xml programming techniques

132
XML Programming XML Programming Techniques Techniques Daniela Florescu, Oracle Daniela Florescu, Oracle Donald Kossmann, ETH Donald Kossmann, ETH

Post on 13-Sep-2014

2.505 views

Category:

Documents


2 download

DESCRIPTION

 

TRANSCRIPT

Page 1: XML Programming Techniques

XML Programming XML Programming TechniquesTechniques

Daniela Florescu, OracleDaniela Florescu, OracleDonald Kossmann, ETH Donald Kossmann, ETH

Page 2: XML Programming Techniques

2

Why this tutorial?Why this tutorial?Has XML changed the way we build Has XML changed the way we build apps?apps?

No! (just another layer; made things No! (just another layer; made things worse!)worse!)

Should XML change the way we build Should XML change the way we build apps?apps?

Yes! (our hypothesis)Yes! (our hypothesis)So what are the options/tradeoffs?So what are the options/tradeoffs?

Page 3: XML Programming Techniques

3

OverviewOverviewIntroductionIntroduction

Applications & ArchitecturesApplications & ArchitecturesInterfaces to existing languages (Java, .NET, …)Interfaces to existing languages (Java, .NET, …)

XML APIs: SAX, DOM, StaXXML APIs: SAX, DOM, StaXCodegenerators: JAXB 2.0, XML Beans, SDO, EMFCodegenerators: JAXB 2.0, XML Beans, SDO, EMF

Extensions to existing programming languagesExtensions to existing programming languagesJavaScript (ECMA), AJAX, PHPJavaScript (ECMA), AJAX, PHPSQL/XMLSQL/XMLMicrosoft‘s XLinqMicrosoft‘s XLinq

„„Native“ XML Programming LanguagesNative“ XML Programming LanguagesDomain-specific languages: BPELDomain-specific languages: BPELPure XML Type System: XQuery, XSLT, XQueryPPure XML Type System: XQuery, XSLT, XQueryPResearch: Curl, XL, Xduce, Links, XQuery!, SIMKINResearch: Curl, XL, Xduce, Links, XQuery!, SIMKIN

Comparison of existing solutionsComparison of existing solutions

Page 4: XML Programming Techniques

4

Killer Advantages of XMLKiller Advantages of XMLPlatform/vendor independent, international Platform/vendor independent, international (UNICODE)(UNICODE)Human and machine readableHuman and machine readableSerialization of dataSerialization of dataHype: $$$ and people Hype: $$$ and people

Tools and human resources availableTools and human resources availableStandardization, secure investmentStandardization, secure investment

Family of technologiesFamily of technologiesXQuery, XML Schema, SOAP, XQuery, WS Security, … XQuery, XML Schema, SOAP, XQuery, WS Security, … (all building blocks for SOA)(all building blocks for SOA)

XML is not new!XML is not new! Best of breed from OO, DB, Documents, Distr. Best of breed from OO, DB, Documents, Distr. Systems, …Systems, …

Page 5: XML Programming Techniques

5

Killer AdvantagesKiller AdvantagesDecouple Data from ApplicationDecouple Data from Application

Data lives longer than code (legacy Data lives longer than code (legacy problem)problem)Data first, schema later (pay as you go Data first, schema later (pay as you go along)along)

Spectrum: unstructured to structured Spectrum: unstructured to structured datadata

Potentially all dataPotentially all dataPay as you go alongPay as you go along

Spectrum: data, meta-data, codeSpectrum: data, meta-data, codePotentially all informationPotentially all informationAvoid technology jungle: one size fits allAvoid technology jungle: one size fits all

Page 6: XML Programming Techniques

6

Some Problems of XMLSome Problems of XMLNot complete; pieces of puzzle missingNot complete; pieces of puzzle missing

RDF Compatibility, Programming, …RDF Compatibility, Programming, …Bottom-up standardization Bottom-up standardization Bottom-up product developmentBottom-up product development

Too much fluffToo much fluffDo you need processing instructions?Do you need processing instructions?

No references, no support for N:M No references, no support for N:M relationshipsrelationshipsNo design methodologyNo design methodology

ER / UML were not designed for XMLER / UML were not designed for XMLSome things are good and badSome things are good and bad

Lexical and binary representation of dataLexical and binary representation of dataAll data are context-sensitive (no cut&paste!) All data are context-sensitive (no cut&paste!)

Page 7: XML Programming Techniques

7

Why is programming for XML Why is programming for XML different?different?

XML is not based on entities + XML is not based on entities + relationshipsrelationshipsXML decouples data from its XML decouples data from its interpretationinterpretation

Data first, schema laterData first, schema laterSpectrum: unstructured to structured Spectrum: unstructured to structured datadataSpectrum: data, meta-data, codeSpectrum: data, meta-data, code

Don‘t burry killer advantages of XML in Don‘t burry killer advantages of XML in programming language!programming language!

Page 8: XML Programming Techniques

8

Typical XML ApplicationsTypical XML ApplicationsBlogs: RSS, AtomBlogs: RSS, Atom

Why XML: Platform-independent, Why XML: Platform-independent, serialization, structure-unstructured dataserialization, structure-unstructured dataUnused potential: RSS as a building block Unused potential: RSS as a building block of any streaming application of any streaming application

EAI: Web Services, RestEAI: Web Services, RestWhy XML: family of standards, Why XML: family of standards, serialization, platform-independent, serialization, platform-independent, machine readablemachine readableUnused potential: performance, Unused potential: performance, declarative programming, strong typingdeclarative programming, strong typing

Page 9: XML Programming Techniques

9

Typical XML Applications Typical XML Applications (ctd.)(ctd.)Office: OpenOffice, Microsoft OfficeOffice: OpenOffice, Microsoft Office

Why XML: structured-unstructured data, Why XML: structured-unstructured data, hypehypeUnused potential: ???Unused potential: ???

Scientific DataScientific DataWhy XML: data first/schema later, hype, Why XML: data first/schema later, hype, strucutre-unstructured datastrucutre-unstructured dataUnused potential: ???Unused potential: ???

Eclipse (XMI), Configuration FilesEclipse (XMI), Configuration FilesWhy XML: XML is not new, human Why XML: XML is not new, human readable, data/code/metadatareadable, data/code/metadataUnused potential: data first/schema laterUnused potential: data first/schema later

Page 10: XML Programming Techniques

10

XML ArchitecturesXML Architectures

XML another layer for comm. + XML another layer for comm. + presentationpresentationLeave everything else as beforeLeave everything else as beforeXML makes things worse (another XML makes things worse (another layer)layer)

More marshalling, more logging, more More marshalling, more logging, more complexitycomplexity

XML

SQL

Objects

Page 11: XML Programming Techniques

11

XML ArchitecturesXML Architectures

Common runtime; ideally no Common runtime; ideally no marshallingmarshallingExploit best of all worldsExploit best of all worldsNot clear how to do the cutNot clear how to do the cutExample: Microsoft LINQExample: Microsoft LINQ

XML SQLObjects

Page 12: XML Programming Techniques

12

XML ArchitecturesXML Architectures

XML used by different components at XML used by different components at different layers for different purposesdifferent layers for different purposesExamples: Eclipse, PHP (most Examples: Eclipse, PHP (most frameworks)frameworks)

XML

Page 13: XML Programming Techniques

13

XML ArchitecturesXML Architectures

XML everywhere and nowhereXML everywhere and nowhereExample: WebLogic, WebSphereExample: WebLogic, WebSphere

XML

XML

XMLXML

XMLXML

XML

Page 14: XML Programming Techniques

14

XML ArchitecturesXML Architectures

XML everywhere XML everywhere Only a little bit of native codeOnly a little bit of native code

Jim Gray: „Extremist Approach“ (ACM Jim Gray: „Extremist Approach“ (ACM Queue)Queue)Example: XQuery, XQueryP Example: XQuery, XQueryP

XML

Page 15: XML Programming Techniques

15

What is right for me?What is right for me?How deep does the XML go into architecture?How deep does the XML go into architecture?

Wrap XML as an additional layerWrap XML as an additional layerHow big is wrapper compared to rest of code?How big is wrapper compared to rest of code?

Am I too lazy to learn a new language?Am I too lazy to learn a new language?Cost to train people, how safe is that investmentCost to train people, how safe is that investment

What tools support my SE process?What tools support my SE process?Do I have a methodology for the XML app?Do I have a methodology for the XML app?

What application? What computations?What application? What computations?What kind of XML data?What kind of XML data?

Persistent, data on the wire, typed, distributed, ...Persistent, data on the wire, typed, distributed, ...What kind of XML data model?What kind of XML data model?

Serialized XML, Infoset, PSVI, XDM, ... Serialized XML, Infoset, PSVI, XDM, ...

Page 16: XML Programming Techniques

16

What is right for me?What is right for me?Optimizability, performanceOptimizability, performance

Cost for data marshallingCost for data marshallingCan I stream data; no need to parse whole messageCan I stream data; no need to parse whole messageDo things several times (e.g., logging, checking integrity)Do things several times (e.g., logging, checking integrity)

Productivity of programmersProductivity of programmersTechnology jungle vs. one unified modelTechnology jungle vs. one unified modelOptimization, logging, ... are all automatic; Optimization, logging, ... are all automatic; focus on application logic and not on mundane tasksfocus on application logic and not on mundane tasksStatic typing of programs Static typing of programs programming style (declarative vs. Imperative)programming style (declarative vs. Imperative)

Standard compliance: W3C XML family Standard compliance: W3C XML family Other domain-specific goodiesOther domain-specific goodies

Support for push / events, error handling, logging, Support for push / events, error handling, logging, asynchronous computation, …asynchronous computation, …

Exploits / exposes killer advantages of XMLExploits / exposes killer advantages of XMLXML syntax?XML syntax?

Page 17: XML Programming Techniques

17

OverviewOverviewIntroductionIntroduction

Applications & ArchitecturesApplications & ArchitecturesInterfaces to existing languages Interfaces to existing languages (Java, .NET, …)(Java, .NET, …)

XML APIs: SAX, DOM, StaXXML APIs: SAX, DOM, StaXCodegenerators: JAXB 2.0, XML Beans, SDO, EMFCodegenerators: JAXB 2.0, XML Beans, SDO, EMF

Extensions to existing programming languagesExtensions to existing programming languagesJavaScript (ECMA), AJAX, PHPJavaScript (ECMA), AJAX, PHPSQL/XMLSQL/XMLMicrosoft‘s XLinqMicrosoft‘s XLinq

„„Native“ XML Programming LanguagesNative“ XML Programming LanguagesDomain-specific languages: BPELDomain-specific languages: BPELPure XML Type System: XQuery, XSLT, XQueryPPure XML Type System: XQuery, XSLT, XQueryPResearch: Curl, XL, Xduce, Links, XQuery!, SIMKINResearch: Curl, XL, Xduce, Links, XQuery!, SIMKIN

Comparison of existing solutionsComparison of existing solutions

Page 18: XML Programming Techniques

18

Overview of XML APIsOverview of XML APIsDOM DOM

Any XML application, updates + navigational readAny XML application, updates + navigational readSAX SAX

Low level XML processing, no updates, only Low level XML processing, no updates, only forward nav.forward nav.

StaX (JSR 173), XMLPullParserStaX (JSR 173), XMLPullParserLow level like SAX, but pull (instead of push)Low level like SAX, but pull (instead of push)

TokenIterator (BEA XQuery processor)TokenIterator (BEA XQuery processor)Like JSR 173, but full support for XQuery data Like JSR 173, but full support for XQuery data modelmodel

XQJ / JSR 225XQJ / JSR 225Standard for Java interface for XQuery resultsStandard for Java interface for XQuery results

Microsoft XMLReader Streaming APIMicrosoft XMLReader Streaming APIMicrosoft‘s streaming XML interfaceMicrosoft‘s streaming XML interface

(Many more that I have omitted.)(Many more that I have omitted.)

Page 19: XML Programming Techniques

19

Classification CriteriaClassification CriteriaNavigational access?Navigational access?Random access (by node id)?Random access (by node id)?Decouple navigation from data reads?Decouple navigation from data reads?Updates?Updates?Infoset or XQuery Data Model?Infoset or XQuery Data Model?Target programming language?Target programming language?Target data consumer?Target data consumer?

Page 20: XML Programming Techniques

20

DecouplingDecouplingIdea:Idea:

methods to methods to navigatenavigate through data (XML tree)through data (XML tree)methods to methods to read propertiesread properties at current position at current position (node)(node)

Example: DOM (tree-based model)Example: DOM (tree-based model)navigation:navigation: firstChild, parentNode, nextSibling, … firstChild, parentNode, nextSibling, …properties:properties: nodeName, getNamedItem, … nodeName, getNamedItem, …(updates:(updates: createElement, setNamedItem, …) createElement, setNamedItem, …)

Assessment:Assessment:good:good: read parts of document, integrate existing read parts of document, integrate existing storesstoresbad:bad: materialize temp. query results, materialize temp. query results, transformationstransformations

Page 21: XML Programming Techniques

21

Non DecouplingNon DecouplingIdea:Idea:

Combined navigation + read propertiesCombined navigation + read propertiesSpecial methods for fast forward, reverse navigationSpecial methods for fast forward, reverse navigation

Example: TokenIterator (token stream)Example: TokenIterator (token stream)Token getNext(), void skipToNextNode(), …Token getNext(), void skipToNextNode(), …Assessment:Assessment:

good:good: less method calls, stream-based processing less method calls, stream-based processinggood:good: integration of data from multiple sources integration of data from multiple sourcesbad:bad: difficult to wrap existing XML data sources difficult to wrap existing XML data sourcesbad:bad: reverse navigation tricky, difficult reverse navigation tricky, difficult programming modelprogramming model

Page 22: XML Programming Techniques

22

Classification of APIsClassification of APIsDMDM Nav.Nav. Rand.Rand. Decp.Decp. Upd.Upd. Platf.Platf.

DOMDOM InfoSeInfoSett yesyes nono yesyes yesyes manymany

SAXSAX InfoSeInfoSett nono nono nono nono JavaJava

JSR173JSR173 InfoSeInfoSett (no)(no) nono yesyes nono JavaJava

TokIterTokIter XQuerXQueryy (no)(no) nono nono nono JavaJava

XQJXQJ XQuerXQueryy yesyes yesyes yesyes yesyes JavaJava

MSMS InfoSeInfoSett (no)(no) nono yesyes nono .NET.NET

Page 23: XML Programming Techniques

23

Summary: XML APIsSummary: XML APIsGood: programmers stay in their worldGood: programmers stay in their worldBad: APIs are clumsy (not declarative)Bad: APIs are clumsy (not declarative)Bad: no logical/physical data Bad: no logical/physical data independenceindependenceBad: APIs require data marshallingBad: APIs require data marshalling

Programming via XML APIs extreme Programming via XML APIs extreme case:case:

How deep XML goes into architectureHow deep XML goes into architectureHow lazy am I to learn a new language How lazy am I to learn a new language

Page 24: XML Programming Techniques

24

OverviewOverviewIntroductionIntroduction

Applications & ArchitecturesApplications & ArchitecturesInterfaces to existing languages Interfaces to existing languages (Java, .NET, …)(Java, .NET, …)

XML APIs: SAX, DOM, StaXXML APIs: SAX, DOM, StaXCodegenerators: JAXB 2.0, XML Beans, SDO, Codegenerators: JAXB 2.0, XML Beans, SDO, EMFEMF

Extensions to existing programming languagesExtensions to existing programming languagesJavaScript (ECMA), AJAX, PHPJavaScript (ECMA), AJAX, PHPSQL/XMLSQL/XMLMicrosoft‘s XLinqMicrosoft‘s XLinq

„„Native“ XML Programming LanguagesNative“ XML Programming LanguagesDomain-specific languages: BPELDomain-specific languages: BPELPure XML Type System: XQuery, XSLT, XQueryPPure XML Type System: XQuery, XSLT, XQueryPResearch: Curl, XL, Xduce, Links, XQuery!, SIMKINResearch: Curl, XL, Xduce, Links, XQuery!, SIMKIN

Comparison of existing solutionsComparison of existing solutions

Page 25: XML Programming Techniques

25

Code GeneratorsCode GeneratorsIdeaIdeaInput:Input: XML Schema (XSD) XML Schema (XSD)Output:Output: Code in target language (mostly Code in target language (mostly Java)Java)

ExamplesExamplesJAXB: XML <-> Java Objects JAXB: XML <-> Java Objects (Un-)Marshalling(Un-)Marshalling

Given XML and Java Class, automatic Given XML and Java Class, automatic translationtranslationMany similar open source projects (e.g. Castor)Many similar open source projects (e.g. Castor)

XML Beans: Java getters and setters for XML Beans: Java getters and setters for XMLXML

Compiles Java interfaces based on XSDCompiles Java interfaces based on XSDImplements an XML Store + XPath/XQuery Implements an XML Store + XPath/XQuery accessaccessOpen Source, but owned by BEAOpen Source, but owned by BEA

SDO, EMF (see next slides)SDO, EMF (see next slides)

Page 26: XML Programming Techniques

26

Eclipse Modeling Framework (EMF)Eclipse Modeling Framework (EMF)Background: Model Driven Background: Model Driven ArchitectureArchitectureIdea: compile (Java) code from modelIdea: compile (Java) code from modelEMF supports the following modelsEMF supports the following models

UML 2.0 diagrams (e.g., IBM Rational UML 2.0 diagrams (e.g., IBM Rational Rose)Rose)XMI (XML Metadata Interchange)XMI (XML Metadata Interchange)Annotated JavaAnnotated JavaXML Schema (but restricted!!!)XML Schema (but restricted!!!)

Reference:Reference: http://www.eclipse.org/emfhttp://www.eclipse.org/emf

Page 27: XML Programming Techniques

27

EMF: ECore and EObjectEMF: ECore and EObjectECore is a meta modelECore is a meta model

Model to describe modelsModel to describe modelsAll models (UML, etc.) are described with All models (UML, etc.) are described with ECoreECoreAnalogon: XML SchemaAnalogon: XML Schema

EObject is a model to represent EObject is a model to represent instancesinstances

All instances (Java objects) implement All instances (Java objects) implement EObjectEObjectAnalogon: XML instanceAnalogon: XML instance

Page 28: XML Programming Techniques

28

XML Schema vs. ECoreXML Schema vs. ECore

XML Schema ECore

XMLSchema.xsd

ECore.xsd

describes

ECore.ecore

XMLSchema.ecore

Page 29: XML Programming Techniques

29

EMF from UML: ExampleEMF from UML: ExampleUML 2.0 Class DiagramUML 2.0 Class Diagram

Generated Java CodeGenerated Java CodePublic interface BankAccount extends EObject {Public interface BankAccount extends EObject {

String getOwner();String getOwner();void setOwner(String value);void setOwner(String value);double getBalance();double getBalance();void setBalance();void setBalance();

}}Generated code is annotated; can be manually extended, Generated code is annotated; can be manually extended, regeneratedregeneratedGenerates interfaces + implementation (i.e., class)Generates interfaces + implementation (i.e., class)Very big community (!)Very big community (!)

-owner : string-balance : decimal

BankAccount

Page 30: XML Programming Techniques

30

EMF from XSDEMF from XSD<xsd:schema targetNamespace=„…“<xsd:schema targetNamespace=„…“

xmlns:xsd=„…“>xmlns:xsd=„…“><xsd:complexType name=„BankAccount“><xsd:complexType name=„BankAccount“>

<xsd:sequence><xsd:sequence><xsd:element name=„owner“ <xsd:element name=„owner“

type=„xsd:string“/>type=„xsd:string“/><xsd:element name=„balance“ <xsd:element name=„balance“

type=„xsd:double“/>type=„xsd:double“/></xsd:sequence></xsd:sequence>

</xsd:complexType> </xsd:complexType> </xsd:schema></xsd:schema>

Creates the same Java (interface + class)Creates the same Java (interface + class)Works for simple casesWorks for simple casesDoes not work for complex XML SchemasDoes not work for complex XML Schemas

Generated Java not always equivalent to XML SchemaGenerated Java not always equivalent to XML Schema

Page 31: XML Programming Techniques

31

Summary EMFSummary EMFVery popular in MDA communityVery popular in MDA community

If you believe in MDA, here you goIf you believe in MDA, here you goTechnical advantagesTechnical advantages

References are part of ECore (fixes XML bug)References are part of ECore (fixes XML bug)ECore shares some of the XML advantagesECore shares some of the XML advantagesEObjects are strongly typedEObjects are strongly typed

Technical disadvantages (common to all CGs)Technical disadvantages (common to all CGs)Does not support whole XML SchemaDoes not support whole XML SchemaDoes not support declarative programmingDoes not support declarative programmingOptimizability alla DB is not likely to happenOptimizability alla DB is not likely to happen

Platform: Java + EclipsePlatform: Java + EclipseIf you hate Microsoft, here you goIf you hate Microsoft, here you go

Code Generators = XML APIs++ Code Generators = XML APIs++ (productivity)(productivity)

Schema-based static typing, data independenceSchema-based static typing, data independence

Page 32: XML Programming Techniques

32

SDO, ADO.NETSDO, ADO.NETSDO = service data objects (J2EE platform)SDO = service data objects (J2EE platform)

BEA, IBM, Oracle et al.BEA, IBM, Oracle et al.ADO = ActiveX data objects (.NET platform)ADO = ActiveX data objects (.NET platform)

MicrosoftMicrosoftUniform access to data from different sourcesUniform access to data from different sources

Inparticular XML, Web sourcesInparticular XML, Web sourcesJava or C# interface to access any kind of dataJava or C# interface to access any kind of data

Protocol for disconnected client/server accessProtocol for disconnected client/server accessClient propagates change lists to serverClient propagates change lists to server

Implementation IBM‘s SDO on top of EMFImplementation IBM‘s SDO on top of EMFConceived by IBM as an extension of EMFConceived by IBM as an extension of EMFwrt. XML binding, similar tradeoffs as EMFwrt. XML binding, similar tradeoffs as EMF

Page 33: XML Programming Techniques

33

OverviewOverviewIntroductionIntroduction

Applications & ArchitecturesApplications & ArchitecturesInterfaces to existing languages (Java, .NET, …)Interfaces to existing languages (Java, .NET, …)

XML APIs: SAX, DOM, StaXXML APIs: SAX, DOM, StaXCodegenerators: JAXB 2.0, XML Beans, SDO, EMFCodegenerators: JAXB 2.0, XML Beans, SDO, EMF

Extensions to existing programming Extensions to existing programming languageslanguages

JavaScript (ECMA), AJAX, PHPJavaScript (ECMA), AJAX, PHPSQL/XMLSQL/XMLMicrosoft‘s XLinqMicrosoft‘s XLinq

„„Native“ XML Programming LanguagesNative“ XML Programming LanguagesDomain-specific languages: BPELDomain-specific languages: BPELPure XML Type System: XQuery, XSLT, XQueryPPure XML Type System: XQuery, XSLT, XQueryPResearch: Curl, XL, Xduce, Links, XQuery!, SIMKINResearch: Curl, XL, Xduce, Links, XQuery!, SIMKIN

Comparison of existing solutionsComparison of existing solutions

Page 34: XML Programming Techniques

34

ECMAScript = JavaScript ~ ECMAScript = JavaScript ~ JScriptJScriptHistoryHistory

Started 1995: Sun and NetscapeStarted 1995: Sun and NetscapeMarch 1996: Netscape Navigator 2.0March 1996: Netscape Navigator 2.0August 1996: Microsoft IE 3.0 (JScript)August 1996: Microsoft IE 3.0 (JScript)June 1997, 1998: first standards June 1997, 1998: first standards (ECMAScript)(ECMAScript)Dec. 1999: ECMA-262 (current version)Dec. 1999: ECMA-262 (current version)regular expressions, formatting, try/catch, …regular expressions, formatting, try/catch, …June 2004, Dec 2005: E4X (ECMAScript for June 2004, Dec 2005: E4X (ECMAScript for XML)XML)

PurposePurposeenliven Web pages (dynamic Web-based enliven Web pages (dynamic Web-based GUIs)GUIs)Scripting language for experts and usersScripting language for experts and users

http:http://www//www..ecma-internationalecma-international..orgorg

Page 35: XML Programming Techniques

35

ECMAScript OverviewECMAScript Overview„„object-based“ language (not fully OO)object-based“ language (not fully OO)

Object have properties (e.g., name, balance)Object have properties (e.g., name, balance)Properties contain objects, primitives, Properties contain objects, primitives, methodsmethodsPrimitives: e.g., Boolean, String, nullPrimitives: e.g., Boolean, String, nullProperties have attributes (e.g., ReadOnly)Properties have attributes (e.g., ReadOnly)Objects are created through constructorsObjects are created through constructorsConstructors use prototypesConstructors use prototypesBuilt-in objects: Object, Array, Function, …Built-in objects: Object, Array, Function, …Example objects: pop-up, menu, text field, … Example objects: pop-up, menu, text field, …

Event-based language (there is no „main“)Event-based language (there is no „main“)Attach code to events (mouse, errors, aborts, Attach code to events (mouse, errors, aborts, …)…)

Syntax resembles Java, C, SelfSyntax resembles Java, C, Self

Page 36: XML Programming Techniques

36

E4X (ECMA-357)E4X (ECMA-357)Simplify access and manipulation of Simplify access and manipulation of XMLXML

DOM conceived as too clumsyDOM conceived as too clumsyXML is a primitive (like String, Boolean, XML is a primitive (like String, Boolean, …)…)var x = new XML()var x = new XML()x = <BankAccount> x = <BankAccount>

<owner id=„4711“>D. <owner id=„4711“>D. Duck</owner>Duck</owner>

<balance <balance curr=„EUR“>123.54</balance>curr=„EUR“>123.54</balance>

</BankAccount></BankAccount>

Page 37: XML Programming Techniques

37

E4XE4XAccess to elementsAccess to elements

Child access: „.“Child access: „.“x.balancex.balance

Attribute axis: „.@“Attribute axis: „.@“[email protected].@curr

IterationIterationvar total = 0; var total = 0; for each (x in allBankAccounts.BankAccount) for each (x in allBankAccounts.BankAccount) {{total += x.balance }total += x.balance }

UpdatesUpdatesDelete nodesDelete nodes

delete x.commentdelete x.commentInsert nodesInsert nodes

x.comment += <comment>blabla</comment>x.comment += <comment>blabla</comment>

Page 38: XML Programming Techniques

38

AJAX: Asyn. JavaScript and XMLAJAX: Asyn. JavaScript and XMLGoal: fine-grained interaction between Goal: fine-grained interaction between Web browser and Web serverWeb browser and Web server

Faster, more interactive, user-friendly Web Faster, more interactive, user-friendly Web GUIGUIWeb GUI should be as powerful as desktop Web GUI should be as powerful as desktop GUIGUI

Idea: Exploit JavaScript, HTTP and XMLIdea: Exploit JavaScript, HTTP and XMLJavaScript has methods to invoke HTTP JavaScript has methods to invoke HTTP requestsrequestsAJAX uses XML to ship data from/to serverAJAX uses XML to ship data from/to server

Why so successful?Why so successful?Nothing new; it is all there alreadyNothing new; it is all there alreadyJust do it!Just do it!

Page 39: XML Programming Techniques

39

AJAX ExampleAJAX ExampleHTML FormHTML Form<form> Product: <form> Product: <input type="text" id=„pname" <input type="text" id=„pname" onkeyup=„autoComp(this.value)„/> onkeyup=„autoComp(this.value)„/>

</form></form> JavaScriptJavaScriptfunction autoComp(str) { function autoComp(str) { var url=„www.myapp.com/pname.do?"+var url=„www.myapp.com/pname.do?"+„p="+str xmlHttp.open("GET", url , „p="+str xmlHttp.open("GET", url , true) true) xmlHttp.send(null) xmlHttp.send(null)

}}

Page 40: XML Programming Techniques

# 40

PHP

• Compile first, execute later interpreter: Compiles into intermediate language Executes opcodes (might contain a lot of functionality)

• Dynamically typed language Types include integer, float, boolean, string, array (hash), object, null

Type juggling is automatic at runtime based on context

Page 41: XML Programming Techniques

# 41

PHP: Accessing XML

• Treats XML values as if they were native PHP types Takes advantage of the new Zend Engine II Overloading API

Takes advantage of the dynamic nature of PHP Uses Gnome project’s libxml2 library

Page 42: XML Programming Techniques

# 42

Simple Access to XML…

Page 43: XML Programming Techniques

# 43

Proposal: XML Content Store

• Goals Process and manage XML data from many sources: web services, RSS feeds, messages, configuration files, user data

Create an API to abstract CRUD details• Results

Allow for rapid application design without worrying about tedious persistence details

• Implementation Example API: PHP Persistence Layer: Upcoming Release of DB2, code-named Viper, with Native XML support

Page 44: XML Programming Techniques

44

SummarySummaryJavaScript, AJAX, PHP are very popularJavaScript, AJAX, PHP are very popular

Essential building block of Web 2.0Essential building block of Web 2.0Good: mature platforms, great communityGood: mature platforms, great communityGood: domain-specific goodiesGood: domain-specific goodies

E4X and PHP provide native support for E4X and PHP provide native support for XMLXML

XML data typeXML data typeSyntax to access and manipulate XMLSyntax to access and manipulate XML

E4X, PHP are not compatible with E4X, PHP are not compatible with standardsstandards

they argue that this is a featurethey argue that this is a featureBad: but, do miss some of the XML Bad: but, do miss some of the XML advantagesadvantages

Page 45: XML Programming Techniques

45

OverviewOverviewIntroductionIntroduction

Applications & ArchitecturesApplications & ArchitecturesInterfaces to existing languages (Java, .NET, …)Interfaces to existing languages (Java, .NET, …)

XML APIs: SAX, DOM, StaXXML APIs: SAX, DOM, StaXCodegenerators: JAXB 2.0, XML Beans, SDO, EMFCodegenerators: JAXB 2.0, XML Beans, SDO, EMF

Extensions to existing programming languagesExtensions to existing programming languagesJavaScript (ECMA), AJAX, PHPJavaScript (ECMA), AJAX, PHPSQL/XMLSQL/XMLMicrosoft‘s XLinqMicrosoft‘s XLinq

„„Native“ XML Programming LanguagesNative“ XML Programming LanguagesDomain-specific languages: BPELDomain-specific languages: BPELPure XML Type System: XQuery, XSLT, XQueryPPure XML Type System: XQuery, XSLT, XQueryPResearch: Curl, XL, Xduce, Links, XQuery!, SIMKINResearch: Curl, XL, Xduce, Links, XQuery!, SIMKIN

Comparison of existing solutionsComparison of existing solutions

Page 46: XML Programming Techniques

46

Processing XML with SQLProcessing XML with SQLMapping XML data into tuples in a relational Mapping XML data into tuples in a relational database, then use (a variant of) SQLdatabase, then use (a variant of) SQLUser-controlled shredding, then use classical SQLUser-controlled shredding, then use classical SQL

Model driven shredding Model driven shredding (Florescu, Kossmann, 99)(Florescu, Kossmann, 99)Edge, binary approach + alternatives Edge, binary approach + alternatives Corresponds to generic APIs (e.g. DOM) for Java Corresponds to generic APIs (e.g. DOM) for Java PlusPlus: very general; integrates well with relational data: very general; integrates well with relational dataMinusMinus: poor performance: poor performance

Schema based shredding (Schema based shredding (Shanmugasundaram et al. 99)Shanmugasundaram et al. 99) Map XML Schema / DTD to SQL DDLMap XML Schema / DTD to SQL DDLPlusPlus: integrates well with relational data: integrates well with relational dataMinusMinus:  missing tools, complicated:  missing tools, complicated

Automatic shredding, then use SQL/XMLAutomatic shredding, then use SQL/XMLPlusPlus: usability, logical/physical data independence: usability, logical/physical data independenceMinusMinus: less user control: less user control

Page 47: XML Programming Techniques

47

History of SQL / XMLHistory of SQL / XMLFirst edition part of SQL:2003First edition part of SQL:2003

Part 14 of the SQL standardPart 14 of the SQL standardPre-dates XQuery standardPre-dates XQuery standardLimited functionality - storage and Limited functionality - storage and publishingpublishing

Second edition: work in progressSecond edition: work in progressMore complete integration of XQuery + More complete integration of XQuery + XQuery Data ModelXQuery Data ModelAdvanced Query capabilitiesAdvanced Query capabilitiesExpected to be published in 2006Expected to be published in 2006

Page 48: XML Programming Techniques

48

XML Type in SQLXML Type in SQLA new type (like varchar, date, A new type (like varchar, date, numeric)numeric)SQL:2003 - XML type restricted toSQL:2003 - XML type restricted to

XML document orXML document orXML element orXML element orSequence of XML elementsSequence of XML elements

SQL / XML, 2nd editionSQL / XML, 2nd editionFull support of XQuery Data ModelFull support of XQuery Data ModelXML(SEQUENCE), XML(ANY CONTENT), ...XML(SEQUENCE), XML(ANY CONTENT), ...

Page 49: XML Programming Techniques

49

Example (SQL:2003)Example (SQL:2003)create table books(create table books(

title varchar(20),title varchar(20),authors XML);authors XML);

TitleTitle AuthorsAuthorsXQuery 1.0XQuery 1.0 <author>D. Chamberlin <author>D. Chamberlin

</author></author><author>D. Florescu</author><author>D. Florescu</author><author>et al.</author><author>et al.</author>

Dual Dual BufferingBuffering

„„D. Kossmann“D. Kossmann“No schema validation, no typing!

Page 50: XML Programming Techniques

50

XML View on Relational DataXML View on Relational Data

IdId NameName47114711 WutzWutz911911 PotterPotter

Phantasy-People SELECT XMLGEN(<Person id = „{ $Id }“>

{ $Name }</Person>) as Person

FROM Phantasy-People

PersonPerson

<person id=„4711“>Wutz</person><person id=„4711“>Wutz</person><person id=„911“>Potter</person><person id=„911“>Potter</person>

Page 51: XML Programming Techniques

51

XML View on XML DataXML View on XML DataTitleTitle AuthorsAuthorsXQuery 1.0XQuery 1.0 <author>D. Chamberlin <author>D. Chamberlin

</author></author><author>D. Florescu</author><author>D. Florescu</author><author>et al.</author><author>et al.</author>

Dual Dual BufferingBuffering

<author>D. <author>D. Kossmann</author>Kossmann</author>SELECT Title, XMLGEN(<pa>{$Authors[1]/text()}</pa>) as PrimA

FROM MyAuthors;

TitleTitle PrimAPrimAXQuery 1.0XQuery 1.0 <pa>D. Chamberlin </pa><pa>D. Chamberlin </pa>Dual Dual BufferingBuffering

<pa>D. Kossmann</pa><pa>D. Kossmann</pa>

Page 52: XML Programming Techniques

52

XMLAGGXMLAGGProductProduct SalesSalesFishFish 500500BreadBread 2020FishFish 400400

SELECT Product, XMLAGG(XMLELEMENT(NAME „S“, Sales)) AS AllSalesFROM SalesTableGROUP BY Product;

SalesTable

ProductProduct AllSalesAllSalesFishFish <S>500</S><S>500</S>

<S>400</S><S>400</S>BreadBread <S>20</S><S>20</S>

Page 53: XML Programming Techniques

53

SQL / XML: 2nd EditionSQL / XML: 2nd EditionXML datatype will support XQuery data modelXML datatype will support XQuery data model

XML(UNTYPED CONTENT) – old XML infoset modelXML(UNTYPED CONTENT) – old XML infoset modelXML(SEQUENCE) – holds heterogeneous sequencesXML(SEQUENCE) – holds heterogeneous sequences... (other parameterized types; validated data possible! Non ... (other parameterized types; validated data possible! Non well-formed XML data possible, too.)well-formed XML data possible, too.)Full XML Schema support and validationFull XML Schema support and validation

XMLQuery() function XMLQuery() function create XML content using XQuerycreate XML content using XQuery

XMLTable() function XMLTable() function Shred XML to rel. Data using XqueryShred XML to rel. Data using Xquery

Mapping between SQL & XQuery data modelMapping between SQL & XQuery data modelXMLCAST between XML and SQL typesXMLCAST between XML and SQL types

Page 54: XML Programming Techniques

54

XMLExistsXMLExists

SELECT Title FROM booksSELECT Title FROM booksWHERE WHERE XMLEXISTS(Authors, //author = „et al.“);XMLEXISTS(Authors, //author = „et al.“);

Explicit PASSING also possible (see XMLQuery)Explicit PASSING also possible (see XMLQuery)

TitleTitle AuthorsAuthorsXQuery 1.0XQuery 1.0 <author>D. Chamberlin <author>D. Chamberlin

</author></author><author>D. Florescu</author><author>D. Florescu</author><author>et al.</author><author>et al.</author>

Dual Dual BufferingBuffering

<author>D. <author>D. Kossmann</author>Kossmann</author>

Page 55: XML Programming Techniques

55

XMLQuery expressionXMLQuery expressionSQL Expression – use in select for constructing XMLSQL Expression – use in select for constructing XML

select XMLQuery(select XMLQuery( ‘ ‘for $i in for $i in ..//PurchaseOrderPurchaseOrder where $i/PoNo = where $i/PoNo = $j$j/val/val return $i//Item ‘ return $i//Item ‘ passing passing p.pocolp.pocol , , xmlelement(“val”,2100) as “j” xmlelement(“val”,2100) as “j” returning content)returning content)from purchaseorder from purchaseorder pp

<Item itemno=“21”><Quantity>200</Quantity>..</Item><Item itemno=“21”><Quantity>200</Quantity>..</Item><Item itemno=“22”><Quantity>22</Quantity>..</Item><Item itemno=“22”><Quantity>22</Quantity>..</Item>

Pocol maps to default item

XMLElementvalue

maps to $j

Page 56: XML Programming Techniques

56

XMLTable constructXMLTable constructUsed in FROM clause: translate XML into relational data Used in FROM clause: translate XML into relational data Splits up result into SQL columns, passing always BY REFSplits up result into SQL columns, passing always BY REF

select items.pos, items.itemno, items.quantityselect items.pos, items.itemno, items.quantityfrom purchaseorder p, from purchaseorder p, XMLTable(‘XMLTable(‘for $i in /PurchaseOrder//Itemsfor $i in /PurchaseOrder//Items where $i/Quantity > 200 where $i/Quantity > 200 return $ireturn $i’ passing p.pocol’ passing p.pocol columns pos for ordinality, columns pos for ordinality, itemno itemno numbernumber path ‘ItemNo’ path ‘ItemNo’ quantity quantity numbernumber DEFAULTDEFAULT 0 path ‘Quantity’ 0 path ‘Quantity’ ) items;) items;

POS ITEMNO QUANTITYPOS ITEMNO QUANTITY------ ----------- ------------------ ----------- ------------1 21 211 21 212 22 02 22 0

Default value is usedIf path does not return

value

Ordinality returns sequential position

Relational columnsreturned in result

Page 57: XML Programming Techniques

57

SQL/XMLSQL/XMLGoodGood

Takes advantage of the entire SQL infrastructure (e.g. triggers, PL/SQL)Takes advantage of the entire SQL infrastructure (e.g. triggers, PL/SQL)Transactional supportTransactional supportScalability, clustering, reliabilityScalability, clustering, reliabilityGlobal optimization (XML and relational)Global optimization (XML and relational)Standard implemented and supported by Microsoft, Oracle, IBM, Standard implemented and supported by Microsoft, Oracle, IBM, DataDirect, etcDataDirect, etc

BadBadRequires data to be loaded in the database Requires data to be loaded in the database

not good for temporary XML datanot good for temporary XML datanot worth the effort for small volumes of datanot worth the effort for small volumes of datadatabase complex component, hard to fit in an architectural database complex component, hard to fit in an architectural diagramdiagram

Blend of the two languages (SQL, XQuery) isn’t natural, easy to useBlend of the two languages (SQL, XQuery) isn’t natural, easy to useXQuery not supported entirely by database enginesXQuery not supported entirely by database engines

Not XML updates a la XQuery yetNot XML updates a la XQuery yet

Page 58: XML Programming Techniques

58

OverviewOverviewIntroductionIntroduction

Applications & ArchitecturesApplications & ArchitecturesInterfaces to existing languages (Java, .NET, …)Interfaces to existing languages (Java, .NET, …)

XML APIs: SAX, DOM, StaXXML APIs: SAX, DOM, StaXCodegenerators: JAXB 2.0, XML Beans, SDO, EMFCodegenerators: JAXB 2.0, XML Beans, SDO, EMF

Extensions to existing programming languagesExtensions to existing programming languagesJavaScript (ECMA), AJAX, PHPJavaScript (ECMA), AJAX, PHPSQL/XMLSQL/XMLMicrosoft‘s XLinqMicrosoft‘s XLinq

„„Native“ XML Programming LanguagesNative“ XML Programming LanguagesDomain-specific languages: BPELDomain-specific languages: BPELPure XML Type System: XQuery, XSLT, XQueryPPure XML Type System: XQuery, XSLT, XQueryPResearch: Curl, XL, Xduce, Links, XQuery!, SIMKINResearch: Curl, XL, Xduce, Links, XQuery!, SIMKIN

Comparison of existing solutionsComparison of existing solutions

Page 59: XML Programming Techniques

59

Xlinq in .NETXlinq in .NEThttp://http://msdnmsdn..microsoftmicrosoft.com/data/.com/data/linqlinq//

.NET Common Language Integration.NET Common Language Integration

Standard Query OperatorsStandard Query Operators

XLinqXLinq DLinqDLinq Declarative access to Declarative access to persistent relational datapersistent relational data

Declarative access to Declarative access to transient XML datatransient XML data

C#C# Visual BasicVisual Basic

Page 60: XML Programming Techniques

60

XLinq: main conceptsXLinq: main conceptsXML type added as a basic type (C#, VB)XML type added as a basic type (C#, VB)

Infoset, no typed dataInfoset, no typed dataNo support for the XML Data Model (XDM)No support for the XML Data Model (XDM)Temporary, not persistent XML dataTemporary, not persistent XML data

Library of basic XML manipulation functions (e.g. Library of basic XML manipulation functions (e.g. navigation, construction)navigation, construction)Basic .NET Standard Query OperatorsBasic .NET Standard Query Operators

Collection-oriented set of operationsCollection-oriented set of operationsSecond orderSecond orderGeneral, not XML specificGeneral, not XML specific

High level syntax similar to SELECT-FROM-WHEREHigh level syntax similar to SELECT-FROM-WHERENatively integrated with the language, not through APIsNatively integrated with the language, not through APIs

Goal: eliminate the need for DOM processingGoal: eliminate the need for DOM processing

Page 61: XML Programming Techniques

61

.NET Standard Query .NET Standard Query OperatorsOperatorsSet of second order operatorsSet of second order operators

similar to the relational algebrasimilar to the relational algebraWork on all ordered collections in .NETWork on all ordered collections in .NETIn particular, they work on collections of In particular, they work on collections of XML elementsXML elementsBuild your own algebraic query Build your own algebraic query execution plan by hand !execution plan by hand !

Page 62: XML Programming Techniques

62

.NET Standard Query .NET Standard Query OperatorsOperatorsWhereWhere(selectFunction)(selectFunction)

Items.Where(i => i.price <100)Items.Where(i => i.price <100)SelectSelect(mappingFunction)(mappingFunction)

Products.Select(p => new {p.name, p.price})Products.Select(p => new {p.name, p.price})SelectManySelectMany(mappingFunction)(mappingFunction)

Customers.SelectMany(c => c.orders)Customers.SelectMany(c => c.orders)Take, SkipTake, Skip

Products.OrderByDescending(p => p.price).Take(3)Products.OrderByDescending(p => p.price).Take(3)TakeWhileTakeWhile, , SkipWhileSkipWhile(predicate)(predicate)

Products.OrderByDescending(p => p.price).TakeWhile(p => Products.OrderByDescending(p => p.price).TakeWhile(p => p.price<100)p.price<100)

Page 63: XML Programming Techniques

63

.NET Standard Query .NET Standard Query OperatorsOperatorsJoinJoin(outer, inner, outerKeySelection, (outer, inner, outerKeySelection,

innerKeySelection, resultSelector)innerKeySelection, resultSelector)Customers.Join(orders, c => c.CustomerID, o => o.CustomerID, Customers.Join(orders, c => c.CustomerID, o => o.CustomerID, (c, o) => new {c.name, o.Total})(c, o) => new {c.name, o.Total})

GroupJoinGroupJoin((outer, inner, outerKeySelection, outer, inner, outerKeySelection, innerKeySelection, resultSelector)innerKeySelection, resultSelector)

Customers.GroupJoin(orders, c => c.CustomerID, o => Customers.GroupJoin(orders, c => c.CustomerID, o => o.CustomerID, (c, co) => new {c.name, co.Sum(o=>o.Total)})o.CustomerID, (c, co) => new {c.name, co.Sum(o=>o.Total)})

OrderByOrderBy(comparisonFunct), ThenBy(ComparisonFunct)(comparisonFunct), ThenBy(ComparisonFunct)Collection.OrderBy(…).ThenBy(…).ThenBy(…)Collection.OrderBy(…).ThenBy(…).ThenBy(…)

Page 64: XML Programming Techniques

64

.NET Standard Query .NET Standard Query OperatorsOperatorsGroupByGroupBy(collection, keySelector)(collection, keySelector)GroupByGroupBy(collection, equalityComparer)(collection, equalityComparer)Distinct, Union, Intersect, ExceptDistinct, Union, Intersect, Except

Based on GetHashCode and EqualsBased on GetHashCode and EqualsToDictionaryToDictionary(collection, keySelector)(collection, keySelector)

Creates a one-to-one dictionaryCreates a one-to-one dictionaryToLookupToLookup(collection, keySelector)(collection, keySelector)

Creates a one-to-many dictionaryCreates a one-to-many dictionaryAnyAny(collection, predicate), (collection, predicate), All(collection, predicate)All(collection, predicate)

products.Any(p => p.price>100)products.Any(p => p.price>100)Sum, Count, Min, Max, Average, Sum, Count, Min, Max, Average, AggregateAggregate

Page 65: XML Programming Techniques

65

Constructing XML dataConstructing XML dataC#, VB : (nested) functional notationC#, VB : (nested) functional notationnew XMLElement(“person”,new XMLElement(“person”,

new XMLAttribute(“age”, 45),new XMLAttribute(“age”, 45),new XMLElement(“name”, “Patrick Hines”),new XMLElement(“name”, “Patrick Hines”),new XMLElement(“phone”, “425-555-0144”))new XMLElement(“phone”, “425-555-0144”))

VB 9.0: inlined XML with dynamic VB 9.0: inlined XML with dynamic contentcontent<contact><contact>

<name><%myName%><name><name><%myName%><name></contact></contact>

Page 66: XML Programming Techniques

66

A more complex exampleA more complex examplenew XMLElement(“contracts”, contracts.new XMLElement(“contracts”, contracts.

Where(c => c.address.city= “New York”).Where(c => c.address.city= “New York”).OrderBy(c => c.age).OrderBy(c => c.age).Select(c => new XMLElement(“contact”,Select(c => new XMLElement(“contact”, new XMLElement(“name”, new XMLElement(“name”, c.name),c.name),

new XMLElement(“phone”, new XMLElement(“phone”, c.phone)))c.phone)))

Linq works across data models (objects, tuples, XML)Linq works across data models (objects, tuples, XML)

Page 67: XML Programming Techniques

67

Navigation primitives in Navigation primitives in XLinqXLinq

Similar to the path axes in Xpath 1.0Similar to the path axes in Xpath 1.0Nodes() : retrieves all the childrenNodes() : retrieves all the childrenElements(): retrieves all elements Elements(): retrieves all elements childrenchildrenElements(“name”): selects children Elements(“name”): selects children elem. by nameelem. by nameAttributes()Attributes()Parent()Parent()Descendents()Descendents()EtcEtc

Page 68: XML Programming Techniques

68

Updating primitives in Updating primitives in XLinqXLinq

AddAdd()()add new content to an existing XML treeadd new content to an existing XML tree

RemoveRemove()()Delete nodes from a treeDelete nodes from a tree

ReplaceContentReplaceContent()()Replaces the content of a nodeReplaces the content of a node

SetElementSetElement()()Particular case of ReplaceContentParticular case of ReplaceContent

SetAttributeSetAttribute()()

Page 69: XML Programming Techniques

69

Declarative XML querying in XLinqDeclarative XML querying in XLinqSelect-From-Where style syntax directly Select-From-Where style syntax directly supported C# 3.0 (no API barrier)supported C# 3.0 (no API barrier)Can be logically mapped into a combination Can be logically mapped into a combination of query operators (see above)of query operators (see above)

from c in contacts.Elements(“”contact”),from c in contacts.Elements(“”contact”), average = contacts.Elements(“contact”).average = contacts.Elements(“contact”). Average(x => (int) x.Element(“netWorth”))Average(x => (int) x.Element(“netWorth”))where (int) c.Element(“netWorth”) > averagewhere (int) c.Element(“netWorth”) > averageorderBy (string) c.Element(“name”)orderBy (string) c.Element(“name”)select cselect c

Page 70: XML Programming Techniques

70

Conclusion on XLinqConclusion on XLinqGoodGood

Usability for .NET developers (simple tasks)Usability for .NET developers (simple tasks)Integration with the rest of .NET’s tools and librariesIntegration with the rest of .NET’s tools and libraries

BadBadNo support for typed dataNo support for typed dataNo static analysis No static analysis

No schema based static typingNo schema based static typingNo optimization based on static knowledgeNo optimization based on static knowledge

Blend of imperative and declarative code problematicBlend of imperative and declarative code problematicSemantics: lazy evaluationSemantics: lazy evaluationSemantics: error handlingSemantics: error handlingSemantics: imperative Semantics: imperative andand and and or or are non-commutative are non-commutativeOptimization: global dataflow analysis hard Optimization: global dataflow analysis hard Optimization: streaming and indexing are explicitOptimization: streaming and indexing are explicit

Page 71: XML Programming Techniques

71

OverviewOverviewIntroductionIntroduction

Applications & ArchitecturesApplications & ArchitecturesInterfaces to existing languages (Java, .NET, …)Interfaces to existing languages (Java, .NET, …)

XML APIs: SAX, DOM, StaXXML APIs: SAX, DOM, StaXCodegenerators: JAXB 2.0, XML Beans, SDO, EMFCodegenerators: JAXB 2.0, XML Beans, SDO, EMF

Extensions to existing programming languagesExtensions to existing programming languagesJavaScript (ECMA), AJAX, PHPJavaScript (ECMA), AJAX, PHPSQL/XMLSQL/XMLMicrosoft‘s XLinqMicrosoft‘s XLinq

„„Native“ XML Programming LanguagesNative“ XML Programming LanguagesDomain-specific languages: BPELDomain-specific languages: BPELPure XML Type System: XQuery, XSLT, XQueryPPure XML Type System: XQuery, XSLT, XQueryPResearch: Curl, XL, Xduce, Links, XQuery!, SIMKINResearch: Curl, XL, Xduce, Links, XQuery!, SIMKIN

Comparison of existing solutionsComparison of existing solutions

Page 72: XML Programming Techniques

72

WS-BPELWS-BPELWeb Service Business Process Execution Language Web Service Business Process Execution Language (version 2.0)(version 2.0)

OASIS, May 2006 working draftOASIS, May 2006 working draftNot a general purpose programming languageNot a general purpose programming languageDesigned for a specific task:Designed for a specific task:

Specification of the implementation of a Web Service Specification of the implementation of a Web Service created by the composition and orchestration of other Web created by the composition and orchestration of other Web ServicesServices

Created by logically merging two previous XML Created by logically merging two previous XML programming languagesprogramming languages

WSFL (IBM)WSFL (IBM)Xlang (Microsoft)Xlang (Microsoft)

Implemented by Microsoft, Oracle, IBM, SAPImplemented by Microsoft, Oracle, IBM, SAP

Page 73: XML Programming Techniques

73

WS-BPEL programsWS-BPEL programs

placeplaceorderorder

orderorderconfirmationconfirmation

ship ordership order pickup notificationpickup notification

payment confirmationpayment confirmationreceive invoicereceive invoicesend invoice respondsend invoice respond

receive place orderreceive place ordersend ship ordersend ship orderif(shipCompleted)if(shipCompleted) send order notice (completed)send order notice (completed)elseelse send order notice (!completed)send order notice (!completed)receive update notificationreceive update notificationupdate ship historyupdate ship historyreceive invoicereceive invoicesend invoice responsesend invoice responsereceive payment confirmationreceive payment confirmationsend order confirmationsend order confirmation

Page 74: XML Programming Techniques

74

Main conceptsMain conceptsTraditional workflow concepts adapted to the reality of XML and Traditional workflow concepts adapted to the reality of XML and Web ServicesWeb ServicesPorts, messages and operations (WSDL)Ports, messages and operations (WSDL)

Describe the external interface of the process Describe the external interface of the process Activities Activities

Describe how various components are assembled into complex Describe how various components are assembled into complex execution logicexecution logic

VariablesVariablesInternal state of the programInternal state of the program

Error and compensation handlersError and compensation handlersDescribe the behavior in case of dynamic faultsDescribe the behavior in case of dynamic faults

Correlation setsCorrelation setsTo describe how various process instances participate in complex To describe how various process instances participate in complex conversationsconversations

ScopesScopes

Page 75: XML Programming Techniques

75

WS-BPEL query and WS-BPEL query and expression languagesexpression languages

XML data model, query language and expression language are XML data model, query language and expression language are black boxesblack boxes for the main language for the main languageBy default Infoset (untyped data) and Xpath 1.0By default Infoset (untyped data) and Xpath 1.0Uses XSLT 1.0 for data transformation (doXslTransform)Uses XSLT 1.0 for data transformation (doXslTransform)Allows other data models and languagesAllows other data models and languages

XDM (XQuery Data Model)XDM (XQuery Data Model)Xpath 2.0Xpath 2.0XQuery XQuery

<assign><assign> <copy><copy>

<from> <from> $po/lineItem[@prodCode=$myProd]/amt*$exchRate$po/lineItem[@prodCode=$myProd]/amt*$exchRate</from></from>

<to> <to> $convertPO/lineItem[@prodCode=$myProd]$convertPO/lineItem[@prodCode=$myProd] </to> </to> <copy><copy></assign></assign>

Page 76: XML Programming Techniques

76

WS-BPEL simple activitiesWS-BPEL simple activitiesassign and copyassign and copyinvokeinvokereceivereceivethrowthrowwaitwaitemptyemptyexitexituser defined activities (extensibility user defined activities (extensibility mechanism)mechanism)

Page 77: XML Programming Techniques

77

WS-BPEL structured WS-BPEL structured activitiesactivitiessequencesequence

ififwhilewhilerepeatUntilrepeatUntilpick pick

selectively choosing an activityselectively choosing an activityflowflow

for parallel and control dependency for parallel and control dependency processingprocessing

forEachforEach

Page 78: XML Programming Techniques

78

WS-BPEL active behaviorWS-BPEL active behaviorEach scope can have event handlersEach scope can have event handlersThey execute concurrentlyThey execute concurrentlyThey start when the parent scope They start when the parent scope startsstartsOnEventOnEvent

Waiting for a particular type of messageWaiting for a particular type of messageOnAlarmOnAlarm

For (duration value), until (specific point For (duration value), until (specific point in time)in time)repeatEveryrepeatEvery

Page 79: XML Programming Techniques

79

WS-BPEL error handlingWS-BPEL error handlingSupport for Long Running Support for Long Running TransactionsTransactionsMechanism for specifying the Mechanism for specifying the compensation logic (sagas)compensation logic (sagas)Compensation handlers associated Compensation handlers associated with scopeswith scopes

Page 80: XML Programming Techniques

80

Compensation exampleCompensation example<<scopescope>>

<<compensationHandlercompensationHandler>><invoke partnerLink=“Seller” portType=“Purchasing”<invoke partnerLink=“Seller” portType=“Purchasing”

operation=“operation=“CancelPurchaseCancelPurchase” inputVariable=“getResponse”” inputVariable=“getResponse” outputVariable=“getConfirmation”>outputVariable=“getConfirmation”> <<correlationscorrelations>>

<correlation set=“PurchaseOrder” pattern=“request”/><correlation set=“PurchaseOrder” pattern=“request”/> </</correlationscorrelations>> </invoke></invoke></</compensationHandlercompensationHandler>><invoke partnerLink=“Seller” portType=“Purchasing”<invoke partnerLink=“Seller” portType=“Purchasing” operation=“operation=“PurchasePurchase” inputVariable=“sendPurchaseOrder”” inputVariable=“sendPurchaseOrder” outputVariable=“getResponse”>outputVariable=“getResponse”> <<correlationscorrelations>>

<correlation set=“PurchaseOrder” pattern=“request” <correlation set=“PurchaseOrder” pattern=“request” initiate=“yes””/>initiate=“yes””/>

</</correlationscorrelations>> </invoke></invoke></</scopescope>>

Page 81: XML Programming Techniques

81

WS-BPEL conclusionWS-BPEL conclusionGood:Good:

Easy specification of Web Services Easy specification of Web Services orchestrationorchestrationHigh levelHigh levelUseful constructs (parallelism, Useful constructs (parallelism, compensation, events, etc)compensation, events, etc)

BadBadSeparation between control flow and Separation between control flow and expression/query languageexpression/query language

Impact on static typing, automatic optimization, Impact on static typing, automatic optimization, usabilityusability

Page 82: XML Programming Techniques

82

OverviewOverviewIntroductionIntroduction

Applications & ArchitecturesApplications & ArchitecturesInterfaces to existing languages (Java, .NET, …)Interfaces to existing languages (Java, .NET, …)

XML APIs: SAX, DOM, StaXXML APIs: SAX, DOM, StaXCodegenerators: JAXB 2.0, XML Beans, SDO, EMFCodegenerators: JAXB 2.0, XML Beans, SDO, EMF

Extensions to existing programming languagesExtensions to existing programming languagesJavaScript (ECMA), AJAX, PHPJavaScript (ECMA), AJAX, PHPSQL/XMLSQL/XMLMicrosoft‘s XLinqMicrosoft‘s XLinq

„„Native“ XML Programming LanguagesNative“ XML Programming LanguagesDomain-specific languages: BPELDomain-specific languages: BPELPure XML Type System:Pure XML Type System: XQuery, XSLT XQuery, XSLT, XQueryP, XQueryPResearch: Curl, XL, Xduce, Links, XQuery!, SIMKINResearch: Curl, XL, Xduce, Links, XQuery!, SIMKIN

Comparison of existing solutionsComparison of existing solutions

Page 83: XML Programming Techniques

83

W3C: XQuery, Xpath, XSLTW3C: XQuery, Xpath, XSLT

Xpath 1.0

XSLT 2.0XQuery 1.0

Xpath 2.0

XSLT 1.0

uses as a sublanguage

uses as a sublanguage

extends, almost backwards compatible

extendsFLWOR expressionsNode constructorsValidation

1999

2006

Page 84: XML Programming Techniques

84

XQuery 1.0 vs. XSLT 2.0XQuery 1.0 vs. XSLT 2.0Equivalent expressive powerEquivalent expressive powerSame data model, type system, Same data model, type system, function libraryfunction libraryDifferent programming paradigmsDifferent programming paradigms

Iteration-basedIteration-based for XQuery for XQueryRecursive template-basedRecursive template-based for XSLT for XSLT

Two different syntaxes for the same Two different syntaxes for the same languagelanguage

XQuery easier when XQuery easier when shape of the data is shape of the data is knownknownXSLT easier to use when XSLT easier to use when shape of the shape of the data is unknowndata is unknown

Implementations often use the same Implementations often use the same runtime for bothruntime for both

Oracle, SaxonOracle, SaxonBetter language integration in the Better language integration in the futurefuture

XML Data ModelXML Data Model(XDM)(XDM)

XML Type SystemXML Type System(XML Schema)(XML Schema)

Function LibraryFunction Library

Xpath 2.0Xpath 2.0

XQueryXQuery XSLT 2.0XSLT 2.0

Page 85: XML Programming Techniques

85

XML Data Model (XDM)XML Data Model (XDM)Abstract (I.e. logical) data model for XML dataAbstract (I.e. logical) data model for XML dataSame role for Xpath 2.0, XQuery and XSLT 2.0 as the Same role for Xpath 2.0, XQuery and XSLT 2.0 as the relational data model for SQLrelational data model for SQLPurely Purely logicallogical --- no --- no standardstandard storage or access model (in storage or access model (in purpose)purpose)XQuery, Xpath 2.0 and XSLT 2.0 are XQuery, Xpath 2.0 and XSLT 2.0 are closedclosed with respect with respect to XDMto XDM

InfosetPSVI XML Data Model

XQueryXpath 2.0XSLT 2.0

Page 86: XML Programming Techniques

86

XML Data Model (XDM)XML Data Model (XDM)Instance of the data model: Instance of the data model:

a a sequencesequence composed of zero or more composed of zero or more itemsitemsThe The empty sequenceempty sequence often often considered as the “null value”considered as the “null value”ItemsItems

nodesnodes or or atomic valuesatomic valuesNodesNodesdocument | element | attribute | text | namespaces | PI | commentdocument | element | attribute | text | namespaces | PI | comment Atomic values Atomic values

Instances of all XML Schema atomic typesInstances of all XML Schema atomic typesstring, boolean, ID, IDREF, decimal, QName, URI, ...string, boolean, ID, IDREF, decimal, QName, URI, ...

untyped atomic valuesuntyped atomic valuesTyped Typed (I.e. schema validated) and (I.e. schema validated) and untyped untyped (I.e. non schema (I.e. non schema validated) nodes and valuesvalidated) nodes and values

Remember Lisp ?

Page 87: XML Programming Techniques

87

Xpath 2.0/XQuery/XSLT 2.0 Xpath 2.0/XQuery/XSLT 2.0 type system type system

Types are imported from XML SchemasTypes are imported from XML SchemasStandard static typing for XQuery and XPath 2.0Standard static typing for XQuery and XPath 2.0

Optional featureOptional featurePessimistic/conservativePessimistic/conservative

XSLT 2.0 has no standard static typing rulesXSLT 2.0 has no standard static typing rulesDynamic dispatch makes dataflow analysis very hardDynamic dispatch makes dataflow analysis very hard

The goal of the type system is:The goal of the type system is:1.1. detect statically errors in the queriesdetect statically errors in the queries2.2. infer the type of the result of valid queriesinfer the type of the result of valid queries3.3. ensure statically that the result of a given query is of a given ensure statically that the result of a given query is of a given

(expected) type if the input dataset is guaranteed to be of a given (expected) type if the input dataset is guaranteed to be of a given typetype

Page 88: XML Programming Techniques

88

What is XQuery ?What is XQuery ?A programming language that can A programming language that can express arbitrary XML to XML data express arbitrary XML to XML data transformationstransformations

Logical/physical data independenceLogical/physical data independenceDeclarativeDeclarativeSide-effect freeSide-effect freeStrongly typed languageStrongly typed language

““An expression language for XML.”An expression language for XML.”Such expressions are embeddable in a Such expressions are embeddable in a variety of environments (programming variety of environments (programming languages, APIs, etc)languages, APIs, etc)

Page 89: XML Programming Techniques

89

XQuery vs. SQLXQuery vs. SQL

SQL

Transacted data Declarative

processing

Transacted data Declarative

processing

XQuery

SQL works on the relational data model.XQuery works on XML Data Model (XDM).XQuery: the XML replacement for SQL ?” No. XQuery is not a query language, but a declarative programming language.

Large Large volumevolume

Persistentdata

Persistentdata

Large Large volumevolume

Page 90: XML Programming Techniques

90

XQuery programsXQuery programsAn XQuery program:An XQuery program:

a a prologprolog + an + an expressionexpressionRole of the prolog:Role of the prolog:

Populate the context where the expression is compiled and Populate the context where the expression is compiled and evaluatedevaluated

Prologue contains:Prologue contains: namespace definitionsnamespace definitions schema importsschema imports default element and function namespacedefault element and function namespace function definitionsfunction definitions collations declarationscollations declarations function library importsfunction library imports global and external variables definitions, etcglobal and external variables definitions, etc

The prolog is the link between the XQuery expression The prolog is the link between the XQuery expression and the environment where the expression is embeddedand the environment where the expression is embedded

Page 91: XML Programming Techniques

91

XQuery expressionsXQuery expressionsXQuery Expr :=Constants | Variable | FunctionCalls | PathExpr |XQuery Expr :=Constants | Variable | FunctionCalls | PathExpr |

ComparisonExpr | ArithmeticExpr | LogicExpr |ComparisonExpr | ArithmeticExpr | LogicExpr | FLWRExpr | ConditionalExpr | QuantifiedExpr |FLWRExpr | ConditionalExpr | QuantifiedExpr |TypeSwitchExpr | InstanceofExpr | CastExpr |TypeSwitchExpr | InstanceofExpr | CastExpr |UnionExpr | IntersectExceptExpr |UnionExpr | IntersectExceptExpr |ConstructorExpr | ValidateExprConstructorExpr | ValidateExpr

Expressions can be nested with full generality !Expressions can be nested with full generality !Functional programming heritage.Functional programming heritage.

Page 92: XML Programming Techniques

92

Path expressionsPath expressionsdocument(“bibliography.xml”)/bibdocument(“bibliography.xml”)/bib

$x/child::bib/child::book/@year$x/child::bib/child::book/@year

$x/parent::*$x/parent::*

$x/child::*/descendent::comment()$x/child::*/descendent::comment()

$x/child::element(*, ns:PoType)$x/child::element(*, ns:PoType)

$x/attribute::attribute(*, xs:integer)$x/attribute::attribute(*, xs:integer)

$x/ancestors::document(schema-element(ns:PO))$x/ancestors::document(schema-element(ns:PO))

$x/(child::element(*, xs:date) | $x/(child::element(*, xs:date) | attribute::attribute(*, xs:date)attribute::attribute(*, xs:date)

$x/f(.)$x/f(.)

Page 93: XML Programming Techniques

93

FLWFLWOOR expressionsR expressionsSimilar to the Select-From-Where of SQLSimilar to the Select-From-Where of SQLClauses: FOR, LET, WHERE, ORDER BY, RETURNClauses: FOR, LET, WHERE, ORDER BY, RETURN

ExampleExample for $x in //bib/book /* similar to for $x in //bib/book /* similar to FROMFROM in SQL */ in SQL */ let $y := $x/author /* no analogy in SQL */let $y := $x/author /* no analogy in SQL */ where $x/title=“The politics of experience” where $x/title=“The politics of experience” /* similar to /* similar to WHEREWHERE in SQL */ in SQL */ order by $x/year /* similar to the ORDER BY order by $x/year /* similar to the ORDER BY

clause */clause */ return count($y) /* similarreturn count($y) /* similar to to SELECTSELECT in SQL in SQL

*/*/

FOR var IN expr

LET var := expr

RETURN expr

WHERE expr ORDER expr

Page 94: XML Programming Techniques

94

Node constructorsNode constructorsConstructing new nodes:

Elements, attributes, documents, processing instructions, comments, text

Constant vs. Dynamically evaluated contentConstant vs. Dynamically evaluated content<result><result>

literal text contentliteral text content</result></result>

<result><result> { $x/name{ $x/name }}</result></result>

<result><result>some content here {$x/text()}and some more heresome content here {$x/text()}and some more here</result></result>

Page 95: XML Programming Techniques

95

Functions in XQueryFunctions in XQueryIn-place XQuery functionsIn-place XQuery functionsdeclare function ns:foo($x as xs:integer) as declare function ns:foo($x as xs:integer) as

element()element(){ <a> {$x+1}</a> }{ <a> {$x+1}</a> }Can be recursive and mutually Can be recursive and mutually recursiverecursiveSupport for external functionsSupport for external functionsSupport for library of modulesSupport for library of modules

XQuery functions play the role of database viewsdatabase views

Page 96: XML Programming Techniques

96

Dynamic dispatch in XSLTDynamic dispatch in XSLTOrder of templates depends on the dataOrder of templates depends on the data

Very useful while dealing with irregular XML Very useful while dealing with irregular XML structuresstructures

<<xsl:templatexsl:template match="/"> match="/"> <axsl:stylesheet version="2.0"> <axsl:stylesheet version="2.0">

<xsl:apply-templates/><xsl:apply-templates/> </axsl:stylesheet></axsl:stylesheet>

</</xsl:templatexsl:template>><<xsl:templatexsl:template match="elements"> match="elements">

<axsl:template match="/"> <axsl:template match="/"> <axsl:comment select="systemproperty('xsl:version')"/> <axsl:comment select="systemproperty('xsl:version')"/> <axsl:apply-templates/><axsl:apply-templates/></axsl:template></axsl:template>

</</xsl:templatexsl:template>><<xsl:templatexsl:template match="block"> match="block">

<axsl:template match="{.}"> <axsl:template match="{.}"> <fo:block> <axsl:apply-templates/> <fo:block> <axsl:apply-templates/>

</fo:block> </fo:block> </axsl:template></axsl:template>

</</xsl:templatexsl:template>>

Page 97: XML Programming Techniques

97

XQuery/Xpath 2.0 Full XQuery/Xpath 2.0 Full TextTextXML data frequently contains textXML data frequently contains textXQuery/Xpath 2.0 Full Text extension provides XQuery/Xpath 2.0 Full Text extension provides search capabilitiessearch capabilitiesUse case example: RSS/blogs filteringUse case example: RSS/blogs filteringFTSelections: special kind of Boolean FTSelections: special kind of Boolean predicatespredicates

Operators Operators words, and, or, not, words, and, or, not,  mild not, order, scope,  mild not, order, scope, distance, window, times)  distance, window, times)  

Match optionsMatch optionsCase, diacritics, stemming, thesauri, stop words, Case, diacritics, stemming, thesauri, stop words, language, wildcardslanguage, wildcards

ScoringScoring

Page 98: XML Programming Techniques

98

XQuery Full Text ExampleXQuery Full Text Examplefor $book in for $book in

doc("http://bstore1.example.com/full-doc("http://bstore1.example.com/full-text.xml")/books/booktext.xml")/books/book

let $title := $book/metadata/title[. let $title := $book/metadata/title[. ftcontains "improving" && "usability" ftcontains "improving" && "usability" distance at most 2 words ordered at distance at most 2 words ordered at start]start]

where count($title)>0where count($title)>0return $titlereturn $title

Page 99: XML Programming Techniques

99

XML Update facilityXML Update facilityXML Update Facility W3C Working DraftXML Update Facility W3C Working DraftAbility to modify nodes in an XDM Ability to modify nodes in an XDM instance in a declarative fashioninstance in a declarative fashionPrimitive update operationsPrimitive update operations

insertinsert <age>24</age> into <age>24</age> into $person[name=“Jim”]$person[name=“Jim”]deletedelete $book[@year<2000] $book[@year<2000]renamerename $article as “publication” $article as “publication”replacereplace ($books/book)[1] with ($books/book)[1] with <book>….</book><book>….</book>replace value ofreplace value of $title with “New Title” $title with “New Title”

Page 100: XML Programming Techniques

100

XML Update Facility (2)XML Update Facility (2)Conditional updatesConditional updatesif($book/year<2000) if($book/year<2000) then delete $book/yearthen delete $book/yearelse rename $book/year as “publicationTime”else rename $book/year as “publicationTime”Collection-oriented updatesCollection-oriented updatesfor $x in $bookfor $x in $bookwhere $x/year<200where $x/year<200do rename $x as “oldBook”do rename $x as “oldBook”XML transformations using the update syntaxXML transformations using the update syntaxSingle snapshot querySingle snapshot query

Page 101: XML Programming Techniques

101

OverviewOverviewIntroductionIntroduction

Applications & ArchitecturesApplications & ArchitecturesInterfaces to existing languages (Java, .NET, …)Interfaces to existing languages (Java, .NET, …)

XML APIs: SAX, DOM, StaXXML APIs: SAX, DOM, StaXCodegenerators: JAXB 2.0, XML Beans, SDO, EMFCodegenerators: JAXB 2.0, XML Beans, SDO, EMF

Extensions to existing programming languagesExtensions to existing programming languagesJavaScript (ECMA), AJAX, PHPJavaScript (ECMA), AJAX, PHPSQL/XMLSQL/XMLMicrosoft‘s XLinqMicrosoft‘s XLinq

„„Native“ XML Programming LanguagesNative“ XML Programming LanguagesDomain-specific languages: BPELDomain-specific languages: BPELPure XML Type System: XQuery, XSLT, Pure XML Type System: XQuery, XSLT, XQueryPXQueryPResearch: Curl, XL, Xduce, Links, XQuery!, SIMKINResearch: Curl, XL, Xduce, Links, XQuery!, SIMKIN

Comparison of existing solutionsComparison of existing solutions

Page 102: XML Programming Techniques

102

Procedural extensions to Procedural extensions to XQueryXQueryVery controversial topicVery controversial topic

Old researchOld researchXL project (Florescu, Kossmann, 2001)XL project (Florescu, Kossmann, 2001)

New ResearchNew ResearchXQuery! (Simeon, Ghelli)XQuery! (Simeon, Ghelli)XQueryP (Carey, Chamberlin, Kossmann, Florescu, Robie)XQueryP (Carey, Chamberlin, Kossmann, Florescu, Robie)

Industrial pressureIndustrial pressureE.g.MarkLogic’s XML application development platformE.g.MarkLogic’s XML application development platform

Long history of adding control flow logic to query Long history of adding control flow logic to query languageslanguages

More then 15 years of success of PL/SQL and other More then 15 years of success of PL/SQL and other procedural extensions for SQLprocedural extensions for SQLSQL might have failed otherwise !SQL might have failed otherwise !

Page 103: XML Programming Techniques

103

What functionalities are What functionalities are missing in XQuery (after missing in XQuery (after adding updates)?adding updates)?

The ability to “see” the results of The ability to “see” the results of their side-effects during the their side-effects during the computationcomputationThe ability to invoke external The ability to invoke external computations that cannot participate computations that cannot participate in a snapshot semanticsin a snapshot semanticsThe ability to preserve state during The ability to preserve state during computationcomputationThe ability to recover (in a controlled The ability to recover (in a controlled way) from dynamic errorsway) from dynamic errors

Page 104: XML Programming Techniques

104

XQueryP proposalXQueryP proposalSubmitted by several companies to W3CSubmitted by several companies to W3C

Oracle, BEA, DataDirect, etcOracle, BEA, DataDirect, etcUnder consideration for standardizationUnder consideration for standardization

Surprisingly: Surprisingly: very smallvery small extensions to extensions to XQuery can satisfy many new use case XQuery can satisfy many new use case scenarios (not all unfortunately)scenarios (not all unfortunately)

Page 105: XML Programming Techniques

105

The XQueryP technical The XQueryP technical proposalproposal

A well-defined evaluation order for XQuery A well-defined evaluation order for XQuery expressions (“sequential order”)expressions (“sequential order”)

Paradigm shift for the database peopleParadigm shift for the database peopleDoes not mean that optimizability is reduced !Does not mean that optimizability is reduced !

Reduce the granularity of the snapshot to each Reduce the granularity of the snapshot to each individual atomic update expressionindividual atomic update expressionAdds three new kind of expressions:Adds three new kind of expressions:

Block Block SetSetWhileWhile

Page 106: XML Programming Techniques

106

(1) Sequential evaluation (1) Sequential evaluation orderorderSlight modification to existing rules:Slight modification to existing rules:

FLWOR: FLWO clauses are evaluated first; FLWOR: FLWO clauses are evaluated first; result in a tuple stream; then Return clause result in a tuple stream; then Return clause is evaluated in order for each tuple. Side-is evaluated in order for each tuple. Side-effects made by one row are visible to the effects made by one row are visible to the subsequent rows.subsequent rows.COMMA: subexpressions are evaluated in COMMA: subexpressions are evaluated in orderorder(UPDATING) FUNCTION CALL: arguments are (UPDATING) FUNCTION CALL: arguments are evaluated first before body gets evaluatedevaluated first before body gets evaluatedRequired (only) if we add side-effects Required (only) if we add side-effects

immediately visible to the program: e.g. variable immediately visible to the program: e.g. variable assignmentsassignments or or single snapshot atomic updates; single snapshot atomic updates; otherwise semantics not deterministic.otherwise semantics not deterministic.

Page 107: XML Programming Techniques

107

(2) Reduce snapshot (2) Reduce snapshot granularitygranularityToday update snapshot: entire queryToday update snapshot: entire queryChange:Change:

Every single atomic update expression Every single atomic update expression (insert, delete, rename, replace) is (insert, delete, rename, replace) is executed and made effective immediatelyexecuted and made effective immediately

Semantics is deterministic because of Semantics is deterministic because of the sequential evaluation order the sequential evaluation order (point1)(point1)

Page 108: XML Programming Techniques

108

(3) Adding new (3) Adding new expressionsexpressions

Block expressionsBlock expressionsAssignment expressionsAssignment expressionsWhile expressionsWhile expressions

Page 109: XML Programming Techniques

109

Block expressionBlock expressionSyntax:Syntax: “ “{“ ( BlockDecl “;”)* Expr (“;” Expr)* “}”{“ ( BlockDecl “;”)* Expr (“;” Expr)* “}”BlockDecl := BlockDecl := (“declare” $VarName TypeDecl? (“:=“ ExprSingle) ?)?(“declare” $VarName TypeDecl? (“:=“ ExprSingle) ?)? (“,” $VarName TypeDecl? (“:=“ ExprSingle) ? )*(“,” $VarName TypeDecl? (“:=“ ExprSingle) ? )*Semantics:Semantics:

Declare a set of updatable variables, whose scope is Declare a set of updatable variables, whose scope is only the block expression (in order)only the block expression (in order)Evaluate each expression (in order) and make the Evaluate each expression (in order) and make the effects visible immediatelyeffects visible immediatelyReturn the value of the last expressionReturn the value of the last expression

Updating if body contains an updating Updating if body contains an updating expressionexpression

Page 110: XML Programming Techniques

110

Assignment expressionAssignment expressionSyntax:Syntax:““set” $VarName “:=“ ExprSingleset” $VarName “:=“ ExprSingleSemantics:Semantics:

Change the value of the variable Change the value of the variable Variable has to be external or declared Variable has to be external or declared in a block (no let, for or typeswitch)in a block (no let, for or typeswitch)

Updating expressionUpdating expressionSemantics is deterministic because of Semantics is deterministic because of the sequential evaluation orderthe sequential evaluation order

Page 111: XML Programming Techniques

111

While expressionWhile expressionSyntax:Syntax:““while” “(“ ExprSingle “)” “return” Exprwhile” “(“ ExprSingle “)” “return” ExprSemantics:Semantics:

Evaluate the test conditionEvaluate the test conditionIf “If “truetrue” then evaluate the return clause; ” then evaluate the return clause; repeatrepeatIf “If “falsefalse” return the concatenation of the ” return the concatenation of the values returned by all previous values returned by all previous evaluations of returnevaluations of return

Syntactic sugar, mostly for Syntactic sugar, mostly for convenienceconvenience

Could be written using recursive Could be written using recursive functionsfunctions

Page 112: XML Programming Techniques

112

Atomic BlocksAtomic BlocksSyntax:Syntax:““atomic” “{“ . . . “}”atomic” “{“ . . . “}”Semantics:Semantics:

If the evaluation of Expr does not raise If the evaluation of Expr does not raise errors, then result is returnederrors, then result is returnedIf the evaluation of Expr raises a If the evaluation of Expr raises a dynamic error then no partial side-dynamic error then no partial side-effects are performed (all are rolled effects are performed (all are rolled back) and the result is the errorback) and the result is the error

Only the largest atomic scope is Only the largest atomic scope is effectiveeffectiveNote: XQuery! had a similar constructNote: XQuery! had a similar construct

Snap {…} vs. atomic {…}Snap {…} vs. atomic {…}

Page 113: XML Programming Techniques

113

XQueryP: exampleXQueryP: example

declare updating function local:prune($d as xs:integer) as declare updating function local:prune($d as xs:integer) as xs:integerxs:integer

{{declare $count as xs:integer := 0;declare $count as xs:integer := 0;for $m in /mail/message[date lt $d]for $m in /mail/message[date lt $d]return { do delete $m; return { do delete $m;

set $count := $count + 1 set $count := $count + 1 };};

$count$count}}

Page 114: XML Programming Techniques

114

More complex exampleMore complex exampledeclare updating function declare updating function

myNs:cumCost($projects) as element( )*myNs:cumCost($projects) as element( )*{{

declare $total-cost as xs:decimal :=0;declare $total-cost as xs:decimal :=0;for $p in $projects[year eq 2005]for $p in $projects[year eq 2005]return return

{set $total-cost := {set $total-cost := $total-cost+$p/cost;$total-cost+$p/cost;

<project><project><name>{$p/name}</name><name>{$p/name}</name><cost>{$p/cost}</cost><cost>{$p/cost}</cost><cumCost>{$total-cost}</<cumCost>{$total-cost}</

cumCost>cumCost><project><project>}}

}}Today: additional self join, or recursive functionToday: additional self join, or recursive function

Page 115: XML Programming Techniques

115

XQueryP conclusionXQueryP conclusionIf successful, can provide a platform If successful, can provide a platform for building XML-only applicationsfor building XML-only applications

No more SQL, no more Java/C#No more SQL, no more Java/C#Declarative programming and Declarative programming and usabilityusability

Good: less code, higher levelGood: less code, higher levelBad: less programmers can do it, harder Bad: less programmers can do it, harder debuggingdebugging

Automatic optimizationAutomatic optimizationCompilers will be very complex to buildCompilers will be very complex to buildBetter chances of successBetter chances of success

Page 116: XML Programming Techniques

116

Research projectsResearch projectsXLXL

Web Services implementationWeb Services implementationXduceXduce

Static typing, pattern matchingStatic typing, pattern matchingLinksLinks

XML programming without tiersXML programming without tiersXQuery!XQuery!

Make XQuery fully compositional with side-Make XQuery fully compositional with side-effectseffectsUser controlled granularity for snapshotsUser controlled granularity for snapshots

Page 117: XML Programming Techniques

117

OverviewOverviewIntroductionIntroduction

Applications & ArchitecturesApplications & ArchitecturesInterfaces to existing languages (Java, .NET, …)Interfaces to existing languages (Java, .NET, …)

XML APIs: SAX, DOM, StaXXML APIs: SAX, DOM, StaXCodegenerators: JAXB 2.0, XML Beans, SDO, EMFCodegenerators: JAXB 2.0, XML Beans, SDO, EMF

Extensions to existing programming languagesExtensions to existing programming languagesJavaScript (ECMA), AJAX, PHPJavaScript (ECMA), AJAX, PHPSQL/XMLSQL/XMLMicrosoft‘s XLinqMicrosoft‘s XLinq

„„Native“ XML Programming LanguagesNative“ XML Programming LanguagesDomain-specific languages: BPELDomain-specific languages: BPELPure XML Type System: XQuery, XSLT, XQueryPPure XML Type System: XQuery, XSLT, XQueryPResearch: Curl, XL, Xduce, Links, XQuery!, SIMKINResearch: Curl, XL, Xduce, Links, XQuery!, SIMKIN

Comparison of existing solutionsComparison of existing solutions

Page 118: XML Programming Techniques

118

XML programming: for XML programming: for what kind of application ?what kind of application ?

Simple XML serialization for communication (XML at the Simple XML serialization for communication (XML at the end)end)

Xlink, Java+APIsXlink, Java+APIsWeb distributed XML communicationWeb distributed XML communication

AjaxAjaxComplex XML computations (HealthCare7, XBRL)Complex XML computations (HealthCare7, XBRL)

XQuery, XQueryP, XLinkXQuery, XQueryP, XLinkOrchestration of Web Service messagesOrchestration of Web Service messages

BPELBPELProcess a mix of relational and XML dataProcess a mix of relational and XML data

SQL/XMLSQL/XMLFormatting XML contentFormatting XML content

XSLTXSLTUnfortunately, many (most) applications have several Unfortunately, many (most) applications have several of those needs in the same time !of those needs in the same time !Changing paradigms is very costlyChanging paradigms is very costly

Page 119: XML Programming Techniques

119

What community; what What community; what background?background?

XML is an unification factor for CS various XML is an unification factor for CS various communitiescommunitiesFor the moment each community wrongly For the moment each community wrongly believes to solve believes to solve “the XML problem”“the XML problem”Global XML picture missing in each communityGlobal XML picture missing in each community

ProgrammingProgramminglanguageslanguages DatabasesDatabases

WorkflowWorkflow Content Content managementmanagement

XMLXML

Page 120: XML Programming Techniques

120

XML programming: where XML programming: where in the architecture ?in the architecture ?What tier in the architecture What tier in the architecture ??

Client, server, middle tier ?Client, server, middle tier ?Same language on all the Same language on all the tiers ?tiers ?

XQuery can run on all tiersXQuery can run on all tiersEcmaScript, PhP weren’t EcmaScript, PhP weren’t designed to scale on a large designed to scale on a large server, but middle tierserver, but middle tierWhich one will run on a mobile Which one will run on a mobile phone ?phone ?

XML might have an impact XML might have an impact on the on the existenceexistence of the of the existing multi-tiered existing multi-tiered architecturesarchitectures

Storage(supports XML)

Application logic(Java/C#/PhP)

Communication(XML)

Client(XHTML, scripts)

Page 121: XML Programming Techniques

121

Programming styleProgramming styleAll styles:All styles:

Imperative programming + APIs (Java + DOM/SAX)Imperative programming + APIs (Java + DOM/SAX)Declarative (XQuery, XQueryP)Declarative (XQuery, XQueryP)Imperative + declarative (Xlink)Imperative + declarative (Xlink)Workflow (BPEL)Workflow (BPEL)Recursive template (XSLT)Recursive template (XSLT)

Choice: Choice: UsabilityUsability: based on what people are already used : based on what people are already used to doto doPerformancePerformance: declarative is easier to optimize: declarative is easier to optimize

Neither of those alternatives provides a Neither of those alternatives provides a “complete” XML programming solution“complete” XML programming solution

All will evolve in the futureAll will evolve in the futureWhich one will provide all the functionality Which one will provide all the functionality required ?required ?

Page 122: XML Programming Techniques

122

How much weight does How much weight does XML have in the XML have in the language ?language ?One of the thousands APIsOne of the thousands APIs

E.g. Java + DOME.g. Java + DOMLanguage agnostic to the XML existenceLanguage agnostic to the XML existence

More serious: syntactic extensionMore serious: syntactic extensionXlinq, SQL/XMLXlinq, SQL/XMLXML is one feature among others in the languageXML is one feature among others in the language

Nothing Nothing butbut XML XMLXQuery, Xpath, XSLT, XQueryP, BPELXQuery, Xpath, XSLT, XQueryP, BPELTry to process real XML (complex or not, good or bad), not to Try to process real XML (complex or not, good or bad), not to simplify it, or “fix” itsimplify it, or “fix” itXML is a XML is a givengiven

Page 123: XML Programming Techniques

123

Compliance to the W3C Compliance to the W3C family of standardsfamily of standardsXML is not an “orphan”; it comes with an XML is not an “orphan”; it comes with an Italian-style family of W3C standardsItalian-style family of W3C standards

Infoset, Namespaces, XML Schema, Xlink, XForms, Infoset, Namespaces, XML Schema, Xlink, XForms, XHTML, binary XML, etc, etcXHTML, binary XML, etc, etcForced to live well together by W3C rulesForced to live well together by W3C rules

When you marry XML, do you marry the whole When you marry XML, do you marry the whole family as such ?family as such ?

Yes: XQuery, XSLTYes: XQuery, XSLTGood: less friction within the XML worldGood: less friction within the XML worldBad: complexity, bad design choicesBad: complexity, bad design choices

Choose a subset of relatives, ignore the othersChoose a subset of relatives, ignore the othersMost solutions avoid XML SchemasMost solutions avoid XML Schemas

Try to change the familyTry to change the familyXlinq trying to improve the namespaces, or fix the XML data Xlinq trying to improve the namespaces, or fix the XML data modelmodel

Page 124: XML Programming Techniques

124

How do we get from here How do we get from here to there ? The evolution to there ? The evolution stylestyleLow-level disruptionLow-level disruption

Mixing old programming paradigms with new ones Mixing old programming paradigms with new ones via APIsvia APIs

E.g. XSLT and/or XQuery from Java, DOM or SAX from Java E.g. XSLT and/or XQuery from Java, DOM or SAX from Java First-degree disruption (affects the compiler First-degree disruption (affects the compiler writers)writers)

Add native XML support to existing languagesAdd native XML support to existing languagesNative XML processing in SQL, C#, VBNative XML processing in SQL, C#, VB

Second-degree disruption (affects users)Second-degree disruption (affects users)Adapt existing concepts to XMLAdapt existing concepts to XML

BPELBPELHigh-level disruption (affects everybody)High-level disruption (affects everybody)

Replace existing programming solutions with Replace existing programming solutions with new new solutionssolutions

XQueryPXQueryPPotential biggest gain Potential biggest gain in the long termin the long term

Page 125: XML Programming Techniques

125

XML programming: for XML programming: for what kind of XML data ?what kind of XML data ?

VolumeVolumePersistent vs. temporary dataPersistent vs. temporary dataDistributed vs. centralized dataDistributed vs. centralized dataStructured vs. unstructured dataStructured vs. unstructured dataTyped vs. untyped dataTyped vs. untyped dataRead only vs. append only vs. Read only vs. append only vs. updatable dataupdatable data

Page 126: XML Programming Techniques

126

Programming for which Programming for which XML data model ?XML data model ?XML is “just syntax”XML is “just syntax”No single, clear, standard data modelNo single, clear, standard data modelInfoset, PSVI, XDM, etcInfoset, PSVI, XDM, etcDifferent programming languages Different programming languages choose a different underlying “XML data choose a different underlying “XML data model”model”

XQuery, XSLT, Xpath: XQuery, XSLT, Xpath: XDMXDMJava + API: Java + API: Infoset/DOMInfoset/DOMJavaScript: JavaScript: InfosetInfosetXlink: Xlink: proprietary data modelproprietary data modelBPEL: BPEL: agnostic to the data modelagnostic to the data model

Page 127: XML Programming Techniques

127

Is static typing helping Is static typing helping productivity ?productivity ?VeryVery religious topic religious topic

Should we do static type verification of Should we do static type verification of programs ?programs ?

““Static type-based verification is evil”Static type-based verification is evil”XML is for schema-less dataXML is for schema-less dataWhy bind the programs to a certain schema ?Why bind the programs to a certain schema ?PhP and the dynamically typed languagesPhP and the dynamically typed languages

““It is impossible to program without static type It is impossible to program without static type verification”verification”

Guarantee that programs will not raise dynamic errorsGuarantee that programs will not raise dynamic errorsSometimes it is simply impossible or hardSometimes it is simply impossible or hard

XSLT because of the dynamic dispatch natureXSLT because of the dynamic dispatch natureBPEL because of separation between control flow and BPEL because of separation between control flow and expressionsexpressions

Additional questions:Additional questions:What schema languages ?What schema languages ?

XML Schema ? RelaxNG ? DTD ? Proprietary ?XML Schema ? RelaxNG ? DTD ? Proprietary ?Should the verification be pessimistic or optimistic ?Should the verification be pessimistic or optimistic ?Should it be a standard feature or implementation Should it be a standard feature or implementation defined ?defined ?

Page 128: XML Programming Techniques

128

What kind of What kind of computations ?computations ?Filter, project, join, create new XML dataFilter, project, join, create new XML data

All of the solutionsAll of the solutionsUpdate the dataUpdate the data

All except XSLT and XpathAll except XSLT and XpathFull Text searchFull Text search

XQuery, Xpath and XSLT and SQL/XMLXQuery, Xpath and XSLT and SQL/XMLError handlingError handling

Try catch: Java+APIs, XlinkTry catch: Java+APIs, XlinkCompensation behavior: BPELCompensation behavior: BPELNo error handling in XQuery, Xpath, XSLTNo error handling in XQuery, Xpath, XSLT

Events, alarms and triggersEvents, alarms and triggersSQL/XML, BPELSQL/XML, BPEL

Dynamic dispatch based on the structure of the Dynamic dispatch based on the structure of the data: XSLTdata: XSLT

Page 129: XML Programming Techniques

129

Unsolved programming Unsolved programming requirements requirements

Continuous XML processingContinuous XML processingXML streams, RSSXML streams, RSS

Semantic XML queryingSemantic XML querying//~//~personperson not //person not //personSolved by other data Solved by other data models/programming paradigms -- models/programming paradigms -- RDF/OWLRDF/OWL

Integrity constraints and assertionsIntegrity constraints and assertions

Page 130: XML Programming Techniques

130

Automatic optimization Automatic optimization In which programming paradigms it is possible to:In which programming paradigms it is possible to:

Do physical reorganization of the data without changing the Do physical reorganization of the data without changing the code ? code ? Do automatic caching ? Cache invalidation ?Do automatic caching ? Cache invalidation ?Do automatic code parallelization ?Do automatic code parallelization ?Decide automatically the code vs. data shipping in a Decide automatically the code vs. data shipping in a distributed architecture ?distributed architecture ?Derive data statistics and do code cost estimates ?Derive data statistics and do code cost estimates ?Do code rewritings based on assertions and invariants ?Do code rewritings based on assertions and invariants ?

Most of them require global dataflow analysisMost of them require global dataflow analysisNot easy in Xlinq because of the mix of imperative and Not easy in Xlinq because of the mix of imperative and proceduralproceduralHard in XSLT because of the dynamic dispatchHard in XSLT because of the dynamic dispatchHard in BPEL because of the control flow/expression separationHard in BPEL because of the control flow/expression separation

The lower level of abstraction of the language, the The lower level of abstraction of the language, the harder it isharder it is

Physical aspects of the execution are often hard coded (which Physical aspects of the execution are often hard coded (which index, streaming or not, where to execute, in which order, etc)index, streaming or not, where to execute, in which order, etc)

Controversial topic: Is automatic optimization even a Controversial topic: Is automatic optimization even a goodgood thing !? thing !?

Page 131: XML Programming Techniques

131

Syntax: to XML or not to Syntax: to XML or not to XML ?XML ?All spectrum of choicesAll spectrum of choices

No XML Syntax (Xlinq, PHP)No XML Syntax (Xlinq, PHP)XML syntax only for the node constructors (VB)XML syntax only for the node constructors (VB)Partial XML syntax (BPEL, XSLT: Xpath is unparsed)Partial XML syntax (BPEL, XSLT: Xpath is unparsed)Dual parallel syntaxes: non XML + all XML Dual parallel syntaxes: non XML + all XML (XQuery/XQueryX)(XQuery/XQueryX)

Advantages of an XML syntaxAdvantages of an XML syntaxCode becomes dataCode becomes data

Can be stored, indexed, verified, queried, transformed, Can be stored, indexed, verified, queried, transformed, updated with the same programming paradigmsupdated with the same programming paradigmsThe same tools (editors, etc) can be usedThe same tools (editors, etc) can be usedSeveral code processing tasks can be more easily Several code processing tasks can be more easily automatized (e.g. code generation, code rewriting)automatized (e.g. code generation, code rewriting)

No more custom parsers, only custom grammarsNo more custom parsers, only custom grammarsUsability problems….Usability problems….

Page 132: XML Programming Techniques

132

ConclusionConclusionXML is here to stay. Clear advantages.XML is here to stay. Clear advantages.Pervasive across architectures, vertical Pervasive across architectures, vertical industries, CS fields.industries, CS fields.Programming for XML is Programming for XML is veryvery different different from programming for other data models.from programming for other data models.Severe industrial problem today.Severe industrial problem today.

Productivity + performanceProductivity + performanceMany possible solutions.Many possible solutions.No clear general “winner”, now nor in the No clear general “winner”, now nor in the futurefutureThe database community can bring the The database community can bring the notion of declarativity and automatic notion of declarativity and automatic optimization.optimization.