painless oo xml with xml::pastor - 2009 remix
DESCRIPTION
How to build Perl classes with roundtrip data binding to XML, painlessly, using W3C XML Schema and XML::PastorSlides from a previous revision of this talk are online at:http://www.slideshare.net/joelbernstein/painless-oo-xml-with-xmlpastorq-presentation/I will be presenting an expanded, more practical, 2009 version of this talk. Now with more code and less theory!- XML is hard, right? Some things which are hard.- XML data binding- Comparisons of modules- XML::Twig- XML::Smart- XML::Simple- XML::Pastor- Pastor howto- XML schema inference- Trang, Relaxer- Relaxer howto- The future?For more information on XML::Pastor see:http://search.cpan.org/~aulusoy/XML-Pastor/Relaxer download:http://www.relaxer.jp/download/relaxer-1.0.zipRelaxer book (Japanese...):http://www.amazon.co.jp/exec/obidos/ASIN/4894715279/Trang:http://www.thaiopensource.com/download/trang-20030619.zipTRANSCRIPT
![Page 1: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/1.jpg)
Painless OO <-> XMLwith XML::Pastor
(2009 remix)
Joel BernsteinYAPC::EU 2009
![Page 2: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/2.jpg)
It’s all Greek to me
schemaσχήµα (skhēma)shape, plan
![Page 3: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/3.jpg)
How many of you?
![Page 4: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/4.jpg)
How many of you?
• Use XML
![Page 5: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/5.jpg)
How many of you?
• Use XML
• Hate XML
![Page 6: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/6.jpg)
How many of you?
• Use XML
• Hate XML
• Like XML
![Page 7: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/7.jpg)
A Confession
• I do not like XML
• People use it wrong
![Page 8: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/8.jpg)
XML Data Binding
• Binding XML documents to objects specifically designed for the data in those documents.
• I often have to do this.
![Page 9: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/9.jpg)
XML is hard, right?Some hard things:
• Roundtripping data
• Manipulating XML via DOM API
• Preserving element sibling order, comments, XML entities etc.
![Page 10: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/10.jpg)
Typical horrendous XML document
![Page 11: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/11.jpg)
Sales Order XML Logical data model
XML DOM
![Page 12: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/12.jpg)
I shouldn’t need to care about this
![Page 13: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/13.jpg)
How this makes me feel:
![Page 14: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/14.jpg)
Fundamental problem
• I don’t think in elements and attributes
• I think about my data, not how it’s stored
• This is Perl. DWIM.
![Page 15: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/15.jpg)
SolutionTools should make both the syntax and the details of
the manipulation of XML invisible
![Page 16: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/16.jpg)
Do you write XML
![Page 17: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/17.jpg)
Do you write XML
• By hand?
![Page 18: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/18.jpg)
Do you write XML
• By hand?
• Programmatically?
![Page 19: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/19.jpg)
Do you write XML
• By hand?
• Programmatically?
• Schemata?
![Page 20: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/20.jpg)
Do you write XML
• By hand?
• Programmatically?
• Schemata?
• Validation?
![Page 21: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/21.jpg)
Do you write XML
• By hand?
• Programmatically?
• Schemata?
• Validation?
• Transformation?
![Page 22: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/22.jpg)
XML::Pastor is forall of you.
![Page 23: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/23.jpg)
XML::Pastor
• Available on CPAN
• Abstracts away some of the pain of XML
• Ayhan Ulusoy is the author
• I am just a user
![Page 24: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/24.jpg)
What does it do?
• Generates Perl code from W3C XML Schema (XSD)
• Roundtrip and validate XML to/from Perl without loss of schema information
• Lets you program without caring about XML structure
![Page 25: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/25.jpg)
pastorize
• Automates codegen process
• Conceptually similar to DBIC::Schema::Loader
• TMTOWTDI - offline or runtime
• Works on multiple XSDs (caveat, collisions)
![Page 26: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/26.jpg)
pastorize in usepastorize --mode offline --style multiple \
--destination /tmp/lib/perl \--class_prefix MyApp::Data \/some/path/to/schema.xsd
![Page 27: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/27.jpg)
Very simple contrived Album XML demo
![Page 28: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/28.jpg)
Album XML document
![Page 29: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/29.jpg)
Album XML schema
![Page 30: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/30.jpg)
Pastorize the Album XML schema:
Resulting code tree like:
![Page 31: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/31.jpg)
Modify some XML
![Page 32: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/32.jpg)
Roundtrip and modify XML data using Pastor:
# Load XML# Accessors
# Modify
# Write XML
![Page 33: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/33.jpg)
The result!
![Page 34: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/34.jpg)
Real world Pastor
![Page 35: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/35.jpg)
Real world Pastor
$HASH1 = { 1 => 'Vodafone UK', 2 => 'O2 UK', 3 => 'Orange UK', 4 => 'T-Mobile UK', 8 => 'Hutchinson 3 UK'};
![Page 36: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/36.jpg)
Country XML
![Page 37: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/37.jpg)
Dynamic schema parsing of Country XML
![Page 38: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/38.jpg)
Query the Country object
![Page 39: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/39.jpg)
Modify elements and attributes with uniform syntax
![Page 40: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/40.jpg)
Manipulate array-like data
![Page 41: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/41.jpg)
Create new City data and combine with existing Country object
![Page 42: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/42.jpg)
Validate modified data against the stored schema
![Page 43: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/43.jpg)
Turn Pastor objects back into XML, or transform to XML::LibXML DOM
![Page 44: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/44.jpg)
Parsing with Pastor
• Parse entire XML into XML::LibXML::DOM object
• Convert XML DOM tree into native Perl objects
• Throw away DOM, no longer needed
![Page 45: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/45.jpg)
Reasons to not use XML::Pastor
• When you have no XML Schema
• Although several tools can infer XML schemata from documents
• It’s a code-generator
• No stream parsing
![Page 46: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/46.jpg)
XML::Pastor Scope
• Good for “data XML”
• Unsuitable for “mixed markup”
• e.g. XHTML
• Unsuitable for “huge” documents
![Page 47: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/47.jpg)
XML::Pastorknown limitations
• Mixed elements unsupported
• Substitution groups unsupported
• ‘any’ and ‘anyAttribute’ elements unsupported
• Encodings (only UTF-8 officially supported)
• Default values for attributes - help needed
![Page 48: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/48.jpg)
Other XML modules• XML::Twig
• XML::Compile
• XML::Simple
• XML::Smart
![Page 49: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/49.jpg)
XML::Twig
• Manipulates XML directly
• Using code is coupled closely to document structure
• Optimised for processing huge documents as trees
• No schemata, no validation
![Page 50: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/50.jpg)
XML::Compile
• Original design rationale is to deal with SOAP envelopes and WSDL documents
• Different approach but similar goals to Pastor - processes XML based on XSD into Perl data structures
• More like XML::Simple with Schema support
![Page 51: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/51.jpg)
XML::Compile pt. 2
• Schema support incomplete
• Shaky support for imports, includes
• Include restriction on targetNamespace
• I haven’t used it yet but it looks good
![Page 52: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/52.jpg)
XML::Simple
• Working roundtrip binding for simple cases
• e.g. XMLout(XMLin($file)) works
• Simple API
• Produces single deep data structure
• Gotchas with element multiplicity
![Page 53: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/53.jpg)
XML::Simple pt. 2
• No schemata, no validation
• Can be teamed with a SAX parser
• More suitable for configuration files?
![Page 54: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/54.jpg)
XML::Smart
• Similar implementation to XML::Pastor
• Uses tie() and lots of crac^H^H^H^Hmagic
• Gathers structure information from XML instance, rather than schema
• No code generation!
![Page 55: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/55.jpg)
XML::Smart pt. 2
• No schemata, so no schema validation
• Based on Object::MultiType - overloaded objects as HASH, ARRAY, SCALAR, CODE & GLOB
• Like Pastor, overloads array/hashref access to the data - promotes decoupling
• Reasonable docs, some community growing
![Page 56: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/56.jpg)
Any questions?
![Page 57: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/57.jpg)
Thanks for comingSee you next year
http://search.cpan.org/dist/XML-Pastor/
![Page 58: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/58.jpg)
Bonus MaterialIf we have enough time
![Page 59: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/59.jpg)
XML::Pastor Supported XML Schema Features• Simple and Complex Types• Global Elements• Groups, Attributes, AttributeGroups• Derive simpleTypes by extension• Derive complexTypes by restriction• W3C built-in Types, Unions, Lists• (Most) Restriction Facets for Simple types• External Schema import, include, redefine
![Page 60: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/60.jpg)
XML Schema Inference
• Create an XML schema from an XML document instance
• Every document has an (implicit) schema
• Tools like Relaxer, Trang, as well as the System.Xml.Serializer the .NET Framework can all infer XML Schemata from document instances
![Page 61: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/61.jpg)
Simple D::HA object
![Page 62: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/62.jpg)
Rekeying data
![Page 63: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/63.jpg)
Rekeying data deeper
![Page 64: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/64.jpg)
Warning, boring bit
![Page 65: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/65.jpg)
XML::Pastor Code Generation
• Write out static code to tree of .pm files
• Write out static code to single .pm file
• Create code in a scalar in memory
• Create code and eval() it for use
![Page 66: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/66.jpg)
How Pastor worksCode generation
• Parse schemata into schema model
• Perl data structures containing all the global elements, types, attributes, ...
• “Resolve” Model - determine class names, resolve references, etc
• Create boilerplate code, write out / eval
![Page 67: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/67.jpg)
How Pastor worksGenerated classes
• Each generated class (i.e. type) has classdata “XmlSchemaType” containing schema model
• If the class isa SimpleType it may contain restriction facets
• If the class isa ComplexType it will contain info about child elements and attributes
![Page 68: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/68.jpg)
How Pastor worksIn use
• If classes generated offline, then “use” them, if online then they are already loaded
• These classes have methods to create, retrieve, save object to/from XML
• Manipulate/query data using OO API to complexType fields
• Validate modified objects against schema
![Page 69: Painless OO XML with XML::Pastor - 2009 Remix](https://reader034.vdocuments.mx/reader034/viewer/2022051412/54b7741b4a7959df648b45de/html5/thumbnails/69.jpg)
Thanks for comingSee you next year
http://search.cpan.org/dist/XML-Pastor/