the role of xml in cloud data integration presenter: david rr webber, oracle corporation october...
TRANSCRIPT
The Role of XML in Cloud Data Integration
Presenter: David RR Webber, Oracle CorporationOctober 15th, 2010
Introduction
Cloud services introduce new challenges for information
sharing. While certain aspects are familiar and yet still
unresolved, new techniques can be utilized. A canonical
approach underpinning Cloud data exchanges is vital to ensure
consistent understanding and enhanced interoperability.
Developers face many challenges and complexities in using
today’s industry standards for information exchanges. How can
we simplify this and rapidly develop consistent and conforming
information exchanges in the Cloud?
Open source tools will be discussed and government and
industry example exchanges presented.
Agenda
Challenges, New and Old Why a canonical approach?
Adaptive, agile and context aware infrastructure
Avoiding the n(n-1)2 dilemma
Ensuring Simplicity at the Foundation Never underestimate the ability of engineers to add
complexity
Open source and open standard solutions Examples from emergency management domain with NIEM*
Summary and Q&A* National Information Exchange Model (NIEM) approach
Challenges, New and Old Why a Canonical Approach?
Adaptive, agile and context aware infrastructure
Avoiding the n(n-1)2 dilemma
Why a Canonical Approach?
Traditional XML information exchange has been schema
driven; many issues for Cloud integration: W3C XML Schema is inflexible, static, brittle, localized, expensive
Canonical dictionaries exploit Cloud approach with
distributed availability, flexible collaboration and dynamic
updating and referencing Amazon Web Services AWS catalog an example of dynamic approach
Canonical dictionaries provide the components that
underpin the exchanges while leaving the precise
exchange formatting open for implementers; the “what”
not the “how”; content can change hourly on AWS!
Neutral XML-based syntax is future proof
Canonical XML dictionary
A collection of distinct components that represent
discreet business information for an application domain
Includes singleton components and combinations of
related components together as sub-assemblies
Information is represented in a simple neutral
conceptual data format that captures the critical
concepts about the data e.g. name, description, content
type, contextual usage pattern, hierarchy Wikipedia definition:
http://en.wikipedia.org/wiki/Canonical#Computer_science
Baking in Interoperability
Using consistent component definitions dramatically improves interoperability and reuse
Having formal design methods makes development faster, easier, predictable and repeatable
Aligning local practice to industry domain dictionary can reduce complexity and reinforce best practices
Dictionary definitions can be automatically evaluated for common mistakes and this reduces the opportunity for errors during design phase
Generating software artifacts from neutral dictionary definitions ensures reliable information exchange results across user communities and their particular systems, platforms and tools
Neutral Content Model Representation
Neutral representations allow business stakeholders to participate in dictionary development without technology barriers
Concise neutral formats can be viewed as simple spreadsheets as they have no special syntax dependencies
Based on open public standard specifications, semantic concepts and leading knowledge domain techniques
Neutral representation prevents lock-in by vendor, syntax, tooling or platforms
Maximizes flexibility and future proofing of dictionary definitions
Linguistic and Semantic Alignment
Formal community domain naming and design rules provide consistency of definitions
Consistency of definitions minimizes duplication and overlapping of dictionary components
Dictionaries allow collaboration on component development to improve the overall results
Formal component content detail drives alignment Design best practices ensure logical self-contained
components that can be selected contextually Avoids explosion of complexity and excessive over
definition (e.g. “kitchen-sink” schema)
What is a Canonical Approach?
There are several flavors of canonical approaches; some
more complex than others – e.g. UBL vis OAGi vis CCTS
Avoid dependence on W3C XML Schema mechanisms
Core Components Technical Specification (CCTS) simple
components with basic hierarchy Parent components with child entities, and/or components
Associated attributes that denote context and related factors
In CCTS parlance these are ABIE, BBIE and ASBIE
Parent = Aggregate Business Information EntityChild = Basic Business Information EntityAttribute = Association Business Information Entity
Conceptual Information Model
Child(BBIE)Item
Child(BBIE)Item
Parent (ABIE)Item
Parent (ABIE)Item
Follows Naming and Design Rule (NDR) principles and guidelines
Canonical Components
Dictionary
XML
Canonical Components
Dictionary
XML
ebXML CCTS terms (ABIE, BBIE, ASBIE)
Parent = Aggregate Business Information EntityChild = Basic Business Information EntityAttribute = Association Business Information Entity
Parent (ABIE)Item
Parent (ABIE)Item
Parent (ABIE)Item
Parent (ABIE)Item
Parent (ABIE)Item
Parent (ABIE)Item
. . . . .
Child(BBIE)Item
Child(BBIE)Item
Child(BBIE)Item
Child(BBIE)Item
Child(BBIE)Item
Child(BBIE)Item
Attribute(ASBIE)
Attribute(ASBIE)
Attribute(ASBIE)
Attribute(ASBIE)
* CCTS – Core Components Technical Specification
Attribute(ASBIE)
Attribute(ASBIE)
Attribute(ASBIE)
Attribute(ASBIE)
Each compound component
Each atomic component
Optional attributes of component
Example – Person Name
Person Name (ABIE) Language Code (ASBIE)
Verified Details? (ASBIE)
Has Alias? (ASBIE)
First Name (BBIE)
Middle Name (BBIE)
Last Name (BBIE) Previous Name? (ASBIE)
Language Code may exist independently of Person Name
Verified Details and Previous Name are flags that denote additional
information about the entity they are associated with
There are three component items aspects:
structure relationships; content rules; definitions
Naming and Design Rules (NDR) also important in ensuring shorter non-specific context namese.g. compare PersonName to IncidentPersonName
Methods for creating Canonical Dictionary
Harvest from collection of domain exchange schema
Export from SQL database to schema; harvest; rename
Export from modelling tool to schema; harvest; rename
Create manually in XML or spreadsheet
Sample Dictionary Building Processes
EDM
EleEle
DefDef
DDLDDL
11
Export Components in XSD syntaxCollection of objects from model
Option 1 – From Enterprise Data ModelImport XSD and refactor for use with OASIS CAM
22Option 2 – Derive from existing exchange XSD schemaImport each XSD and merge into CAM dictionary
Exchange XSD
schema
Exchange XSD
schemaExchange
XSD schema
Exchange XSD
schemaExchange
XSD schema
Exchange XSD
schema
CAM template
CAM template
OASIS CAM
template
OASIS CAM
template
Model Components XSD schema
Model Components XSD schema
CAM template
CAM templateOASIS
CAM template
OASIS CAM
template
NDR Evaluation,Refactor,Renaming
Tool
NDR Evaluation,Refactor,Renaming
Tool
Apply Naming and Design Rule (NDR) checks and edits
44
NDR Evaluation,Refactor,Renaming
Tool
NDR Evaluation,Refactor,Renaming
Tool
44
GenerateStandard
Components Dictionary
XML
GenerateStandard
Components Dictionary
XML
Merge & Generate
DictionaryXML
Merge & Generate
DictionaryXML
55
55
XMLXML
Dictionary of exchange components
XMLXML
Import
Import
Import
Import
Dictionary of exchange components
33
33
Automated
Manual
LEGEND
Analyst Review
Analyst Review
ebXML CCTS compatible
(ABIE, BBIE, ASBIE)
ebXML CCTS compatible (ABIE, BBIE, ASBIE)
Ensuring Simplicity at the
Foundation
Never underestimate the ability of engineers to add complexity
Adaptive, agile and context aware infrastructure
XML validation framework that is configurable dynamically
through the use of XML templates and rules.
“In today's complex information exchanges with XML and
associated large XSD schema, coupled with an array of trading
partners, it becomes a significant challenge to support and maintain
accurate handling of all incoming transactions”.
“With a more adaptive and fault tolerant process, the application is
able to handle a wider variation in content and, hence, more easily
support a broad set of interaction partners with reduced support and
maintenance costs”.
http://www.ibm.com/developerworks/library/x-camval/index.html
Avoiding the n(n-1)2 dilemma
New XML validation framework Automotive parts repair with STARBOD example
Utilizing validation framework with singleton validation
templates that are context rule driven
Source: http://www.ibm.com/developerworks/library/x-camval/index.html#figure2
19
Agile Solution Components
DefEle
Domainapplications
Industrydictionary
formatted as XML
Interchanges
XML exchange
realistic testexamples
XML Schema
Unit Test Harness
Test
Blueprint toolkit
Automated
Manual
LEGEND
Defin
ition
s Rep
osito
ry (XM
L)
Exchange Structure Schema
Domain dictionaryformatted as XML
Templates
Build
CAMVengine
Content Hints
22
4433
11
Exchange Designer Tool
User Interface
Review StructureAssembly
Pick ComponentsWantlist
WSDL actions(optional)
BusinessContext
Rules
Agile ValidationEngine
Canonical Dictionaries
Leveraging Cloud Deployment strengths
Collaboration tools for sharing canonical component
dictionaries
Repositories of templates and code lists
Fault tolerant deployment architectures with redundancy
Machine accessible APIs to allow real time updates and
propagation of changes
Standards based implementations that provide open access
Open source resources for shared implementation support
Open Source and Open
Standard solutions
Examples from Emergency Management domain with NIEM, OASIS EDXL, LEXS
Example Emergency Management Scenario
Emergency Response Services Workflow using OASIS EDXL exchanges
Haiti demonstrated need for agile exchanges to rapidly cope with unfamiliar scenario and environment changes
Cloud-based sharing of open adaptive common infrastructure components
Top Down Solution Approach
DefEle
Industrydictionariesformatted as XML
Exchangegenerator
tools (CAM)
Automated
Manual
LEGEND
Co
mp
on
ents D
efinitio
n (X
ML
)
Local domaindictionaryformatted as XML
Build
55
22
ExchangeBlueprint Designer User Interface
ExpandStructureExchange Structure
Pick ComponentsStructure Outline Blueprint
Targetapplications
EDM
EleEle
DefDef
DDLDDL
Exchange Package
Exchange Package
ExchangeComponents
ExchangeComponents11
33
66
77
Enterprise Data ModelImport and refactor for use with CAM
Dictionary Repository
44
Assembling Components from dictionaries
Determine your business information exchange components at conceptual level
Search and locate candidate components from appropriate domain dictionary collections
Catalogue the parts to be used Dictionary components can be referenced individually or as collections
by an assembly blueprint that puts them all together to create a complete information exchange
Components can be selected from multiple dictionaries
Note any new extension pieces as needed Select components from multiple physical dictionary files Blueprints themselves also have high re-use value
Can be sub-assemblies and patterns not just exchange models
Example Assembly Blueprint Outlines
LEXS messaging blueprint
Reusable messaging envelope constructs
OASIS EDXL HAVE message
Business functional components
Message handling,
delivery and controlPayload goes here
Top level sets of
business
information
components
Individual
component
these examples available from CAM editor install package~ CAMeditor\eclipse\workspace\CAMEditor\dictionary\blueprints\
LEXS – Law Enforcement eXchange System – http://www.lexs.gov
Exchange Development Process Tools
Expander Tool
Expander Tool
Industry dictionaryIndustry
dictionary
Domain dictionaryDomain dictionary
Component Definitions
44
Component Definitions
Web toolWeb tool
ExcelExcel
Search
Tools
Search
Tools
22
BlueprintDesignerBlueprintDesigner
11
InsertDictionary
ParentComponents
InsertDictionary
ParentComponents
33
Completed ExchangeTemplate
Completed ExchangeTemplate
55
Summary and Q & A
Review Resource links
Summary
Canonical XML component dictionaries
Neutral representation of components
Deployment to target environments and architectures
Collaborative development and open source
Uses open public standards and government guidelines (NIEM)
Available resources and tools
Illustrative use cases
Leverage strengths of cloud-based collaboration resources
Resources
Resource links Supporting supplemental slides
Links and Resources
DOWNLOADS - CAM Toolkit download
https://sourceforge.net/projects/camprocessor
SUPPORTING MATERIALS - NIEM Naming and Design Rules (NDR) 1.3
http://www.niem.gov/pdf/NIEM-NDR-1-3.pdf
RESOURCES – UN/CEFACT Core Components Technical Specification
http://www.unece.org/cefact/ebxml/CCTS_V2-01_Final.pdf
Tutorials - wiki.oasis-open.org/cam/CAM_Tutorials Specifications
www.oasis-open.org/committees/cam docs.oasis-open.org/cam www.oasis-open.org/committees/emergency
NIEM site - www.niem.gov LEXS site – www.lexs.gov
Available XML Dictionaries
LEXS 3.1.4 dictionary OASIS EDXL dictionary OASIS EML dictionary
NIEM 2.1 dictionaries CBRN dictionary Emergency dictionary Family dictionary Immigration dictionary Infrastructure dictionary Intelligence dictionary Justice dictionary Maritime dictionary Screening dictionary Trade dictionary NIEM core dictionary Immigration blueprint
Available from download site
direct link:
http://sourceforge.net/projects/camprocessor/files
XMLXMLXMLXML XMLXML XMLXML XMLXML XMLXML
+ includes spreadsheets and sample blueprint
Packaged with CAM editor
see dictionary folder of install
+ spreadsheet
+ blueprint samples
XMLXMLXMLXML
XMLXML
Note: Those marked in bold are model style dictionaries with recursive components.
Conceptual Information View
CAM toolkit processing
CAM toolkit processing
Apply tools in desktop CAM toolkit editor
CAM Template
DOMAIN DATA COMPONENTSDOMAIN DATA COMPONENTSStructure
Rules
Definitions
Items
Item (ABIE, BBIE, ASBIE)
PropertiesNameUnique IDComponent TypeCardinalityContent TypeContent MaskChildrenGroupStructure ContextWhere fromDefinitionRulesLanguage, Label, Notes
* Required items in Blue
DICTIONARY COMPONENTSDICTIONARY COMPONENTS
XML View of Dictionary Content
Items
NameUnique IDComponent TypeCardinality
Content TypeContent Mask
* See slide notes for explanation
Parent / Child linkage
wherereferenced
Excel Spreadsheet View
An item per row
properties as columns
Type (ABIE, BBIE) children
Mapping to Dictionaries
You can compare a template of components to a dictionary check within a domain for alignment to dictionary check between domains for interoperability merge new/existing components with dictionary
Matches on physical names Reports matching items and details Reports statistics and percentages of matching Generates crosswalk xml file Compatible with Microsoft Excel Report can be used to do spell checking
Example cross-reference spreadsheet
Formatted view in Microsoft Excel of import of cross-reference report details (from generated XML file)
Matcheddetails; item and alignment,definition