© 2005-2006 the athena consortium. business documents – concepts and techniques ulrike greiner,...
TRANSCRIPT
2© 2005-2006 The ATHENA Consortium.
Course Structure
1. Introduction
2. Business Document Standards
3. Business Document Modelling
4. Business Document Mapping
4© 2005-2006 The ATHENA Consortium.
Definition
• A business document is a set of information components that are interchanged as part of a business activity (Definition from ebXML).
• Possible components are:• Information (data)• Meaning of that information (meta-data)• Presentation information (layout)• Links to other information components
5© 2005-2006 The ATHENA Consortium.
Information Contained v2
• Information in business documents can be of different types:– Structured:
• e.g. XML documents or databases
– Unstructured:• e.g. text files, Word documents, Emails, most Web pages
– Semi-structured:• Web pages with known fields of content (annotations)
<xml>…</xml>Structured information:
Unstructured information:
6© 2005-2006 The ATHENA Consortium.
Business Example
MANUFACTURER
RETAILER
SUPPLIER
Goal of this course:Show methods for efficient and easy management of business documents exchanged in a cross-organisational business process
Request for Quotation
Quotation
Order
Order Conf.
Business documents represent the information exchanged in cross-organisational business processes.
Request for Quotation
Quotation
OrderOrder Conf.
7© 2005-2006 The ATHENA Consortium.
Questions
No Question Option A Option B Option C Option D1.1 A business document is Set of information
componentsSet of
charactersExchanged as
part of a business activity
Exchanged during a phone call
1.2 A business document consists of
Information Layout Meta-data Process Information
1.3 Information can be structured unstructured Semi-structured
1.4 Unstructured information can be
Text files Word document XML document Annotated web page
1.5 Structured information can be
XML document Data from relational database
Data from object-relational
database
Image file
1.6 Information in cross-organizational business
processes
Is represented in business
documents
Is not represented in
business documents
Is stored in word documents
Is represented in XML documents
8© 2005-2006 The ATHENA Consortium.
Course Navigation
Recommended next section:● Business Document Standards
You can also continue with:● Business Document Modeling● Business Document Mapping
10© 2005-2006 The ATHENA Consortium.
Classification Categories
• Collaboration Agreement: – agree on a document standard and how to implement it
• Collaboration: – exchange information and data between organisations, specified
e.g. in protocols, or cross-organisational business processes
• Business Process / Service Definition: – define organisation-internal business processes and business
services
• Information Definition: – define business documents and data models
• Infrastructure Services: – specify infrastructure necessary to model and exchange business
documents
11© 2005-2006 The ATHENA Consortium.
Classification of Standards
Collaboration Agreement
Collaboration
Business Process /
Service Def.
InformationDef.
Infrastructure Services
ebXMLCPPA
Impl. Guide
VariantProblem
RosettaNetPIPs
STEP
EDI STAR OAGI WS-CDL
ebXMLBPSS
ebXMLCCTS
RosettaNetData
Dictionary
W3C transport protocols
(HTTP, SOAP, etc.)WSDL Discovery
IEEE FIPA
OGSAOGSI
UML UBLstandard product attributes
WS-BPEL XPDL
EDI STAR OAGI
12© 2005-2006 The ATHENA Consortium.
• Detailed description and analysis of the following standards:• ebXML CCTS• RosettaNet data dictionary and schemas• STEP• OAGI• DFDL
RosettaNetPIPs
STEP
ebXMLCCTS
RosettaNetData
DictionarySTAR OAGI
Selected Standards
Collaboration Agreement
Collaboration
Business Process /
Service Def.
InformationDef.
Infrastructure Services
ebXMLCPPA
Impl. Guide
VariantProblem
EDI STAR OAGI WS-CDL
ebXMLBPSS
W3C transport protocols
(HTTP, SOAP, etc.)WSDL Discovery
IEEE FIPA
OGSAOGSI
UML UBLstandard product attributes
WS-BPEL XPDL
EDI
13© 2005-2006 The ATHENA Consortium.
ebXMLCCTS
ebXML CCTS (1)
• General information:– Core Components
Technical Specification (CCTS) / Part 8 of the ebXML Framework
– Defined and maintained by United Nations Centre for Trade Facilitation and Electronic Business (UN/CEFACT)
– CCTS is fixed; extensions and modifications are performed by UN/CEFACT
– ebXML CCTS can be used in all industries
– CCTS does not provide implementation guidelines
• Repository:– CCTS describes a repository
structure that should be used to store CCTS-based business documents
– No information about repository interfaces is provided
RosettaNetPIPs
STEP
RosettaNetData
DictionarySTAR OAGI
Collaboration Agreement
Collaboration
Business Process /
Service Def.
InformationDef.
Infrastructure Services
ebXMLCPPA
Impl. Guide
VariantProblem
EDI STAR OAGI WS-CDL
ebXMLBPSS
W3C transport protocols
(HTTP, SOAP, etc.)WSDL Discovery
IEEE FIPA
OGSAOGSI
UML UBLstandard product attributes
WS-BPEL XPDL
EDI
14© 2005-2006 The ATHENA Consortium.
ebXML CCTS (2)
• Business document modeling:– Component-oriented approach
to model business documents on the business level (i.e. business experts are involved in modelling) including different variants of one document
– No transformation to more technical representations is specified
– Business documents can be used for company-internal and –external communications
– Specifications are done in a semantically standardized syntax-neutral way
– Normative rules in CCTS allow for checking correctness of business documents
• Transformations / Mapping:– CCTS defines a vocabulary for
common concepts that are used in different business documents
– No specification provided for mapping CCTS-based documents to other formats
ebXMLCCTS
RosettaNetPIPs
STEP
RosettaNetData
DictionarySTAR OAGI
Collaboration Agreement
Collaboration
Business Process /
Service Def.
InformationDef.
Infrastructure Services
ebXMLCPPA
Impl. Guide
VariantProblem
EDI STAR OAGI WS-CDL
ebXMLBPSS
W3C transport protocols
(HTTP, SOAP, etc.)WSDL Discovery
IEEE FIPA
OGSAOGSI
UML UBLstandard product attributes
WS-BPEL XPDL
EDI
15© 2005-2006 The ATHENA Consortium.
RosettaNet (1)
• General information:– RosettaNet Business Dictionary
(RNBD), RosettaNet Technical Dictionary (RNTD), RosettaNet Implementation Framework (RNIF)
– Mainly developed by industrial member organizations of RosettaNet
– Definitions follow the RosettaNet Standards Methodology (RSM)
– Initially targeted at high-tech industry, extended to other industries
– Provides excel-based tools to support implementation projects
• Repository:– No specifications for a
repository are provided
RosettaNetData
Dictionary
RosettaNetPIPs
STEP
ebXMLCCTS
STAR OAGI
Collaboration Agreement
Collaboration
Business Process /
Service Def.
InformationDef.
Infrastructure Services
ebXMLCPPA
Impl. Guide
VariantProblem
EDI STAR OAGI WS-CDL
ebXMLBPSS
W3C transport protocols
(HTTP, SOAP, etc.)WSDL Discovery
IEEE FIPA
OGSAOGSI
UML UBLstandard product attributes
WS-BPEL XPDL
EDI
16© 2005-2006 The ATHENA Consortium.
RosettaNet (2)
• Business document modeling:– Component-oriented XML
specifications for business documents on technical and execution level are provided
– Business documents can be used for company-external information exchange
– Variants of a document are supported through implementation guides describing which elements are generic and can be specialized to meet the specific needs of trading partners
– Software programs to test the validity of RosettaNet business documents
• Transformations / Mapping:– RosettaNet provides
dictionaries for both business terms and technical term that can be used to create documents.
– No specifications provided for mapping RosettaNet documents to other formats
RosettaNetData
Dictionary
RosettaNetPIPs
STEP
ebXMLCCTS
STAR OAGI
Collaboration Agreement
Collaboration
Business Process /
Service Def.
InformationDef.
Infrastructure Services
ebXMLCPPA
Impl. Guide
VariantProblem
EDI STAR OAGI WS-CDL
ebXMLBPSS
W3C transport protocols
(HTTP, SOAP, etc.)WSDL Discovery
IEEE FIPA
OGSAOGSI
UML UBLstandard product attributes
WS-BPEL XPDL
EDI
17© 2005-2006 The ATHENA Consortium.
STEP (1)
• General information:– Standard for the Exchange of
Product Model Data– Defined by TC184/SC4 at ISO– STEP is fixed, extensions and
modifications are performed by TC184/SC4
– STEP is used in manufacturing industry
– Provides implementation guidelines for business documents
• Repository:– ISO 13584 specifies a
repository structure, the Parts Library Structure
– Also specifies how documents should be stored and retrieved
STEP
RosettaNetPIPs
ebXMLCCTS
RosettaNetData
DictionarySTAR OAGI
Collaboration Agreement
Collaboration
Business Process /
Service Def.
InformationDef.
Infrastructure Services
ebXMLCPPA
Impl. Guide
VariantProblem
EDI STAR OAGI WS-CDL
ebXMLBPSS
W3C transport protocols
(HTTP, SOAP, etc.)WSDL Discovery
IEEE FIPA
OGSAOGSI
UML UBLstandard product attributes
WS-BPEL XPDL
EDI
18© 2005-2006 The ATHENA Consortium.
STEP (2)
• Business document modeling:– Component-oriented approach
to specify technical level business documents for internal as well as external communication
– Business documents are specified in EXPRESS
– Variants of documents can be specified using specialization and generalization of entities
– EXPRESS to XML transformations are described to generate execution level document representations
– Validation process for STEP implementations supported by conformance testing methodology and framework
• Transformations / Mapping:– STEP defines a vocabulary /
data dictionary for common concepts
– No specifications provided for mapping STEP documents to other formats
STEP
RosettaNetPIPs
ebXMLCCTS
RosettaNetData
DictionarySTAR OAGI
Collaboration Agreement
Collaboration
Business Process /
Service Def.
InformationDef.
Infrastructure Services
ebXMLCPPA
Impl. Guide
VariantProblem
EDI STAR OAGI WS-CDL
ebXMLBPSS
W3C transport protocols
(HTTP, SOAP, etc.)WSDL Discovery
IEEE FIPA
OGSAOGSI
UML UBLstandard product attributes
WS-BPEL XPDL
EDI
19© 2005-2006 The ATHENA Consortium.
OAGI (1)
• General information:– OAGIS = OAG Integration
Standard – Defined by OAGi = Open
Applications Group, inc. plus, AIAG (Automotive Industry Action Group), AAIA (Automotive Aftermarket Industry Association )
– Standard is defined in ISO 10303 documents and can be extended or modified following a dedicated procedure
– Standard is open for all industries
– OAGi provides implementation guidelines and support services
• Repository:– OAGi does not describe a
repository structure
– Business documents are usually stored on standard but structured file systems
OAGI
RosettaNetPIPs
STEP
ebXMLCCTS
RosettaNetData
DictionarySTAR
Collaboration Agreement
Collaboration
Business Process /
Service Def.
InformationDef.
Infrastructure Services
ebXMLCPPA
Impl. Guide
VariantProblem
EDI STAR OAGI WS-CDL
ebXMLBPSS
W3C transport protocols
(HTTP, SOAP, etc.)WSDL Discovery
IEEE FIPA
OGSAOGSI
UML UBLstandard product attributes
WS-BPEL XPDL
EDI
20© 2005-2006 The ATHENA Consortium.
OAGI (2)
• Business document modeling:– Component-oriented
specification of company-internal and –external business documents on technical and execution level
– Business documents are specified using XML, XSD
– No explicit support for handling variants of documents
– XML schemas available to check correctness of business documents
• Transformations / Mapping:– No specifications provided
for mapping OAG business documents to other formats
OAGI
RosettaNetPIPs
STEP
ebXMLCCTS
RosettaNetData
DictionarySTAR
Collaboration Agreement
Collaboration
Business Process /
Service Def.
InformationDef.
Infrastructure Services
ebXMLCPPA
Impl. Guide
VariantProblem
EDI STAR OAGI WS-CDL
ebXMLBPSS
W3C transport protocols
(HTTP, SOAP, etc.)WSDL Discovery
IEEE FIPA
OGSAOGSI
UML UBLstandard product attributes
WS-BPEL XPDL
EDI
21© 2005-2006 The ATHENA Consortium.
DFDL (1) – Why and Who
• Format Description for Non-XML data– Need for a mechanism bringing
the benefits of formal schema definition to legacy or other non-XML formats.
– Description, rather than prescription, of formats, to allow use with existing technology alongside definition of new
– Uses in integration of new and legacy systems, creation of high performance formats, and mapping and transformation tooling.
• Standard for use in implementing mapping tools– DFDL – Data Format
Description Language– Something fulfilling this role
already exists in many proprietary systems (e.g. Websphere Message Broker, Microsoft Biztalk)
– Common way of describing physical format desirable for interoperability
– DFDL Working Group within Open Grid Forum developing specification
– First revision to be available in near future
22© 2005-2006 The ATHENA Consortium.
DFDL (2) – What and How
• Schema based approach– XML schema used to
describe logical data format– Annotations contain physical
format information e.g.<xs:sequence
dfdl:separator=","> <xs:element name="y" type="double" dfdl:initiator="baseQ" dfdl:tagSeparator="=" />
– Use of XML Schema gives several benefits
• Existing body of tooling• Can apply prior knowledge• Useful document model and
implementation libraries
• Implementation and status– Provided properties should
support description of a wide variety of formats
• Support for fixed length formats, binary and text encodings, field delimeters
• Support for ‘variables‘ e.g. field specifying length of another
– Parsers and Serializers can make use of physical annotations to read and write data in the described format
– Prototype making use of the current version of specification available (within Virtual XML Framework from IBM)
23© 2005-2006 The ATHENA Consortium.
Questions
No Question Option A Option B Option C Option D
2.1 Which categories have been used to classify standards?
Collaboration Infrastructure services
Information Definition Database definition
2.2 STEP belongs to the following categories:
Collaboration Business process Definition
Information Definition Infrastructure services
2.3 STAR belongs to the following categories:
Collaboration Collaboration Agreement
Infrastructure services Information Definition
2.4 UBL is related to CCTS STAR OAGI
2.5 ISO stands for International Standards
Organization
Internal Standards Organization
International Sunshine Organization
2.6 Which might be suitable situations for applying DFDL:
Designing a new XML based message exchange format
Designing a highly optimized (for size)
RFQ format
Describing a legacy message format when interfacing with a new
system
Describing the SOAP headers for a web service call
2.7 DFDL Annotations: Describe a format’s logical structure
Describe a format’s Physical structure
Are embedded in the XML schema for the document
Are kept discrete / separate from the
XML schema
2.8 DFDL Properties can support physical formats containing:
Fixed length fields Binary Data Comma separated fields Length Prefixed (variable ‘fixed’ length) fields
24© 2005-2006 The ATHENA Consortium.
Course Navigation
Recommended next section:● Business Document Modeling
You can also continue with:● Business Document Mapping
26© 2005-2006 The ATHENA Consortium.
Modeling Requirements
• Requirements for modeling of business document:– Re-use of model types that are modeled once and can
then be used in different document models– Model representation targeted at business experts
• Semi-automatic transformation to technical specification
– Support for handling variants of business documents:• Share most of their data fields• Differ in a limited number of data fields that depend on the
context in which the document is used• Example: a purchase order that differs slightly if used in
different European countries
27© 2005-2006 The ATHENA Consortium.
Modeling Approach
• Based on Core Components Technical Specification (CCTS)
• Component-based thus supporting re-use• Graphical representation to support business
experts• Export functionality to create e.g. XML
representations• Provides the concept of a business context:
– Defines a specific context in which a document is used– Can be assigned to mark a particular variant of a
business document
28© 2005-2006 The ATHENA Consortium.
Types of Models
• Primitive Type Model• Context Category Model• Code List Model• Core Component Type Model• Core Component Model• Business Context Model• Data Type Model• Business Information Entity Model
29© 2005-2006 The ATHENA Consortium.
Relationships between Models
Primitive Type ModelContext Category
Model
Core Component Type Model
Code List Model
Core Component Model
Data Type Model
Business Context Model
Business Information Entity Model
30© 2005-2006 The ATHENA Consortium.
Primitive Type Model
• Models all primitive types• Examples: string, integer,
URL• Represented by nodes• Primitive type nodes can be
connected by edges:– Means that primitive type x
can be substituted by primitive type x
– e.g. a URL can be substituted by a string
Primitive type integer:
Primitive types string and URL:
31© 2005-2006 The ATHENA Consortium.
Core Component Type Model
• Specifies the data fields of business documents
• Groups multiple data fields each represented by a primitive type– exactly 1 content component:
primary data field with the actual value
– 1 to n supplementary components: describe the value
• Examples: Price, Text
Core Component Type Price:
32© 2005-2006 The ATHENA Consortium.
Core Component Model
• Represents a template of a business document:– contains all possible data fields
• Examples: order, quotation• Aggregate Core Component
(ACC) aggregates core components
• Association Core Components (ASCC) connects two ACCs
• Basic Core Component (BCC) connects ACC with CCT
• Property Terms specify the child CC
Core Component
Type
Basic Core Component
Aggregate Core Component
Association Core Component
33© 2005-2006 The ATHENA Consortium.
Data Type Model
• Represent data fields of a business document– similar to CCTs but more restrictive
• Is based on a CCT or on a primitive type model
• Specifies a Data Type Restriction (DTR) for each content and supplementary component of a CCT– limits the possible values
• Several Data Types can be based on the same CCT
Data Type A7_Number (based on CCT
Number):
34© 2005-2006 The ATHENA Consortium.
Context Category Model
• Classify the business circumstances, which define a business context
• Examples: industry, geopolitical
• Represented by nodes• Edges define a hierarchy of
categories
Context Category Geopolitical with two
sub-categories:
35© 2005-2006 The ATHENA Consortium.
Code List Model
• Provide values for business contexts
• Restrict the values of data types
• Example: country code• Represented by nodes• Code values of a code list
are specified textually as an attribute value
• Code list authority: organization that wants to define code lists (e.g. ISO)
Code list authority ISO and four code lists
defined by ISO:
36© 2005-2006 The ATHENA Consortium.
Business Context Model
• Describes the business circumstances in which a variant of a business document is used
• Specified by an enumeration of context values– Context values are code values of
a code list– All necessary code lists are put
into a business context node– All required code values are
selected • Examples: geopolitical region
Business context CountryContext:
Selected value from code list:
37© 2005-2006 The ATHENA Consortium.
Business Information Entity Model
• Represents a concrete business document used in a cross-organizational business process
• Is a variant of a Core Component
• Is created in three steps:– Assign a business context– Select the required data field from
the data fields of the core component
– Add a qualifier
• Examples: quotation
38© 2005-2006 The ATHENA Consortium.
Questions
No Question Option A Option B Option C Option D3.1 Which of the following are
requirements for business document modeling?
Re-use of models
Representation targeted at
business experts
Handling variants of business documents
Creating XML documents
3.2 Data Type models can be based on
Primitive type models
Core component type models
Context category model
Code list model
3.3 Business context models are based on
Primitive type models
Code list models Business Information Entity
Model
3.4 Basic core components connect
Aggregate core components and core component
types
Aggregate core components and association core
components
Primitive types and core component
types
3.5 Aggregate core component Is a template for a business document
Contains all possible value
fields
Specifies the business context
39© 2005-2006 The ATHENA Consortium.
Course Navigation
Recommended next section:● Business Document Mapping
You can also continue with:● Business Document Standards
41© 2005-2006 The ATHENA Consortium.
Mapping Requirement
• Requirement for document mapping– Business processes and services are developed by
different groups and use different interfaces.– Standards (ebXML, RosettaNet, etc,) are too
complicated for applications to implement– Document mapping bridges between requester‘s
service definition and provider‘s service definition.
Requester 1
Requester n. . .
ServiceDoc 1
MAP
ServiceDoc 2
Server
42© 2005-2006 The ATHENA Consortium.
Mapping Architecture (1)
RuntimeTransformation
SourceSchema
TargetSchema
conforms to conforms toMaps
generate
save
Source
Source
Target
Target
Automatic matching
XQuery, XSLT, Java, proprietary
Transformationgenerator
Map Generator
•A mapping generator •An optional automatic map generator•A transformation generator•A runtime that executes the transformation
43© 2005-2006 The ATHENA Consortium.
Mapping Architecture (2)
• A mapping generator – Is usually a graphical component that is used to
define the relationship between the source and target schema.
• An optional automatic map generator– Automatically populates mapping generator based on
computed similarities between source and target
• A transformation generator– Generates the runtime instantiation of the map in the
target mapping language. For example XSLT, XQuery, Java, SQL
• A runtime that executes the transformation against business documents.
44© 2005-2006 The ATHENA Consortium.
Automatic Map Generator
• Automatically discovers mappings between elements and attributes in the source and target schema using– Examples of source and target documents (Instance
level matching)– Names and structure defined in the schema only
(schema level matching)
Source TargetDeliveryAddress CustomerAdress
AddrLine1 AddrLine1City City State State
45© 2005-2006 The ATHENA Consortium.
Schema Level Matching
• Schema level matching can use a number of matching algorithms or combination of algorithms– Lexical matcher looks for schema elements with equal
or similar names – A thesaurus matcher makes use of an external non-
domain specific thesaurus to find common synonyms and hyponyms
– A type matcher makes uses of the simple and complex types of the elements
– A structure matcher looks for similar structures and sub-structures within the source and target
– An ontology matcher makes use of an external ontology which provides a domain specific vocabulary
46© 2005-2006 The ATHENA Consortium.
Example
• Source Schema • Target Schema
Orderamount float
UPC string
dueDate datetime
accntId string
deliveryAddr address
clientName string
PurchaseOrderEAN string
Qty float
deliverydate dateTime
clientId string
deliverAddress address
Ontology matching
EANCode
UPC
EAN 8
EAN 13subClassOf
type
type
PartNumber
subClassOf
See ontology on next foil
Lexical matchingThesaurus matching
47© 2005-2006 The ATHENA Consortium.
Ontology
Due Date
Delivery Date
EquivalentClass
NumberOfItems
Quantity
EquivalentClass
EANCode
UPC
EAN 8
EAN 13
subClassOf
type
type
PartNumber
subClassOf
48© 2005-2006 The ATHENA Consortium.
Questions
No Question Option A Option B Option C Option D4.1 Which of the following are
requirements for business document mapping?
Match different
Service definition
Match to standard
documents
Match between xml and non-xml
Match between communication
protocol
4.2 Map generator can be A graphical interface
A text interface Generate runtime transformation
Can map XML and non-XML documents
4.3 Runtime transformation language can be
XSLT XQuery Java C
4.4 Lexical matching matches elements with
the same name
matches elements with similar names
Matches elements that are synonyms
Matches elements that
are subclasses
4.5 An ontology can be used To provide a domain
vocabulary
To describe synonyms
To describe subclasses
Are defined outside a mapping system