an introduction to track 4: soa and metadata (semantics)

54
An Introduction to Track 4: SOA and Metadata (Semantics) Chuck Mosher Senior Enterprise Architect cmosher @ metamatrix.com 2 nd SOA for E-Government Conference 30-31 October 20006

Upload: luana

Post on 20-Jan-2016

34 views

Category:

Documents


0 download

DESCRIPTION

An Introduction to Track 4: SOA and Metadata (Semantics). 2 nd SOA for E-Government Conference 30-31 October 20006. Chuck Mosher Senior Enterprise Architect cmosher @ metamatrix.com. Agenda. The drivers for data (& metadata) integration Metadata in an SOA - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: An Introduction to Track 4:  SOA and Metadata  (Semantics)

An Introduction to Track 4: SOA and Metadata

(Semantics)

Chuck Mosher

Senior Enterprise Architect

cmosher @ metamatrix.com

2nd SOA for E-Government Conference30-31 October 20006

Page 2: An Introduction to Track 4:  SOA and Metadata  (Semantics)

2

Agenda

• The drivers for data (& metadata) integration• Metadata in an SOA• Data services: using active metadata to drive

data integration• Beyond metadata: dictionaries, vocabularies,

domain models, ontologies (semantics)• Why ontologies?• Overview of Track 4 Presentations• Q & A

Page 3: An Introduction to Track 4:  SOA and Metadata  (Semantics)

3

Acknowledgements

• Dave McComb*, Semantic Arts

• Atif Kureishy*, Booz | Allen | Hamilton

• John Salasin*, NIST

• Jeff Pollock, Oracle

• Brand Niemann, EPA

• Andy Evans, Revelytix

* Track 4 Speaker, 2:45-4:15 pm tomorrow

Page 4: An Introduction to Track 4:  SOA and Metadata  (Semantics)

4

One of the three enablers which drives domain-wide visibility: “… is a standard enterprise data architecture — the foundation for effective and rapid data transfer and the fundamental building block to enable a common logistical picture.”

Army Lt. Gen. Claude Christianson

“If you look at all the trends in the IT arena over the past 30 to 40 years, we’ve moved into an environment where we’ve got faster networks, more powerful processors, but it really comes down to the data”

Michael Todd, DOD CIO office

Data Interoperability Lies At The Very Core of DoD Transformation

Page 5: An Introduction to Track 4:  SOA and Metadata  (Semantics)

5

Dr. Linton Wells, as quoted in September’s NDIA Magazine, “…data compatibility may be an issue. Enabling digital interaction with nontraditional partners may require middleware or other programs that convert data from totally different formats …”

Page 6: An Introduction to Track 4:  SOA and Metadata  (Semantics)

6

Problem Scope

• Incompatible data meanings are the largest, most expensive, and time-consuming portion of IT visibility and IT interoperability projects:– Gartner… Forrester… NIST…– IDC… CIO Magazine…

• The classic “n-squared” problem of interfaces is even more severe at the data layer:– Data-to-data interfaces outnumber “pipes”– Tightly-coupled is brittle, and requires code

• Information growth is accelerating – FAST!– 2002-2005 – more new data than all of history– 5 exabytes of new digital data created in 2002 – enough for .5 million

new Library’s of Congress

Jeff Pollock – 2004 White House Conference on Semantic Technology

Page 7: An Introduction to Track 4:  SOA and Metadata  (Semantics)

7

Agenda

• The drivers for data (& metadata) integration• Metadata in an SOA• Data services: using active metadata to drive

data integration• Beyond metadata: dictionaries, vocabularies,

domain models, ontologies (semantics)• Why ontologies?• Overview of Track 4 Presentations• Q & A

Page 8: An Introduction to Track 4:  SOA and Metadata  (Semantics)

8

Why Does SOA Need Metadata?

• An architectural style enabling loose-coupling• Cornerstone of E-Government reengineering• Web Services and their related standards

(SOAP, WSDL, UDDI) provide an implementation framework for several key features of SOA

• BUT: Web Service technologies do not provide all the requirements for Dynamic USE of Discoverable Services

• Discovery – Yes – UDDI/ebXML• Use – No – requires service consumers and

providers to agree on a pre-defined standard interface for the service

Page 9: An Introduction to Track 4:  SOA and Metadata  (Semantics)

9

SOA is Easy, It’s Metadata That’s Hard

• SOA focuses on the interoperability between application interfaces & protocols

• Data (and service) meaning, integrity, and transformation have to be addressed elsewhere

• This information is found in the metadata

• SOA makes getting control over the metadata critical to success– Or you will end up with SOA silos!

Page 10: An Introduction to Track 4:  SOA and Metadata  (Semantics)

10

Metadata Is Everywhere

• Integration– Syntactic– Semantic– Application– Process

• Accessibility• Visibility• Discoverability

• Management– Governance– Auditing– Lineage– Quality– Compliance– Change Mgmnt– Impact Analysis– Performance

Many of the problems & issues around SOA implementations & governance boil down to getting a solid handle on all of the types & forms of metadata involved

Page 11: An Introduction to Track 4:  SOA and Metadata  (Semantics)

11

What Are Semantic Conflicts?Different primitive or abstract types for same information

Synonyms/antonyms have different text labels

Different conceptions about the relationships among concepts in similar data sets. Collections or constraints have been modeled differently for same information

Different abstractions are used to model same domain

Different choices are made about what concepts are made explicit

Fundamentally different data representations are used

Synonyms/antonyms exist in same/similar concept instance values

Different units of measures with incompatible scales

Similar concepts with different definitions

Fundamental incompatibilities in underlying domains

Disparity among the integrity constraints

Data Type

Labeling

AggregationStructureCardinality

Generalization

Value Representation

Impedance Mismatch

Naming

Scaling and Unit

Confounding

Domain

IntegrityJeff Pollock – 2004 White House Conference on Semantic Technology

Page 12: An Introduction to Track 4:  SOA and Metadata  (Semantics)

12

Metadata Management Maturity• Level 1: Inventory of information assets

– Necessary 1st step – what data do we have– Typically stored in repositories, registries, spreadsheets,

implicit in data itself (relational DB’s)• Level 2: Impact analysis

– Develop domain vocabularies and data models– Discover or create relationships between system artifacts

• Level 3: Metadata-driven integration– Design-time metadata repository + run-time integration– Example of Model-Driven Architecture

• Level 4: Semantic Web– Dynamic, machine-based inferencing at the concept level

Page 13: An Introduction to Track 4:  SOA and Metadata  (Semantics)

13

Data Evolution Timeline

Age of Programs

Age of Proprietary

Data

Age of OpenData

Age of Open

Metadata

Age of SemanticModels

Program-Data

GIGO/minis/micros www / Netscape Web services OWL

Text, Office DocsDatabases

(proprietary schema)

HTML,XML

(open schema)

Namespaces,Taxonomies,

RDF

Ontologies&

Inference

1945 -1970 2000 - 20031994 - 20001970 - 1994 2003 -

ProceduralProgramming

Object-OrientedProgramming

Model-DrivenProgramming

“Data is lesslessimportant

than code”

“Data is asasimportantas code”

“Data is moremoreimportant

than code”

Michael Daconta, Creating Relevance and Reuse with Targeted Semantics,XML 2004 Conference Keynote, November 16, 2004.

Page 14: An Introduction to Track 4:  SOA and Metadata  (Semantics)

14

Agenda

• The drivers for data (& metadata) integration• Metadata in an SOA• Data services: using active metadata to drive

data integration• Beyond metadata: dictionaries, vocabularies,

domain models, ontologies (semantics)• Why ontologies?• Overview of Track 4 Presentations• Q & A

Page 15: An Introduction to Track 4:  SOA and Metadata  (Semantics)

15

Program Challenges• Multiple sources

• Different interfaces/drivers• Different physical structures• Different semantics

• Single interface to data desired• Real-time access to data• Performance• Maintainability as data changes• Maintainability as apps change

Mission Challenges• Time-to-deploy• Agility - Responsiveness to change• Automation – Reduce cost of new development and operations• ROI of enterprise information

Agency Challenges• 100’s/1000’s of data sources• 100’s/1000’s of applications• Multiple access points/modes for apps• Understanding relationships/semantics• Data consistency• Data reuse – bridging data silos• Support for Web Services & SQL• Control & manageability, compliance• Security & auditing

Information Resources

Communities of Interest

Information Challenges

?

Page 16: An Introduction to Track 4:  SOA and Metadata  (Semantics)

16

Information Virtualization

Information Resources

Communities of Interest

Information Virtualization Layer

Page 17: An Introduction to Track 4:  SOA and Metadata  (Semantics)

17

Information Virtualization

Unified Semantic Layer

Information Virtualization Layer

Data Federation Layer

Data Access/Connectivity Layer

Enterprise Data Sources

Unification of different concepts across systemsSingle-query access to heterogeneous systemsUniform, standardized access to any system

Page 18: An Introduction to Track 4:  SOA and Metadata  (Semantics)

18

Metadata-Based Data Service

MasterData

OperationalData Store

AgencyApplication

Data Service

SQL SQL APICall

XML/SOAP

• Decouple data sources from application– Data implementation shielded

from application• Semantic/Format Mediation

– Standard vocabulary • Single access point

– Web Service/XML– SQL

• Federation– Single source or multi-source

• Scalability– Security, performance

Bridge theGap

SQL

Page 19: An Introduction to Track 4:  SOA and Metadata  (Semantics)

19

FEA DRM View on Data Services

DRM Version 2 Data Access Services• Context Awareness Services• Structural Awareness Services• Transactional Services• Data Query Services• Content Search and Discovery Services• Retrieval Services• Subscription Services• Notification Services

Service Types include:• Metadata / Data

• Structured / Unstructured• Read / Write• Push / Pull

Page 20: An Introduction to Track 4:  SOA and Metadata  (Semantics)

20

Designing data services

Modeling Information Services for SOA

xml

databases

warehouses

spreadsheets

services

<sale/> <value/></ sale >

geo-spatial

rich media

…Enterprise Enterprise Information Information

Sources (EIS)Sources (EIS)

Information Information ConsumersConsumers

Reusable,Reusable,Integrated Data Integrated Data

ObjectsObjects

ExposedExposedDataData

ServicesServices

<WSDL><WSDL>(contract)

<WSDL><WSDL>(contract)

<WSDL><WSDL>(contract)

Custom Apps

Web Services,Business Processes

Packaged Apps

Reporting, Analytics

EAI, Data warehouses

OD

BC

JDB

CS

OA

P

Logistics

Intelligence

Page 21: An Introduction to Track 4:  SOA and Metadata  (Semantics)

21

• Transformations from one or more sources

• Transformations defined with:– Joins/unions– Criteria– Functions

• Elements mapped to dictionary

• Business definitions captured

Data Service Abstraction Layers

Page 22: An Introduction to Track 4:  SOA and Metadata  (Semantics)

22

Data Service Layer in SOAClient Process & Applications

Data Sources

Data Services Layer

Message Services (ESB)

Business Services

Business Process Services

App App App App App App

Data Service Data Service Data Service Data Service Data Service

Page 23: An Introduction to Track 4:  SOA and Metadata  (Semantics)

23

Data,ContentSources

Logical Data Model

Data Services Approaches

T

Org, Person, Image,

Location

MaterializedLogical Model

<X>

</X>

<X>

</X>

<X>

<X>

<X>

</X>

<X>

</X>

<X>

<X>

Data Services for Multiple Purposes:

• Simplified access to value-added (tagged) data in real-time• Value-added (tagged) data materialized & staged

• Phased-in migration from legacy to new• Managed archiving via classification, retention tags

• Enhanced search via consistent content tags

Model-Driven Integration LayerModel-Driven Integration Layer

Data,ContentSources

Logical Data ModelT

Organization, Customer, Imagery, Location

MaterializedLogical Model

<X>

</X>

<X>

</X>

<X>

<X>

<X>

</X>

<X>

</X>

<X>

<X>

AgileInformation

Services

<X>

</X>

<X>

</X>

<X><X>

<X>

</X>

<X>

</X>

<X><X>

<X>

</X>

<X>

</X>

<X><X>

Enriched Data/Content Store

Page 24: An Introduction to Track 4:  SOA and Metadata  (Semantics)

24

T

Authoritative Sources:• Mapped to logical

Multiple Internal/External Information Sources

Application views of information:

• Relational, XML

T T

XML Document<a>

</a>

<b>

</b>…

T

TT

ODBC/JDBC JDBC SOAP

WebServices

WebServices

Search Applications

Search Applications

BusinessIntelligence

Applications

BusinessIntelligence

Applications

Logical Data Model:• Agency or COI-specific• Rationalize, harmonize,

mediate

C2, Logistics, Intelligence, …

Leveraging COI Data Dictionaries

bldg_id SITENUM Facility_ID

Location_ID

bldg_type Depot_Number

Location_Type

Page 25: An Introduction to Track 4:  SOA and Metadata  (Semantics)

25

Agenda

• The drivers for data (& metadata) integration• Metadata in an SOA• Data services: using active metadata to drive

data integration• Beyond metadata: dictionaries, vocabularies,

domain models, ontologies (semantics)• Why ontologies?• Overview of Track 4 Presentations• Q & A

Page 26: An Introduction to Track 4:  SOA and Metadata  (Semantics)

26

Beyond Mere Metadata

• Vocabularies/lexicons, Domain Models, Taxonomies, Ontologies

• All are means of beginning to define the context and scope of the domain of interest

• All specify artifacts in some way

• The “Semantics” word often means the relationships between artifacts is also specified

Page 27: An Introduction to Track 4:  SOA and Metadata  (Semantics)

27

Semantics = Meaning = Relationships

• Humans (and therefore our machines) only ever understand anything in so far as it is related to other things

ID

Page 28: An Introduction to Track 4:  SOA and Metadata  (Semantics)

28

Semantics = Meaning = Relationships

• Humans (and therefore our machines) only ever understand anything in so far as it is related to other things

ID

VANY

MD

Page 29: An Introduction to Track 4:  SOA and Metadata  (Semantics)

29

Semantics = Meaning = Relationships

• Humans (and therefore our machines) only ever understand anything in so far as it is related to other things

ID

SUPEREGO

EGO

ANALYSIS

Page 30: An Introduction to Track 4:  SOA and Metadata  (Semantics)

30

Semantics = Meaning = Relationships

• Humans (and therefore our machines) only ever understand anything in so far as it is related to other things

ID

LICENSE

CARD

BADGE

Page 31: An Introduction to Track 4:  SOA and Metadata  (Semantics)

31

Data Dictionary -> Vocabulary

• The data alone does not have sufficient context• Using metadata is not enough - you must be able to

leverage domain concepts and terminologies• Example problem – potentially similar data elements,

but dissimilar constructs/datatypes/descriptions– How do we relate common constructs with uncommon datatypes? – Solution requires that vocabulary relate those constructs across

models with transformation relationships, logic

• Define business use/semantics of similar information– Datatypes describe a set of values– Defines the technical constraints on values– Enables integrating information, as datatypes can be

referenced by any models (relational, XML, object, …)

Page 32: An Introduction to Track 4:  SOA and Metadata  (Semantics)

32

Benefits of Building a Vocabulary• Develop reusable information models and schemas

• Capture business and technology requirements in a single vocabulary

• Capture institutional knowledge

• Enables semantic mining techniques for deeper data discovery and information sharing

• Accelerate interoperability, web services and SOA development and deployment

• Establish and maintain a common relationship across data sources

• Establish and maintain compliance with industry exchange models

• Reduce IT expenses by leveraging data in its native source

• Reduce IT expenses associated with building and maintaining partner integration

• Improved information sharing directly enhances decision making

Page 33: An Introduction to Track 4:  SOA and Metadata  (Semantics)

33

Develop UML Use-CaseAuto Generate XSD - XML

Vocabulary Handbook

UNCLASSIFIED

Example Vocabulary Development Process

Determine Pilot Demonstration

Class Relationship Diagram

MDA DS COI Pilot - John Shea PEO C4I, PMW180 ISR/IO NMCI

Page 34: An Introduction to Track 4:  SOA and Metadata  (Semantics)

34

Agenda

• The drivers for data (& metadata) integration• Metadata in an SOA• Data services: using active metadata to drive

data integration• Beyond metadata: dictionaries, vocabularies,

domain models, ontologies (semantics)• Why ontologies?• Overview of Track 4 Presentations• Q & A

Page 35: An Introduction to Track 4:  SOA and Metadata  (Semantics)

35

“Ideal” Semantics

• Formal definition of meaning– Unambiguous– Machine process-able– Decidable

• Automated classification– Membership based on properties

• Inference– Can increase what you know based on

classification

Page 36: An Introduction to Track 4:  SOA and Metadata  (Semantics)

36

Ontologies

• Ontology is an explicit formal specification of the terms in a domain and the relationships between them– Others are special cases– Formal conceptual model– W3C standard (OWL/RDF) implementation

• Concepts, definitions, properties, relationships

• Machines can draw inferences from the properties and relationships captured in the model

Page 37: An Introduction to Track 4:  SOA and Metadata  (Semantics)

37

Ontologies

• Ontologies bring rigorous definitions of meaning to (meta)data

• More abstraction from lower levels of detail

• Key to loose-coupling

• With OWL/RDF, part of the W3C Semantic Web vision

Page 38: An Introduction to Track 4:  SOA and Metadata  (Semantics)

38

W3C Semantic Web Stack

Page 39: An Introduction to Track 4:  SOA and Metadata  (Semantics)

39

RDF

• Resource Description Format

• A mechanism to make assertions about things

• In the form of a triple:

subject -> predicate ->object

Resource (URI) -> Property (URI) -> Resource (URI or literal)

• URI’s establish unique namespace; do not have to be addressable

Page 40: An Introduction to Track 4:  SOA and Metadata  (Semantics)

40

RDF Examples

Airport123Business345

“ORD”

“Chicago, IL”

closestTo

name

locatedIn

Airport123

Airport123

Page 41: An Introduction to Track 4:  SOA and Metadata  (Semantics)

41

OWL

• OWL extends RDF by allowing us to create and make assertions about classes of things

Feline

Mammal Hair

Retractable

Claws

is a

has

has

Page 42: An Introduction to Track 4:  SOA and Metadata  (Semantics)

42

T

Authoritative Sources:• Mapped to logical

Multiple Internal/External Information Sources

Application views of information:

• Relational, XML

T T

XML Document<a>

</a>

<b>

</b>…

T

TT

ODBC/JDBC JDBC SOAP

WebServices

WebServices

Search Applications

Search Applications

BusinessIntelligence

Applications

BusinessIntelligence

Applications

Logical Data Model:• Agency or COI-specific• Rationalize, harmonize,

mediate

C2, Logistics, Intelligence, …

Semantic Mapping Challenge

bldg_id SITENUM Facility_ID

Location_ID

bldg_type Depot_Number

Location_Type

Page 43: An Introduction to Track 4:  SOA and Metadata  (Semantics)

43

Contextualize (Interpret)

Automated term tokenization

Automated semantic linking using the default knowledge-base contained within MatchIT

ArticleAmount

Amount Article

Sum

Assets

Creation

Synonym

Type-of

Page 44: An Introduction to Track 4:  SOA and Metadata  (Semantics)

44

Semantic Matching (Mediate)

• With relationships pre-established within the knowledge-base…

• Identify the Target and the Source(s) and run the match.

ArticleAmount

ProductShares

Automatically linked by a specific % distance

Page 45: An Introduction to Track 4:  SOA and Metadata  (Semantics)

45

Facilitate Decision Making (Mediate)

Helps facilitate rapid decision making

Target element for matching

Automatically calculated semantic distance between terms

Source candidate for matching

Page 46: An Introduction to Track 4:  SOA and Metadata  (Semantics)

46

Enterprise Model (UML)

Data Models(Relational, XML)XML

XMLXML

Physical Sources

Model & Relate information within any domain

Ontology Models(e.g. OWL, RDF)

Relate information in different domains/models

Search within and across domains for related information

Integration Driven By Semantics

Page 47: An Introduction to Track 4:  SOA and Metadata  (Semantics)

47

Ontology-Driven Integration Example

Land

4 Wheel

2 Wheel

TruckBus Car

Fuel Truck

CargoTruck

Transportation T

T

T

T

equivalence

equivalence

equivalence

equivalence

Logical Views Physical SourcesOntology

Page 48: An Introduction to Track 4:  SOA and Metadata  (Semantics)

48

Agenda

• The drivers for data (& metadata) integration• Metadata in an SOA• Data services: using active metadata to drive

data integration• Beyond metadata: dictionaries, vocabularies,

domain models, ontologies (semantics)• Why ontologies?• Overview of Track 4 Presentations• Q & A

Page 49: An Introduction to Track 4:  SOA and Metadata  (Semantics)

49

Track 4 Talks Tomorrow: 2:45-4:15pm

• Predictive Metrics To Guide SOA-Based System Development– John Salasin, NIST

• Integrating SOA and Ontologies for Information Sharing– Atif Kureishy, BAH

• SOA & Semantics– Dave McComb, Semantic Arts

Page 50: An Introduction to Track 4:  SOA and Metadata  (Semantics)

50

Predictive Metrics To Guide SOA Development

John Salasin, NIST• Will propose a set of metrics (vocabulary) to

characterize SOA-based systems• These metrics can be assessed at different points

in the development lifecycle– Early stage (concept development)– Architecture/Construction (system charac.)– Operations (robustness, perf, usage, govern.)– Evolution (extensibility, change mgmnt)

• Analysis can lead to ongoing refinement at every stage

• Quantitative, incremental Verification &Validation

Page 51: An Introduction to Track 4:  SOA and Metadata  (Semantics)

51

Integrating SOA and Ontologies for Information Sharing

Atif Kureishy, BAH

• Will discuss approaches for dynamic use of discoverable services

• Leverage semantic understanding/ definition of application domain

• Ontology-driven application case study

Page 52: An Introduction to Track 4:  SOA and Metadata  (Semantics)

52

SOA & Semantics – Dave McComb

Dave McComb, Semantic Arts

• How firms are using semantic web standards & technology to assist their SOA efforts

• Semantics for service discovery

• Enterprise message modeling

• Dynamic classification of messages

Page 53: An Introduction to Track 4:  SOA and Metadata  (Semantics)

53

Agenda

• The drivers for data (& metadata) integration• Metadata in an SOA• Data services: using active metadata to drive

data integration• Beyond metadata: dictionaries, vocabularies,

domain models, ontologies (semantics)• Why ontologies?• Overview of Track 4 Presentations• Q & A

Page 54: An Introduction to Track 4:  SOA and Metadata  (Semantics)

An Introduction to Track 4: SOA and Metadata

(Semantics)

Chuck Mosher

Senior Enterprise Architect

cmosher @ metamatrix.com

2nd SOA for E-Government Conference30-31 October 20006