ontologies and db schema: what's the difference?
DESCRIPTION
"What is the difference between an ontology and a database schema"? Since the early days of the now maturing ontology field, this has been a persistent question that has never been adequately answered. We define each concept for a high level comparison and then ask the following questions about each concept: 1) What is it for? 2) What does it look like? 3) How do you build one? 4) How is it implemented and used? and 5) Where are the semantics?This gives rise to many other questions. For example: What is the role of constraints? of instances? Is there an analogy in ontology development for the process or database schema normalization? How is change management handled?The differences between database schema and ontologies are many, varied and illuminating. Most arise from their different purposes and historical origins. There are also striking similarities. We wondered whether database schema and ontologies were more alike than different. We reached a surprising conclusion!TRANSCRIPT
Ontologies and
Database Schema:
What’s the Difference?
Michael Uschold, PhDSemantic Arts
.
Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology
Copyright © 2011 Michael Uschold. All rights reserved.
Objective
To settle once and for all the question:
What is the difference between an
ontology and a database schema?
• Often asked.
• Never adequately answered.
Page 2
What is the same?
What is different?
Do we need both?
Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology
Copyright © 2011 Michael Uschold. All rights reserved.
Example Ontology
Page 3
Hydraulics
Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology
Copyright © 2011 Michael Uschold. All rights reserved.
part-of
connected-to
supplies-fuel-to
has-part
done-by Mechanical Device
Pump Engine
Hydraulic Pump Fuel Pump
Aircraft Engine Driven Pump
Fuel SystemHydraulic System
Jet Engine
Fuel Filter
Pumping
Example: Logical DB Schema (IDEF1X)
Page 4
Hydraulics
Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology
Copyright © 2011 Michael Uschold. All rights reserved.
Example: Hydraulics
Is this similarity just superficial?
Ontology
part-of
connected-to
supplies-fuel-to
has-part
done-by Mechanical Device
Pump Engine
Hydraulic Pump Fuel Pump
Aircraft Engine Driven Pump
Fuel SystemHydraulic System
Jet Engine
Fuel Filter
Pumping
Logical DB Schema: IDEF1X
Page 5
Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology
Copyright © 2011 Michael Uschold. All rights reserved.
The Basic Idea for Each
Ontology:
• defines a set of concepts and relationships
• that represent the content and structure
• of some subject matter
• in a formal language.
Database Schema:
• defines the structure of a database in a formal language.
(loosely refers to any of: conceptual, logical, physical)
part-of
connected-to
supplies-fuel-to
has-part
done-by Mechanical Device
Pump Engine
Hydraulic Pump Fuel Pump
Aircraft Engine Driven Pump
Fuel SystemHydraulic System
Jet Engine
Fuel Filter
Pumping
For shared understanding.
For a database.
Page 6
Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology
Copyright © 2011 Michael Uschold. All rights reserved.
Five Questions to get a Better Understanding
1. What is it for?
2. What does it look like?
3. How do you build one?
4. How is it implemented and used?
5. Where are the semantics?
“it”: ontology or
database schema
Quick Poll: more different?
Are they more alike or different?
Quick Poll: more alike?
Page 7
Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology
Copyright © 2011 Michael Uschold. All rights reserved.
What is it for?
Focus
OntologyDB Schema
Data Meaning
Shared Understanding
Core
Purpose(s)
Structure
instances for
efficient storage
and querying.
Human communication,
interoperability, search,
software engineering, …
Also for structuringinstances.
Single purpose.
By the way… Meaning lost. Instances optional.
Page 8
Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology
Copyright © 2011 Michael Uschold. All rights reserved.
What does it look like? Notation
Notation:
syntax
OntologyDB Schema
ER diagrams;
no standard
serialization
syntax.
Minimal focus on
formal semantics.
Strong focus on
formal semantics.
Notation:
semantics
Logic;
no standard
diagram notation
syntax.
Page 9
Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology
Copyright © 2011 Michael Uschold. All rights reserved.
What does it look like? Expressivity
OntologyDB Schema
Expressivity overlap Entities
Attributes,
Relations
Constraints
Classes
Properties
Axioms
Expressivity differences No
taxonomy.
Constraints for
integrity,
foreign key,
delete.
Taxonomy
is backbone.
Constraints for
meaning,
consistency &
integrity.
Cardinality constraints.
Page 10
Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology
Copyright © 2011 Michael Uschold. All rights reserved.
How to Build One?
OntologyDB Schema
Scratch,
rarely reuse.
Starting point: Reuse if
possible.
Normalization: Standard rules in
natural language,
little tool support.
No standard
rules or
guidelines.
OntoClean
Fundamental step.
Manual, geared to
specific queries
for specific DB.
Ontology
independent;
Inference engine
developers.
Optimization:
Toss expensive constraints. Lost meaning.
Some tuning may be required. A black art.
Page 11
Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology
Copyright © 2011 Michael Uschold. All rights reserved.
How is it Implemented and Used?
OntologyDB Schema
Locked into specific
set of queries per
DB.
Tight coupling.
Lost meaning.
Hard to evolve and
maintain.
ETL tools to help.
Change Management,
Agility, Flexibility
No query lock-in.
Queries usable on
other systems.
Looser coupling.
Semantics explicit.
Potentially easier to
evolve & maintain.
Few tools.
Semantics hardwired in procedural code.
Still no picnic!
Page 12
Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology
Copyright © 2011 Michael Uschold. All rights reserved.
How is it Implemented and Used?
OntologyDB Schema
SQL Engines
Queries
Reasoning with
Views
Data integrity
Standardized on SQL
Processing Engines Theorem Provers
Derive new
information from
existing information.
Consistency and
integrity
Less standardization
Primary Focus:
Both handle complex logical expressions.
Page 13
Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology
Copyright © 2011 Michael Uschold. All rights reserved.
How is it Implemented and Used?
OntologyDB Schema
Highly tuned for
performance and
scale.
Not work well with
too many joins.
Performance Full inferencing:
much smaller scale.
Reduced inferencing:
reaching large scale.
RDB and Triple Stores both have a Niche.
Tradeoff: Performance vs. Function & Flexibility
Page 14
Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology
Copyright © 2011 Michael Uschold. All rights reserved.
How is it Implemented and Used?
Building Databases & Applications
• Gather Requirements
• Build Conceptual Schema
(e.g., ER or UML model)
• Refine to Logical Schema
(e.g., normalize, still ER or UML)
• Refine to Physical Schema…
Page 15
Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology
Copyright © 2011 Michael Uschold. All rights reserved.
How is it Implemented and Used?
Refind to Physical Schema
• Define tables, columns, keys.
• Optimize to Specific Kinds of Queries.
• Create Data Dictionary (semantics for humans).
• Integrity Constraints • Domain, Referential & Semantic integrity
• Do the best you can, little automation.
• Where are the Semantics?
Page 16
Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology
Copyright © 2011 Michael Uschold. All rights reserved.
Five Questions to get a Better Understanding
1. What is it for?
2. What does it look like?
3. How do you build one?
4. How is it implemented and used?
5. Where are the semantics?
“it”: ontology or
database schema
Page 17
Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology
Copyright © 2011 Michael Uschold. All rights reserved.
Where are the Semantics for Database Schema?
• Mainly in Conceptual Schema and Data Dictionary
• Designed for Humans
• Semantics don’t evolve as DB and applications change
• Conceptual Schema semantics thrown away when
building Physical Schema.
• Integrity constraints hardwired in procedural code
Semantics are hardwired, lost, tossed, or out of date.
You cannot maintain what you do not understand!
Page 18
Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology
Copyright © 2011 Michael Uschold. All rights reserved.
Quick Summary
Focus on DATA
DB Constraints
• to ensure integrity
• may hint at meaning
No ISA hierarchy
SQL Engines
• querying, views
• data integrity
Instances Central
Data Dictionary
• separate artifact
FOCUS on Meaning
Ontology Axioms
• to specify meaning
• maybe for integrity
ISA Hierarchy is Backbone
Theorem Provers
• infer new information
• ensure consistency
Instances Optional
'Comments'
• part of the ontology
Database Schema Ontology
Page 19
Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology
Copyright © 2011 Michael Uschold. All rights reserved.
More Detailed Summary: 24 Features/Aspects
Core
for DB Schema
Secondary
for DB Schema
Unimportant
for DB Schema
Core
for Ontology
1. Represented using a formal language.
2. Expressivity: types, properties, constraints.
3. Constraints for consistency checking.
4. Shared meaning of some
subject matter.
5. Taxonomy.
6. Multi-purpose.
7. Embedded natural
language definitions.
8. Constraints for meaning.
9. Constraints for ensuring
self-consistency
(not data).
16. Abstract types w/no
instances.
17. Reused to build new ones.
18. Reused in unexpected ways.
19. Formal model-theoretic
semantics.
Secondary
for Ontology
10. Efficient querying and storage for data.
11. Standardized diagram notation.
12. Separate natural language definitions
(data dictionary).
13. Constraints for data integrity.
14. Industry-wide construction guidelines
(normalization).
15. Scale to huge sizes.
Unimportant
for Ontology
20. Cardinality constraints for getting foreign
keys right and ensuring tables created for
many-to-many relationships.
21. Toss semantics after conceptual modeling.
22. Optimization for specific set of queries.
23. Sophisticated tool support for migrating data
when schema evolve (ETL).
24. SQL for querying data.
More Alike?• Over 60% of features are common to both.• The 3 features core to both are the most
important: what is expressed and how.
More Different?Only 12% of features are core to both.
Can you convert one into the other?
Page 20
Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology
Copyright © 2011 Michael Uschold. All rights reserved.
More Alike? or More Different?
Conversion?
• Conceptual Schema & Ontology
No harder than between two different ontology languages
• Ontology & Logical Schema
Some loss of information, a design artifact
• Ontology & Physical Schema
Much loss of information, an implementation artifact
… even though we are apples and oranges, we are all fruit.
My Big Fat Greek Wedding Toast:
Page 21
Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology
Copyright © 2011 Michael Uschold. All rights reserved.
Conclusion
Speaking of weddings… What do you get if you cross:
• DB schema technology/community?
• Ontology technology/community?
• They are similar beasts. • They evolved in different communities.
Two cultures divided by a common idea?
Page 22
Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology
Copyright © 2011 Michael Uschold. All rights reserved.
Ontology/Model-Driven Architecture & Development
Basic Idea:
• Explicitly capture the semantics as formal ontologies.
• Base design- and implementation-level artifacts on the ontologies.
• Ontologies drive the applications, directly or indirectly.
Benefits:
• Looser coupling.
• Semantics is explicit.
• Integration/Interoperability by design.
• Inference gives automated consistency checking.
• Easier to evolve and maintain:• Less hardwiring of semantics means easier to understand and change
• Ontologies evolve with the DB and applications.
Conceptual, Real world
Logical, Design world
Physical, Implementation
world
Page 23
Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology
Copyright © 2011 Michael Uschold. All rights reserved.
Acknowledgements
For answering countless questions as a cube-mate:
• John Thompson
For reviews of a companion unpublished paper:
• Phil Bernstein,
• Tim Wilmering,
• Jun Yuan,
• Anhai Doan,
• Bill Andersen,
• Amit Sheth.
For ideas on model-driven development:
• Simon Robe.
Page 24