ontology and semantic web (2011)

70
© 2012 IBM Corporation Ontologies and the Semantic Web July 2011 [email protected]

Upload: craig-trim

Post on 17-Dec-2014

341 views

Category:

Technology


1 download

DESCRIPTION

An Ontology is a description of things that exist and how they relate to each other. Ontologies and Natural Language Processing (NLP) can often be seen as two sides of the same coin.

TRANSCRIPT

Page 1: Ontology and semantic web (2011)

© 2012 IBM Corporation

Ontologies and the Semantic Web

July 2011

[email protected]

Page 2: Ontology and semantic web (2011)

© 2012 IBM Corporation

Outline

Triples– Reification– Confidence Levels

Ontology– Design– Architecture (big picture)– SPARQL– Inferencing

Methodology– Creating a Semantic Network

Page 3: Ontology and semantic web (2011)

© 2012 IBM Corporation

Page 4: Ontology and semantic web (2011)

© 2012 IBM Corporation

Triples

Subject Predicate Object

“The author of Hamlet is Shakespeare” Shakespeare authorOf Hamlet Hamlet hasAuthor Shakespeare

Page 5: Ontology and semantic web (2011)

© 2012 IBM Corporation

Triples

“Shakespeare wrote Hamlet in 1876”

Shakepeare authorOf Hamlet

Hamlet writtenIn 1876

Page 6: Ontology and semantic web (2011)

© 2012 IBM Corporation

Triples (Reification)

Wikipedia states “Shakespeare wrote Hamlet in 1876”

Wikipedia states Shakepeare

Shakepeare authorOf Hamlet

Hamlet writtenIn 1876

Page 7: Ontology and semantic web (2011)

© 2012 IBM Corporation

Triples (Reification)

Wikipedia states “Shakespeare wrote Hamlet in 1876”

Wikipedia states (Hamlet writtenIn 1876)

Shakespeare authorOf Hamlet

Page 8: Ontology and semantic web (2011)

© 2012 IBM Corporation

Triples (Confidence Levels)

ShakespeareOnline states (Hamlet writtenIn 1599)

Wikipedia states (Hamlet writtenIn 1876)

When was Hamlet written?– 1599– 1876

Page 9: Ontology and semantic web (2011)

© 2012 IBM Corporation

Triples (Confidence Levels)

Go from this:– ShakepeareOnline states (Hamlet writtenIn 1599)

To this:– (ShakepeareOnline states (Hamlet writtenIn 1599)) hasConfidenceLevel 90

Page 10: Ontology and semantic web (2011)

© 2012 IBM Corporation

Triples (Confidence Levels)

Page 11: Ontology and semantic web (2011)

© 2012 IBM Corporation

What is an Ontology?

Description of the kinds of entities there are and how they are related (Chris Welty)

Page 12: Ontology and semantic web (2011)

© 2012 IBM Corporation

Ontology

“Shakespeare wrote Hamlet in 1876”

How many “types” of things are there in this statement?– Authors– Books– Plays– Years– Sources– Characters

What relationships could exist between these types?

Page 13: Ontology and semantic web (2011)

© 2012 IBM Corporation

Ontology

Author – Playwright {Shakespeare, Marlowe}

Book– Play {Hamlet, Macbeth, Faustus}

RDF:– Shakepeare a Playwright– Shakepeare a Author– Hamlet a Play– Hamlet a Book

Page 14: Ontology and semantic web (2011)

© 2012 IBM Corporation

Page 15: Ontology and semantic web (2011)

© 2012 IBM Corporation

William Shakespeareen2:Playwright was an English poet and playwright, widely regarded as the greatest writer in the English language and the world's pre-eminent dramatist.

Page 16: Ontology and semantic web (2011)

© 2012 IBM Corporation

Page 17: Ontology and semantic web (2011)

© 2012 IBM Corporation

Page 18: Ontology and semantic web (2011)

© 2012 IBM Corporation

Page 19: Ontology and semantic web (2011)

© 2012 IBM Corporation

Page 20: Ontology and semantic web (2011)

© 2012 IBM Corporation

AIX hasCommand topas monitors (process uses (CPU hasComponent resources))

Semantic Chains

Page 21: Ontology and semantic web (2011)

© 2012 IBM Corporation

SELECT ?commandWHERE {

AIX hasCommand ?command .?command monitors/uses CPU

}

SPARQL

Page 22: Ontology and semantic web (2011)

© 2012 IBM Corporation

Page 23: Ontology and semantic web (2011)

© 2012 IBM Corporation

Inference

Ontology Model (Classes):

Product– SupportedProduct (x hasMaker IBM)

Company– IBM– NonIBM (disjoint to IBM)

• { Microsoft, Oracle, Teradata)

Ontology Model (Predicates):

<Product> hasMaker <Company>

Triple Store data:

Rational Software Architect hasMaker IBM

Rational Software Architect a SupportedProduct

Page 24: Ontology and semantic web (2011)

© 2012 IBM Corporation

Page 25: Ontology and semantic web (2011)

© 2012 IBM Corporation

Page 26: Ontology and semantic web (2011)

© 2012 IBM Corporation

Tivoli Monitoring hasSynonym ITM

Page 27: Ontology and semantic web (2011)

© 2012 IBM Corporation

Tivoli Monitoring hasSynonym ITMITM hasComponent ITM Agent

Page 28: Ontology and semantic web (2011)

© 2012 IBM Corporation

Tivoli Monitoring hasSynonym ITMITM hasComponent ITM AgentTivoli Monitoring hasComponent Tivoli Monitoring AgentTivoli Monitoring Agent hasSynonym ITM Agent

Page 29: Ontology and semantic web (2011)

© 2012 IBM Corporation

Page 30: Ontology and semantic web (2011)

© 2012 IBM Corporation

Page 31: Ontology and semantic web (2011)

© 2012 IBM Corporation

Page 32: Ontology and semantic web (2011)

© 2012 IBM Corporation

Page 33: Ontology and semantic web (2011)

© 2012 IBM Corporation

Page 34: Ontology and semantic web (2011)

© 2012 IBM Corporation

“Agent” analysis

itm agent 54

db2 agent 32

os agent 32

ul agent 31

monitoring agent 29

oracle agent 22

agent needs 21

itm ul agent 16

windows os agent 15

agent left 14

agent system 14

citrix agent 14

mysap agent 14

unix os agent 13

linux agent 13

Page 35: Ontology and semantic web (2011)

© 2012 IBM Corporation

Proximal Verbs (normalized)

monitor

support

configure

run

start

show

build

appear

Page 36: Ontology and semantic web (2011)

© 2012 IBM Corporation

Events

Situation Event

Omnibus Event

ITM Event

Minor Event

Triggering Event

Console Event

System Event

TBSM Event

JMX Event

TEC Event

Page 37: Ontology and semantic web (2011)

© 2012 IBM Corporation

Blank Nodes

Explict Characterization vs Implicit (Predicate-driven) Identification

Page 38: Ontology and semantic web (2011)

© 2012 IBM Corporation

Blank Nodes

What are blank nodes?– A way of profiling entities– A way of identifying entities without explicit identification– Implicit identification– Predicate driven identification of data (rather than explict characterization)

Examples:– “That person has a child”– “That person has a child and a husband”

Page 39: Ontology and semantic web (2011)

© 2012 IBM Corporation

Anonymous (Anon) Nodes

What is the difference between an Anon Node and a Blank Node?

An “anonymous node” is an existentially quantitifed variable

A typical RDF node has an identifier to which it is useful to refer

Page 40: Ontology and semantic web (2011)

© 2012 IBM Corporation

Appendix A - Resources

Glossary

Books

Common OWL Editors

Triple Stores

Page 41: Ontology and semantic web (2011)

© 2012 IBM Corporation

Glossary

OWL – Web Ontology Language

RDF – Resource Description Framework

SPARQL – Simple Protocol and RDF Query Language

Page 42: Ontology and semantic web (2011)

© 2012 IBM Corporation

Books

Semantic Web for the Working Ontologist: Effective Modeling in RDFS and OWL – Author(s): Dean Allemang and Jim Hendler– Second Edition

Page 43: Ontology and semantic web (2011)

© 2012 IBM Corporation

Common OWL Editors

TopBraid Composer (TBC)

Free Edition (also Standard + Maestro Editions) http://www.topquadrant.com/products/TB_Composer.html

Protege

Free, open source ontology editor and knowledge-base framework http://protege.stanford.edu/

Page 44: Ontology and semantic web (2011)

© 2012 IBM Corporation

Triple Stores

Comparison and links here:

http://www.w3.org/wiki/LargeTripleStores

Sesame - scalable and transactional

May be more suited to web environments Setup slightly more complex than Jena TDB

Jena TDB - scalable and very simple set up

Code Samples and API introduction here: http://cattail.boulder.ibm.com/cattail/#[email protected]/files/

53A1E4007F0F3DDB8C12752E093F23B6 The latest version of Jena TDB (0.90) is transactional. Past versions of TDB

were not transactional, and may not be suited for web environments.

DB2-RDF – builds on top of the Jena Graph SPI.

https://www.ibm.com/developerworks/mydeveloperworks/blogs/nlp/entry/db2_rdf_nosql_graph_support13

Page 45: Ontology and semantic web (2011)

© 2012 IBM Corporation

Appendix B - OWL

OWL (Web Ontology Language)– Built on top of RDF (same syntax RDF)

Open World vs Closed World assumption

Parts of an Ontology:– Header– Classes and Individuals– Properties– Annotations– Datatypes

Instance vs Subclass

Page 46: Ontology and semantic web (2011)

© 2012 IBM Corporation

OWL – Subclasses and Types

alpha rdfs:subClassOf of Thing– a rdf:type alpha– b rdf:type alpha

beta rdfs:subClassOf alpha– c rdf:type beta– d rdf:type beta– c rdf:type alpha – d rdf:type alpha

Page 47: Ontology and semantic web (2011)

© 2012 IBM Corporation

OWL – Subclasses and Types

President rdfs:subClassOf Dignitary

Dignitary rdfs:subClassOf Person

This model states:– All dignitaries are people– All presidents are dignitaries (and thus,

people)

John Smith rdf:type Person

Queen Elizabeth rdf:type Dignitary– Queen Elizabeth rdf:type Person

GW Bush rdf:type President– GW Bush rdf:type Dignitary– GW Bush rdf:type Person

Barack Obama rdf:type President– Barack Obama rdf:type Dignitary– Barack Obama rdf:type Person

How do we expand this model to classify actively-serving American presidents?

Page 48: Ontology and semantic web (2011)

© 2012 IBM Corporation

OWL – Subclasses and Types

President rdfs:subClassOf Dignitary

Dignitary rdfs:subClassOf Person

This model states:– All dignitaries are people– All presidents are dignitaries (and thus,

people)

John Smith rdf:type Person

Queen Elizabeth rdf:type Dignitary– Queen Elizabeth rdf:type Person

GW Bush rdf:type President– GW Bush rdf:type Dignitary– GW Bush rdf:type Person

Barack Obama rdf:type President– Barack Obama rdf:type Dignitary– Barack Obama rdf:type Person

How do we expand this model to classify actively-serving American presidents?

Page 49: Ontology and semantic web (2011)

© 2012 IBM Corporation

Appendix C – OWL Properties

Transitive Property

Functional Property

Inverse Functional Property

Symmetric Property

Asymmetric Property

Reflexive Property

Irreflexive Property

Property Chains

Putting it all together

Others

Page 50: Ontology and semantic web (2011)

© 2012 IBM Corporation

Transitive Property

hasVersion rdf:type owl:TransitiveProperty

Windows hasVersion Windows XP

Windows XP hasVersion Windows XP SP2

Windows hasVersion Windows XP SP2

Page 51: Ontology and semantic web (2011)

© 2012 IBM Corporation

Functional Property

ssn-name rdf:type owl:FunctionalProperty

123-45-6789 ssn-ame Bob Smith

123-45-6789 ssn-ame Robert Smythe

Bob Smith owl:sameAs Robert Smythe

Page 52: Ontology and semantic web (2011)

© 2012 IBM Corporation

Inverse Functional Property

hasSpeKey rdf:type owl:InverseFunctionalProperty

File Net Web Services hasSpeKey 5724S03

FN WS hasSpeKey 5724S03

File Net Web Services owl:sameAs FN WS

Page 53: Ontology and semantic web (2011)

© 2012 IBM Corporation

Symmetric Property

siblingOf rdf:type owl:SymmetricProperty

Tim siblingOf Jim

Jim siblingOf Tim

Page 54: Ontology and semantic web (2011)

© 2012 IBM Corporation

Asymmetric Property

hasParent rdf:type owl:AsymmetricProperty

Stewie hasParent Peter

Peter does not have parent Stewie

Page 55: Ontology and semantic web (2011)

© 2012 IBM Corporation

Reflexive Property

Page 56: Ontology and semantic web (2011)

© 2012 IBM Corporation

Irreflexive Property

Page 57: Ontology and semantic web (2011)

© 2012 IBM Corporation

Property Chain

[] rdfs:subPropertyOf hasGrandfather;owl:propertyChain (

hasFatherhasFather

).

John III hasFather John JR

John JR hasFather John SR

John III hasGrandfather John SR

Page 58: Ontology and semantic web (2011)

© 2012 IBM Corporation

Putting it all together …

hasSynonym– Transitive, Symmetric

Page 59: Ontology and semantic web (2011)

© 2012 IBM Corporation

Appendix D - Classic Mereology

Transitive Axiom

Reflexive Axiom

Antisymmetric Axiom

Page 60: Ontology and semantic web (2011)

© 2012 IBM Corporation

Transitive Axiom

parts of parts are parts of the whole

If A is part of B and B is part of C, then A is part of C

Page 61: Ontology and semantic web (2011)

© 2012 IBM Corporation

Reflexive Axiom

everything is part of itself– A is part of A

Page 62: Ontology and semantic web (2011)

© 2012 IBM Corporation

Antisymmetric Axiom

nothing is a part of its parts– if A is part of B and A != B then B is not part of A

Page 63: Ontology and semantic web (2011)

© 2012 IBM Corporation

Appendix E - Partonomy

Can you distinguish parts from kinds?

Why is this important?

This is often the difference between a taxonomy and an ontology– A taxonomy doesn’t need to distinguish between parts and kinds– An ontology must make this distinction

Vehicle-Car--Engine---Crankcase----Aluminum Crankcase

Page 64: Ontology and semantic web (2011)

© 2012 IBM Corporation

Partonomy

Page 65: Ontology and semantic web (2011)

© 2012 IBM Corporation

Partonomy

Page 66: Ontology and semantic web (2011)

© 2012 IBM Corporation

Appendix F – Common Predicates

hasPart– hasPart owl:inverseOf partOf– hasPart rdf:type owl:TransitiveProperty– partOf rdf:type owl:TransitiveProperty

hasLocus

Page 67: Ontology and semantic web (2011)

© 2012 IBM Corporation

Appendix G

Blank nodes

Anonymous (Anon) nodes

Quads

Page 68: Ontology and semantic web (2011)

© 2012 IBM Corporation

Quads

(Reference Jena Tutorial with TDB.ppt)

Page 69: Ontology and semantic web (2011)

© 2012 IBM Corporation

Maintenance*

The relational model has relations between entities established through explict keys (primary, foreign) and associative entities.

– Changing relationships in this case is cumbersome, as it requires changes to the base model structure itself.

– Changes in an RDBMS can be difficult for a populated database.

Hierarchcal models have similar limitations

The graph model (RDF) makes it much easier to maintain the model once it is deployed.– A critical point is that relations are part of the data, not part of the database structure– If a new relationship needs to be added that was not anticipated, a new triple is simply

added to the datastore.– A graph model can be traversed from any perspective. In constrast, other types of

database designs might require structural changes to answer new questions that arise after initial implementation.

Page 70: Ontology and semantic web (2011)

© 2012 IBM Corporation

Design Styles

Avoid proliferating owl:inverseOf [1]