avoiding a semantic web roadblock: uri management and ontology evolution

21
Copyright © 2010 Michael Uschold. All rights reserved. Avoiding a Semantic Web Road Block: URI Management and Ontology Evolution Michael Uschold, PhD: Independent Consultant . Friday 25 June 2010 Semantic Technology Conference , San Francisco, CA 1

Upload: michael-uschold

Post on 13-Jun-2015

1.320 views

Category:

Education


9 download

DESCRIPTION

We highlight the importance of creating a set of guidelines for managing URIs during ontology evolution and linking open data. We examine some potential and actual negative impacts of making the wrong decision. For example, the new version of SKOS changes the semantics for existing terms without changing the URI. This adds a heavy load on developers of ontology-driven applications to keep them from breaking. Alternatively, minting a whole set of new URIs when the meaning for most of the terms is unchanged, causes an unnecessary proliferation of URIs that adds computational and conceptual overheads. We suggest a way forward based on examining two root causes of the problem: 1) URIs are overloaded and 2) there is no good technology for change management. As linked data grows and as applications are driven more and more by ontologies, the negative impacts of inadequate URI management could severely retard the growth of the semantic web.

TRANSCRIPT

Page 1: Avoiding a Semantic Web Roadblock: URI Management and Ontology Evolution

Copyright © 2010 Michael Uschold. All rights reserved.

Avoiding a Semantic Web Road Block:

URI Management and Ontology Evolution

Michael Uschold, PhD:Independent Consultant

.Friday 25 June 2010

Semantic Technology Conference , San Francisco, CA

1

Page 2: Avoiding a Semantic Web Roadblock: URI Management and Ontology Evolution

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Page 2Copyright © 2010 Michael Uschold. All rights reserved.

Outline

• Examples of linked data in the wild

• Problems

• Root Causes

• What to do?

Page 3: Avoiding a Semantic Web Roadblock: URI Management and Ontology Evolution

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Ontologies and Linked Data in the Wild: SKOS

Simple Knowledge Organizing System (SKOS)

• Small vocabulary (20 terms)

• Evolve to new version

Changes:

• Majority of terms are the same

• Change semantics of broader: no longer transitive

3Copyright © 2010 Michael Uschold. All rights reserved.

Page 4: Avoiding a Semantic Web Roadblock: URI Management and Ontology Evolution

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Ontologies and Linked Data in the Wild: WordNet

WordNet: lexical database for English language

• Large vocabulary

• Evolve to new version

Changes:

• Majority of terms are the same

• Significant number of updates and changes

4Copyright © 2010 Michael Uschold. All rights reserved.

Page 5: Avoiding a Semantic Web Roadblock: URI Management and Ontology Evolution

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Ontologies and Linked Data in the Wild: Open Biomedical Ontologies

Open Biomedical Ontologies

• Very large vocabulary

• Interconnected ontologies

• Undergoing continual evolution (daily)

Changes:

• Majority of terms are the same

• Significant number of updates and changes

5Copyright © 2010 Michael Uschold. All rights reserved.

Page 6: Avoiding a Semantic Web Roadblock: URI Management and Ontology Evolution

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Versioning and URIs: Options

A. Mint all new URIs, even for unchanged terms.

B. Keep URIs the same, even when semantics changes.

C. Mint new URIs only for changed terms.

(including the ontology URI)a. Throw away old terms.

b. Deprecate old terms for backwards compatibility

6Copyright © 2010 Michael Uschold. All rights reserved.

Page 7: Avoiding a Semantic Web Roadblock: URI Management and Ontology Evolution

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

(A) Mint all new URIs: Impacts

Usage Scenario

1. Application loads ontology O1 and data D1

2. New version: O21. All new URIs,

2. No idea which terms have different semantics

3. New dataset D2, created and loaded into application

4. Query using old URIs

WRONG ANSWERS: Ignores data from new URIs

Maintenance headaches: find semantic matches

Performance problems: if use owl:sameAs

Broken applications

Convenient for first time users.

7Copyright © 2010 Michael Uschold. All rights reserved.

Page 8: Avoiding a Semantic Web Roadblock: URI Management and Ontology Evolution

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

(B) Same URIs, Different Semantics: Impacts

Usage Scenario

1. Application loads ontology O1 and data D1

2. Create application functionality that depends on O1

3. New version: O21. Some terms now have different semantics, but the same URIs,

2. No idea which terms have different semantics

4. New dataset D2, created and loaded into application

5. Run functionality that depends on O1 semantics

WRONG ANSWERS: mixing different semantics

Maintenance Headaches: find semantic matches

Broken Applications

Convenient for first time users.

8Copyright © 2010 Michael Uschold. All rights reserved.

Page 9: Avoiding a Semantic Web Roadblock: URI Management and Ontology Evolution

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

(C) New URIs only for changed terms: Impacts

Usage Scenarios

1. No broken applications

2. No performance problems

3. No maintenance headaches

Inconvenience of having same ontology with multiple

namespaces.

9Copyright © 2010 Michael Uschold. All rights reserved.

Page 10: Avoiding a Semantic Web Roadblock: URI Management and Ontology Evolution

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Pros and Cons

Maintenance

headaches

Performance

problems

Broken

Apps

Multiple

namespaces

same ontology

Convenient

for first time

users.

A: All New

URIsx x x x

B: Same URIs

changed

semantics

x x x

C: New URIs

only for new or

changed terms

x

10Copyright © 2010 Michael Uschold. All rights reserved.

What would YOU do?

What did THEY do?

WordNet, SKOS, Open Biomedical Ontologies

Page 11: Avoiding a Semantic Web Roadblock: URI Management and Ontology Evolution

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

What Actually Happened?

Open Biomedical Ontologies: (C)

New URIs only for new terms, deprecate old terms

SKOS: (B) Same URIs, Different Semantics

WordNet: (A) Mint all new URIs, multiple times!http://wordnet.princeton.edu/~agraves/wordnet/0.9/

http://xmlns.com/wordnet/1.6/

http://www.w3.org/2006/03/wn/wn20/instances/

http://www.loa-cnr.it/wn30/instances/

But wait, there’s more:

http://purl.org/vocabularies/princeton/wn30/

http://www.ontologyportal.org/WordNet.owl#WN30-200662589

11Copyright © 2010 Michael Uschold. All rights reserved.

Page 12: Avoiding a Semantic Web Roadblock: URI Management and Ontology Evolution

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Why no Uproar?

• SKOS is not a standard

• SKOS is not used by that many people

• It’s just life, people get by

• Few ontology-driven applications

• BUT: this is changing, and business as usual could

result in a Semantic Web Roadblock down the road.

12Copyright © 2010 Michael Uschold. All rights reserved.

Page 13: Avoiding a Semantic Web Roadblock: URI Management and Ontology Evolution

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Another Example: DBpedia and Yago

• DBpedia published, without any ontology

• YAGO team created ontology from DBpedia• Subset of Wikipedia category hierarchy

• Only when aligned with WordNet hierarchy

• http://www.mpii.de/yago/resource/wordnet_calculator_102938886

• DBpedia team added Yago Classes to their datasets,

but different URIs were used.• http://dbpedia.org/class/yago/Calculator102938886

ISSUES:

• Proliferation of URIs.

• A lot of semantics hidden in names.

13Copyright © 2010 Michael Uschold. All rights reserved.

Page 14: Avoiding a Semantic Web Roadblock: URI Management and Ontology Evolution

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Problems and Root Causes

14

Ontology-driven

applications break

Maintenance

Issues

Performance

Issues

URIs Overloaded

(especially w/ UIDs)

Page 15: Avoiding a Semantic Web Roadblock: URI Management and Ontology Evolution

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

URI Overloading

http://wordnet.princeton.edu/~agraves/wordnet/0.9/

1. Owning / Controlling organization

2. File directory structure

3. Human readable names (ontology and terms)

4. Version number

5. Unique Identifier

6. Web location (URL)

Contributed to SKOS problem. If URIs were only UIDs:• Non-transitive broader: Create a new resource with new UID

• Transitive broader: change the human readable term name to

broaderTransitive, same UID.

• Viola!15Copyright © 2010 Michael Uschold. All rights reserved.

Page 16: Avoiding a Semantic Web Roadblock: URI Management and Ontology Evolution

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Problems and Root Causes

16

Ontology-driven

applications break

Maintenance

Issues

Performance

Issues

Overuse of

OWL:sameAs

Proliferation of URIs

URIs Overloaded

(especially w/ UIDs)

Poor change mgmt.

infrastructure

Page 17: Avoiding a Semantic Web Roadblock: URI Management and Ontology Evolution

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Change Management Infrastructure

• Inadequate to non-existent• Stopgap: annotation properties for versioning

• Technologies immature

• Purposely delayed by W3C

HENCE: no versioning guidelines

17Copyright © 2010 Michael Uschold. All rights reserved.

Page 18: Avoiding a Semantic Web Roadblock: URI Management and Ontology Evolution

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Problems and Root Causes

18

Ontology-driven

applications break

Maintenance

Issues

Performance

Issues

Overuse of

OWL:sameAs

Proliferation of URIs

URIs Overloaded

(especially w/ UIDs)

Poor change mgmt.

infrastructure

No versioning

guidelines

Change semantics

of URIs

Semantic infidelity

Overloading

OWL:sameAs

Page 19: Avoiding a Semantic Web Roadblock: URI Management and Ontology Evolution

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

What can be done?

1. Imagine a future:

• Change management and versioning is solved.

• Specify exactly WHAT that would mean

(Don’t worry about HOW)

• Ontology-driven applications are the norm.

2. Build guidelines that will work in this future.

19Copyright © 2010 Michael Uschold. All rights reserved.

Page 20: Avoiding a Semantic Web Roadblock: URI Management and Ontology Evolution

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

Change Management & Versioning Solved

• Unique IDs are separated from URLs and all the rest.

• Automatic tracking and detection of dependencies

• Automatic minting of new UIDs when semantics

changes• Don’t change name if semantics is the same

• Don’t change semantics if name is the same

20Copyright © 2010 Michael Uschold. All rights reserved.

Page 21: Avoiding a Semantic Web Roadblock: URI Management and Ontology Evolution

Engineering, Operations & Technology | Phantom Works E&IT | Mathematics and Computing Technology

SUMMARY: Problems and Root Causes

21

Proliferation of URIs

Overuse of

OWL:sameAs

Ontology-driven

applications break

Maintenance

Issues

Performance

Issues

Poor change mgmt.

infrastructure

URIs Overloaded

(especially w/ UIDs)

Change semantics

of URIs

No versioning

guidelines

Overloading

OWL:sameAs

Semantic infidelity