new york state center of excellence in bioinformatics & life sciences r t u new york state...

123
New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U VUB Leerstoel 2009-2010 Theme: Ontology for Ontologies, theory and applications Inaugural Oration: The quest for semantic interoperability May 17, 2010; 16h30-19h00 Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels Room D2.01 Prof. Werner CEUSTERS, MD Ontology Research Group, Center of Excellence in Bioinformatics and Life Sciences and Department of Psychiatry, University at Buffalo, NY, USA

Upload: arthur-powell

Post on 16-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

VUB Leerstoel 2009-2010Theme: Ontology for Ontologies, theory and applications

Inaugural Oration:The quest for semantic interoperability

May 17, 2010; 16h30-19h00Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels

Room D2.01

Prof. Werner CEUSTERS, MD

Ontology Research Group, Center of Excellence in Bioinformatics and Life Sciences and

Department of Psychiatry, University at Buffalo, NY, USA

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Buffalo NYCChicago

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Center of Excellence in

Bioinformatics & Life Sciences

Buffalo, NY

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

?

Short personal history

1959 - 20101977

1989

1992

1998

2002

2004

2006

19931995

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

?

Short personal history

1959 - 2030?1977

1989

1992

1998

2002

2004

2006

19931995

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

A trajectory of mixes and mingles …

Biology

TranslationalResearch

Defense &Intelligence

Pharmacology

PharmacogenomicsPerforming

Arts

Linguistics

Computational Linguistics

Medical NaturalLanguage Understanding

Informatics

Medicine

Knowledge Representation

ElectronicHealth Records Referent

Tracking

PhilosophyOntology

Realism-BasedOntology

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

… provides the context for this lecture series (1)

• May 17: the quest for semantic interoperability

– what is it ?– what are the building blocks ?– why do only few systems exhibit it ?– Take home message:

• good ontologies are badly needed

Informatics Knowledge Representation

ElectronicHealth Records

PhilosophyOntology

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

… provides the context for this lecture series (2)• May 18: the need for realism-based ontology

development– What ontology should be

• philosophical realism, applied to …• … ‘knowledge representation’

– Generic/specific distinction• relation with Referent Tracking

– Target audience:• ontology developers and evaluators• philosophers who want a real job• technology scouts

– Take home message: • good ontology = realism-based ontology

ReferentTracking

PhilosophyOntology

Realism-BasedOntology

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

… provides the context for this lecture series (3)

• May 19: ontologies in healthcare and the vision of personalized medicine

– An ontologist’s view on data and information models– Open Biomedical Ontologies Foundry– Example ontologies for eHealth

Biology

TranslationalResearch

Pharmacology

PharmacogenomicsMedicine

ElectronicHealth Records Referent

Tracking

Realism-BasedOntology

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

… provides the context for this lecture series (4)

• May 20: ontologies and Natural Language Understanding

• Target audience:– computational linguists– semantic engineers

Linguistics

Computational Linguistics

Medical NaturalLanguage Understanding

Informatics

Medicine

ElectronicHealth Records

Realism-BasedOntology

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

… provides the context for this lecture series (5)

• May 21: Referent Tracking: why Big Brother was just a little baby.

– theory of Referent Tracking:• give a unique identifier to everything

– implementation of RT systems– application in situational awareness

(in the broadest sense)

– Target audience:• everybody who wants to survive after 2012 Defense &

IntelligencePerforming

Arts

ReferentTracking

Realism-BasedOntology

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Semantic Interoperability

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Interoperability of Information Systems

The capacity of distinct information systems to

exchange ‘stuff’From ‘Wargames’

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Gradations in interoperability• Level 0: no interoperability at all• Level 1: technical and syntactical interoperability (no

semantic interoperability)• Level 2: two orthogonal levels of partial semantic

interoperability– Level 2a: unidirectional semantic interoperability– Level 2b: bidirectional semantic interoperability

of meaningful fragments• Level 3: full semantic interoperability, sharable context,

seamless co-operabilitySemantic Interoperability for Better Health and Safer Healthcare.

Semantic HEALTH Report. January 2009

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

One often used definition

Semantic Interoperability (SI)

=

the ability of two or more computer systems to exchange information in such a way that the

meaning of that information can be automatically interpreted by the receiving system accurately

enough to produce useful results to the end users of both systems.

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

‘Full interoperability’

• ‘Neither language nor technological differences prevent the system to seamlessly integrate the received information into the local record and provide a complete picture of someone’s health as if it would have been collected locally.’

• ‘Further, the anonymized data feeds directly into the tools of public health authorities and researchers.’

Stroetmann et.al. Semantic Interoperability for Better Health and Safer Healthcare. SemabticHEALTH report. Jan 2009

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

What this practically means …

Healthcare Finance Intelligence and Command & Control

Digital collectionsand

IP rights

Enterprise&

supply chainmanagement

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Biggest SI endeavor: the Semantic Web

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

The standard web: end users are humans

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

The Semantic Web: end-users are maximally assisted by agents

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Where is a web … is usually a spider

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

The core issue:

Semantic Interoperability (SI)

=

the ability of two or more computer systems to exchange information in such a way that the

meaning of that information can be automatically interpreted by the receiving system accurately

enough to produce useful results to the end users of both systems.

meaning

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

And meaning is, of course, the problem

• ‘I know that you believe that you understood what you think I said, but I am not sure you realize that what you heard is not what I meant.’

– Robert McCloskey, State Department spokesman (attributed).

• http://www.quotationspage.com/quotes/Robert_McCloskey/

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Goal of the Semantic Web

• to make it possible for software to find the data it needs on the Web, understand it, cross-reference it and apply it to a particular task.

• “I should be able to tell my Web-enabled handheld device to schedule an appointment with a dentist within 20 miles of home and let the computer do the rest.”

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

“I should be able to tell my Web-enabled handheld device to schedule an appointment with a dentist within 20 miles of home and let the computer do the rest.”• So the SW must understand natural language ?

• So the SW must know when the requester is free ?

• So the SW must understand that it is to take care of the requester’s teeth, and not to have a nice diner date ?

• So the SW can then deduce what the actual length of “20 miles” is for this particular person ?

• So the SW must understand where the requester lives ?

If it were just that simple

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Pray your computer isn’t Irish

X: “Hallo stranger, you appear to be travelling?”Y: “Yes, I always travel when on a journey.”

X: “And pray, what might your name be?”Y: “It might be Sam Patch, but it isn't.”

X: “Have you been long in these parts?”Y: “Never longer than at present—5 feet 9.”

X: “Do you get anything new?”Y: “Yes, I bought a new whetstone this morning.”

Copyright © 1996 Electronic Historical Publications

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

The linguistic perspective (1)

characters

lexemes words syntax

semantics

word categories

pragmatics

discourse

morphology

phrases

sentences

prose

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

The linguistic perspective (2)

• Words:– ‘in’, ‘hepatitis’, ‘the’, ‘virus’, ‘sit’, ‘bank’, ‘river’, ‘money’

• We combine them in phrases and sentences:– ‘hepatitis virus’ ‘virus hepatitis’,

– ‘money in the bank’ ‘bank in the river’

• We combine sentences:– ‘First I removed the skin from the fish. Then I fried it. It was

delicious.’

• We know what (not) to use under which circumstances:– ‘girl’ – ‘chick’, ‘man’ – ‘guy’ – ‘dude’, …

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Building the Semantic Web requires this too

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

But it seems that only dummies are involved

• there must be a lot of dummies, or• don’t they still get it?

• a lot does seem to mean nothing at all

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

The clever (?) business man and his XML card

<business-card>

<name> John Nitwit </name>

<address>

<street> 524 Moon base avenue </street>

<city> Utopia </city>

</address>

<phone> … </phone>

</business-card>

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Is anything gained this way?

Eric Miller. Weaving Meaning: The Semantic Web. 2002. www.w3.org/Talks/2002/10/16-sw/

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Are mails like this one surprising?At 10:13 PM 3/22/2010, you wrote:

Dear Prof Smith,just a quick email to express my sincerest gratitude - the learning materials you made available are being of enormous value to me. After a PhD in the Semantic Web area at [a well known knowledge management

institute], I came out so disgusted with the general lack of scientific & philosophical grounding in the community around me, that I felt I totally lost sight of my research path.But your systematic and thorough presentation of the field is helping me see where I stand, without all the usual technical buzzwords and marketing pitches. At the same time, this gives me hope of finding more solid grounds for my future research.

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

The heart of the evil …

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T UWhat it was …

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U… and became

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

What is (an) Ontology ?

Without buzzwords and marketing pitches

but

with adequate philosophical thinking

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

“What is … ?” –questions are problematic

• How would you answer the following questions:– what is a human being ?– what is JFK ?– what is yellow ?– what is a unicorn ?– what is a drug ?

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

What do the following juxtapositions reveal?

what is a human being?

what is JFK?

what is yellow?

what is a unicorn?

what is a drug?

what does ‘human being’ mean?

what does ‘JFK’ mean?

what does ‘yellow’ mean?

what does ‘unicorn’ mean?

what does ‘drug’ mean?

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

What do the following juxtapositions reveal?

what is a human being?

what is JFK?

what is yellow?

what is a unicorn?

what is a drug?

what does ‘human being’ mean?

what does ‘JFK’ mean?

what does ‘yellow’ mean?

what does ‘unicorn’ mean?

what does ‘drug’ mean?

Ontology Terminology

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

The Ontology-Terminology divide

• Ontology is about what things are.

• Terminology is about how to name things, without caring about whether what is named exists.

• Sadly, this distinction is by many people who call themselves ‘ontologists’ or build ‘ontologies’ either not understood at all, or applied in the wrong way.

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Terminological versus Ontological approach

• The terminologist defines:– ‘a clinical drug is a pharmaceutical product given to (or taken

by) a patient with a therapeutic or diagnostic intent’. (RxNorm)

• The (good, real) ontologist thinks:– Does ‘given’ includes ‘prescribed’?

– Is manufactured with the intent to … not sufficient?• Are newly marketed products – available in the pharmacy, but not yet

prescribed – not clinical drugs?

• Are products stolen from a pharmacy not clinical drugs?

• What about such products taken by persons that are not patients?– e.g. children mistaking tablets for candies.

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

This dichotomy is also present in simple words

Carl Austin Weiss, MD(Dec 6, 1906 – Sept 8, 1935)

Huey Pierce Long, Jr.(Aug 30, 1893 - Sept 10, 1935)

Solving crimes through Referent Tracking

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

A double mystery• (It is argued that) On September 9th, 1935, Carl Austin Weiss

shot Senator Huey Long in the Louisiana State Capitol with a .35 calibre pistol. Long died from this wound thirty hours later on September 10th. Weiss, on the other hand, received between thirty-two and sixty .44 and .45 calibre hollow point bullets from Long's agitated bodyguards and died immediately.

Sorensen, R., 1985, "Self-Deception and Scattered Events", Mind, 94: 64-69.

• Questions:– Did Weiss kill Senator Long ?

– If so, when did he kill him ?

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

The events on a time linetime

Senator Long’s living

Weiss’ shooting of Long

Carl Weiss’ living

Bodyguards’shooting of Weiss

Weiss’ path. body reactions

Long’s pathological body reactions

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

When did the killing happen ?time

Senator Long’s living

Weiss’ shooting of Long

Carl Weiss’ living

Bodyguards’shooting of Weiss

Weiss’s path. body reactions

Long’s pathological body reactions

t1?

t2?

If at t1: Long was not dead after he was killed

If at t2: Long was killed by a dead person

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

What this demonstrates

• What things are and how things are named, are two different issues,

• (Natural) language does not fit nicely with reality,– formed at a time when insight in reality was crippled,– did not evolve with our insight,

• Human brains have the capacity not to be bothered too much by the unfaithfulness of natural language.

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Ambiguous phrasings

warning on plastic bag

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Hotel semantics

in Miami hotel lobbyin A’dam hotel elevator

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Good philosophers lack this capacity

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Computers lack this ability too, but that is a problem

• Knowledge representation and semantic interoperability are for machines, not humans;

• Computer languages and knowledge representations must at least be unambiguous, and preferably also faithful to (our best understanding of) reality.

• Unfortunately, the majority of them don’t precisely because of the confusion between terminology and ontology.

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

The terminology of ‘ontology’:

Google ‘define: ontology’:• the study of the broadest range of categories of existence, which also

asks questions about the existence of particular kinds of objects;• an explicit representation of the meaning of terms in a vocabulary, and

their relationships;• a common vocabulary for describing the concepts that exist in an area

of knowledge and the relationships that exist between them;• specification of a conceptualisation of a knowledge domain;• a structured information model of a domain capable of supporting

reasoning by human users and software agents;• a data model that represents a set of concepts within a domain and the

relationships between those concepts;• …

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

One term, many definitions

This raises some (philosophical?) questions:1. Is it possible for a term to have so many meanings?

2. Can the authors of these definitions all be right at the same time?

3. Is it possible for something to which one of these definitions applies to be such that also one or more of the other definitions apply ?

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

• Clearly: yes !

• This phenomenon is called:

• and is usually explained in terms of the semantic or semiotic or meaning triangle.

Homonymy

Q1: Is it possible for a term to have so many meanings?

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Standard Semiotic/Semantic Triangle

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Standard Semiotic/Semantic Triangle

Useful,but nevertheless wrong !

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Useful to build multi-lingual dictionaries

Concept ‘cat’

catchatkat

Katze…

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Problem: several interpretations of the Semiotic/Semantic triangle

Sign:Language/

Term/Symbol

Referent:Reality/Object

Reference: Concept / Sense / Model / View / Partition

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Aristotle’s triadic meaning model

semeia

gramma/ phoné pragma

pathemaWords spoken are signs or symbols (symbola) of affections or impressions (pathemata) of the soul (psyche); written words (graphomena) are the signs of words spoken (phoné). As writing (grammatta), so also is speech not the same for all races of men. But the mental affections themselves, of which these words are primarily signs (semeia), are the same for the whole of mankind, as are also the objects (pragmata) of which those affections are representations or likenesses, images, copies (homoiomata).

Aristotle, 'On Interpretation', 1.16.a.4-9, Translated by Cooke & Tredennick,

Loeb Classical Library, William Heinemann, London, UK, 1938.

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Richards’ semantic triangle

• Reference (“concept”): “indicates the realm of memory where recollections of past experiences and contexts occur”.

• Hence: as with Aristotle, the reference is “mind-related”: thought.

• But: not “the same for all”, rather individual mind-related

symbol referent

referenceunderstandingmy your understanding

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Don’t confuse with homonymy !

“mole” mole (animal)

R1

mole (unit)

R2

mole (skin lesion)

R3

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Different thoughts Homonymy

“mole” mole “animal”

R1

mole “unit”

R2

mole“skin lesion”

R3

symbol referent

understanding

One conceptof x understanding of y

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

And by the way, synonymy...

the Aristotelian view Richards’ view

“perspiration”

“sweat”“sweat”

“perspiration”

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Frege’s view

• “sense” is an objective feature of how words are used and not a thought or concept in somebody’s head

• 2 names with the same referent can have different senses– morning star– evening star

• 2 names with the same sense have the same referent (synonyms)

• a name with a sense does not need to have a referent (“Beethoven’s 10th symphony”)

referent

sense

name

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Homonymous use of the term ‘ontology’• the study of the broadest range of categories of existence, which also

asks questions about the existence of particular kinds of objects;

• an explicit representation of the meaning of terms in a vocabulary, and their relationships;

• a common vocabulary for describing the concepts that exist in an area of knowledge and the relationships that exist between them;

• specification of a conceptualisation of a knowledge domain;

• a structured information model of a domain capable of supporting reasoning by human users and software agents;

• a data model that represents a set of concepts within a domain and the relationships between those concepts;

• …

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Q2: Can the authors of these definitions all be right at the same time?

• Yes, if we are dealing with a case of homonymy.

• But in that case, they are all talking about different distinct things.

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Q3: Is it possible for something to which one of these definitions applies to be such that also one or more of the other definitions apply ?

study

representation

vocabulary

specification

information model

data model

‘that’ thing

is an

is a

is ais ais a

is a ?(hint on next slide)

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Remember the “what is yellow?”-question

• Answers could have been:– a color– a banana

• Thus:– can something which is a color be a banana ?– can something which is a banana be a color ?

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Q3: Is it possible for something to which one of these definitions applies to be such that also one or more of the other definitions apply ?

study

representation

vocabulary

specification

information model

data model

‘that’ thing

is an

is a

is ais ais a

is a ?Not for all !

Only for some

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Homonymous use of the term ‘ontology’:at least one clear cut distinction

• the study of the broadest range of categories of existence, which also asks questions about the existence of particular kinds of objects;

• an explicit representation of the meaning of terms in a vocabulary, and their relationships;

• a common vocabulary for describing the concepts that exist in an area of knowledge and the relationships that exist between them;

• specification of a conceptualisation of a knowledge domain;• a structured information model of a domain capable of supporting

reasoning by human users and software agents;• a data model that represents a set of concepts within a domain and the

relationships between those concepts;• …

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

‘Ontology’ as the study of what exists• Key questions:

– What exists ?– How do things that exist relate to each other ?

• Some hypotheses:– An external reality, time, space– Ideas, concepts– Particulars, universals, objects, processes– God

• Ontologists from distinct ‘schools’ differ in opinion about the existence of some of the above:– Realism, nominalism, conceptualism, monism, …

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

An ontology as a representation

• Terms WordNet, MedDRA, RxNORM

• Concepts the majority of ‘ontologies’But … overwhelming lack of clarity about what

‘concepts’ are:• meaning shared in common by synonymous terms ?• idea shared in common in the minds of those who use these terms ?• unit of knowledge describing meanings ?• feature or property or characteristic shared in common by entities in

the world ?

• Universals Realism-based ontology

Key question: of what ?

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

• But what the word ‘concept’ denotes, is never clarified and users of it often refer to different entities in a haphazard way:

• meaning shared in common by synonymous terms• idea shared in common in the minds of those who

use these terms• unit of describing meanings knowledge• universal that what is shared by all and only all entities

in reality of a similar sort

Most ontologies are ‘concept’-based

Smith B, Kusnierczyk W, Schober D, Ceusters W. Towards a Reference Terminology for Ontology Research and Development in the Biomedical Domain. Proceedings of KR-MED 2006, Biomedical Ontology in Action, November 8, 2006, Baltimore MD, USA

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Concepts in ISO ?• A unit of thought constituted through abstraction on the

basis of properties common to a set of objects. (ISO 1087:1990) – Object: anything perceivable or conceivable. Objects may also

be material (e.g. an engine, a sheet of paper, a diamond), immaterial (e.g. a conversion ratio, a project plan) or imagined (e.g. a unicorn). [Adapted from ISO 1087-1:2000, 3.1.1]

• A unit of knowledge created by a unique combination of characteristics. [ISO 1087-1:2000, 3.2.1] – characteristic: Abstraction of a property of an object or of a set

of objects. Characteristics are used for describing concepts. [ISO

1087-1:2000, 3.2.4]

What knowledge is there to have about unicorns ?

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

• But what the word ‘concept’ denotes, is never clarified and users of it often refer to different entities in a haphazard way:

• meaning shared in common by synonymous terms• idea shared in common in the minds of those

who use these terms• unit of describing meanings knowledge• universal that what is shared by all and only all entities

in reality of a similar sort

These views require the involvement of a cognitive entity:

Most terminologies are ‘concept’-based

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

• But what the word ‘concept’ denotes, is never clarified and users of it often refer to different entities in a haphazard way:

• meaning shared in common by synonymous terms• idea shared in common in the minds of those

who use these terms• unit of describing meanings knowledge• universal that what is shared by all and only all entities

in reality of a similar sort

These views require the involvement of a cognitive entity:

This view does not presuppose cognition at all

Most terminologies are ‘concept’-based

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Therefore: a multi-disciplinary approach to ontology

• In philosophy:– Ontology (no plural) is

the study of what entities exist and how they relate to each other;

• Our ‘realist’ view within the Ontology Research Group combines the two:– We use realism, a specific theory of ontology, as the basis for

building high quality ontologies, using reality as benchmark.

• In mainstream computer science and biomedical informatics:– An ontology (plural: ontologies) is a

shared and agreed upon conceptualization of a domain;

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Realism-based Ontology

• Accepts the existence of: – a real world outside mind and language,– a structure in that world prior to mind and language

(universals / particulars).

• Rejects ontology as a matter of agreement on ‘conceptualizations’.

• Uses reality as a benchmark for testing the quality of ontologies as artifacts by building appropriate logics with referential semantics (rather than model-theoretic).

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

A realism-based ontology is …

• a representation of some pre-existing domain of reality which:– (1) reflects the properties of the entities within its

domain in such a way that there obtains a systematic correlation between reality and the representation itself,

– (2) is intelligible to a domain expert,– (3) is formalized in a way that allows it to support

automatic information processing.

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Compare with Alberti’s grid

reality

representation

Ontologicaltheory

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

BFO Top-Level Ontology (partial)

ContinuantOccurrent

(always dependent on one or more

independent continuants)

IndependentContinuant

DependentContinuant

Role

Function

Realizable

SpatialRegion

TemporalRegion

ProcessQuality

SDC GDC

Disposition

InformationContentEntity

Functioning

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Rise and fall of theconcept-based approach

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

No serious scholar should work with ‘concepts’

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Slow penetration of the idea …

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

More serious scholars become convinced …

what is a concept description a

description of?

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Eugen Wüster• 1935

• Professor of Woodworking Machinery in the Vienna Agricultural College

• Terminology-hobbyist

• founder of ISO-TC 37: terminology standards

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

• concepts are inside people’s brains– a concept is a mental surrogate of a plurality of objects

grouped together on the basis of perceived similarities– what makes those objects similar is itself a concept

• object = def. anything to which human thought is or can be directed, whether material or immaterial, real or purely imagined

• ISO: ‘In the course of producing a terminology, philosophical discussions on whether an object actually exists in reality … are to be avoided’.

Eugen Wüster’s psychological view of concepts

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Concept-based approaches are top-down

• FIRST concepts (meanings, words, terms)• THEN (if you’re lucky) real-world phenomena

• Reasons:

– Wüsterianism and the ISO terminology standards

– needs of programmers (and of third-party payers)

– hold-overs from the era of electronic dictionaries

Smith B., Ceusters W, Temmerman R. Wüsteria. In: Engelbrecht R. et al. (eds.) Medical Informatics Europe, IOS Press, Amsterdam, 2005;:647-652 

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Typical reasoning patterns for Wüsterians

• If domain experts use some term– then, there must be a concept,

• whether or not there is some referent.

• If observations reveal the existence of ‘objects’ which are of a similar kind,– then, even if we don’t know yet what that kind is,– there must be an associated concept.

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Observations and similarities

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Observations and similarities

Are these pictures of concepts or of horses ?

Is this a sensible question:‘What concepts have tails and do …?’

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Observations and similarities

Are these pictures of concepts?

Are these pictures of anything at all?

If concepts are in brains, that must be awfully big brains!

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Concepts = confusions !

• Use/mention confusions:– Brussels is a nice city and has eight letters.

Brussels is a nice city and

Brussels’ name is ‘Brussels’ and

‘Brussels’ has eight letters.

• Kantian confusions:– what exists is what we believe that exists– horses exists because we have the concept of horse and

we see in reality things that fit that concept.

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

And confusion is thus everywherein terminologies, classifications and ‘ontologies’

• SNOMED:– ‘Disorders are concepts in which there is an explicit or

implicit pathological process causing a state of disease which tends to exist for a significant length of time under ordinary circumstances.’

– And also: “Concepts are unique units of thought”.– Thus: Disorders are unique units of thoughts in

which there is a pathological process …???– And thus: to eradicate all diseases in the world at once

we simply should stop thinking ?

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

SNOMED International (1995, V3.1)

• T Topography 12,385• M Morphology 4,991• F Function 16,352• L Living Organisms 24,265• C Drugs & Biological Products 14,075• A Physical Agents, Forces and Activities 1,355• D Disease/ Diagnosis 28,623• P Procedures 27,033• S Social Context 433• J Occupations 1,886• G General Modifiers 1,176• TOTAL RECORDS 132,641 ?

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Diagnosis versus disease

The disease is hereThe diagnosis is here

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Border’s classification (of medicine?)• Medicine

– Mental health– Internal medicine

• Endocrinology– Oversized endocrinology

• Gastro-enterology• ...

– Pediatrics– ...– Oversized medicine

Refer to the size of the books that do not fit on

a normal Border’s Bookshop shelf

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

MeSH: Geographic Locations• Africa [Z01.058] +• Americas [Z01.107] +• Antarctic Regions [Z01.158]• Arctic Regions [Z01.208]• Asia [Z01.252] +• Atlantic Islands [Z01.295] +• Australia [Z01.338] +• Cities [Z01.433] +• Europe [Z01.542] +• Historical Geographic Locations

[Z01.586] +• Indian Ocean Islands [Z01.600] +• Oceania [Z01.678] +• Oceans and Seas [Z01.756] +• Pacific Islands [Z01.782] +

• mereological mess• mixture of geographic

entities with socio-political entities

• mixture of space and time

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

MeSH: Geographic Locations [Z01]• Africa [Z01.058] +• Americas [Z01.107] +• Antarctic Regions [Z01.158]• Arctic Regions [Z01.208]• Asia [Z01.252] +• Atlantic Islands [Z01.295] +• Australia [Z01.338] +• Cities [Z01.433] +• Europe [Z01.542] +• Historical Geographic Locations

[Z01.586] +• Indian Ocean Islands [Z01.600] +• Oceania [Z01.678] +• Oceans and Seas [Z01.756] +• Pacific Islands [Z01.782] +

• Ancient Lands [Z01.586.035] +• Austria-Hungary [Z01.586.117]• Commonwealth of Independent States

[Z01.586.200] +• Czechoslovakia [Z01.586.250] +• European Union [Z01.586.300]• Germany [Z01.586.315] +• Korea [Z01.586.407]• Middle East [Z01.586.500] +• New Guinea [Z01.586.650]• Ottoman Empire [Z01.586.687]• Prussia [Z01.586.725]• Russia (Pre-1917) [Z01.586.800]• USSR [Z01.586.950] +• Yugoslavia [Z01.586.980] +

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Diabetes Mellitus in MeSH 2008

?

Different set of more specific terms when different path from the top is taken.

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T UMeSH: some paths from top to Wolfram Syndrome

Wolfram Syndrome

All MeSH Categories

Diseases Category

Nervous System Diseases

Cranial Nerve Diseases

Optic Nerve Diseases

Optic Atrophy

Optic Atrophies,Hereditary

NeurodegenerativeDiseases

HeredodegenerativeDisorders,

Nervous System

Eye Diseases

Eye Diseases, Hereditary

Optic Nerve Diseases

Male UrogenitalDiseases

Urologic Diseases

Kidney Diseases

Diabetes Insipidus

Female Urogenital Diseasesand Pregnancy Complications

Female Urogenital Diseases

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T UWhat would it mean if used in the context of a patient ?

Wolfram Syndrome

All MeSH Categories

Diseases Category

Nervous System Diseases

Cranial Nerve Diseases

Optic Nerve Diseases

Optic Atrophy

Optic Atrophies,Hereditary

has

NeurodegenerativeDiseases

HeredodegenerativeDisorders,

Nervous System

Eye Diseases

Eye Diseases, Hereditary

Optic Nerve Diseases

Female Urogenital Diseasesand Pregnancy Complications

Female Urogenital Diseases

Male UrogenitalDiseases

Urologic Diseases

Kidney Diseases

Diabetes Insipidus

???

has

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Description logics is no guarantee to get parthood rightSNOMED-RT (2000)

SNOMED-CT (2003)

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Mistakes dueto inappropriatelexical mapping ?

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Find the problem

concept

terms

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Snomed CT (July 2007):“fractured nasal bones”

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

SNOMED-CT: abundance of false synonymy

nose

bones

fracture

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Coding / Classification confusion

A patient with a fractured nasal bone

A patient with a broken nose

A patient with a fracture of the nose

=

=

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

A patient with a fractured nasal bone

A patient with a broken nose

A patient with a fracture of the nose

=

=

Coding / Classification confusion

A patient with a fractured nasal bone

A patient with a broken nose

A patient with a fracture of the nose

=

=

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

cycles inhierarchicalrelationships

UMLS: Metathesaurus: merging terminologies

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Conclusion

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Conclusion

• Concept-based terminology (and standardisation thereof) is there as a mechanism to improve understanding of messages by humans.

• It is NOT the right device – to explain why reality is what it is, how it is organised,

etc., (although it is needed to allow communication), – to reason about reality, – to make machines understand what is real,– to integrate across different views, languages,

conceptualisations, ...

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Why not ?

Because there is no valid

benchmark !

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Why not ?• Conceptualism does not take care of the structure of

reality.• Concepts not necessarily correspond to something that

(will) exist(ed)– Sorcerer, unicorn, leprechaun, ...

• Definitions set the conditions under which terms may be used, and may not be abused as conditions an entity must satisfy to be what it is. Kantian constructivism

• Language can make strings of words look as if it were terms– “Middle lobe of left lung”

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Today’s biggest problem: a confusion between “terminology” and “ontology”

• The conditions to be agreed upon when to use a certain term to denote an entity, are often different than the conditions which make an entity what it is.– Trees would still be different from rabbits if there were

no humans to agree on how these things should be called.

• “ontos” means “being”. The link with reality tends to be forgotten: one concentrates on the models instead of on the reality.

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

What to do about it ? (1)

• Research:– Revision of the appropriatness of concept-based

terminology for specific purposes;– Relationship between models and that part of reality

that the models want to represent;– Adequacy of current tools and languages for

representation;– Boundaries between terminology and ontology and the

place of each in semantic interoperability.

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

What to do about it ? (2)

• Training and awareness– Make people more critical wrt terminology and

ontology promisses• What is needed must be based on needs, not on the

popularity of a new paradigm

• But in a system, it’s not just your own needs, it is each component’s needs !

– Towards “an ontology of ontologies”• First description

• Then quality criteria

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Ontology based on Unqualified Realism

• Accepts the existence of – a real world outside mind and language– a structure in that world prior to mind and language

(universals / particulars)

• Rejects nominalism, conceptualism, ontology as a matter of agreement on ‘conceptualizations’

• Uses reality as a benchmark for testing the quality of ontologies as artifacts by building appropriate logics with referential semantics (rather than model-theoretic)

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

How that works ?

Come and see tomorrow

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Did you get us tickets for tomorrow?

Sure, for the train out of

here. Boo, this was awful!!