1 vt. 2 medical ontology barry smith
TRANSCRIPT
1
VT
2
Medical Ontology
Barry Smith
http://ifomis.de
3
IFOMIS
Institute for Formal Ontology and Medical Information Science
Faculty of Medicine
University of Leipzig
4
Reference Ontology
An ontology is a theory of a domain of entities in the world
Ontology is outside the computer
seeks maximal expressiveness and adequacy to reality
and sacrifices computational tractability for the sake of representational adequacy
5
Reference Ontology
a theory of the tertium quid
– called reality –
needed to hand-callibrate database/terminology systems
6
Methodology
Get ontology right first
(realism; descriptive adequacy; rather powerful logic);
solve tractability problems later
7
The Reference Ontology Community
IFOMIS (Leipzig) Laboratories for Applied Ontology (Trento/Rome,
Turin)Foundational Ontology Project (Leeds)Ontology Works (Baltimore)Ontek Corporation (Buffalo/Leeds)Language and Computing (L&C)
(Belgium/Philadelphia)
8
Domains of Current Work
IFOMIS Leipzig: Medicine, Bioinformatics
Laboratories for Applied Ontology
Trento/Rome: Ontology of Cognition/Language
Turin: Law
Foundational Ontology Project: Space, Physics
Ontology Works: Genetics, Molecular Biology
Ontek Corporation: Biological Systematics
Language and Computing: Natural Language Understanding
9
Ontologie als Zweig der Philosophie
die Wissenschaft von den Arten und Strukturen von Objekten, Qualitäten, Prozessen, Ereignissen, Funktionen und Relationen in allen Bereichen der Wirklichkeit
10
Aristotle
Der erste Ontologe
11
Eine biologische Ontologie
12
Linné
1763: Genera Morborum
(Nosologie
oder
Ontologie der Krankheitsarten)
13
Q: Warum “Ontologie” in der medizinischen Informatik?
A: Das Turm von Babel-Problem der Informationssysteme
14
Turm von Babel
Jedes Informationssystem basiert auf einer eigenen Terminologie
Wie können wir die Inkompatibilitäten lösen, die entstehen, wenn Daten aus verschiedenen Quellen kombiniert werden?
Vgl. Wie können wir Anatomie und Physiologie integrieren?
15
Wie lösen Medizinstudenten dieses Problem?
Vielfach erst durch die Begegnung mit dem Patienten Der Patient und die in ihm ablaufenden Prozesse dienen als Kristallisationspunkt für eine sinnvolle Ordnung sonst isoliert stehender (gelernter) Fakten.
(Aus Wissen-dass wird Wissen-wie)
16
Dem Computer fehlt praktisches Wissen
Wie können in Medizininformations-systemen isolierte Datenartefakte zu konsistentem und anwendbarem Wissen integriert werden?
17
Ursprünglicher Traum der Ontologie in der Informatik
Eine einzige allumfassende Taxonomie aller Gegenstandsarten, die als zentrales integrierendes Kategoriensystem für alle Informationssysteme dient.
Dieser Traum ist ausgeträumt ...
18
Gegenwärtige Lösungen
Standardisierte Terminologien
UMLS
SNOMED
ICD-10
Gene Ontology
Digital Anatomist
usw.
19
Standardisierte Terminologien
sollen Zugriff auf biomedizinische Literatur und Faktendatenbanken erleichtern
Beispielsweise um Verbindungen zwischen spezifischen Genen und spezifischen Körperreaktionen auffindbar zu machen
Eine neue Art medizinischer Forschung soll dadurch ermöglicht werden
20
Database and terminology standardization
is desparately needed in medical and bioinformatics
to enable the huge amounts of existing data to be fused together automatically
21
To reap the benefits of standardization
we need to make ONE SYSTEM out of many different terminologies
But how?
Through government edict? (Scandinavia)
Through efforts of international standards bodies (ISO, CEN …)?
Through UMLS Metathesaurus?
22
Zentrale Schaltstelle
UMLS
Universal Medical Language System
National Library of Medicine
Bethesda, MD
23
UMLS Metathesaurus
eine riesige Kombination verschiedener maschinenlesbarer Quellterminologien
800,000 Begriffe
10 Mio. Beziehungen
24
Beispiele für Quell-Terminologien
SNOMED-RT
Systematized Nomenclature of Medicine
MeSH
Medical Subject Headings
25
is_a trees
hormone
peptide hormone digestive hormone
adrenocorticotropin glycopeptide hormone
follicle-stimulating hormone
26
is_a = ist ein / ist von der Art
Diabetes Melletus is_a Disease
27
Bad Coding
deriving from over-simplification
and from failure to pay attention to ontological principles
Z.B. SNOMED
both_testes is_a testis
(beide_Hoden ist_ein Hoden)
28
Terminological Incompatibilities
29
Representation of Blood in SNOMED
Blood is_a Tissue
30
Representation of Blood in MeSH
Blood is_a Bodily Fluid
31
Bad CodingIncompatibilities
Context-Dependence
Standardized Terminologies must be used properly
32
people are lazy and idiosyncratic
Sie machen SchreibfehlerJeder pflegt seine eigene Terminologie, die sich mehr oder weniger von der anderer Akteure unterscheidet Sie verwenden verschiedene natürlich-sprachliche Darstellungen der gleichen medizinischen Phänomena
33
The codes are not formulated on the basis of clear principles
Therefore inconsistent
Unintuitive
Difficult to train people to use them
Application often depends on context-dependent knowledge
34
The IFOMIS Contribution
help to improve standardizations through constructive criticism based on robust ontological principles
35
UMLS Metathesaurus
eine riesige Kombination verschiedener maschinenlesbarer Quellterminologien
UMLS Semantic NetworkSemantic Network
bestehend aus 134 Semantic TypesSemantic Types
soll Ordnung in diesem Wust schaffen
36
UMLS Semantic Network
entity event
physical conceptual entity entity
37
conceptual entity
Organism Attribute
Finding
Idea or Concept
Occupation or Discipline
Organization
Group
Group Attribute
Intellectual Product
Language
38
conceptual entity
Organism Attribute
Finding
Idea or Concept
Occupation or Discipline
Organization
Group
Group Attribute
Intellectual Product
Language
39
Idea or ConceptFunctional ConceptQualitative ConceptQuantitative ConceptSpatial Concept
Body Location or RegionBody Space or JunctionGeographic AreaMolecular Sequence
Amino Acid SequenceCarbohydrate SequenceNucleotide Sequence
40
INNSBRUCK
is an Idea or Concept
41
Idea or ConceptFunctional ConceptQualitative ConceptQuantitative ConceptSpatial Concept
Body Location or RegionBody Space or JunctionGeographic AreaMolecular Sequence
Amino Acid SequenceCarbohydrate SequenceNucleotide Sequence
42
Confusion of Ontology and Epistemology
Physical Entity
Chemical Entity
Chemical Chemical
Viewed Viewed
Structurally Functionally
43
Confusion of Ontology and Epistemology
the hydraulic equation:
BP = CO*PVR
arterial blood pressure is directly proportional to the product of blood flow (cardiac output, CO) and peripheral vascular resistance (PVR).
Cardiac Output in UMLS = A Finding
44
UMLS-Semantic Types:
blood pressure is an Organism Function,
cardiac output is a Laboratory or Test Result or Diagnostic Procedure
BP = CO*PVR thus asserts that
blood pressure is proportional either to a laboratory or test result or to a diagnostic procedure
45
The goal
Formulate clear principles for building ontologies
Reconstitute the UMLS Semantic Types on the basis of these principles
46
Zusammenarbeit mit der National Library of Medicine
Revision der UMLS Semantic Types und der Gene Ontology
47
GO: the Gene Ontology
3 large telephone directories of standardized designations for gene functions and products
organized into hierarchies via is_a and part_of
48
GO
can in practice be used only by trained biologists (with know how)
whether a GO-term truly stands in the is_a relation depends e.g. on the type of organism involved
glycosome is part-of cytoplasm only for Kinetoplastidae
Computers have no counterpart of such context-dependent know-how
49
GO divided into three disjoint term hierarchies
the cellular component ontology,
e.g. flagellum, chromosome, cell
the molecular function ontology,
e.g. ice nucleation, binding, protein stabilization
the biological process ontology,
e.g. glycolysis, death
50
Definition of Molecular Function
“the action characteristic of a gene product.”
On March 2003 all nodes in the Molecular Function ontology (except the root) had ‘activity’ added to their names
-- confusion of function with functioning
(how deal with dormant/suppressed functions?)
51
Definition of Biological Process
“A phenomenon marked by changes that lead to a particular result, mediated by one or more gene products”
52
How are the 3 ontologies related?
Function = “the action characteristic of a gene product.”
Process = “phenomenon marked by changes that lead to a particular result, mediated by one or more gene products”
No part-whole relations across ontologies?
53
The GO isa relation
in its intended meaning indicates a necessary relationship.
That is, when we say “eukaryotic cell isa cell”, we mean that every eukaryotic cell is a cell.
Confusion of necessarily, universally, and permanently
(No time in GO)
54
part_of
The Relation part-of: The intended meaning of part-of as explained in the GO Usage Guide is: “can be a part of, not is always a part of”
55
Uses of part_of
– membrane part-of cell, intended to mean “a membrane is a part-of any cell”
– flagellum part-of cell, intended to mean “a flagellum is part-of some cells”
– replication fork part-of cell cycle, intended to mean: “a replication fork is part-of the nucleoplasm only during certain times of the cell cycle”
– regulation of sleep part-of sleep, should be corrected to: “regulation of sleep is co-located with and is causally involved with the sleep process”.
56
Need to find ways to deal with time in medical informatics
Functions vs. Realizations of Functions
Function is still there even when not being realized
need to be clear about the distinction between continuants and occurrents
57
SNAP and SPAN
58
SNAP and SPAN
Substances and processesContinuants and occurrents
In preparing an inventory of realitywe keep track of these two different categories of entities in two different ways
59
Substances and processes exist in time in different ways
substance
t i m
e
process
60
Need for different perspectives
Not one ontology, but a multiplicity of complementary ontologies
Cf. Quantum mechanics: particle vs. wave ontologies
61
SNAPshot Video (SPAN)ontology ontology
substance
t i m
e
process
62
SNAP and SPAN
stocks and flows
commodities and services
product and process
anatomy and physiology
synchrony and diachrony
63
SNAP and SPAN
SNAP entities
- have continuous existence in time
- preserve their identity through change
- exist in toto if they exist at all
SPAN entities
- have temporal parts
- unfold themselves phase by phase
- exist only in their phases/stages
64
SNAP: Entities existing in toto at a time
65
Three kinds of SNAP entities
• Substances
• Dependent SNAP entities (qualities, functions, roles, powers …)
• Spatial regions, Contexts, Niches
66
FunctionsThe function of the
heart is to pump blood
67
68
SNAP
Fiat part of substanceExtremity (hand, arm)
Bodily System
69
SPAN: Entities extended in time
SPANEntity extended in time
Portion of Spacetime
Fiat part of process *First phase of a clinical trial
Spacetime worm of 3 + Tdimensions
occupied by life of organism
Temporal interval *projection of organism’s life
onto temporal dimension
Aggregate of processes *Clinical trial
Process[±Relational]
Circulation of blood,secretion of hormones,course of disease, life
Processual Entity[Exists in space and time, unfolds
in time phase by phase]
Temporal boundary ofprocess *
onset of disease, death
70
SPAN: Entities extended in time
71
SPAN: Entities extended in time
FunctioningThe heart’s pumping
of blood
72
Granularity
spatial region substance
parts of substances are always substances
73
Granularity
spatial region substance
parts of spatial regions are always spatial regions
74
Granularity
process
parts of processes are always processes
75
MORAL
Relations crossing the SNAP/SPAN border are never part-relations
76
Relations crossing the SNAP/SPAN border are never part-relations
John’s lifesubstance John
physiological processes
sustaining in existence
77
DIGESTIVE SYSTEM
78
RESPIRATORY SYSTEM
79
URINARYSYSTEM
80
IMMUNE SYSTEM
81
CIRCULATORY SYSTEM
82
CIRCULATORY SYSTEM (Principal Organs)
83
The autonomous part of the nervous system (regulatory links to other systems)
84
ENDOCRINESYSTEM
85
Bodily Systems are Component Parts of Bodies
respiratorydigestive skeletal circulatorymusculatory immune
86
87
A system for keeping your jewels safe
88
Bodily Systems interconnect
89
Systems are SNAP entitiesThey are dependent continuantsWe can take photographs of them
90
The Monarchic System of Government