towards an ontological treatment of disease and diagnosis
DESCRIPTION
Towards an Ontological Treatment of Disease and Diagnosis. Barry Smith New York State Center of Excellence in Bioinformatics and Life Sciences University at Buffalo. - PowerPoint PPT PresentationTRANSCRIPT
Towards an Ontological Treatment of Disease and Diagnosis
Barry SmithNew York State Center of Excellence in Bioinformatics and Life Sciences
University at Buffalo
http://ontology.buffalo.edu/smith 1
Anders Grimsmo, “Patients, diagnoses and processes in general practice in the Nordic countries. An attempt to
make data from computerised medical records available for comparable statistics”
Scandinavian Journal of Primary Health Care, 2001
“The major obstacle to extracting more epidemiological data from computerised medical records is caused by information in the databases not being uniquely linked to episodes of care.”
http://ontology.buffalo.edu/smith 2
3
What is to be linked with what?
What is information in the databases about?To answer this question (to assign numbers to discrete entities), we need a good ontology of the care domain, including episodes of care on the one hand and entities on the side of the patient on the other.
http://ontology.buffalo.edu/smith
4
and we need to take account of context
– of multiple diseases– of the patient’s style of life– of the patient’s environment– of specific aspects of the presentation
http://ontology.buffalo.edu/smith
5
we do this by paying attention to natural language
but the more we succeed in this, the more difficult it is to aggregate the data
disease of UMLSitis
http://ontology.buffalo.edu/smith
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
6
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
7
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T UOBO Foundry
8
11with acknowledgements to NLM: 1R21LM009824-01A1
Buffalo Longitudinal Cancer Data
Even with the best of intentions, and even if we just use one coding system, results are not always what they seem
Problem of SNOMEDitis
Why does SNOMED change so much?
12
13with acknowledgements to NLM: 1R21LM009824-01A1
SNOMED CT: Anaplasma marginale (organism)
14
infectious agentis_a navigational concept
with acknowledgements to Werner Ceusters NLM: 1R21LM009824-01A1
15
infectious agentis_a navigational concept
16with acknowledgements to NLM: 1R21LM009824-01A1
17with acknowledgements to NLM: 1R21LM009824-01A1
18with acknowledgements to NLM: 1R21LM009824-01A1
19with acknowledgements to NLM: 1R21LM009824-01A1
20
Why does SNOMED change so much?
• Problems with ‘concept’ no real coherence as to what SNOMED is representing
21
Why does SNOMED change so much?
• No proper hierarchy (of more and less general)
• Confusion of disorders (continuants) with etiological and diagnostic processes (occurrents) and of both with information entities (‘findings’)
• Confusion of ‘disorders’ with ‘morphological abnormalities’
22
SNOMED CT
128477000 Abscess (disorder)44132006 Abscess (morphologic abnormality)
23
Epistemology and Combinatorial Explosion
• Epistaxis/nosebleed– Epistaxis (disorder)– Nosebleed/epistaxis symptom (finding)– On examination - epistaxis (disorder)– Has nosebleeds - epistaxis (disorder)– Evidence of recent epistaxis (finding)
from Bill Hogan
Epistemology and Combinatorial Explosion
• Rash– Cutaneous eruption (morphologic abnormality), with
synonym Rash– Eruption of skin (disorder), with synonym Rash– Complaining of a rash (finding)– On examination - a rash (finding)
• Dry skin– Dry skin (finding)– Complaining of dry skin (finding)– On examination - dry skin (finding)– Dry skin dermatitis (disorder)
from Bill Hogan
An Alternative: Basic Formal Ontology
360 BC: Aristotle’s Metaphysics 1879: Invention of modern logic (Boole,
Frege) 1920: The problem of the Unity of
Science (Logical Positivism) 1940Birth of computing (Turing)
http://ontology.buffalo.edu/smith 30
Ontology Timeline
1970: AI, Robotics (J. McCarthy, P. Hayes)
1980: KIF: Knowledge Interchange Format
1990: Description Logics 2000: Semantic Web (OWL), Protégé 2007: National Center for Biomedical
Ontology (NCBO) Bioportal
http://ontology.buffalo.edu/smith 31
Uses of ‘ontology’ in PubMed abstracts
32
2000 2001 2002 2003 2004 2005 2006 2007 2008 20090
100
200
300
400
500
600
700
800
900
1000
35 37 69
143
283
412
501
618
860900
Biomedical Ontology in PubMed
By far the most successful: GO (Gene Ontology)
34
35
Ontology Timeline
1990: Human Genome Project 1999: The Gene Ontology (GO) – Model
Organism Research 2005: The Open Biomedical Ontologies
(OBO) Foundry 2010: Ontology for General Medical
Science
http://ontology.buffalo.edu/smith
The GO is a controlled vocabulary for use in annotating data
multi-species, multi-disciplinary, open source
contributing to the cumulativity of scientific results obtained by distinct research communities
compare use of kilograms, meters, seconds … in formulating experimental results
36
37
NIH Mandates for Data Sharing
Organizations such as the NIH now require use of common standards in a way that will ensure that the results obtained through funded research are more easily accessible to external groups. ODR will be created in such a way that its use will address the new NIH mandates. It will designed also to allow information presented in its terms to be usable in satisfying other regulatory purposes—such as submissions to FDA.
http://ontology.buffalo.edu/smith
GO provides answers to three types of questions:
for each gene product (protein ...)
in what parts of the cell has it been identified? Cell Constituent Ontology
exercising what types of molecular functions? Molecular Function Ontology
with what types of biological processes? Biological Process Ontology
38
39
40
= part_of= subtype_of
Gene Product Associations
41
$100 mill. invested in literature curation using GO
over 11 million annotations relating gene products described in the UniProt, Ensembl and other databases to terms in the GO
ontologies provide the basis for capturing biological theories in computable form
in contrast to terminologies and thesauri – which focus on socially diverse uses of language – the GO method focuses on commonly shared results of basic biological science
42
A new kind of biological research
based on analysis and comparison of the massive quantities of annotations linking ontology terms to raw data, including genomic data, clinical data, public health data
What 10 years ago took multiple groups of researchers months of data comparison effort, can now be performed in milliseconds
43
The GO covers only generic (‘normal’) biological entities of three sorts:
– cellular components– molecular functions– biological processes
It does not provide representations of diseases, symptoms, genetic abnormalities …How to extend the GO methodology to other domains of biology and medicine?
45
RELATION TO TIME
GRANULARITY
CONTINUANT OCCURRENT
INDEPENDENT DEPENDENT
ORGAN ANDORGANISM
Organism(NCBI
Taxonomy)
Anatomical Entity(FMA, CARO)
OrganFunction
(FMP, CPRO) Phenotypic
Quality(PaTO)
Biological Process
(GO)CELL AND CELLULAR
COMPONENTCell(CL)
Cellular Compone
nt(FMA, GO)
Cellular Function
(GO)
MOLECULEMolecule
(ChEBI, SO,RnaO, PrO)
Molecular Function(GO)
Molecular Process
(GO)The Open Biomedical Ontologies (OBO) Foundry
46
all follow the same principles to ensure interoperability
– GO Gene Ontology– ChEBI Chemical Ontology– PRO Protein Ontology– CL Cell Ontology– ...– OGMS Ontology for General Medical
Science
OBO Foundry ontologies
47
48
Basic Formal Ontology: GO at a high level
http://ontology.buffalo.edu/smith
Basic Formal Ontology (BFO)
A simple top-level ontology to support information integration in scientific researchNo abstractaNothing propositionalClear hierarchyNo overlap with domain ontologiesNo confusion of ontology with epistemologyNo confusion of terms with what terms represent in reality
49
Basic Formal Ontology
Continuant Occurrent(Process, Event)
IndependentContinuant
DependentContinuant
http://ifomis.uni-saarland.de/bfo/50
BFO and the 3 Gene Ontologies (GO)
Continuant Occurrent
IndependentContinuant
DependentContinuant
Cell Component
Biological Process
Molecular Function
Kumar A., Smith B, Borgelt C. Dependence relationships between Gene Ontology terms based on TIGR gene product annotations. CompuTerm 2004, 31-38.
Bada M, Hunter L. Enrichment of OBO Ontologies. J Biomed Inform. 2006 Jul 26
51
Users of BFO
NCI BiomedGT SNOMED CTOntology for General Medical Science
(OGMS)ACGT Clinical Genomics Trials on Cancer –
Master Ontology / Formbuilder (Case Report Forms for Cancer Clinical Trials)
53
Users of BFO
MediCognos / Microsoft HealthvaultCleveland Clinic Semantic Database in
Cardiothoracic SurgeryMajor Histocompatibility Complex (MHC)
Ontology (NIAID)Neuroscience Information Framework
Standard (NIFSTD) and Constituent Ontologies
54
Users of BFO
Interdisciplinary Prostate Ontology (IPO)Nanoparticle Ontology (NPO): Ontology for
Cancer Nanotechnology ResearchNeural Electromagnetic Ontologies (NEMO)ChemAxiom – Ontology for ChemistryOntology for Risks Against Patient Safety
(RAPS/REMINE) (EU FP7)IDO Infectious Disease Ontology (NIAID)
55
Infectious Disease Ontology Consortium
• MITRE, Mount Sinai, UTSouthwestern – Influenza
• IMBB/VectorBase – Vector borne diseases (A. gambiae, A. aegypti, I. scapularis, C. pipiens, P. humanus)
• Colorado State University – Dengue Fever• Duke University – Tuberculosis, Staph. aureus• Case Western Reserve – Infective Endocarditis• University of Michigan – Brucellosis
56
• GO Gene Ontology• CL Cell Ontology• SO Sequence Ontology• ChEBI Chemical Ontology • PATO Phenotype (Quality) Ontology• FMA Foundational Model of Anatomy• ChEBI Chemical Entities of Biological Interest • CARO Common Anatomy Reference Ontology • PRO Protein Ontology• Infectious Disease Ontology• Plant Ontology• Environment Ontology• Ontology for Biomedical Investigations• RNA Ontology
The OBO Foundry
57
RELATION TO TIME
GRANULARITY
CONTINUANT OCCURRENT
INDEPENDENT DEPENDENT
ORGAN ANDORGANISM
Organism(NCBI
Taxonomy)
Anatomical Entity(FMA, CARO)
OrganFunction
(FMP, CPRO) Phenotypic
Quality(PaTO)
Biological Process
(GO)CELL AND CELLULAR
COMPONENTCell(CL)
Cellular Compone
nt(FMA, GO)
Cellular Function
(GO)
MOLECULEMolecule
(ChEBI, SO,RnaO, PrO)
Molecular Function(GO)
Molecular Process
(GO)The Open Biomedical Ontologies (OBO) Foundry
58
CONTINUANT OCCURRENT
INDEPENDENT DEPENDENT
ORGAN ANDORGANISM
Organism(NCBI
Taxonomy)
Anatomical Entity
(FMA, CARO)
OrganFunction
(FMP, CPRO) Phenotypic
Quality(PaTO)
Organism-Level Process
(GO)
CELL AND CELLULAR
COMPONENTCell(CL)
Cellular Compone
nt(FMA, GO)
Cellular Function
(GO)
Cellular Process
(GO)
MOLECULEMolecule
(ChEBI, SO,RNAO, PRO)
Molecular Function(GO)
Molecular Process
(GO)
rationale of OBO Foundry coverage (homesteading principle)
GRANULARITY
RELATION TO TIME
59
OBO Foundry organized in terms of Basic Formal Ontology
Methodology of downward populationEach Foundry ontology can be seen as an extension of a single upper level ontology (BFO)
60
Example: The Cell Ontology
Ontology for General Medical Science BFO-based ontology for clinical medicine
Continuant Occurrent
IndependentContinuant
DependentContinuant
Anatomical Component
+Disorder
Pathological Process
+Clinical Encounter
Disease+
Bodily Quality62
Continuant
IndependentContinuant
DependentContinuant
..... .....Quality Disposition
63
realization depends_on realizable
Continuant Occurrent
IndependentContinuant
bearer
DependentContinuant
disposition
.... ..... .......67
Process of realization
this particular case of redness (of a particular fly eye)
the universal red
instantiates
an instance of eye (in a particular fly)
the universal eye
instantiates
depends_on
70
the particular case of redness (of a particular fly eye)
red
instantiates
an instance of an eye (in a particular fly)
eye
instantiates
depends on
color anatomical structure
is_a is_a
71
portion of water
this portion of H20
72
portion of ice
portion of liquid water
portion of gas
instantiates at t1
instantiates at t2
instantiates at t3
Phase transitions
human
John (exists continuously)
73
embryo fetus adultneonate infant child
instantiates at t1
instantiates at t2
instantiates at t3
instantiates at t4
instantiates at t5
instantiates at t6
in nature, no sharp boundaries here
temperature
John’s temperature (exists continuously)
74
37ºC 37.1ºC 37.5ºC37.2ºC 37.3ºC 37.4ºC
instantiates at t1
instantiates at t2
instantiates at t3
instantiates at t4
instantiates at t5
instantiates at t6
in nature, no sharp boundaries here
in nature, no sharp boundaries here
coronary heart disease
John’s coronary heart disease (exists continuously)
75
asymptomatic (‘silent’)
infarction
early lesions and small
fibrous plaques
stable angina
surface disruption of
plaque
unstable angina
instantiates at t1
instantiates at t2
instantiates at t3
instantiates at t4
instantiates at t5
time
OGMS
Ontology for General Medical Science
http://code.google.com/p/ogms/
76
OGMS: The Big Picture
77
Disposition (potentiality)
A disposition isa realizable entity which is such that, if it ceases to exist, then its bearer is physically changed,whose realization occurs, in virtue of the bearer’s physical make-up, when this bearer is in some special physical circumstances
89
Disorder
independent continuantthat is part of an organismthat deviates from the
canonical anatomy of the organism
in a way that gives rise to pathological processes
90
Disorder
serves as the bearer of a disposition to pathological processes
A part of the body that typically gets larger over time
91
Disease course• the totality of all disease processes
through which a given disease instance is realized .
• multiple disease courses will be associated with the same disorder type, for example in reflection of the presence or absence of pharmaceutical or other interventions, of differences in environmental influence, and so forth.
The Big Picture
94
A disease is a disposition rooted in a physical disorder in the organism and realized in pathological processes.
etiological process
produces
disorder
bears
disposition
realized_in
pathological process
produces
abnormal bodily features
recognized_as
signs & symptomsinterpretive process
produces
diagnosis
used_in95
Definitions - Foundational Terms
Disorder =def. – A causally linked combination of physical components that is clinically abnormal.
Pathological Process =def. – A bodily process that is a manifestation of a disorder and is clinically abnormal.
Disease =def. – A disposition (i) to undergo pathological processes that (ii) exists in an organism because of one or more disorders in that organism.
97
Influenza - infectious Etiological process - infection of
airway epithelial cells with influenza virus produces
Disorder - viable cells with influenza virus bears
Disposition (disease) - flu realized_in
Pathological process - acute inflammation produces
Abnormal bodily features recognized_as
Symptoms - weakness, dizziness Signs - fever
Symptoms & Signs used_in
Interpretive process produces
Hypothesis - rule out influenza suggests
Laboratory tests produces
Test results - elevated serum antibody titers used_in
Interpretive process produces
Result - diagnosis that patient X has a disorder that bears the disease flu
But the disorder also induces normal physiological processes (immune response) that can results in the elimination of the disorder (transient disease course).
98
Huntington’s Disease - genetic Etiological process - inheritance of
>39 CAG repeats in the HTT gene produces
Disorder - chromosome 4 with abnormal mHTT bears
Disposition (disease) - Huntington’s disease realized_in
Pathological process - accumulation of mHTT protein fragments, abnormal transcription regulation, neuronal cell death in striatum produces
Abnormal bodily features recognized_as
Symptoms - anxiety, depression Signs - difficulties in speaking and
swallowing
Symptoms & Signs used_in
Interpretive process produces
Hypothesis - rule out Huntington’s suggests
Laboratory tests produces
Test results - molecular detection of the HTT gene with >39CAG repeats used_in
Interpretive process produces
Result - diagnosis that patient X has a disorder that bears the disease Huntington’s disease
99
HNPCC - genetic pre-disposition Etiological process - inheritance of a mutant mismatch repair gene
produces Disorder - chromosome 3 with abnormal hMLH1
bears Disposition (disease) - Lynch syndrome
realized_in Pathological process - abnormal repair of DNA mismatches
produces Disorder - mutations in proto-oncogenes and tumor suppressor genes with
microsatellite repeats (e.g. TGF-beta R2) bears
Disposition (disease) - non-polyposis colon cancer realized in
Symptoms (including pain)
100
Dispositions and Predispositions
All diseases are dispositions; not all dispositions are diseases.
A predisposition is a disposition to acquire a disposition. Predisposition to Disease of Type X =def. – A disposition
in an organism that constitutes an increased risk of the organism’s subsequently developing the disease X.
HNPCC is caused by a disorder (mutation) in a DNA mismatch repair gene that disposes to the acquisition of additional mutations from
defective DNA repair processes, and thus is a predisposition to the development of colon cancer.
101
Cirrhosis - environmental exposure
• Etiological process - phenobarbitol-induced hepatic cell death– produces
• Disorder - necrotic liver– bears
• Disposition (disease) - cirrhosis– realized_in
• Pathological process - abnormal tissue repair with cell proliferation and fibrosis that exceed a certain threshold; hypoxia-induced cell death– produces
• Abnormal bodily features– recognized_as
• Symptoms - fatigue, anorexia• Signs - jaundice, splenomegaly
Symptoms & Signs used_in
Interpretive process produces
Hypothesis - rule out cirrhosis suggests
Laboratory tests produces
Test results - elevated liver enzymes in serum used_in
Interpretive process produces
Result - diagnosis that patient X has a disorder that bears the disease cirrhosis
Systemic arterial hypertension
• Etiological process – abnormal reabsorption of NaCl by the kidney– produces
• Disorder – abnormally large scattered molecular aggregate of salt in the blood– bears
• Disposition (disease) - hypertension– realized_in
• Pathological process – exertion of abnormal pressure against arterial wall– produces
• Abnormal bodily features– recognized_as
• Symptoms - • Signs – elevated blood pressure
Symptoms & Signs used_in
Interpretive process produces
Hypothesis - rule out hypertension suggests
Laboratory tests produces
Test results - used_in
Interpretive process produces
Result - diagnosis that patient X has a disorder that bears the disease hypertension
Type 2 Diabetes Mellitus• Etiological process –
– produces• Disorder – abnormal pancreatic beta
cells and abnormal muscle/fat cells– bears
• Disposition (disease) – diabetes mellitus– realized_in
• Pathological processes – diminished insulin production , diminished muscle/fat uptake of glucose– produces
• Abnormal bodily features– recognized_as
• Symptoms – polydipsia, polyuria, polyphagia, blurred vision
• Signs – elevated blood glucose and hemoglobin A1c
Symptoms & Signs used_in
Interpretive process produces
Hypothesis - rule out diabetes mellitus suggests
Laboratory tests – fasting serum blood glucose, oral glucose challenge test, and/or blood hemoglobin A1c produces
Test results - used_in
Interpretive process produces
Result - diagnosis that patient X has a disorder that bears the disease type 2 diabetes mellitus
Type 1 hypersensitivity to penicillin• Etiological process – sensitizing of mast
cells and basophils during exposure to penicillin-class substance– produces
• Disorder – mast cells and basophils with epitope-specific IgE bound to Fc epsilon receptor I– bears
• Disposition (disease) – type I hypersensitivity– realized_in
• Pathological process – type I hypersensitivity reaction– produces
• Abnormal bodily features– recognized_as
• Symptoms – pruritis, shortness of breath• Signs – rash, urticaria, anaphylaxis
Symptoms & Signs used_in
Interpretive process produces
Hypothesis - suggests
Laboratory tests – produces
Test results – occasionally, skin testing used_in
Interpretive process produces
Result - diagnosis that patient X has a disorder that bears the disease type 1 hypersensitivity to penicillin
Next steps in OGMS• classification of distinct types of disease
courses for instances of each disease type – in different typical environments– with and without treatment– with treatment plan that is or is not
realized by the patient– where the disease exists in combination
with other diseases
Next steps in OGMS
• modify the Big Picture to take account of differences between primary care and specialist care
The Big Picture
108
Definitions - Clinical Evaluation Terms
Sign =def. – A bodily feature of a patient that is observed in a physical examination and is deemed by the clinician to be of clinical significance. (Objectively observable features)
Symptom =def. – An experienced bodily feature of a patient that is observed by and observable only by the patient and is of the type that can be hypothesized by a patient to be a realization of a disease. (A restricted family of phenomena including pain, nausea, anger, drowsiness, which are of their nature experienced in the first person)
Symptoms are subjective. But this does not mean that there is no objective fact of the matter whether a given symptom exists
109
Definition: Etiology
Etiological Process =def. – A process in an organism that leads to a subsequent disorder.
Example: toxic chemical exposure resulting in a mutation in the genomic DNA of a cell; infection of a human with a pathogenic virus; inheritance of two defective copies of a metabolic gene
The etiological process creates the physical basis of that disposition to pathological processes which is the disease.
110
Definitions - Diagnosis
Clinical Picture =def. – A representation of a clinical phenotype that is inferred from the combination of laboratory, image and clinical findings about a given patient.
Diagnosis =def. – A conclusion of an interpretive process that has as input a clinical picture of a given patient and as output an assertion to the effect that the patient has a disease of such and such a type.
111
Definitions - Qualities Manifestation of a Disease =def. – A bodily feature of a
patient that is (a) a deviation from clinical normality that exists in virtue of the realization of a disease and (b) is observable. Observability includes observable through elicitation of response or
through the use of special instruments. Preclinical Manifestation of a Disease =def. – A
manifestation of a disease that exists prior to its becoming detectable in a clinical history taking or physical examination.
Clinical Manifestation of a Disease =def. – A manifestation of a disease that is detectable in a clinical history taking or physical examination.
Phenotype =def. – A (combination of) bodily feature(s) of an organism determined by the interaction of its genetic make-up and environment.
Clinical Phenotype =def. – A clinically abnormal phenotype. 112
113
For an ontology to succeed, potential users should be incentivized to use it, it should be populated using the terms that
they need and using definitions that conform to their understanding of these terms
it should be easily correctable in light of new research discoveries
it should enable the data annotated in its terms to be easily integrated with legacy data from related fields
it should be easily extendable to new kinds of data.
http://ontology.buffalo.edu/smith
A new kind of Electronic Health Recordresting on the use of the same (public domain) ontologies in mapping proprietary EHR vocabularies to yield patient data annotated in consistent ways that support
114
integrated care and continuity of care comparison and integration for diagnosis and
meta-analysis secondary uses for research