Shanghai, Nov. 2004
Michel SIMONETTIMC-IMAG research laboratory
Université Joseph FourierGrenoble - France
Michel SIMONETTIMC-IMAG research laboratory
Université Joseph FourierGrenoble - France
ONTOLOGIES beyond fashionA short introduction to ontologies and the
Semantic Web
ONTOLOGIES beyond fashionA short introduction to ontologies and the
Semantic Web
Shanghai, Nov. 2004
FASHION : the Semantic Web
1998: Tim Berners-Lee “Machine and machine man and machine can communicate”
Communication / Understanding
based on ONTOLOGIES
FASHION : the Semantic Web
1998: Tim Berners-Lee “Machine and machine man and machine can communicate”
Communication / Understanding
based on ONTOLOGIES
ONTOLOGIES beyond fashion A short introduction
ONTOLOGIES beyond fashion A short introduction
Shanghai, Nov. 2004
• An example through Information Retrieval
• History, definition, examples
• Ontologies and data integration
• Ontologies and Information systems– Ontology as the starting point of an Information System
– From ontology to database and software
– Gennere example
• An example through Information Retrieval
• History, definition, examples
• Ontologies and data integration
• Ontologies and Information systems– Ontology as the starting point of an Information System
– From ontology to database and software
– Gennere example
ONTOLOGIES A short introductionONTOLOGIES
A short introduction
Shanghai, Nov. 2004
• An example through Information Retrieval• History, definition, examples
• Ontologies and data integration
• Ontologies and Information systems
• An example through Information Retrieval• History, definition, examples
• Ontologies and data integration
• Ontologies and Information systems
ONTOLOGIES A short introductionONTOLOGIES
A short introduction
Shanghai, Nov. 2004
HeterogeneousMedical
Literature Databasesand the Internet
Medical Professionals
& Users
TOXLINE
CancerLitEMIC
HazardousSubstancesDatabank
MEDLINE
Current Information
Interfaces
The Medical Information Gap*
* Aronson AR, Rindflesch TC. Query Expansion Using the UMLS Metathesaurus. In: AMIA Annual Fall Symposium; 1997; 1997. p. 485-89.
Shanghai, Nov. 2004
Information searchon the Internet
Information searchon the Internet
• Google+ Easy to use – Natural language
? Quality of result
– Time-consuming
• Medline - Pubmed+ High quality of scientific content
? Ease of use – Controlled vocabulary (MeSH thesaurus)
– Time-consuming
• Objective : Concept-based rather than word-based search
Shanghai, Nov. 2004
Word-based searchWord-based search
Breast removal Index
Breast removal
Breast augmentation andLaser hair removal
The removal of breast cancer
Mastectomy
Searching Googl
e
Breast and RemovalBreast or Removal
Noise
NoiseSilence
Perti
nent
Mastectomy is a Breast Removal Synonym
Shanghai, Nov. 2004
Concept-based searchConcept-based search
Breast removal
Breast removalAblation du sein
MastectomyMastectomie
Mammectomy Mammectomie
Radical mastectomyMastectomie radicale
Searching
Concept
Pertinent
MASTECTOMY
RADICALMASTECTOMY
Pertinent
Pertinent
Pertinent
Shanghai, Nov. 2004
What is an ontology?Graphical representationWhat is an ontology?Graphical representation
TREATMENT
CHEMOTHERAPY SURGERY
RADICAL MASTECTOMY
RADIOTHERAPY
MASTECTOMY
ABLATION
TUMORECTOMY
HUMAN_BODY
ORGAN
BREAST
followed_by
Remove
IS-A
Part-Of
Mastectomy
Mammectomy
Breast removal
Ablation du sein
Μαστεκτομή
Shanghai, Nov. 2004
Benefits of Concept-based Information Retrieval
Benefits of Concept-based Information Retrieval
• Search is automatically extended to synonyms– E.g., query : « breast removal »
mastectomy, mammectomy, …
• Independence from query language– E.g., query in French : « mastectomie »
answer = documents in any language (e.g., English, French, Spanish, German, Greek, Chinese …)
• Query expansion using the concept hierarchy
• Result presentation using the Ontology’s organization
• General orientation of the Semantic Web
Shanghai, Nov. 2004
• An example through Information Retrieval
• Definition, examples, history• Ontologies and data integration
• Ontologies and Information systems
• An example through Information Retrieval
• Definition, examples, history• Ontologies and data integration
• Ontologies and Information systems
ONTOLOGIES A short introductionONTOLOGIES
A short introduction
Shanghai, Nov. 2004
What is an ontology?What is an ontology?
• The Origins: Plato and Aristotle
• A need to organize knowledge
• History in Computer Science
• Various definitions
• Consensus definition
• Example
• W3C hierarchy of languages
• Ontology usages
• The Origins: Plato and Aristotle
• A need to organize knowledge
• History in Computer Science
• Various definitions
• Consensus definition
• Example
• W3C hierarchy of languages
• Ontology usages
Shanghai, Nov. 2004
The Origins: Plato and AristotleThe Origins: Plato and Aristotle
Aristotle : the study of beings insofar as they exist
Reality: Individuals Vs Concepts Plato, John Human
What is universal, beyond particular representations?
Categories of being - Physical objects- Minds- Classes- Properties- Relations
Porphyry (3rd century) : Porphyry’s trees Categorization by identity and difference
The basis of contemporary ontologies
Shanghai, Nov. 2004
History: a need to organize knowledgeHistory: a need to organize knowledge
Classifications in biology
Linné (1707-1778)
Thesaurus in Information RetrievalConsensus about names and structure
- Russell, Wittgenstein, Frege, Husserl, Peirce
- Nicola Guarino- Barry Smith
- Gruber (1990) Stanford KIF, Ontolingua- Sowa : Conceptual Graphs- Description Logics (DL)- Semantic Web
Philosophy I.A.
Shanghai, Nov. 2004
History in Computer ScienceHistory in Computer Science
Semantic Networks (Shank - 1968) Concepts and relationships Confusion between concepts and individuals
STUDENT IS-A PERSON John IS-A PERSON
Conceptual Graphs (Sowa – 1980) Formalization of semantic networks
First-Order Logic Gruber (1990 – Stanford)
KIF : Knowledge Interchange Format Ontolingua: a language and a platform for ontology exchange
1st use of the term Ontology in Computer Science
Shanghai, Nov. 2004
Consensus definitionConsensus definition
CONCEPTS
RELATIONSHIPS between concepts IS-A relationships (generic/specific) part-of relationships other relationships
VOCABULARY + preferred term for a concept
DEFINITIONS informal, in natural language formal (eg., Description Logics)
Shanghai, Nov. 2004
Example : Breast Cancer* (1)Example : Breast Cancer* (1)
CONCEPTS : BREAST SURGERY MASTECTOMY
RELATIONSHIPS between concepts LEFT MASTECTOMY IS-A BREAST SURGERY MASTECTOMY ENTIRE LEFT BREAST is-proper-material-part-of
ENTIRE LEFT THORAX
LEFT MASTECTOMY has-theme ENTIRE LEFT BREAST
VOCABULARY + preferred term for a conceptbreast surgery mastectomy* From the INFACE Ontology by Language and Computing (www.landc.be)
Shanghai, Nov. 2004
Example : Breast Cancer* (2)Example : Breast Cancer* (2)
CONCEPTS : e.g., MASTECTOMY
RELATIONSHIPS between concepts MASTECTOMY IS-A ABLATION ORGAN part-of HUMAN BODY TUMORECTOMY followed-by RADIOTHERAPY
VOCABULARY + preferred term mastectomy, mammectomy, breast removal, mastectomie, mammectomie, ablation du sein, μαστεκτομή …
DEFINITIONS Surgical removal of the breast
* From a Patient-oriented ontology by Radja Messai – TIMC (UJF)
Shanghai, Nov. 2004
What is an ontology?Graphical representationWhat is an ontology?
Graphical representationTREATMENT
CHEMOTHERAPY SURGERY
RADICAL MASTECTOMY
RADIOTHERAPY
MASTECTOMY
ABLATION
TUMORECTOMY
HUMAN_BODY
ORGAN
BREAST
followed_by
Remove
IS-A
Part-Of
Shanghai, Nov. 2004
W3C hierarchy of languages*W3C hierarchy of languages*
* In the framework of the Semantic Web
• XML (eXtensible Markup XML (eXtensible Markup Language)Language)
• XML Schema (XSD)XML Schema (XSD)• RDF (Resource Description RDF (Resource Description
Framework)Framework)• RDF SchemaRDF Schema• OWL (Web Ontology Language)OWL (Web Ontology Language)
Shanghai, Nov. 2004
• An example through Information Retrieval
• History, definition, examples
• Ontologies and data integration• Ontologies and Information systems
• An example through Information Retrieval
• History, definition, examples
• Ontologies and data integration• Ontologies and Information systems
ONTOLOGIES A short introductionONTOLOGIES
A short introduction
Shanghai, Nov. 2004
ONTOLOGIES Data Integration
ONTOLOGIES Data Integration
Shanghai, Nov. 2004
ProblemProblem
• Various types of data– Structured: databases– Informal: Texts– Semi-structured: XML
• Heterogeneity– Query languages– Structure, vocabulary, …
• Various types of data– Structured: databases– Informal: Texts– Semi-structured: XML
• Heterogeneity– Query languages– Structure, vocabulary, …
Database
XMLTexts
?
Shanghai, Nov. 2004
Query through an OntologyQuery through an Ontology
…..…..
….….
ConceptAnatomo-fonctionnel
ConceptAnatomique Conceptfonctionnel
Hidbrain Midbrain
Correspondence
User
Databases XMLTexts
Adaptor1 Adaptor2 Adaptor3
Shanghai, Nov. 2004
• An example through Information Retrieval
• History, definition, examples
• Ontologies and data integration
• Ontologies and Information systems
• An example through Information Retrieval
• History, definition, examples
• Ontologies and data integration
• Ontologies and Information systems
ONTOLOGIES A short introductionONTOLOGIES
A short introduction
Shanghai, Nov. 2004
Ontologies and Information Systems
in the Health Field
• Common understanding? / Consensus ?• Communication / Sharing
ONTOLOGIES
WHAT we speak about?CONCEPTSDEFINITIONS
HOW do we speak of it? VOCABULARY
Information SystemsOntologies
Shanghai, Nov. 2004
Information SytemsInformation Sytems
« The Information System is a support for communication inside the enterprise and between the enterprise and its environment »
G. Panet & R. Letouche : Modèles techniques Merise avancés.
• The Real Organization– se transforme, agit– communicates – memorise
• The system which is built to REPRESENT– Actions
– Communication
– Memorisation
« The Information System is a support for communication inside the enterprise and between the enterprise and its environment »
G. Panet & R. Letouche : Modèles techniques Merise avancés.
• The Real Organization– se transforme, agit– communicates – memorise
• The system which is built to REPRESENT– Actions
– Communication
– Memorisation
HUMAN
Software
DatabasesData warehouses
Ontologies
Shanghai, Nov. 2004
Cognitive test
Agilité Verbale Dénomination image
Assemblage Objet
Cognitive Function
Vision Memory Language
Mémoire Rétrograde Mémoire Antérograde
Mémoire RétrogradeSémantique
above
Temporal Lobe
Cerebral Cortex
Occipital Lobe
Parietal Lobe
Frontal Lobe
ISA relationship :
CONCEPTS :
Relation transversale :
part-of relationship :
Vocabulary : {Memory, memory fonction, mémoire}
Définitions
validates
Responsible for
validates
Shanghai, Nov. 2004
What is an ontology?What is an ontology?
• CONCEPTS
• RELATIONSHIPS between concepts• ISA relationships (generic/specific)• part-of relationships• other relationships
• VOCABULARY + preferred term for a concept
• DEFINITIONS• informal, in natural language• formal
CONSENSUS
Shanghai, Nov. 2004
Example in mycologyExample in mycology ISA concepts hierarchy
Formal definitions of constraints Object classification / identification Detection of inconsistent definitions
Knowledge BaseDiagnosis aid
Champignon à lames
Russule Ammanite
Virescens Cyanoxantha
Couleur:{blanc,crème}Chair:{grenue, cassante}…
Couleur: blancChair: grenueCouleurChair: blanc
CouleurChair: rosé…
: colour: white chair: grenue couleurChair: white
Shanghai, Nov. 2004
Formal Ontologies and Knowledge Bases
Formal Ontologies and Knowledge Bases
PERSON
ADULT
SENIOR
CONCEPT CLASSIFICATION INSTANCE CLASSIFICATION
PERSON
ADULT
SENIOR
MINOR
age≥18
age≥65
Age<18
PERSON
ADULT SENIOR
age≥18 age≥65
: age = 70
Shanghai, Nov. 2004
Formal Ontologies and Knowledge Bases
Formal Ontologies and Knowledge Bases
Formal Ontology
representation in a logical formalism
E.g. : Description Logic (DL)
Inférence- Subsomption Concept Classification Consistency - Instance Classification
DefConcept PERSON name:STRING and age:INT
DefConcept ADULT = PERSON and age≥18
DefConcept SENIOR = PERSON and age≥ 65
DefConcept SENIOR1 =SENIOR and age< 60
CONSISTENCY: SENIOR1 EMPTY_CONCEPT
SUBSOMPTION:
SENIOR ADULT SENIOR ADULT
SENIOR subsumed by ADULT
Shanghai, Nov. 2004
Database DesignDatabase Design
NAME
PERSON
PATIENT DOCTOR
SEX NHS
1st Name
CC
ACT
HOSPITAL
DISEASE
• Identify the concepts of the domain
• Determine relationships and their cardinalities
Micro-ontology of the domain
First Step
prescritpaie
coding
Ontological schema
consulte
1,*
1,*
1,1
SPEC
Shanghai, Nov. 2004
Database DesignDatabase Design Ontological Schema
Physical level: files …
Name things and their relationships
E-R, UML Schema : classes, methods
Optimize
Relational Schema : tables
Evolutions
Choice - model constraints (associations, …)- cultural choices
- object / value
Choic - model constraints - atomic attributes
- normalization
Choice - index- buffers
- DBMS specific features
Shanghai, Nov. 2004
Database DesignDatabase Design
Evolution level Ontological levelCorrective & evolutive maintenance (80% cost)
Loss of initial semantics
Maintain links between levels
Understanding, Mastering
Evolution level Ontological levelCorrective & evolutive maintenance (80% cost)
Loss of initial semantics
Maintain links between levels
Understanding, Mastering
Shanghai, Nov. 2004
• GENNERE database and software for 2 fields– Nephrology (ESRD: End-Stage Renal Disease)
– Rheumatology (RA: Rhumatoid Arthritis)
User testing and validation is ongoing at Rui Jin hospital
• Perspectives – Data Warehouse for epidemiological studies
– Geographical Information systems
– Improve tools and methods for genericity
• GENNERE database and software for 2 fields– Nephrology (ESRD: End-Stage Renal Disease)
– Rheumatology (RA: Rhumatoid Arthritis)
User testing and validation is ongoing at Rui Jin hospital
• Perspectives – Data Warehouse for epidemiological studies
– Geographical Information systems
– Improve tools and methods for genericity
GENNERE Project achievements and perspectives
GENNERE Project achievements and perspectives
Shanghai, Nov. 2004
• Genericity– Common core ontology : PATIENT – FOLLOW-UP – TREATMENT
Common schema concepts and attributes
– Common set of events: New Patient, New Treatment, Patient Transfer, Decease, …
– Automatic generation of database (ISIS CASE tool)
– Database access through views (with limitations due to DBMS)
– Intensive use of metadata• Domain values: Comorbidities, …
• Multilingualism (UTF8): GUI items, domain values
– Standard medical classifications (thesaurus)
• Genericity– Common core ontology : PATIENT – FOLLOW-UP – TREATMENT
Common schema concepts and attributes
– Common set of events: New Patient, New Treatment, Patient Transfer, Decease, …
– Automatic generation of database (ISIS CASE tool)
– Database access through views (with limitations due to DBMS)
– Intensive use of metadata• Domain values: Comorbidities, …
• Multilingualism (UTF8): GUI items, domain values
– Standard medical classifications (thesaurus)
Genericity: Achievements and LimitsGenericity: Achievements and Limits
Shanghai, Nov. 2004
• Limits– Disease-specific data
• e.g., vaccinations for RA
• core concepts (DISEASE, TREATMENT) have to be derived according to each disease
Rheumatology TREATMENT is more complex
– Country-specific data• Culture and health care system are different
• Patient identification, addresses
– No standard multilingual version of ICD10
– Thesaurus translation into Chinese
– No framework for automatic GUI generation
• Limits– Disease-specific data
• e.g., vaccinations for RA
• core concepts (DISEASE, TREATMENT) have to be derived according to each disease
Rheumatology TREATMENT is more complex
– Country-specific data• Culture and health care system are different
• Patient identification, addresses
– No standard multilingual version of ICD10
– Thesaurus translation into Chinese
– No framework for automatic GUI generation
Genericity: Achievements and LimitsGenericity: Achievements and Limits
Shanghai, Nov. 2004
Genericity was partly achieved– Gain around 50% for 2nd disease (RA) although more complex
– Multilingualism: Chinese, English, French
Extend the ISIS CASE tool– To deal explicitly with Generic and Specific concepts
Through and XML (OWL?) description of the domain
– To perform automatic GUI generation
So as to ease a strong interaction with users
Genericity was partly achieved– Gain around 50% for 2nd disease (RA) although more complex
– Multilingualism: Chinese, English, French
Extend the ISIS CASE tool– To deal explicitly with Generic and Specific concepts
Through and XML (OWL?) description of the domain
– To perform automatic GUI generation
So as to ease a strong interaction with users
ConclusionsConclusions
Shanghai, Nov. 2004
Contact 联系方式Contact 联系方式• Michel SIMONET
• Didier GUILLON
• Dr Haijin YU 俞海瑾 [email protected]
• Michel SIMONET
• Didier GUILLON
• Dr Haijin YU 俞海瑾 [email protected]
Shanghai Grenoble
Shanghai, Nov. 2004
GENNERE partners GENNERE partners GENNERE partners GENNERE partners
Paris - NECKERParis - NECKER
Paul LANDAISMichel & Ana SIMONET
Didier GUILLON
Belgium - RAMITBelgium - RAMIT
Georges de MOOR
Nan CHEN
Wen ZHANG
Grenoble - AGDUCGrenoble - AGDUC
Michel FORET
Philippe GAUDIN
Grenoble – TIMC IMAGGrenoble – TIMC IMAG
Shanghai – RUI JINShanghai – RUI JIN
Shanghai, Nov. 2004
Results (1)Results (1)
Shanghai, Nov. 2004
Results (2)Results (2)