can there be such a thing as ontology engineering?
DESCRIPTION
Invited talk at Carlton University, OttawaTRANSCRIPT
Can there be such a thing as Ontology Engineering?
Robert StevensBioHealth Informatics Group
University of Manchester
Introduction
A bit of ontology introduction if required; What is engineering? Predictability in ontology engineering The application of deterministic principles The role of strict semantics The role of philosophy Acquiring some level of reproducibility.
A World of Instances
The world (of information) is made up of things and lots of them Instances, individuals, objects, tokens, particulars. The Earth is a kind of Planet Robert Stevens (NE 67 41 58 A) is a Person All the individual Alpha Haemoglobins in my many Instances of Red Blood
Cell Each cell instance in my Body has copies of some 30,000 Genes A Word, language, idea, etc. This Table, those Chairs, Any Thing with “A”, “The”, “That”, etc. before it….
We Put things into Categories
All these instances hang about making our world Putting these things into categories is a fundamental part of human
cognition Psychologists study this as concept formation The same instances are put into a category The capitalised and italicised in the slide before last
We have Labels for the Categories and their Instances
We label categories with symbols: Words “Lion” is a category of big cat with big teeth Gene, Protein, Cell, Person, Hydrolase Activity, etc. …and, as we’ve already seen, each category can have many labels and
any particular label can refer to more than one category Semantic Heterogeneity “A lion” is an instance in that category Does the category “Lion” exist? Lions exist, but the category could just be a human way of talking about
lions … we like putting things into categories
A Controlled Vocabulary
A specified set of words and phrases for the categories in which we place instances
Natural language definitions for those words and phrases
A glossary defines, but doesn’t control The Uniprot keywords define and control Control is placed upon which labels are used to
represent the categories (concepts) we’ve used to describe the instances in the world
…, but there is nothing about how things in these categories are related
Biopolymer
DNA
Enzyme
Nucleic acid
mRNA
Polypeptide
snRNA
tRNA
We also like to Relate Things Together
Categories have subcategories Instances in one category can be related
in some way to instances in another Can relate instances to each other in
many different ways Is-a, part-of, develops-from, etc.axes We can use these relationships to classify
categories Things in category A are part is If all instances in category A are also in
category B then As are kinds of Bs
Biopolymer
Nucleic Acid Polypeptide
Enzyme
DNA RNA
tRNA mRNA smRNA
Categories and sub-categories
biopolymer
polypeptide Nucleic acid
enzymeDNA
RNA
Describing Category Membership
We can make conditions that any instance must fulfil in order to be a member of a particular category
A Phosphatase must have a phosphatase catalytic domain A Receptor must have a transmembrane domain A codon has three nucleotide residues A limb has part that is a joint A man has a Y chromosome and an X chromosome A woman has only an X chromosome
Relationships
These conditions made from a property and a successor relationship
isPartOf, hasPart isDerivedFrom DevelopsFrom isHomologousTo …and many, many more
A Structured Controlled Vocabulary
Not only can we agree on the labels we give categories
Can also agree on how the instances of categories are related
And agree on the labels we give he relations
Structure aids querying and captures knowledge with greater fidelity
Biopolymer
Nucleic Acid Polypeptide
Enzyme
DNA RNA
tRNA mRNA smRNAGene
regionOf
transcribedFrom
trans
lated
From
Manchester MercuryJanuary 1st 1754
Executed 18Found Dead 34Frighted 2Kill'd by falls and other
accidents 55Kill'd themselves 36Murdered 3Overlaid 40Poisoned 1Scalded 5Smothered 1Stabbed 1Starved 7Suffocated 5
Aged 1456
Consumption 3915
Convulsion 5977
Dropsy 794
Fevers 2292
Smallpox 774
Teeth 961
Bit by mad dogs 3
Broken Limbs 5
Bruised 5
Burnt 9
Drowned 86
Excessive Drinking 15
List of diseases & casualties this year
19276 burials
15444 christenings
Deaths by centile
Uses of Ontology in Bioinformatics
What is engineering?
American Engineers' Council for Professional Development defines "engineering" as:
“The creative application of scientific principles to design or develop structures, machines, apparatus, or manufacturing processes, or works utilizing them singly or in combination; or to construct or operate the same with full cognizance of their design; or to forecast their behavior under specific operating conditions; all as respects an intended function, economics of operation and safety to life and property.[2]”
Taken from http://en.wikipedia.org/wiki/Engineering
What Type of Artefact? The Rise of the Computer Science Ontology
A term borrowed from philosophy Not supposed to be the same thing, but… Meant to deliver formal, computational semantics to
applications and humans Necessarily involves consensus
Software engineering life cycle
04/12/2318
http://www.samsvb.co.uk
Ontology
Where are we in the Development of Ontology Engineering?
At about 1975… There’s a lot of craft involved; Too much reliance on gurus Could two independent sets of ontologist develop two
ontologies for the same domain with the same utility? Can we cost ontology building? Do we know when we have succcess?
The Waterfall Method
04/12/2320
RequirementsRequirements
ConceptualisationConceptualisation
Development +Coding
Development +Coding
Quality+TestingQuality+Testing
Maintenance +Support
Maintenance +Support
Getting it right first time
Something a bit more agile
04/12/2321
Requirements, scoping, Competency questions
Knowledge acquisition
Conceptualisation, pattern forming
Axiomatization
Testing / evaluation?
Repeated, small
iterations
Repeated, small
iterations
Users always involved
Users always involved
Four Broad Areas of Ontology Engineering
1. Technical aspects: Code repositories, issue trackers, editors, and so on
2. Coding styles and naming conventions, etc.
3. Choosing a class, placing it in a hierarchy and choosing relationships and entities by which it is described.
4. The rhetoric behind how (2) and (3) are done. One can have philosophical justification for any decision, or it can just be practically useful….
Getting the Requirements Right
Truth and beauty is an easy requirement to state Just model the world as it is and all else wil flow from this; Not necessarily helpful; Have to set a scope; Have to set priorities – what do we most need to represent? Competency questions – what do I need to be able to answer? Separating “what the ontology must answer” and “what the ontology
must enable to be answered”; Requirements change; keeping it “agile” Setting priorities.
Strict Semantics
Languages such as OWL have a strict semantics; Statements have a precise and interpretable meaning; Deductions can follow from a series of statements; Can be used to aid development and use of the ontology
Correct, but Wrong…
An automated reasoner for OWL can make sure all your axioms are coherent;
One can make sure the ontology is structurally robust The statements in the ontology can stil be rubbish
though… A strict semantics lends some kind of predictability to an
ontology; A pure description logic approach of all defined classes
has some appeal…
Total Definition
In OWL a defined class can find its own place in the hierarchy A parent is any person that has a child; A mother is any woman that has a child; As a woman is a kind of person, we can infer a mother to be a kind
of parent; Do this for all classes; press the button and you have an ontology Definition is hard (but that may be a good thing) and the tools may
lack Requires discipline from the authors …and it all grounds out to a primitive somewhere along the line…
Normalisation
An “engineering” method to manage polyhierarchies in ontology through reasoning;
Make a strict tree of primitive classes using one criterion; Put all other criteria as restrictions upon those classes; Re-establish the polyhierarchy through defined classes
with the “other” criteria…. http://ontogenesis.knowledgeblog.org/49
Authoring Tools
These are really just axiom editors Support for the surrounding processes are nascent Lots of “hand-crafting” of even large ontologies Knowledge gathering tools; organising tools; axiom
generation tools; checking and validation tools; …
Protégé 4
04/12/2329
Patterns and Components
Software Design Patterns: Accepted design solutions to common problems;
Application building at the level of components; Design pattern analogy in ontologies; Patterns or regularities that are not ODP; Ontologies tend to be repetitious and humans tend to be
bad at repetition – tedium kicks in…. Calls for automation
Ontology Pre-Processor Language
A cell type is equivalent to a cell type that is part of some anatomy
Pattern
Ontology Pre-Processor Language
?cell:CLASS, ?anatomyPart:CLASS, ?anatomy:CLASS =
(CL:0000000 part_of some ?anatomyPart)
BEGINADD ?cell equivalentTo ?anatomyEND;
Variables
Create axioms
A cell type is equivalent to a cell type that is part of some anatomy
Pattern
OPPL Script
Ontology Pre-Processor Language
?cell:CLASS, ?anatomyPart:CLASS, ?anatomy:CLASS =
(CL:0000000 part_of some ?anatomyPart)
BEGINADD ?cell equivalentTo ?anatomyEND;
A cell type is equivalent to a cell type that is part of some anatomy
Pattern
OPPL Script
Variable mapper ?cell -> ‘Kidney Cell’[CL:0003523]?anatomyPart -> ‘Kidney’[FMA:629093]
Resulting OWL axioms
Class: CL:0003523
Annotation:rdfs:label ‘Kidney Cell’
EquivalentTo:CL:0000000 and OBO_REL:part_of some FMA:629093
A ‘Kidney Cell’ is equivalent to a cell that is part of the ‘Kidney’
Example
Generated OWL (Manchester Syntax)
Automation
Moving from hand-crafting to production line Can try things out and then re-model (as long as the
entities involved don’t change) Documents what has been done; Ruthlessly consistent; Also need support in repetitious knowledge gathering as
well as axiom generation.
Populous
Generic tool for populating ontology templates Spreadsheet style interface Supports validation at the point of data entry Expressive Pattern language for OWL Ontology generation
http://www.e-lico.eu/populous
Evaluation
A big “can of worms” Closely linked to requirements Closely linked to what one believes an ontology to be…; “Just do what I say and it will be OK” isn’t an evaluation
strategy; Nor is saying “just model reality” and that’s all you need
to evaluate; No really convincing way of doing it.
The Role of philosophy
04/12/2338
BiologyComputer Science
Philosophy
Angels on the head of a pin
BiologyComputer Science
Philosophy
The role of philosophy
Can we have Ontology Engineering?
Probably, but you’ll have to wait; Not much predictability, except to say “it’s hard” and “people wil
disagree with you” So, much like software engineering; Much to learn from SE and it should be quicker; Programming is not software engineering Axiom authoring is not ontology engineering; At the moment we’re writing axioms, but realise we need to
engineer; Once wwe can demonstrate, with predictability, that two
independent groups can take a method and each produce an ontology that meets some needs then I’ll begin to relax.