why a credit card number is not a number barry smith 1

Post on 21-Dec-2015

217 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Why a Credit Card Number is Not a Number

Barry Smithhttp://ontology.buffalo.edu/smith

1

Why Lite Ontologies will Not Even Work for Cataloging Your

Collection of Favorite Rock Bands

Barry Smithhttp://ontology.buffalo.edu/smith

2

Ontology for the Intelligence Community: A Strategy for the

Future

Barry Smithhttp://ontology.buffalo.edu/smith

3

Semantic Web, wikis, statistical textmining, etc.

let a million flowers bloom

How create broad-coverage semantic annotation systems which will enable

sharing of gigantic bodies of heterogeneous data?

4

let a million flowers (weeds) bloom

how create broad-coverage semantic annotation systems which will enable

sharing of gigantic bodies of heterogeneous data?

5

Unified Medical Language System(National Library of Medicine)built by trained experts

massively useful for information retrieval and information integration

creates out of PubMed literature a huge semantically searchable space (much better than Semantic Wiki …)

6

for UMLSlocal usage respected, regimentation frowned upon

mappings between ‘synonyms’ full of noise

is_synonymous_with is not transitive

no cross-framework consistency

no concern to establish consistency with basic science

different grades of formal rigor, different degrees of completeness, different update policies

7

with UMLS-based annotationswe can know what data we have (via term searches)

we can map between data at single granularities (via ‘synonyms’)

can’t combine data

can’t resolve (or even identify) logical conflicts

can’t reason with data

8

with UMLS, Web 2.0, ...

no evolutionary path towards improvement

9

We will be able to use ontologies to help us share data

only if the ontologies represent the world correctly

are humanly intelligible

and computationally tractable

and work well (and thus evolve) together, under adult supervision

10

A new approach

prospective standardization based on objective measures of what works

bring together selected groups to agree on and commit to good terminology / annotation habits preemptively

11

for science

ensure legacy annotation efforts not wasted

create an evolutionary path towards improvement, of the sort we find in science

a collaborative, community effort to ensure buy-in

with rewards for participation

Requirements

12

for scienceCreate a consensus core of

interoperable domain ontologies

starting with low hanging fruit and working outwards from there

built and validated by trained experts

backed by persons of influence in different communities

13

This solution is already being implemented in the domain of

biomedicine

14

Uses of ‘ontology’ in PubMed abstracts

15

By far the most successful: GO (Gene Ontology)

16

a family of interoperable gold standard biomedical reference ontologies, based on the GO, designed to serve the annotation of

scientific literature biological research data clinical data public health data

The OBO FoundryThe OBO Foundry

17

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic

Quality(PaTO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Compone

nt(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)

The Open Biomedical Ontologies (OBO) Foundry18

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity

(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic

Quality(PaTO)

Organism-Level Process

(GO)

CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Compone

nt(FMA, GO)

Cellular Function

(GO)

Cellular Process

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)

OBO Foundry ontology modules

GRANULARITY

RELATION TO TIME

19

geospatialtransportreligion

biometrics

demographicsethnicspoliticslaw

use common rules drawing on best practices for creating these ontologies ... and for linking them together

20

in the intelligence domain, too:

for science

geospatialtransportreligionweather

bacteriachemicalspoliticslaw

... exploiting the division of labor

... relying on champions in dispersed communities to spread the words

21

Obstacles to the realization of Ontology Modularity in the

Intelligence Domain

22

Too few knowledgeable folks, and fewer cleared.Computer scientists are teaching people ontology tools and ...

Paris has_temperature 62o Mohammed is_a stringAmount of money is_a integerCurrency has_unit $Nuclear weapon is_a concept

with thanks to Jen Williams, Ontology Works Inc.23

What we need 1

thoroughly tested, mandated, common top-level ontology to promote interoperability

institutions for ontology standardization

24

What we need 2

Professional training for ontologists

to teach people to CREATE ONTOLOGY CONTENT

to teach people to USE ONTOLOGY CONTENT

25

What we need 3

Greater organization:

Division of labor for ontology modules plus

Authorities governing

rules for ontology development, versioning, modularity

ensuring interoperability

filling in gaps

sustainability

26

What we need 4

ontology evaluation with teeth

if ontology (science) is to be born, ontologies must die

27

Ontology needs to become more like a science

basis in evidence

established results – authoritative ontologies*

credit for good ontology work

28

Ucore Conceptual Data Model

In process of adoption by DoD, DoJ, DHS

http://www.gcn.com/print/27_20/46900-1.html?page=1

Army-Funded NCOR project to create UCore Semantic Layer

29

Treat ontologies like publications

Nature Signaling, Nature Pathway Interactions

Nature Ontologies ?

Ontologies subjected to a process of expert peer review

Peer review methodology being tested within the OBO Foundry

30

Peer review evaluation process

Required where the quality of inputs cannot be evaluated mechanically

31

Peer review assessment tasks

Is the ontology consistent with the policies on modularity?

Does the ontology provide adequate coverage of the defined domain?

To what level is inferencing supported in the ontology relations structure?

Does the ontology interoperate with other ontologies in the system

32

Is the ontology being developed collaboratively through the engagement and participation of relevant domain stakeholders and developers of neighboring ontologies?

Does the ontology have a tracker for submissions of new terms and notification of errors?

Does the ontology have a help desk which has prompt response times?

33

Verify syntactical correctness, either OBO-Format or OWL-DL, or FOL or some combination.

Is a URI assigned to each term of the ontology? Does the URI point to required metadata for this term (including definition).

Verify uniqueness of all identifiers and preferred terms 

Verify correctness of all asserted subclass relations

34

http://code.google.com/p/information-artifact-ontology/

Information Artifact Ontology

35

– not a mathematical object

– not a contingent object with physical properties, taking part in causal relations

– but a historical object, with a very special provenance, relations analogous to those of ownership, existing only within a nexus of working financial institutions of specific kinds

What is a credit card number?

36

Basic Formal Ontology (BFO)

Continuant Occurrent

processIndependentContinuant

thing

DependentContinuant

quality, role, function …

.... ..... .......37

Blinding Flash of the Obvious

Continuant Occurrent

processIndependentContinuant

thing

DependentContinuant

quality

.... ..... .......quality dependson bearer

38

What is a datum?

Continuant Occurrent

processIndependentContinuant

laptop, book

DependentContinuant

quality

.... ..... .......datum: a pattern in some medium with a certain kind of provenance

39

Continuant Occurrent

IndependentContinuant

DependentContinuant

.... ..... .......

InformationEntity

Action

creating a datum

40

Generically Dependent Continuants

GenericallyDependentContinuant

Information Entity

Sequence

if one bearer ceases to exist, then the entity can survive, because there are other bearers (copyability)

the pdf file on this laptop

the DNA (sequence) in that chromosome

41

Generically Dependent Continuants

GenericallyDependentContinuant

Information Artifact

Gene Sequence

.pdf file .doc file

instances 42

Transcriptomics (MIAME Working Group)

Proteomics (Proteomics Standards Initiative)

Metabolomics (Metabolomics Standards Initiative)

Genomics and Metagenomics (Genomic Standards Consortium)

In Situ Hybridization and Immunohistochemistry (MISFISHIE Working Group)

Phylogenetics (Phylogenetics Community)

RNA Interference (RNAi Community)

Toxicogenomics (Toxicogenomics WG)

Environmental Genomics (Environmental Genomics WG)

Nutrigenomics (Nutrigenomics WG)

Flow Cytometry (Flow Cytometry Community)

IAO adopted, and being violently tested, inter alia, by:

43

top related