why lexgrid?

75
The Lexical Grid

Upload: toya

Post on 16-Jan-2016

28 views

Category:

Documents


0 download

DESCRIPTION

Why LexGrid?. Existing medical records Various forms of coding and classification in use since the early 1500’s ‘Modern’ records from the 1960’s to present include various forms of codes Medical records are still on a “per-institution basis”. Why LexGrid?. Emerging medical records - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Why LexGrid?

The Lexical Grid

Page 2: Why LexGrid?

Why LexGrid?

Existing medical records• Various forms of coding and

classification in use since the early 1500’s

• ‘Modern’ records from the 1960’s to present include various forms of codes

• Medical records are still on a “per-institution basis”

Page 3: Why LexGrid?

Why LexGrid?

Emerging medical records

Multiple factors forcing new levels of interoperability

• Economic• Regulatory• Technical

Page 4: Why LexGrid?

Why LexGrid?

Bioinformatics• Large volumes of information

• Large cross sections• Detailed – what is important may not (and

cannot) be anticipated• Interoperability of

• Medical (Phenomics)• Genomics• Environmental• GeoSpatial

Page 5: Why LexGrid?

The GAP (In Western Medicine)“Terminologies”

Coding and Classification

“Ontologies”Computable DL Frameworks

ICD-9-CM

CPT-4

ICD-10-PCS

MESH

SNOMED-IIISNOMED CT

GO...

Many, many more to comeCountries

Languages

Mime Types

SNOP

FMA

ChEBI

MGED

GMOD

Page 6: Why LexGrid?

LexGridThe purpose behind LexGrid

Communication

Page 7: Why LexGrid?

Language and the Communication Process

• Language - a “specification” that enables communication

• Semantics - the association between signs or symbols and their intended “meaning”

• Syntax - the rules for ordering and structuring the signs into phrases and sentences

• Pragmatics - the relationship between signs and symbols and the recipient. Broadly, the shared context.

Page 8: Why LexGrid?

Ogden’s Semiotic Triangle

Thought or Reference

Referent Symbol

SymbolisesRefers to

Stands for

C.K Ogden and I. A. Richards. The Meaning of Meaning.

Page 9: Why LexGrid?

Ogden’s Semiotic Triangle

Thought or Reference

Referent Symbol

SymbolisesRefers to

Stands for

C.K Ogden and I. A. Richards. The Meaning of Meaning.

“Rose”, “ClipArt”

Page 10: Why LexGrid?

The Communication Process

CONCEPT

Referent

Refers ToSymbolises

Stands For“Rose”,“ClipArt”

Refers ToSymbolises

Stands For“Rose”,“ClipArt”

CONCEPT

Symbol Symbol

“I see a ClipArt image of a rose”

Page 11: Why LexGrid?

The Communication Process

CONCEPT

Referent

Refers ToSymbolises

Stands For“Rose”,“ClipArt”

Refers ToSymbolises

Stands For“Rose”,“ClipArt”

CONCEPT

Symbol Symbol

“I see a ClipArt image of a rose”

Semantics

Page 12: Why LexGrid?

The Communication Process

CONCEPT

Referent

Refers ToSymbolises

Stands For“Rose”,“ClipArt”

Refers ToSymbolises

Stands For“Rose”,“ClipArt”

CONCEPT

Symbol Symbol

“I see a ClipArt image of a rose”

Semantics

Syntax

Page 13: Why LexGrid?

The Communication Process

CONCEPT

Referent

Refers ToSymbolises

Stands For“Rose”,“ClipArt”

Refers ToSymbolises

Stands For“Rose”,“ClipArt”

CONCEPT

Symbol Symbol

“I see a ClipArt image of a rose”

Semantics

Syntax

Context Context

Shared Context

Page 14: Why LexGrid?

Shared Context

Impacts how much information can be contained in a symbol.

NoSharedContext

SharedUniverse

CommonLanguage

SharedSpecies

SharedPlanet

SharedSun

CommonCulture

SimilarEducation

CommonProfession

CommonSpecialty

Information / Symbol

Page 15: Why LexGrid?

Minimum Shared Context

Page 16: Why LexGrid?

The impact of context on communication

Shared context:• Allows information to be communicated in larger,

more succinct “chunks”.• Drug, analgesic and NSAID are all “chunks”,

yet differ markedly in conceptual complexity.• Enables specialized symbol sets:

• Contrast the amount of information contained in the formula E=MC2 versus that contained in this presentation...

Page 17: Why LexGrid?

Contextual Formalism

The degree of formality in a shared context can vary across a wide spectrum:

• Tacit context which is simply presumed• Contextual negotiation proceeding the

actual message• Rigorous and formal rules and documents

describing the form and possible meanings behind every message and phrase.

Page 18: Why LexGrid?

Factors Effecting the Degree Contextual Formalism

• Number of participating parties• Formalism needs to increase as number of

participants increase

• Geographic, cultural and temporal proximity of communicators

• The further apart communicators are, the less they can assume

• Amount of shared context• The more you have, the more important it

becomes to be organized

Page 19: Why LexGrid?

Factors Effecting the Degree Contextual Formalism

• The cost of imprecise communication• Poetry and literature - low cost (some may argue

actual gain)• Technical and professional - high to very high

cost• What is the cost of assuming the units of a

thrust specification?• What is the cost of assuming the dose of a

prescription?• What is the cost of assuming the century in

which the communication originated?

Page 20: Why LexGrid?

Common Forms of Contextual Formalism

• Dictionaries

• Thesauri

• Textbooks, college courses, etc.

• Operations manuals

• Data dictionaries

• Terminologies

Page 21: Why LexGrid?

The Communication Process

CONCEPT

Referent

Refers ToSymbolises

Stands For“Rose”,“ClipArt”

Refers ToSymbolises

Stands For“Rose”,“ClipArt”

CONCEPT

Symbol Symbol

“I see a ClipArt image of a rose”

Semantics

Syntax

Context Context

Shared Context

Page 22: Why LexGrid?

Making Shared Context Explicit

CONCEPT

Referent

Refers ToSymbolises

Stands For“Rose”,“ClipArt”

Refers ToSymbolises

Stands For“Rose”,“ClipArt”

CONCEPT

Symbol Symbol

“I see a ClipArt image of a rose”

Context Context

Formal SharedContext

Terminologies Terminologies

Page 23: Why LexGrid?

Shared Context Least Common Denominator

CONCEPT

Referent

Refers ToSymbolises

Stands For“Rose”,“ClipArt”

Refers ToSymbolises

Stands For“Rose”,“ClipArt”

CONCEPT

Symbol Symbol

“I see a ClipArt image of a rose”

Context Context

Reduce the Shared Context...

Terminologies Terminologies

Terminologies

“I see a ClipArt image of a red flower with ...”

... increase the symbolcomplexity

Context

Page 24: Why LexGrid?

Information vs. Symbol

CONCEPT

Referent

Refers ToSymbolises

Stands For“Rose”,“ClipArt”

Refers ToSymbolises

Stands For“Rose”,“ClipArt”

CONCEPT

Symbol Symbol

“I see a ClipArt image of a rose”

Information

Symbol Symbol

Information – predicate w/ Range of True/False/..Symbol - predicate w/ Range of “Concept”

Page 25: Why LexGrid?

Ontologies serve (at least) two roles

Symbol - Definitional

• Concept -> Symbol

• Symbol -> Concept

• Symbol/Symbol translation

• Symbol validation, organization and mapping

• Are axioms – not verifiable

Information - Propositional

• Statements

• True/False/Unknown

• Convey information

• Are verifiable

Page 26: Why LexGrid?

Sample Description Logic

Symbol

A, B

C, D

R

>

?

: A

C u D

8 R.C

9 R.>

Interpretation

AI, BI

CI, DI

RI µ I x I

I

?

I n AI

CI Å DI

{a 2 I | 8 b.(a,b) 2 RI ! b 2 CI}

{a 2 I | 9 b.(a,b) 2 RI}

Page 27: Why LexGrid?

Interpretations

• An interpretation I satisfies an inclusion C v D if CI µ DI, and it satisfies an equality C ´ D if CI = DI.

• If T is a set of axioms, then I satisfies T iff I satisfies each element of T.

• If I satisfies an axiom (resp. a set of axioms) then we say that it is a model of this axiom (resp. set of axioms).

• Two axioms or two sets of axioms are equivalent if they have the same models.

Page 28: Why LexGrid?

Description Logic

Symbol

A, B

C, D

R

>

?

: A

C u D

8 R.C

9 R.>

Interpretation

AI, BI

CI, DI

RI µ I x I

I

?

I n AI

CI Å DI

{a 2 I | 8 b.(a,b) 2 RI ! b 2 CI}

{a 2 I | 9 b.(a,b) 2 RI}

Much study (DAML+OIL, OWL, CL, …)

But what of this????

Page 29: Why LexGrid?

Interpretation and OWL

OWL:AnnotationProperty

• … in OWL DL one cannot define subproperties or domain/range constraints for annotation properties

• Five annotation properties are predefined by OWL:owl:versionInfo

rdfs:labelrdfs:commentrdfs:seeAlsordfs:isDefinedBy

Page 30: Why LexGrid?

A Rose in OWL?

<owl:Class rdf:ID=“Rose”><rdfs:subClassOf rdf:resource=“#FloweringPlant”/><rdfs:subClassOf>

<owl:restriction>

<owl:onProperty rdf:resource=“#hasRisk”/>

<owl:someValuesFrom rdf:resource=“#Thorn/>

</owl:restriction>

</rdfs:subClassOf>

</owl:Class>

Page 31: Why LexGrid?

A Rose in OWL?

<owl:Class rdf:ID=“C”><rdfs:subClassOf rdf:resource=“#A”/><rdfs:subClassOf>

<owl:restriction>

<owl:onProperty rdf:resource=“#R”/>

<owl:someValuesFrom rdf:resource=“#D/>

</owl:restriction>

</rdfs:subClassOf>

</owl:Class>

Page 32: Why LexGrid?

The Communication Process

CONCEPT

Referent

Refers ToSymbolises

Stands For“rose”

Refers ToSymbolises

Stands For“floweringPlant”,“hasRisk”“thorn”

CONCEPT

Symbol

Context Context

{x 2 I | 9 thorn.(x, thorn) 2 hasRiskI Å x 2 floweringPlantI }

+

Page 33: Why LexGrid?

The Communication Process

CONCEPT

Referent

Refers ToSymbolises

Stands For“A101”

Refers ToSymbolises

Stands For“A102”,“R1”“A103”

CONCEPT

Symbol

Context Context

{x 2 I | 9 A102.(x, A102) 2 R1I Å x 2 A103^\cI }

+

A101I = “rose”

A101I = “flower”

A102I = “sharp spine”

R1I = “possible misfortune”

Page 34: Why LexGrid?

Definitions vs. Propositions

Is this:1. The thing that is defined as a

procedure that involves an excision of a structure of lobe of lung? (Axiom)

2. A statement saying “All procedures that involve an excision of the structure of lobe of lung are pulmonary lobectomy? (Falsifiable proposition)

Page 35: Why LexGrid?

LexGrid Focus

• Definitional Aspects of Ontologies

• Making sure that the information (axioms) that are the basis of propositions are accurate, complete and reproducible

• Making sure that resulting propositions are verifiable – that the terms that come out match the terms that go in

Page 36: Why LexGrid?

(Reference) Ontologies

Reference ontologies are not designed to be nice - they are designed to be big, boring and true.

Barry Smith

Page 37: Why LexGrid?

LexGrid Goal

1) Combine:• Lexical Semantics

• Names• (Textual) Definitions• Comments• Other non-classification property

• Context• Languages and dialects• Communities and specialties• Localizations

• Logical Semantics• Roles and Relations

Page 38: Why LexGrid?

LexGrid Goal

2) Use these to integrate, reason about and report:

• Existing data & codes• Special contexts• Need formalization

• New information• New screens• Metadata

Page 39: Why LexGrid?

The LexGrid Goal

Terminology as a commodity resource• Available whenever and wherever it

is needed• Online or downloadable• Push or pull update mechanism• Available 24x7

• Revised and updated in “real-time”• Cross-linked and indexed

Page 40: Why LexGrid?

LexGrid Three-Pronged Approach

LexGridModel

Page 41: Why LexGrid?

The Heart of the Lexical Grid

The LexGrid Model - a model of terminology that:1) Explicitly names and defines the

things that the LexGrid tools need to reference explicitly

2) Represents “non-semantic” entities as name/value pairs

Page 42: Why LexGrid?

Modeling Extremes

hyperNormalized hyperSpecified

Page 43: Why LexGrid?

hyperNormalized hyperSpecified

hyperNormalized Model

+ Incredibly flexible

- Doesn’t say a heck of a lot about a given domain.

• Specialization is possible• Entity: “Patient”• Attribute: “Name/String”• Relationship: hasName

• Many hyperNormalized models already existER1 / UML / SQL / …

Page 44: Why LexGrid?

hyperSpecified Model

Page 45: Why LexGrid?

hyperNormalized

hyperSpecified Model

hyperSpecified

+ Incredibly precise – you know exactly what you’ve got

- Unwieldy and inflexible

- Difficult to understand

Page 46: Why LexGrid?

hyperNormalized hyperSpecified

Modeling Pragmatics

• Make the differences that are important explicit

• Use terminology to carry the rest

Page 47: Why LexGrid?

The Heart of the Lexical Grid

The LexGrid Model - a formal model of terminology that:1) Explicitly names and defines the

entities and objects used in the LexGrid tooling

2) Supports as many “non-semantic” entities (from the toolkit perspective) as possible via. Name/value pairs

Page 48: Why LexGrid?

The LexGrid Model

Interpretation Computation

Page 49: Why LexGrid?

The LexGrid Model

Page 50: Why LexGrid?

(Short Rave)This is not a model of a concept!!!

It is a model of a symbol!!!

Page 51: Why LexGrid?

(Short Rave)

Thought or Reference

Referent Symbol

SymbolisesRefers to

Stands for

C.K Ogden and I. A. Richards. The Meaning of Meaning.

“Rose”, “ClipArt”

Concept

Symbol

Page 52: Why LexGrid?

Concept, Symbol and Meaning

Human Being

Human / SymbolInteraction

The focus ofLexGrid

cd Model

Concept

Symbol

Meaning

Page 53: Why LexGrid?

Concept vs. Symbol

A thing that is a flower and has thorns

Symbol

Symbolizes a conceptNOT a concept.

Page 54: Why LexGrid?

(short rave)

• Calling a symbol a concept in a model:• Confuses everyone• Makes a mess of the resulting model

• Everything is a concept• And (almost) everything is NOT in

anyone’s database

• Symbols, can be modeled, carried in databases, reasoned with, etc.

Page 55: Why LexGrid?

(end rave)

Page 56: Why LexGrid?

The LexGrid Model

• Source is currently maintained in XML Schema

• First incarnation was LDAP Schema

• (Semi) automatic transformations available to

• Unified Modeling Language (UML)• XML Model Interchange (XMI)• Eclipse Modeling Framework (EMF)• Java• LDAP Schema

Page 57: Why LexGrid?

The LexGrid Node

• A LexGrid Node is software and a backing data store that represents terminological information in a format semantically faithful to the LexGrid Model

LexGridNode

DataStore

Page 58: Why LexGrid?

FunctionalityVirtual Nodes

LexGridNode

DataStoreLexGridNode

DataStore

LexGridNode

DataStore

LexGridNode

DataStore

Mayo

Stanford

UCSF

NCI

Page 59: Why LexGrid?

FunctionalityVirtual Nodes

• Virtual Node Toolkit• Create and load a local node• Publish in web space• Node is treated as part of the larger

grid

Page 60: Why LexGrid?

FunctionalityVirtual Nodes – Cross Node Search

ICD-9

FMA

MeSH

Page 61: Why LexGrid?

FunctionalityReplication / Update

NCIReplica

DataStore

Mayo

NCIReplica

DataStore

Stanford

NCI

DataStore

NCI

Update

Subscribe

ChangeLog

ChangeLog

ChangeLog

“Push”“Pull”

Page 62: Why LexGrid?

FunctionalityIndices

NCI

DataStore

NCI

Update

IndexService

Subscribe

“Push”

ReasoningService

Subscribe

“Push”

Page 63: Why LexGrid?

FunctionalityCross References

NCI

DataStore

UMLS

DataStore

SemanticNET

DataStoreUMLS_CUI = URN:ISO:2.16.840.1.113883.6.56:C0002072

Semantic_Type = URN:ISO:2.16.840.1.113883.6.56.1:T123

T123 – “Biologically Active Substance”

ConceptCode: C222 entityDescription: Alkylsulfonate Compound Semantic_Type: SemNet:T123 UMLS_CUI: C0002072

C0002702 – “Alkanesufonates”

Page 64: Why LexGrid?

FunctionalityNode Directory

Page 65: Why LexGrid?

LexGrid Components

LexGridNode

DataStore

Services

WebClients

Java

.NET

...

Import

Editors

Browsers

Query Tools

OWL

RDFXML

CSV

OWLBrowse and

Edit

Export

Embed

...

OBO

UMLSSKOS

Protege (custom)

Protege

Page 66: Why LexGrid?

LexGrid Components

LexGridNode

DataStore

Services

WebClients

Java

.NET

...

Editors

Browsers

Query Tools

OWL

RDFXML

CSV

Terminology

...

MMFIODM…

20944

XMDRRDF DB’s

SPARQLProlog..

SwoopProtégéDagEditXMDRp

SKOSOWLUMLS…

MDA

Page 67: Why LexGrid?

LexGrid and Metadata

Page 68: Why LexGrid?

Different Data Forms, Same Information

PT# Observation

1110112 F

PT# Tag Value

1110112 Gender Female

PT# Female

1110112 TRUE

PT#

1110112

Table 17: Female Patients

Table Name

Tag/Value Pair

Column Heading

A code in a table

Database Names

Free text

PT# observation

1110112 “…a middle-aged woman…”

Female ResearchClinic

Page 69: Why LexGrid?

Different Vocabulary Same Information

Code Designation

F Female

Code Designation

123.17 Male or Female Adult

Code DesignationAA 17-44 Year Old Female with

no signs of head injury

Code Designation

A17 XX

A13 XX Mosaic

Desired Granularity

Too Coarse

Coupled With OtherInformation

Too Fine

Page 70: Why LexGrid?

Terminology and the Information Model

Information Model

Terminology

?

Page 71: Why LexGrid?

Terminology and the Information Model

Information Model

Terminology

Terminology and structure must be coordinated to achieve consistency and an integrated whole in HL7 standards.

Page 72: Why LexGrid?

Active Application Work

• SNOMED CMWG

• HL7 Terminfo

Page 73: Why LexGrid?

LexGrid Collaborations

• NCI• LexBIG – LexGrid for caGRID

• National Center for Biomedical Ontology• LexBIO – LexGrid for NCBO

• Health Level Seven (HL7)• Tooling

• National Library of Medicine

• ISO JTC1/SC32 (NCITS-L8) - XMDR

Page 74: Why LexGrid?

Acnowledgements

This work was supported in part by a grant from the US National Library of Medicine: LM07319.

Page 75: Why LexGrid?