biomedical informatics the lexical grid project: lexgrid christopher g. chute, md drph professor and...

66
Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College of Medicine Rochester, Minnesota Ontolog Forum 14 December 2006

Post on 18-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

Biomedical Informatics

The Lexical Grid Project: LexGrid

Christopher G. Chute, MD DrPHProfessor and Chair, Biomedical Informatics

Mayo Clinic College of MedicineRochester, Minnesota

Ontolog Forum14 December 2006

Page 2: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

Biomedical Informatics

The Lexical Grid Project: LexGrid

Acknowledgements:Harold Solbrig

James BuntrockThomas Johnson

Dan Armbrust

Page 3: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 3

Biomedical Informatics

Outline - LexGrid

• Overview• Functional Features• Problem Framing• LexGrid History• Present Status• Implementations• Future

Page 4: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 4

Biomedical Informatics

Overview

• The LexGrid package represents a comprehensive set of software and services to load, publish, and access vocabulary or ontological resources.

• The package is based upon an open standard• HL7 CTS (CTS II intended as more complete)

• Reference implementations as open source• http://informatics.mayo.edu• Migration to OHF

Page 5: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 5

Biomedical Informatics

LexGridInterlocking Components• Standards - access methods

(programming APIs) and formats need to be published and openly available.

• Tools - standards based tools must be readily available.

• Content - commonly used vocabularies and ontologies have to be available for access and download.

Page 6: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 6

Biomedical Informatics

The Lexical Grid• Terminology as a commodity resource

• Accessible online• under a common model• through a set of common API's• in web-space on web-time

• cross-linked• loosely coupled• published individually, when ready

• exportable• locally extendable • globally revised• open source tooling to browse, edit, etc

Page 7: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 7

Biomedical Informatics

Overview purposes

• Provides a single information model flexible enough to represent yesterday’s, today’s and tomorrow’s terminological or ontological resources

• Allows resources to be published online, cross-linked, and indexed on demand

• Provides standardized building blocks and tools that allow applications and users to take advantage of the content where and when it is needed

• Provide consistency and standardization required to support large-scale vocabulary adoption and use

Page 8: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 8

Biomedical Informatics

LexGrid Features• Accommodation of multiple vocabulary and ontology

distribution formats.• Support of multiple data stores to accommodate

federated vocabulary distribution.• Consistent and standardized access across multiple

vocabularies.• Rich API for supporting lexical and graph search and

traversal.• Fully compatible with HL7-CTS implementation.• Support for programmatic access via Java, .NET, and

web services.• Open source tooling and code to facilitate adoption

and use.

Page 9: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 9

Biomedical Informatics

LexGrid Users

• Vocabulary service providers. Describes organizations currently supporting externalized API-level interfaces to vocabulary content.

• Vocabulary integrators. Describes organizations that desire to integrate new vocabulary content or relations to be served locally.

• Vocabulary users. Describes persons and organizations desiring common, consistent access to vocabulary content for a supporting multiple application development uses.

Page 10: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 10

Biomedical Informatics

LexGridNode

Data

Services

Java

.NET

...

Import

Editors

Browsers

Query Tools

XML

Browse andEdit

Export

Embed

LexBIG

Index

LexGrid Conceptual ArchitectureComponentsRRF

OBO

OBO

Text

ProtégéCTS

Text

OWL

XML

Lex*

WebClients

LexGrid

Service IndexRegistry

Page 11: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 11

Biomedical Informatics

LexGrid Node

• The logical persistence layer for storing and managing vocabulary content.

• The LexGrid node utilizes relational database management systems for management of data and indexing functions.

• LexGrid nodes have been successful installed and tested using MySQL, Postgres, UDB/DB2, Oracle, Hypersonic, and LDAP/BDB.

LexGridNode

Data

Services

WebClients

Java

.NET

...

Import

Editors

Browsers

Query Tools

XML

Browse andEdit

Export

Embed

LexBIG

Index

RDF

Protégé

RDF

OWL

Protégé CTS

Text

OWL

XML

Lex*

LexGridService Index

Registry

Page 12: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 12

Biomedical Informatics

The Import Toolkit(s)

• Provides an API and a set of administration tools to load, index, publish, and manage vocabulary content for the vocabulary server.

• Standard formats and models that have been developed include:

•Rich Release Format (RRF)•Ontology Web Language (OWL)•LexGrid XML•Text Delimited•Ontylog XML (Apelon) format•Open Biomedical Ontology (OBO)

LexGridNode

Data

Services

WebClients

Java

.NET

...

Import

Editors

Browsers

Query Tools

XML

Browse andEdit

Export

Embed

LexBIG

Index

RDF

Protégé

RDF

OWL

Protégé CTS

Text

OWL

XML

Lex*

LexGridService Index

Registry

Page 13: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 13

Biomedical Informatics

The Export Toolkit(s)

• Provides an API and set of administration tools to export content in a standard format from a LexGrid node.

• Standard formats provided for export include:• LexGrid XML• OWL

LexGridNode

Data

Services

WebClients

Java

.NET

...

Import

Editors

Browsers

Query Tools

XML

Browse andEdit

Export

Embed

LexBIG

Index

RDF

Protégé

RDF

OWL

Protégé CTS

Text

OWL

XML

Lex*

LexGridService Index

Registry

Page 14: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 14

Biomedical Informatics

The LexGrid Editor

• A light weight editor for creating, modifying, and changing vocabulary content.

• The LexGrid Editor is an Eclipse Based application that supports multi vocabulary query and browsing, interactive views, and logging and auditing.

• Recent enhancements have provided extensions to accommodate value set creation and management.

LexGridNode

Data

Services

WebClients

Java

.NET

...

Import

Editors

Browsers

Query Tools

XML

Browse andEdit

Export

Embed

LexBIG

Index

RDF

Protégé

RDF

OWL

Protégé CTS

Text

OWL

XML

Lex*

LexGridService Index

Registry

Page 15: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 15

Biomedical Informatics

LexGrid Principles• LexGrid software is based on a model driven

architecture.• The LexGrid model is maintained in XML Schema

format• Represents a core component of design.

• The LexBIG API• Java-based API to LexGrid content is formally

modeled • Accommodates registration of additional load, index,

and search functions• Provides a conscious separation of service and data

classes in order to support deferred query resolution and software iterators

Page 16: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 16

Biomedical Informatics

LexGrid Model

• Lexical Semantics• Names• (Textual) Definitions• Comments• Other non-classification property

• Context• Languages and dialects• Communities and specialties• Localizations

• Logical Semantics• Roles and Relations

Page 17: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 17

Biomedical Informatics

LexGrid Model• Proposal for standard storage of controlled

vocabularies and ontologies• Flexible enough to accurately represent a wide

variety of vocabularies and other lexically-based resources

• Defines • How vocabularies should be formatted and

represented programmatically• Several different server storage mechanisms

• relational database, LDAP and an XML format.

Page 18: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 18

Biomedical Informatics

LexGrid ModelCoding Scheme

RelationsConcepts

Properties

cd codingSchemes

describable

codingScheme

concepts::conceptsdescribable

relations::relations

describable

relations::association

relations::

associationInstance

associatableElement

relations::associationTarget

versionableAndDescribable

concepts::codedEntry

concepts::property

concepts::comment

concepts::definition

concepts::presentation

0..1+concepts 0..*+relations

1..*+association

0..*+sourceConcept

0..*+targetConcept

1..*+concept

0..*+property

Page 19: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 19

Biomedical Informatics

Model: Code Systems• Each service defined to the LexGrid model can

encapsulate the definition of one or more vocabularies.

• Each vocabulary is modeled as an individual code system, known as a codingScheme.

• Each scheme tracks information used to uniquely identify the code system, along with relevant metadata.

• The collection of all code systems defined to a service is encapsulated by a single codingSchemes container.

Page 20: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 20

Biomedical Informatics

Model: Concepts• A code system may define zero or more coded

concepts, encapsulated within a single container. • A concept represents a coded entity (identified in the

model as a codedEntry) within a particular domain of discourse.

• Each concept is unique within the code system that defines it.

• Must be qualified by at least one term or designation, represented in the model as a property.• Each property is an attribute, facet, or some other characteristic that may represent or help define the intended meaning of the encapsulating codedEntry.

• A concept may be the source for and/or the target of zero or more relationships.

Page 21: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 21

Biomedical Informatics

Model: Relations• Each code system may define one or more containers to

encapsulate relationships between concepts.• Each named relationship (e.g. “hasSubtype” or “hasPart”) is

represented as an association within the LexGrid model.• Each relations container must define one or more association.• May also further define the nature of the relationship in terms of

transitivity, symmetry, reflexivity, forward and inverse names, etc.• Multiple instances of each association can be defined, each of

which provide a directed relationship between one source and one or more target concepts.

• Source and target concepts may be contained in the same code system as the association or another if explicitly identified.

• By default, all source and target concepts are resolved from the code system defining the association.

• The code system can be overridden by each specific association, relation source (associationInstance), or relation target (associationTarget).

Page 22: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 22

Biomedical Informatics

Available Representationsof the LexGrid Model

• The master representation of the LexGrid model is provided in XML Schema Definition (XSD) format.

• Conversions to other formal representations are available, including XML Metadata Interchange (XMI) and Unified Modeling Language (UML).

• Implementation or technology-specific renderings of the model also exist.

• Relational database schema • (MySQL, PostgreSQL, DB2, Oracle, etc)

• Lightweight Directory Access Protocol (LDAP) schema• Programming interfaces generated from the formal

representation include Java bean interfaces based on the Eclipse Modeling Framework (EMF) and Castor frameworks.

Page 23: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 23

Biomedical Informatics

Disease UnderstandingConstrained by Knowledge

• Carolus Linnaeus Carl von Linné

• Genera Morborum (1763)

• Underscored Content Difficulty• Pathophysiology vs Manifestation

e.g. Rabies as psychiatric disease

Page 24: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 24

Biomedical Informatics

The Genomic Era• The genomic transformation of medicine far

exceeds the introduction of antibiotics and aseptic surgery

• The binding of genomic biology and clinical medicine will accelerate

• The implications for shared semantics across the basic science and clinical communities are unprecedented

• The implications for Public Health surveillance and inference are profound

Page 25: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 25

Biomedical Informatics

From Practice-based Evidenceto Evidence-based Practice

PatientEncounters

ClinicalDatabases Registries et al.

ClinicalGuidelines

Medical Knowledge

ExpertSystems

DataData InferenceInference

KnowledgeKnowledgeManagementManagement

DecisionDecisionsupportsupport

OntologyShared Semantics

Vocabularies &Terminologies

Page 26: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 26

Biomedical Informatics

The Historical Center of theHealth Data Universe

Clinical DataClinical Data

Billable DiagnosesBillable Diagnoses

Billable DiagnosesBillable Diagnoses

Page 27: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 27

Biomedical Informatics

Copernican Health Data Universe

Billable DiagnosesBillable Diagnoses

Clinical DataClinical Data(Niklas Koppernigk)

GuidelinesGuidelines

Scientific LiteratureScientific Literature

Medical LiteratureMedical Literature

Clinical DataClinical Data

Genomic CharacteristicsGenomic Characteristics

Page 28: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 28

Biomedical Informatics

Continuum from Nomenclature to Classification

• Patient Data is Highly Detailed• Modifiers: Anatomy, Stage, Severity, Extent• Qualifiers: Probability, Temporal Status

• Aggregate Uses Require Categorization• Granularity of Classifiers

• Focused Groups and Strata for CQI/Outcomes• Broad Statistical/Fiscal Groups

Page 29: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 29

Biomedical Informatics

Familiar Points Along Continuum Modern Health Vocabularies

• Nomenclature – Highly Detailed Descriptions (SNOMED)

• Classification – Organized Aggregation of Descriptions into a Rubric (ICDs)

• Groupings – High Level Categories of Rubrics (DRGs)

Detailed GroupedNomenclatureNomenclature ClassificationClassification GroupsGroups

Page 30: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 30

Biomedical Informatics

Blois, 1988Medicine and the nature of vertical reasoning • Molecular: receptors, enzymes, vitamins, drugs• Genes, SNPs, gene regulation• Physiologic pathways, regulatory changes• Cellular metabolism, interaction, meiosis,…• Tissue function, integrity• Organ function, pathology• Organism (Human), disease• Sociology, environment, nutrition, mental health…

Page 31: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 31

Biomedical Informatics

The Continuum Of Biomedical InformaticsBioinformatics meets Medical Informatics

0

1

2

3

4

5

6

7

8

9

10

Biology Medicine

Chasm of Semantic Despair

Page 32: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 32

Biomedical Informatics

Feudal CognitionIntellectual Semantic Baronies

• Genetic variation – Genomics • Haplotypes – Statistical Genomics• Molecular – Metabolomics, Proteomics• Binding – Molecular simulation• Pathways – Physiology and Systems Biology• Symptoms – Consumer Health• Rx and Px – Clinical Medicine• Risk – Public Health, Epidemiology• Social impact – Sociology, Health Economics

Page 33: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 33

Biomedical Informatics

Mol

ecul

ar

Clin

ical

Fine Detail

Highly Aggregated

Imm

unol

ogy

Imm

unol

ogy

??

Dise

ase

Dise

ase

Anat

omy

Anat

omy

Pulm

onar

y Di

seas

ePu

lmon

ary

Dise

ase

asth

ma

asth

ma

Lung

Lung

Nose

Nose

pneu

mon

iapn

eum

onia

Nasa

l Dise

ase

Nasa

l Dise

ase

alle

rgic

rhin

itisal

lerg

ic rh

initis

Airw

ayAi

rway

Nucle

otid

eNu

cleot

ideMol

ecul

eM

olec

ule

Amin

o Ac

id S

eque

nce

Amin

o Ac

id S

eque

nce

Prot

ein

Prot

ein

Enzy

me

Enzy

me

Amin

o Ac

idAm

ino

Acid

TPM

TTP

MT

HNM

THN

MT

Thr1

05Ile

Th

r105

Ile

allo

zym

eal

lozy

me

Lysin

eLy

sine

Imm

unog

lobu

linIm

mun

oglo

bulin

IgE

IgE

has

tran

slat

ion

Pept

ide

Page 34: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 34

Biomedical Informatics

Aggregation Logics by domainrule-based aggregations

Decision Support Decision Support and Error Detectionand Error Detection

Public Health andPublic Health andSurveillanceSurveillance

Reimbursement Reimbursement and Management and Management

Outcome Research Outcome Research and Epidemiologyand EpidemiologyFindingsFindings InterventionsInterventionsEventsEvents

Page 35: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 35

Biomedical Informatics

Making Shared Context Explicit

CONCEPT

Referent

Refers ToSymbolises

Stands For“Rose”,“ClipArt”

Refers ToSymbolises

Stands For“Rose”,“ClipArt”

CONCEPT

Symbol Symbol

“I see a ClipArt image of a rose”

Context Context

Formal SharedContext

Terminologies Terminologies

[From Solbrig]

Page 36: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 36

Biomedical Informatics

Proliferation of Content“Have it your way” Vocabulary Models

• Major ontologies• SNOMED CT; Gene Ontology; LOINC; NDF-RT• UMLS Metathesaurus; NCI Thesaurus• HL7 RIM and Vocabulary; DICOM RadLex • CDC bioterrorism PHIN standards• caBIG DSR / CDEs (Common Data Elements)

• All created with differing formats and models• Mechanisms for content sharing

• Research Area

Page 37: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 37

Biomedical Informatics

History of Terminology Servicesin the US

• YATN: yet another terminology service 1996• Mayo, Kaiser, Lexical Technology

• MetaPhrase – Lexical Technology 1998• LQS: Lexicon Querry Services; 3M 1998• Mayo Autocoder: UI to YATN suite 2000• CTS: Common Terminology Services 2003

• HL7 balloted standard 2004• LexGrid: superset CTS, ref. implementation – 04

• http://informatics.mayo.edu

Page 38: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 38

Biomedical Informatics

Mayo’s Work with Problem ListInterface Design

• Premise upon Terminology Server• MetaPhrase Prototypes on the Network

• Iterative Usability Lab Evaluations• Mock-ups in VB, Delphi, Java, …

• Evolve Toward Subset of Functional Needs• Problem List Specific• Drive Specification and Operation of T Server

Page 39: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 39

Biomedical Informatics

Terminology Services for Humans

Page 40: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 40

Biomedical Informatics

Common Terminology Services (CTS) • An HL7 ANSI standard

• Defines the minimum set of requirements for interoperability across disparate healthcare applications

• A specification for accessing terminology content• The CTS identifies the minimum set of functional

characteristics a terminology resource must possess for use in HL7.

• A functional model• Defining the functional characteristics of vocabulary as

a set of Application Programming Interfaces (APIs)

Page 41: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 41

Biomedical Informatics

CTS APIs• Define the necessary functions for healthcare

terminology• Decouples terminology from the terminology service.• Technology independent

• Legacy database• Institutional infrastructure

• Provide common interface and reference model • I know what you mean by

• Code System• Coded Concept• Relationship

Page 42: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 42

Biomedical Informatics

Mayo LexGrid ProjectOntology Services

• HL7 ANSI Standard• ISO Standard• Open specification• Provide consistency and standardization

required to support large-scale vocabulary adoption and use

• Common model, tools, formats, and interfaces• Standard terminology model (Excel to OWL)• Grid-nodal architecture•http://informatics.mayo.edu

Page 43: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 43

Biomedical Informatics

Examples and Proof of Concept• NIH RoadMap: Nat. Center Biomedical Ontologies

• Mayo LexGrid project [MLG]• Clinical and basic science (Gene Ontology) communities

• NCI caBIG – Bioinformatics Grid [MLG]• HHS/ONC NHIN National Health Information Network

• IBM Data Coordination project• NLM/HL7 Coordination project; [MLG]

• CDC PHIN Public Health Information Network [MLG]• W3C Semantic Web

• XML/RDF/OWL • ISO 11179 metadata standards [MLG]

Page 44: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 44

Biomedical Informatics

LexGrid Applications at Mayo forSemantic Annotation and Integration

• Basis for NLP (Natural Language Processing) entity annotation – clinical notes

• Harmonize data elements, values sets• Getting the data right

• Information retrieval and navigation• Getting the right data

• Grounding for data governance• Foundation for semantic interoperability

Page 45: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 45

Biomedical Informatics

Cancer Biomedical Informatics Grid(caBIG)

• Coordinated infrastructure for Cancer Research• Clinical Trials, Integrative Cancer Research,

Tissue Banking and Pathology Tools• Vocabulary, Common Data Elements,

Architecture

Page 46: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 46

Biomedical Informatics

caBIGGrid

caBIGGrid

caBIGNode

caBIGNode

OtherVocabulary

NCIThesaurus

LexGridCTS

Server

(Partial)Online Replica

Importer

OtherVocabulary

NCIThesaurus

NCI Meta-Thesaurus

LexGridCTS

Server

Local Replica

Importer

OtherVocabulary

NCI Meta-Thesaurus

LexGridCTS

Server

NCI

Import

KitNCI

Thesaurus

caBIGGrid

caBIGGrid

caBIGNodecaBIGNode

caBIGNodecaBIGNode

OtherVocabulary

NCIThesaurus

LexGridCTS

Server

(Partial)Online Replica

Importer

OtherVocabulary

NCIThesaurus

LexGridCTS

Server

(Partial)Online Replica

Importer

OtherVocabulary

NCIThesaurus

NCI Meta-Thesaurus

LexGridCTS

Server

Local Replica

Importer

OtherVocabulary

NCIThesaurus

NCI Meta-Thesaurus

LexGridCTS

Server

Local Replica

Importer

OtherVocabulary

NCI Meta-Thesaurus

LexGridCTS

Server

NCI

Import

KitNCI

Thesaurus

OtherVocabulary

NCI Meta-Thesaurus

LexGridCTS

Server

NCI

Import

KitNCI

Thesaurus

LexBIG Vision

Page 47: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 47

Biomedical Informatics

Page 48: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 48

Biomedical Informatics

LexPHINCDC Public Health Informatics Network

• Adoption of the LexGrid Model• Replace PHIN Vocabulary Services (VS)• Addresses genomic characterization of disease

• Span semantic chasm with Gene Ontology• Organized Value Sets

• Outbreak Management System• Biosurveillance and Biosense aggregation

Page 49: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 49

Biomedical Informatics

LexPHIN Model

Concepts

Value Domains

Coding Scheme

Relations

Versions

versions::history

concepts::concepts

versionableAndDescribable

valueDomains::valueDomainversionableAndDescribable

codingSchemes::codingScheme

describable

relations::relations

valueDomains::valueDomains codingSchemes::codingSchemes

describable

service::service

+history

0..1

+concepts 0..1

+valueDomain 1..* +codingScheme 1..*

+relations 0..*

+valueDomains 0..1 +codingSchemes 0..1

Page 50: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 50

Biomedical Informatics

Health Level Seven (HL7)

• Vocabulary and value domain management• Tooling for vocabulary submissions• Includes change events for HL7 governance

process

Page 51: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 51

Biomedical Informatics

HL7 Value Domain Editor

Page 52: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 52

Biomedical Informatics

NCBO – A Bridge Across the Chasm

Page 53: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 53

Biomedical Informatics

NCBO Tools

Page 54: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 54

Biomedical Informatics

Ontology List

Page 55: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 55

Biomedical Informatics

Ontology Counts

Total Number of Ontologies 52NCBO Library 45Remote 7Number of Classes 175296**ontologies which have been parsed and indexed

Page 56: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 56

Biomedical Informatics

Ontologies by Category

Page 57: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 57

Biomedical Informatics

Expanded Categories

Page 58: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 58

Biomedical Informatics

GO Biological Process Metadata

Page 59: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 59

Biomedical Informatics

Concept Search

Page 60: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 60

Biomedical Informatics

Search Results

Page 61: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 61

Biomedical Informatics

MeSH Results

Page 62: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 62

Biomedical Informatics

MeSH Hindlimb

Page 63: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 63

Biomedical Informatics

BioPortalStanford UniversityArchana Vembakam and Lynn Murphy

Page 64: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 64

Biomedical Informatics

LexGrid Future Issues• Federated vocabulary node synchronization and

registration/discovery.• API extensions to support local vocabulary extensions and

provider suggestions.• API extensions to support HL7/CTSII API (currently being

defined).• API extensions to support submission of vocabulary change

requests.• API extensions to load and map between additional vocabulary

formats.• ISO 11179 and LexGrid integration• Provide additional index services

• Synonymy and normalized search• Reasoner or classifier adaptation• Automated coding of medical records• Provide a light-weight Representational State Transfer (REST)

service implementation.

Page 65: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 65

Biomedical Informatics

Conclusion• Biomedicine concepts have become complex

and intertwined• Big science model of future research

• 21st Century Medicine will require comparable and consistent data (Clinical and Genomic)

• Ontologies as formal models of concepts provide great opportunity

• Tools, content, and resources are becoming increasingly available

• LexGrid is emerging as an integrating force

Page 66: Biomedical Informatics The Lexical Grid Project: LexGrid Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College

© 2006 Mayo Clinic College of Medicine 66

Biomedical Informatics

Resources

LexGrid Projecthttp://informatics.mayo.edu/LexGrid

LexBIG Forge Sitehttp://gforge.nci.nih.gov/projects/lexbig

caBIG LexGrid CVShttp://cabigcvs.nci.nih.gov/viewcvs.cgi/lexgrid

NCBO Projecthttp://www.bioontology.org