11 shared representation and community participation in standards (including vocabularies) for human...

32
1 1 Shared Representation and Community Participation in Standards (including vocabularies) for Human and Machine Interoperability A Cooperative Effort Between the NCBO, NCRI and NCI CBIIT Stuart Turner, DVM, MS Leafpath Informatics [email protected] Architecture/VCDE Joint Face-to-Face Friday 4 June 2010 | St. Louis, Missouri

Upload: arnold-foster

Post on 29-Jan-2016

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 11 Shared Representation and Community Participation in Standards (including vocabularies) for Human and Machine Interoperability A Cooperative Effort

11

Shared Representation and Community Participation in

Standards (including vocabularies) for Human

and Machine Interoperability

A Cooperative Effort Between the NCBO, NCRI and NCI CBIIT

Stuart Turner, DVM, MSLeafpath [email protected]

Architecture/VCDE Joint Face-to-Face

Friday 4 June 2010 | St. Louis, Missouri

Page 2: 11 Shared Representation and Community Participation in Standards (including vocabularies) for Human and Machine Interoperability A Cooperative Effort

22

Participants

- Mark Musen- Natasha Noy- Trish Whetzel

National (U.S.) Center for Biomedical Ontology (NCBO)

bioontology.org

Page 3: 11 Shared Representation and Community Participation in Standards (including vocabularies) for Human and Machine Interoperability A Cooperative Effort

33

Participants

- Alan Hogg- Stuart Bell

National (U.K.) Cancer Research Institute (NCRI)

Oncology Information Exchange (ONIX) & the Cancer InfoMatrix (CIM)

ncri-onix.org.uk

Page 4: 11 Shared Representation and Community Participation in Standards (including vocabularies) for Human and Machine Interoperability A Cooperative Effort

44

Participants

- Brian Davis- Sherri De Coronado- Robert Freimuth- Richard Kiefer- Hua Min- Michael Riben- Harold Solbrig- Grace Stafford- Larry Wright

National (U.S.) Cancer Institute (NCI)

Center for Biomedical Informatics and Information Technology (CBIIT)

Page 5: 11 Shared Representation and Community Participation in Standards (including vocabularies) for Human and Machine Interoperability A Cooperative Effort

55

Standards Representation Group

Objectives

1.Identify common features in a shared profile for ontologies for discovery, understanding, aggregation, annotation, rating and evaluation.

2.Avoid or limit the creation of something new.

3.A phased and pragmatic approach. Vocabularies first, then standards, specifications, projects, people, artifacts, etc.

4.Identify methods for discourse, rating, clustering, certifying, etc

5.Identify methods for federation and synchronization •Satisfy needs for certification, engaging a diverse community, broader entity representation and ultimately a “web of trust”.

1.White paper

2.An implementation

Page 6: 11 Shared Representation and Community Participation in Standards (including vocabularies) for Human and Machine Interoperability A Cooperative Effort

66

Standards Representation Group

Approach

1.Gather requirements from the three organizations.

2.Identify common features-of-interest as well as those where there may be discordance but of high value to an individual organization.

3.Identify existing active metadata models, including those that overlap and are complimentary.•Identify methods for discourse, rating, clustering, etc.•Identify methods

Where

https://wiki.nci.nih.gov/x/mkNyAQ

Page 7: 11 Shared Representation and Community Participation in Standards (including vocabularies) for Human and Machine Interoperability A Cooperative Effort

77

NCBO

About*

1.One of three National Centers for Biomedical Computing launched by NIH in 2005

2.Collaboration of Stanford, Mayo, Buffalo, Washington University, Johns Hopkins, and the Medical College of Wisconsin

3.Primary goal is to make ontologies accessible and usable

4.Research will develop technologies for ontology dissemination, use, indexing, alignment, and peer review

Key Activities*

1.Creates and maintain a library of biomedical ontologies.

2.Builds tools and Web services to enable the use of ontologies.

3.Collaborate with scientific communities that develop and use ontologies.

*Adapted from Mark Musen’s presentation to VCDE, 2009

Page 8: 11 Shared Representation and Community Participation in Standards (including vocabularies) for Human and Machine Interoperability A Cooperative Effort

88

NCBO

Biomedical Resource Ontology

*Adapted from Mark Musen’s presentation to VCDE, 2009

Page 9: 11 Shared Representation and Community Participation in Standards (including vocabularies) for Human and Machine Interoperability A Cooperative Effort

99

NCBO

Notes in BioPortal

*Adapted from Mark Musen’s presentation to VCDE, 2009

Page 10: 11 Shared Representation and Community Participation in Standards (including vocabularies) for Human and Machine Interoperability A Cooperative Effort

1010

NCRI - ONIX

About

Partnership of greater than 20 organizations

Goals: Promote data sharing, describing relevant standards, forming alliances

Projects: Cancer InfoMatrix

Page 11: 11 Shared Representation and Community Participation in Standards (including vocabularies) for Human and Machine Interoperability A Cooperative Effort

1111

NCRI - Cancer InfoMatrix

Illustration: A view of the Cancer InfoMatrix showing Ontologies matched to Clinical and highlighting CTCAE (Common Terminology Criteria for Adverse Events)

Page 12: 11 Shared Representation and Community Participation in Standards (including vocabularies) for Human and Machine Interoperability A Cooperative Effort

1212

NCRI - Cancer InfoMatrix

Illustration: A view of the Cancer InfoMatrix showing Ontologies matched to Clinical and showing details (metadata) for CTCAE (Common Terminology Criteria for Adverse Events)

Page 13: 11 Shared Representation and Community Participation in Standards (including vocabularies) for Human and Machine Interoperability A Cooperative Effort

1313

caBIG Vocabulary Reviews

Evolving class of certified vocabularies

Certification is via formal consensus review (Modified Delphi) using ~ 105 evaluation criteria grouped logically into four categories (structure, content, documentation and editorial/governance)

Evaluation criteria based on best-practices derived principally from healthcare community including Jim Cimino’s Desiderata

Two principal outcomes

1. Environment agnostic benchmark of the merits of a vocabulary

2. Benchmark in turn is used as a certification vehicle - yields a more specific measure of the fit-for-purpose of a vocabulary within the caBIG enterprise

Reviews performed since 2005NCI Thesaurus, Gene Ontology, CTCAE v3.0, LOINC, SNOMED CT, RadLex, Nanoparticle Ontology, MedDRA, CTCAE v4.0, ICD-9-CM (pending), ICD-10 (pending)

Ref: Cimino, J.J., et al., The caBIG terminology review process. J Biomed Inform, 2008.

Page 14: 11 Shared Representation and Community Participation in Standards (including vocabularies) for Human and Machine Interoperability A Cooperative Effort

1414

caBIG Vocabulary Reviews

Process continues to evolve, be refined

Example: Discrete literature (peer and grey) review

Augments the normative documentation and communication with a vocabulary representative

Attenuates any inherent gaps in knowledge or understanding and bias

Criteria statements “is the terminology evolving to maintain domain coverage?”, “is there a process for review by independent experts from the field in which the terminology will be used?”, or “is there nothing controversial about the terminology that should be considered?”

Illustration: A view of a vocabulary review subprocess showing discrete activities for literature, tooling, regulatory and use case reviews.

Page 15: 11 Shared Representation and Community Participation in Standards (including vocabularies) for Human and Machine Interoperability A Cooperative Effort

1515

Vocabulary Reviews: Challenges

1. Resolving a certification classification scheme that fairly and consistently abstracts the recommended usage of a vocabulary in caBIG

-Common issues to-date: narrative (absent, incomplete, inadequate) definitions (e.g. SNOMED-CT) and limited governance

-Expecting any vocabulary to survive all criteria unscathed is a tall order

-Pass/Fail doesn’t work (insufficient and often inappropriate)

-Fully certified, partially certified, uncertified scheme more approachable

-Partially certified requires qualifying guidance statements (e.g. “for use in value domains only”)

2. Questionable utility of reviews-Monolithic reports-Too terminology-centric. - Insufficient perspectives for different users-“At-a-glance” vs. “in-depth”-Rapidly obsolete-Time-consuming, costly, not updated-Limited community, use-case information-Unable to aggregate or cluster information (usage or concept domains)

Page 16: 11 Shared Representation and Community Participation in Standards (including vocabularies) for Human and Machine Interoperability A Cooperative Effort

1616

Cochrane Library Style Summaries

Page 17: 11 Shared Representation and Community Participation in Standards (including vocabularies) for Human and Machine Interoperability A Cooperative Effort

1717

Vocabulary Reviews: Challenges

Page 18: 11 Shared Representation and Community Participation in Standards (including vocabularies) for Human and Machine Interoperability A Cooperative Effort

1818

Vocabulary Reviews: Challenges

Page 19: 11 Shared Representation and Community Participation in Standards (including vocabularies) for Human and Machine Interoperability A Cooperative Effort

1919

Profiles: Candidate Metadata Models

1. Ontology Metadata Vocabulary (OMV | Consortium)Human readable and comprehensive

2. Terminology Metadata Model (TMM | CBIIT)Includes certification related attributes important to CBIIT

3. Common Terminology Services 2 (CTS2)Especially important for value domains/value sets, discovery, localizations, machine interoperability

4. Ontology Definition Metamodel (ODM | OMG)Broad coverage, use cases (clustering), lifecycle, engineering (tools), DL and CL, RDFS, Topic Maps

5. Metamodel for Ontology Registration (ISO 19763-3)Ontology registries and tracking evolution, machine interoperability

6. Open Provenance Model (OPM)Compliments other models to describe entities (agents), processes and artifacts. Good fit for describing “there”.

7. Dublin Core MetadataDocument and provenance centric. Often included in other models

8. Friend of a Friend (FOAF)Important for social integration, including user ratings (also Expertise Ontology, KSA’s)

9. Description of a Project (DOAP)Matching ontologies and standards to projects. NCBO has added project metadata to OMV

Page 20: 11 Shared Representation and Community Participation in Standards (including vocabularies) for Human and Machine Interoperability A Cooperative Effort

2020

Example of one issue and resolution

Issue

Identify a metamodel that captures the salient and common (NCBO, NCRI, CBIIT) attributes to describe and share ontologies (now) as well as standards, projects, people, artifacts (future).

Resolution

1. Gather requirements and rank them

2. Review candidate metamodels

3. Match requirements to features of extant metamodels

4. Resolve to a single model (if possible)

Progress

Current focus is NCBO’s use and extensions of the Ontology Metadata Vocabulary.

Page 21: 11 Shared Representation and Community Participation in Standards (including vocabularies) for Human and Machine Interoperability A Cooperative Effort

2121

Example of one issue and resolution

Illustration: Focused view of the Decision Matrix used by group to rank requirements (on NCI Wiki)

Page 22: 11 Shared Representation and Community Participation in Standards (including vocabularies) for Human and Machine Interoperability A Cooperative Effort

2222

Example of one issue and resolution

Illustration: Focused view of annotating features of the Ontology Metadata Vocabulary (on NCI Wiki)

Page 23: 11 Shared Representation and Community Participation in Standards (including vocabularies) for Human and Machine Interoperability A Cooperative Effort

2323

Example of one issue and resolution

Courtesy: Natasha Noy (NCBO)

• The main class OMV:Ontology– Represents metadata about a version of an ontology

Page 24: 11 Shared Representation and Community Participation in Standards (including vocabularies) for Human and Machine Interoperability A Cooperative Effort

2424

Example of one issue and resolution

Courtesy: Natasha Noy (NCBO)

Some OMV properties describing an ontology (properties on OMV:Ontology)• OMV:acronym• OMV:name• OMV:URI• OMV:naturalLanguage• OMV:creationDate• OMV:modificationDate• OMV:description• OMV:designedForOntologyTask• OMV:documentation• OMV:endorsedBy• OMV:hasContributor• OMV:hasCreator• OMV:hasDomain• OMV:status

• OMV:cointainsABox, OMV:containsTBox• OMV:expressiveness• OMV:hasFormalityLevel• OMV:hasLicense• OMV:keywords• OMV:keyClasses• OMV:knownUsage• OMV:isOfType• OMV:usedOntologyEngineeringTool• OMV:usedKnowledgeRepresentationPar

adigm• OMV:numberOfClasses• OMV:numberOfIndividuals• OMV:numberOfAxioms• OMV:numberOfProperties

Page 25: 11 Shared Representation and Community Participation in Standards (including vocabularies) for Human and Machine Interoperability A Cooperative Effort

2525

Example of one issue and resolution

Adapted from slides by Natasha Noy (NCBO)

Properties of OMV:ontology that were added by NCBO

• administeredBy• hasContactEmail• hasContactName• uploadDate• id• internalVersionNumber• preferredNameProperty• synonymProperty• documentationProperty• authorProperty

• codingScheme• fileNames• filePath• hasView• isVersionOfVirtualOntology

Page 26: 11 Shared Representation and Community Participation in Standards (including vocabularies) for Human and Machine Interoperability A Cooperative Effort

2626

Example of one issue and resolution

Adapted from slides by Natasha Noy (NCBO)

Properties of OMV:ontology that were added by NCBO

• administeredBy• hasContactEmail• hasContactName• uploadDate• id• internalVersionNumber• preferredNameProperty• synonymProperty• documentationProperty• authorProperty

• codingScheme• fileNames• filePath• hasView• isVersionOfVirtualOntology

Page 27: 11 Shared Representation and Community Participation in Standards (including vocabularies) for Human and Machine Interoperability A Cooperative Effort

2727

Virtual Ontology

Adapted from slides by Natasha Noy (NCBO)

• Needed a container for all the versions of the same ontology (e.g., to be able to provide an id that resolves to the latest version)

Page 28: 11 Shared Representation and Community Participation in Standards (including vocabularies) for Human and Machine Interoperability A Cooperative Effort

2828

Other Classes

Adapted from slides by Natasha Noy (NCBO)

• Project – describing ontology-based projects

• View, VirtualView – handling ontology views and subsets

• BioPortalUser (subclass of OMV:Person)

Page 29: 11 Shared Representation and Community Participation in Standards (including vocabularies) for Human and Machine Interoperability A Cooperative Effort

2929

Where are we?

Adapted from slides by Natasha Noy (NCBO)

• Moving towards OMV (core), and extensions (provenance and workflow), including those added by NCBO

• NCBO’s model may become putative model

• Next: Evaluate ratings and rankings systems (e.g. Amazon style)

• Next: Evaluate methods for federated exchange, synchronization of profiles and community participation

• Next: Proposed phased roll-out, implementation

Page 30: 11 Shared Representation and Community Participation in Standards (including vocabularies) for Human and Machine Interoperability A Cooperative Effort

3030

SAIF and ECCF effects on this process

• Updated ontology profiles are a better fit for our emerging agile and iterative environment

• Aggregation or clustering of ontology information more useful for describing concept domains, regulatory domains, ontology tasks, etc.

• Inclusion of community and “grass roots” participation more useful to discovery of relevant use cases, education and adoption (“web of trust”, evaluation by peers)

• Vocabulary review criteria have potential to be used as a self-assessment tool to “pre-certify”. Criteria may be used as conformance statements.

• The review process and criteria have already proven to help guide ontology development (i.e. CTCAE version 4.0)

• Provide sufficient detail (granularity, context) to assist usage and adoption at various levels in the implementation stack

• Community participation (experiential) important to mitigate presumptions about interoperability (e.g. semantic drift or change)

Page 31: 11 Shared Representation and Community Participation in Standards (including vocabularies) for Human and Machine Interoperability A Cooperative Effort

3131

Conclusions & Recommendations

• Ontology evaluation should include formal evaluations, self-evaluations, community reviews and case studies, methods of aggregation, viewing varying perspectives and should maintain currency, context, granularity and a web-of-trust

• Identify relevant metadata for ontologies that is also reusable for other entities, processes and artifacts (e.g. other standards, non-standards, people, projects, etc.)

• Be pragmatic (implement now/soon) yet prescient (have sufficient foresight) or “don’t repeat yourself” (DRY

Page 32: 11 Shared Representation and Community Participation in Standards (including vocabularies) for Human and Machine Interoperability A Cooperative Effort

3232

Questions?