biomedical informatics data governance and normalization with the mayo clinic enterprise christopher...
TRANSCRIPT
Biomedical Informatics
Data Governance and Normalization with the Mayo Clinic Enterprise
Christopher G Chute MD DrPH, Mayo ClinicChair, ICD11 Revision, World Health OrganizationChair, ISO TC215 on Health Informatics
CTSA Ontology WorkshopBaltimore
25 April 20121
Biomedical Informatics
2
From Practice-based Evidenceto Evidence-based Practice
PatientEncounters
ClinicalDatabases Registries et al.
ClinicalGuidelines
Medical Knowledge
ExpertSystems
Data Inference
KnowledgeManagement
Decisionsupport
StandardsShared Semantics
Vocabularies & Terminologies
Biomedical Informatics
Mayo Clinic – Normalization NeedsVariations on Secondary Use
•Clinical Decision Support•Quality measurement and improvement•Best practice discovery and application •New knowledge discovery (research)
•Genotype to phenotype association•Outcomes Research•Comparative Effectiveness Analyses
© 2007 Mayo Clinic College of Medicine 3
Biomedical Informatics
Comparable and Consistent Data
• Inferencing from data to information requires sorting information into categories•Statistical bins•Machine learning features
•Accurate and reproducible categorization depends upon semantic consistency
•Semantic consistency is the vocabulary problem•Almost always manifest as the “value set” problem
© 2007 Mayo Clinic College of Medicine 4
Biomedical Informatics
LOINC, RxNorm, and SNOMEDWe’re not done yet…
•Picking high-level, Meaningful Use conformant, and fashionable vocabularies is not hard
•Finding which subsets (value sets) map to•For each application and interface•For each data element and variable
• Implies thousands of value sets for the average enterprise
• Imposes a normalization and mapping challenge
© 2007 Mayo Clinic College of Medicine 5
Biomedical Informatics
Data Governance
Data Standards
Data Quality
Data Architecture and Infrastructure
Data Governance Support Services
Policies
Biomedical Informatics
Terminology VisionMayo Clinic Production Terminology Support
•To incrementally standardize Mayo Clinic terminology•To access and use standardized terminology in
applications, interfaces, warehouses, and processes across Mayo Clinic
•To reduce cost and effort of developing and maintaining redundant and disparate terminologies and mappings for operations, analytics, and decision support
•To support business needs and strategies that require or benefit from standardized terminology (e.g., ICD-10 conversion, Meaningful Use, Health Information Exchange)
Biomedical Informatics
Principles
•Leverage external standards•Support Mayo needs that are not yet part of an
external standard and promote them to the industry
•Align with the Enterprise Data Model and other types of Mayo Clinic Data Standards
•Maintain a balance between short-term and strategic needs and solutions
Biomedical Informatics
High-Level Context
External Terminologies
LOINC, CPT, ICD-9, ICD-O, SNOMED-CT,UMLS, other
EnterpriseData Model
EnterpriseManagedMetadataEnvironment
Analytic Systems
EDT, Admin,Nursing,DSS, other
Operational Systems
EMRs. Lab,Radiology, other
Research andCollaboratingOrganizations
EnterpriseTerminology
Biomedical Informatics
Solution Vision
EnterpriseTerminology
Tools and Services: Load, Author, Map, Store, Search, Browse, Publish, Access
• Code Systems•Mayo Clinic Backbone Terminology, ICD-9 CM, ICD-10 CM, SNOMED CT, RxNorm, LOINC, etc.
• Mappings•Clinical Problems to ICD-9 CM, ICD-10 CM, and SNOMED CT, Lab Orders/Results to LOINC
• Value Sets•Clinical Problem, Body Structure, Body Side, Administrative Gender, Unit of Measure
• Picklists•Admitting Diagnosis, Lab Unit of Measure
Biomedical Informatics
HypertensionHyperlipidemiaDepressionHypothyroidismObesityOsteoporosisCoronary artery diseaseAnemiaDiabetes mellitusObstructive sleep apnea
Clinical ProblemsClinical Facing
Code Systems
SNOMED CTMaps
Administrative Billing
ICD-9
ICD-10
Payers
Health Information Exchanges
Value Set and Mappings Example
Biomedical Informatics
Problem List Terminology Example
Develop and approve
Provide in an
accessible format
Distribute and Load
Use in clinical and admin processes
EMR’s, Scheduling, Orders, etc.
Biomedical Informatics
Terminology Management Offerings
•Methodologies and Best Practices•Terminology Standardization Facilitation•External Standards Expertise•Terminology Mapping (and 3rd party
recommendations)•External Terminology Management•Mayo Clinic Terminology Standards
Management
Biomedical Informatics
Tools and Terminology Services
•Health Language Inc.• Central authoring tool (LExScape)• Mapping assistance tool (LEAP)• Web-based browser (LExPlorer)• Importer/Exporter and Web Services
• Mayo Developed & Supported• Exports• Web Services (as needed)
Biomedical Informatics
Terminology Resources
• Data Governance Support Services Terminology Management
• Technical support: IT Reference Repositories Operational Systems
• Data Governance Committee
Biomedical Informatics
Approved Data Standards and Policy
•Approved Data Standards on Web• Mayo Clinic Data Standards• Enterprise Data Model
•Data Standards Policy• Data Governance Policy for Data Standards
Biomedical Informatics
17
Data Granularity DissonancePractice vs. Research vs. Public Health
•Clinical data is impressionistic•Not protocol driven – quaint rambling text…•Detail and consistency is a happy coincidence•However, can be holistic; intuit the unanticipated
•Research data is rigid though rigorous•Detailed, structured, complete (ideally), coherent•May miss the forest for the chlorophyll
•Public Health comprehends broader interests•Metrics of life-style, high risk behavior, fitness•Non-clinical circumstances, settings, environment
•Can there be reconciliation among use-cases?
Biomedical Informatics
electronic MEdical Records and GEnomics NHGRI eMERGE (U01) Goals
•GWAS – Genome Wide Association Study•600k Affy chip
•High-throughput Phenotyping•Disease algorithm scans across EMRs• “catch up” with high throughput genomics
•Generalize Phenotypes across the Consortium•Measure reproducibility of algorithms among
members•Vanderbilt, Northwestern, Marshfield, Group
Health Seattle, Mayo1811/16/10
Biomedical Informatics
19
SHARP: Area 4: Secondary Use of EHR DataA $15M National Consortium
•16 academic and industry partners•Develop tools and resources that influence and
extend secondary uses of clinical data•Cross-integrated suite of project and products
•Clinical Data Normalization •Natural Language Processing (NLP)•Phenotyping (cohorts and eligibility)•Common pipeline tooling (UIMA) and scaling•Data Quality (metrics, missing value management)•Evaluation Framework (population networks)
Biomedical Informatics
SHARP Area 4: Secondary Use of EHR Data
• Harvard Univ. • Intermountain Healthcare• Mayo Clinic• Mirth Corporation, Inc.• MIT • MITRE Corp. • Regenstrief Institute, Inc.• SUNY • University of Colorado
• Agilex Technologies• CDISC (Clinical Data Interchange
Standards Consortium)• Centerphase Solutions• Deloitte• Group Health, Seattle• IBM Watson Research Labs• University of Utah• University of Pittsburgh
Biomedical Informatics
Themes & Projects
Biomedical Informatics
Clinical Data Normalization•Data Normalization
•Clinical data comes in all different forms even for the same kind of information.
•Comparable and consistent data is foundational to secondary use
•Clinical Data Models – Clinical Element Models (CEMs)•Basis for retaining computable meaning when data
is exchanged between heterogeneous computer systems.
•Basis for shared computable meaning when clinical data is referenced in decision support logic.
Biomedical Informatics
A diagram of a simple clinical model
# 23
data 138 mmHg
quals
SystolicBPObs
data Right Arm
BodyLocation
data Sitting
PatientPosition
Clinical Element Model for Systolic Blood Pressure
Biomedical Informatics
Biomedical Informatics
Data Element Harmonization
•Stan Huff – CIMI•Clinical Information Model Initiative
•NHS Clinical Statement•CEN TC251/OpenEHR Archetypes•HL7 Templates• ISO TC215 Detailed Clinical Models•CDISC Common Clinical Elements• Intermountain/GE CEMs
Biomedical Informatics
Normalization Pipelines
• Input heterogeneous clinical data•HL7, CDA/CCD, structured feeds
•Output Normalized CEMs•Create logical structures within UIMA CAS
•Serialize to a persistence layer•SQL, RDF, “PCAST like”, XML
•Robust Prototypes Q1 2012•Early version production Q3 2010
Biomedical Informatics
This slide is obvious
Determine payloadIs laboratory data
1. Physical Quantity
2. CodedValues
3. Text field
Biomedical Informatics
Natural Language Processing
• Information extraction (IE): transformation of unstructured text into structured representations and merging clinical data extracted from free text with structured data•Entity and Event discovery•Relation discovery•Normalization template: Clinical Element Model
• Improving the functionality, interoperability, and usability of a clinical NLP system(s), namely the Clinical Text Analysis and Knowledge Extraction System (cTAKES)
Biomedical Informatics
High-Throughput Phenotyping
•Phenotype - identifying a set of characteristics of about a patient, such as:•A diagnosis•Demographics•A set of lab results
•Phenotyping – overload of terms•Originally for research cohorts from EMRs•Obvious extension to clinical trial eligibility•Note relevance to quality metrics
• Numerator and denominator constitute phenotypes•Clinical decision support
• Trigger criteria
Biomedical Informatics
EMR Phenotype Algorithms
•Typical components•Billing and diagnoses codes; Procedure codes•Labs; Medications•Phenotype-specific co-variates (e.g.,
Demographics, Vitals, Smoking Status, CASI scores)
•Pathology; Imaging?•Organized into inclusion and exclusion criteria•Experience from eMERGE Electronic Medical
Records and Genomics Network (http://www.gwas.net)
Biomedical Informatics
SHARP Area 4: More information…
http://informatics.mayo.edu/sharphttp://sharpn.org
Biomedical Informatics
32
Where is This Going?•Standards and interoperability are crucial to the
operation and success of academic medical centers
•The boundaries among clinical and research standards are eroding
• Information standards, especially vocabularies, are the foundation for scientific synergies
•Data Governance, at an enterprise (and arguably a national) level is critical