202 - alan stein - big data presentation · future sources • video • biometrics • geotracking...
TRANSCRIPT
HPE Big Data PlatformHealthcare AnalyticsAlan Stein, MD PhDHealthcare Practice Lead, SW Big DataApril 2016
FutureSources
• Video• Biometrics• Geotracking• SMS• Web chat• Physiologic monitoring• Social networks• Mobile apps• Sensors• Survey response• Biochemical Assays
• Revenue management• Claims• EMRs• ICD 9-10• Genetic Sequences• Lab values• Medication records • Clinician/caretaker notes
• Radiology reports• Pathology readings• Clinical quality measures
• Population health data
CurrentSources
Traditional HLS data can be structured or unstructured, and limited, or voluminous in
nature
Nontraditional healthcare data will challenge current methods of data capture and analytics
Current and Future Healthcare Data
We want to turn data into information
Millions of daily
transactions
Multitude of Haven Big Data Use-cases Improving patient care, quality outcomes, and speed to market while reducing overall costs
• Cost, utilization, performance, & quality variable analytics
• Claim and member data analytics• Fraud detection & prevention• Medical and pharmaceutical diagnostics• Compliance testing• Clinical data analysis• Patient record analysis• Internal risk assessment • Logistics optimization• Supply chain optimization• Equipment monitoring
• Customer behavior analysis• Web application optimization• Operations analytics• Marketing campaign optimization• Brand management • Social media analytics• Pricing optimization• Revenue assurance• Security analytics• Defect tracking• Risk management• Sentiment analysis
• Clickstream analysis• Influencer analysis• IT infrastructure analysis• Legal discovery• Enterprise search• Warranty management• Social CRM / network analysis• Churn mitigation• Brand monitoring• Cross and Up sell• Loyalty & promotion analysis
Drug development
Scientific research
Evidence based medicine
Healthcare outcomes analysis
Patient analytics
Clinical data analysis
Use Cases
Sample HC Analytics questions
• What are the top 5 reasons for an ER “frequent flyer” visit? How does this vary for patients that live alone vs those that live with family?
• What percentage of primary Type 1 diabetes care pediatric patients were on an insulin pump in the last calendar year?
• What is the incidence of Pressure Sores / Bed-Hour?
• In the last 30 days, how many patient events Involving Rapid Response or Crash Team occurred? What hospital units were involved?
• How does a missed cardiology follow-up appointment affect the likelihood of a hospitalisation within the next 30 days?
Real world use-cases
–Clinical concept based surveillance for VTE events by QI team including suspect cohort generation and computer facilitated chart abstraction–Clinical feature extraction from patient narratives including genetic testing to identify phenotype/genotype relationships–Diagnosis related group precoding/postcoding confirmation based on a combination of quantitative and qualitative criteria–Post-acute discharge referral analysis (CHF and arthroplasty) to evaluate cost, outcomes, and patterns, while segmenting by health status and disease acuity– Identification of uncoded metabolic conditions by laboratory values and computer facilitated chart review
5
Extracting value from data through a Platform
Self service analytics
New approach: Self Service AnalyticsAccelerates access to comprehensive insights
Business Users Intuitive Data Interaction
Seamless integration of structured and unstructured data
…90% data is untapped for KPIs and analytics
Transaction RecordsQualitative Human DataQuantitative Machine Data
Financial & Operational Transactions
Admission notes Discharge summaries Progress notes Imaging study results Consultant reports
Medication recordsLaboratory resultsPhysiologic testingBiometric sensorsRFID tags
David Yachnin
UnstructuredData
Connector
Connector
HCA
Reporting tools
Statistical analysis tools
PatternMatchersMappers Ontologies
TransactionalDatabase
HPE Healthcare Rapid Deploy Solution
ETL tools
IDOLUnstructured
Index
VerticaAnalytic Database
IDOLEnrichment
HAFV
Haven App
Framework
DashboardUI
Schema
Hadoop
SQL for Hadoop
HPE Vertica Technology
Achieve best data query performance with unique Vertica column store
Linear scaling by adding more resources on the fly
Store more data, provide more views, use less
hardware
Query and load 24x7 with zero administration
Columnar storage and execution Clustering Compression Continuous
performance
Automated Performance Tuning
Database DesignTime-series, geospatial, click-stream
and an SDK for more
Advanced Analytics
HPE IDOL Technology
– Scales Massively– Single integration and access layer for all data types (structured & unstructured)
– Open Platform with REST APIs/ Standard Based– Manages data in-place– Conceptual and machine learning capabilities– Automatically categorizes & tags content– Granular security model to support HIPAA
IDOL is an information processing and indexing layer that:
Cerner Corporation
– Challenge– Improve efficiency and quality of patient care with better productivity of clinician users
– Solution– Cerner Millennium health care platform– HP HAVEn engines: HP Vertica Analytics Platform, Hadoop
– Expected result• 6,000% faster analysis of timers helps Cerner gain insight into how physicians and other use Millennium and make suggestions about using it more efficiently so users become more efficient physicians
• Rapid analysis of 2 million alerts daily enables Cerner to know what will happen, then head off problems before they happen
HPE Vertica helps to optimize health information solutions
Human genome sequencing and medical research centers in New York Diagnose disease and develop more effective treatments for patients– Challenge
• Scalable platform to handle 16 TB of data output per day
• Solution must be cost-efficient and support cutting-edge research
– Solution– HP Vertica Analytics Platform
– Expected result
• Ability to house output from deployment of Illimina HiSeq X Ten genome sequencing appliances
• Functionality to support the enormous amounts of data sequencers generate
• Query speed to correlate output for time-sensitive reporting• Cost-savings with software that runs on off-the-shelf hardware
Big Data vs. “Big BI”
Big BI:1. Same analyses as before, just more data
2. Batch or warehouse-type processing3. Informative, but not really actionable
Big Data:1. Joining data sets never before joined, asking questions never before asked2. Real-time or near-real-time, leading to predictive/persuasive3. Action oriented
HPE HCAS genomic related functionso automatic abstraction of clinical concepts from the patient summaries
o correlation between clinical features and genetic variations
o cluster visualizations to determine variant cohorts for select phenotypes
o cluster visualizations to determine phenotype cohorts for select variants
Related genomic queries:o how many variants of type X are in the population?
oHow many variants are heterozygous?
oWhat is allele distribution of all variants?
oWhat variant differences do we find when we compare cohort A against cohort B?
o In how many variants is a particular mutation seen?
Genomic Medicine
DRG AssignmentsDRG coding details• Was the patient receiving antibiotics prior to this admission? [ ] Yes [ ] No
• Presenting symptoms upon admission: ___________________________
• Were positive blood cultures present? [ ] Yes [ ] No• If YES, list the organism• If YES, was there physician documentation that the blood culture was contaminated? [ ] Yes [ ] No
• Did the attending physician document urosepsis as the final principal diagnosis? [ ] Yes [ ] No• If YES, the correct ICD-9-CM code for urosepsis is 599.0.
DRG coding detailsSEPTICEMIA / SEPSIS / SIRS INDICATIONS
– Clinical indication of septicemia/sepsis/SIRS: • Acute mental status changes• Positive blood culture• Fever >100.4ºF (38ºC) PR or Hypothermia < 97ºF (36ºC) PR• Heart rate >100 beats/minute• Respiratory rate > 24 breaths/minute or pCO2 < 32 mmHg• WBC > 12,000/cu.mm or < 4,000/cu.mm or > 10% bands• Physician documentation of decreased urinary output/oliguria• Arterial pH less than 7.35 (metabolic acidosis)• Elevated blood lactate levels
DetailsIf septicemia/sepsis is substantiated:• A. What did the physician document in the medical record as the condition responsible for the septicemia/sepsis diagnosis?• Pyelonephritis• Pneumonia• Cellulitis• Meningitis• Cholangitis• Peritonitis• Other (specify):
• B. Does the physician documentation indicate that the septicemia/sepsis is due to an internal device, implant, or catheter? [ ] Yes [ ] No
Analytics queries
• Quick navigation of cohort based on structured codings• Septicemia (038.x)• Sepsis (038.x and 995.9x) • SIRS (995.9x) • Severe Sepsis (995.92) • Septic Shock (038.x, 995.92, 785.52)
• Semantic search for chart abstraction• Blood culture results
Analytics queries
• Urosepsis documentation• Documentation indicates Urosepsis• Documentation indicates Urosepsis but record does not have ICD-9-CM code for urosepsis (599.0)
Analytics queries
• Semantic search for clinical indicators• Acute mental status changes• Positive blood culture• Fever >100.4ºF (38ºC) PR or Hypothermia < 97ºF (36ºC) PR• Heart rate >100 beats/minute• Respiratory rate > 24 breaths/minute or pCO2 < 32 mmHg• WBC > 12,000/cu.mm or < 4,000/cu.mm or > 10% bands• Physician documentation of decreased urinary output/oliguria• Arterial pH less than 7.35 (metabolic acidosis)• Elevated blood lactate levels
Analytics queries
• Semantic search for clinical indicators• Arterial pH < 7.30• Hypotension (SBP < 90 mmHg/or SBP decrease > 40 mmHg)• Arterial hypoxemia (ratio of PaO2 over FIO2 < 300 torr)• Acute oliguria (urine output < 30 mL/hour for more than 2 hours)• Creatinine > 2.0, or increase > 0.5 mg/dl• Coagulation abnormalities (INR > 1.5 or PTT > 60 secs)• Ileus (absent bowel sounds)• Thrombocytopenia (platelet count < 100,000 pL-1)• Hyperbilirubinemia (plasma total bilirubin > 4 mg/dl or 70 mmol/L)• Decreased mental status• Decreased peripheral pulses
Analytics queries
• Semantic search for clinical indicators• Is the septicemia/sepsis diagnosis clearly substantiated (through physician documentation of clinical indications, positive blood
culture, etc.)? [ ] Yes [ ] No
Analytics queries
• Semantic search for clinical indicators the condition responsible for the septicemia/sepsis diagnosis:• Pyelonephritis• Pneumonia• Cellulitis• Meningitis• Cholangitis• Peritonitis• Other (specify):
• Does the physician documentation indicate that the septicemia/sepsis is due to an internal device, implant, or catheter? [ ] Yes [ ] No
Episode Dimension Building
23
Episode Trigger
Episode Duration
Claims Included in Spend
Non-Risk Adjusted Episode Spend
Identify PAPEpisode Level
Exclusions
Identify PAP who pass Quality Metrics
Risk Adjustment
Calculate Risk/Gain Share amounts
Episode Algorithm
Vendor Extracts in Vertica
Visualize Episode Report
Stepwise Episode Model
START
Lucile Packard Children’s Hospital (Stanford)– History of Partnership
• Development partnership: HP Healthcare Analytics for structured/unstructured data• POC: Multi-patient Semantic Search• Pilot: Facilitate USNWR survey• Quality and clinical effectiveness access to ~115K patients, ~390K encounters, ~3M documents
– Healthcare Analytics powered by HP haven
• Cohort Identification: Cross patient search, intuitive UI• Chart Abstraction: Rapid review of individual patient records• Deeper Analysis: Hypothesis generation
Lucile Packard Children’s Hospital Stanford– Current State
• EMR conversion (Cerner to Epic) in May, 2014• 750k encounters, 155k patients, ~1M notes• Preparing for weekly batch updates
– Challenge: Venous Thromboembolism (VTE)
• Hospital Acquired Condition (HAC), incidence about 4/1000 in pediatrics• Difficult to identify for reporting, much less for mitigation and prevention• Current process is inefficient, and lacks sensitivity
Venous Thromboembolism: Traditional Workflow
Venous Thromboembolism– Traditional Workflow
• Report identifies 3-5 patients per month, perhaps 1 true positive• EHR-based chart abstraction takes hours-days• < 5 VTE patients identified in 2015
– HP Healthcare Analytics
• Semantic search identifies 15-30 potential events per month• Computer-assisted chart abstraction takes minutes
– Additional Quality/Clinical Effectiveness Use Cases
• Generalization to other Hospital Acquired Conditions (e.g. obstetric adverse events)
• Deeper analysis of identified events, risk factors, development of care protocols
Venous Thromboembolism Analysis
Lucile Packard Children’s Hospital Stanford– Additional Uses for HPE Healthcare Analytics
• Self service analytics tool for faculty physicians for clinical care• Support the concept of a Learning Healthcare System• Insight into past experience (i.e. practice-based evidence)• Allow for increasingly data driven care decisions
• Complements Epic SlicerDicer (cohort identification for structured Epic data)• API development for Epic integration
• Analytics tool for Research/Discovery• Cohort Discovery• Hypothesis generation
The Big Data Journey
Business Intelligence
Analytics
Business Apps
Discover Insights
OperationalizeInsights
Data warehousing for monitoring
Data warehousing for monitoring
Statistical modeling to extract insights
Statistical modeling to extract insights
Specialized apps to automate business processes
Specialized apps to automate business processes
Discover insights to identify new business
opportunities
Discover insights to identify new business
opportunities
Operationalize insights to transform the business
Operationalize insights to transform the business
Data Warehouses Analytics Packages Business Systems Data Lakes Hybrid Data Management
Enabling Technologies
Usage
Ad Hoc
Operational
Predictive
Prescriptive
Reporting
DiscoveryEnvironments