kumc biomedical informatics resources for your research: a focus on heron
DESCRIPTION
KUMC Biomedical Informatics Resources for your Research: a focus on HERON. This project is supported in part by NIH grant UL1TR000001 and NSF Award CNS - 1258315. Biomedical Informatics Can Help Your Research. We have tools and expertise to manage data and convert it into information - PowerPoint PPT PresentationTRANSCRIPT
KUMC Biomedical Informatics Resources for your Research: a focus on HERON
Russ Waitman, PhDDirector of Medical Informatics,
Associate Professor, Department of BiostatisticsDirector, Frontiers Biomedical Informatics
Assistant Vice Chancellor, Enterprise AnalyticsUniversity of Kansas Medical Center
Kansas City, Kansas
This project is supported in part by NIH grant UL1TR000001 and NSF Award CNS-1258315
• We have tools and expertise to manage data and convert it into information
• REDCap and CRIS – enter and manage data
• HERON – fish for data from the hospital/clinic
• Biweekly Frontiers Clinical Informatics Clinics– Tuesday 4-5 pm in 1028 Dykes Library. – Next session April 30, 2013.
Biomedical Informatics Can Help Your Research
Bennett Spring Trout Park, Lebanon Missourihttp://mdc.mo.gov/regions/southwest/bennett-spring
You’re that fisherman: wanting to land data to answer your research hypothesis
The Fish: Diagnoses, Demographics, Observations, Treatments
Why so many fish? Current Goal: Build Hatchery, Manage the Fishery
Photo Credit: HuntFishGuide.comhttp://www.flickr.com/photos/huntfishguide/5883317106/
Second Goal: If you need help fishing, get a guide
Photo Credit: S. Klathillhttp://www.flickr.com/photos/sklathill/505464990/
Prepare and Analyze Data
Photo Credit: Steve Velo http://www.flickr.com/photos/juniorvelo/259888572/
Our shared goal: a tasty publication
• I’ll just enter everything in Excel….• What if I lose or accidentally sort my
spreadsheet?• How to I let students only review de-
identified data?
• Hospital/Clinic is making me use this Electronic Medical Record and I get nothing in return...
Little White Salmon River, Washington State, last Summer in July
Nightmare: looks like a nice river, but can’t catch fish
• https://redcap.kumc.edu– It uses the same username and password as your KUMC email.
• Non-KUMC researchers can request an affiliate account through Frontiers CTSA office
– Check out the training materials under videos– Case Report Forms and Surveys
• For consultation and to move project to production: Register your project with us so we can keep track of your request.
– http://biostatistics.kumc.edu/projectReg.aspx– After you register your project, a CRIS team member, likely Kahlia Ford will get in touch
with you.
• Check out other institutions using REDCap and possibly borrow from the master library.
– http://www.project-redcap.org/
Sometimes, You’re willing to enter data/buy fish: REDCap: Research Electronic Data Capture
REDCap Case Report Form Example
REDCap Survey: Think SurveyMonkey
• For clinical trials, CRIS (Velos) may be a better fit– Multiple years of experience– CRIS team builds for you with biostatistics review – Budget for CRIS team and biostatistics explicitly
• “Investigator driven” REDCap only works if you, the Principal Investigator, takes responsibility for your data– Scalability: informatics provides consultation and responsibility for
technical integrity; not your dictionary or data entry. • Underwritten by CTSA, but you “feed and talk to your fish”
– Middle model where informatics can build for you in REDCap.• Again, you budget for our team’s time
Option Two: CRISREDCap Disclaimer
http://www.flickr.com/photos/wiccked/185270913/lightbox/
REDCap: think Fish Tank you manage
Bonneville Hatchery: Trout, Salmon, Sturgeon, Columbia River, Oregon
I want to go fishing, not fill a fish tank (REDCap) Use HERON: a managed fishery
• Get a License: Develop business agreements, policies, data use agreements and oversight.
• Get a Fishing Rod and Bass Boat: Implement open source NIH funded (i.e. i2b2 https://www.i2b2.org/) initiatives for accessing data.
• Know what your catching: Transform data into information using the NLM UMLS Metathesaurus as our vocabulary source.
• Stock Different Tasty Fish: link clinical data sources to enhance their research utility.
Central CTSA Informatics Aim: Create a data “fishing” platform: HERON, https://heron.kumc.edu
• Fill out System Access Agreements to sponsor students/staff• Fill out Data Use Agreement to request data export• No Limit!!! IRB Protocol Not Required to view or pull de-
identified data• Must be on campus or use VPN or https://access.kumed.com • Check http://frontiersresearch.org/frontiers/HERON-Introduction
for more information, status, and training videos
Single sign-on using your email username
Real-time check
for current human subjects training
HERON: Getting a Fishing License
The i2b2 “Fishing Rod”: build Diabetes cohort
Drag concepts from upper left into panels on the right
Types of “fish” in folders
i2b2 : AND in Frontiers Research Registry
Dragging over the second condition
i2b2: AND a high Hemoglobin A1C
When you add a numeric concept,i2b2 asks if you want to set a constraint
i2b2 Result: 497 patients in Cohort
Run the QueryQuery took 4 seconds497 patient in cohort
I2b2: Explore Cohort, Visualize
http://www.oregon.com/columbia_gorge_attractions/bonneville_hatchery
Catch the data for JAMA, NEJMpublication
The dream: landing the big one
Without getting bit
• Goal: stable monthly process, minimal downtime• Complete rebuild of the repository, not HL7 messaging update based. • Two databases: create new DB while old DB is in use. • When the new DB is ready, switch over i2b2 to serve customers fresh data.
• Initial Files from Clinical Organizations• Export KUH Epic Clarity relational database instead of Cache/MUMPS. • Monthly file from UKP clinic billing system (GE IDX). UHC CDB, NAACCR
• Demographics, services, diagnoses, procedures, and Frontiers research participant flag.
• Extract Transform Load (ELT) processes largely SQL (some Oracle PL/SQL) • Wrapped in python scripts.
• Goals for a monthly release (20 months in a row so far): – Fresh data. Example: another month of visits = millions of facts– New types of data. Example: family history– New functionality: Example: link data by encounter across clinical and
financial sources; distinguish medication administration from prescription
How the team works: HERON Evolves Every Month
https://informatics.kumc.edu/work/blog
Monthly ReleaseBlog highlights:- Features- Size- Dates of sources
HERON’s Data Sources, Types of Data
https://informatics.kumc.edu/work/wiki/HeronProjectTimeline#Sep2012Planning - contains current plan for next several monthly releases
• Find a colleague• Talk with hospital, clinic
to understand workflow• Attend bi-weekly clinics• Watch the videos:
http://frontiersresearch.org/frontiers/informatics-training-videos
• Request a consult http://frontiersresearch.org/frontiers/biomedical-informatics
If you don’t see what you want, or you really like things, let us know:
https://redcap.kumc.edu/surveys/?s=3SBkPg&tool=1
“Who’s Using HERON” and collaboration approaches
• HIPAA Safe Harbor De-identification – Remove 18 identifiers and randomly date shifting by up to 365 days back in
time• Downside: can’t do seasonal studies without IRB approval to go back and get actual dates• In general, tack on 7 months when wanting volume for the last year.
– Resulting in non-human subjects research data but treated as a limited data set from a system access perspective. System users and data recipients agree to treat as a limited data set (acknowledging re-identification risk)
• To be addressed: – For now, we won’t add free text such as progress notes with text scrubbers
(DeID, MITRE Identification Scrubber toolkit)
• Date Shift example:– Patient was born August 13, 1968, had their blood pressure measured on
November 28, 2012.– Each month dates shifted, ex: to -15 for January release: New birthday is July
29, 1968 and the blood pressure measurement occurred on November 13, 2012.
• For another patient, their offset might be -278. Next month the Aug 13th patient’s offset might be -192.
HERON De-identification: Remove HIPAA 18 identifiers -> non human subjects research
Research Context: Medical Informatics Hypotheses
Hypothesis #1: Admin + Clinical -> Better Knowledge?
Hypothesis #2: Computer + Clinical Process-> Better Health?
• Motivation: Build a way to go beyond counting and obtain insight before you need a Data Use Agreement and release patient data. – Grows out Dan Connolly’s survival analysis tool for NCI site visit– Intermediate step of a multi-cohort generalized survival plugin– R Data Builder plugin in i2b2 and integration with RStudio Server
• (http://www.rstudio.com/ide/docs/server/getting_started)
Emerging Functionality: From Data Aggregation to Hospital Quality Preliminary Analysis
• Test Case: Antibiotic Administration for Septic patients in the Emergency Room– Past publication to bring in flowsheet
data an important foundation – University HealthSystem Consortium
CDB “gold” standard for KU Hospital– What can you solve in i2b2 “same
financial encounter” versus send to R?
Repurposing i2b2 Clinical Research Infrastructure for Inpatient Quality Improvement
• i2b2 “largely” ambulatory or population/genomics focused • Is i2b2 version 1.6 with same financial encounter and modifiers
now useful for inpatient research?
• Goal: understand medication timing and antibiotic selection• Suspect vancomycin preferred• Validate HERON medications
– Especially administration timing
Systems ArchitectureIdentified data server
i2b2 compatiblestar schema
Staged source data
De-identified server
i2b2 compatiblestar schema
Application server
de-identification processmonthly refresh ETL
Source System files (EMR dump, UHC CDB extract)
secu
re F
TP
/ET
L
RStudio Server
R scripts plots,statistics
Investigator’s client
One tab in browser
i2b2 web client
Another tab in browser
RStudio IDE web client
i2b2 Hive
rgate
R Data Builder Plugin and RStudio Server
Web based for user. Just another tab in the browser
All data stays on the server so there’s no data release and risk of re-identification due to a lost file
i2b2 Plugin invokes a program that creates a Rda file in their directory on the server
UHC, Flowsheets, Medications data sources:what i2b2 could answer versus R analysis
3513 patients had a UHC-defined
septicemia diagnosis
2912 patients were an Emergency
Admission
2861 patients age were 18 years or
older
2722 patients had an exposure to an
Antibiotic in the encounter 1839 had
ED Triage documentation
during the encounter
1244 patients had 1st antibiotic admin
within 24 hours(1474 encounters)
A
993 had 1st antibiotic admin given in ED(1140 encounters)
B
316 had 1st antibiotic admin not in ED(334 encounters)
C
1836 had the Sepsis Screen
Used during the encounter
261 had 1st antibiotic admin before sepsis
screening (277 encounters)
D
1040 had 1st antibiotic admin after sepsis screening
(1197 encounters)
E
Cohorts above line defined with i2b2
Cohorts below line further refined with R
1223 had 2 SIRS criteria, organ
dysfunction and suspicion/treatment
of infection717 MD notified
Average time spent in ED is 8.7 hours, median 7.6
Average time in ED is 7.9 hours,
median 7.1
Average time spent in ED is 6.7 hours, median 6.6
Average time to sepsis screening 2.9 hours, median 49 minutes
Note: 28 patients who lacked an ED departure time were excluded from further analysis
i2b2 could define cohort
cohort refinement with R
Density Plots: Time from Arrival to First Antibiotic
0.00
0.05
0.10
0.15
0 5 10 15 20 25Hours
Pro
port
ion
ofE
ncou
nter
s
Drug
broad
vanc
1
0.00
0.05
0.10
0.15
0 5 10 15 20 25Hours
Pro
port
ion
ofE
ncou
nter
s
Drug
broad
vanc
2
0.00
0.05
0.10
0.15
0.20
0 5 10 15 20 25Hours
Pro
port
ion
ofE
ncou
nter
s
When
in.ed
not.in
3
0.00
0.05
0.10
0.15
0.20
0 5 10 15 20 25Hours
Pro
port
ion
ofE
ncou
nter
s
Admin
before
afte
Broad Spectrum versus Vancomycin
Lag in Broad Spectrum after Vancomycin
Lag when given outside Emergency Room
Administration relative to RN Sepsis Screen
• REDCap registries into i2b2 allows intuitive exploration – Researchers may need less abstraction as data is extracted from the EMR.
• i2b2 into REDCap: inherit security model, graphical/export tools
Aligning Clinical Research Informatics for Quality: Registry Abstraction and Data Delivery
• Informatics Research and Systems for Hypothesis #1 – Administrative plus Clinical/Biomedical providers better knowledge– Current UHC models of administrative data based on linear regression
• Want to reproduce UHC models with for our data in HERON
– Then develop systematic method to evaluate utility of clinical data• Perhaps applicability of newer machine learning and statistical methods and methods for
validation (ex: bootstrapping)
• Engage with Clinical Researchers and Hospital Quality– Continue to harvest valuable data: microbiology discrete pathology results – Advance streamlined methods for self service
• Recognize though that data driven research is non-trivial and sometime the effort is underestimated by investigators
• Harvest Epic alerts (best practice, drug interaction), Orderset Utilization to evaluate Hypothesis #2 – Computer + Clinical Process -> Improved Decisions and Better Health
Next Steps
Questions?