unlocking a national adult cardiac surgery audit registry with r
DESCRIPTION
Following the Bristol Royal Infirmary heart scandal, the Society of Cardiothoracic Surgery in Great Britain & Ireland (SCTS) established a world-leading clinical registry to collect data on all adult cardiac surgery procedures. To date this registry contains >480,000 records and 163 fields. The data includes patient demographics, comorbidities and clinical measurements, cardiac and operative details, and post-operative outcomes. We will describe examples of how R has been used recently to interrogate the SCTS registry and run a national governance programme for performance monitoring. Understanding the data is vital to making decisions. The SCTS have recently used the googleVis package by Gesmann and de Castillo (2011) to visualize hospital- and surgeon-level data longitudinally over time as Google Motion Charts (SCTS, 2013a). This can be used to interrogate, for example, the risk-adjusted mortality rate of healthcare providers, whilst gaining an understanding of the variation due to sample size or inherent natural variability. It can also be used to understand the multivariate relationships between data; for example is postoperative length-of-stay related to patient age and the number of operations performed by each hospital? This tool has already been the instigator of a number of clinical and care-quality investigations. Monitoring performance of surgeons requires a broad portfolio of tools. First, statistical modelling tools, for example glm or glmer, are required to appropriately ‘risk-adjust’ outcomes. Second, functions to aggregate and summarize the data in different ways over healthcare providers are required. Finally, graphical tools are required to present the results as funnel plots and case mix charts to patients for scrutiny of their healthcare provision (SCTS 2013b). “Real-world” databases are messy – the SCTS registry is no exception. Cleaning data can be complicated, especially if there interdependencies between data frame rows and columns. Synonyms and homonyms required homogenizing; numerical, temporal and clinical conflicts required resolving; and duplicate records required accurate identification and removal. Previously this was a terminal obstacle facing cardiac surgeons in their bid to unlock the potential of this data. A registry-specific R package has been written to fully automate the cleaning in a transparent and reproducible manner, thus enabling analyses of the data. References Gesmann M and de Castillo D (2011). googleVis: Interface between R and the Google Visualisation API. The R Journal 3, 40-44. SCTS (2013a). Dynamic charts, http://www.scts.org/DynamicCharts. SCTS (2013b). Performance reports, http://www.scts.org/patients/default.aspx.TRANSCRIPT
Unlocking a national adult cardiac surgery audit registry with
GL Hickey1,2,3, SW Grant2,3 & B Bridgewater1,2,3
1Northwest Institute of BioHealth Informatics, University of Manchester2University Hospital of South Manchester3National Institute of Cardiovascular Outcomes Research, UCL
The R User Conference 2013University of Castilla-La Mancha, Albacete, Spain
BACKGROUND
Bristol Inquiry
Contributory factors that led to the failings included:
1. Inadequate collection of data
2. Inadequate monitoring of data
National Adult Cardiac Surgery Audit registry
• Up to 166 clinical variables collected on each patient: administrative, demographics, comorbidities, operative factors, outcomes
• 15 years of data• 465,000 records• 44 hospitals + >400 consultant surgeons
Flow of data
NICOR NIBHI
HO
SPIT
ALS
DATABASE CLEANING ANALYSES
AUDIT & GOVERNANCE TOOLS
CLINICAL RESEARCHERS
NATIONAL DEATH
REGISTER*
* Ability to link with many other national registries RESEARCH
UNLOCKING THE REGISTRYMESSY DATA
Cleaning the registry in
DATA EXTRACT
VARIABLE 1
VARIABLE 2
VARIABLE 3
……
……
EXCLUDE RECORDS
ADD VALUE
CLEANED DATA
Scripts to add:• Risk scores• Combined variables• ‘Resolve’ conflicting
variables
• Script per each variable• Some dependencies
E.g. duplicates
Rapidly reproducible
> with(SCTS, table(X4.04.Discharge.Destination, X4.05.Status.at.Discharge)) X4.05.Status.at.DischargeX4.04.Discharge.Destination 0. Alive 1. Dead 828 48296 2453 . Another dept within the trust 0 57 0 0 1 1 0 0. Not applicable - patient deceased 0 0 1 1 Home 0 4104 0 1. Home 674 370763 374 2 Convalescence 0 63 0 2. Convalescence 8 7347 4 2. Convalescence (Non acute Hospital) 2 2164 0 3 Other hospital 0 1 0 3 Other Hospital 0 151 0 3 Other Hospital - wd 6 0 1 0 3 Other Hospital wd 2 0 1 0 3 Other ward 0 1 0 3. Other Acute hospital 1 7680 1 3. Other hospital 115 22935 37 4 Patient deceased 0 0 173 4. Not applicable - patient deceased 51 412 13286 4. Patient Deceased 0 0 19 5 0 7 0 5. Transferred to different Consultant - NGH 0 42 0 7 0 2 0 8 0 38 4 9 114 3820 518 Second op 0 2 6Illegal options
Transcriptional discrepancies
Missing data
Conflicts
• Errors are difficult to find and not all can be resolved
• Excluding all imperfect data not an option• Balance between a ‘research ready’ dataset and
robust audit capability• Needs to be reproducible
• It is locked to clinicians & researchers without being cleaned
Cleaning the registry in
Warning: cleaning clinical registries without experts is dangerous*
* Applies to analysing healthcare data also
+ = DATA
UNLOCKING THE REGISTRYMONITORING
Publication of named healthcare provider outcomes
http://www.scts.org/patients/
Publication of named healthcare provider outcomes
FILTER DATA
subset
RISK ADJUSTMENT
glm, glmer {lme4}, mfp {mfp}, predict, auc {pROC},
CLASSIFICATION & PRESENTATION
ggplot {ggplot2}, write.csv
AGGREGATION
summaryBy {doBy}, merge, arrange {plyr}
Exploratory analyses
http://www.scts.org/DynamicCharts/
summaryBy {doBy} + gvisMotionChart {googleVis}
Monitoring medical devices
• Currently does not happen in UK
• Data: 200 valve types entered 13,000 ways (free text)
• But R is good with regular expressions
UNLOCKING THE REGISTRYRESEARCH
Evidence based medicineOctogenarians having Mitral Valve Surgery ± CABG ± TV repair
over 10-year window
survfit + Surv {survival} kmplot {by Tatsuki Koyama}
Mean 4 patients per unit / year
Contemporary statistical methodology for retrospective data
matchit {MatchIt}
Prob
abili
ty o
f rec
eivi
ng a
mec
hani
cal v
alve
Mechanical valve Biological valve Mechanical valve Biological valve
Risk prediction: status quoM
orta
lity
prop
ortio
n
Ratio = 0.37Ratio = 0.73
2%
4%
6%
8%
10%
Mor
talit
y
Date of surgery
Risk prediction: with R
+
logistic.dma {dma}
CONCLUSIONS
Conclusions
• We need to unlock healthcare registries to: Monitor quality & avoid a repeat of Bristol Revalidation of professional credentials Facilitate patient choice Develop & validate evidence based medicine Increase in demand
• We can do it all in R!
Comments & suggestions
• Funded by Heart Research UK [Grant Number RG2583]
• Dr Norman Stein, North West e-Health
Acknowledgements