medical data mining
TRANSCRIPT
Medical Data Mining
Lars Juhl Jensen
unstructured data
structured data
Jensen et al., Nature Reviews Genetics, 2012
individual hospitals
central registries
opt-out
opt-in
Danish registries
civil registration system
CPR number
established in 1968
Jensen et al., Nature Reviews Genetics, 2012
national discharge registry
14 years
6.2 million patients
45 million admissions
68 million records
119 million diagnosis
ICD-10
Jensen et al., Nature Reviews Genetics, 2012
reimbursement
not research
diagnosis trajectories
naïve approach
comorbidity
Jensen et al., Nature Reviews Genetics, 2012
confounding factors
“known knowns”
gender
age
type of hospital encounter
Jensen et al., submitted, 2014
“known unknowns”
smoking
diet
“unknown unknowns”
reporting biases
matched controls
temporal correlation
Jensen et al., Nature Communications, 2014
trajectories
Jensen et al., Nature Communications, 2014
trajectory networks
Jensen et al., Nature Communications, 2014
key diagnoses
Jensen et al., Nature Communications, 2014
direct medical implications
electronic health records
structured data
Jensen et al., Nature Reviews Genetics, 2012
unstructured data
free text
Danish
busy doctors
typos
psychiatric patients
delusions
heavily medicated
Eriksson et al., Drug Safety, 2014
text mining
dictionary-based method
diseases
drugs
adverse drug reactions
expansion rules
typos
“negative modifiers”
negations
delusions
detailed disease profiles
Roque et al., PLOS Computational Biology, 2011
3262638254947
Assigned codes
Text mined codes
pharmacovigilance
structured data
medication
semi-structured data
drug indications
known ADRs
unstructured data
adverse drug reactions
temporal correlation
Eriksson et al., Drug Safety, 2014
known ADRs
ADR frequencies
Eriksson et al., Drug Safety, 2014
new ADRs
Drug substance ADE p-value
Chlordiazepoxide Nystagmus 4.0e-8
Simvastatin Personality changes
8.4e-8
Dipyridamole Visual impairment
4.4e-4
Citalopram Psychosis 8.8e-4
Bendroflumethiazide
Apoplexy 8.5e-3
Eriksson et al., Drug Safety, 2014
AcknowledgmentsDisease trajectoriesAnders Bøck JensenTudor OpreaPope MoseleySøren Brunak
Adverse drug reactionsRobert ErikssonThomas WergeSøren Brunak
EHR text mining
Peter Bjødstrup Jensen
Robert ErikssonHenriette SchmockFrancisco S. Roque
Anders JuulMarlene Dalgaard
Massimo AndreattaSune FrankildEva Roitmann
Thomas HansenKaren Søeby
Søren BredkjærThomas Werge
Søren Brunak