1 using electronic medical records for research: practical issues and implementation hurdles prakash...

21
1 Using Electronic Medical Using Electronic Medical Records for Research: Records for Research: Practical Issues and Practical Issues and Implementation Hurdles Implementation Hurdles Prakash M. Nadkarni MD Prakash M. Nadkarni MD

Upload: peter-lester

Post on 22-Dec-2015

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Using Electronic Medical Records for Research: Practical Issues and Implementation Hurdles Prakash M. Nadkarni MD

11

Using Electronic Medical Using Electronic Medical Records for Research: Records for Research: Practical Issues and Practical Issues and

Implementation HurdlesImplementation Hurdles

Prakash M. Nadkarni MDPrakash M. Nadkarni MD

Page 2: 1 Using Electronic Medical Records for Research: Practical Issues and Implementation Hurdles Prakash M. Nadkarni MD

22

Benefits of EMRsBenefits of EMRs

Most of the data that you want is often Most of the data that you want is often in the EMRin the EMRSample Size Analyses Sample Size Analyses Cohort identification /recruitmentCohort identification /recruitmentDetail DataDetail Data

You can implement many research You can implement many research related workflowsrelated workflowsAppointment scheduling enables Appointment scheduling enables

interventions at the patient's convenience.interventions at the patient's convenience.

Page 3: 1 Using Electronic Medical Records for Research: Practical Issues and Implementation Hurdles Prakash M. Nadkarni MD

33

EMRs don't do everything EMRs don't do everything Even Epic warns you about the need to Even Epic warns you about the need to

interoperate with software designed interoperate with software designed specifically for clinical research specifically for clinical research (CRIS=Clinical Research Information (CRIS=Clinical Research Information System).System).

Even CRISs are sub-specialized: Project Even CRISs are sub-specialized: Project management/finance, grant management management/finance, grant management workflows, federal paperwork (FDA workflows, federal paperwork (FDA Investigational New Drug applications), Investigational New Drug applications), general or specialized data capture (e.g., general or specialized data capture (e.g., patient diaries, adaptive questionnaires).patient diaries, adaptive questionnaires).

Page 4: 1 Using Electronic Medical Records for Research: Practical Issues and Implementation Hurdles Prakash M. Nadkarni MD

44

Challenge: No Study CalendarChallenge: No Study CalendarAll patients are not enrolled at the same time. All patients are not enrolled at the same time. Specific evaluations or interventions are done Specific evaluations or interventions are done

at specific time points ('events") relative to at specific time points ('events") relative to start of participation in the study (or some start of participation in the study (or some arbitrary point- e.g., working backwards from arbitrary point- e.g., working backwards from a scheduled MRI scan). a scheduled MRI scan).

Each time point may have a permissible Each time point may have a permissible range or range or windowwindow (e.g., “6-mth follow up” may (e.g., “6-mth follow up” may occur between 5-7 months).occur between 5-7 months).

Given a protocol/study calendar, a CRIS will Given a protocol/study calendar, a CRIS will *generate* a provisional patient calendar.*generate* a provisional patient calendar.

Page 5: 1 Using Electronic Medical Records for Research: Practical Issues and Implementation Hurdles Prakash M. Nadkarni MD

55

Study Calendar (2)Study Calendar (2) The protocol is worked out based on The protocol is worked out based on

information yield of the evaluation and information yield of the evaluation and expected rate of change in the parameters expected rate of change in the parameters evaluated, evaluation cost and patient risk. evaluated, evaluation cost and patient risk. An Event-CRF Cross-Table enforces An Event-CRF Cross-Table enforces consistency.consistency.

CRISs use "Unscheduled" events to deal CRISs use "Unscheduled" events to deal with emergency conditions.with emergency conditions.

An entire set of reports are calendar-driven An entire set of reports are calendar-driven – e.g., scheduled events, missing forms, – e.g., scheduled events, missing forms, out-of-range visits.out-of-range visits.

In Epic, the closest to Calendar functionality In Epic, the closest to Calendar functionality is the Chemotherapy module (Beacon)is the Chemotherapy module (Beacon)

Page 6: 1 Using Electronic Medical Records for Research: Practical Issues and Implementation Hurdles Prakash M. Nadkarni MD

66

Non-adherence to StandardsNon-adherence to Standards

If vendor ignores national/international If vendor ignores national/international controlled terminology standards, data controlled terminology standards, data pooling in cross-institutional pooling in cross-institutional collaborations is difficultcollaborations is difficultFor procedures, Epic does not use Clinical For procedures, Epic does not use Clinical

& Procedural Terminology (CPT). Instead, & Procedural Terminology (CPT). Instead, procedures are identified by idiosyncratic procedures are identified by idiosyncratic abbreviations created by hurried users, abbreviations created by hurried users, that are hard to interpret except by those that are hard to interpret except by those users, and vary across institutions. users, and vary across institutions.

Page 7: 1 Using Electronic Medical Records for Research: Practical Issues and Implementation Hurdles Prakash M. Nadkarni MD

77

Standards Challenges (2)Standards Challenges (2)

Of the 15,000 laboratory tests in our instance Of the 15,000 laboratory tests in our instance of Epic, only about 8% have been mapped of Epic, only about 8% have been mapped currently to the Logical Observations, currently to the Logical Observations, Identifiers, Nomenclature and Codes (LOINC) Identifiers, Nomenclature and Codes (LOINC) vocabulary.vocabulary.

Sometimes the same procedure or lab test is Sometimes the same procedure or lab test is defined more than once in a master tabledefined more than once in a master table the definitions are unhelpful, and one must look at the definitions are unhelpful, and one must look at

the actual data to determine which are used, e.g., the actual data to determine which are used, e.g., histogram showing number of tests performed over histogram showing number of tests performed over a period of time, the max and minimum values.a period of time, the max and minimum values.

Page 8: 1 Using Electronic Medical Records for Research: Practical Issues and Implementation Hurdles Prakash M. Nadkarni MD

88

Redundancy and Redundancy and heterogeneityheterogeneity

The data may have been stored more The data may have been stored more than once, and in different ways, in than once, and in different ways, in different parts of the medical recorddifferent parts of the medical recordBMI is recorded in two different places.BMI is recorded in two different places.

"Uncontrolled" local terminologies"Uncontrolled" local terminologiesFlowsheets where Blood pressure is recorded Flowsheets where Blood pressure is recorded

redundantly as text "124/82". (Not in UIHC, redundantly as text "124/82". (Not in UIHC, fortunately.)fortunately.)

Procedures and Lab definitions list are also Procedures and Lab definitions list are also semi-controlled.semi-controlled.

Page 9: 1 Using Electronic Medical Records for Research: Practical Issues and Implementation Hurdles Prakash M. Nadkarni MD

99

Duplicate ElementsDuplicate Elements

Pseudo-redundancy: Subtly different Pseudo-redundancy: Subtly different data elements that are given the data elements that are given the same label in the user interfacesame label in the user interfaceBaby's birth weight is recorded both at Baby's birth weight is recorded both at

the time of delivery and at the time of the time of delivery and at the time of admission to a NICU. The two are not admission to a NICU. The two are not semantically the same: with semantically the same: with interventions, the former may be interventions, the former may be significantly more (or less) than the significantly more (or less) than the latter.latter.

Page 10: 1 Using Electronic Medical Records for Research: Practical Issues and Implementation Hurdles Prakash M. Nadkarni MD

1010

““Wrong” structureWrong” structureMuch data (discharge summaries, etc.) is Much data (discharge summaries, etc.) is

stored as text, requiring human abstraction stored as text, requiring human abstraction or Natural language processing (NLP).or Natural language processing (NLP).

NLP is not 100% accurate, requiring NLP is not 100% accurate, requiring sensitivity and specificity to be traded off. It sensitivity and specificity to be traded off. It is especially hard with progress notes that is especially hard with progress notes that are replete with abbreviations and that may are replete with abbreviations and that may have little grammatical structure.have little grammatical structure.

Much of the published NLP work relies on Much of the published NLP work relies on idiosyncrasies of a particular dataset (e.g., idiosyncrasies of a particular dataset (e.g., the use of Epic templates) to achieve higher the use of Epic templates) to achieve higher accuracy, and is not always generalizable.accuracy, and is not always generalizable.

Page 11: 1 Using Electronic Medical Records for Research: Practical Issues and Implementation Hurdles Prakash M. Nadkarni MD

1111

The Needle in the HaystackThe Needle in the HaystackEpic schema contains several thousand Epic schema contains several thousand

tables; many unused, or with empty fields.tables; many unused, or with empty fields. Incomplete or out-of-date documentation.Incomplete or out-of-date documentation.The first time, one may spend more time The first time, one may spend more time

locating a particular data element than locating a particular data element than actually pulling it out.actually pulling it out.

Persons doing data extraction need to add Persons doing data extraction need to add value by providing signposts and tips, to help value by providing signposts and tips, to help others who have to do the same task later.others who have to do the same task later.

Even with a data warehouse, this problem Even with a data warehouse, this problem will reoccur as long as data definitions are will reoccur as long as data definitions are suboptimalsuboptimal

Page 12: 1 Using Electronic Medical Records for Research: Practical Issues and Implementation Hurdles Prakash M. Nadkarni MD

1212

Real-time cohort Real-time cohort identification must be done identification must be done

judiciouslyjudiciously"Best Practice Alerts" can be a "Best Practice Alerts" can be a

resource drain on responsiveness of resource drain on responsiveness of systems. systems.

Do you really need real-time subject Do you really need real-time subject identification? Would a 24-hour delay identification? Would a 24-hour delay be acceptable? ICU-related clinical be acceptable? ICU-related clinical studies; transfusion in preemies.studies; transfusion in preemies.

Page 13: 1 Using Electronic Medical Records for Research: Practical Issues and Implementation Hurdles Prakash M. Nadkarni MD

1313

Transforming the DataTransforming the DataThe form in which data is recorded in the The form in which data is recorded in the

EMR is not necessarily the form in which EMR is not necessarily the form in which it is most conveniently analyzed or it is most conveniently analyzed or reported. reported.

Registries often require creating derived Registries often require creating derived variablesvariablesConverting numerical data into categories – Converting numerical data into categories –

e.g., Binning children by birth weighte.g., Binning children by birth weightConverting numeric values or Converting numeric values or

existence/absence of data into Yes/No: Is the existence/absence of data into Yes/No: Is the bilirubin > 5 mg/dl? Did the neonate receive bilirubin > 5 mg/dl? Did the neonate receive nitric oxide inhalation for pulmonary nitric oxide inhalation for pulmonary hypertension?hypertension?

Page 14: 1 Using Electronic Medical Records for Research: Practical Issues and Implementation Hurdles Prakash M. Nadkarni MD

1414

Interfacing with statistical Interfacing with statistical softwaresoftware

Before: sample size, randomizationBefore: sample size, randomizationAfter: Analysis, fitting to modelsAfter: Analysis, fitting to models

Some CRISs (e.g., REDCap, TrialDB) will Some CRISs (e.g., REDCap, TrialDB) will output SAS/SPSS-formatted data files, output SAS/SPSS-formatted data files, with definitions for all variables (including with definitions for all variables (including enumerations for all categorical variables; enumerations for all categorical variables; SAS has a command called PROC FORMAT SAS has a command called PROC FORMAT for categorical data). EMRs still lag.for categorical data). EMRs still lag.

Page 15: 1 Using Electronic Medical Records for Research: Practical Issues and Implementation Hurdles Prakash M. Nadkarni MD

1515

Data WarehouseData WarehouseA database that is optimized for fast A database that is optimized for fast

query, preferably by end-users, without query, preferably by end-users, without interactive updatesinteractive updates

Solves some problems, but not othersSolves some problems, but not othersMore homogeneous structure – i.e., a handful More homogeneous structure – i.e., a handful

of tables rather than thousands.of tables rather than thousands.However, the problem of locating variables of However, the problem of locating variables of

interest doesn't go away. With indifferent interest doesn't go away. With indifferent documentation of the variables, the problem documentation of the variables, the problem of hunting for variables of interest is of hunting for variables of interest is transferred from the concierge/analyst to the transferred from the concierge/analyst to the end-user, which may worsen the problem.end-user, which may worsen the problem.

Page 16: 1 Using Electronic Medical Records for Research: Practical Issues and Implementation Hurdles Prakash M. Nadkarni MD

1616

Special Challenges in EMR Data Special Challenges in EMR Data Interpretation /ReliabilityInterpretation /Reliability

Data entry errors in source data, often a Data entry errors in source data, often a consequence of “copy and paste”.consequence of “copy and paste”.

Coding of categorical variables does not Coding of categorical variables does not accommodate nuances in the medical accommodate nuances in the medical history or diagnostic findings.history or diagnostic findings.

Depending on the source, billing data may Depending on the source, billing data may have been up-coded (Humana).have been up-coded (Humana).

Outcome data may be lacking – absence of Outcome data may be lacking – absence of return visit data mayreturn visit data may simply mean that simply mean that patient failed to improve and went patient failed to improve and went elsewhere.elsewhere.

Page 17: 1 Using Electronic Medical Records for Research: Practical Issues and Implementation Hurdles Prakash M. Nadkarni MD

1717

Special Challenges (2)Special Challenges (2)

Data fragmentation – especially where Data fragmentation – especially where healthcare is provided by separate institutions.healthcare is provided by separate institutions.

Data is observational – treatments and Data is observational – treatments and exposures are not assigned randomly.exposures are not assigned randomly.

Confounding Bias – socioeconomic factors Confounding Bias – socioeconomic factors might lead patients to use suboptimal might lead patients to use suboptimal treatmentstreatments

Selection/sampling Bias – atypical Selection/sampling Bias – atypical demographical attributes for the cohort whose demographical attributes for the cohort whose data you are seeing, may limit inferences that data you are seeing, may limit inferences that you can make about the general population.you can make about the general population.

Page 18: 1 Using Electronic Medical Records for Research: Practical Issues and Implementation Hurdles Prakash M. Nadkarni MD

1818

Frontiers: Genetic DataFrontiers: Genetic DataThere are no technical barriers to the There are no technical barriers to the

incorporation of limited genetic data incorporation of limited genetic data for an individual– e.g., SNPs or specific for an individual– e.g., SNPs or specific mutations – in structured (i.e., readily mutations – in structured (i.e., readily analyzable) form.analyzable) form.

Major current issue is the limited Major current issue is the limited understanding of genetic data and understanding of genetic data and definitions by EMR vendors.definitions by EMR vendors.

Whole-genome is still a long-way off. A Whole-genome is still a long-way off. A single record would be larger than the single record would be larger than the bulk of existing non-image EMR data.bulk of existing non-image EMR data.

Page 19: 1 Using Electronic Medical Records for Research: Practical Issues and Implementation Hurdles Prakash M. Nadkarni MD

1919

ConclusionsConclusionsNone of the challenges are None of the challenges are

insurmountable, but they take a lot of insurmountable, but they take a lot of effort and resources to addresseffort and resources to address

Most of the fixes are long-term, involving:Most of the fixes are long-term, involving:Manual mapping to controlled vocabulary Manual mapping to controlled vocabulary

termstermsChange in processesChange in processesMaintaining descriptive documentation that Maintaining descriptive documentation that

must continually be checked for usability and must continually be checked for usability and currency.currency.

Page 20: 1 Using Electronic Medical Records for Research: Practical Issues and Implementation Hurdles Prakash M. Nadkarni MD

2020

Further ReadingFurther Reading Masys DR, et al . Technical desiderata for Masys DR, et al . Technical desiderata for

the integration of genomic data into the integration of genomic data into Electronic Health Records.J Biomed Inform. Electronic Health Records.J Biomed Inform. 2012 Jun;45(3):419-222012 Jun;45(3):419-22

Nadkarni, Ohno-Machado and Chapman. Nadkarni, Ohno-Machado and Chapman. Natural Language Processing: A Tutorial. Natural Language Processing: A Tutorial. Journal of the American Medical Journal of the American Medical Informatics Association, 2011. Informatics Association, 2011. PMC3168328PMC3168328

Hoffman & Podgurski, “Big, bad data” Hoffman & Podgurski, “Big, bad data” Journal of Law, Medicine and Ethics, (2013) Journal of Law, Medicine and Ethics, (2013) 41:1,pp 56-60. 41:1,pp 56-60. http://www.ncvhs.hhs.gov/130430b6.pdfhttp://www.ncvhs.hhs.gov/130430b6.pdf..

Page 21: 1 Using Electronic Medical Records for Research: Practical Issues and Implementation Hurdles Prakash M. Nadkarni MD

2121

Questions?Questions?