de-identifying the ehr: building a resource for research clinical e-science framework de-identifying...
TRANSCRIPT
![Page 1: De-identifying the EHR: building a resource for research Clinical e-Science Framework De-identifying the EHR: building a resource for research All Hands](https://reader036.vdocuments.mx/reader036/viewer/2022081515/56649f115503460f94c248cc/html5/thumbnails/1.jpg)
Clinical e-Science FrameworkDe-identifying the EHR: De-identifying the EHR:
building a resource for researchbuilding a resource for research
All Hands Meeting - BOF sessionDr Dipak Kalra, UCL
on behalf of the CLEF Consortium
![Page 2: De-identifying the EHR: building a resource for research Clinical e-Science Framework De-identifying the EHR: building a resource for research All Hands](https://reader036.vdocuments.mx/reader036/viewer/2022081515/56649f115503460f94c248cc/html5/thumbnails/2.jpg)
CLEF’s Goals
GRID
GRID
• Collect clinical information from multiple sitesCollect clinical information from multiple sites
• Analyse, structure and integrate itAnalyse, structure and integrate it
• Make it available using GRID tools (e.g. Make it available using GRID tools (e.g. mymyGrid)Grid)
• To authorised clinicians and e-Health scientistsTo authorised clinicians and e-Health scientists
• In a secure and ethical collaborative frameworkIn a secure and ethical collaborative framework
Ethical oversight committee
![Page 3: De-identifying the EHR: building a resource for research Clinical e-Science Framework De-identifying the EHR: building a resource for research All Hands](https://reader036.vdocuments.mx/reader036/viewer/2022081515/56649f115503460f94c248cc/html5/thumbnails/3.jpg)
The CLEF records resource
• a repository of longitudinal cancer clinical records– that has been analysed and semantically indexed – to provide a summary of what happened and why at each
point in a patient's evolving story of care– that can be queried across substantial populations of similar
patients through an intuitive query workbench
![Page 4: De-identifying the EHR: building a resource for research Clinical e-Science Framework De-identifying the EHR: building a resource for research All Hands](https://reader036.vdocuments.mx/reader036/viewer/2022081515/56649f115503460f94c248cc/html5/thumbnails/4.jpg)
The CLEF repository has to be:
• scalable to populate– capable of incorporating large numbers of fine grained
personal health records– from many different clinical systems in primary, secondary
and tertiary care– each longitudinally linked so that the CLEF record can grow
as each actual patient's care progresses
• widely accessible to distributed research teams across the UK and ultimately internationally
• conformant to ethical and legal requirementsconformant to ethical and legal requirements
![Page 5: De-identifying the EHR: building a resource for research Clinical e-Science Framework De-identifying the EHR: building a resource for research All Hands](https://reader036.vdocuments.mx/reader036/viewer/2022081515/56649f115503460f94c248cc/html5/thumbnails/5.jpg)
All use of personal health data is regulated
• In the UK:– Common Law of Confidentiality– Data Protection Act 1998– Human Rights Act 1998– Section 60 of Heath & Social Care Act 2001– BMA Guidance Oct 1999– GMC Guidance Sept 2000
• At a European Level– European Community Directive 95/46/EC (1995)– Council of Europe Recommendation R(97)5 (1997)
![Page 6: De-identifying the EHR: building a resource for research Clinical e-Science Framework De-identifying the EHR: building a resource for research All Hands](https://reader036.vdocuments.mx/reader036/viewer/2022081515/56649f115503460f94c248cc/html5/thumbnails/6.jpg)
Personal data
• The Data Protection Act defines "personal data" as:"data which relate to a living individual who can be identified (a)
from those data, or (b) from those data and other information which is in the possession of, or is likely to come into the possession of, the data controller"
• This is likely to apply to any clinically useful information about living patients
• Patient consent would be required for CLEF to acquire the data into its repository, and for each new kind of research access to the data– This is not scalableThis is not scalable
![Page 7: De-identifying the EHR: building a resource for research Clinical e-Science Framework De-identifying the EHR: building a resource for research All Hands](https://reader036.vdocuments.mx/reader036/viewer/2022081515/56649f115503460f94c248cc/html5/thumbnails/7.jpg)
Anonymised research data
• If legitimately processed for research or statistical purposes“can be kept indefinitely and are exempt from the subject access rights if the results of the work are not made available in a form from which data subjects can be identified”
• If CLEF can make sure the data is anonymous consent is not required and the data may be used for any reasonable research purpose– This is the only scalable approach
• But.. no anonymisation can be perfectBut.. no anonymisation can be perfect
![Page 8: De-identifying the EHR: building a resource for research Clinical e-Science Framework De-identifying the EHR: building a resource for research All Hands](https://reader036.vdocuments.mx/reader036/viewer/2022081515/56649f115503460f94c248cc/html5/thumbnails/8.jpg)
The CLEF ethics approach
1) de-identify the data
2) depersonalise the parts of the record which are most vulnerable to revealing who the patient is
3) still treat the data as having some small potential risk of re-identification
regulate, restrict and monitor access
![Page 9: De-identifying the EHR: building a resource for research Clinical e-Science Framework De-identifying the EHR: building a resource for research All Hands](https://reader036.vdocuments.mx/reader036/viewer/2022081515/56649f115503460f94c248cc/html5/thumbnails/9.jpg)
ReidentifyBy Hospital
PseudonymiseIn Hospital
Depersonalise
ExtractInformation
Integrate &Aggregate
Construct‘Chronicle’
Chronicle
Ethical oversightcommittee
PseudonymisedRepository
Hazardmonitoring
Knowledgeenrichment
Summarise& Formulate
Queries
Individual Summaries& Queries
PrivacyEnhancementTechnologies
ReidentifyBy Hospital
PseudonymiseIn Hospital
Depersonalise
ExtractInformation
Integrate &Aggregate
Construct‘Chronicle’
Chronicle
Ethical oversightcommittee
PseudonymisedRepository
Hazardmonitoring
Knowledgeenrichment
Summarise& Formulate
Queries
Individual Summaries& Queries
PrivacyEnhancementTechnologies
ArchitectureOutline
Data Acquisition Cycle
Data Access Cycle
![Page 10: De-identifying the EHR: building a resource for research Clinical e-Science Framework De-identifying the EHR: building a resource for research All Hands](https://reader036.vdocuments.mx/reader036/viewer/2022081515/56649f115503460f94c248cc/html5/thumbnails/10.jpg)
1) De-identification
1a) replace real patient identifiers with random keys – done securely by the clinical site providing the data– consistent longitudinally within an enterprise
(across enterprises is more difficult)– not known to CLEF– but it remains possible for the original site to re-identify a patient
if this is warranted– "one-way key encryption"
1b) exclude highly identifying data elements from the record extraction– e.g. demographics (except postal district, gender, year of birth)
![Page 11: De-identifying the EHR: building a resource for research Clinical e-Science Framework De-identifying the EHR: building a resource for research All Hands](https://reader036.vdocuments.mx/reader036/viewer/2022081515/56649f115503460f94c248cc/html5/thumbnails/11.jpg)
2) Depersonalisation
Medical narratives (letters, reports, summaries) are– rich in useful clinical data – most likely to reveal something personal about the patient
2a) CLEF tools will lexically analyse all such narratives– to remove occurrences of personal names or references,
locations, highly-specified occupations etc.– to extract and code the key features of the clinical story and
care process– records will be stored within a standards-based architecture
• incorporating formal access control measures
2b) these original depersonalised narratives will not be accessed directly by the query workbench– access will be limited to the extracted coded data
![Page 12: De-identifying the EHR: building a resource for research Clinical e-Science Framework De-identifying the EHR: building a resource for research All Hands](https://reader036.vdocuments.mx/reader036/viewer/2022081515/56649f115503460f94c248cc/html5/thumbnails/12.jpg)
ROYAL MARSDEN NHS TRUST - PATIENT CASE NOTE 324A621F:MRS Dorothy Smith
DOB: 12/05/44 21, Park Crescent
Basingstoke B12 Q13 16 Dec 1992 Seen in General Surgical This lady who has had a mastectomy and left open capsulotomy and removal of her prosthesis was seen by me in the clinic today on behalf of Mr Peterson. She has extensive bony lymphoedema in her left arm which does not seem to be getting any better although she is more or less reconciled to the problem. The original problem was that she complained of shooting pain in the direction of ulna nerve and although there does not seem to be any evidence of local, regional or distant recurrence the pain itself warrants management in a pain clinic. Mrs Smith could be seen in the pain clinic at the Marsden but as this would involve a lot of travelling would like to be treated nearer her home. I wonder whether it would be possible for you to investigate if there is a pain clinic available at Basingstoke as I am sure Dotty could be treated and benefit from its management. I have otherwise arranged for her to be seen in the clinic again in a year's time. There are no signs of recurrence at this time.
Mr Thomas Partridge
Pseudonymisation at hospital
Carer pseudonymised
Clinic date blurredpreserving sequence
Mrs SmithXXXXXXXXX
324A621F:MRS Dorothy Smith
21, Park CrescentBasingstoke B12 Q13
########:######### ####### Obvious mentions of patient name
ROYAL MARSDEN##### #######
MarsdenXXXXXXX
BasingstokeXXXXXXXXXXX
CLEF-RMH-Entry-Key: 52A4F6DB2B46E
or hospital name removed
Overt identifying information removed in hospital & ID replaced by CLEF Entry Key
Mr Thomas Partridge
16 Dec 1992
5213A4F612F1
AB 1992
12/05/441944
Date of birth reduced to year
![Page 13: De-identifying the EHR: building a resource for research Clinical e-Science Framework De-identifying the EHR: building a resource for research All Hands](https://reader036.vdocuments.mx/reader036/viewer/2022081515/56649f115503460f94c248cc/html5/thumbnails/13.jpg)
Depersonalisation by CLEF Language Technology…
Non-obvious identifyinginformation removedusing languagetechnology
##### ####### NHS TRUST - PATIENT CASE NOTE ########:######### ####### DOB: 1944 CLEF-RMH-Entry-Key: 52A4F6DB2B46E
AB 1992 Seen in General Surgical This lady who has had a mastectomy and left open capsulotomy and removal of her prosthesis was seen by me in the clinic today on behalf of Mr Peterson. She has extensive bony lymphoedema in her left arm which does not seem to be getting any better although she is more or less reconciled to the problem. The original problem was that she complained of shooting pain in the direction of ulna nerve and although there does not seem to be any evidence of local, regional or distant recurrence the pain itself warrants management in a pain clinic. XXXXXXXXX could be seen in the pain clinic at the XXXXXXX but as this would involve a lot of travelling would like to be treated nearer her home. I wonder whether it would be possible for you to investigate if there is a pain clinic available at XXXXXXXXXXX as I am sure Dotty could be treated and benefit from its management. I have otherwise arranged for her to be seen in the clinic again in a year's time. There are no signs of recurrence at this time.
5213A4F612F1
Nick-name “Dotty”spotted by language software & removedXXXXX
XXXXXXXXXXX
Carer name spotted & pseudonymised
![Page 14: De-identifying the EHR: building a resource for research Clinical e-Science Framework De-identifying the EHR: building a resource for research All Hands](https://reader036.vdocuments.mx/reader036/viewer/2022081515/56649f115503460f94c248cc/html5/thumbnails/14.jpg)
##### ####### NHS TRUST - PATIENT CASE NOTE ########:######### ####### DOB: 1944 CLEF-RMH-Entry-Key: 52A4F6DB2B46E
AB 1992 Seen in General Surgical This lady who has had a mastectomy and left open capsulotomy and removal of her prosthesis was seen by me in the clinic today on behalf of XXXXXXXXXXX. She has extensive bony lymphoedema in her left arm which does not seem to be getting any better although she is more or less reconciled to the problem. The original problem was that she complained of shooting pain in the direction of ulna nerve and although there does not seem to be any evidence of local, regional or distant recurrence the pain itself warrants management in a pain clinic. XXXXXXXXX could be seen in the pain clinic at the XXXXXXX but as this would involve a lot of travelling would like to be treated nearer her home. I wonder whether it would be possible for you to investigate if there is a pain clinic available at XXXXXXXXXXX as I am sure XXXXX could be treated and benefit from its management. I have otherwise arranged for her to be seen in the clinic again in a year's time. There are no signs of recurrence at this time.
5213A4F612F1
Extraction of keyinformation from text
Information Extraction identifies events and relationships between them from the text, based on templates & knowledge resources
recurrence
no signs of recurrence
bony lymphoedema
shooting pain in thedirection of ulna nerve
pain
Interventions
Problems
Problem Site
Locations
left arm
local, regional or distant
a year’s time
today
at this time
Time
pain clinic
clinic
pain clinic
General Surgical
pain clinic
mastectomy left open capsulotomyremoval of her prosthesis
management
management
![Page 15: De-identifying the EHR: building a resource for research Clinical e-Science Framework De-identifying the EHR: building a resource for research All Hands](https://reader036.vdocuments.mx/reader036/viewer/2022081515/56649f115503460f94c248cc/html5/thumbnails/15.jpg)
3a) Regulation and restriction of access
• Ethical Oversight Board will approve the kinds of organisations, teams and purposes for which the CLEF repository may be queried– defining the appropriate security measures to be taken
• e.g. for authentication, authorisation and encryption
• A research project specific identifier will be used for data extracts to prevent cross-linking– the approval process will determine the extent to which
longitudinal access to records is required, and the extent of drill-down permitted
![Page 16: De-identifying the EHR: building a resource for research Clinical e-Science Framework De-identifying the EHR: building a resource for research All Hands](https://reader036.vdocuments.mx/reader036/viewer/2022081515/56649f115503460f94c248cc/html5/thumbnails/16.jpg)
3b) Monitoring of access
• All accessed will be logged in an audit trail database• Published algorithms will be used to help detect
attempts to combine queries maliciously • Selected research clients will be requested to help
spot personal characteristics that slip through the net– the process of depersonalisation is still early R&D
![Page 17: De-identifying the EHR: building a resource for research Clinical e-Science Framework De-identifying the EHR: building a resource for research All Hands](https://reader036.vdocuments.mx/reader036/viewer/2022081515/56649f115503460f94c248cc/html5/thumbnails/17.jpg)
Privacy Enhancement& authorisation
Queries logged,threats to confidentiality monitored.
CLEF WYSIWYM Query WriterLogin Query OMIM Exit
Relevant Subjects
Treatment Profiles
Outcome Measures
[Male] patients with [adenocarcinoma] of of [this laterality] of [this part] of [breast]AND [age] at [diagnosis] was [less than 30].
Percentage of patients [alive] after [1 year] and after [2 years] and after [5 years].
Patients who received [radiotherapy] [daily], compared with patients who received [radiotherapy] [every other day] and those who received [no radiotherapy].
WARNINGLess than 20 male patients diagnosed with adenocarcinoma of the breast were found.
Further subanalysis on small groups increases the risk that a patient may be identifiable.
Your CLEF security authorisation does not permit your query to be processed.
Queries on small patient groups are blocked or the figures blurred.
malefemale
[Female] patients with [adenocarcinoma] of of [this laterality] of [this part] of [breast]
QUERY RESULT1792 patients diagnosed with adenocarcinoma of the breast were found. 788 had radiotherapy daily, 513 had it on alternate days and 491 had no radiotherapy.
After 5 years, 20% (n=158) of patients who had a daily treatment were alive. After 5 years, 10% (n=49) who had alternate day treatment were alive. After 5 years, 5% (n=27) of the patients who had no treatment were alive.
With special authorisation researchers may examine individual records in anonymised form.
CLEF Patient Chronicle Viewer – L2Exit
#12345678910111213141516171819202122232425262728293031
17
1974
Grade III infiltrating ductal carcinoma left breast
7/22 sampled nodes positive
Radical Mastectomy Left Breast
Insertion Left Breast Prosthesis
MEFUP Chemotherapy
1982/3
Recurrence Left supraclavicular nodes
Excision biopsy of nodes
Radiotherapy
1992
Replacement of Left Breast Prosthesis
Removal of replacement to left breast prosthesis
1994
Recurrence inside chest (confirmed biopsy)
VAC Chemotherapy aborted (toxicity)
Radiotherapy completed
L5/S1 degeneration
Left phrenic nerve paralysis
1996
Multiple pulmonary emboli
Post-radiation fibrosis left upper lung
Prior rib fractures
Frontal lobe ischaemic atrophy
Teflon injection vocal cord
1997
Recurrence in chest
Pleural effusions
VAC Chemotherapy 6 cycles
1998
Recurrence in chest
Radiotherapy
Normal Left Shoulder Xray
1999
No evidence of recurrence
Congestive cardiac failure
Died June 1999
19 75 19 80 19 85 19 90 19 95 20 00
D i edG rad e I I I i n fi l tr ati n g
d u c ta lc ar c i n om a l e f t b reas t
RR ecu rr e n ce
R R R
TAMOXIFEN ARIMIDEX
RA D IO
C H E M O
SSSSSS S S S SS SSS t ag in g C T
T 1N 3 cM 0
T 1 >N 3 cM 1
S tag e IIA S tag e I II c S tag e IV
N o d e sL iv erS p l ee nK id n e yB o n e
N o d e sL iv erS p l ee nK id n e yB o n e
T 1 >N 1 >M 0
Textual summary of CLEF Chronicle for patient #17
Graphical ‘time line’ view of CLEF
Chronicle
![Page 18: De-identifying the EHR: building a resource for research Clinical e-Science Framework De-identifying the EHR: building a resource for research All Hands](https://reader036.vdocuments.mx/reader036/viewer/2022081515/56649f115503460f94c248cc/html5/thumbnails/18.jpg)
Gaining ethical approval
• This depersonalisation process has MREC approval as a valid candidate methodologybut as it has not yet been validated, CLEF cannot yet use live patient data without consent
• However, the project has been approved to use the records of deceased patients as an initial step towards developing, refining and evaluating the depersonalisation approach
• If successful, CLEF hopes to be permitted to migrate to live patient's records next year
![Page 19: De-identifying the EHR: building a resource for research Clinical e-Science Framework De-identifying the EHR: building a resource for research All Hands](https://reader036.vdocuments.mx/reader036/viewer/2022081515/56649f115503460f94c248cc/html5/thumbnails/19.jpg)
Intended final security results
• A validated approach– accepted by MREC, PIAG, and other stakeholder groups
(BMA, GMS, NHS, etc.)
• Exemplar policies and procedures– Ethical Oversight Committee– employee/researcher contracts– safe data extraction – access controls
• Open source tools– mechanisms to support security– active monitoring of use, limiting risk of inferential attack