electronic phenotypingfor genomic research · location care_site death cost device_exposure note...
TRANSCRIPT
![Page 1: Electronic Phenotypingfor Genomic Research · Location Care_site Death Cost Device_exposure Note Observation ... –And multimodal in general ... information retrieval, informatics](https://reader035.vdocuments.mx/reader035/viewer/2022071104/5fddae656c7d4e677a7f9da6/html5/thumbnails/1.jpg)
ElectronicPhenotyping forGenomicResearch
GeorgeHripcsak,ColumbiaUniversityOnbehalfofPhenotyping WG
October30,2017
![Page 2: Electronic Phenotypingfor Genomic Research · Location Care_site Death Cost Device_exposure Note Observation ... –And multimodal in general ... information retrieval, informatics](https://reader035.vdocuments.mx/reader035/viewer/2022071104/5fddae656c7d4e677a7f9da6/html5/thumbnails/2.jpg)
1. HowcaneMERGE improveuponthecurrentlabor-intensivephenotyping towardfully-automatedphenotyping methodstoincreasephenotyping efficiencyandvalidityusingEMRs?
![Page 3: Electronic Phenotypingfor Genomic Research · Location Care_site Death Cost Device_exposure Note Observation ... –And multimodal in general ... information retrieval, informatics](https://reader035.vdocuments.mx/reader035/viewer/2022071104/5fddae656c7d4e677a7f9da6/html5/thumbnails/3.jpg)
Phenotypesharing
• Onepartofthelaborissharing– eMERGE adoptingOHDSIOMOPCommonDataModel
– ConvertcurrenteMERGE datawarehousestosameschemaandvocabulary
– Butpreservesourceinformation
![Page 4: Electronic Phenotypingfor Genomic Research · Location Care_site Death Cost Device_exposure Note Observation ... –And multimodal in general ... information retrieval, informatics](https://reader035.vdocuments.mx/reader035/viewer/2022071104/5fddae656c7d4e677a7f9da6/html5/thumbnails/4.jpg)
Concept
Concept_relationship
Concept_ancestor
Vocabulary
Source_to_concept_map
Relationship
Concept_synonym
Drug_strength
Cohort_definition
Standardizedvocabularies
Attribute_definition
Domain
Concept_class
Cohort
Dose_era
Condition_era
Drug_era
Cohort_attribute
Standardizedderivedelem
ents
Stan
dardize
dclinicaldata
Drug_exposure
Condition_occurrence
Procedure_occurrence
Visit_occurrence
Measurement
Observation_period
Payer_plan_period
Provider
Care_siteLocation
Death
Cost
Device_exposure
Note
Observation
Standardizedhealthsystemdata
Fact_relationship
SpecimenCDM_source
Standardizedmeta-data
Standardizedhealtheconom
ics
Person
DeepInformationModel:OMOPv5.2
Note_NLP
![Page 5: Electronic Phenotypingfor Genomic Research · Location Care_site Death Cost Device_exposure Note Observation ... –And multimodal in general ... information retrieval, informatics](https://reader035.vdocuments.mx/reader035/viewer/2022071104/5fddae656c7d4e677a7f9da6/html5/thumbnails/5.jpg)
Extensivevocabularies
![Page 6: Electronic Phenotypingfor Genomic Research · Location Care_site Death Cost Device_exposure Note Observation ... –And multimodal in general ... information retrieval, informatics](https://reader035.vdocuments.mx/reader035/viewer/2022071104/5fddae656c7d4e677a7f9da6/html5/thumbnails/6.jpg)
eMERGE phenotypegeneration• eMERGE phenotypinglessons
– [KhoAN,Sci TransMed2011]• ComplexityofeMERGE phenotypes
– [ConwayM,AMIA2011]• Multi-modalapproaches
– [PeissigPL,JAMIA2012]• UseofNQFQualityDataModel
– [ThompsonWK,AMIA2012]• Improvingvalidation
– [NewtonKM,JAMIA2013]• Designpatterns
– [RasmussenLV,JBI2014]• PhEMA:PhenotypeExecutionandModelingArchitecture
– [Pathaketal.]
![Page 7: Electronic Phenotypingfor Genomic Research · Location Care_site Death Cost Device_exposure Note Observation ... –And multimodal in general ... information retrieval, informatics](https://reader035.vdocuments.mx/reader035/viewer/2022071104/5fddae656c7d4e677a7f9da6/html5/thumbnails/7.jpg)
Phenotypegenerationlessons• Challengeofbillingcodes• ImportanceofNLP– Andmultimodalingeneral
• Complexityofeffectivephenotypedefinitions• Possibleimprovementfromtoolsandreuse,butmostlyjustsloggingitout
• Differinggoals:– KnowledgediscoveryviaGWASneedshighPPV– Knowledgedeploymentfordecisionsupportalsoneedssensitivity
![Page 8: Electronic Phenotypingfor Genomic Research · Location Care_site Death Cost Device_exposure Note Observation ... –And multimodal in general ... information retrieval, informatics](https://reader035.vdocuments.mx/reader035/viewer/2022071104/5fddae656c7d4e677a7f9da6/html5/thumbnails/8.jpg)
Phenotypingforthefuture
• High-fidelityphenotypes[HripcsakG,JAMIA2017]
– Encodedegree,severityofcondition• Redoforpastphenotypes?
– Exploittimetocreatemoreaccuratephenotypes– Encodetimeofcondition
• Diseasecourse,responsetotreatment– Continuousstates(topology, wherenotdichotomous)– Hiddenphysiologicphenotypes(dataassimilation)– Latentabstractstates(deeplearning)– Accommodatehealthcareprocessbias
![Page 9: Electronic Phenotypingfor Genomic Research · Location Care_site Death Cost Device_exposure Note Observation ... –And multimodal in general ... information retrieval, informatics](https://reader035.vdocuments.mx/reader035/viewer/2022071104/5fddae656c7d4e677a7f9da6/html5/thumbnails/9.jpg)
High-fidelityphenotypes
• Encodedegree,severityofcondition
Albers,AMIA2015
![Page 10: Electronic Phenotypingfor Genomic Research · Location Care_site Death Cost Device_exposure Note Observation ... –And multimodal in general ... information retrieval, informatics](https://reader035.vdocuments.mx/reader035/viewer/2022071104/5fddae656c7d4e677a7f9da6/html5/thumbnails/10.jpg)
High-fidelityphenotypes
• Exploittimetocreatemoreaccuratephenotypes
• Encodetimeofcondition
Hripcsak,JAMIA2015
![Page 11: Electronic Phenotypingfor Genomic Research · Location Care_site Death Cost Device_exposure Note Observation ... –And multimodal in general ... information retrieval, informatics](https://reader035.vdocuments.mx/reader035/viewer/2022071104/5fddae656c7d4e677a7f9da6/html5/thumbnails/11.jpg)
High-fidelityphenotypes
• Continuousstates(topology, wherenotdichotomous)
Nicolau,PNAS2011
![Page 12: Electronic Phenotypingfor Genomic Research · Location Care_site Death Cost Device_exposure Note Observation ... –And multimodal in general ... information retrieval, informatics](https://reader035.vdocuments.mx/reader035/viewer/2022071104/5fddae656c7d4e677a7f9da6/html5/thumbnails/12.jpg)
High-fidelityphenotypes
• Hiddenphysiologicphenotypes(dataassimilation)
Albers,PLOSCompBio2017
![Page 13: Electronic Phenotypingfor Genomic Research · Location Care_site Death Cost Device_exposure Note Observation ... –And multimodal in general ... information retrieval, informatics](https://reader035.vdocuments.mx/reader035/viewer/2022071104/5fddae656c7d4e677a7f9da6/html5/thumbnails/13.jpg)
High-fidelityphenotypes
• Latentabstractstates(deeplearning)
Miotto,ScientificReports2016
![Page 14: Electronic Phenotypingfor Genomic Research · Location Care_site Death Cost Device_exposure Note Observation ... –And multimodal in general ... information retrieval, informatics](https://reader035.vdocuments.mx/reader035/viewer/2022071104/5fddae656c7d4e677a7f9da6/html5/thumbnails/14.jpg)
High-fidelityphenotypes
• Accommodatehealthcareprocessbias
Hripcsak,JAMIA2013
![Page 15: Electronic Phenotypingfor Genomic Research · Location Care_site Death Cost Device_exposure Note Observation ... –And multimodal in general ... information retrieval, informatics](https://reader035.vdocuments.mx/reader035/viewer/2022071104/5fddae656c7d4e677a7f9da6/html5/thumbnails/15.jpg)
2. Howmightmachine-learningandotheradvancedcomputationaltoolsbeusedtoimproveelectronicphenotypingintheeMERGEnetwork?
![Page 16: Electronic Phenotypingfor Genomic Research · Location Care_site Death Cost Device_exposure Note Observation ... –And multimodal in general ... information retrieval, informatics](https://reader035.vdocuments.mx/reader035/viewer/2022071104/5fddae656c7d4e677a7f9da6/html5/thumbnails/16.jpg)
Advancedcomputationaltools
• Naturallanguageprocessing– Largeproportionofphenotypesemployit– Disparatesystemsacrossthenetwork–Mostgetbywithrelativelysimpleprocessing–WorkingonsharingNLP!
![Page 17: Electronic Phenotypingfor Genomic Research · Location Care_site Death Cost Device_exposure Note Observation ... –And multimodal in general ... information retrieval, informatics](https://reader035.vdocuments.mx/reader035/viewer/2022071104/5fddae656c7d4e677a7f9da6/html5/thumbnails/17.jpg)
Advancedcomputationaltools• Machinelearningresearch– eMERGE research:seefollowingslides– Anchors,noisysetstolearnfromimperfecttrainingdata(MIT,Stanford,Columbia)
– Activelearningtoreducetrainingsetlabor(Marshfield,…)
– Deeplearningtocharacterizepatients(Mt.Sinai,…)– Physiologicphenotypesviadataassimilation(Columbia)• E.g.,kidney&liverfunction,bodyspace,insulinexcretion
– Topologyforcontinuousphenotypes(Stanford,Columbia)
![Page 18: Electronic Phenotypingfor Genomic Research · Location Care_site Death Cost Device_exposure Note Observation ... –And multimodal in general ... information retrieval, informatics](https://reader035.vdocuments.mx/reader035/viewer/2022071104/5fddae656c7d4e677a7f9da6/html5/thumbnails/18.jpg)
RheumatoidArthritisAlgorithmFinalFeatureBetas
Createatrainingsetusingclinicianchartreview(N=200) Trainamachinelearning
algorithm
Usealgorithmtoidentifycases(and
controls)
Validationbasedonadditional100chartreview(PPV=0.92)
HarvardeMERGE – RheumatoidArthritisMachineLearningPhenotypeAlgorithm
AUC:0.967
• Machinelearningalgorithmscanbeeffectivelyandefficientlyappliedtoalargepopulationtoaccuratelyphenotypepatients
• Algorithmsprovideflexibilitytoadjustsensitivityandspecificitytovariedusecasescomparedtopre-definedrules-basedalgorithms
RheumatoidArthritisAlgorithmDevelopmentWorkflow
![Page 19: Electronic Phenotypingfor Genomic Research · Location Care_site Death Cost Device_exposure Note Observation ... –And multimodal in general ... information retrieval, informatics](https://reader035.vdocuments.mx/reader035/viewer/2022071104/5fddae656c7d4e677a7f9da6/html5/thumbnails/19.jpg)
• ChallengesfacedinusingNLPforcomputationalphenotyping– Poorportabilitycausedbysyntactic,semantic,andprocessvariations– Semanticgapsamongusers,experts,anddata– Itisnot“onesizefitsall”solutionsforcomputationalphenotyping
• Solutionsproposed– Improvesyntacticinteroperabilitybyadoptingcommondatamodels– Mitigatethesemanticgapsthroughacombinationofdeeplearningrepresentation,
informationretrieval,informaticsextraction,andlatebindingNLPanddatanormalization– Developa platformforsharingNLPknowledgeartifactsandmappingbetweendatasemantics
andexpertsemantics
Mayo
![Page 20: Electronic Phenotypingfor Genomic Research · Location Care_site Death Cost Device_exposure Note Observation ... –And multimodal in general ... information retrieval, informatics](https://reader035.vdocuments.mx/reader035/viewer/2022071104/5fddae656c7d4e677a7f9da6/html5/thumbnails/20.jpg)
PhEMA• PhEMA:PhenotypeExecutionandModelingArchitecture
[Pathaketal.]– Standards-basedrepresentationofphenotypes– Visualtoolforauthoringphenotypes(PhAT)– ExecutionagainstOMOPori2b2(PheX)– DevelopingNLP&MLextensions– IntegrateswithPheKB
![Page 21: Electronic Phenotypingfor Genomic Research · Location Care_site Death Cost Device_exposure Note Observation ... –And multimodal in general ... information retrieval, informatics](https://reader035.vdocuments.mx/reader035/viewer/2022071104/5fddae656c7d4e677a7f9da6/html5/thumbnails/21.jpg)
NLP– MLApproach• ApplyexclusionandinclusioncriteriabasedonICD9codefiltering
• AcquireEMRdataforthefilteredpatients• ProcessclinicalnotestodiscoverSNOMED-CTandRxNORM conceptswiththeirattributes(ApachecTAKES)andgeneratefeaturevectors
• Applymachinelearningpredictiononfeaturevectorsbasedontrainingfromexpert-providedlabels
• CommunicateMLmodeltoothersitestorunontheirdata
![Page 22: Electronic Phenotypingfor Genomic Research · Location Care_site Death Cost Device_exposure Note Observation ... –And multimodal in general ... information retrieval, informatics](https://reader035.vdocuments.mx/reader035/viewer/2022071104/5fddae656c7d4e677a7f9da6/html5/thumbnails/22.jpg)
PhenotypingusingRelationalMachineLearning
![Page 23: Electronic Phenotypingfor Genomic Research · Location Care_site Death Cost Device_exposure Note Observation ... –And multimodal in general ... information retrieval, informatics](https://reader035.vdocuments.mx/reader035/viewer/2022071104/5fddae656c7d4e677a7f9da6/html5/thumbnails/23.jpg)
Marshfield,Castro2008
![Page 24: Electronic Phenotypingfor Genomic Research · Location Care_site Death Cost Device_exposure Note Observation ... –And multimodal in general ... information retrieval, informatics](https://reader035.vdocuments.mx/reader035/viewer/2022071104/5fddae656c7d4e677a7f9da6/html5/thumbnails/24.jpg)
3. HowcaneMERGEassessphenotypecomparabilityacrossdiversepatientpopulationsanddiversehealthcaresettings(e.g.academicandcountyhospitals,communityclinicsandothernationalhealthcaresystems)?
![Page 25: Electronic Phenotypingfor Genomic Research · Location Care_site Death Cost Device_exposure Note Observation ... –And multimodal in general ... information retrieval, informatics](https://reader035.vdocuments.mx/reader035/viewer/2022071104/5fddae656c7d4e677a7f9da6/html5/thumbnails/25.jpg)
Diversepopulationsandsettings
• DesignspecificeMERGE experiments– Busynowwithexisiting phenotypes
• CollaboratewithAllofUsResearchProgram– Gettinguptospeed;usessamedatamodel
• CollaboratewithOHDSI– Large,internationalsetforphenotypepart