envisioning a world where everyone helps solve disease

Post on 08-Apr-2017

2.947 Views

Category:

Science

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

@monarchinit@ontowonka

“Not everyone can become a great artist, but a great artist

can come from anywhere” Anton Ego, Ratatouille, 2007, Dixsney/Pixar

Envisioning a world where everyone helps solve disease

Melissa HaendelSWAT4LS 2015

Cambridge, England

Faith-based research

“I believe that my work on some obscure cell type in some obscure organism will matter to mankind one day”

Well, it can, and it does.

Four things it takes to solve an undiagnosed disease

1. Deep phenotyping the human organism

2. Crossing the language barrier

3. A lot of data from a lot of places

4. Very many people (who have faith)

1. DEEP PHENOTYPING THE HUMAN ORGANISM

PatientGenom

e/Exome

Filter

****

** ***** ****

Genomic data

Diagnosis,treatment

ATCTTAGCACGTTAC

ATCTTAGCACGTGACATCTTATCACGTTACATCTTAGCACGTTAC

What do all those variations do?

We only know the phenotypic consequences of mutation of <20% of the human coding genome

Patient

Genome

/Exome

Diagnosis,treatment

Filter

****

** ***** ****

Genomic data

Phenotype

Gene-Phenotype

Data

Environment

We have a common language for sequence data…. ATCTTAGCACGTTAC… ….not so much for phenotypes

CC2.0 European Southern Observatory https://www.flickr.com/photos/esoastronomy/6923443595

Can we help machines understand phenotypes?

“Palmoplantar

hyperkeratosis”

Human phenotype I have absolutely no

idea what that means

???

Image credits:

"HandsEBS" by James Heilman, MD - Own work. Licensed under CC BY-SA 3.0 via Commons – https://commons.wikimedia.org/wiki/File:HandsEBS.JPG#/media/File:HandsEBS.JPG

Marcin Wichary [CC BY 2.0 (http://creativecommons.org/licenses/by/2.0)], via Wikimedia Commons

A disease is a collection of phenotypes

Patient

Disease XDifferential diagnosis with similar but non-matching phenotypes is difficult

Flat back of head Hypotonia

Abnormal skull morphology Decreased muscle mass

Do we *really* need yet another clinical vocabulary?

Winnenburg and Bodenreider, ISMB PhenoDay, 2014

UMLSSNOMED CT

CHVMedDRA

MeSHNCIT

ICD10-CICD9-CM

ICD-10OMIM

MedlinePlus

Existing clinical vocabularies don’t adequately cover phenotype descriptions

Disease-phenotype associations using an ontology

Once OMIM is rendered computable, are we done yet?

Free text -> HPO enables phenotype semantic similarity matching

Mendelian disease integrationMerges sources together using: equivalence and subclass axioms derived from xrefs string matching manual efforts to fill gaps based on phenotypes and

anatomical axioms

Parkinson’s disease subtypes

Different colors = different disease sources

https://github.com/monarch-initiative/monarch-disease-ontology

Why we need all the organisms

Model data can provide up to 80% phenotypic coverage of the human coding genome

We learn different things from different organisms

2. CROSSING THE LANGUAGE BARRIER

Ulcerated paws

Palmoplantar hyperkeratos

is

Thick hand skin

Image credits:

"HandsEBS" by James Heilman, MD - Own work. Licensed under CC BY-SA 3.0 via Commons – https://commons.wikimedia.org/wiki/File:HandsEBS.JPG#/media/File:HandsEBS.JPG

http://www.guinealynx.info/pododermatitis.html

Challenge: Each database uses their own vocabulary/ontology

MPHP

MGIHPOA

Image credits:

"HandsEBS" by James Heilman, MD - Own work. Licensed under CC BY-SA 3.0 via Commons – https://commons.wikimedia.org/wiki/File:HandsEBS.JPG#/media/File:HandsEBS.JPG

http://www.guinealynx.info/pododermatitis.html

Challenge: Each database uses their own vocabulary/ontology

ZFA

MPDPO

WPO

HP

OMIA

VT

FYPO APOSNOMED

………

WB

PB

FB

OMIA

MGI

RGD

ZFIN

SGD

HPOAIMPC

OMIM

ICDQTLd

b

EHR

Image credits:

"HandsEBS" by James Heilman, MD - Own work. Licensed under CC BY-SA 3.0 via Commons – https://commons.wikimedia.org/wiki/File:HandsEBS.JPG#/media/File:HandsEBS.JPG

http://www.guinealynx.info/pododermatitis.html

Decomposition of complex concepts allows interoperability

Mungall, C. J., Gkoutos, G., Smith, C., Haendel, M., Lewis, S., & Ashburner, M. (2010). Integrating phenotype ontologies across multiple species. Genome Biology, 11(1), R2. doi:10.1186/gb-2010-11-1-r2

“Palmoplantar

hyperkeratosis”

increased

Stratum corneum

layer of skin

=Human phenotype

PATO

Uberon

Species neutral ontologies, homologous concepts

Autopod

keratinization

GO

Cross-species ontology integration

3. A LOT OF DATA FROM A LOT OF PLACES

Graph Views

DiverseG2P/D

source data

Source Ontologies Owl Loader

Graph Views

Monarch App

FacetedBrowsing

Phenotype

Matching

.ttl

.ttl

Input OutputPipeline

Putting it Together: Data + Ontologies

https://github.com/SciGraph/SciGraph

Data Integrated in SciGraph>25 sources>100 species

51M triples4M curated

associations2.2M G-P / G-D

associations

Genotype-phenotype integration

One sourceTwo sources3 or more

9%

91% of our 2.2 Million G2P associations required integrating 2 or more data sources (this number does not even include orthology (Panther))

91%

Ontology-based phenotype matching

www.owlsim.org

Combining genotype and phenotype data for variant prioritization

Whole exome

Remove off-target and common variants

Variant score from allele freq and pathogenicity

Phenotype score from phenotypic similarity

PHIVE score to give final candidates

Mendelian filters

https://www.sanger.ac.uk/resources/software/exomiser/

York platelet syndrome and STIM1

Markello T et al. Molecular Genetics and Metabolism 2015, 114: 474

Grosse J, J Clin Invest 2007 117: 3540-50

Impaired platelet aggregation(HP:0003540) Thromocytopenia (HP:0001873)

Abnormal platelet activation(MP:0006298) Thrombocytopenia (MP:0003179)

UDP_2542 Stim1Sax/Sax

http://www.nature.com/gim/journal/vaop/ncurrent/full/gim2015137a.html

4. VERY MANY PEOPLE (WHO HAVE FAITH)

Who helped solve the STIM1 UDP_2542 case?

Credit extends beyond the publication

Johannes creates stim1 mouse

Melissa annotates patient UDP_2542 with HPO

Will performs analysis of UDP_2542 that includes stim1 mouse to generate a dataset of prioritized variants

Tom writes publication pmid:25577287 about the STIM1 diagnosis

Tom explicitly credits Will as an author but not Melissa.

Credit is connected

Credit to Will is asserted, but credit to Melissa can be inferred

Who is in the graph?

Melissa HaendelPeter RobinsonChris MungallSebastian KohlerCindy SmithNicole VasilevskySandra Dolken

Johannes GrosseAttila BraunDavid Varga-SzaboNiklas BeyersdorfBoris SchneiderLutz ZeitlmannPetra HankePatricia SchroppSilke MühlstedtCarolin ZornMichael HuberCarolin SchmittwolfWolfgang JaglaPhilipp YuThomas KerkauHarald SchulzeMichael NehlsBernhard Nieswandt

Thomas MarkelloDong ChenJustin Y. Kwan Iren Horkayne-Szakaly Alan Morrison Olga Simakova Irina Maric Jay Lozier Andrew R. Cullinane Tatjana Kilo Lynn Meister Kourosh PakzadSanjay Chainani Roxanne Fischer Camilo Toro James G. White David AdamsCornelius BoerkoelWilliam A. Gahl Cynthia J. Tifft Meral Gunay-Aygun

Melissa HaendelDavid AdamsDavid DraperBailey GallingerJoie DavisNicole Vasilevsky Heather TrangRena GodfreyGretchen GolasCatherine GrodenMichele NehrebeckyAriane SoldatosElise Valkanas,Colleen WahlLynne Wolfe

Elizabeth Lee Amanda LinksWill Bone Murat SincanDamian SmedleyJules JacobsonNicole WashingtonElise FlynnSebastian KohlerOrion BuskeMarta GirdeaMichael Brudno Jeremy Band

Hans GoebleKaren BalbachNadine PfeiferSandra WernerChristian Linden

Clinical/care Pathology Ontologist CS/informatics Curator Basic research

Tracking Evidence and Provenance of G2P Associations

Evidence is a collection of information that is used to support a scientific claim or association

Provenance is a history of what  processes led to the claim being made, what entities participated in these processes

Value of Evidence and Provenance Metadata context to evaluate credibility/confidence support filtering and analysis of data detailed history for attribution

Evidence and Provenance for a Variant-Phenotype Association

Who is missing?

http://haluzz.deviantart.com/art/Waldo-at-the-hipster-party-273602450

What about patients? Can they help too?

HP:0000252Pref Label: MicrocephalySynonyms: Decreased Head Circumference; Reduced Head Circumference; Small head circumferenceSuggested Synonyms : Small Head; Little Head; Small Skull; Little Skull; Small Cranium…

Small headMicrocephaly

https://commons.wikimedia.org/wiki/File:Microcephaly.png#/media/File:Microcephaly.png

Job openinghttps://goo.gl/MlcnR5

Focusing on building ontologies and semantic web technologies to represent research, attribution, provenance, and scholarly communication

@ontowonka haendel@ohsu.edu

Funding: NIH Office of Director: 1R24OD011883; NIH-UDP: HHSN268201300036C, HHSN268201400093P; NCINCI/Leidos #15X143,

BD2K U54HG007990-S2 (Haussler) & BD2K PA-15-144-U01 (Kesselman)

PIs: Chris Mungall, Peter Robinson, Damian Smedley, Tudor Groza, Harry Hochheiserwww.monarchinitiative.org/page/team

top related