citizen science and rare disease research

40
Citizen Science and Rare Disease Research Andrew Su, Ph.D. @andrewsu [email protected] http://sulab.org September 22, 2016 Personalized Health in the Digital Age Symposium Slides: slideshare.net/andrewsu

Upload: andrew-su

Post on 10-Feb-2017

238 views

Category:

Science


2 download

TRANSCRIPT

Citizen Science and Rare Disease Research

Andrew Su, Ph.D.@andrewsu

[email protected]://sulab.org

September 22, 2016

Personalized Health in the Digital Age Symposium

Slides: slideshare.net/andrewsu

2

Credit: http://www.slideshare.net/PhRMA/rare-disease-infographics

3

Credit: http://www.slideshare.net/PhRMA/rare-disease-infographics

Rare disease case study #14

Photo: Retta Beery

5

Bainbridge et al., STM, 2011

6

Photo: Retta Beery

Rare disease case study #27

8

… but no obvious treatments

9

Bainbridge et al., STM, 2011

SPR

What differentiates SPR and NGLY1?10

SPR

11

Sarah Olmsteadhttps://flic.kr/p/364dZW

NGLY1

12

NGLY1(11 PubMed articles)

Congenital disorders of glycosylation

(822)

PNGase(686)

ERAD(1330)

glycosylation(48,862)

alacrima(164)

Genetic interactors

(3016)

symptoms(109,928)

25 million articles in PubMed

The biomedical literature is massive…13

1983

1984

1985

1986

1987

1988

1989

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

0

200,000

400,000

600,000

800,000

1,000,000

1,200,000

Number of new PubMed-indexed articles

… but it is very hard to query and compute14

… but it is very hard to query and compute15

ImatinibCrizotinibErlotinibGefitinibSorafenibLapatinibDasatinib

Acute myeloid leukemiaAcute lymphoblastic leukemia

Chronic myelogenous leukemiaChronic lymphocytic leukemia

Hodgkin lymphomaNon-Hodgkin lymphoma

Myeloma…

AND

16

Personalized medicine relies on effective

Pie

tro B

ellin

iht

tps:

//flic

.kr/p

/k5j

mja

KNOWELDGE MANAGEMENT

Information extraction from biomedical text17

1. Identify biomedical concepts in text

… We report a case of familial systemic mastocytosis with the rare KIT K509I germ line mutation. In vitro treatment with imatinib, dasatinib and PKC412 reduced cell viability of primary mast cells harboring KIT K509I mutation. Both patients with familial systemic mastocytosis had remarkable hematological and skin improvement after three months of imatinib treatment.

Leuk Res. 2014 Oct;38(10):1245-51. doi: 10.1016/j.leukres.

GENES

DISEASES

DRUGS

VARIANTS

Information extraction from biomedical text18

imatinib

dasatinib

PKC412

Familial systemic mastocytosis

KIT

K509I

1. Identify biomedical concepts in text

2. Identify relationships between concepts

Mutation of

Mutation causes

causes

treats

inhibits

19

Goal: Assemble a network of biomedical knowledge that is comprehensive, current, computable and traceable.

20

http://www.navy.mil/management/photodb/photos/101104-N-6383T-508.jpg

21

Crowdsourcing

is to data

is to text

biomedicalProvide a database of the world’s knowledge that anyone can edit

- Denny Vrandečić

23

Subclass of

Regulates

Physically interacts with

Protein

Neural development

Property:P279

Property:P128

Property:P129

Q8054

Q1345738

VLDL receptor Q1979313

Amyloid beta A4 Q423510

Q13561329

http

://w

ww

.wik

idat

a.or

g/w

iki/Q

1356

1329

Decreased expression in

Property:P1910Schizophrenia Q41112

Bipolar disorder Q131755

Property:P279

Property:P128

Property:P129

Q8054

Q1345738

Q1979313

Q423510

Q13561329

http

s://

ww

w.w

ikid

ata.

org/

w/a

pi.p

hp?a

ctio

n=w

bget

entit

ies&

ids=

Q13

5613

29&

form

at=j

son

Property:P1910Q41112

Q131755

We are seeding it with biomedical data

• All human, mouse genes and proteins • All Gene Ontology terms• All FDA approved drugs • 9,000+ human diseases• 120 reference microbial genomes

Burgstaller et al (2016) Database (preprint in BioRxiv)Mitraka et al (2015) Semantic Web Applications for the Life Sciences (best paper) (preprint in BioRxiv)

Putman et al (2016) Database (preprint in BioRxiv)

Inter-item links form a giant knowledge graph

Everything is connectedReelin, Heart disease, Barack Obama, everything..

https://query.wikidata.orgSPARQL endpoint for Wikidata

28

Crowdsourcing

Question: Can a group of non-scientists collectively perform concept recognition in biomedical texts?

29

30

Experts versus crowd for concept identification

593 PubMed abstracts

6,900 mentions of “disease concepts”

F = 0.87F = 0.78

$$$

31

Experts versus crowd for concept identification

593 PubMed abstracts

6,900 mentions of “disease concepts”

F = 0.87F = 0.87

$$$

• 9 days• 145 workers• Total: $630.96

32

http://mark2cure.org

33

Paid crowdsourcing

• F = 0.84• 28 days• 212 workers• Total cost: $0

$$$

• F = 0.87• 9 days• 145 workers• Total: $630.96

“Help science, please”

Citizen Science

Mapping the biomedical network around NGLY1 34

NGLY1

35

http://mark2cure.org

36

A preliminary view of the NGLY1-focused biological network

1,200 contributors3,200 documents 787,400 annotations

37

Personalized medicine relies on effective

Pie

tro B

ellin

iht

tps:

//flic

.kr/p

/k5j

mja

KNOWELDGE MANAGEMENT

38

If I have seen further than others, it is by standing on the shoulders of giants.

- Sir Isaac Newton

39

Jake BruggemanKarthik GRamya Gamini

Louis GioiaToby LiGreg Stupp

Other group members

Funding and Support

BioGPS: GM83924Gene Wiki: GM089820BD2K COE: GM114833

Contacthttp://sulab.org

[email protected]@andrewsu

Mark2CureJennifer Fouquier

Max NanisGinger Tsueng

AMT volunteers and Mark2Curators!

Slides: slideshare.net/andrewsuIcon credits (Noun Project, Wikimedia Commons): Zach VanDeHey, hunotika, Viktorvoigt, Alberto Rojas, Lloyd Humphreys

Matt and Cristina MightNGLY1 community

Gene WikiBen Good

Sebastian BurgstallerTim Putman

Núria Queralt RosinachJulia Turner

Andra Waagmeester

BioThings APIChunlei Wu

Julee AdesaraCyrus Afrasiabi

Sebastien LelongMike Mayers

Kevin Xin

Why do I Mark2Cure?40

I am retired, have a doctorate in medical humanities, and have two children with Gaucher disease. I am just looking for some way to put my education to use.

My 4 year old daughter Phoebe is living with and battling rare disease.

I have Ehlers Danlos Syndrome. I hope to help people learn about this painful and debilitating disorder, so that others like me can receive more effective medical care.

Take part in something that helps humanity.

I Mark2Cure in memory of my son Mike who had type 1 diabetes.

Studied biology in college and I really miss it!

In memory of my daughter who had Cystic Fibrosis

Give back