opensnp - crowdsourcing genome wide association studies

29
Crowdsourcing Genome Wide Association Studies 23.01.12, Bastian Greshake

Upload: bastian-greshake

Post on 15-Apr-2017

808 views

Category:

Education


1 download

TRANSCRIPT

Crowdsourcing Genome Wide Association Studies 23.01.12, Bastian Greshake

some words about me

• BSc in Life Sciences (2010)

• Working at Biodiversity & Climate Research Center (since 2010)

• MSc studies at the Goethe University in Frankfurt/Main (since 2011)

• Not exactly a biologist with much professional background in human genetics, but...

some words about me

• some background in data mining (mainly transcriptomics)

• some experience with web applications

• interest in social media & crowd-sourcing

• customer of DTC genetic testing myself

finding DTC results up to now

mining DTC genetic tests

• results are hidden somewhere on the web

• often no phenotypic annotation

• not easily re-usable

let’s code it:

• wants to be a central repository for sharing DTC results

• enables users to share phenotypes as well

• lowers barrier to participate

• motivation to share through benefits for users

• can we take it a step further and provide data for GWAS?

mining DTC genetic tests

• lots of potential for open data (100k+ customers)

• cheap data source for scientists

6 %

26 %

68 %

Would you share DTC test results? (n=226)

YesOnly with DTC companyNo

the front

technical implementation

• framework: Ruby on Rails

• database: PostgreSQL

• task management via resque (known of GitHub)

• basic API via JSON-queries

other resources

• Personal Genome Project

• data is open

• participation not

Personal Genome Project

other resources

• Personal Genome Project

• data is open

• participation not

• no easy way to download data, no API etc.

• genomera

• participation will be open (currently invited beta)

• focus on small scale studies/experiments

genomera

problems & potential of patient driven/crowd-sourced research

• problems

• sample sizes

• bias in participants

• motivation of participants

• accuracy of data

• potential

• possible sample sizes

• low costs

• "warm fuzzy feeling inside" for patients

positive examples: PatientsLikeMe

• around since ~2006

• published a dozen studies since then

• famous example: ALS research on lithium carbonate intake (149 patients, 447 controls)

Paul Wicks et al. (2011) Accelerated clinical discovery using self-reported patient data collected online and a patient-matching algorithm, Nature Biotechnology 29, 411–414

positive examples: 23andMe

• published some studies in 2010/2011

• done with self-reported data

• studies include 10.000+ to 30.000+ participants

“positive examples: 23andMe – general traits

Replications of associations [...] for hair color, eye color, and freckling validate the Web-based, self-reporting paradigm. The identification of novel associations for hair morphology [...], freckling [...], the ability to smell the methanethiol produced after eating asparagus [...], and photic sneeze reflex [...] illustrates the power of the approach.

Nicolas Eriksson et al. (2010) Web-Based, Participant-Driven Studies Yield Novel Genetic Associations for Common Traits. PLoS Genet 6(6): e1000993. doi:10.1371/journal.pgen.1000993

“positive examples: 23andMe – Parkinson’s Disease

We discovered two novel, genome-wide significant associations with [Parkinson’s Disease]—both replicated in an independent cohort. We also replicated 20 previously discovered genetic associations (including LRRK2, GBA, SNCA, MAPT, GAK, and the HLA region), providing support for our novel study design.

Chuong B. Do et al. (2011) Web-Based Genome-Wide Association Study Identifies Two Novel Loci and a Substantial Genetic Component for Parkinson's Disease. PLoS Genet 7(6): e1002141. doi:10.1371/journal.pgen.1002141

Quantified Self and Science

Quantified Self Movement

QS projects

• tracking health in response to work-outs (minimizing impacts of disease/genetic predisposition)

• track response to different drugs

• tracking well-being in response to eating habits (butter vs arithmetics)

butter vs arithmetics

source: Seth Roberts - quantifiedself.com

my conclusions

• technology enables new kinds of research

• DTC results and patient driven research can lead to new scientific knowledge

• can be a valuable addition to traditional research

openSNP: now & future

• won the Mendeley/PLoS Binary Battle in 2011

• got some funding of the German WikiMedia foundation to get more people genotyped

• collaborating with consent to research to get IRB approved consent-process

• working on implementing the Distributed Annotation System

thanks for your attentionsource: xkcd.comCC-BY-NC