the $0 genome & personalgenomes - harvard...

34
1 3:50-4:20 PM GC at CGC 9-Jun-2009 Thanks to: The $0 Genome & PersonalGenomes.org Azco RBH

Upload: others

Post on 20-Nov-2019

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

1

3:50-4:20 PM GC at CGC 9-Jun-2009

Thanks to:

The $0 Genome & PersonalGenomes.org

Azco

RBH

Page 2: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

2

What does $0 to the consumer mean?

1991 Linux1993 WWW2001 Wikipedia1998 Google Search, Maps, Translate, Health..

Page 3: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

3

Specifications other than cost1. Speed (really real-time)2. No reagents or stable in harsh conditions 3. Portability (Instrument size)4. Read length (Mbp)5. Keep DNA parts together in mixtures6. Subsequence targeting (e.g. drug resistance)

emulsionoilH2O

Microbe chromosomes

barcode

Page 4: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

4

DNA Explorer, $80 (Ages 10 and up) www.discovery.com

Genographic Project $99

DIY Bio

23andme $399Time Magazine Nov 2008 invention of the year

Page 5: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

5

DTC SNP chips : Breast Cancer

deCODEme: “does not include the high-risk but rare BRCA1 and BRCA2 breast cancer risk variants”. Navigenics: “Mutations in BRCA1 or BRCA2 are less common in the population and are only present in approximately 5 – 10% of families with breast and ovarian cancer.”23andme: “Hundreds of cancer-associated BRCA1 and BRCA2 mutations have been documented, but three specific BRCA mutations are worthy of note because they are responsible for a substantial fraction of hereditary breast cancers and ovarian cancers among women with Ashkenazi Jewish ancestry”.

1M vs 3G

Page 6: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

6

“Genes Show Limited Value in Predicting Diseases”

Nicholas Wade April 15, 2009

David B. Goldstein, Ph.D.“We must therefore turn more sharply toward the

study of rare variants.”

(Common SNP backlash)

Page 7: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

7

Valuable Personal Genome Sequences

1464 genes are highly predictive & medically actionable(inherited & cancer) at ~$2K per gene.

**Very few of these are on SNP chips.** Why?PKU, Tay Sachs, Cystic Fibrosis, BRCA1/2, etc.

Pharmacogenomic drug/allele combinations:Herceptin, Iressa, ..

Also: Ancestry, Forensics, Social Networking, Education, Research

Page 8: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

8

Multigenic rare causative alleles can yield strong or weak GWA with a common allele

CasesStrong GWA

Controls

Casesweak GWA

Controls

Red=haplotype block

Page 9: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

9

Seq bp/$

0.01

0.1

1

10

100

1000

10000

100000

1000000

10000000

1980 1985 1990 1995 2000 2005 2010

(Moore’s law) 1.5x/yr for electronics

vs10x/yr for

DNA Sequencing

4 logs in 4 years

2009:Lig:$5K

2005:capil:$50M

1995:gel: $3G

Pol:$50K

Page 10: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

10

Ultra-low-cost sequencing1. Polonator SbL/P Open-source $170K device, haplotypes2. Roche-454 SbP Long reads (>0.4 kb)3. Illumina-GA SbP Fluorescent read-length 2*110 bp4. AB-SOLiD SbL Longest ligation reads5. Helicos SbP High parallelism & quantitation6. CGI SbL Rolony grid & 100Kb haplotypes $5K genome

7. Ion Torrent SbP Potentially small device8. Genizon BioSci SbH In situ sequencing9. LightSpeed SbL 16X higher density, >10X speed10. Intelligent Bio SbP Hexagonal grid11. Pacific Bio SbP Long reads (>2.0 kb)12. Bionanomatrix SbP Fluorescent mapping (>300kb) 13. Visigen SbP14. OxfordNanopore Pore Potentially small device15. Nabsys Pore Potentially small device16. Halcyon EM Long reads (>300kb) 17. ZS Genetics EM Long reads (>300kb)

Polonator Polonator

Page 11: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

11

SequencingmC

G T A C

Clarke, Bayley, et al. Nature Nanotech 2009

Page 12: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

12

Electron Microscopy

YOYO labeled stretched ss-M13-DNA on PDMS15 μm = 30 kb

Pt-G-ssDNA 0.5nm = 1 base

William Andregg, et al. unpublished, 2009 .gg...gg....g..g.....gg.n....g...g...g..ggg.....gg.gg....n..

Page 13: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

13

Electron Microscopy

Osmium T

William Andregg, et al. unpublished, 2009

10 vs10K fps

Page 14: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

14

Why open-architecture hardware, software, wetware?

Polonator

1999-2009$170K

2 billion reads per run

Precedents:1981 IBM PC1991 Linux1993 WWW2001 Wikipedia

Rich TerryFigure 4.6.1 Polonator instrument

A shared resource: Pol & Ligase chemistries

Page 15: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

15

Anonymity vs Open-access? Are we in denial?

Trends in laws to make data public (not just at elite institutions): e.g. H.R. 2764, SEC. 218. 26-Dec-07 open-access publishing for all NIH-funded research.

(12) Identify individual case/control status from pooled SNP data Homer et al PLoS Genetics 2008 as this became known, NCBI pulled dbGAP data

(11) Re-identification after “de-identification” using public data. Group Insurance list of birth date, gender, zip code sufficient to re-identify medical records of Governor Weld & family via voter-registration records (1998)

Self identification trend (10) Unapproved self-identification. e.g. Celera IRB. (Kennedy Science. 2002)(9) Obtaining data about oneself via FOIA or sympathetic researchers. (8) DNA data CODIS data in the public domain.

even if acquitted

Page 16: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

16

Anonymity vs Open Access? Are we in denial?Accessing “Secure data”(7) Laptop loss. 26 million Veterans' medical records,

SSN & disabilities stolen Jun 2006. (6) Hacking. A hacker gained access to confidential medical info at the U.

Washington Medical Center -- 4000 files (names, conditions, etc, 2000)(5) Combination of surnames from genotype with geographical info An

anonymous sperm donor traced on the internet 2005 by his 15 year old son who used his own Y chromosome data.

(4) Identification by phenotype. If CT or MR imaging data is part of a study, one could reconstruct a person’s appearance . Even blood chemistry can be identifying in some cases.

(3) Inferring phenotype from genotype Markers for eye, skin, and hair color, height, weight, geographical features, dysmorphologies, etc. are known & the list is growing.

(2) “Abandoned DNA bearing samples (e.g. hair, dandruff, hand-prints, etc.) (1) Government subpoena. False positive IDs and/or family coercion

index

Page 17: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

17

Who can contribute to cures?

Huntington's NancyWexler (psychologist)

Adrenoleukodystrophy

Odone (World Bank)

Parkinson’sBrin family Hugh Rienhoff, (MD)

MyDaughtersDNA.org

ALS Jamie Heywood (engineer)PatientsLikeMe.com

Motivating, donating data ... access to data?

LRRK2 G2019S

HFE Aull(engineer)

Page 18: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

18

Genesenvironmentstraits, cells1) First/only open access data 2) Avoid over-promising on de-identification 3) 100% on Exam to assure informed consent(*Educate pre-consent rather than post-discovery*)4) Low cost coding sequence + regulatory data 5) Multi-traits: images, iPS-etc.RNA, microbe/VDJ 6) Cells available for personal functional genomics7) IRB approval for 100,000 diverse volunteers

501(c)(3)

0431

1070

1660

1677

1687

1833

1846

1731

1730

1781

Page 19: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

1919

Traitomatic: 7 diploid +10 PGP sequences: hypertrophic cardiomyopathy allele

Page 20: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

20

Diagnostics Systems Biology Challenge

TRAITS(Phenome)

Genome6 Gbp

3M Alleles

NOT going from ONLY Genome Sequence to Prediction

Page 21: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

21

PersonalGenomes.orgInherited, Somatic, Environmental Genomics

VDJ-ome

TRAITS(Phenome)

Personal stem-cellsepigenome(RNA,mC)

PERSONAL GENOME

6 Gbp3M alleles

One in a life-time genome + yearly ( to daily) tests

Public Health Bio-weathermap.org : Allergens, Microbes, Viruses

Microbiome~5 new non-synonymousAlleles per generation

Page 22: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

22

Microbiome vs VDJ-ome

Microbe tests: Detect Drug resistance spectrumEarlier warning (e.g. meningitis)

Immune tests: Focus on response to exposureLonger times to detect exposure (e.g. HIV, TB)

Page 23: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

23

Multiple Phyla Subsisting on 18 Antibiotics

DantasSommerChurchScience

2008

(& lignin)

Page 24: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

24

PersonalGenomes.orgInherited, Somatic, Environmental Genomics

VDJ-ome

TRAITS(Phenome)

Personal stem-cellsepigenome(RNA,mC)

PERSONAL GENOME3M alleles

One in a life-time genome + yearly ( to daily) testsPublic Health Bio-weather map : Allergens, Microbes, Viruses

Microbiome

Page 25: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

25

Epignome: DNA - RNA - Protein

Regulatory RNA & Proteins

Page 26: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

26

Selective genome sequencing

Shendure, et al. Science 309:1728 Porreca et al 2007 Nat Methods 4:931Nilsson et al. (2006) Trends Biotechnol 24:83.

Red=Synthetic; Yellow=genome/cDNA

Optimize 258K oligos: 148,949 exons, 20,065 CCDS genes.

3 ways to capture alleles from genomic or c-DNA

In vitro Paired-end-tags (PET)

Science 2005Science 2005

Hybridiz.selection

Zhang, Chou, Shendure, Li, Leproust, Dahl, Davis, Nilsson, Church

For rearrangements

2. 3.1.

GapFill

Nat Methods 2007

3.

Page 27: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

2727

Array Synthesis of Padlock Probes

barcodes

Page 28: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

28

PO4

PO4

App

Barcoding RNAs

Efficient microRNA capture and barcoding via enzymatic oligonucleotide adenylation.Vigneault et al. Nature Methods 2009

3’‐OH5’

5’

+

X3’

+T4 RNA ligase

ATPX3’5’

X3’

Page 29: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

29

RNA editing: A to I(G)# of known cases increased from from 10 to 569

Erez Levanon

Genomic DNA

RNA - intestine

RNA - kidney

RNA - diencephalon

RNA - frontal lobe

RNA - corpus callosum

RNA - cerebellum

Li, Levanon,Yoon,Aach, Xie, LeProust, Zhang, Gao, Church (Science 2009)

e.g. VEZF1

Page 30: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

30

Regulation & MethylationHigh Expression = High Gene-Body to Promoter Ratio

Ball, Li, Gao, Lee, LeProust, Park, Xie, Daley, Church. (Nature Biotech 2009)

Genome wide bisulfite & enzyme assays unrestricted by CpG Island bias

Page 31: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

31

G

A

TC

Allele-specific expression (ASE)

N=1: Combine all cis element variants

GA

AAAAAAAAAAAAAAAAAAAA

TC

TT

& eliminate environmental & trans-acting variation among individuals.Cis: Copy number, enhancer, promoter, splicing, polyA, termination, transport, decay.

G

A

GG

Allele‐specific transcription factor 

binding

TF

Causality: Synthetic homologous allele‐replacement

Zhang, Li, Church unpublishedForton et al. Genome Res. 2007

Page 32: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

3232

PersonalGenomes.org: skin to stem cells to many types

Park& Lee

Hair or skin sample

Page 33: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

33

Clustering stat-significant

allele-specific expression in

reprogrammed cells, ~50% of ASE invariant

among cell types

LeeZhangParkDaleyChurch

Page 34: The $0 Genome & PersonalGenomes - Harvard Universityarep.med.harvard.edu/gmc/ppt/09Jun9_CGC_Hynes.pdf · Helicos SbP High parallelism & quantitation 6. CGI SbL Rolony grid & 100Kb

34

PersonalGenomes.orgInherited, Somatic, Environmental Genomics

VDJ-ome

TRAITS(Phenome)

Personal stem-cellsepigenome(RNA,mC)

PERSONAL GENOME

6 Gbp3M alleles

One in a life-time genome + yearly ( to daily) tests

Public Health Bio-weather map : Allergens, Microbes, Viruses

Microbiome