genome-wide association studies: hunting for genes in the new millennium

83
Genome-Wide Association Studies: Hunting for Genes in the New Millennium National Human Genome Research Institute National Institutes of Health U.S. Department of Health and Human Services U.S. Department of Health and Human Services National Institutes of Health National Human Genome Research Institute Teri A. Manolio, M.D., Ph.D. Director, Office of Population Genomics Senior Advisor to the Director, NHGRI, for Population Genomics November 20, 2008

Upload: adonis

Post on 15-Jan-2016

24 views

Category:

Documents


0 download

DESCRIPTION

Genome-Wide Association Studies: Hunting for Genes in the New Millennium. National Human Genome Research Institute. U.S. Department of Health and Human Services National Institutes of Health National Human Genome Research Institute. National Institutes of Health. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Genome-Wide Association Studies: Hunting for

Genes in the New Millennium

National Human Genome Research

Institute

National Institutes of

Health

U.S. Department of Health and

Human Services

U.S. Department of Health and Human Services

National Institutes of HealthNational Human Genome Research

InstituteTeri A. Manolio, M.D., Ph.D.Director, Office of Population GenomicsSenior Advisor to the Director, NHGRI,

for Population GenomicsNovember 20, 2008

Page 2: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

We Live in Interesting Times…

“‘May he live in interesting times.’ Like it or not we live in interesting times.”

--Robert Kennedy, June 7, 1966

May you come to the attention of those in authority.

May you find what you are looking for.

Wikipedia, accessed 11Sep07

Page 3: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

200520062007 first quarter2007 second quarter2007 third quarter2007 fourth quarter2008 first quarter 2008 second quarter 2008 third quarter

Manolio, Brooks, Collins, J. Clin. Invest., May 2008

Page 4: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Pennisi E, Science 2007; 318:1842-43.

2007: The Year of GWA Studies

Page 5: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Diseases and Traits with Published GWA Studies (n = 76, 11/17/08)

• Macular Degeneration• Exfoliation Glaucoma

• Lung Cancer• Prostate Cancer• Breast Cancer• Colorectal Cancer• Bladder Cancer• Neuroblastoma• Melanoma• TP53 Cancer

Predispos’n• Chr. Lymph. Leukemia

• Inflamm. Bowel Disease• Celiac Disease• Gallstones• Irritable Bowel

Syndrome

• QT Prolongation • Coronary Disease• Coronary Spasm • Atrial

Fibrillation/Flutter• Stroke• Subarachnoid

Hemorrhage• Intracranial Aneurysm • Hypertension• Hypt. Diuretic Response• Peripheral Artery

Disease

• Syst. Lupus Erythematosus

• Sarcoidosis• Pulmonary Fibrosis• Psoriasis • HIV Viral Setpoint• Childhood Asthma

• Type 1 Diabetes • Type 2 Diabetes• Diabetic Nephropathy • End-St. Renal Disease• Obesity, BMI, Waist,

IR• Height• Osteoporosis• Osteoarthritis• Male Pattern Baldness

• F-Cell Distribution• Fetal Hgb Levels• C-Reactive Protein• ICAM-1• Total IgE Levels• Uric Acid Levels, Gout• Protein Levels• Vitamin B12 Levels• Recombination Rate• Pigmentation

• Lipids and Lipoproteins• Warfarin Dosing• Ximelegatran Adv.

Resp.

• Parkinson Disease• Amyotrophic Lat.

Sclerosis• Multiple Sclerosis• MS Interferon-β

Response • Prog. Supranuclear

Palsy• Alzheimer’s Disease in

ε4+• Cognitive Ability• Memory• Hearing• Restless Legs Syndrome • Nicotine Dependence• Methamphetamine

Depend.• Neuroticism• Schizophrenia• Sz. Iloperidone

Response• Bipolar Disorder• Family Chaos• Narcolepsy• Attention Deficit

Hyperactivity• Personality Traits

• Rheumatoid Arthritis• RA Anti-TNF Response

Page 6: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Hunter DJ and Kraft P, N Engl J Med 2007; 357:436-439.

“There have been few, if any, similar bursts of discovery in the history of medical research…”

Page 7: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

What is a Genome-Wide Association Study?

• Method for interrogating all 10 million variable points across human genome

• Variation inherited in groups, or blocks, so not all 10 million points have to be tested

• Blocks are shorter (so need to test more points) the less closely people are related

• Technology now allows studies in unrelated persons, assuming 5,000 – 10,000 base pair lengths in common (300,000 – 1,000,000 markers)

Page 8: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

DNA on Chromosome 7GAAATAATTAATGTTTTCCTTCCTTCTCCTATTTTGTCCTTTACTTCAATTTATTTATTTATTATTAATATTATTATTTTTTG

AGACGGAGTTTC/ACTCTTGTTGCCAACCTGGAGTGCAGTGGCGTGATCTCAGCTCACTGCACACTCCGCTTTCCTGGTTTCAAGCGATTCTCCTGCCTCAGCCTCCTGAGTAGCTGGGACTACAGTCACACACCACCACGCCCGGCTAATTTTTGTATTTTTAGTAGAGTTGGGGTTTCACCATGTTGGCCAGACTGGTCTCGAACTCCTGACCTTGTGATCCGCCAGCCTCTGCCTCCCAAAGAGCTGGGATTACAGGCGTGAGCCACCGCGCTCGGCCCTTTGCATCAATTTCTACAGCTTGTTTTCTTTGCCTGGACTTT

ACAAGTCTTACCTTGTTCTGCC/TTCAGATATTTGTGTGGTCTCATTCTGGTGTGCCAGTAGCTAAAAATCCATGATTTGCTCTCATCCCACTCCTGTTGTT

CATCTCCTCTTATCTGGGGTCACA/CTATCTCTTCGTGATTGCATTCTGATCCCCAGTACTTAGCATGTGCGTAACAACTCTGCCTCTGCTTTCCCAGGCTGTTGATGGGGTGCTGTTCATGCCTCAGAAAAATGCATTGTAAGTTAAATTATTAAAGATTTTAAATATAGGAAAAAAGTAAGCAAACATAAGGAACAAAAAGGAAAGAACATGTATTCTAATCCATTATTTATTATACAATTAAGAAATTTGGAAACTTTAGATTACACTGCTTTTAGAGATGGAGATGTAGTAAGTCTTTTACTCTTTACAAAATACATGTGTTAGCAATTTTGGGAA

GAATAGTAACTCACCCGAACAGTG/TAATGTGAATATGTCACTTACTAGAGGAAAGAAGGCACTTGAAAAACATCTCTAAACCGTATAAAAACAATTACATCATAATGATGAAAACCCAAGGAATTTTTTTAGAAAACATTACCAGGGCTAATAACAAAGTAGAGCCACATGTCATTTATCTTCCCTTTGTGTCTGTGTGAGAATTCTAGAGTTATATTTGTACATAGCATGGAAAAATGAGAGGCTAGTTTATCAACTAGTTCATTTTTAAAAGTCTAACACATCCTAGGTATAGGTGAACTGTCCTCCTGCCAATGTATTGCACATTTGTGCCCAGATCCAGCATAGGGTATGTTTGCCATTTACAAACGTTTATGTCTTAAGAGAGGAAATATGAAGAGCAAAACAGTGCATGC

TGGAGAGAGAAAGCTGATACAAATATAAAT/GAAACAATAATTGGAAAAATTGAGAAACTACTCATTTTCTAAATTACTCATGTATTTTCCTAGAATTTAAGTCTTTTAATTTTTGATAAATCCCAATGTGAGACAAGATAAGTATTAGTGATGGTATGAGTAATTAATATCTGTTATATAATATTCATTTTCATAGTGGAAGAAATAAAATAAAGGTTGTGATGATTGTTGATTATTTTTTCTAGAGGGGTTGTCAGGGAAAGAAATTGCTTTTT

SNPs 1 / 300 bases

Page 9: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Christensen and Murray, N Engl J Med 2007; 356:1094-97.

Mapping the Relationships Among SNPs

Page 10: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Chromosome 9p21 Region Associated with MI

Samani N et al, N Engl J Med 2007; 357:443-453.

Page 11: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

BostonProvi-dence

New York

Phila-delphi

a

Balti-more

Providence 59

New York 210 152Philadelphia 320 237 86

Baltimore 430 325 173 87Washington 450 358 206 120 34

Distances Among East Coast Cities

Page 12: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

BostonProvi-dence

New York

Phila-delphi

a

Balti-more

Providence 59

New York 210 152Philadelphia 320 237 86

Baltimore 430 325 173 87Washington 450 358 206 120 34

Distances Among East Coast Cities

< 100 101-200

201-300

301-400

> 400

Page 13: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

BostonProvi-dence

New York

Phila-delphi

a

Balti-more

Providence 59

New York 210 152Philadelphia 320 237 86

Baltimore 430 325 173 87Washington 450 358 206 120 34

Distances Among East Coast Cities

< 100 101-200

201-300

301-400

> 400

Page 14: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Distances Among East Coast Cities

Boston Provi-

dence New

York Phila-

delphia Balti-

more Wash-

ington

Page 15: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Distances Among East Coast Cities

Boston

Provi-

dence

New York

Phila-delph

ia

Balti-more

Wash-

ington

Page 16: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Christensen and Murray, N Engl J Med 2007; 356:1094-97.

Mapping the Relationships Among SNPs

Page 17: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

One Tag SNP May Serve as Proxy for Many

CAGATCGCTGGATGAATCGCATCTGTAAGCAT

CGGATTGCTGCATGGATCGCATCTGTAAGCAC

CAGATCGCTGGATGAATCGCATCTGTAAGCAT

CAGATCGCTGGATGAATCCCATCAGTACGCAT

CGGATTGCTGCATGGATCCCATCAGTACGCAT

CGGATTGCTGCATGGATCCCATCAGTACGCAC  

SNP2↓

SNP3↓

SNP4↓

SNP5↓

SNP6↓

SNP1↓

Block 1 Block 2

SNP7↓

SNP8↓

Page 18: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

One Tag SNP May Serve as Proxy for Many

CAGATCGCTGGATGAATCGCATCTGTAAGCAT

CGGATTGCTGCATGGATCGCATCTGTAAGCAC

CAGATCGCTGGATGAATCGCATCTGTAAGCAT

CAGATCGCTGGATGAATCCCATCAGTACGCAT

CGGATTGCTGCATGGATCCCATCAGTACGCAT

CGGATTGCTGCATGGATCCCATCAGTACGCAC

%

SNP2↓

SNP3↓

SNP4↓

SNP5↓

SNP6↓

SNP1↓

Block 1 Block 2

SNP7↓

SNP8↓

Page 19: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

One Tag SNP May Serve as Proxy for Many

CAGATCGCTGGATGAATCGCATCTGTAAGCAT

CGGATTGCTGCATGGATCGCATCTGTAAGCAC

CAGATCGCTGGATGAATCGCATCTGTAAGCAT

CAGATCGCTGGATGAATCCCATCAGTACGCAT

CGGATTGCTGCATGGATCCCATCAGTACGCAT

CGGATTGCTGCATGGATCCCATCAGTACGCAC

%

SNP3↓

SNP5↓

SNP6↓

Block 1 Block 2

SNP7↓

SNP8↓

Page 20: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

One Tag SNP May Serve as Proxy for Many

CAGATCGCTGGATGAATCGCATCTGTAAGCAT

CGGATTGCTGCATGGATCGCATCTGTAAGCAC

CAGATCGCTGGATGAATCGCATCTGTAAGCAT

CAGATCGCTGGATGAATCCCATCAGTACGCAT

CGGATTGCTGCATGGATCCCATCAGTACGCAT

CGGATTGCTGCATGGATCCCATCAGTACGCAC

%

SNP3↓

SNP6↓

Block 1 Block 2

SNP8↓

Page 21: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

One Tag SNP May Serve as Proxy for Many

GTT 35%

CTC 30%

GTT 10%

GAT 8%

CAT 7%

CAC 6% 

other haplotypes 4%

Block 1 Block 2 FrequencySingleton

Page 22: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

www.hapmap.org

Nature 2005; 437:1299-320.Nature 2007; 449:851-61.

Page 23: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

A HapMap for More Efficient Association Studies: Goals

• Use just the density of SNPs needed to find associations between SNPs and diseases

• Do not miss chromosomal regions with disease association

• Produce a tool to assist in finding genes affecting health and disease

• Use more SNPs for complete genome coverage of populations of recent African ancestry populations due to shorter LD

Page 24: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Progress in Genotyping Technology

1 10 102 103 104 105 106

Nb of SNPs

Cost

per

gen

oty

pe

(Cen

ts,

US

D)10

1

102

ABITaqMan

ABISNPlex

IlluminaGolden

Gate

IlluminaInfinium/

Sentrix Affymetrix

100K/500K

Perlegen

Affymetrix

MegAllele

2001 2005

Affymetrix

10K

Courtesy S. Chanock, NCI

Page 25: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

0

300

600

900

1200

1500

1800

Jul-05 Oct-05 Jan-06 Apr-06 Jul-06 Oct-06

Affymetrix 500K

Illumina 317K

Illumina 550K

Illumina 650Y

Continued Progress in Genotyping Technology

Courtesy S. Gabriel, Broad/MIT

July 2005 Oct 2006

Cost

per

pers

on

(U

SD

)

Page 26: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Association of Alleles and Genotypes of rs1333049 with Myocardial

Infarction C

N (%)G

N (%)2

(1df)P-value

Cases2,132 (55.4)

1,716 (44.6)

55.11.2 x 10-

13Controls

2,783 (47.4)

3,089 (52.6)

Allelic Odds Ratio = 1.38

Samani N et al, N Engl J Med 2007; 357:443-53.

Page 27: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Association of Alleles and Genotypes of rs1333049 with Myocardial

Infarction C

N (%)G

N (%)2

(1df)P-value

Cases2,132 (55.4)

1,716 (44.6)

55.11.2 x 10-

13Controls

2,783 (47.4)

3,089 (52.6)

Allelic Odds Ratio = 1.38CC

N (%)CG

N (%)GG

N (%)2

(2df) P-value

Cases586

(30.5) 960 (49.9)

378 (19.6)59.7

1.1 x 10-

14Controls

676 (23.0)

1,431 (48.7)

829 (28.2)

Heterozygote Odds Ratio = 1.47

Homozygote Odds Ratio = 1.90

Samani N et al, N Engl J Med 2007; 357:443-53.

Page 28: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Klein et al, Science 2005; 308:385-389.

P Values of GWA Scan for Age-Related Macular Degeneration

Page 29: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Bierut LJ et al, Hum Molec Genet 2007; 16:24-35.

Nicotine Dependence among Smokers

Page 30: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

http://www.broad.mit.edu/diabetes/scandinavs/type2.html

Genome-Wide Scan for Type 2 Diabetes in a Scandinavian Cohort

Page 31: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Libioulle C et al, PLoS Genet; 2007 Apr 20;3(4):e58.

Genome-Wide Scan for Crohn Disease in Belgian Cases and

Controls

Page 32: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Sladek R et al, Nature 2007; 445, 881-885.

Genome-Wide Scan for Type 2 Diabetes in French Case-Control

Study

Page 33: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

WTCCC, N ature 2007; 447:661-678.

Wellcome Trust Genome-Wide Association Study of Seven

Common Diseases

Page 34: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Hunter DJ et al, Nat Genet 2007; 39:870-874.

Genome-Wide Scan for Breast Cancer in Postmenopausal Women

Page 35: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

-Log10 P Values for SNP Associations with Myocardial

Infarction

Samani N et al., N Engl J Med 2007; 357:443-53.

Page 36: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Association Signal for Coronary Artery Disease on Chromosome 9

Samani N et al., N Engl J Med 2007; 357:443-53.

Page 37: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Region of Chromosome 1 Showing Strong Association with Inflammatory

Bowel Disease

Duerr R et al., Science 2006; 314:1461-63.

Page 38: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Unique Aspects of GWA Studies

• Permit examination of inherited genetic variability at unprecedented level of resolution

• Permit "agnostic" genome-wide evaluation • Once genome measured, can be related to any

trait• Most robust associations in GWA studies have not

been with genes previously suspected of association with the disease

• Some associations in regions not even known to harbor genes

“The chief strength of the new approach also contains its chief problem: with more than 500,000 comparisons per study, the potential for false positive results is unprecedented.”

Hunter DJ and Kraft P, N Engl J Med 2007; 357:436-439.

Page 39: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Larson, G. The Complete Far Side. 2003.

Page 40: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Number of New, Significant Gene-Disease Associations by Year, 1984 -

2000

Hirschhorn J et al, Genet Med 2002; 4:45-61.

Page 41: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Of 600 Gene-Disease Associations, Only 6 Significant in > 75% of

Identified Studies

Disease/Trait GenePolymorphism

Frequency

DVT F5 Arg506Gln 0.015

Graves’ Disease

CTLA4 Thr17Ala 0.62

Type 1 DM INS 5’ VNTR 0.67

HIV/AIDS CCR5 32 bp Ins/Del 0.05-0.07

Alzheimer’s APOE Epsilon 2/3/4 0.16-0.24

Creutzfeldt-Jakob Disease

PRNP Met129Val 0.37

Hirschhorn J et al., Genet Med 2002; 4:45-61.

Page 42: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

POLYMORPHISM PRESENT ABSENT SUMMARY

ACE I/D 13 with D; 1 with I 18 favors none

APOE 8 with ε4, 2 with ε2 9 equivocal

AGT M235T 0 8 none

AGTR1 A1166C 0 7 none

MTHFR 7 with T, 1 with non-T 8 equivocal

PON1 Q192R 3 with R 10 none

PON1 L55M 5 with L (subgroups) 1 weak

NOS3 G894T 1 with T 4 none

MMP3 -1516 5A/6A

4 with 6A0

association

IL-6 G-174C 1 with G 3 none

Reports For and Against Associations of Variants with

Carotid Atherosclerosis

Manolio et al., ATVB 2004; 24:1567-77.

Page 43: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

May 1999

J. Hirschhorn and D. Altshuler J Clin Endo Metab 2002

Am J Hum Genet July 2004

Am J Hum Genet July 2004

PLoS Biol Sept 2005

Nat Genet July 2006

Page 44: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Chanock S, Manolio T, et al., Nature 2007; 447:655-60.

Page 45: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Replication, Replication, ReplicationInitial study: Sufficient description to permit

replication• Sources of cases and controls• Participation rates and flow chart of selection• Methods for assessing affected status• Standard “Table 1” including rates of missing data• Assessment of population heterogeneity• Genotyping methods and QC metricsReplication study: • Similar population, similar phenotype• Same genetic model, same SNP, same direction• Adequately powered to detect postulated effect

Chanock S, Manolio T, et al., Nature 2007; 447:655-60.

Page 46: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Replication Study #1 3,000 cases / 3,000 controls

Replication Study #22,400 cases / 2,400 controls

Replication Study #3 2,500 cases / 2,500 controls

Initial Study1,150 cases / 1,150 controls

~24,000 SNPs

~1,500 SNPs

200+ New ht-SNPs

>500,000 Tag SNPs

25-50 Loci

Replication Strategy for Prostate Cancer Study in CGEMS

Hoover R, Epidemiology 2007; 18:13-17.

Page 47: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Replication Strategy in Easton Breast Cancer Study

Stage Cases Controls SNPs

1 408 400 266,722

Easton et al, Nature 2007; 447:1087-93.

Page 48: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Replication Strategy in Easton Breast Cancer Study

Stage Cases Controls SNPs

1 408 400 266,722

2 3,990 3,916 13,023

Easton et al, Nature 2007; 447:1087-93.

Page 49: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Replication Strategy in Easton Breast Cancer Study

Stage Cases Controls SNPs

1 408 400 266,722

2 3,990 3,916 13,023

3 23,734 23,639 31

Easton et al, Nature 2007; 447:1087-93.

Page 50: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Replication Strategy in Easton Breast Cancer Study

• ABCFS• BCST• COPS • GENICA• HBCS• HBCP

Stage Cases Controls SNPs

1 408 400 266,722

2 3,990 3,916 13,023

3 23,734 23,639 31

Final 6

• MEC-W• MEC-J• NHS• PBCS• RBCS• SASBAC

Easton et al, Nature 2007; 447:1087-93.

• SEARCH2• SEARCH3• SBCP• SBCS• CNIOBCS• USRT

• TBCS• KConFab/AOCS• KBCP• LUMCBCS• MCBCS• MCCS

Page 51: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Larson, G. The Complete Far Side. 2003.

Page 52: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Replication Strategy in CGEMS Prostate Cancer Study

Stage Cases Controls SNPs

1 1,172 1,157 527,869

Thomas et al, Nat Genet 2008; 40:310-15.

Page 53: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Replication Strategy in CGEMS Prostate Cancer Study

Stage Cases Controls SNPs

1 1,172 1,157 527,869

2 3,941 3,964 26,958*

Thomas et al, Nat Genet 2008; 40:310-15.

Page 54: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Replication Strategy in CGEMS Prostate Cancer Study

Stage Cases Controls SNPs

1 1,172 1,157 527,869

2 3,941 3,964 26,958*

* Selected for p < 0.068

Thomas et al, Nat Genet 2008; 40:310-15.

Page 55: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Replication Strategy in CGEMS Prostate Cancer Study

Stage Cases Controls SNPs

1 1,172 1,157 527,869

2 3,941 3,964 26,958*

* Selected for p < 0.068

SNP GeneStage 1+2

P-value

rs4962416 MSMB 7 x 10-13

rs10896449 11q13 2 x 10-9

rs10993994 CTBP2 2 x 10-7

rs10486567 JAZF1 2 x 10-6

Thomas et al, Nat Genet 2008; 40:310-15.

Page 56: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Replication Strategy in CGEMS Prostate Cancer Study

Stage Cases Controls SNPs

1 1,172 1,157 527,869

2 3,941 3,964 26,958*

* Selected for p < 0.068

SNP GeneStage 1+2

P-value

Initial Rank

rs4962416 MSMB 7 x 10-13 24,223

rs10896449 11q13 2 x 10-9 2,439

rs10993994 CTBP2 2 x 10-7 319

rs10486567 JAZF1 2 x 10-6 24,407Thomas et al, Nat Genet 2008; 40:310-15.

Page 57: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Replication Strategy in CGEMS Prostate Cancer Study

Stage Cases Controls SNPs

1 1,172 1,157 527,869

2 3,941 3,964 26,958*

* Selected for p < 0.068

SNP GeneStage 1+2

P-valueInitial Rank

Initial P-value

rs4962416 MSMB 7 x 10-13 24,223 0.042

rs10896449 11q13 2 x 10-9 2,439 0.004

rs10993994 CTBP2 2 x 10-7 319 4 x 10-4

rs10486567 JAZF1 2 x 10-6 24,407 0.042

Thomas et al, Nat Genet 2008; 40:310-15.

Page 58: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

0

50

100

150

200

2005 2006 2007 2008

Published GWA Reports, 3/2005 - 9/2008

Tota

l N

um

ber

of

Pu

blic

ati

on

s

191

Calendar Quarter

Page 59: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

NHGRI Catalog of GWA Studies: http://www.genome.gov/gwastudies/

Page 60: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

NHGRI GWA Catalog - Objectives

• Identify and track all GWA publications attempting to assay > 100,000 SNPs

• Extract key information regarding associations

• Provide widely as scientific resource, including downloadable datafile

• Seek commonalities across associations genome-wide rather than disease by disease

• Describe approach clearly so others can replicate or expand upon it

• Maintain consistency in approach• Adapt to evolving technologies: CNVs?

Page 61: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

NHGRI GWA Catalog - Methods

• Survey NIH “e-clips” daily, PubMed weekly• Identify all GWA publications attempting to

assay > 100,000 SNPs• Describe top 5 novel associations significant

at p < 10-6 • Expand to all associations < 10-6 • Extract information on:

- Disease/trait - Rs number/risk allele - Sample size - Risk allele frequency- Genomic region - P-value, OR [95%

CI]- Reported genes - Platform, #SNPs

Page 62: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Reports Included in this Analysis

• 180 published papers through 9/18/2008

• 34 did not report SNP

• 1 reported haplotypes without specific SNPs

• 145 reports

– 782 unique (“index”) SNPs

– 3,841 unique perfect LD SNPs (“linked”)

– 4,623 index + linked

• 83 index SNPs reported 2-7 times

• 10 multiple reports in “unrelated” traits

Page 63: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

0 10 20 30 40 50 60

Intergenic

3' (0.5kb)

5' (2kb)

miRTS

3' UTR

5' UTR

Intronic

Synonymous

Missense

Functional Classification of 782 Index SNPs Associated with

Complex Traits37

0 10 20 30 40 50 60 Percent

11

340

11

2

6

22

20

354

Page 64: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

0 10 20 30 40 50 60

Intergenic

3' (0.5kb)

5' (2kb)

miRTS

3' UTR

5' UTR

Intronic

Synonymous

Missense

Functional Classification of 782 Index SNPs and 4,623 Index +

Linked SNPs

Index SNPs Index + Linked SNPs

0 10 20 30 40 50 60 Percent

Page 65: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Odds Ratios of Discrete Associations

0

20

40

60

80

100

120

1.2 1.4 1.6 1.8 2.0 2.2 2.4 3.04.05.06.0 9.013.020.030.0

Odds Ratio (upper inclusive bound)

Num

ber

of A

ssocia

tions

3 4 5 6 9 13 20 30

// //

Median = 1.28

Page 66: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Percent of Variance in Disease Risk Explained by 32 Established CD Risk Loci

Barrett et al., Nat Genet 2008 Jun 29.

Power to detect risk loci

Page 67: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Odds Ratios of Discrete Associations

0

20

40

60

80

100

120

1.2 1.4 1.6 1.8 2.0 2.2 2.4 3.04.05.06.0 9.013.020.030.0

Odds Ratio (upper inclusive bound)

Num

ber

of A

ssocia

tions

3 4 5 6 9 13 20 30

// //

Page 68: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Reported Risk Allele Frequencies by Odds Ratios for

Discrete Traits

1

3

5

7

9

0.0 0.2 0.4 0.6 0.8 1.0

Risk Allele Frequency (%)

Od

ds

Rat

io

30

25

20

15

10

5

4

3

2

1

Page 69: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Reported Risk Allele Frequencies by Odds Ratios for

Discrete Traits

1

3

5

7

9

0.0 0.2 0.4 0.6 0.8 1.0

Risk Allele Frequency (%)

Od

ds

Rat

io

30

25

20

15

10

5

4

3

2

1

Page 70: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Reported Risk Allele Frequencies by Odds Ratios for

Discrete Traits

1

3

5

7

9

0.0 0.2 0.4 0.6 0.8 1.0

Risk Allele Frequency (%)

Od

ds

Rat

io

30

25

20

15

10

5

4

3

2

1

Page 71: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Reported Risk Allele Frequencies by Odds Ratios for

Discrete Traits

1

3

5

7

9

0.0 0.2 0.4 0.6 0.8 1.0

Risk Allele Frequency (%)

Od

ds

Rat

io

30

25

20

15

10

5

4

3

2

1

Thorlieifsson Exfoliation Glaucoma

SarasqueteOsteonecrosis

HakonarsonType 1 DM

van HeelCeliac Disease

WTCCCType 1 DM

Page 72: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Author Trait # Ca/Co RAF OR P-value

Thorliefsson Exfol’n glaucoma 75/14,747 0.85 20.10 3 x 10-21

Sarasquete Osteonecrosis 21/64 0.12 12.75 1 x 10-6

HakonarsonM Type 1 diabetes 561/1,143 0.13 8.30 1 x 10-16

van HeelM Celiac disease 991/1,489 0.14 7.04 1 x 10-19

Matarin* Stroke 259/269 NR 5.62 6 x 10-6

WTCCCM Type 1 diabetes 1,963/2,938 0.39 5.49 5 x 10-134

Behrens Juvenile arthritis 130/1,952 NR 5.37 2 x 10-10

Fung Parkinson’s dis. 267/270 NR 5.00 7 x 10-6

Klein Macular degen. 96/50 0.70 4.60 4 x 10-8

SEARCH Statin myopathy 85/90 0.13 4.50 2 x 10-9

*2 other SNPs in this study also associated, OR > 2.0.MSNPs in MHC region.

Characteristics of SNPs Associated with

Odds Ratios > 4.5

Page 73: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

FST Values: Index SNPs and HapMap SNPs

0.0

0.1

0.2

0.3

0.4

0.5

0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80

Nu

mb

er

of

Lo

ci

HapMap CEU Index SNPsFST Values

0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80

Median = 0.069

Page 74: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Phenotype Relationships of SNPs with Highest FST Values

0.0

0.1

0.2

0.3

0.4

0.5

0.6

IR OB Height Cancer

Nu

mb

er

of

Lo

ci

Top 5% Fst values (0.49) Top 1% Fst values (0.69)

Immune Pigment Obesity Neuro. Height BMD Cancer Related Traits Related

Page 75: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Lessons Learned from Initial GWA Studies

Signals in Previously Unsuspected Genes

Macular Degeneration

CFH

Coronary Disease CDKN2A/2BChildhood Asthma ORMDL3Type II Diabetes CDKAL1Crohn’s Disease ATG16L1

Page 76: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Lessons Learned from Initial GWA Studies

Signals in Previously Unsuspected Genes

Macular Degeneration

CFH

Coronary Disease CDKN2A/2BChildhood Asthma ORMDL3Type II Diabetes CDKAL1Crohn’s Disease ATG16L1

Signals in Gene “Deserts”Prostate Cancer 8q24

Crohn’s Disease5p13.1,

1q31.2, 10p21

Page 77: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Lessons Learned from Initial GWA Studies

Signals in Previously Unsuspected Genes

Signals in Common

Macular Degeneration

CFH

Coronary Disease CDKN2A/2BDiabetes, Melanoma

Childhood Asthma ORMDL3 Crohn’s DiseaseType II Diabetes CDKAL1 Prostate CancerCrohn’s Disease ATG16L1

Signals in Gene “Deserts”Prostate Cancer 8q24

Crohn’s Disease5p13.1,

1q31.2, 10p21

Page 78: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Lessons Learned from Initial GWA Studies

Signals in Previously Unsuspected Genes

Signals in Common

Macular Degeneration

CFH

Coronary Disease CDKN2A/2BDiabetes, Melanoma

Childhood Asthma ORMDL3 Crohn’s DiseaseType II Diabetes CDKAL1 Prostate CancerCrohn’s Disease ATG16L1

Signals in Gene “Deserts”Signals in Common

Prostate Cancer 8q24Breast, Colorectal Cancer; Crohn’s

Crohn’s Disease5p13.1,

1q31.2, 10p21

Page 79: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Lessons Learned from Initial GWA Studies

Signals in Previously Unsuspected Genes

Signals in Common

Macular Degeneration

CFH

Coronary Disease CDKN2A/2BDiabetes, Melanoma

Childhood Asthma ORMDL3 Crohn’s DiseaseType II Diabetes CDKAL1 Prostate CancerCrohn’s Disease ATG16L1

Signals in Gene “Deserts”Signals in Common

Prostate Cancer 8q24Breast, Colorectal Cancer; Crohn’s

Crohn’s Disease5p13.1,

1q31.2, 10p21Signals in Common

Multiple Sclerosis IL7R Type 1 DiabetesSarcoidosis C10orf67 Celiac DiseaseRA, T1DM PTPN2, PTPN22 Crohn’s

Page 80: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Lessons Learned from Initial GWA Studies

Signals in Previously Unsuspected Genes

Signals in Common

Macular Degeneration

CFH

Coronary Disease CDKN2A/2BDiabetes, Melanoma

Childhood Asthma ORMDL3 Crohn’s DiseaseType II Diabetes CDKAL1 Prostate CancerCrohn’s Disease ATG16L1

Signals in Gene “Deserts”Signals in Common

Prostate Cancer 8q24Breast, Colorectal Cancer; Crohn’s

Crohn’s Disease5p13.1,

1q31.2, 10p21Signals in Common

Multiple Sclerosis IL7R Type 1 DiabetesSarcoidosis C10orf67 Celiac DiseaseRA, T1DM PTPN2, PTPN22 Crohn’s

Page 81: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

Study Crohn’s Disease!

Barrett et al., Nat Genet 2008 Jun 29.

Page 82: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

• Nearly half of GWA-identified SNPs are intergenic• Only 8.4% of index SNPs are in coding regions, 5’

or 3’ UTR, or miRTS• Potential selection bias in genotyped SNPs for

excess of missense variants • Most associated odds ratios are < 1.5• Risk allele frequencies do not appear skewed

toward rare alleles or large FST values

• Highly-differentiated SNPs enriched for immune-related, pigmentation, and obesity traits

• Examination of loci at extremes of these characteristics may yield interesting insights

Conclusions

Page 83: Genome-Wide Association Studies: Hunting for Genes in the New Millennium

“The more we find, the more we see, the more we come to learn.

The more that we explore, the more we shall return.”

Sir Tim Rice, Aida, 2000