contemporary research in human genomics genetics, ethics and the law may 29-31, 2009 josyf...
Post on 15-Jan-2016
216 Views
Preview:
TRANSCRIPT
CONTEMPORARY RESEARCH IN HUMAN GENOMICS
Genetics, Ethics and the LawMay 29-31, 2009
Josyf Mychaleckyj, D.Phil.Center for Public Health GenomicsUniversity of Virginia
Slide 2
Joe Mychaleckyj
Today we’ll review…
• Genome Wide Association Studies (GWAS)• Copy Number Variants (CNVs)• Medical Resequencing• Direct-to-Consumer Services (DTC)
Joe Mychaleckyj
Slide 3
Genome Wide Association Studies (GWAS)
Slide 4
Joe Mychaleckyj
A C C G C G T G T C
Single Nucleotide Polymorphisms: SNPs (‘SNiPs’)
A C C G T G T G T C
Chromosome #1
Chromosome #2
C, T are the 2 different alleles for this SNP
Mutation = Rare variantPolymorphism = Frequent (> 1% prevalence)
Slide 5
Joe Mychaleckyj
Homozygote f(AA)
Each person carries pairs of chromosomes with a separate allele at the SNP position on each chromosome
3 Possible SNP Genotypes
A A
Heterozygote f(AG)A G
Homozygote f(GG)G G
f(AA) + f(AG) + f(GG) = 1
frequency
Slide 6
Joe Mychaleckyj
Case Control Association studyCases =
Clinical Disease Controls =
Disease Free
eg Blue Allele: 0.48 (48%) 0.41 (41%)
Quantitative Trait Locus (QTL)
Association Study
Slide 8
Joe Mychaleckyj
Genome Wide Association Study
• SNPs most common type of human genome variant by number (10-15 Million)
• Stable, easy to assay, accurately genotype• Able to multiplex 1000’s of SNPs into same assay
Affymetrix Human 6.0906,000 SNPS946,00 probes for CNV
Illumina 1M-Duo
Slide 9
Joe Mychaleckyj
GWAS • SNPs present in genes (affect proteins) but
since coding sequence is ~2% of genome, the vast majority of human SNPs are outside exons or introns
• Genotype Dense map of SNPs across all chromosomes of the human genome
• Studies with 500,000 SNPs are becoming routine and 1 Million SNP panels are available
• Do not have to test all 10M SNPs because of SNP-SNP correlations (linkage disequilibrium)
Slide 10
Joe Mychaleckyj
GWAS approach
Does not assume a knowledge of genes or biology
Hardy J, Singleton A.N Engl J Med. 2009 Apr 23;360(17):175
Joe Mychaleckyj
Slide 11
Genome wide Association Analysis of Coronary Artery Disease, NEJM 2007
Slide 12
Joe Mychaleckyj
But Common Diseases are Complex
Gene 1
Gene 2
Gene 4Gene 3VPPGEEQRYT[C/Y]QVEHPGLD
rs1800562GGGGAAGAGCAGAGATATACGT[A/G]CCAGGTGGAGCACCCAGGCCTG
C282Y
HFE
P( Hemochromatosis+ | CC homozyote) ~ 60-100%
Environment 1
Environment 2
Clinical Complex Disease
Environment 3
Clinical Monogenic Disease
OR
OR
Gene 5OR
Slide 13
Joe Mychaleckyj
Monogenic vs Complex DiseaseMonogenic Complex
1 or small # of genes Many
Often etiologic Susceptibility / molecular (severe phenotype) pathology ?
Highly penetrant Modest penetrance
High Odds Ratio Modest/Low Odds Ratio
Strong selection => Weak/No selection => Low frequency/Rare High frequency/Common
Coding Sequence Non-coding/regulation (?)
Slide 14
Joe Mychaleckyj
What are GWAS Studies Finding
• Typically detected variants are common (allele freq >10%)
• low genotype risk, odds ratio (1.1-1.5)• Small sibling relative risk• Causal variants have not been mapped -
function unknown and major signals occur in non-coding regions
• Penetrance model not well known
Slide 15
Joe Mychaleckyj
Example: Crohn Disease
First susceptibility gene NOD2 for Crohn DiseaseSNP: rs17221417
• GRR (het) = 1.29, GRR Homo = 1.92• Allele frequency 0.287 • Sibling Risk Ratio = 1.02• Familial risk in NOD2 has been estimated at
1.19-1.49 but varies with populationLewis J Med Genet 2007, Economou Am J Gastroenterol 2004
Slide 16
Joe MychaleckyjHindorff, PNAS 2009
>200 GWAS studies published as of December 2008
Slide 17
Joe Mychaleckyj
Nature Genetics 41, 666 - 676 (2009) Published online: 10 May 2009Genome-wide association study identifies eight loci associated with blood pressure
Slide 18
Joe Mychaleckyj
The GWAS conundrum: Little variance/risk is explained by GWAS alleles• Obesity
– FTO and MC4R <2% of variance
• Lipids– 30 gene loci, proportion of variance explained in each trait:– 9.3% for HDL cholesterol– 7.7% for LDL cholesterol– 7.4% for triglycerides
• Diabetes– 18 replicated loci: combined sibling relative risk ~1.07
Slide 19
Joe Mychaleckyj
Example: Height
• Highly heritable (heritability ~0.8)• Combined sample of ~63,000• 54 validated variants in multiple genes• Each locus explains ~0.3% - 0.5% of the
phenotypic variance• Total variance explained < 5% overall
Slide 20
Joe Mychaleckyj
What are we missing?
• Population differences• Alleles with small effect sizes• Copy number variants• Rare variants• Epigenetic effects
Slide 21
Joe Mychaleckyj
• Genotype and phenotype datasets made available as rapidly as possible to a wide range of scientific investigators
• Grantees are expected to develop a sharing plan consistent with the GWAS policy.
• Plan should include data submission to the NIH GWAS data repository (dbGaP).http: grants.nih.gov/grants/guide/notice-files/NOT- OD- 07-
088.html)
Pezzolesi et al Diabetes 2009
Slide 22
Joe Mychaleckyj
http://www.ncbi.nlm.nih.gov/gap
Slide 24
Joe Mychaleckyj
NIH GWAS Data Sharing Issues• Sharing of individual genotype & phenotype
data with any approved researcher worldwide
(*Public access to genetic summary statistics)• Review by a central NIH data use committee
(DUC) not constituted by the study • Informed consent templates for new GWAS • ‘Retrofitting’ existing cohorts to conform to
NIH Policy – adequacy of consents– Data sharing clauses– Use of data for research purposes not intended or foreseen
• Ancestry, ethnic origins – harm to community http://grants.nih.gov/grants/gwas/
Slide 25
Joe MychaleckyjPloS Genetics Aug 2008
0.0 0.25 0.75 1.0 Allele Frequency
More Likely to be in mixture
MixtureReference Sample
Personal Genome
Summation over all SNPs, can infer with very high confidence whether the Person (or a close relative) is more likely to be in the Mixture versus a Reference Sample
Example Results for one SNP
Joe Mychaleckyj
Slide 26
Copy Number Variants (CNVs)
Slide 27
Joe Mychaleckyj
Copy Number Variants• Submicroscopic structural genome
rearrangments (cf cytogenetics, FISH)– ~ 10 – 10,000 base pairs in length– Insertions, deletions, duplications (2+ copies), inversions
• Copy number variant or polymorphism – polymorphism = more common CNV (> 1% frequency = CNP)
• Common feature of the genome• Frequency >1% => polymorphism (CNPs)• Assay using genome wide SNP or CNV arrays
– Electronic FISH study
Slide 28
Joe Mychaleckyj
Copy number variants (CNVs)
The Copy Number Variation (CNV) Projecthttp://www.sanger.ac.uk/humgen/cnv/
Slide 29
Joe Mychaleckyj
~11kb deletion on chromosome 8 revealed by ultra-high resolution CGH. Blue lines: individuals with two copies. Red line: individual with zero copies.
The Copy Number Variation (CNV) Projecthttp://www.sanger.ac.uk/humgen/cnv/
Points are SNPs or probes from GWAS Array
Slide 30
Joe Mychaleckyj
Location and frequency of CNVs in the genome
Nature. 2006 Nov 23;444(7118):444-54
Joe Mychaleckyj
Slide 31
Medical Resequencing: Next Generation Sequencing (NGS)
Slide 32
Joe Mychaleckyj
Public Reference Human Genome Sequence (2001, 2004) is Haploid and Chimeric
DNA Library 2, Individual 2
DNA Library 1, Individual 1
DNA Library 3, Individual 3
Slide 33
Joe Mychaleckyj
Next Generation Sequencing (NGS) enables Diploid Sequencing of an individual
Positions of variants, SNPS, CNVs etc
Hundreds of Millions of small random sequence ‘reads’
Slide 34
Joe Mychaleckyj
Mapping of Individual Variants (SNPs, CNVs)
N = 1 individual
A
T
A
T
A
T
Shotgun Reads:T
G
G
G
G
G
G
CReference Genome
Slide 35
Joe Mychaleckyj
Mapping of Individual Variants
• Random reads from diploid genome sequencing – Align random shotgun reads from single individual diploid library
& look for high quality mismatches– Find heterozygous positions
• Medical Sequencing (to determine disease risk profile)– Incorporation of sequence and variants in the Medical Record
Slide 36
Joe Mychaleckyj
ABBA00000000
Slide 37
Joe Mychaleckyj
‘Project Jim’
Bio-IT World June 2007
1.3 percent of Watson’s genome did not match the existing reference genome. > 600,000 novel SNPs< 68,000 insertions and deletions compared to the reference sequence, 3bp - 7kbases
Slide 38
Joe Mychaleckyj
NGS of Diploid Genomes
5 Completely Sequenced as of (May 2009):J. Craig VenterJames WatsonYoruban (West Africa, HGVS)Chinese (YH)Korean (SJK May 2009)
Levy et al, PLoS Biology, 2007
Slide 39
Joe Mychaleckyj
Scientific American 2006
Slide 40
Joe Mychaleckyj
Slide 41
Joe Mychaleckyj
2008: Announcement of the $5,000 Genome
Joe Mychaleckyj
Slide 42
Direct-to-Consumer Services
Slide 43
Joe Mychaleckyj
Bio-IT World November 2008
Launch Platform List Cost Counselor
deCODEme Nov-07 Illumina $985 Referrals
23andMe Nov-07 Illumina $399 No
Navigenics Apr-08 Affymetrix $2500+$250 annual sub
On staff
SeqWright Jan-08 Affymetrix $998 No
Slide 44
Joe Mychaleckyj
Slide 45
Joe Mychaleckyj
Rival genetic tests leave buyers confused
Firms that offer to predict your risk of disease give worryingly varied resultsNic Fleming
(September 7, 2008)
Slide 46
Joe Mychaleckyj
Different Companies produce differing assessments of risk• Different genetic variants reviewed and
included – threshold for inclusion• Level of expertise in companies to review
literature• Different statistical models for risk prediction
– no ‘right’ answer• How frequently updated – new findings in
literature
top related