motivations to study human genetic variation

34
Motivations to study human genetic variation The evolution of our species and its history. Understand the genetics of diseases, esp. the more common complex ones such as diabetes, cancer, cardiovascular, and neurodegenerative. To allow pharmaceutical treatments to be tailored to individuals (adverse reactions based on genetics).

Upload: emele

Post on 02-Feb-2016

25 views

Category:

Documents


0 download

DESCRIPTION

Motivations to study human genetic variation. The evolution of our species and its history. Understand the genetics of diseases, esp. the more common complex ones such as diabetes, cancer, cardiovascular, and neurodegenerative. - PowerPoint PPT Presentation

TRANSCRIPT

  • Motivations to study human genetic variationThe evolution of our species and its history.

    Understand the genetics of diseases, esp. the more common complex ones such as diabetes, cancer, cardiovascular, and neurodegenerative.

    To allow pharmaceutical treatments to be tailored to individuals (adverse reactions based on genetics).

  • Haplotype Map of the Human GenomeGoals:

    Define patterns of genetic variation across human genomeGuide selection of SNPs efficiently to tag common variantsPublic release of all data (assays, genotypes)

    Phase I: 1.3 M markers in 269 peoplePhase II: +2.8 M markers in 270 people

  • HapMap ProjectThe HapMap Project tests linkage between SNPs in various sub-populations.

    For a group of linked SNPs recombination may be rare over tens of thousands of bases

    A few "tag SNPs" can be used to identify genotypes for groups of linked SNPs

    Makes it possible to survey the whole genome with fewer markers (1/3-1/10th)

  • HaplotypeLinkage is common in the human population, particularly in genetically isolated sub-populations. A group of alleles for neighboring genes on a segment of a chromosome are very often inherited together.

    Such a combination of linked alleles is known as a haplotype.

    When linked alleles are shared by members of a population, it is called a linkage disequilibrium.

  • Haplotypes (example)A chromosome region with only the SNPs shown. Three haplotypes are shown. The two SNPs in color are sufficient to identify (tag) each of the three haplotyes. For example, if a chromosome has alleles A and T at these two tag SNPs, then it has the first haplotype.

    ..A..C..A..T..G..T....A..C..C..G..C..T....G..T..C..G..G..A..

  • HapMap Samples90 Yoruba individuals (30 parent-parent-offspring trios) from Ibadan, Nigeria (YRI)

    90 individuals (30 trios) of European descent from Utah (CEU)

    45 Han Chinese individuals from Beijing (CHB)

    45 Japanese individuals from Tokyo (JPT)

  • Scan these populations with a large number of SNP markers.

    Find markers linked to drug response phenotypes.

    It is interesting, but not necessary, to identify the exact genes involved.Can work with associated populations, does not require detailed information on disease in family history(pedigree).Make Genetic Profiles

  • March, 2010 105,098,087 The 1000 Genomes Project submitted 17.3M SNPs

    The 2008 SNP Submissions for the James Watson Genome totaled 3,542,364The 2008 SNP Submissions for the J. Craig Venter Genome totaled 4,018,050 The 2008 SNP Submissions for the Individual Chinese Genome totaled 5,077,954The 2008 SNP Submissions for the Individual Korean Genome totaled 1,750,224The SNP database todayDerived from dbSNP release 130http://www.ncbi.nlm.nih.gov/SNP/

  • SNPs arent everything: Introducing Copy Number VariationsRedon et al. Nature 2006

  • Copy Number Variation DatasetGenome Structural Variation Consortium

    Array-CGH using a whole genome tile path array Median clone size ~170 kbAll 270 HapMap individualsMeasures amount of DNA, not RNAComparison between two samplesTest sample vs Reference sample

  • Array-CGH technology

  • Typical Analysis ProcedureValues are typically normalized so that the mean log2 value for the entire array (or an individual chromosome) is 0

    Analysis consists of identifying segments where the test and reference samples have unequal copy number

  • 1,447 CNVRs from 270 HapMap samples

  • Structural Variation ProjectNature 447: 161-165, 2007

  • The number of genome structural variants (>1 kb) that distinguish genomes of different individuals is at least on the order of 600900 per individual.J.O. Korbelet al., Science318(2007), pp. 420426Copy Number Variations are ubiquitous in the human genome

  • HapMap 3Merged the results from Affymetrix and Illumina chipsGenotyped 1.6 million common single nucleotide polymorphisms (SNPs) in 1,184 reference individuals from 11 global populationsSequenced ten 100-kilobase regions in 692 of these individuals

  • Centre dEtude du Polymorphisme Humain collected in Utah, USA, with ancestry from northern and western Europe (CEU) Han Chinese in Beijing, China (CHB) Japanese in Tokyo, Japan (JPT) Yoruba in Ibadan, Nigeria (YRI) African ancestry in the southwestern USA (ASW) Chinese in metropolitan Denver, Colorado, USA (CHD) Gujarati Indians in Houston, Texas, USA (GIH) Luhya in Webuye, Kenya (LWK) Maasai in Kinyawa, Kenya (MKK) Mexican ancestry in Los Angeles, California, USA (MXL) Tuscans in Italy (Toscani in Italia, TSI)CEU, ASW, MXL, MKK, and YRI

  • Computational detection of structural genomic variationDirect comparison of genomes through sequence alignments Advantages: All types of genomic variation can be identified, including balanced variants (inversions or translocations)No limit in the resolution and breakpoints can be defined at nucleotide levelProblems: Generate a lot of false positives due to sequence misassembly and gaps

  • Modern humans arose in Africa and replaced other human species across the globe. Scientific American, August 1999)Out of Africa

  • *Out of Africa again and againTempleton, A. Nature 416 (2002): 45 - 51Itai Yanai, 2003

  • The Human Genome Project cost ~USD 3,000,000,000

    Illumina now offers a complete genome sequence from USD 50,000

    Complete Genomics will offer a complete genome sequence from USD 5,000 soon There are now an estimated 50 complete human genome sequences

  • James Watson, 454. $70 millionCraig Venter, Sanger, -$1 millionAfrican -HapMap Illumina & Solid, $100,000Five African Penn State UniversityChinese, IlluminaTwo Koreans Prof. Quake -Stanford --Nature genetics paper -$50,000, 1 week, Helicos Stanford team -Clinical annotation of genome from patient Zero

  • The 10-gen data set

  • ********Log(2) = (test/reference)*