single nucleotide polymorphisms and large scale variation

26
Lecture-6 Single nucleotide polymorphisms and Large scale variation Huseyin Tombuloglu, Phd GBE423 Genomics & Proteomics

Upload: audra-burke

Post on 08-Jan-2018

224 views

Category:

Documents


0 download

DESCRIPTION

How can we identify and study all the genetic changes that occur in so many different diseases? How can we explain why some people respond to treatment and not others?

TRANSCRIPT

Page 1: Single nucleotide polymorphisms and Large scale variation

Lecture-6

Single nucleotide polymorphisms and Large scale variation

Huseyin Tombuloglu, Phd

GBE423 Genomics & Proteomics

Page 2: Single nucleotide polymorphisms and Large scale variation

►How can we identify and study all the genetic changes that occur in so many different diseases?

►How can we explain why some people respond to treatment and not others?

Page 3: Single nucleotide polymorphisms and Large scale variation

‘SNP’ is the answer to these questions…

• So what exactly are SNPs? • How are they involved in so many different

aspects of health?

Page 4: Single nucleotide polymorphisms and Large scale variation

What is SNP ?

►A SNP is defined as a single base change in a DNA sequence that occurs in a significant proportion (more than 1 percent) of a large population.

Page 5: Single nucleotide polymorphisms and Large scale variation

Some Facts• In human beings, 99.9 percent bases are same.

• Remaining 0.1 percent makes a person unique. – Different attributes / characteristics / traits • how a person looks, • diseases he or she develops.

Page 6: Single nucleotide polymorphisms and Large scale variation

• These variations can be:

– Harmless (change in phenotype)– Harmful (diabetes, cancer, heart disease,

Huntington's disease, and hemophilia )– Latent (variations found in coding and regulatory

regions, are not harmful on their own, and the change in each gene only becomes apparent under certain conditions e.g. susceptibility to lung cancer)

Page 7: Single nucleotide polymorphisms and Large scale variation

SNP facts►SNPs are found in

coding and (mostly) noncoding regions.

►Occur with a very high frequency about 1 in 1000 bases to 1 in 100 to 300 bases.

►The abundance of SNPs and the ease with which they can be measured make these genetic variations significant.

►SNPs close to particular gene acts as a marker for that gene.

►SNPs in coding regions may alter the protein structure made by that coding region.

Page 8: Single nucleotide polymorphisms and Large scale variation

SNPs may / may not alter protein structure

Page 9: Single nucleotide polymorphisms and Large scale variation

Types of genetic variation• Substitutions ACTGACTGACTGACTGACTG ACTGACTGGCTGACTGACTG

– Single Nucleotide Polymorphisms (SNPs)– Single Nucleotide Variations (SNVs)

• Insertions/deletions (INDELS) ACTGACTGACTGACTGACTG

ACTGACTGACTGACTGACTGACTG– Copy Number Variants (CNVs)

• Indels > 1Kb in size

Page 10: Single nucleotide polymorphisms and Large scale variation

Variant Calling 10

SNPs vs. SNVs• Really a matter of frequency of occurrence• Both are concerned with aberrations at a single nucleotide• SNP

– Aberration expected at the position for any member in the species (well-characterized)

– Occur in population at some frequency so expected at a given locus– Validated in population– Catalogued in dbSNP (http://www.ncbi.nlm.nih.gov/snp)

• SNV– Aberration seen in only one individual (not well characterized)– Occur at low frequency so not common– Not validated in population

9/12/2012

Page 11: Single nucleotide polymorphisms and Large scale variation

• Variation can have an effect on function– Non-synonymous substitutions can change the

amino acid encoded by a codon or give rise to premature stop codons

– Indels can cause frame-shifts– Mutations may affect splice sites or regulatory

sequence outside of genes or within introns

Human genetic variation

Page 12: Single nucleotide polymorphisms and Large scale variation

Identifying a causative de novo mutation

Patient with idiopathic disorder

Veltman and colleagues - Nat Genet. 2010 Dec;42(12):1109-12

(1) Sequence genome

(2) Select only coding mutations

(3) Exclude known variants seen in healthy people

(4) Sequence parents and exclude their

private variants

For 6/9 patients, they were able to identify a single likely-causative

mutation

(5) Look at affected gene function and mutational impact

~22,000 variants (exome re-sequencing)

MSGTCASTTRMSGTNASTTR

~5,640 coding variants

~143 novel coding variants

~5 de novo novel coding variants

Page 13: Single nucleotide polymorphisms and Large scale variation

Variant Calling 13

Catalogs of human genetic variation

• The 1000 Genomes Project– http://www.1000genomes.org/– SNPs and structural variants– genomes of about 2500 unidentified people from about 25 populations

around the world will be sequenced using NGS technologies• HapMap

– http://hapmap.ncbi.nlm.nih.gov/– identify and catalog genetic similarities and differences

• dbSNP– http://www.ncbi.nlm.nih.gov/snp/– Database of SNPs and multiple small-scale variations that include indels,

microsatellites, and non-polymorphic variants• COSMIC

– http://www.sanger.ac.uk/genetics/CGP/cosmic/– Catalog of Somatic Mutations in Cancer

9/12/2012

Page 14: Single nucleotide polymorphisms and Large scale variation
Page 15: Single nucleotide polymorphisms and Large scale variation
Page 16: Single nucleotide polymorphisms and Large scale variation

SNP or Mutation? Call it a SNP IF

the single base change occurs in a population at a frequency of 1% or higher.

Call it a mutation IFthe single base change occurs in less than 1% of a population.

A SNP is a polymorphic position where the point mutation has been fixed in the population.

Page 17: Single nucleotide polymorphisms and Large scale variation

From a Mutation to a SNP

Page 18: Single nucleotide polymorphisms and Large scale variation

SNPs ClassificationSNPs can occur anywhere on a genome, they are classified based on their locations.

Intergenic region Gene region

can be further classified as promoter region, and coding region (intronic, exonic, promoter region, UTR, etc.)

Page 19: Single nucleotide polymorphisms and Large scale variation

Coding Region SNPs Synonymous: do not result in a change of amino acid in

the protein, but still can affect its function in other ways Non-Synonymous

Missense – amino acid changeNonsense – changes amino acid to stop codon.

Geo

spiz

a G

reen

Arr

ow™

tuto

rial b

y Sa

ndra

Por

ter,

Ph.D

.

Page 20: Single nucleotide polymorphisms and Large scale variation
Page 21: Single nucleotide polymorphisms and Large scale variation

SynonymousAn example would be a seemingly silent mutation in the multidrug resistance gene 1 (MDR1), which codes for a cellular membrane pump that expels drugs from the cell, can slow down translation and allow the peptide chain to fold into an unusual conformation, causing the mutant pump to be less functionalMissense (e.g. c.1580G>T SNP in LMNA gene - position 1580 (nt) in the DNA sequence (CGT codon) causing the guanine to be replaced with the thymine, yielding CTT codon in the DNA sequence, results at the protein level in the replacement of the arginine by theleucine in the position 527, at the phenotype level this manifests in overlapping mandibuloacral dysplasia and progeria syndromeNonsensee.g. Cystic fibrosis caused by the G542X mutation in the cystic fibrosis transmembrane conductance regulator gene

Page 22: Single nucleotide polymorphisms and Large scale variation

The Consequences of SNPsThe phenotypic consequence of a SNP is significantly affected by the location where it occurs, as well as the nature of the mutation.

No consequence Affect gene transcription quantitatively or

qualitatively. Affect gene translation quantitatively or

qualitatively. Change protein structure and functions. Change gene regulation at different steps.

Page 23: Single nucleotide polymorphisms and Large scale variation

Simple/Complex Genetic Diseases and SNPs Simple genetic diseases (Mendelian diseases) are

often caused by mutations in a single gene. -- e.g. Huntington’s, Cystic fibrosis, PKU, etc.

Many complex diseases are the result of mutations in multiple genes, the interactions among them as well as between the environmental factors.-- e.g. cancers, heart diseases, Alzheimer's, diabetes, asthmas, etc.

Majority of SNPS may not directly cause any diseases. SNPs are ideal genomic markers (dense and easy to

assay) for locating disease loci in association studies.

Page 24: Single nucleotide polymorphisms and Large scale variation
Page 25: Single nucleotide polymorphisms and Large scale variation

A single base mutation in the APOE (apolipoprotein E) gene is associated with a higher risk for Alzheimer's disease

A single SNP may cause a Mendelian disease, though for complex diseases, SNPs do not usually function individually, rather, they work in coordination with other SNPs to manifest a disease condition as has been seen in Osteoporosis

rs6311 and rs6313 are SNPs in the Serotonin 5-HT2A receptor gene on human chromosome 13.

rs3091244 is an example of a triallelic SNP in the CRP gene on human chromosome 1.

TAS2R38 codes for PTC tasting ability, and contains 6 annotated SNPs.

Page 26: Single nucleotide polymorphisms and Large scale variation

NCBI dbSNPhttp://www.ncbi.nlm.nih.gov/SNP/index.html

NCBI Online Mendelian Inheritance in Man (OMIM)http://www.ncbi.nlm.nih.gov/sites/entrez?db=OMIM

International HapMap Projecthttp://www.hapmap.org/

Perlegen http://genome.perlegen.com

Genome Variation Server (Seattle SNPs)http://gvs.gs.washington.edu/GVS/

Main Genetic Variation Resources