lecture 22: signatures of selection and introduction to linkage disequilibrium november 12, 2012
TRANSCRIPT
![Page 1: Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56649ec75503460f94bd32ac/html5/thumbnails/1.jpg)
Lecture 22: Signatures of Selection and Introduction to
Linkage Disequilibrium
November 12, 2012
![Page 2: Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56649ec75503460f94bd32ac/html5/thumbnails/2.jpg)
Last Time
Sequence data and quantification of variation
Infinite sites model
Nucleotide diversity (π)
Sequence-based tests of neutrality
Tajima’s D
Hudson-Kreitman-Aguade
Synonymous versus Nonsynonymous substitutions
McDonald-Kreitman
![Page 3: Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56649ec75503460f94bd32ac/html5/thumbnails/3.jpg)
Today
Signatures of selection based on synonymous and nonsynonymous substitutions
Multiple loci and independent segregation
Estimating linkage disequilibrium
![Page 4: Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56649ec75503460f94bd32ac/html5/thumbnails/4.jpg)
Using Synonymous Substitutions to Control for Factors Other Than
Selection
dN/dS or Ka/Ks Ratios
![Page 5: Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56649ec75503460f94bd32ac/html5/thumbnails/5.jpg)
Types of Mutations (Polymorphisms)
![Page 6: Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56649ec75503460f94bd32ac/html5/thumbnails/6.jpg)
First and second position SNP often changes amino acid
UCA, UCU, UCG, and UCC all code for Serine
Third position SNP often synonymous
Majority of positions are nonsynonymous
Not all amino acid changes affect fitness: allozymes
Synonymous versus Nonsynonymous SNP
![Page 7: Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56649ec75503460f94bd32ac/html5/thumbnails/7.jpg)
Synonymous & Nonsynonymous Substitutions Synonymous substitution rate can be used
to set neutral expectation for nonsynonymous rate
dS is the relative rate of synonymous mutations per synonymous site
dN is the relative rate of nonsynonymous mutations per non-synonymous site
= dN/dS
If = 1, neutral selection
If < 1, purifying selection
If > 1, positive Darwinian selection
For human genes, ≈ 0.1
![Page 8: Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56649ec75503460f94bd32ac/html5/thumbnails/8.jpg)
Complications in Estimating dN/dS Multiple mutations in a
codon give multiple possible paths
Two types of nucleotide base substitutions resulting in SNPs: transitions and transversions not equally likely
Back-mutations are invisible
Complex evolutionary models using likelihood and Bayesian approaches must be used to estimate dN/dS (also called KA/KS or KN/KS depending on method) (PAML package)
http://www.mun.ca/biology/scarr/Transitions_vs_Transversions.html
CGT(Arg)->AGA(Arg)
CGT(Arg)->AGT(Ser)->AGA(Arg)
CGT(Arg)->CGA(Arg)->AGA(Arg)
![Page 9: Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56649ec75503460f94bd32ac/html5/thumbnails/9.jpg)
dn/ds ratios for 363 mouse-rat comparisons
interleukin-3: mast cells and bone marrow cells in immune system
Hartl and Clark 2007
Most genes show purifying selection (dN/dS < 1)
Some evidence of positive selection, especially in genes related to immune system
![Page 10: Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56649ec75503460f94bd32ac/html5/thumbnails/10.jpg)
McDonald-Kreitman Test Conceptually similar to HKA test
Uses only one gene
Contrasts ratios of synonymous divergence and polymorphism to rates of nonsynonymous divergence and polymorphism
Gene provides internal control for evolution rates and demography
![Page 11: Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56649ec75503460f94bd32ac/html5/thumbnails/11.jpg)
Aligned 11,624 gene sequences between human and chimp
Calculated synonymous and nonsynonymous substitutions between species (Divergence) and within humans (SNPs)
Identified 304 genes showing evidence of positive selection (blue) and 814 genes showing purifying selection (red) in humans
Bustamente et al. 2005. Nature 437, 1153-1157
Positive selection: defense/immunity, apoptosis, sensory perception, and transcription factors
Purifying selection: structural and housekeeping genes
Application of McDonald-Kreitman Test:
![Page 12: Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56649ec75503460f94bd32ac/html5/thumbnails/12.jpg)
Genes showing purifying (red) or positive (blue) selection in the human genome based on the McDonald-Kreitman Test
Bustamente et al. 2005. Nature 437, 1153-1157
![Page 13: Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56649ec75503460f94bd32ac/html5/thumbnails/13.jpg)
How can you differentiate between effects of selection and demographic effects on sequence
variation?
Will this work for organellar DNA?
![Page 14: Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56649ec75503460f94bd32ac/html5/thumbnails/14.jpg)
Extending to Multiple Loci
So far, only considering dynamics of alleles at single loci
Loci occur on chromosomes, linked to other loci!
“The fitness of a single locus ripped from its interactive context is about as relevant to real problems of evolutionary genetics as the study of the psychology of individuals isolated from their social context is to an understanding of man’s sociopolitical evolution”
Richard Lewontin (quoted in Hedrick 2005) Size of region that must be considered depends on Linkage Disequilibrium
![Page 15: Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56649ec75503460f94bd32ac/html5/thumbnails/15.jpg)
Gametic (Linkage) Disequilibrium (LD) Nonrandom association of alleles at different loci into
gametes
Haplotype: Genotype of a group of closely linked loci
LD is a major factor in evolution
LD itself provides insights into population history
Estimation of LD is critical for ALL population genetic data
![Page 16: Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56649ec75503460f94bd32ac/html5/thumbnails/16.jpg)
Nomenclature and concepts
Two loci, two alleles
Frequency of allele i at locus 1 is pi
Frequency of allele i at locus 2 is qi
A1
A2
B1
B2
p1
p2
q1
q2
111
n
ii
n
ii qp
![Page 17: Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56649ec75503460f94bd32ac/html5/thumbnails/17.jpg)
Nomenclature and concepts
Genotype is written as
A1
A2
B1
B2
A1 A2B1 B2
A1 and B1 are in coupling phase
A1 and B2 are in repulsion phase
![Page 18: Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56649ec75503460f94bd32ac/html5/thumbnails/18.jpg)
Gametic Disequilibrium
Easiest to think about physically linked loci, but not necessarily the case
A1 A2B1 B2
A1B1 A1B2 A2B1A2B2
Meiosis
p1q1 p1q2 p2q1 p2q2What Are Expected Frequencies of Gametes
in a Population Under Independent Assortment?
![Page 19: Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56649ec75503460f94bd32ac/html5/thumbnails/19.jpg)
What are expected frequency of Gametes with complete linkage?
A1
A2
B1
B2
p1
p2
q1
q2
A1 A2B1 B2
A1B1 A1B2 A2B1A2B2
Meiosis
x11 x12x21 x22
![Page 20: Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56649ec75503460f94bd32ac/html5/thumbnails/20.jpg)
Linkage disequilibrium measure, D
Independent Assortment: With LD:
Substituting from above table:
21122211 xxxxD
![Page 21: Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56649ec75503460f94bd32ac/html5/thumbnails/21.jpg)
Problem: D is sensitive to allele frequencies
Example, if D is positive: p1=0.5, q2=0.5, Dmax=0.25
butp1=0.1, q2=0.9, Dmax=0.09
Solution: D' = D/Dmax
ranges from -1 to 1
Dmax Calculation:
If D is positive, Dmax is lesser of p1q2 or p2q1
If D is negative, Dmax is lesser of p1q1 or p2q2
Can’t have negative gamete frequencies
Maximum D set by allele frequencies
![Page 22: Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56649ec75503460f94bd32ac/html5/thumbnails/22.jpg)
LD can also be estimated as correlation between alleles
r can also be standardized to a -1 to 1 scale
It is equivalent to D’ in this case
2121
2
qqpp
Dr
''
2121
max
2121 D
qqpp
D
qqpp
D
r
![Page 23: Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56649ec75503460f94bd32ac/html5/thumbnails/23.jpg)
Recombination
Shuffling of parental alleles during meiosis
A1 A2B1 B2
Occurs for unlinked loci and linked loci
Rate of recombination for linked markers is partially a function of physical distance
A1
A2
B1
B2
A1
A2 B1
B2
![Page 24: Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56649ec75503460f94bd32ac/html5/thumbnails/24.jpg)
What is the expected recombination rate for unlinked
loci?
A1 A2B1 B2
A1B1 A1B2 A2B1A2B2
Meiosis
cr
r
nn
nc
Where nr is number of repulsion phase gametes, and
nc is number of coupling phase gametes
Coupling CouplingRepulsion Repulsion
![Page 25: Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56649ec75503460f94bd32ac/html5/thumbnails/25.jpg)
LD is partially a function of recombination rate
Expected proportions of gametes produced by various genotypes over two generations
Where c is the recombination rateand D0 is the initial amount of LD
First generation (Second generation)
![Page 26: Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56649ec75503460f94bd32ac/html5/thumbnails/26.jpg)
Recombination degrades LD over time
211222111 '''' xxxxD ))(())(( 021012022011 cDxcDxcDxcDx
01 )1( DcD
0)1( DcD tt 0DeD ct
t
Where t is time (in generations) ande is base of natural log (2.718)
![Page 27: Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012](https://reader030.vdocuments.mx/reader030/viewer/2022032612/56649ec75503460f94bd32ac/html5/thumbnails/27.jpg)
Effects of recombination rate on LD
Decline in LD over time with different theoretical recombination rates (c)
Even with independent segregation (c=0.5), multiple generations required to break up allelic associations
Genome-wide linkage disequilibrium can be caused by demographic factors (more later)