felsenstein_theor evolutionary genetics.pdf

Upload: mallory

Post on 02-Jun-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    1/489

    THEORETICAL

    EVOLUTIONARY GENETICS

    JOSEPH FELSENSTEIN

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    2/489

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    3/489

    Theoretical Evolutionary Genetics

    GENOME 562

    Joseph Felsenstein

    Department of Genome Sciencesand Department of Biology

    University of WashingtonBox 355065Seattle, Washington 98195-5065

    January, 2013

    Copyright c

    1978, 1983, 1988, 1991, 1992, 1994, 1995,1997, 1999, 2001, 2003, 2005, 2007, 2009, 2011, 2013by Joseph Felsenstein. All rights reserved.

    Not to be reproduced without authors permission.

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    4/489

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    5/489

    Contents

    PREFACE xvii

    I RANDOM MATING POPULATIONS 1I.1 Asexual inheritance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

    Two genotypes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1Multiple genotypes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

    I.2 Haploid inheritance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3I.3 Diploids with two alleles: Hardy-Weinberg laws. . . . . . . . . . . . . . . . . 4

    Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4Meaning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7Equilibrium? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    I.4 Where the rare alleles are found. . . . . . . . . . . . . . . . . . . . . . . . . 8I.5 Multiple alleles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

    An intuitive argument . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10I.6 Overlapping generations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

    I.7 Different Gene Frequencies in the Two Sexes . . . . . . . . . . . . . . . . . . 14Genotype frequencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15I.8 Sex linkage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

    Haploids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16Diploids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16Long-term behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17Genotype frequencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

    I.9 Linkage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20Independence of genotypes at two loci . . . . . . . . . . . . . . . . . . . . . 20A retrospective derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20A measure of nonindependence . . . . . . . . . . . . . . . . . . . . . . . . . 21Simplifying population genetics: the gene pool . . . . . . . . . . . . . . . . . 22A more direct derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23Haplotype frequencies in terms of D . . . . . . . . . . . . . . . . . . . . . . . 24History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

    v

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    6/489

    What causes linkage disequilibrium . . . . . . . . . . . . . . . . . . . . . . . 25I.10 Other Measures of Linkage Disequilibrium . . . . . . . . . . . . . . . . . . . 26I.11 Estimating Gene Frequencies . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

    Maximum likelihood estimation of gene frequencies . . . . . . . . . . . . . . 27Box 1: The method of maximum likelihood . . . . . . . . . . . . . . . . . . . 28Condence intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

    Gene counting (EM algorithm) . . . . . . . . . . . . . . . . . . . . . . . . . 31I.12 Testing Hypotheses about Frequencies . . . . . . . . . . . . . . . . . . . . . 33Chi-Square goodness-of-t test . . . . . . . . . . . . . . . . . . . . . . . . . . 33The likelihood ratio test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33Testing linkage disequilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . 35

    Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36Complements/Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

    II NATURAL SELECTION 45II.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45II.2 Selection in Asexuals - Discrete Generations . . . . . . . . . . . . . . . . . . 46

    Fitness and population density . . . . . . . . . . . . . . . . . . . . . . . . . . 47Selection coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50Change of genotype (or gene) frequency . . . . . . . . . . . . . . . . . . . . . 51Haploids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

    II.3 Selection in Asexuals - Continuous Reproduction . . . . . . . . . . . . . . . 52II.4 Selection in Diploids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

    Viabilities and fertilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54The basic two-allele selection formula . . . . . . . . . . . . . . . . . . . . . . 55Multiplicative (geometric) tnesses . . . . . . . . . . . . . . . . . . . . . . . 57

    Additive tnesses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58A recessive gene . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59A dominant gene . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59Overdominance and underdominance . . . . . . . . . . . . . . . . . . . . . . 60

    II.5 Rates of Change of Gene Frequency . . . . . . . . . . . . . . . . . . . . . . . 61Asexuals and haploids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61Asexuals and haploids - continuous generations . . . . . . . . . . . . . . . . 64Multiplicative tnesses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65Additive tnesses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66An approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66The recessive case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67The dominant case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69Dominance, recessiveness, and gene frequency change . . . . . . . . . . . . . 70History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

    II.6 Overdominance and Underdominance . . . . . . . . . . . . . . . . . . . . . . 74

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    7/489

    Overdominance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75Analyzing stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76Stability of overdominant equilibria . . . . . . . . . . . . . . . . . . . . . . . 78Underdominance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78Protected polymorphism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

    II.7 Selection and Fitness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82Asexuals and haploids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82Diploidy: two alleles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84Fitness optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86Segregational load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

    II.8 Selection and Fitness : Multiple Alleles . . . . . . . . . . . . . . . . . . . . . 89Effect of selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90Stability and mean tness . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

    II.9 Selection Dependent on Population Density . . . . . . . . . . . . . . . . . . 94Asexuals and haploids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94Diploids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95Oscillations and chaos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97Some particular growth laws . . . . . . . . . . . . . . . . . . . . . . . . . . . 97Additional work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

    II.10 Temporal Variation in Fitnesses . . . . . . . . . . . . . . . . . . . . . . . . . 99Asexuals and haploids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99Diploids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101Fitnesses varying within a generation . . . . . . . . . . . . . . . . . . . . . . 103References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

    II.11 Frequency-Dependent Fitnesses . . . . . . . . . . . . . . . . . . . . . . . . . 104

    Asexuals and haploids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105Diploids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

    II.12 Kin selection: a case of frequency-dependence . . . . . . . . . . . . . . . . . 110Hamiltons Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111Kin and group selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111A model of pairwise interaction . . . . . . . . . . . . . . . . . . . . . . . . . 111Pitfalls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

    Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

    Complements/Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120III MUTATION 123

    III.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123III.2 Effect of Mutation on Gene Frequencies . . . . . . . . . . . . . . . . . . . . . 124

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    8/489

    Two alleles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124Approach to equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

    III.3 Mutation with Multiple Alleles . . . . . . . . . . . . . . . . . . . . . . . . . 126Forward and back mutation . . . . . . . . . . . . . . . . . . . . . . . . . . . 126Multiple alleles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127A distinction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

    III.4 Mutation versus Selection: Haploids . . . . . . . . . . . . . . . . . . . . . . . 129III.5 Mutation versus selection: Diploids . . . . . . . . . . . . . . . . . . . . . . . 131Implications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

    III.6 Mutation vs. Selection: Effects of Dominance . . . . . . . . . . . . . . . . . 132Recessive mutants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132Turnover of alleles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134Rate of approach to equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . 134Effect of back mutation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135A computational example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135Dominance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136Polyploidy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

    III.7 Mutational Load. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139A heuristic approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140Weak selection and mutational load . . . . . . . . . . . . . . . . . . . . . . . 142Meaning of the mutational load . . . . . . . . . . . . . . . . . . . . . . . . . 142The c paradox and mutational load . . . . . . . . . . . . . . . . . . . . . . . 143

    III.8 Quasispecies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144III.9 Mutation and Linkage Disequilibrium . . . . . . . . . . . . . . . . . . . . . . 145III.10 History and References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147Complements/Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

    IV MIGRATION 151IV.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151IV.2 The Effect of Migration on Gene Frequencies . . . . . . . . . . . . . . . . . . 151IV.3 Migration and Genotype Frequencies: Gene Pools . . . . . . . . . . . . . . . 152

    The Wahlund effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152Effects of random mating . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154Linkage disequilibrium created by migration . . . . . . . . . . . . . . . . . . 154Gene ow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

    IV.4 Estimating Admixture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

    IV.5 Recurrent Migration: Models of Migration . . . . . . . . . . . . . . . . . . . 157The one-island model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157The island model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157The stepping-stone model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    9/489

    The general migration matrix model . . . . . . . . . . . . . . . . . . . . . . 159IV.6 Recurrent Migration: Gene Frequencies . . . . . . . . . . . . . . . . . . . . . 160

    The one-island model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160The island model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161More general models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

    IV.7 History and References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

    IV.8 Migration vs. Selection: Patches of Adaptation . . . . . . . . . . . . . . . . 163A one-island haploid model . . . . . . . . . . . . . . . . . . . . . . . . . . . 163Diploid models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

    IV.9 Two-Population Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169A symmetric equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169Asymmetry and patch-swamping . . . . . . . . . . . . . . . . . . . . . . . . 170References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

    IV.10 The Levene Model: High Migration . . . . . . . . . . . . . . . . . . . . . . . 172References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

    IV.11 Selection-Migration Clines . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176Clines in a stepping-stone model . . . . . . . . . . . . . . . . . . . . . . . . . 176Approximate solutions of clines: a differential equation . . . . . . . . . . . . 179Approximate solutions of clines: their shape . . . . . . . . . . . . . . . . . . 181Approximate solutions of clines: their slope . . . . . . . . . . . . . . . . . . . 181The characteristic length of a cline . . . . . . . . . . . . . . . . . . . . . . . 182Characteristic length and swamping of patches . . . . . . . . . . . . . . . . . 183

    IV.12 The Wave of Advance of an Advantageous Allele . . . . . . . . . . . . . . . 185Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186Complements/Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

    V INBREEDING 189V.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189V.2 Inbreeding Coefficients and Genotype Frequencies . . . . . . . . . . . . . . . 191V.3 The Loop Calculus: A Simple Example. . . . . . . . . . . . . . . . . . . . . 192V.4 The Loop Calculus: A Pedigree With Several Loops. . . . . . . . . . . . . . 194V.5 The Loop Calculus: Sex Linkage. . . . . . . . . . . . . . . . . . . . . . . . . 196V.6 The Method of Coefficients of Kinship. . . . . . . . . . . . . . . . . . . . . . 197V.7 The Complication of Linkage. . . . . . . . . . . . . . . . . . . . . . . . . . . 199V.8 More Elaborate Probabilities of Identity. . . . . . . . . . . . . . . . . . . . . 202V.9 Regular Systems of Inbreeding: Selng. . . . . . . . . . . . . . . . . . . . . . 202V.10 Regular Systems of Inbreeding: Full Sib Mating . . . . . . . . . . . . . . . . 204

    History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207V.11 Regular Systems of Inbreeding: Matrix Methods . . . . . . . . . . . . . . . . 208

    Other matrix approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210V.12 Repeated double rst cousin mating. . . . . . . . . . . . . . . . . . . . . . . 210

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    10/489

    V.13 Avoiding Inbreeding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212V.14 The Effects of Inbreeding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213

    Multiple loci . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214V.15 Some Comments About Pedigrees . . . . . . . . . . . . . . . . . . . . . . . . 215Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216Complements/Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217

    VI FINITE POPULATION SIZE 221VI.1 Genetic Drift and Inbreeding: their relationship . . . . . . . . . . . . . . . . 221VI.2 Inbreeding due to nite population size . . . . . . . . . . . . . . . . . . . . . 222VI.3 Diploids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224VI.4 Genetic drift: the Wright-Fisher model . . . . . . . . . . . . . . . . . . . . . 226

    Haploidy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226Diploidy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227A Markov process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227Eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . 229

    VI.5 Inbreeding, variances, and xation probabilities. . . . . . . . . . . . . . . . . 230Fixation probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231Genetic drift and the Wahlund effect . . . . . . . . . . . . . . . . . . . . . . 233

    VI.6 Effective population number: avoidance of selng, two sexes, monogamy. . . 233Effective population number . . . . . . . . . . . . . . . . . . . . . . . . . . . 233Selng not allowed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235Separate sexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236Monogamy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238

    VI.7 Varying population size and offspring number. . . . . . . . . . . . . . . . . . 238

    Varying population size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238Variation in tness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239

    VI.8 Other effects on effective population number. . . . . . . . . . . . . . . . . . . 242Overlapping generations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242Linkage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243

    VI.9 Hierarchical population structure. . . . . . . . . . . . . . . . . . . . . . . . . 244Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246Complements/Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248

    VII GENETIC DRIFT AND OTHER EVOLUTIONARY FORCES 251VII.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251VII.2 Drift versus mutation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252

    The innite isoalleles model . . . . . . . . . . . . . . . . . . . . . . . . . . . 252Finite numbers of alleles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253The electrophoretic ladder . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    11/489

    Rate of substitution of alleles . . . . . . . . . . . . . . . . . . . . . . . . . . 255Response to population size bottlenecks . . . . . . . . . . . . . . . . . . . . . 257References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257Box 2: Application: heterogeneity of deleterious mutations . . . . . . . . . . 258

    VII.3 Genetic distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259Changes of heterozygosity and homozygosity . . . . . . . . . . . . . . . . . . 260

    Divergence by genetic drift only . . . . . . . . . . . . . . . . . . . . . . . . . 260Divergence by drift and mutation . . . . . . . . . . . . . . . . . . . . . . . . 261Some widely-used measures . . . . . . . . . . . . . . . . . . . . . . . . . . . 262

    VII.4 Drift versus migration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263A one-island model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263Variation of gene frequency . . . . . . . . . . . . . . . . . . . . . . . . . . . 264

    VII.5 Drift vs. Migration: the Island Model . . . . . . . . . . . . . . . . . . . . . 266The model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266Equilibrium solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268A numerical example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269Rate of loss of variability without mutation . . . . . . . . . . . . . . . . . . . 271References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273

    VII.6 Drift vs. Migration: the stepping stone model. . . . . . . . . . . . . . . . . 273The model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273An aside: the general migration matrix model . . . . . . . . . . . . . . . . . 274The stepping stone model again . . . . . . . . . . . . . . . . . . . . . . . . . 275Further results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276Rate of loss of variability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277Relation to the equilibrium with mutation . . . . . . . . . . . . . . . . . . . 279References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279

    VII.7 Probability of Fixation of a Mutant . . . . . . . . . . . . . . . . . . . . . . 280

    Probability of xation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280The branching process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280The Wright-Fisher model with selection . . . . . . . . . . . . . . . . . . . . . 281An approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283Number of copies present . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285

    VII.8 The Diffusion Approximation to Fixation Probabilities. . . . . . . . . . . . . 285The Wright-Fisher model with selection . . . . . . . . . . . . . . . . . . . . . 285Exact xation probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287The diffusion approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . 287A specic case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289Numerical examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290Departures from the Wright-Fisher model: effective population number . . . 292Selection against a mutant . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293Accuracy of the branching process approximation . . . . . . . . . . . . . . . 293

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    12/489

    Weak selection: an intuitive result . . . . . . . . . . . . . . . . . . . . . . . . 294Dominance, recessiveness, and overdominance . . . . . . . . . . . . . . . . . 295History and references . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297

    VII.9 Approximation to Equilibrium Distributions. . . . . . . . . . . . . . . . . . 297Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297The Wright-Fisher model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298

    The diffusion approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . 299Mutation and drift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301Migration and drift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304Selection versus drift: a general formula . . . . . . . . . . . . . . . . . . . . . 306Some numerical examples of selection, mutation and drift . . . . . . . . . . . 308History and references . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309Box 3: The diffusion scaling rules . . . . . . . . . . . . . . . . . . . . . . . . 313

    VII.10 The Relative Strength of Evolutionary Forces . . . . . . . . . . . . . . . . . 314Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315Complements/Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317

    VIII MULTIPLE LINKED LOCI 321VIII.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321VIII.2 A Haploid 2-locus Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322

    Selection with no recombination . . . . . . . . . . . . . . . . . . . . . . . . . 322Epistasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324Selection and recombination . . . . . . . . . . . . . . . . . . . . . . . . . . . 325Interaction and Linkage An Example . . . . . . . . . . . . . . . . . . . . . 327

    VIII.3 Linkage and Selection in Diploids . . . . . . . . . . . . . . . . . . . . . . . 329VIII.4 Linked polymorphisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332

    Lewontin and Kojimas symmetric model . . . . . . . . . . . . . . . . . . . . 332

    Fitness and Disequilibrium: Morans Counterexample . . . . . . . . . . . . . 336Coadapted Gene Complexes and Recombination . . . . . . . . . . . . . . . . 337The General Symmetric Model . . . . . . . . . . . . . . . . . . . . . . . . . . 340Multiplicative overdominant loci . . . . . . . . . . . . . . . . . . . . . . . . . 343Some perspective on interacting polymorphisms . . . . . . . . . . . . . . . . 343

    VIII.5 Intermediate optimum models . . . . . . . . . . . . . . . . . . . . . . . . . 344VIII.6 Selection on modiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344

    Modication of dominance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345Modication of recombination . . . . . . . . . . . . . . . . . . . . . . . . . . 346Modication of mutation rate . . . . . . . . . . . . . . . . . . . . . . . . . . 346General reduction principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346

    VIII.7 Genetic drift and linkage . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347A numerical example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348Why this is not quite right . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348

    VIII.8 Drift, linkage disequilibrium, and selection . . . . . . . . . . . . . . . . . . 349

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    13/489

    Hitchhiking, selective sweeps, and periodic selection . . . . . . . . . . . . . . 349Box 4: Haplotype blocks? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350The Hill-Robertson effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352Implications of the Hill-Robertson effect . . . . . . . . . . . . . . . . . . . . 355

    VIII.9 Migration and linkage disequilibrium . . . . . . . . . . . . . . . . . . . . . 357Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358Complements/Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359

    IX QUANTITATIVE CHARACTERS 361IX.1 What is a Quantitative Character? . . . . . . . . . . . . . . . . . . . . . . . 361IX.2 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362

    Scale transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365IX.3 Means . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367

    A one-locus analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367Inbreeding effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368Means of crosses and backcrosses . . . . . . . . . . . . . . . . . . . . . . . . 369

    IX.4 Additive and Dominance Variance . . . . . . . . . . . . . . . . . . . . . . . . 371Variances and covariances . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371Phenotypic variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373Additive effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374Additive and dominance variances . . . . . . . . . . . . . . . . . . . . . . . . 379

    IX.5 Covariances Between Relatives . . . . . . . . . . . . . . . . . . . . . . . . . . 382Correlations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384Parents and offspring: heritability . . . . . . . . . . . . . . . . . . . . . . . . 385Full sibs and half-sibs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385Other relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386

    IX.6 Regression of Offspring on Parents . . . . . . . . . . . . . . . . . . . . . . . 387Regression on the midparent . . . . . . . . . . . . . . . . . . . . . . . . . . . 389

    IX.7 Estimating variance components and heritability. . . . . . . . . . . . . . . . 390Pitfalls and limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391

    IX.8 History and References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394IX.9 Response to articial selection . . . . . . . . . . . . . . . . . . . . . . . . . . 396

    Normally distributed phenotypes . . . . . . . . . . . . . . . . . . . . . . . . 396Response to selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398Selection effects at a single locus . . . . . . . . . . . . . . . . . . . . . . . . . 401Effects of repeated selection . . . . . . . . . . . . . . . . . . . . . . . . . . . 402Complications and limitations . . . . . . . . . . . . . . . . . . . . . . . . . . 403

    IX.10 History and References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405Complements/Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    14/489

    X MOLECULAR POPULATION GENETICS 411X.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411X.2 Mutation models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411

    The Jukes-Cantor model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411Kimuras 2-parameter model . . . . . . . . . . . . . . . . . . . . . . . . . . . 413The HKY model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414

    The General Time-Reversible model . . . . . . . . . . . . . . . . . . . . . . . 415X.3 Approximate mutation models . . . . . . . . . . . . . . . . . . . . . . . . . . 415The innite-alleles model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416The innite-sites model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416

    X.4 The Coalescent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417The approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420

    X.5 Coalescents with migration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421X.6 Coalescents with population growth . . . . . . . . . . . . . . . . . . . . . . . 423

    Recombination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423Trees and Ds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424The ancestral recombination graph . . . . . . . . . . . . . . . . . . . . . . . 425Further reading on coalescents . . . . . . . . . . . . . . . . . . . . . . . . . . 427

    X.7 Some summary statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427Nucleotide diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427Number of segregating sites . . . . . . . . . . . . . . . . . . . . . . . . . . . 428Tajimas test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429

    X.8 Likelihood calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429The likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430Summing over trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430Monte Carlo integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 432Importance sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433

    Computing likelihoods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 434Bayesian samplers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435Griffiths-Tavare independent sampling . . . . . . . . . . . . . . . . . . . . . 435Markov chain Monte Carlo sampling . . . . . . . . . . . . . . . . . . . . . . 436Approximate Bayesian computation . . . . . . . . . . . . . . . . . . . . . . . 437An objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437

    Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438

    XI POLYGENIC CHARACTERS IN NATURAL POPULATIONS 441XI.1 Phenotypic Evolution Models . . . . . . . . . . . . . . . . . . . . . . . . . . 441

    Effect of optimizing selection . . . . . . . . . . . . . . . . . . . . . . . . . . . 441XI.2 Kimuras model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442XI.3 Landes model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443

    A symmetrized version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443Effect of mutation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    15/489

    Environmental variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445The mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446Strengths and limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446

    XI.4 Bulmers model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446XI.5 Other models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 448

    A philosophical difference . . . . . . . . . . . . . . . . . . . . . . . . . . . . 448

    Some further references . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449Complements/Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449

    REFERENCES 451

    xv

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    16/489

    xvi

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    17/489

    PREFACE

    These are chapters I-XI of a set of notes which serve as a text for Genome Sciences 562(Population Genetics). The material omitted will complete chapter VIII on the interactionof linkage and selection and cover some additional topics in chapters X and XI, with moreon quantitative characters in natural populations and material on the population genetics of transposons and tandemly repeated sequences.

    Each chapter ends with two sets of problems. Those labeled Exercises are intended to berelatively straightforward application of principles given in the text. They usually involvenumerical calculation or simple algebra. The set labeled Problems/Complements are morealgebraic, and often involve extension or re-examination of the material in the text.

    The level of mathematics required to read this text is not high, although the volume of algebra is sometimes heavy. It is probably sufficient to know elementary calculus, and partsof elementary statistics and probability. Matrix algebra is used in several places, but thesecan be skipped without much loss. The most relevant mathematical technique for populationgenetics is probably factorization of simple polynomial expressions, which most people aretaught in high school (and then, unfortunately, forget how to do).

    These notes have been developed over the last 34 years. They were not nished rapidlyand published primarily because I got interested in phylogenies and was less interested intheoretical population genetics. Nevertheless I needed these to teach my theoretical popu-

    lation genetics course, and so they were gradually expanded. At rst they were encoded onmagnetic card storage for an IBM word processor. Later we had them transferred to mag-netic tape, and hand-edited that into text for the Runoff family of text-formatting programs.They nally became a LaTeX le with Postscript gures.

    Many of the references are from the 1970s and earlier. There are two reasons for this.First, population genetics theory had its major development in the 1920s-1940s (at the handsof Fisher, Wright, and Haldane) and was nally rigorized in the 1960s and 1970s under theinuence of people like Richard Lewontin, James Crow, Motoo Kimura, Sam Karlin, Geoff Watterson, and Warren Ewens. Second, I simply have not yet had the time to update thereferences and include much later work. You may nd citation searches successful in ndinglater work that follows up on the work cited here.

    Many people have contributed to the production of these notes, particularly students inearlier years of the course who caught many errors in earlier versions. The presentationswere heavily inuenced by lecture notes and courses on this subject by J. F. Crow andR. C. Lewontin. The cover illustration is adapted from an original by Helen Leung. Sean

    xvii

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    18/489

    Lamont wrote the plotting program that produced the majority of the gures. I am indebtedto many people for suggestions and corrections, particularly to Jarle Tufto and his students,and to Eric Anderson, Max Robinson, Weiva Sieh, Tim Reluga, Marissa La Madrid, NormanEhrentreich, Rich Neapolitan, Phil Hedrick, Eric Rynes, Pui Yee Fong, Fred Allendorf,Sterling Sawaya, and Alirio Rosales. I am especially grateful, for many corrections, to Jeff Thorne and his students, and to Wenying Shou. But most of all, I must thank Nancy Gamble

    and Martha Katz for doing the enormous job of typing out most of these notes, and NancyGamble for drawing some of the gures for earlier editions.I am still hoping to complete this set of notes one day. It is not clear whether it will

    ever be a printed book the market for texts in theoretical population genetics is small,as there are only a few schools offering courses in this subject. Having it available as afreely downloadable document is effective, and makes it available to students who wouldhave difficulty affording a book.

    Joe FelsensteinDepartment of Genome Sciences

    and Department of BiologyUniversity of WashingtonSeattle

    joe (at) gs.washington.edu

    xviii

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    19/489

    Chapter I

    RANDOM MATINGPOPULATIONS

    Theoretical population genetics (or theoretical evolutionary genetics) is arguably the areaof biology in which mathematics has been most successfully applied. Other areas such as

    theoretical ecology model phenomena which are intrinsically more important to human wel-fare, and which have a much larger base of observations to work with, but are neverthelessnot as successfully modeled. The major reason why theory is more readily applied to popu-lation genetics is that there is a framework Mendelian segregation on which to hang it.The Mendelian mechanism is a highly regular process with strong geometric and algebraicovertones.

    The other reason why Mendelian segregation is particularly important to population ge-netics is that it occurs whether or not natural selection is present, whether or not mutationis present, and whether or not migration is present. In this chapter we examine the conse-quences of Mendelian segregation for the genetic composition of a population. That therecan be consequences that are not intuitively obvious follows from one property of Mendeliansegregation that the composition of offspring for some matings differs from the compositionof the parents. For example, a cross of AA aa yields, not half AA and half aa , but insteadAa .

    Normal Mendelian segregation is diploid and sexual. To understand it we must startwith an examination of the simpler cases in which populations are asexual or haploid. Indoing so we hope to make the results of this chapter intuitively obvious after the fact.

    I.1 Asexual inheritance.TWO GENOTYPES. The rst case we cover is one so simple that there is virtuallynothing to report. Consider a mixed population of two strains which reproduce asexually(as do many bacteria, dandelions, and maybe bdelloid rotifers). The offspring of this formof uniparental inheritance have genotypes which are exact copies of their parents genotypes(we are deliberately ignoring the possibility of mutation). Suppose that the population is

    1

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    20/489

    undergoing synchronous reproduction with nonoverlapping generations. Let the two strainsbe numbered 1 and 2, and suppose that the number of strain i in some generation t is N i ,for i = 1 or 2. Now if each individual has W t offspring in generation t, irrespective of itsgenotype, and we denote the number of strain i in the next generation as N i , then

    N 1 = W t N 1,and

    N 2 = W t N 2.(I-1)

    The number of offspring of each genotype is simply the number of parents of that geno-type, multiplied by the number of offspring each has. Note that we have assumed that theindividuals of type 1 have exactly the same number of offspring as the individuals of type2. If the populations are small this is very unlikely to be true, since random environmentalcircumstances will cause some individuals to have more surviving offspring than others. If there are a very large number of individuals, these circumstances should average out, andthe average number of offspring from each strain will be nearly equal.

    Consider the fraction of all individuals that are of genotype 1. This is, in generation t +1,

    N 1N 1 + N 2

    = W t N 1

    W t N 1 + W t N 2=

    W tW t

    N 1N 1 + N 2

    = N 1N 1 + N 2

    (I-2)

    This establishes the fact that when different genotypes reproduce equally well, the pro-portion of any one of them does not change. We can make the same point by calculatingthe ratio of the numbers of one genotype to the other:

    N 1N 2

    = W t N 1W t N 2

    = N 1N 2

    (I-3)

    Thus, the proportions and ratios of different genotypes are not changed by asexual re-production in a large population.

    MULTIPLE GENOTYPES. If we had not two, but k different genotypes, the pictureis the same. If we denote by pi the frequency of the i-th genotype in generation t, then if

    N = N 1 + N 2 + + N k ,we nd that

    N = W t N 1 + W t N 2 +

    + W t N k =

    k

    i=1

    W t N i = W tk

    i=1

    N i = W t N,

    and we have pi =

    N iN

    = W t N i

    W t N =

    N iN

    = pi , (I-4)

    2

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    21/489

    A

    a

    Aa

    A

    A

    aa

    fertilization meiosis

    Aa

    Figure 1.1: Diploid stage of a predominantly haploid organism.

    so that the frequencies of the different genotypes do not change, even though their numbersmay increase or decrease (depending on whether W

    t is greater or less than 1).

    We will have frequent recourse to the conclusions of this section. In sexual diploidsthe effect of Mendelian segregation is felt only as one moves from one generation to thenext. Within a generation the population is effectively asexual. Thus the logic of thissection applies perfectly to the genotypic composition of a single generation in which eachindividual has probability W t of surviving to adulthood. From now on we will leave outthe factor W t and simply assume that genotypic compositions are not changed by randomsurvival in innite populations, provided that survival is unaffected by genotype.

    Similarly, when we have a set of sexual offspring and ask who their parents were, we willassume that the composition of the parents is unaffected by differences between individualsin the amount of reproduction they do, provided that the differences in reproduction areindependent of genotype, and provided that there are an innite number of parents.

    I.2 Haploid inheritanceThere are many cases, particularly among microorganisms, of organisms which are haploidduring most of their life cycle, having only the briefest of diploid phases. Figure 1.1 showsa typical generation in such an organism.

    Suppose that we have a population of haploid organisms of two genotypes, A and a . Letthe proportions of these genotypes be p and 1 p in generation t. If the organisms mateat random, we can easily compute the proportions of the three resulting diploid genotypes.When mating is random, the genotypes of the two mates are independent of one another.So an AA diploid will be formed in p p = p2 of the matings. An aa will be formed(1 p) (1 p) of the time. There will be two ways of forming heterozygotes: Aa , withprobability p (1 p), and aA, with probability (1 p) p. Since we cannot normally tell

    3

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    22/489

    these apart, the proportions of the diploid genotypes are:

    AA p2

    Aa 2 p(1 p) (I-5)aa (1 p)2.

    These are the so-called Hardy-Weinberg proportions, actually only a simple case of abinomial expansion. To obtain the proportions of A and a in the next generation, wemust consider the results of meiosis in these diploids. It is, of course, assumed that allthree genotypes are equally likely to undergo meiosis. Then p2 of the haploids in the nextgeneration come from AA diploids. All of these haploids must be A, since there is no mutationin this idealized case. 2 p(1 p) of the haploids will come from Aa diploids, and half of thesewill be A. All of the (1 p)2 of the gametes which come from aa diploids will be a . Thetotal proportions of A and a among the offspring generation are thenA : p2 + 1 / 2 2 p(1 p) = p2 + p(1 p) = p[ p + (1 p)] = p,a : (1 p)

    2+ 1 / 2 2 p(1 p) = (1 p)

    2+ p(1 p) = (1 p)[(1 p) + p] = 1 p.(I-6)

    So we once again, if we denote the gene frequency in generation t by p, and the frequencyin generation t + 1 as p ,

    p = p, (I-7)

    so that genotype frequencies remain unchanged from their initial values. It is tempting toconsider haploids as exactly equivalent to asexuals. But this is not true when we considerrecombination, as we shall see later. We have ignored sex determination. It has beenimplicitly assumed that, even if there is a mating type system as in yeast, where two alleles,a and determine the mating types, that the genotype frequencies are the same amongboth a and haploids, so that we need not take mating types into account. We will shortlysee the consequences of relaxing this assumption. Many of the phenomena of populationgenetics can be seen most clearly in haploid cases, and we will return to the haploid casemore frequently than its biological importance alone warrants.

    I.3 Diploids with two alleles: Hardy-Weinberg laws.DERIVATION. We now consider a random-mating population of diploids in which twoalleles are segregating. We assume that there is no difference in genotype proportions betweenthe sexes. Suppose that in generation t the population contains the three genotypes AA,Aa , and aa in proportions P AA , P Aa , P aa . These we henceforth call the genotype frequencies .Consider a haploid gamete produced by one individual chosen at random. The individualhas chance P AA of being an AA, and P Aa of being an Aa . In the latter case, the gamete is Aone half of the time. The chance that the gamete produced by a randomly chosen individual

    4

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    23/489

    is A is then p1 and the chance that it is a is p2 where

    p1 = P AA + 12 P Aa ,

    p2 = 12 P Aa + P aa .(I-8)

    p1 and p2 will be referred to as the gene frequencies of the two alleles. (Allele frequencies

    would be a more consistent term, but gene frequencies is solidly entrenched in the literature).They are not only the frequencies of the two types of gametes, but also the proportion of allgenes in generation t which are each of the two alleles. We can see this by indirect argument,as follows: P AA of all copies of this gene are in AA individuals, and all of these are A. P Aaof the copies are in Aa individuals, and half of these are A alleles. So the total fraction of all copies which are A is P AA + 12 P Aa , which is just the gene frequency p1. More directly,a randomly chosen haploid gamete contains a copy of a gene chosen at random from theparental diploids. So the probability that such a gamete is A is just the gene frequency, p1.An alternative approach to this point, involving direct counting of A and a alleles, is givenin the next section.

    If it happened to be true that random mating of individuals gave the same results as ran-dom combination of the pool of gametes, then the following would be true, as a consequenceof the results of the previous section:

    1. The diploid genotypes in the next generation would occur in the frequencies p21, 2 p1 p2, p22.

    2. The gametes which they produce would be in the same frequencies as the gametesof generation t. So if we use the argument (t) to indicate which generation a genefrequency is from, p(t+1)1 = p

    (t)1 = p

    (t 1)1 = = p

    (0)1 .

    It follows from these two principles that not only will the gene frequencies remain constantfrom one generation to the next, so will the genotype frequencies, with the exception of the

    initial generation. In fact, it turns out to be true that random mating is equivalent to random union of gametes . This is simply the result of the fact that choosing a gamete at randomfrom the pool of gametes is equivalent to sampling a parent at random, and then having itproduce a gamete containing one of its two genes (at this locus), chosen at random by themechanism of Mendelian segregation. The reader who doubts that this is so can consult Table1.1, which enumerates the possible matings, their probabilities, and the resulting offspringgenotype frequencies. The Table makes use of the independence of the genotypes of the twomates under random mating, so that the probability of an AA AA mating is P AA P AA .The genotype frequencies from Table 1.1 are:

    AA : P 2AA + P AA P Aa + 1 / 4 P 2Aa = ( P AA + 1 / 2 P Aa )2

    Aa : P AA P Aa + 1 / 2 P 2Aa + 2 P AA P aa + P Aa P aa = 2 ( P AA + 1 / 2 P Aa ) (1/ 2 P Aa + P aa )

    aa : 1/ 4 (P Aa )2 + P Aa P aa + ( P aa )2 = (1 / 2 P Aa + P aa )2(I-9)

    5

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    24/489

    Table 1.1: Mating types, their frequencies, their contribution to the offspringgenotype frequencies, and the resulting genotype frequencies under random mat-ing.

    Mating Type Contribution to Offspring Generation

    Mating Frequency AA Aa aaAA AA P AA P AA P 2AA AA Aa P AA P Aa 12 P AA P Aa 12 P AA P Aa AA aa P AA P aa P AA P aa Aa AA P Aa P AA 12 P Aa P AA 12 P Aa P AA Aa Aa P Aa P Aa 14 P 2Aa 12 P 2Aa 14 P 2AaAa aa P Aa P aa 12 P Aa P aa 12 P Aa P aaaa AA P aa P AA P aa P AA aa Aa P aa P Aa 12 P aa P Aa 12 P aa P Aaaa aa P aa P aa P 2aa

    MEANING. The two principles given above are often known as the Hardy-Weinberg Law.They have two important impacts on population genetics. The rst implies that genotypefrequencies can (under appropriate conditions) be predicted from gene frequencies. Togetherwith the second, it implies that we can carry through an analysis in terms of gene frequenciesinstead of genotype frequencies. The second part of the Hardy-Weinberg Law implies thatMendelian reproduction in a random-mating population has no inherent tendency to favorone allele or the other: it will not tend to lose genotypic variability. This is a dramaticdifference from the pre-Mendelian scheme of blending inheritance, in which the offspringsgenotype (supposed to be contained in its blood) was a mixture of the parents, without anymechanism of segregation. Blending inheritance would tend to lose half of the genotypicvariability each generation, with dramatic consequences for evolution. A Scottish professorof engineering, Fleeming Jenkin (1867), made this point in response to Darwins Origin of Species . It led him to the conclusion that the response to natural selection would shortlystall for lack of variation. Darwin was unable to convincingly rebut Jenkin. In later editionsof the Origin , he raised the origin of new variation by direct effects of the environment toa greater importance than he had hitherto assigned it, in order to provide the continuoustorrent of new variation necessary to keep evolution operating. With the rise of Mendeliangenetics, and the realization of its consequences, the problem vanished.

    6

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    25/489

    HISTORY. The Hardy-Weinberg law was discovered by the famous English mathemati-cian G. H. Hardy (1908), and simultaneously and independently in a paper by the Germanobstetrician and human geneticist Wilhelm Weinberg (1908), whose proof was more general-ized. Hardy seems to have deliberately buried his paper in an obscure American journal sothat his mathematical colleagues would not realize that he had strayed into applied math-ematics. It has sometimes been claimed that William Ernest Castle made use of it in an

    earlier paper (1903), but a careful reading of that paper will show that Castle worked interms of genotypes rather than gene frequencies. The Hardy-Weinberg Law is as close tobeing trivially obvious as it can be, but it had a major impact on the practice of populationgenetics. Before it, calculations of the effect of natural selection required one to keep trackof three variables, the genotype frequencies, and the algebra required to do even simple caseswas quite complicated. By focusing attention on the gene frequencies, and establishing theconstancy of gene frequencies in the absence of perturbing forces, the Hardy-Weinberg Lawgreatly simplied calculations. The advances of the next two decades would come muchmore slowly and tortuously if it had not been true. For a more detailed history of popula-tion genetics during the decade of the 1900s, the reader should consult the book by Provine(1968).

    EQUILIBRIUM?. The Hardy-Weinberg Law is sometimes referred to as the Hardy-Weinberg Equilibrium. It is an equilibrium in only a restricted sense. If we change thegene frequency of a population, there is nothing inherent in the Law which will restore thegene frequency to its original value. It will remain indenitely at the new gene frequency.But if we perturb the genotype frequencies in such a way that the gene frequency is not changed , then in the next generation Hardy-Weinberg proportions will be restored. If wetake a population in Hardy-Weinberg proportions 0.81 AA : 0.18 Aa : 0.01 aa , and alter thegenotype frequencies to 0.88 AA : 0.04 Aa : 0.08 aa , then the gamete frequencies will be 0.9A : 0.1 a , and the offspring generation will once again have genotype frequencies 0.81 AA :0.18 Aa : 0.01 aa . But had we altered the gene frequency, the genotype frequencies of theoffspring would be in Hardy-Weinberg proportions, but in those dictated by the new genefrequency.

    ASSUMPTIONS. To maintain the Hardy-Weinberg principles, we have made manyassumptions. Among these are:

    1. Random mating.

    2. No differential fertility of the genotypes , so that the contribution a mating typemakes to the next generation is simply its frequency among all mating types.

    3. Equal genotype frequencies in the two sexes, which we have assumed since weuse the same three genotype frequencies for both parents.

    4. No mutation , so that the offspring of any mating are simply those expected fromMendels laws.

    7

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    26/489

    5. No immigration , so that all members of the next generation come from the presentgeneration. It is also assumed that there is

    6. No differential emigration , so that any emigration which occurs does not changethe genotype frequencies.

    7. No differential viability , so that any mortality between newly fertilized zygote and

    adult stages does not alter the genotype frequencies.8. Innite population size , so that the proportions of mating types expected from

    random mating, as well as the proportions of offspring expected from Mendelian seg-regation are exactly achieved.

    Much of the remainder of these notes will be devoted to the consequences of relaxing oneor more of these assumptions. We will not be able to cover all possibilities, even supercially,but we should be able to arrive at some intuitive understanding of the effects, singly and incombination, of these various evolutionary forces.

    I.4 Where the rare alleles are found.Hardy-Weinberg proportions imply that homozygotes for rare alleles will be uncommon.This must be emphasized, since it makes it much easier for us to intuitively understand thebehavior of natural selection in diploids. The algebra is simple if we calculate the proportionof copies of a rare allele that are found in homzygotes. If the gene frequency of allele A is p, we know that p2 of all individuals in the population are expected to be homozygotes forthat allele. If there are N individuals in the population, we expect that Np2 of them will beAA homozygotes, and in these there will be a total of 2 Np2 copies of the A allele. Overall,there are 2 N copies of this gene, of which a fraction p are copies of A, so that there are 2 Np

    copies of that allele.The fraction of all copies of A that are expected to be found in AA homozygotes is then

    2Np2

    2Np = p

    which is a dramatic and simple result. If an allele has gene frequency 0.0003, only a fraction0.0003 of the copies of that allele will occur in homozygotes. Fully 0.9997 of them will befound in heterozygotes. This has strong implications for the effectiveness of selection for oragainst recessive alleles, and also for the relative importance of the tness effects of a rareallele when it is heterozygous and when it is homozygous.

    It will be helpful to keep in mind thatRare alleles occur mostly in heterozygotes; common alle-les occur mostly in homozygotes.

    because these will help us understand the results of natural selection on rare alleles.

    8

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    27/489

    There is an even simpler way to obtain the result of equation (I.4). Imagine that you area copy of a rare allele, and you have been segregated into a gamete (say an egg). What is theprobability that you will end up in a homozygote, paired with another allele like yourself?That is simply the probability that the sperm will contain that rare allele. If mating is atrandom, the probability of this is your allele frequency, p.

    I.5 Multiple alleles.

    If, instead of 2 alleles, a population contains n alleles, the principles stated in the previoussection either apply or generalize naturally. In a haploid population, we have n differenthaploid genotypes A1, A2, . . . , An , whose frequencies in generation t we call p1, p2, . . . pn .When diploids are formed by random mating, the frequencies of the diploid genotypes aresimply the products of the respective haploid frequencies. Thus the frequency of the A1A1diploid genotypes is p21 since each of the two haploid genotypes independently has probability p1 of being A1. In general (if we count genotype A iA j as being distinct from genotype A j Aifor i = j ),

    Ai Ai : P ii = p2i i = 1, 2, . . . , n

    AiA j : P ij = pi p j i = 1, 2, . . . , n , j = 1 , 2, . . . , n ,(i = j ).

    (I-10)

    To keep the notation straight, you must keep in mind that, although we cannot tell AiA jand A j Ai genotypes apart, we count their genotype frequencies P ij and P ji separately, as if we could distinguish them in practice. Thus, the total genotype frequency of AiA j and A j Aiheterozygotes is

    pi p j + p j pi = 2 pi p j . (I-11)

    If we had a population of diploid genotypes, in which we knew the numbers N ii of A i Aihomozygotes, and the numbers N ij + N ji of A i A j or A j Ai heterozygotes, we could computethe genotype frequencies directly, by counting Ai genes. There are two Ai genes in each Ai Aihomozygote and one in each Ai A j heterozygote. If we have N individuals in all, there are2N copies of the A gene, so that the fraction of them which are A i is

    pi = [2N ii + ( N 1i + N 2i + + N i 1,i + N i+1 ,i + + N ni )+( N i1 + N i2 + + N i,i 1 + N i,i +1 + + N in )] / (2N )

    = [( N 1i + N 2i + + N ni ) + ( N i1 + N i2 + + N in )] / (2N ).(I-12)

    9

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    28/489

    Dividing each term of the numerator by 2 N , and noticing that N ij /N = P ij ,

    pi = 1/ 2 (P 1i + + P ni ) + 1 / 2 (P i1 + + P in )

    = 1 / 2n

    j =1P ji + 1/ 2

    n

    j =1P ij .

    (I-13)

    In producing the next generation of haploids from a diploid generation with genotypefrequencies P ij , the proportion of haploid offspring of genotype A i is just the gene frequencyof A i in the diploids of the previous generation:

    pi = pi = 1/ 2n

    j =1

    (P ji + P ij ). (I-14)

    If generation t was itself formed by random mating, then P ij = pi p j , so if we denote by pi the gene frequency in the next generation,

    pi = 1/ 2

    n

    j =1 (2 pi p j )

    =n

    j =1 pi p j

    = pi p1 + . . . + pi pn = pi ( p1 + p2 + + pn ),

    (I-15)

    which clearly equals pi , since the sum of all of the haploid genotype frequencies is 1. So if p(t)i is the gene frequency in generation t,

    p(t+1)i

    = p(t)i

    = . . . = p(0)i , (I-16)

    for all n values of i. Thus the gene frequencies of all n alleles remain constant through timeand, by equations (I-10), the diploid genotype frequencies can be predicted from the genefrequencies.

    All of the above has been for a haploid organism. The results for diploids are identical.All we need to do is note that the principle that random mating is equivalent to random union of gametes is still valid, unaffected by the number of alleles present. Therefore, underthe assumptions of the Hardy-Weinberg Law (random mating, no differential fertilities, nosex differences, no mutation, no migration, no differential viabilities, innite population size),the Hardy-Weinberg Laws still hold. In fact, Weinberg (1908) made his derivation in terms

    of multiple alleles at the outset.AN INTUITIVE ARGUMENT. At least part of the results of this section can be seenintuitively. If we classify alleles into two classes, one containing the A1 allele and the othercontaining all other alleles, we can consider the resulting population as having two-alleles.

    10

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    29/489

    The gene frequency of A1 cannot depend on whether or not the geneticist can perceivedifferences among the other alleles. Neither can the frequency of A1A1 homozygotes. Itfollows immediately that the gene frequency of A1 (or of any other allele we choose) mustremain constant through time, and that the genotype frequency of A1A1 must become thesquare of the frequency of the A1 allele. Only the genotype frequencies of the heterozygotesare not predicted by this analogy between two and many alleles.

    I.6 Overlapping generations.So far, the generations have been discrete. One generation gives rise to another, whereuponthe parents do not reproduce again, and are no longer counted as part of the population.In that case, the population moves into Hardy-Weinberg proportions in one generation.This life cycle is reasonable only for organisms which breed synchronously and only oncein their lifetime (such as annual plants). If there is repeated reproduction and overlappinggenerations it is not a good representation of the life cycle. A realistic model for continuousreproduction and/or overlapping generations would be quite complex. As a start towards

    considering such cases, in this section we consider a very simple continuous-time model.We assume overlapping generations, continuous time, but not age-dependent reproduc-tion. The discrete-generation model is one with perfect memory: organisms rememberexactly when they were born, and reproduce exactly on schedule. But the present modelis the opposite: in each small interval of time, a small fraction of the population, chosenirrespective of age, dies. These individuals are replaced by newborns formed by randommating among all existing individuals, again irrespective of age. Since we wish to consider acase parallel to the Hardy-Weinberg situation, we here assume that deaths and births occurirrespective of genotype, that there is no difference in genotype frequencies between sexes,no mutation, no migration, and an innite population size. The relationship between clocktime and generation time is set once we know what fraction of individuals die in a givenamount of time, and therefore how rapidly the population turns over. To equate one unitof time with one generation, we assume that during an amount t of time (assumed to beshort), a fraction t of the population dies and is replaced. This scales the situation sothat the probability that an organism survives t units of time is (1 t)t/t which as t ismade small approaches e t . (You may remember from a calculus course that (1 + 1 /n )napproaches e as n , and this is a variant on that result). So lifespan has an exponentialdistribution, which turns out to have a mean (average) of 1. The process of allowing t toapproach zero is justied by the fact that if the process of death and replacement occurscontinuously with constant death rates the probability of survival for t units of time is 1tonly approximately, the approximation improving as t becomes small.

    The newborns who replace the deaths constitute a fraction t of the population (againapproximately: exactly if we let t 0). They are the result of random mating in thepopulation under Hardy-Weinberg assumptions, so if the current population gene frequencyof A is pA(t), the newborns are of genotype AA with probability [ pA(t)]2. The AA individualsafter t units of time are a mixture of a fraction t of newborns and 1 t of survivors, so

    11

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    30/489

    if P AA (t) is the frequency of genotype AA at time t:

    P AA (t + t) = P AA (t) (1 t) + t [ pA(t)]2 (I-17)and (rearranging)

    P AA (t + t) P AA (t)t

    = [ pA(t)]2

    P AA (t). (I-18)

    Taking the limit as t 0, the left side of (I-18) is simply the derivative of P AA (t):dP AA (t)

    dt = [ pA(t)]2 P AA (t). (I-19)

    Similarly, it is easy to show that if P Aa (t) is the frequency of heterozygotes Aa (and aA)

    dP Aa (t)dt

    = 2 pA(t) pa (t) P Aa (t). (I-20)

    Before attempting to solve these equations to nd the way P AA (t) changes through time,it will be instructive to look at the gene frequency pA(t). This is equal to P AA (t) + 12 P Aa (t).We can add together equations (I-19) and (I-20), after multiplying (I-20) by one-half. Weget

    d(P AA (t) + 12 P Aa (t))dt

    = [ pA(t)]2 + pA(t) pa (t) P AA (t) 1/ 2 P Aa (t), (I-21)so

    dpA(t)dt

    = pA(t) [ pA(t) + pa (t)] pA(t) = 0 . (I-22)So pA(t) = pA(0) = pA: the gene frequency does not change, just as we might have

    expected. Knowing that pA remains constant, as does pa , means that we can solve equations(I-19) and (I-20) by treating pA(t) as a constant.

    Before going through any algebraic details, we can see from (I-17) and (I-19) what the re-sult will be. Equation (I-17) shows what is happening: as the initial generation of individualsdies out, it is replaced by newborns who are in Hardy-Weinberg proportions at the constantgene frequency pA . Ultimately, when the last of the original individuals has died, the pop-ulation will be in Hardy-Weinberg proportions. Equation (I-19) veries this conclusion. If P AA (t) > p2A , then we have more AA individuals than Hardy-Weinberg proportions wouldpredict. Then the right side of (I-19) is negative, so that P AA (t) decreases. Likewise, when

    P AA (t) < p2A , it will increase. Ultimately P AA (t) = p

    2A , and P AA will not change further.We can solve (I-19) by elementary separation of variables and integration. It rst becomes

    dP AA (t)[ pA(t)]2 P AA (t)

    = dt. (I-23)

    12

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    31/489

    Then (remembering that pA(t) = pA is constant) we can integrate both sides:

    1 p2A P AA (t) dP AA (t) = dt, (I-24)which yields

    loge[ p2A P AA (t)] = t + C. (I-25)We can determine the value of the unknown constant C by setting t = 0. Then

    C = loge( p2A P AA (0)) . (I-26)So

    loge( p2A P AA (t)) = t + loge( p2A P AA (0)) . (I-27)

    Taking the exponential function ( ex ) of both sides of this equation:

    p2

    A P AA (t) = [ p2

    A P AA (0)] e t

    . (I-28)which shows that the deviation of P AA (t) from the Hardy-Weinberg proportion p2A decaysexponentially with time. Solving for P AA (t):

    P AA (t) = P AA (0) (e t ) + p2A (1 e t ). (I-29)This conrms precisely the explanation already given. As time passes, a fraction e t of

    the population consists of survivors of the original population. A fraction P AA (0) of theseare AA. All individuals born later are in Hardy-Weinberg, proportions, so that a fraction p2Aof them are AA. Analogous equations hold for P Aa and P aa . While P AA (t) approaches its

    limiting value exponentially, and never quite reaches it, all newborns are in Hardy-Weinbergproportions. In that sense, Hardy-Weinberg proportions are reached in one generation.

    In the remainder of this book we will rarely make use of the overlapping-generationsmodels, but you should keep in mind that there are overlapping-generations versions of some of the models treated here. However, overlapping-generations models are generallyfar less tractable than discrete-generations models. This is mostly because Hardy-Weinbergproportions cannot be assumed. As we have seen, they are approached only asymptoticallyeven with random mating. If there is any evolutionary force, such as natural selection, makingthe population continually depart from Hardy-Weinberg proportions, we will have to followgenotype frequencies rather than gene frequencies, which makes life harder. In discrete-generations models one is usually in Hardy-Weinberg proportions once per generation, whenthe new generation of zygotes is produced.

    The monograph by Charlesworth (1980) should be consulted for a clear review of theproblems involved in extending overlapping-generations models to cases in which birth anddeath rates are age-dependent.

    13

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    32/489

    Table 1.2: Genotype frequencies when gene frequencies differ in the sexes.

    Female Gametes:A a

    Male Gametes: pf 1 pf A pm pm pf pm (1 pf )a 1 pm pf (1 pm ) (1 pf )(1 pm )

    I.7 Different Gene Frequencies in the Two SexesWe have been assuming that the genotype frequencies are the same in both sexes. Wenow relax that assumption, in a discrete generations model which otherwise obeys all of the Hardy-Weinberg assumptions. We follow a population in which two alleles segregate.Suppose that in the initial generation the gene frequencies of A in females and in males are,

    respectively pf and pm . Random mating is equivalent to the combination of a random femalegamete with a random male gamete. Table 1.2 shows the resulting genotypes:

    which give the genotype frequencies:

    AA pf pm

    Aa pf (1 pm ) + pm (1 pf )aa (1 pf )(1 pm )

    (I-30)

    We are assuming that the gene A is unlinked to the sex chromosome or sex-determininglocus. Thus in the offspring generation the genotypes AA, Aa, and aa are distributed in-dependently of the sex of the offspring. So in that generation, although the genotypes maynot be in Hardy-Weinberg proportions, they are the same in both sexes. Therefore the nextoffspring generation is produced by parents with equal gene frequencies in both sexes, and itwill therefore be in Hardy-Weinberg proportions, as will all subsequent generations. Puttingprimes on the pf s and pm s to denote the next generation, the gene frequency in the gametesforming the offspring generation is

    pm = pf = pf pm + 12 [ pf (1 pm ) + pm (1 pf )]= pf pm + 12 pf

    1

    2 pf pm + 1

    2 pm

    1

    2 pf pm

    = 12 pf + 1

    2 pm .

    (I-31)

    It is entirely intuitively obvious why this must be so. The gametes produced by therst generation contain in half of them genes coming from the initial female generation, and

    14

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    33/489

    in half of them genes coming from the initial males. This is true even if there is a greatinequality of the sex ratio: even if there are very few females (say), the symmetry of mating- the fact that each mating consists of one male and one female - ensures that (I-30) willhold. The totality of male genes is copied into the next generation as many times as thetotality of female genes.

    The picture we get from all this is that after starting with unequal male and female

    gene frequencies, we do not reach Hardy-Weinberg proportions in the offspring. But we doachieve equal gene frequencies in the two sexes of the offspring. In the second generationHardy-Weinberg proportions are achieved. So the effect of unequal gene frequencies in thetwo sexes is to delay achievement of Hardy-Weinberg proportions by one generation. Wecan still say that the overall gene frequency of the population does not change. But we canonly say this if we dene it as p = 12 pf +

    12 pm , irrespective of the actual numbers of the two

    sexes. In other words, we must count the aggregate of all females as contributing as muchto the population gene frequency as the aggregate of all males. Any other weighting system- such as counting each individual as equivalent - will lead to the population gene frequencychanging during the rst generation.

    In this presentation, p has been the frequency of an allele A, and 1

    p of a . But we

    could as easily have designated 1 p as being the frequency of all other alleles than A. Sothe above argument applies to the frequency of an allele A irrespective of how many otheralleles there are. Having multiple alleles in a population will not alter the conclusions.

    GENOTYPE FREQUENCIES. Finally, we verify the direction of departure of genotypefrequencies from Hardy-Weinberg proportions. Suppose that, instead of having variables pf and pm , we measure the gene frequency in each sex as the average gene frequency plus (orminus) a deviation from that quantity, so that

    pf = p + pm = p

    . (I-32)

    Then the genotype frequencies in the next generation are:

    AA ( p + )( p )Aa ( p + )(1 p + ) + ( p )(1 p )aa (1 p )(1 p + ).

    (I-33)

    or (collecting terms)AA p2 2

    Aa 2 p (1 p) + 2 2

    aa (1 p)2 2.(I-34)

    This demonstrates that in the two allele case, if there is any difference between genefrequencies in the sexes, if = 0, there will be a departure from Hardy-Weinberg proportions

    15

  • 8/11/2019 Felsenstein_Theor Evolutionary Genetics.pdf

    34/489

    SA

    sa

    SAsa

    SA

    SA

    sa

    sa

    Figure 1.2: Segregation of an allele completely linked to a sex-determining locusin a haploid organism

    .

    in the next generation. Furthermore, whether is positive or negative, the result is thesame: there are fewer homozygotes and more heterozygotes than we would expect fromHardy-Weinberg proportions.

    With multiple alleles, there must also be a decit of each homozygote class, and also anaverage excess of heterozygotes compensating for this. But specic heterozygote classes canbe in decit, despite the fact that there is an overall excess of heterozygotes.

    Biologically, the main implication of the results of this section is that for autosomalloci, we would not expect to see gene frequency differences between the sexes unless someevolutionary force continually created such differences. This has an interesting implicationfor differentiation of the sexes: it will be difficult to explain it by genotypic differences atloci that are n