genetic diversity and genetic burden in humans
Post on 28-Mar-2023
0 Views
Preview:
TRANSCRIPT
Genetic Diversity and Genetic Burden in Humans
Henry Harpending (corresponding author)
Department of Anthropology, University of Utah
Salt Lake City UT 84112 USA
phone: 801 581 3776
fax: 801 581 6252
email: harpend@xmission.com
Gregory Cochran
Department of Anthropology, University of Utah
April 2005
Abstract
We discuss categories of genetic diversity in humans. Neutral diversity, population differences
in frequencies of genetic markers that we think are invisible to natural selection, provides
a passive record of population history but is otherwise of little interest in human biology.
Genetic variation related to disease can be separated into mutational noise and variation
due to selection, either ongoing selection else effects of a past environment.
We distinguish consequences of genetic diversity for fitness, relevant to evolution, and
consequences for well-being, relevant to medicine and public health. We call genetic varia-
tion that causes impairment of health or well-being of individual humans “apparent genetic
burden” and variation that has effects on fitness but not well-being “unapparent genetic
burden.” We use “burden” to distinguish these notions from the classical concept of “ge-
netic load” that refers to effects on population fitness, a concept formulated by Morton et al.
(1956).
We distinguish adapted genes and adapted genotypes: an adapted gene is a gene that
increases fitness of its bearer either in heterozygous or homozygous state or both, while an
adapted genotype is a genotype that increases fitness of its bearer but is not transmitted
intact to future generations. Balanced polymorphisms in which the heterozygote is superior
in fitness may generate most adapted genotypes. In the face of major rapid environmental
change adapted genotypes appear first but over time they are replaced by adapted genes.
The presence of adapted genotypes is a good indication of recent environmental change:
for example there are apparently many polymorphisms in domestic animals of this nature,
responses to domestication, and many fewer in wild animals (and in humans.)
Keywords: genetic burden, mutation, rapid selection, genetic polymorphism, selective sweep,
Ashkenazi Jews
Introduction
Much of human genetic diversity is generally thought to be neutral, that is invisible to
natural selection. Neutral diversity is of interest because it contains a kind of record of
human history. The global patterns of human neutral diversity are now well known and
understood as a consequence of DNA typing and sequencing technologies that have been
applied to samples from many human populations.
Diversity that is not neutral, selected diversity, may also be important for understanding
history, but a more important question in many fields is about how much variation in health
and well-being is a consequence of mutation and natural selection.
Much selected diversity must be just random damage, deleterious mutations that are
eventually removed by natural selection. Many rare genetic disorders are probably in this
category. Since the highest known spontaneous mutation rates for disorders like this are on
the order of 10−4, anything with a total fitness cost greater than 10−4 is almost certainly not
in this category.
Many polymorphic systems that have health and fitness consequences are leftovers, adap-
tations to past environments. For example sickle cell heterozygotes enjoy some protection
from falciparum malaria. Malaria can be eliminated in a region instantly in evolutionary
time, but the genetic variation that evolved in response to the malaria persists as detrimental
or lethal in the new environment.
Neutral diversity
Neutral diversity is DNA diversity that is invisible to natural selection. Much of our DNA
is apparently “junk” in which mutations have no biological consequences. Even synonymous
mutations in coding regions may be close to neutral but apparent exceptions are known.
Neutral diversity is then of only indirect evolutionary and medical interest. On the other
1
hand it does provide a passive record of population history precisely because it is unnoticed
by natural selection: there is information about past population movements and episodes of
growth and decline in it. Neutral diversity also provides markers in the genome useful for
genetic counseling and for searching for nearby functional genes in linkage disequilbrium.
Discussions of neutral genetic diversity distinguish diversity within and between popu-
lations. Diversity within populations refers directly or indirectly to the average amount of
difference between random DNA sequences chosen from that population. This can be mea-
sured by the mean pairwise number of sequence differences in a collection of sequences, by
the squared difference in length for repeat polymorphisms, by heterozygosity for collections
of simple markers, and so on.
Within population diversity is greatest in sub-Saharan Africa and gradually declines
away from Africa into the New World. For a collection of repeat polymorphisms (Eller, 1999)
diversity declines approximately 15 to 20% from Africa to northeast Asia and as much as 30%
into the New World. The interpretation is usually that the population of Africa is “older”,
that is that they were demographically successful earlier, while populations outside Africa
are primarily derived from migrants from the edge of the population of sub-Saharan African
ancestors (Eswaran, 2002). Interestingly the increased diversity in Africa was not apparent
until genetic systems like STRs became available that were not so subject to ascertainment
bias. In the standard encyclopedic summary of classical marker polymorphisms (Cavalli-
Sforza et al., 1994) there is no trace of it. Classical markers were mostly discovered in
Europeans, biasing the sample toward those polymorphic in Europeans.
The standard way of expressing diversity between populations is some variant of a statistic
called Fst. We pick pairs of genes or sequences from within populations, measure how
different they are on average, and call this Ho. We then pick random pairs from the whole
sample without regard to population membership, measure their average difference, and call
this He. The statistic
Fst = (He −Ho)/He (1)
2
can then be interpreted as a measure of inter-population diversity because it describes the
fraction of total diversity He that is between populations. For a collection of human groups
from all over the earth this statistic is usually 0.10 to 0.15. This has been well-known for
decades (Lewontin, 1972) and has remained unchanged with the advent of new large datasets.
Lewontin emphasized in his article that 1/8 is a small number, that most neutral genetic
diversity is within populations, and therefore that human group differences were small and
minor. It is not clear why 1/8 should be considered small: the relative differences among
human populations implied by a kinship of 1/8 are the same as the relative differences
among sets of half siblings. Most of us do not think that kinship between half siblings or
grandparents and grandchildren is trivially small. (Note that in behavioral ecology genetic
similarity is often discussed in terms of the coefficient of relationship while in genetics the
coefficient of kinship is more frequently used. In the simple case the coefficient of relationship
is twice the coefficient of kinship (Bulmer, 1994), that is 1/4 between half sibs.)
As Edwards points out (Edwards, 2003), Fst statistics do not take into account the
correlation structure of gene-frequencies in groups. Correlated differences in the frequencies
of alleles that influence a phenotypic trait can cause arbitrarily large differences in that
trait, even while the great majority of genetic diversity is intra-population. These correlated
gene-frequency differences are exactly what we expect from natural selection, but they also
accumulate under pure random drift.
A consequence of human neutral diversity is that it is possible to assign an individual to
an ethnic group correctly nearly every time when a reasonable number of variable sites, 50
to 100, have been typed.
The pattern of neutral differences between populations in our species is one of isolation by
distance: populations close to each other on the ground are genetically similar to each other,
while distant populations are more genetically dissimilar at neutral loci. This means that
there is a high correlation between measure of geographical distance and genetic distance
between populations. This has been a surprise to many because the global pattern can be
3
very different for external appearance. Dark skin color, for example, is found in Africa and
again in Australia and much of Oceania but is not so common in populations in between.
This similarity in skin color is not reflected in the neutral genome, suggesting that skin color
differences are caused by natural selection rather than passive genetic similarity. Similarly
there are small “pygmoid” people in Africa whose appearance is similar to that of groups in
the Indian ocean, Australia, and Oceania. There is no evidence from neutral genes of any
population kinship of these groups, so again the natural inference is that the morphology
has been generated and maintained by selection.
Diversity generated by random mutation
Many genetic diseases and disorders are thought to be consequences of random damage to
genetic material. In general the total impairment of fitness due to mutation at a locus is,
at equilibrium, close to the mutation rate at the locus. Therefore any condition whose total
effect on population fitness is greater than 10−4 is almost certainly not purely mutation
driven since this is the upper limit of known mutation rates.
Sometimes, either accidentally or deliberately, we lump a number of different disease enti-
ties with specific individual causes into a broad symptomatic category: such broad category
diseases can be more considerably more common.
For example, consider congenital deafness, which is made up of many different diseases,
basically anything that in some way interferes with hearing before birth. In the past, perhaps
as many as one in 500 individuals were born deaf. About half of those cases were caused
by prenatal infection, mostly rubella (now prevented by vaccination). The other half were
caused by many different mutations: tens of loci have been identified, although a single locus,
connexin-26, accounts for 40% of genetic deafness. Among Caucasians, a single connexin-
26 mutation dominates (35delG.): there is reason to think that it became common due to
heterozygote advantage, probably as a defense against some skin disease. About 15% of the
4
many mutations causing deafness are syndromic—that is, they cause other symptoms as well
as deafness. Waardenburg syndrome causes a white forelock as well as deafness, and other
mutations causing syndromic deafness can have much more serious effects.
Even with a historical fitness load of some 10−3, not particularly high, and even though it
is a broad category rather than a specific disease entity, most congenital deafness was most
likely a result of pathogen pressures rather than mutational noise. Part was in the form
of direct infection (rubella) and part in form of the long-term consequences of a selective
response to pathogens, the common connexin-26 mutations. Although we need to be careful
to avoid lumping that leads to this kind of imprecise disease classification, it still seems
that we must look to sources other than mutational pressure to explain familiar disorders
with large effects on population fitness, such as diabetes, obstructive arterial disease, or
schizophrenia. Parenthetically, the existence of a syndromic subset of a given disease is
probably a sign that it is really a broad category rather than a specific disease entity.
The extent to which minor discomforts, aches and pains, and idiosyncratic ill health re-
flect random mutational damage is not known. If an average rate of deleterious mutation per
gene per generation is 10−5, and there are slightly fewer than 105 genes in the human genome,
an average gamete may carry less than one new deleterious mutation. Many of these may
have no apparent consequences for fitness in the current environment. On the other hand
this damage accumulates. The Morton, Crow, and Muller genetic load theory attempted
to estimate how much random deleterious noisy diversity we each carried by looking at the
increase of morbidity and mortality with inbreeding in humans: random mutational dam-
age, mostly recessive, should be “revealed” by inbreeding while other diversity, for example
that maintained by higher heterozygote fitness, should not increase much with inbreeding.
Unfortunately the overall results of those efforts were inconclusive: there were deleterious
effects of inbreeding but the slope of the regression of fitness loss on amount of inbreeding
was squarely between the predictions of the deleterious mutants model and the heterozygote
advantage model.
5
Selectively maintained diversity
Advantageous genes
Although mutation is the ultimate source of all genetic diversity, persistence of genetic varia-
tion that affects health and fitness must usually reflect natural selection. We will distinguish
between selection for advantageous genes and selection for advantageous genotypes. An
advantageous gene is a gene that increases fitness of its bearer either in heterozygous or
homozygous state or both. An advantageous genotype is a genotype that is not directly
transmitted to offspring, for example the heterozygous genotype of the sickle cell polymor-
phism in malarial environments.
Persistence into adulthood of the ability to synthesize lactase is apparently an advan-
tageous gene among dairying people, especially when there is consumption of unfermented
milk. Since those unable to digest lactose suffer from gastrointestinal upset in such environ-
ments, lactase persistence is likely advantageous in both the homozygous and heterozygous
states. Lactase persistence is an advantageous gene, in the sense that it is unconditionally
favored by selection in a certain environment, that of consumption of unfermented dairy
foods.
Another unconditionally advantageous gene may be the CCR5 ∆32 variant that protects
individuals from HIV progression to AIDS (and probably became common in European
populations because it gave protection against smallpox). There are no known ill effects of
this gene, even though it seems to break a part of the immune system. Presumably in an
environment of hyperendemic HIV infection the gene would go to fixation.
We don’t know many cases of such purely advantageous genes, perhaps because they go
to fixation so rapidly that it is difficult to find any in the transient stage of going to fixation:
certainly the more advantageous such a gene is the more rapidly it will displace any other
alleles at the locus.
6
Advantageous genotypes
The best understood advantageous genotype in humans is the carrier state of the sickle cell
gene in areas of hyperendemic falciparum malaria. Carriers of a single sickling mutation
(HbS) of the hemoglobin beta chain enjoy a fitness advantage while bearers of two copies
of HbS are severely ill and usually die in early childhood in the absence of modern medical
care. The heterozygote HbA/HbS is an advantageous genotype in the sense that bearers
of that genotype enjoy health and fitness advantages in that environment. The result is
a balanced polymorphism, in which selection maintains the genetic diversity at the locus.
As HbA becomes more common in a population, more and more HbS alleles are spending
lifetimes in heterozygous HbA/HbS genotypes and the fitness, hence the frequency of HbS
increases. Similarly as HbA declines in frequency, perhaps because of drift, the fitness of HbS
alleles decreases. At the equilibrium, the fitness of each allele is exactly the same, indeed
this is the definition of equilibrium.
The biology in this case is that heterozygotes have slightly impaired red blood cells, and
the impairment hurts the malaria more than it hurts the individual. The net effect of this
adaptation to malaria on population fitness is unfortunately not great. Consider a simple
numerical example: in some environment saturated with falciparum malaria the fitness of
wild-type HbA/HbA homozygotes is 1, the fitness of HbA/HbS heterozygotes is 1.2, and
the fitness of those HbS/HbS homozygotes, born with sickling disease, is 0. In the absence
of the sickling gene the population fitness is 1 since everyone is HbA/HbA, while with the
polymorphism the equilibrium frequency of the wild-type allele is 1.2/1.4 ≈ 0.86 and the
average population fitness is only 1.03, a trivial increase.
In this population the infant mortality rate due to sickle cell disease is 0.142 ≈ 0.02 or 20
per thousand live births. The tragedy of this adaption is that malaria can disappear with
environmental change but the genetic adaptation to malaria persists for centuries, initially
causing the deaths soon after birth of about 2% of all babies born but declining slowly
7
afterward.
What if mutation and natural selection hit upon a gene that had the same physiological
consequences as the HbA/HbS heterozygous state? Such a gene would quickly proceed to
fixation in the malarial environment and population fitness would increase by 20% rather
than 3%. An advantageous gene at a locus must usually be superior to an advantageous
genotype but more difficult for mutation and natural selection to “discover.” In general
balanced polymorphisms, advantageous genotypes, are the first responses to drastic envi-
ronmental change, and they are later replaced by advantageous genes. The most familiar
examples of balanced polymorphism in humans are these responsive to malaria, especially
falciparum malaria. Deadly malaria is both a very strong and a new selective agent, and
indeed the presence of numerous balanced polymorphisms is testimony to its novelty.
We expect to find advantageous genotypes in populations that have been exposed to
recent strong selection, and that seems to be the pattern (Orr, 2005). Many are known in
domesticated animals, many fewer in wild animals.
Polymorphisms in domesticates
We know of a number of examples of genes of strong effect in domesticated animals, and in
many cases we understand the selective pressures involved.
Myostatin is a protein that regulates and limits muscle growth, and we find several
different high-frequency myostatin mutations in some breeds of beef cattle, such as Belgian
Blue, Piedmontese, and South Devon. These myostatin mutations render the gene inactive
and result in a phenotype known as double muscling, which increases muscle mass—the
target of selection in these breeds. Homozygotes have calving difficulties, and thus these
myostatin mutations caused adapted genotypes but it is not an adapted gene in our usage
(McPherron and Lee, 1997; Grobet et al., 1997).
Pigs too have at least one prominent gene of strong effect, a mutated ryanodine receptor
that has a significant effect on carcass lean content (Wendt et al., 2000). It seems to have
8
become common in the 1970s, as breeders attempted to adjust to changing market tastes
of less lard and more lean meat. Homozygotes for the gene have poorer meat quality than
normal pigs, and are extremely susceptible to stress. Again, this is a mutation that produces
an adapted genotype in heterozygotes at the cost of terribly maladapted homozygotes.
Another clear example is hyperkalemic periodic paralysis in quarter horses, characterized
by sporadic attacks of muscle tremors, weakness, and collapse. This is a dominant muscle
disorder caused by a mutant allele of the skeletal muscle form of the sodium channel, found
only in descendants of the quarter horse Impressive (Naylor, 1997; Cannon et al., 1995). It
has spread rapidly since it originated in 1968, and it now exists in approximately 100,000
quarter horses. It produces a muscular phenotype that has been selected by show judges: of
the top 15 halter horses in 1992, 13 were descendants of Impressive.
Perhaps the most dramatic example, in terms of a significant life-history change, are the
mutations causing twinning in domestic sheep. Twinning is rare in wild sheep and is still
rare in most breeds of domesticated sheep. It reduced fitness in typical past environments,
since it was difficult for the ewe to take care of two offspring. But in modern conditions,
where sheep experience very favorable environments and considerable human intervention,
twinning increases fitness, and now several different mutations that induce twinning are
found at polymorphic frequencies in some breeds of sheep. We now know of four different
mutations of the same X-linked gene (bone morphogenetic protein 15) that cause twinning
in heterozygotes and sterility in homozygotes (in Inverdale, Hanna, Belclare, and Galway
sheep). We know of another twinning mutation involving the bone morphogenetic protein
1B receptor (Booroola sheep - homozygotes have triplets!) (Davis, 2005; Galloway et al.,
2000; McNatty et al., 2001). This pattern, multiple mutations in the same enzyme path, is
common in cases of recent strong selection. We see the same thing in malaria defenses such
as HbS and the thalassemias, which tweak the hemoglobin molecule in different ways. There
is also a broader clustering, mutations aimed at a common physiological target, including
the malaria-defense examples like G6PD, which changes the environment inside the red cell,
9
and Melanesian ovalocytosis, which changes the red cell membrane.
In terms of those searching for genotypic correlates of disease in humans, advantageous
genotypes would appear as “major genes” affecting some inherited disease. Random diversity
consequent to deleterious mutation would lead to rare genes that would likely be geograph-
ically local. In fact much of human gene hunting has turned up rare local mutants but very
few major genes of large effect (Orr, 2005). The simple prediction from evolutionary theory
is that there aren’t very many, and those that are present in our population are responses to
environmental change that is both recent and severe. Many such major genes are known in
domestic animals because, we think, the domestication process and later selection have been
precisely the kind of new environment of strong selection that leads initially to the evolution
of advantageous genotypes and only later to the evolution of advantageous genes.
Leftover diversity
As we discussed above, one of the great tragedies of the HbS response to hyperendemic
falciparum malaria is the long persistence of the genetic response when the external agent,
malaria, is eliminated. Thousands of premature deaths and compromised lives in North
America are caused, indirectly, by malaria since they are the costs of an adaptation no
longer relevant in a malaria-free environment. The genetic diversity is “leftover” from a past
environment.
There are large numbers of humans on earth, we are mobile, and we occupy a wide range
of environments. For all these reasons we are especially prone to epidemic infectious diseases.
Many known polymorphisms in our species are thought to be responses to infectious disease,
essentially again “leftovers” from past environments to the extent that we have managed to
control or eradicate the pathogens.
10
Infections
The importance of expensive genetic defenses to infection varies geographically, depending
on the historical impact of infectious disease pressure. On a worldwide basis, they account
for the majority of cases of genetic disease. Genetic defenses against falciparum malaria
are far and away the most important part of this story. Wherever falciparum malaria has
existed for a long time, mainly the tropical areas of the Old World, we find many expensive
genetic defenses against malaria, and those defenses account for the great preponderance of
all genetic disease in populations originating in those regions. The most important are sickle-
cell (HbS), HbC, HbE, alpha- and beta- thalassemia, Melanesian ovalocytosis, and G6PD
deficiency. They are far more common than ’noise’ genetic diseases caused by mutational
pressure. For example, about 250,000 children are born with sickle cell anemia each year
worldwide, while about 5,000 boys are born with Duchenne’s muscular dystrophy, one of the
most common mutation-driven genetic diseases (WHO, 1994). These malaria defenses give
heterozygote advantage while causing problems of varying severity in homozygotes. They
are not adapted genes, but instead produce adapted genotypes. This sort of simple, sloppy
adaptation is atypical of species near equilibrium with the selective environment. Normally,
adaptations involve a number of genes that work together in a coordinated way. We think
that this evolutionary sloppiness exists because falciparum malaria, as we know it today,
has not been around very long; perhaps as little as 4,000 years (Carter and Mendis, 2002).
The same appears to be true of the anti-malaria genetic defenses. For example, the main
African variety of G6PD deficiency is roughly 2500 years old (Sabeti et al., 2002). The end
of the ice age and increased population density resulting from the spread of agriculture seem
likely to have favored the spread of this virulent form of malaria. This trend was particularly
unpleasant in Africa , where mosquito strains evolved that prefer humans to animals, which
greatly facilitated malaria transmission. Vivax malaria is milder, propagates over a wider
range of conditions, and is likely much older than falciparum malaria. There is at least one
11
genetic defense that completely prevents infection—the Duffy negative allele—which does
not appear to cause disease. Duffy negative, which is near fixation among central Africans,
is thus a good example of an adapted gene.
There are a number of other genetic diseases that clearly have been favored by selec-
tion and seem likely to have given protection against some pathogen other than falciparum
malaria. Cystic fibrosis (CF), the most common serious genetic disease among Europeans,
alters the cystic fibrosis transmembrane regulator (CFTR) protein. There is good reason
to believe that it has reached its present high frequency through natural selection (Slatkin
and Bertorelle, 2001). Salmonella typhi, the agent for typhoid fever, uses CFTR to enter
intestinal cells, and inefficiently infects cells heterozygous for the main European mutation,
deltaf508 (Pier et al., 1998). That mutation is apparently considerably older than the malaria
defenses. Typhoid has had a smaller impact on human fitness than falciparum malaria, but
it may have been around longer. The typhoid carrier state, caused by a persistent infection
of the gall bladder in a few percent of those infected, would have allowed typhoid propagation
at low population density.
The common European hemochromatosis mutation (C282Y) probably works in a similar
way, altering a pathogen receptor. The HFE protein is normally expressed in intestinal crypt
cells , where it regulates iron absorption. The C282Y-mutant form of HFE fails to reach the
cell surface. In homozygotes this sometimes leads to iron overload and organ damage such as
cirrhosis of the liver. At one time the general opinion was that increased iron absorption in
hemochromatosis carriers yielded heterozygote advantage, but this was always dubious, since
the C282Y HFE is only common in northwestern Europe, a region not particularly prone
to anemia. Recent studies (Hunt and Zeng, 2004) show that heterozygotes show only a tiny
increase in iron absorption. More likely the HFE protein served as the entry port for some
intestinal pathogen (Rochette et al., 1999) which was thwarted by the C282Y mutation.
Many intestinal pathogens have an important role in child mortality and thus impact
fitness. Some of these likely defenses appear to involve other anti-pathogen mechanisms,
12
for example upregulating inflammatory mechanisms. Familial Mediterranean Fever (FMF) ,
quite common among populations originating in the Middle East, is caused by a number of
defective alleles of pyrin, a protein that down-regulates granulocyte-mediated inflammation.
Mutant pyrin alleles can result in harmful fever and inflammation in homozygotes, but it is
quite easy to believe that unleashing granulocytes might protect heterozygotes from some
pathogen (Online Inheritance in Man:OMIM, MEFV).
In a similar vein, alpha-1-antitrypsin (AAT) is a protease inhibitor of leukocyte elastase,
and thus deficient AAT alleles might protect heterozygotes against some pathogen. Of course
homozygotes (and heterozygotes to a lesser degree) run very significant risks of emphysema.
Two different low-activity alleles of AAT reach polymorphic frequencies in Europeans (Online
Inheritance in Man:OMIM, AAT).
Congenital deafness is caused by many different mutations (over 100 have been identified)
and is thus, for the most part, a good example of a broad syndrome caused by mutational
pressure. Altogether perhaps 1 in 1500 children have some form of genetic deafness. Most
of the individual mutations are rare, since in the past deaf individuals had very low fitness.
However, mutations of a single gene (connexin-26) account for about 40% of congenital deaf-
ness. In Europeans, the main mutation is 35delG, which has a single origin approximately
10,000 years ago. Other populations have their own characteristic connexin-26 mutation:
R143W in Africa, 167 delT among Ashkenazi Jews, 235delC among Japanese and Koreans.
Somehow selection has increased the frequency of certain connexin-26 mutations in a num-
ber of populations, despite the severe negative effects in homozygotes. There is evidence
that these mutations affect the skin (Meyer et al., 2002), resulting in a somewhat thicker
epidermis and saltier sweat, which may act as a barrier to pathogens.
13
Social and sexual selection
While the majority of our genetic burden from adapted genotypes seems to consist of re-
sponses to infectious agents, recent strong social or sexual selection ought to lead to the
same kind of transient outcomes. An interesting case is that of the Ashkenazi Jews, who are
burdened with an array of inherited disease, especially recessive disease (Risch et al., 2003).
We have calculated (Cochran and Harpending, 2005) that fewer than half of contemporary
Ashkenazi Jews bear none of the ethnic-specific mutations. While the best-known of these
disorders is Tay-Sachs disease, there are several others that affect the same metabolic path-
way. Another cluster of Ashkenazi diseases is the “DNA repair cluster” including mutations
in BRCA1 and BRCA2. This may be a misnomer, since these genes are directly involved in
early brain growth and development and their role in DNA repair may be secondary.
While the presence of the large number of genetic disorders among Ashkenazi has often
been attributed to genetic drift due to a severe bottleneck in their history, several lines
of evidence show that there never was any bottleneck. First, the only evidence for such
a bottleneck is the presence of the disorders: there is no independent record of any such
bottleneck in Ashkenazi history. There is not even any hint of a bottleneck. Second, we were
able to obtain data on allele frequencies of a large number of polymorphisms and examine
overall population genetic diversity of several European and Middle Eastern populations.
While several Middle Eastern groups showed heterozygosity reduction implying either a
bottleneck or a long interval of small size, Ashkenazi showed no such loss of diversity. Any
bottleneck severe enough to have led to the elevated frequency of even one of the Ashkenazi
disorders would have left a signature of diversity loss, and there is no such signature. Finally
the clustering of the disorders in a few pathways denies the bottleneck and drift hypothesis,
since drift that lead to elevated frequencies of deleterious mutations would not have acted
in only a few specific biochemical pathways.
There has been speculation that these traits are responses to selection by infectious
14
disease, following the model of the sickle cell gene and falciparum Malaria. Adaptation to
tuberculosis was one model that had some currency but failed to find support from family
studies. At any rate the infectious disease hypothesis is extremely implausible since none of
the Ashkenazi disorders rose to appreciable frequencies in their neighbors who lived, literally,
across the street. The selective pressure must have been in some sense social.
Ashkenazi history may furnish clues about what the selective social environment was.
From about the ninth to the seventeenth century they were a nearly completely endogamous
group with extreme occupational specialization in finance, trade, and management. The
amount of gene flow outward from the population is unknown, but there was almost none into
it. Only after this time, as the demographically successful population outgrew its specialized
niche, did Ashkenazi branch out into trades, shopkeeping, and occasionally even farming. In
societies prior to the demographic transition of the eighteenth century and continuing to the
present there was a positive correlation between wealth and Darwinian fitness everywhere
anyone has looked. The extreme occupational specialization of this population and the
possibility that occupation success led to wealth and to differential fitness is the likely context
of strong selection that led to the presence of the numerous Ashkenazi genetic disorders. A
simple implication is that heterozygotes for these largely recessive disorders will be better
at whatever skills or abilities were the target of selection (Cochran and Harpending, 2005).
Detecting ongoing evolution directly from the gene
While past studies of ongoing evolution have been dominated by reasoning from the pheno-
type to the locus, as in sickle cell anemia or lactose tolerance, it has become possible in the
last few years to reason instead from characteristics of the gene directly. There is a lively
literature on this theme (Tishkoff et al., 2001; Slatkin and Rannala, 2000). We will not
review this literature here but will instead describe several particularly interesting examples
of ongoing selective sweeps inferred purely from the pattern of variation at the loci (Wang
15
et al., 2004; Ding et al., 2002; Harpending and Cochran, 2002; Hardy et al., 2005; Stefans-
son et al., 2005; Mekel-Bobrov et al., 2005; Evans et al., 2005). While similar patterns are
known for genes modulating response to falciparum malaria, all these are happening in genes
involved in behavior and central nervous system function.
At any genetic locus extant alleles or variants are tips of a tree of descent, called a
coalescent tree. The depth of this tree, that is the time back to the most recent common
ancestor at the locus, varies from locus to locus because it is a random process. Most nuclear
loci seem to coalesce between 1 and 2 million years ago, while haploid loci like mtDNA or
the non-recombining part of the Y chromosome (nrY) coalesce much more recently.
[Figure 1 about here.]
Figure 1 shows a typical gene tree from a neutrally evolving locus, in the left panel, and
one from a locus in which a selective sweep has occurred, in the right panel. The dots on
the branches represent mutations that have occurred in the history of the locus. “Selective
sweep” refers to the rapid spread of an advantageous new allele: in figure 1 the subtree on the
right side of the right tree has undergone such a sweep. At some time in the recent (recent
with respect to the scale of the total tree) past an advantageous variant has appeared on
that part of the tree. Many of the alleles at the locus are of the recent type (shown in red in
the figure), they share a recent common ancestor, and they are not very different from each
other. This pattern of allelic descent is called a “star phylogeny”: a recent common ancestor
and little or no differentiation since that ancestor.
A gene tree is of course not directly observable but many important properties of the
tree can be inferred from extant alleles. For example if we had samples from the tree in the
right panel we would notice that one subclade had a high frequency but very little diversity
within the clade. Compare, for example, the red clade in the left and right panels of the
figure: on the left the red clade alleles would be different from each other because there are
old branches that separate them. With the history shown in the right panel the red alleles
16
would be all very similar to each other. These differences are conventionally measured by
the mean pairwise sequence difference (MPD) among all possible pairs of alleles. A clade at
high frequency with low MPD is a strong suggestion of a selective sweep occurring.
A second consequence of the history shown in the right panel is that much of whatever
sequence diversity due to mutation is present in the red clade occurs as singletons. Any
mutation in the history of the sample since the sweep began will most likely have a single
descendant in the sample. In the tree in the left panel, many mutations have more than one
descendant in the sample. We can evaluate the extent of “starness” in a clade by examining
the ratio of the normalized number of segregating site, that is mutations, to the average
number of pairwise sequence differences. The familiar Tajima D statistic is a normalized
difference between these two numbers, designed to assess statistical significance.
A third consequence of a recent selective sweep is that there is linkage disequilibrium
between the favored type and neighboring parts of the chromosome. Since all the copies
share a recent common ancestor, little time has elapsed for recombination to erase the initial
disequilibrium. Scans of the genome for regions of high disequilibrium is a simple method
for detecting likely targets of recent selection.
There is a repeat polymorphism in the human D4 dopamine receptor gene that is associ-
ated with variation in personality or behavior of its bearers. Each repeat is 48 bp in length
corresponding to 16 amino acid changes in the length of the gene product. The common
worldwide variant has 4 repeats (4R) while the (7R) variant is undergoing a sweep as in-
ferred from reduced SNP diversity within the variant and high linkage disequilibrium. There
is little linkage disequilibrium around the ancestral 4R variant.
The literature suggests that carriers of 7R may be at elevated risk of childhood attention-
deficit hyperactivity disorder without the neurological impairments that often are found with
ADHD (Swanson et al., 2000). The bearers may also be more impulsive, more risk-seeking,
and less altruistic in experimental games.
The MAPT locus has two major haplotypes that have been separate for an estimated
17
three million years. They are distinct because one is an inversion, but maintenance of two
clades for such a long time is highly unlikely. The recent evolutionary success of the H2
clade in Europeans, where it is spreading rapidly, as well as the lack of diversity within the
clade suggest that it was a Neanderthal allele introduced into the modern human population
invading Europe (Hardy et al., 2005). This locus, when damaged, is implicated in tangle
disorders of the brain.
A similar pattern is found in both Microcephalin (MCPH1 ) and ASPM, related genes
that determine brain size in mammals. In each case a haplotype seemingly only distantly
related to the others at the locus is undergoing strong positive selection. The sweep in
microcephalin appears to have started about 40,000 years ago, i.e. at the time modern
humans entered Europe. The sweep in ASPM is apparently much more recent, about 6,000
years.
While the effects of these variants are not yet well understood, it is striking that the first
and best-described human genes undergoing vigorous selective sweeps are genes involved
in behavior and central nervous system development. A conventional impression from an-
thropology textbooks is that modern humans appeared about 40,000 years ago and have
remained essentially unchanged since then. These examples show that evolution is ongoing
in our species, especially evolution of the brain. There is even a suggestion in the literature
that Microcephalin regulates BRCA1, one of the loci prominent in our discussion of Ashke-
nazi Jewish evolution. In other words ongoing evolution of the brain and the particular turn
it happened to take among northern European Jews are almost certainly parts of the same
story (Xu et al., 2004).
Consequences of genetic diversity
The viewpoints of an evolutionary biologist and of a health professional on genetic diversity
are quite different. While biologists are interested in differences in fitness and in ongoing evo-
18
lution, health professionals are interested in well-being of individuals. These two categories
may often not overlap very much.
A medical disease is some trait that impairs well-being or shortens lifespan or both.
Interestingly, the impairment of well-being may be to the bearer of the disease (tuberculosis,
cancer) or even to others (sociopathy, bad breath). A disease in the evolutionary sense is a
trait that lowers fitness. There are then medical diseases that do not impair fitness or some
that, in the case for example of sociopathy, elevate fitness. There are, conversely, disorders
in the strict evolutionary sense, like chastity, left-handedness (Aggleton et al., 1994), or male
homosexuality, that are not considered to be medical diseases. There is speculation in the
literature of human evolutionary ecology that male homosexuality has a genetic basis and
that it is maintained by an inclusive fitness effect in which males may improve the fitness of
their relatives by giving them resources. However there is no support in the data available
(Bobrow and Bailey, 2001) for such behavior. The very weak genetic influence on male
homosexuality might as easily be explained by genetic variation in resistance to a pathogen
that may cause the trait directly.
We can think of genetic burden as the net contribution of genetic diversity to disease
in either sense. This burden may be apparent, meaning that it is responsible for medical
disease, or it may be unapparent meaning that it does not lead a diminished quality of
human life. For example sickle cell anemia causes compromised and prematurely-terminated
lives in hundreds of thousands of people because it is expressed after birth: an apparently
healthy baby falters. Melanesian ovalocytosis is a parallel adaptation to malaria in parts
of the Pacific, and a homozygote has never been observed. While the homozygous state is
apparently lethal it is expressed so early in development that its consequences for human
well being, that is of the mother and the family, are small. The apparent burden of the sickle
cell polymorphism is great while the burden of ovalocytosis is mostly unapparent.
From this perspective a major goal of prenatal diagnosis and selective abortion is to
convert apparent burden to unapparent burden. The ethical problems surrounding this field
19
are complex, of course, but from the viewpoint of allocating burden, human intervention in
some ways mirrors evolutionary processes.
References
Aggleton, J., Bland, J., Kentridge, R., Neave, N., 1994. Handedness and longevity: archival
study of cricketers. British Medical Journal 309, 1681–1684.
Bobrow, D., Bailey, J., 2001. Is male homosexuality maintained via kin selection? Evolution
and Human Behavior 22, 361–368.
Bulmer, M., 1994. Theoretical Evolutionary Ecology. Sinauer, Sunderland, MA.
Cannon, S., Hayward, L., Beech, J., Jr., R. B., 1995. Socium channel inactivation is impaired
in equine hyperkalemic periodic paralysis. J. Neurophysiol. 73, 1892–1899.
Carter, R., Mendis, K., 2002. Evolutionary and historical aspects of the burden of malaria.
Clinical Microbiology Reviews 15, 564–594.
Cavalli-Sforza, L. L., Menozzi, P., Piazza, A., 1994. The History and Geography of Human
Genes. Princeton Univ. Press, Princeton, NJ.
Cochran, G., Harpending, H., 2005. The natural history of Ashkenazi intelligence. Jour.
Biosoc. Sci. Published online.
Davis, G., 2005. Major genes affecting ovulation in sheep. Genet. Sel. Evol. 37, 511–523.
Ding, Y.-C., Chi, H.-C., Grady, D., Morishima, A., Kidd, J., Kidd, K., Flodman, P., Spence,
M., Schuck, S., Swanson, J., Zhang, Y.-P., Moyzis, R., 2002. Evidence of positive selection
acting at the human dopamine receptor D4 gene locus. Proc. Nat. Acad. Sci. USA 99,
309–314.
20
Edwards, A., 2003. Human genetic diversity: Lewontin’s fallacy. Bioessays pp. 798–801.
Eller, E., 1999. Population substructure and isolation by distance in three continenta l
regions. Amer.J. Phys. Anth. 108, 147–159.
Eswaran, V., 2002. A diffusion wave out of Africa—the mechanism of the modern human
revolution? Current Anthropology 43, 749–774.
Evans, P., Gilbert, S., Mekel-Bobrov, N., Vallender, E., Anderson, J., Vaez-Azizi, L.,
Tishkoff, S., Hudson, R., Lahn, B., 2005. Microcephalin, a gene regulating brain size,
continues to evolve adaptively in humans. Science 309, 1717–1720.
Galloway, S., McNatty, K., Cambridge, L., Laitinen, M., Juengel, J., Jokiranta, T., McLaren,
R., Luiro, K., Dodds, K., Montgomery, G., Beattie, A., Davis, G., Ritvos, O., 2000.
Mutations in an oocyte-derived growth factor gene (BMP15) cause increased ovulation
rate and infertility in a dosage-sensitive manner. Nature Genetics 25, 279–283.
Grobet, L., Martin, L., Poncelet, D., Brouwers, B., Riquet, J., Schoeberlein, A., Dunner, S.,
Menissier, F., Massaband, J., Fries, R., Hanset, R., Georges, M., 1997. A deletion in the
bovine myostatin gene causes the double-muscled phenotype in cattle. Nature Genetics
17, 71–74.
Hardy, J., Pittman, A., Myers, A., Gwinn-Hardy, K., Fung, H., de Silva, R., Hutton, M.,
Duckworth, J., 2005. Evidence suggesting that Homo neanderthalensis contributed the
H2 MAPT haplotype to Homo sapiens . Biochem. Soc. Trans. 33, 582–585.
Harpending, H., Cochran, G., 2002. In our genes. Proc. Nat. Acad. Sci. USA 99, 10–12.
Hunt, J. R., Zeng, H., 2004. Iron absorption by heterozygous carriers of the HFE C282Y
mutation associated with hemochromatosis 1,2,3. American Journal of Clinical Nutrition
80, 924–931.
21
Lewontin, R. C., 1972. The apportionment of human diversity. In: Hecht, M. (Ed.), Evolu-
tionary Biology, Appleton–Century–Crofts, New York, volume 6, pp. 381–398.
McNatty, K., Juengel, J., Wilson, T., Galloway, S., Davis, G., 2001. Genetic mutations
influencing ovulation rate in sheep. Reprod. Fertil. Dev. 13, 549–555.
McPherron, A., Lee, S.-J., 1997. Double muscling in cattle due to mutations in the myostatin
gene. Proc. Natl. Acad. Sci. U.S.A. pp. 12457–12461.
Mekel-Bobrov, N., Gilbert, S., Evans, P., Vallender, E., Anderson, J., Hudson, R., Tishkoff,
S., Lahn, B., 2005. Ongoing adaptive evolution of ASPM , a brain size determinant in
Homo sapiens . Science 309, 1720–1722.
Meyer, C., Amedofu, G., Brandner, J., Pohland, D., Timmann, C., Horstmann, R., 2002.
Selection for deafness? Nature Medicine 8, 1332–1333.
Morton, N. E., Crow, J. F., Muller, H. J., 1956. An estimate of the mutational damage in
man from data on consanguineous marriages. Proc. Nat. Acad. Sci USA 42, 855–863.
Naylor, J., 1997. Hyperkalemic periodic paralysis in quarter horses. Vet. Clin. North Am.
Equine Pract. 13, 129–144.
Online Inheritance in Man:OMIM, AAT, . Protease inhibitor 1: PI and Alpha-a-antitrypsin.
March 17, 2004, MIM +107400.
Online Inheritance in Man:OMIM, MEFV, . Familial Mediterranean fever gene. June 30,
2004, MIM *608107.
Orr, H. A., 2005. The genetic theory of adaptation: a brief history. Nature Reviews Genetics
6, 119–127.
22
Pier, G., Grout, M., Zaidi, T., Meluleni, G., Mueschenborn, S., Banting, G., Ratcliff, R.,
Evans, M., Colledge, W., 1998. Salmonella typhi uses CFTR to enter intestinal epithelial
cells. Nature 393, 79–82.
Risch, N., Tang, H., Katzenstein, H., Ekstein, J., 2003. Geographic distribution of dis-
ease mutations in the Ashkenazi Jewish Population supports genetic drift over selection.
American Journal of Human Genetics 72, 812–22.
Rochette, J., Pointon, J., Fisher, C., Perera, G., Arambepola, M., Aricchi, D. K., Silva, S. D.,
Vandwalle, J. L., Monti, J., Old, J., Merryweather-Clarke, A., Weatherall, D., Robs, K.,
1999. Multicentric Origin of Hemochromatosis gene (HFE) mutations. Amer. J. Hum.
Genet. 64, 1056–1062.
Sabeti, P., Reich, D., Higgins, J., Levine, H., richter, D., Schaffner, S., Gabriel, S., Platko,
J., Patterson, N., McDonald, G., Ackerman, H., Campbell, S., Altshuler, D., Cooper, R.,
Kwiatkowsk, D., Ward, R., Lander, E., 2002. Detecting recent positive selection in the
human genome from haplotype structure. Nature 419.
Slatkin, M., Bertorelle, G., 2001. The use of intraallelic variability for testing neutrality and
estimating population growth rate. Genetics 158, 865–874.
Slatkin, M., Rannala, B., 2000. Estimating allele age. Annual Review of Genomics and
Human Genetics 1, 255–249.
Stefansson, H., Helgason, A., Thorleifsson, G., Steinthorsdottir, V., Masson, G., Barnard, J.,
Baker, A., Jonasdottir, A., Ingason, A., Gudnadottir, V., Desnica, N., Hicks, A., Glfason,
A., Gudbjartsson, D., Jonsdottir, G., j. Sainz, Agnarsson, K., Birgisdottir, B., s. Ghosh,
Olafsdottir, A., Cazier, J., Kristjansson, K., Frigge, M., thorgeirsson, T., Gulcher, J.,
Kong, A., Stefansson, K., 2005. A common inversion under selection in europeans. Nature
Genet. 37, 129–137.
23
Swanson, J., Oosterlaan, J., Murias, M., Schuck, S., Flodman, P., Spence, A., Wasdell, M.,
Ding, Y., Chi, H.-C., Smith, M., Mann, M., Carlson, C., Kennedy, J., Sergeant, J., Leung,
P., Zhang, Y.-P., Sadeh, A., Chen, C., Whalen, C., Babb, K., Moyzis, R., Posner, M., 2000.
Adhd children with a 7–repeat allele of the drd4 gene have extreme behavior but normal
performance on critical neuropsychological tests of attention. PNAS 97, 4574–4579.
Tishkoff, S., Varkonyi, R., Cahinhinan, N., Abbes, S., Argyropoulos, G., Destro-Bisol,
G., Drousiotou, A., Dangerfield, B., Lefranc, G., Loiselet, J., Piro, A., Stoneking, M.,
Tagarelli, A., Tagarelli, G., Touma, E., Williams, S., Clark, A., 2001. Haplotype diversity
and linkage disequilibrium at human G6PD: recent origin of alleles that confer malarial
resistance. Science 293, 455–462.
Wang, E., Ding, Y.-C., Flodman, P., Kidd, J., Kidd, K., Grady, D., Ryder, O., Spence,
M., Swanson, J., Moyzis, R., 2004. The genetic architecture of selection and the human
dopamine receptor D4 (DRD4) gene locus. Am. J. Hum. Genet. 74, 931–944.
Wendt, M., Bickhardt, K., Herzog, A., Fischer, A., Martens, H., Richter, T., 2000. Porcine
stress syndrome and PSE meat: clinical symptons, pathogenesis, etiology and animal
rights aspects. Berl. Munch. Tierarztl. Wochenschr. 113, 117–190.
WHO, 1994. Guidelines for control of haemoglobin disorders. Technical Report
WHO/HDP/HB/GL/94.1, World Health Organization.
Xu, X., Lee, J., Stern, D., 2004. Microcephalin in a DNA damage response protein involved
in regulation of CHK1 and BRCA1. J. Biol. Chem. 279, 34091–34094.
24
List of Figures
1 Typical trees of descent of alleles at a neutral locus, in the left panel, and ata locus where a selective sweep is occurring, in the right panel . . . . . . . . 26
25
top related