human genetic diversity. eshg barcelona

Post on 17-Jul-2015

3.171 Views

Category:

Education

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Human genetic population structure: patterns and underlying processes

Guido Barbujani

Dipartimento di Biologia ed Evoluzione, Università di Ferrara

g.barbujani@unife.it

• Our genome is very small• Our genome is very large• Our genomes are very similar• Our genomes are very different

Human genetic population structure: patterns and underlying processes

There are clear morphologicaldifferences (“types”)

But each group harbours extensive diversity

Analyses of morphological traits led to inconsistent lists of races

Linnaeus (1758) 4 (europeus, asiaticus, afer, americanus) [+2]Blumenbach (1795) 5 (same, + australianus)Cuvier (1828) 3 (caucasoid, negroid, mongoloid)Huxley (1875) 4 (mongoloid, xanthocroid, australoid, negroid)Deniker (1900) 29Weinert (1935) 17Von Eickstedt (1937) 38Museum of Nat. Hist. Chicago (1933) 107Coon (1967) 5 (negroid, capoid, caucasoid, mongoloid, australoid)Risch (2002) 5 (different in different articles)

According to Molnar (1975) 20th century lists include from 3 to 200 items

Skin colour

Stature

Variation is continuous and discordant. It is possible to cluster people one the basis of any trait, but the resulting classification does not allow one to predict clustering for other traits

The trouble with morphological traits

1. Estimating variances from sequence comparisons

-TACGAACATCAGGC--TATGAACATCAGGC--TATGAACATCGGGC-

Independent studies of genetic variances yield very similar results: 85, 5, 10

Lewontin (1972) 17 loci 85% 8% 6%Latter (1973) 18 86% 5% 9%Barbujani et al. (1997) 109 85% 5% 10%Jorde et al. (2000) 100 85% 2% 13%Romualdi et al. (2002) 32 83% 8% 9%Rosenberg et al. (2002) 377 93% 3% 4%Excoffier & Hamilton (2003) 377 88% 3% 9%Ramachandran et al. (2005) 17 90% 5% 5%Bastos-Rodriguez et al. (2006) 40 86% 2% 12%Li et al. (2008) 650 000 89% 2% 9%

MEDIAN 85% 5% 10%

within populations

among populations

among continents

What does it mean, in practice?

100%

100%100%

Members of our community are only slightly less different from us than members of distant populations

85%85%

85%

Mind the numbers

Humans and chimps share >98% of their genomes

Among the 2% differences, 1.9% are fixed differences within species

The remaining fraction, 0.1%, contains all human genomic variation

85% of that 0.1% represents differences among members of the same population

The differences among the main continental groups represent 10% of 0.1% of the total, that is, 0.01%

But 0.01% of <3 billion DNA sites means <300 000 variable sites

2. Clustering genotypes or haplotypes

Rosenberg et al., 2002

Clustering genotypes by algorithms identifying structure

K=3

K=4

SNPs

Haplotypes

CNV

Jakobsson et al. 2008

Structure inferred from SNPs and haplotypes differs from that inferred from Copy Number Variation

Genes, as well as morphology, suggest inconsistent clusterings of genotypes

Africa

Asia, Europe, Australia, Americas

Americas

Africa, Asia, Americas,Oceania

Asia Europe

Africa, Asia,EuropeOceania

Y chromosome: Romualdi et al. 2002

Alu insertions: Romualdi et al. 2002

X chromosome: Wilson et al. 2001

Europe,Ethiopia

S. Africa N. Guinea

Asia

Genes, as well as morphology, suggest inconsistent clusterings of genotypes

377 STR loci: Rosenberg et al. 2005

Melanesia Eurasia N Africa N America

Maya

S. Africa

377 STR loci: Barbujani and Belle 2006

E Africa

C Africa

Piapoco

Suruì

Karitiana

Kalash

W. Eurasia

E. Asia

Africa

Americas

Oceania

Sampling has a large effect on the apparent structuring

Serre and Pääbo 2004

Variation is continuous and discordant. It is possible to cluster people one the basis of any trait, but the resulting classification does not allow one to predict clustering for other traits

The trouble with genetic traits

MCPH D-haplogroup

NAT2 acetylator

Sampling points in the geographic space

3. Identifying genomic boundaries

The sampling points are connected by edges

d

d

d

dd

d

d

d

dd

d

d

d

d

d

dd

d

d

d

d

d

d

d

d

d

d

d

d

d

dd

d

Genetic distances between neighbours are associated to each edge of the reticulation

d

d

d

dd

d

d

d

dd

d

d

d

d

d

dd

d

d

d

d

d

d

d

d

d

d

d

d

dd

d

Boundaries are traced perpendicular to the edge showing the highest genetic distance and extended through the adjacent edges

d

d

d

dd

d

d

d

dd

d

d

d

d

dd

d

d

d

d

d

d

d

d

dd

d

1

1

A boundary is completed when it exits the reticulation or closes on a preexisting boundary

d

dd

d

d

d

d

d

d

dd

d

d

d

d

d

dd

d

1

1

2

23

3

The number of boundaries one may detect is arbitrary, but there are methods to choose

1

1

2

23

3

Four genetic clusters are identified, each separated from the others by a boundary

8

6

2

45

91

7

Genomic boundaries inferred from diversity at 377 STR loci

(Barbujani and Belle 2006)

Eight significant boundaries, defining 9 groups of populations

81% of SNPs cosmopolitan.

Alleles present in one continent only: 0.91% in Africa, 0.75% in Eurasia, practically 0 elsewhere.

Hunting-gathering populations distinct from farmers in Africa

Jakobsson et al. 2008(525910 SNPs, 396 CNVs)

12.4% of haplotypes cosmopolitan, 29% continent-specific, 18% of which in Africa. More than 50% present in 1 or 2 continents

Jakobsson et al. 2008

LD decreasing with physical distance between loci and with geographic distance from East Africa

Jakobsson et al. 2008

Models with an African population replacing previous human continental groups explain the data better than

any alternative models

Fagundes et al. (2007)

Patterns of morphological and genetic variation are compatible with the effects of dispersal from Africa

Manica et al. 2007

Fitting a model of isolation by distance to human genetic diversity

Liu et al. (2006)

Average coalescence times and gene diversity decline as a function of distance from Africa

Best fit of the model for an African exit 56,000 years ago

Fagundes et al. (2007)

http://info.med.yale.edu/genetics/kkidd/point.html

The best available estimates place our species’ origin and its exit from Africa in a not-so-remote past

Linguistic and genetic differences are often correlated

Genetic variances are significant among language groups

Correlations between distance measures r r2

GEN-GEO 0.746*** 0.557GEN-LAN 0.311*** 0.097GEO-LAN 0.269*** 0.072GEN-GEO.LAN 0.723*** 0.523GEN-LAN.GEO 0.172*** 0.030

Percentages of the total variance

Genetic distance Fst Rst

Among lang. phyla 2.9 6.7Among pops. of 2.4 2.9 the same phylumWithin populations 94.7 90.4

Belle and Barbujani 2007

Origins: Attempting a synthesis

• Human genetic population structure is generally weak, with large differences among members of the same population and discordant variation across loci

• Genetic and morphological data agree in indicating an origin of human dispersal in Africa

• At the large geographic scale,patterns fit a model of repeated founder effects during dispersal from Africa

• Zones of relatively sharp genetic change correspond to reproductive barriers, geographic or cultural

top related