the origin of domestication genes in goats - biorxiv.org · 1 1 the origin of domestication genes...
TRANSCRIPT
1
The origin of domestication genes in goats 1
2
One Sentence Summary: Goat domestication mainly focused on immune and neural 3
genes, with adaptive leaps driven by introgression and selection. 4
5
Zhuqing Zheng1†, Xihong Wang1†, Ming Li1†, Yunjia Li1†, Zhirui Yang1†, Xiaolong 6
Wang1†, Xiangyu Pan1, Mian Gong1, Yu Zhang1, Yingwei Guo1, Yu Wang1, Jing Liu1, 7
Yudong Cai1, Qiuming Chen1, Moses Okpeku2,3, Licia Colli4,5, Dawei Cai6, Kun 8
Wang7, Shisheng Huang1, Tad S. Sonstegard8, Ali Esmailizadeh9, Wenguang Zhang10, 9
Tingting Zhang1, Yangbin Xu1, Naiyi Xu1, Yi Yang11, Jianlin Han12, Lei Chen7, 10
Joséphine Lesur13, Kevin G. Daly14, Daniel G. Bradley14, Rasmus Heller15, Guojie 11
Zhang2,16,17,18, Wen Wang2,7,18, Yulin Chen1*, Yu Jiang1* 12
13
1Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi 14
Province, College of Animal Science and Technology, Northwest A&F University, 15
Yangling 712100, China. 16
2State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of 17
Zoology, Chinese Academy of Sciences, Kunming 650223, China. 18
3Discipline of Genetics, School of Life Science, University of Kwazulu-Natal, Durban 19
4000, South Africa. 20
4Dipartimento di Scienze Animali, della Nutrizione e degli Alimenti, Facoltà di 21
Scienze Agrarie, Alimentari e Ambientali, Università Cattolica del S. Cuore, via 22
Emilia Parmense n. 84, 29122, Piacenza (PC), Italy. 23
5BioDNA - Centro di Ricerca sulla Biodiversità e sul DNA Antico, Facoltà di Scienze 24
Agrarie, Alimentari e Ambientali, Università Cattolica del S. Cuore, via Emilia 25
Parmense n. 84, 29122, Piacenza (PC), Italy. 26
6Research Center for Chinese Frontier Archaeology, Jilin University, Changchun 27
130012, PR China. 28
7Center for Ecological and Environmental Sciences, Northwestern Polytechnical 29
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
2
University, Xi’an 710072, China. 30
8Recombinetics, Inc., St. Paul, Minnesota, USA. 31
9Department of Animal Science, Faculty of Agriculture, Shahid Bahonar University 32
of Kerman, Kerman, PB 76169-133, Iran. 33
10College of Animal Science, Inner Mongolia Agricultural University, Hohhot 010018, 34
China. 35
11Institute of Preventive Veterinary Medicine, Zhejiang Provincial Key Laboratory of 36
Preventive Veterinary Medicine, College of Animal Sciences, Zhejiang University, 37
Hangzhou 310058, China. 38
12CAAS-ILRI Joint Laboratory on Livestock and Forage Genetic Resources, Institute 39
of Animal Science, Chinese Academy of Agricultural Sciences (CAAS), Beijing, 40
China. 41
13UMR 7209 MNHN/CNRS CP 5555, rue Buffon 75005 Paris, France. 42
14Smurfit Institute of Genetics, Trinity College Dublin, Dublin 2, Ireland. 43
15Section for Computational and RNA Biology, Department of Biology, University of 44
Copenhagen, Copenhagen N 2200, Denmark. 45
16China National GeneBank, BGI-Shenzhen, Shenzhen 518083, China. 46
17Section for Ecology and Evolution, Department of Biology, University of 47
Copenhagen, DK-2100 Copenhagen, Denmark. 48
18Center for Excellence in Animal Evolution and Genetics, Chinese Academy of 49
Sciences, Kunming 650223, China. 50
†These authors contributed equally to this work. 51
*Corresponding author. E-mail: [email protected] (Y.J.), 52
[email protected] (Y.Chen).53
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
3
Abstract 54
Goat domestication was critical for agriculture and civilization, but its underlying 55
genetic changes and selection regimes remain unclear. Here we analyze the genomes 56
of worldwide domestic goats, wild caprid species and historical remains, providing 57
evidence of an ancient introgression event from a West Caucasian tur-like species to 58
the ancestor of domestic goats. One introgressed locus with a strong signature of 59
selection harbors the MUC6 gene which encodes a gastrointestinally secreted mucin. 60
Experiments revealed that the nearly fixed introgressed haplotype confers enhanced 61
immune resistance to gastrointestinal pathogens. Another locus with a strong signal of 62
selection may be related to behavior. The selected alleles at these two loci emerged in 63
domestic goats at least 7,200 and 8,100 years ago, respectively, and increased to high 64
frequencies concurrent with the expansion of the ubiquitous modern mitochondrial 65
haplogroup A. Tracking these archaeologically cryptic evolutionary transformations 66
provides new insights into the mechanism of animal domestication.67
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
4
Introduction 68
Domestication presents an extreme shift of physiological and behavioral stress for 69
free-living animals (1) to a high-density and disease-prone anthropogenic 70
environment, especially for herbivores. The goat (Capra hircus) was one of the first 71
domesticated livestock species, demonstrating remarkable adaptability and versatility 72
(2, 3), and has been intimately associated with human dispersal (4). Recent studies 73
have identified candidate targets of selection during goat domestication, including loci 74
related to pigmentation, xenobiotic metabolism and milk production (5, 6); however, 75
the evolutionary dynamics of key genes involved in adaptation during the early phase 76
of domestication remains unclear. 77
Goat domestication is believed to have begun ~11,000 years ago from a mosaic 78
of wild bezoar populations (Capra aegagrus) (2, 6). However, other wild Capra 79
species are widely distributed and many of them can hybridize with domestic goats 80
(7-9). Their contribution to the goat domestication process remains unexplored. 81
Introgression of adaptive variants has been widely recognized as a significant 82
evolutionary phenomenon in humans (10), sheep (11), and cattle (12), and it may be 83
particularly effective for increasing fitness without negative pleiotropic effects as 84
demonstrated in other species (13). Here, we conducted a comprehensive population 85
genomic survey of modern goat populations distributed across the world, six wild 86
Capra species and previously published ancient goat genomes to investigate temporal 87
shifts in key genetic variants under selection during goat domestication.88
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
5
Results 89
Population structure and origin of domestic goats 90
We sequenced 101 Capra genomes (coverage 3 - 47×, average 12×), including 88 91
domestic goats collected from three different continents, one bezoar, one Alpine ibex 92
(Capra ibex), three Siberian ibexes (Capra sibirica), three Markhors (Capra 93
falconeri), one West Caucasian tur (Capra caucasica) and four Nubian ibex × 94
domestic goat hybrids (Capra nubiana × C. hircus) (Fig. 1A, figs. S1 and S2, tables 95
S1 and S2, and text S1). We also sequenced five ancient goat samples at ~0.04 to 96
13.44× coverage (Fig. 1A, figs. S3 to S6, and table S3), including the earliest known 97
Chinese archaeological samples in Neolithic age (14). Together with the publicly 98
available genomic sequences for modern goat and bezoar (5, 15), as well as ancient 99
genomes (6) (table S4), we compiled a worldwide collection of 164 modern domestic 100
goats, 52 ancient goats, 24 modern bezoars and four ancient bezoars. 101
A whole-genome neighbor-joining phylogenetic tree reveals that all domestic 102
goats form a monophyletic sister lineage to bezoars (Fig. 1B, fig. S7, and table S5), 103
confirming that modern domestic goats are the descendants of bezoar-like ancestors 104
(16). The other four wild Capra species (C. ibex, C. sibirica, C. falconeri and C. 105
caucasica), which are referred to as the ibex-like species (17), fall exclusively in 106
another branch divergent from bezoar-goat (Fig. 1B and fig. S7). Principal 107
component analysis (PCA) shows that the three bezoar populations, which were 108
collected in the Middle East (Fig. 1A) near the domestication center (2), are 109
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
6
structured and cluster corresponding to their geographic origins (Fig. 1C). Within 110
domestic goats, PCA and model-based clustering (k = 3) show that Asian goats are 111
genetically distinct from European (EUR) and African (AFR) samples (Figs. 1D and 112
1E). At k = 6, Asian goats further split into two geographic subgroups: Southwest 113
Asia-South Asia (SWA-SAS) and East Asia (EAS) (Fig. 1E and figs. S8 to S10). 114
This geographic structure is also supported by TreeMix and haplotype-based statistics 115
(figs. S11 and S12), in agreement with the scenario that the ancestors of present-day 116
domestic goat populations followed distinct dispersal routes along the east-west axis 117
of Afro-Eurasia (4) (Fig. 2A, figs. S13 and S14, and table S6). 118
Our demographic analyses using MSMC, SMC++, and ∂a∂i indicate that the 119
divergence times amongst Asian, African and European goat populations predated the 120
archaeologically estimated domestication time by a large margin (figs. S11 to S17, 121
table S7, and text S2). In addition, the three bezoar populations share significantly 122
different levels of alleles with divergent goat populations according to D statistics 123
(Fig. 2B and fig. S22). Therefore, the deep coalescence times of different domestic 124
goat populations can be explained by their origin in structured ancestral bezoar 125
populations (6), or by post-domestication recruitment from different local bezoar 126
populations. 127
128
Gene flow from ibex-like species to pre-domestication bezoars and modern goats 129
D statistics reveal that all four ibex-like species have different signals of allele sharing 130
with ancient and modern goats, with all goat populations showing the greatest sharing 131
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
7
with the West Caucasian tur (Fig. 2C and table S8). Notably, all four pre-132
domesticated bezoars (an Armenian bezoar dating to >47,000 years ago and three 133
Anatolian bezoars dating to ~13,000 years ago) show an excess of allele sharing with 134
the West Caucasian tur (Fig. 2D). Further f3 statistics in the form of f3 (modern bezoar, 135
West Caucasian tur; target) support gene flow from the West Caucasian tur to ancient 136
Armenian bezoar, but do not suggest additional recent gene flow to modern goats 137
(table S9). Together with the fact that three pre-domestication Anatolian bezoars 138
possessed a tur-like mitochondrial haplotype (6), this evidence supports a pre-139
domestication admixture event of bezoar with a tur-like species. This ancient 140
admixture event is further supported by the distribution of West Caucasian tur being 141
sympatric with that of the ancient Armenian bezoar, and closer to the goat 142
domestication center than any other living ibex-like species (Fig. 2A). 143
Although all modern goat genomes suggest some introgression from tur, 144
Neolithic Balkan and modern EUR and AFR goats show more allele sharing with tur 145
compared to Asian goats (fig. S23), probably because of higher genetic affinity with 146
the ancient Armenian and Anatolian bezoars (6) (Fig. 2D). Therefore, pre-147
domestication introgression from a tur-like species may have contributed alleles to 148
ancient bezoars and thereby early domesticated goats and their derived modern 149
populations. 150
We identified a total of 112 putative introgressed regions derived from ibex-like 151
species with a haplotype frequency > 0.1 in domestic goats through D statistics, 152
Identity-By-State (IBS), Sprime, and maximum likelihood (ML) phylogenetic trees 153
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
8
(Fig. 2E, fig. S24, and Data file S1). The 112 regions cover 81 protein-coding genes. 154
A KEGG pathway enrichment analysis for these genes shows that the most 155
significantly enriched category is amoebiasis (hypergeometric test, adjusted P < 5.28 156
× 10-3, table S10), which is related to parasite invasion and immunosuppression, 157
including four genes (SERPINB3, SERPINB4, CD1B, and COL4A4). Three additional 158
introgressed genes (BPI, MAN2A1, and CD2AP) are also involved in immune 159
function (18-20). For half of the introgressed regions, the West Caucasian tur shows 160
the highest similarity with the introgressed alleles among the four ibex-like species 161
(fig. S25 and Data file S1). Furthermore, the EUR-AFR populations have a higher 162
prevalence of introgressed regions than the Asian populations (t-test, P = 3.30×10-10) 163
(Fig. 2E), consistent with the higher genome-wide allele sharing between EUR-AFR 164
and West Caucasian tur. 165
Notably, one introgressed region on chromosome 29 has a nearly fixed 166
introgressed haplotype (frequency = 95.7%) in the modern worldwide domestic goats 167
(Figs. 2D and 2F). This region contains one intact protein-coding gene, MUC6, 168
which encodes a gastrointestinally secreted mucin that forms a protective glycoprotein 169
coat involved in host innate immune responses to the invasion of multiple 170
gastrointestinal pathogens (21-23). We searched for potential donor populations in our 171
sequenced wild caprid genomes and used coalescent-based inference methods to 172
validate this introgression signal. The haplotype network constructed with the non-173
repeated region of MUC6 showed the most common domestic haplotype (MUC6D) is 174
similar to West Caucasian tur, differing by only one mutation, while being strikingly 175
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
9
different from the minor frequency haplotypes of domestic goats and the common 176
haplotype in bezoar (MUC6B) (Fig. 2G). Coalescence time estimates and neutral 177
simulations suggest that such pattern of haplotype differentiation is highly unlikely in 178
the absence of interspecific gene flow (figs. S26 and S27). Therefore, the nearly fixed 179
MUC6D in goats was most likely introgressed from a lineage close to the West 180
Caucasian tur, in agreement with the genome-wide signal. 181
182
Domestication-mediated selection on immune and neural genes 183
To identify key selective sweeps in goat domestication, the worldwide domestic goat 184
populations were compared with all 24 bezoars by estimating pairwise genetic 185
differentiation (FST), differences in nucleotide diversity (π ln-ratio bezoar/domestic) 186
and cross-population extended haplotype homozygosity (XP-EHH) in 50 kb sliding 187
windows along the genome (figs. S28 and S29 and table S11). We defined the 188
windows with significant values (Z test, P < 0.005) in all three statistics (FST > 0.195, 189
π ln-ratio > 0.395 and XP-EHH > 2.10) as putative selective sweeps. After merging 190
consecutive outlier windows, 105 unique regions containing 403 protein-coding genes 191
were identified (Fig. 3A and Data file S2). 18 of these regions have been identified in 192
previous domestication scans, including those associated to phenotypic effects related 193
to immunity, neural pathways/processes, pigmentation, and productivity traits 194
associated with milk composition and hair characteristics (Data file S2). The limited 195
overlap with previous studies is mainly due to differences in sample sets, selection 196
scan methodology and different versions of the reference genome (text S2). 197
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
10
KEGG analysis showed that nine of the 14 significantly enriched pathways 198
(hypergeometric test, adjusted P < 0.01) are immune-related (hypergeometric test, 199
adjusted P = 5×10-3 to 2.35×10-7) (Fig. 3B and table S12), including influenza A, 200
malaria, African trypanosomiasis, Natural killer cell-mediated cytotoxicity, Measles, 201
Herpes simplex infection, Cytokine-cytokine receptor interaction, Hepatitis C, and 202
Adipocytokine signaling pathway. We surveyed the literature and identified 40 203
additional genes with immune function (table S13), obtaining a total of 41 regions 204
that contain 77 immune-related genes. Among these, FST is particularly high at MUC6 205
region and we detected 16 missense mutations showing FST > 0.88 in this gene 206
(windowed FST = 0.89, π ln-ratio = 1.01, XP-EHH = 5.25. Fig. 3C, fig. S30, and 207
table S14). This region contains 228 SNPs nearly fixed for the derived allele in 208
modern goats (frequency > 95%, absent in bezoars) accounting for 93.8% of those in 209
the whole genome (a total of 243, Data file S3), illustrating the singular nature of this 210
selection signal. 211
The selection enrichment analysis furthermore pinpoints 12 genes involved in 212
Neuroactive ligand-receptor interaction (hypergeometric test, adjusted P = 3.70×10-5). 213
We observed an additional 37 genes linked to other functions in the nervous system 214
that were under selection during goat domestication (table S15). One genomic region 215
located on chromosome 15 shows the strongest combined signal of selection (FST = 216
0.90, π ln-ratio = 3.20, and XP-EHH = 8.09) (Fig. 3A). Evidence for large negative 217
Tajima’s D values, high composite likelihood ratio (CLR) scores, and extensive 218
haplotype sharing in domestic goats all suggest strong positive selection at this locus 219
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
11
(Fig. 3D and fig. S31). This locus contains 15 SNPs almost fixed for the derived 220
allele in domestic goats and harbors two protein-coding genes, STIM1 and RRM1 221
(Data file S3). Functionally, STIM1 is an endoplasmic reticulum calcium sensor 222
involved in regulating Ca2+ and metabotropic glutamate receptor signaling in the 223
neural system (24, 25). RRM1 encodes ribonucleoside-diphosphate reductase large 224
subunit, an enzyme essential for the production of deoxyribonucleotides, and 225
influences sensitivity to valproic acid�induced neural tube defects in mice (26). 226
Hence, we hypothesize that this strongly selected locus could be an example of 227
behavioral adaptions in the early phase of animal domestication (27). 228
229
Introgressed MUC6 plays a role in pathogen resistance 230
The MUC6 region constitutes the only intersection between the selection and 231
introgression scans. In sheep and cattle, MUC6 is associated with gastrointestinal 232
parasite resistance (28-31). The results of transcriptome sequencing, qPCR, and 233
immunohistochemistry revealed that MUC6 is specifically expressed in the 234
abomasum and duodenum of goat (Fig. 4A and figs. S32 and S33). Structurally, 235
MUC6 has a long variable number of tandem repeats (VNTRs) rich in Pro, Thr, and 236
Ser residues, which can influence the covalent attachment of O-glycans (32). We 237
therefore used PacBio SMRT transcriptome sequencing to investigate the difference 238
between the MUC6D and MUC6B at the transcriptional level. Besides a number of 239
small indels, an 82 amino acid (aa) deletion containing three copies of the VNTR after 240
the 2,789th aa was uniquely identified in MUC6D compared to MUC6B (figs. S34 and 241
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
12
S35). The number of VNTRs in these two haplotypes may influence the function of 242
MUC6, which is a key to the generation of gastrointestinal mucous gel (22, 33). To 243
examine the potential difference of pathogen resistance of the MUC6D and MUC6B 244
variants, an epidemiological survey was carried out in a polymorphic goat population. 245
By evaluating fecal egg counts (FEC) for gastrointestinal nematodes in 143 246
MUC6D/MUC6D, 111 MUC6D/MUC6B, and 14 MUC6B/MUC6B ewes at 13 months of 247
age (fig. S36), we found that the goats carrying MUC6D/MUC6D exhibited lower FEC 248
than those carrying the other two genotypes (Fig. 4B and Data file S4), implying that 249
the goats carrying MUC6D might be more resistant to gastrointestinal nematodes. 250
These results support that the introgressed MUC6D in domestic goats is most likely 251
under selection due to its advantage in the host innate immune response towards 252
potential gastrointestinal pathogens (22). 253
254
The origin and diffusion of domestic STIM1-RRM1 and MUC6 alleles 255
An in-depth genetic survey of ancient genomes throughout the Near East revealed that 256
two ancient Balkan goats dating to ~8,100 years ago carried STIM1-RRM1D (fig. S37 257
and Data file S5). MUC6D appeared later at ~7,200 years ago in Southwest Iran (fig. 258
S38 and Data file S6), a period characterized by an increased density of settlements 259
in the Fertile Crescent (34). Herding goats at higher densities in a crowded and 260
disease-prone anthropogenic environment would likely have exerted an increased 261
selective pressure for livestock pathogen resistance (35). Interestingly, the first 262
detected ancient animals possessing STIM1-RRM1D or MUC6D both carry 263
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
13
mitochondrial haplogroup A, although this mitochondrial haplogroup had a low 264
frequency and narrow distribution before 7,500 years ago (6). By ~6,500 years ago, 265
the frequency of the STIM1-RRM1D and MUC6D increased to ~60% in the Near East 266
(Fig. 5A) and spread to China ~3,900 years ago (Fig. 5B), concomitant with the 267
consolidation and diffusion of livestock-based economies throughout Eurasia (36, 37). 268
The expansion of these two selected loci was also contemporaneous with the 269
overwhelming spread of mitochondrial haplogroup A. In contrast, the frequencies of 270
the two major Y-chromosome haplogroups remained relatively unchanged over time 271
(Fig. 5B, figs. S38 to S40, and Data file S7).272
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
14
Discussion 273
The present study generated genomic data from a substantial number of domestic and 274
wild Capra species to characterize adaptive introgression and genetic changes during 275
goat domestication. Collectively, both the selective sweep enrichment analyses and 276
the two outstanding selection signals found in MUC6 and STIM1-RRM1 in domestic 277
goats are consistent with the hypothesis that genes related to pathogen resistance and 278
behavior have been particularity targeted during goat domestication. 279
Numerous studies have shown that adaptive introgression can provide beneficial 280
alleles that allow the recipient populations to adapt to new environments (12, 38-40). 281
In humans, immune-related loci acquired substantially advantageous alleles by 282
admixture with archaic humans who were probably well adapted to local 283
environments and pathogens (41-43). In our study, a number of lines of evidence 284
support that gene flow between West Caucasian tur and goats occurred prior to the 285
onset of domestication. The West Caucasian tur is distributed around the Black Sea 286
coast (Fig. 2A), geographically close to the goat domestication center. West 287
Caucasian tur inhabit a humid, subtropical environment where the expected pathogen 288
exposure and parasite load may be considerably higher than in more arid regions of 289
southwest Asia. The rigorous identification of introgression segments in domestic 290
goats shows that the nearly fixed MUC6D was introgressed from a West Caucasian 291
tur-like species. These results indicate that the advantageous MUC6D might enhance 292
innate immune surveillance and reactivity against potential gastrointestinal pathogens 293
under a grazing agro-ecological niche (44) with novel infectious disease pressures. 294
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
15
Our data set including several ancient goat samples allowed us to tentatively 295
track the emergence and spread of the advantageous variants in these two important 296
domestication loci (Fig. 5). The first occurrences of the STIM1-RRM1D and MUC6D 297
locus coincided with the otherwise rare mtDNA haplogroup A, and the three 298
increased in frequency roughly concurrently (Fig. 5A), becoming nearly fixed in 299
modern goat populations. These results suggest the global diffusion of variations 300
central to the domestication process was possibly mediated by a subset of female 301
goats carrying the mtDNA haplogroup A. Despite this, modern goat populations 302
remain clearly differentiated from each other. The most likely explanation we can 303
think of is that even Neolithic goat husbandry entailed some kind of breeding strategy 304
under which immigrant matrilines containing globally advantageous alleles were 305
carefully backcrossed with local populations, which probably carried local genetic 306
adaptations, rather than simply replacing local populations. 307
Overall, our results provide evidence for adaptive introgression and the genetic 308
basis of selected traits during the domestication of goats. We highlight that livestock 309
domestication is a dynamic evolutionary process, with adaptive leaps driven by 310
introgression and selection. Our results indicate that domestication may have focused 311
mainly on neural traits and changes in pathogen resistance, which helped managed 312
herds to adapt to an anthropogenic environment. Our work provides a new insight into 313
goat domestication research and the application of genomics to practical breeding for 314
other domesticated species.315
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
16
Materials and Methods 316
Sample collection 317
We sampled 106 animals including 88 domestic goats (Capra hircus), one bezoar 318
(Capra aegagrus), three Markhors (Capra falconeri), three Siberian ibexes (Capra 319
sibirica), one Alpine ibex (Capra ibex), one West Caucasian tur (Capra caucasica), 320
five ancient goats spanning from the Neolithic to the Middle Ages, and four Nubian 321
ibex hybrids (Capra nubiana × C. hircus) for evolutionary analyses. All procedures 322
involving sample collection were approved by the Northwest A&F University Animal 323
Care Committee (Permit number: NWAFAC1019), making all efforts to minimize 324
invasiveness. The C. caucasica (tooth) sample was obtained from a zoo specimen 325
collected by the Muséum national d’histoire naturelle (MNHN, sample ID MNHN 326
ZM AC 1982-1092) in the year 1982. In addition, we also obtained whole-genome 327
data and their geographical coordinates of the sampling sites from the Nextgen project 328
(http://projects.ensembl.org/nextgen) (5, 15). Details of the samples used in this study 329
are given in tables S1 to S3. 330
331
DNA extraction and sequencing 332
For modern samples, genomic DNA was extracted from whole blood using the 333
standard phenol-chloroform method and checked for quality and quantity on the Qubit 334
2.0 fluorometer (Invitrogen). Library construction for resequencing was performed 335
with 1-3 µg of genomic DNA using standard library preparation protocols and insert 336
sizes from 300 to 500 bp. All libraries were sequenced on either the Illumina HiSeq 337
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
17
2500 or X Ten platforms. 338
For the historical sample (C. caucasica), tooth pulverization, DNA extraction 339
and Illumina sequencing library preparation were performed in Trinity College Dublin, 340
Ireland as described by ref. (6) and ref. (45). The double-stranded DNA sequencing 341
library was constructed without UDG treatment. The sample library was then 342
sequenced using a HiSeq 2500 platform (Single end, 1×101 bp). 343
344
Read alignment and variant calling 345
Filtered reads from all individuals were aligned to the latest goat reference genome 346
(GCF_001704415.1) (46) by the Burrows-Wheeler Aligner (BWA) (47, 48). To 347
obtain high-quality SNPs, SNP calling and genotyping were carried out at two 348
separate stages. Variants were discovered on a population-scale (without outgroup 349
argali Ovis ammon) using SAMtools model (49) implemented in the analysis of next 350
generation sequencing data (ANGSD) (50) with “-only_proper_pairs 1 -uniqueOnly 1 351
-remove_bads 1 -minQ 20 -minMapQ 30 -C 50 -doMaf 1 -doMajorMinor 1 -GL 1 -352
setMinDepth [Total_read_depth/3] -setMaxDepth [Total_read_depth*3] -doCounts 1 353
-dosnpstat 1 -SNP_pval 1”. The sites that could be called in at least 90% of the 354
samples and had a strand bias score below 90% were retained. A P-value threshold of 355
1 × 10-6 and minor allele frequency value >1/2n were used as SNP discovery criteria, 356
where n was the sample size. We also kept the sites at which the sampled individuals 357
were homozygous for the alternative alleles. To get the hard-called genotypes for the 358
candidate variants in all Capra species and outgroups, we called genotypes according 359
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
18
to the GATK-3.7.0 (51) best practice workflows. Finally, we only kept biallelic 360
markers as well as markers with no more than 10% missing data using VCFtools 361
v0.1.15 (52) with “--min-alleles 2 --max-alleles 2 --max-missing 0.9”. Then, all 362
sample sets of filtered variant calls were used for imputation and phasing using 363
BEAGLE v4.1 (53, 54). 364
To retrieve the nucleotides that were accessible to variant discovery (used in the 365
demographic estimation), hard filtering of the invariant sites was carried out with 366
thresholds based on sequencing coverage (1/3 to 3-fold of corresponding mean depth) 367
and missing data rate (no more than 10%) following the same strategies adopted for 368
variant sites. 369
370
Population structure and phylogenetic analysis 371
Neighbor-joining (NJ) tree was constructed for the whole-genome SNP set using 372
MEGA v6.0 (55) based on pairwise genetic distances matrix calculated by plink v1.9 373
(56). We also made use of the 5,043,096 fourfold degenerate sites to build a 374
maximum likelihood (ML) tree using RAxML v8.2.9 (57). PCA was performed using 375
smartpca, part of the EIGENSOFT v6.1 (58). A Tracy-Widom test was used to 376
determine the significance level of the eigenvectors. ADMIXTURE (59) was used to 377
perform an unsupervised clustering analysis. We increased the number of predefined 378
genetic clusters from k = 2 to k = 7 and ran the analysis 20 times for each k. We 379
further compared the individual-level haplotype similarity using fineSTRUCTURE 380
(60). TreeMix v1.12 (61) was also used to infer a population-level phylogeny using 381
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
19
the ML approach. 382
383
Demographic reconstruction 384
To infer patterns of effective population size and population separations over 385
historical time, we used MSMC2 (62), using the statistically phased autosomal 386
genotypes. We also used the SMC++ (63), which does not rely on haplotype phase 387
information. We estimated a mutation rate of 4.32×10-9 per site per generation based 388
on the whole genome alignment between goat and sheep (Ovis aries) and a generation 389
time of 2 years. Additionally, we calculated the likelihood of the observed site 390
frequency spectrum (SFS) conditioned on several demographic models using ∂a∂i 391
(64). For the best model, nonparametric bootstrapping (100 replicates) was performed 392
to determine the variance of each parameter. 393
394
Gene flow analysis 395
We computed f3-statistics (65) and D-statistics (66) to formally test hypotheses of 396
gene flow. To identify regions introgressed into modern domestic goats from ibex-397
related species, we calculated D-statistics using non-overlapping 20 kb sliding 398
windows across the genome. Windows within top 5% of the D-statistics were 399
considered as outliers and were further examined. An introgressed segment should 400
have low sequence divergence from the putative donor species whereas sequence 401
divergence between the donor species and bezoar should be higher. Therefore, we 402
also calculated the Identity-By-State (IBS) distance matrix of the outlier regions using 403
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
20
ANGSD (50). Differences in sequence divergence were determined by a t-test-based 404
approach using a P-value of 0.01 as cutoff. To further investigate whether 405
introgression best explained the outlier regions, we used Sprime (67) to infer the 406
putative introgressed alleles in the significantly diverged regions using 24 modern 407
bezoars as outgroup. Although bezoars may contain a small amount of introgressed 408
variants from ibex-related species, only the frequent (i.e., introgressed allele 409
frequency > 0.1) introgression from other wild Capra species to domestic goats were 410
assessed here. Alleles with frequency > 1/48 in the 24 bezoars were excluded from 411
further analysis. We assumed a score threshold of 150,000 and a mutation rate of 4.32 412
× 10-9 per site per generation to determine an introgressed segment (67). The region 413
boundaries were defined by the first and last introgressed alleles. To determine 414
putative introgression haplotypes, we required that over 50% of introgressed alleles 415
could be found in introgressed haplotypes. Genomic regions containing at least 100 416
putative introgressed variants in domestic goats were further examined by 417
constructing ML trees and by searching for domestic goats located within the ibex-418
related clade. For each region, we also reported a match if the genotype of ibex like 419
species included the putative introgressed allele and a mismatch otherwise. The match 420
rate was calculated as the number of matches divided by the number of compared 421
sites (matches and mismatches). 422
423
Selective sweep analysis 424
We screened for sweeps selected during goat domestication by parsing specific 50-kb 425
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
21
windows that showed low diversity in domestic goats and had high divergence (a high 426
fixation index, FST, and haplotype differences, XP-EHH) between domestic goats and 427
wild goats. After calculating all tests, the windows with P values less than 0.005 (Z 428
test) were considered significant signals. Candidate genes under selection were 429
defined as those overlapped by sweep regions or within 20-kb of the signals. Two 430
additional statistics, the Tajima’s D and composite likelihood ratio (CLR) test were 431
applied to confirm the top signals. 432
433
Functional enrichment analyses 434
We characterized the most relevant functions of the protein-coding genes overlapping 435
with the selective sweeps and introgressed regions by searching for overrepresented 436
Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. Goat protein 437
sequences were used to conduct functional enrichment tests on the target genes using 438
KOBAS 3.0 server (68). The P-value was calculated using a hypergeometric 439
distribution. FDR correction was performed to adjust for multiple testing. Pathways 440
with an FDR-corrected P-value < 0.01 were considered statistically significantly 441
enriched. 442
443
Genotyping loci underlying domestication in ancient genomes 444
To investigate the genotypes of the STIM1-RRM1 and MUC6 locus in ancient samples, 445
we used a total of 243 SNPs (15 SNPs within STIM1-RRM1 locus and 228 SNPs 446
within MUC6 locus, respectively), which showed derived allele frequency > 0.95 in 447
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
22
164 modern domestic goats (defined as domestic haplotypes: STIM1-RRM1D and 448
MUC6D) and was absent in 24 modern bezoars and four ancient bezoars (Hovk1, 449
Direkli1-2, Direkli6 and Direkli5) (defined as bezoar haplotypes: STIM1-RRM1B and 450
MUC6B). Due to the low coverage for some ancient genomes, we used allele reads 451
count at SNP positions to determine the genotypes of ancient goats at STIM1-RRM1 452
locus and MUC6 locus. For each ancient goat and SNP, we determined the numbers 453
of reads corresponding to the ancestral and derived allele using GATK 454
UnifiedGenotyper (--min_base_quality_score 15 --output_mode EMIT_ALL_SITES -455
-standard_min_confidence_threshold_for_calling 20) (51). For each locus in each 456
ancient goat, we calculated the proportion of the summed derived allele reads count 457
versus total allele reads count. If the proportion was lower than 10%, the genotype of 458
ancient goat was considered to be homozygous for ancestral allele (bezoar-like). The 459
genotype was set to heterozygous if the proportion was between 10% and 90%. If the 460
proportion was more than 90%, it was classified as homozygous for derived allele 461
(domestic-like). We set the genotype to be missing when the ancient goats only had 462
one single informative read (mapping to the predefined SNPs) mapped to the locus. 463
464
Expression and epidemiologic survey related to MUC6 locus 465
Quantitative real-time PCR analysis 466
Total RNA was extracted using goat tissues from gastrointestinal tract (including 467
abomasum, duodenum, jejunum, ileum, and caecum) using Eastep Super Total RNA 468
Extraction Kit (Promega, Shanghai, China). cDNA was generated via 469
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
23
PrimerScriptTM RT Reagent Kit with genomic DNA Eraser (Perfect Real Time) 470
(TaKaRa, Beijing, China). Quantitative real-time PCR analysis (q-PCR) was 471
performed with FastStart Universal SYBR Green Master (Roche, Shanghai, China) on 472
a Bio-Rad instrument. MUC6 primers (Forward primer: 5’-473
CAGCCAGGACAAAATCATGA-3’, Reverse primer: 5’-474
CTCTGGTCTGGCCTCTGAAC-3’) were designed using NCBI Primer-BLAST 475
(http://www.ncbi.nlm.nih.gov/tools/primer-blast/). GAPDH (Forward primer: 5’-476
ACACCCTCAAGATTGTCAGC-3’, Reverse primer: 5’-477
ATAAGTCCCTCCACGATGC-3’) was used as internal reference. Gene expression 478
results were calculated using the delta-delta cycle threshold (2-ΔΔCt) method. 479
Immunohistochemistry analysis 480
For histological analysis, dissected goat tissues (abomasum pyloric and duodenal bulb) 481
were fixed in 4% paraformaldehyde, and embedded in paraffin for sectioning (5 μm 482
thick). The tissues sections were deparaffinized and rehydration followed by antigen 483
retrieval using a citrate-buffered solution in a microwave at 100� for 15 min. After 484
cooling down to room temperature, quenching of endogenous peroxidase and protein 485
block, the sections were treated with a primary antibody (Cat No. D161001, Sangon 486
Biotech, Shanghai, China) overnight at 4�. Negative control sections were obtained 487
by incubating with the primary antibody diluted buffer. Subsequently, the secondary 488
antibody (Cat No. D110058, Sangon Biotech, Shanghai, China) was used to detect 489
primary antibody. Specific protein immunoreactivity was visualized with the substrate 490
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
24
chromogen 3,3′- diaminobenzidine (DAB). Finally, the slides were rinsed in water, 491
counterstained with haematoxylin and mounted with coverslips. 492
Full-length transcript sequencing 493
To determine the sequence differences between the domestic (MUC6D) and bezoar 494
(MUC6B) MUC6 haplotypes, total RNA was isolated from the abomasum pyloric with 495
high expression of MUC6 from two heterozygous goats and sequenced by PacBio 496
Sequel. Pacbio SMRTbell libraries were sequenced on two separate SMRT cells 497
(Annoroad Gene Technology Co., Ltd., Beijing, China). A total of 23,338,976 498
subreads were generated with a mean accuracy of 80% and an average length of 1755 499
nt. High-quality Circular Consensus Sequences (CCS) were obtained using the Iso-500
Seq 3 application in the PacBio SMRT Analysis v6.0.0 501
(https://github.com/PacificBiosciences/IsoSeq3), with parameters “--noPolish --502
minPasses 1”. Finally, we got a total of 960,271 CCS. These CCS were aligned to the 503
latest goat reference genome using Minimap2 (69) and we picked out 251 CCS that 504
mapped to the MUC6 mRNA sequence. We used the CCS that belonged to the bezoar 505
haplotype to manually assemble the mRNA sequence of the MUC6B. Then, we 506
realigned the MUC6B mRNA sequence to MUC6D mRNA sequence 507
(XM_018042766.1) by MEGA6 (55) and checked the indels carefully. The abundance 508
of tandem repeats in MUC6 (XP_017898255.1) was examined by using the rapid 509
automatic detection and alignment of repeat finding program 510
(https://www.ebi.ac.uk/Tools/pfa/radar/) (70). Three major types of repeat units with 511
distinct sequence features were observed. We detected a 246 bp insertion in the 512
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
25
MUC6B mRNA sequence located at the 32nd exon, containing three copies of type III 513
units between the 2789th amino acid and the 2871th amino acid. We further validated 514
this insertion by aligning short transcriptome reads from homozygous and 515
heterozygous to the MUC6B and MUC6D mRNA sequences, respectively. 516
Test population 517
To further test the genotype-phenotype associations, we first examined the genotypes 518
at the MUC6 locus using blood samples from breeding rams from a company located 519
in Inner Mongolia, China. This company runs more than 9,000 cashmere goats, 520
mainly for superfine cashmere (fiber diameter < 14 μm). The animals are dewormed 521
routinely with Oxfendazole, Ivermectin, Avermectin, and Levamisole three times 522
annually. Of the 20 tested breeding rams, we identified two animals with 523
MUC6D/MUC6B genotype. We sampled their offspring based on breeding records. 524
Blood samples of these 495 animals were collected in January, 2019. 525
Genotypic and phenotypic analysis 526
DNA was extracted from the blood samples. The MUC6 locus was genotyped by PCR 527
(Forward primer: 5’-CAGCACTATCTCCCATACATC-3’, Reverse primer: 5’-528
GTGGAGCTGAGCTGACACTT-3’) and Sanger sequencing. The frequency of 529
MUC6B was 17.3% (n = 338 MUC6D/MUC6D, n = 143 MUC6D/MUC6B, n = 14 530
MUC6B/MUC6B). 531
After genotyping, we collected fresh fecal samples from a subset of 268 animals 532
from the rectum in April, 2019, a time when gastrointestinal nematodes numbers were 533
anticipated to be elevated. The gastrointestinal parasitic infestations were examined 534
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
26
using the McMaster's technique as described in ref. (71). In brief, two grams of fecal 535
samples were transferred into a container and 60 ml of saturated sodium chloride were 536
added. The suspension was thoroughly stirred with a glass stick and poured through a 537
strainer (80-mesh screen) into the new container. The container with the suspension 538
was closed tightly and carefully inverted several times. Then, the suspension was 539
taken up from the container to fill the glass plate with two McMaster counting 540
chambers. The size of the counting chamber was 10 × 10 × 1.5 mm. The nematodes 541
eggs were then counted under microscopic observation at 100× magnification. 542
Observed nematodes in the feces included the common abomasal and intestinal goat 543
nematodes: Hemonchus contortus and Nematodirus sp. (fig. S36). 544
The number of nematodes eggs per gram (EPG) was calculated according to the 545
following formula: EPG = (n1 + n2)/2 ÷ 0.15 × 60 ÷ 2, where (n1 + n2)/2 is the 546
average number of eggs per chamber, 0.15 is the effective volume (ml) for counting 547
chamber, 60 is the total volume (ml) of suspension, 2 is the weight (g) of faeces 548
examined. EPG was measured on three replicates of each fecal sample and the 549
average of the three replicates was used for analysis. 550
Data analysis 551
The average fecal nematodes egg counts were not normally distributed, therefore we 552
used Wilcoxon’s rank sum test to test the null hypothesis that there was no association 553
between genotype and phenotype.554
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
27
References 555
1. G. Larson, D. Q. Fuller, The Evolution of Animal Domestication. Annu Rev 556 Ecol Evol S 45, 115-136 (2014)10.1146/annurev-ecolsys-110512-135813). 557
2. M. A. Zeder, Domestication and early agriculture in the Mediterranean Basin: 558 Origins, diffusion, and impact. Proceedings of the National Academy of 559 Sciences of the United States of America 105, 11597-11604 (2008); published 560 online EpubAug 19 (10.1073/pnas.0801317105). 561
3. F. Pereira, A. Amorim, Origin and Spread of Goat Pastoralism. 562 (2010)10.1002/9780470015902.a0022864). 563
4. L. Colli, M. Milanesi, A. Talenti, F. Bertolini, M. Chen, A. Crisà, K. G. Daly, 564 M. Del Corvo, B. Guldbrandtsen, J. A. Lenstra, Genome-wide SNP profiling 565 of worldwide goat populations reveals strong partitioning of diversity and 566 highlights post-domestication migration routes. Genetics Selection Evolution 567 50, 58 (2018). 568
5. F. J. Alberto, F. Boyer, P. Orozco-terWengel, I. Streeter, B. Servin, P. de 569 Villemereuil, B. Benjelloun, P. Librado, F. Biscarini, L. Colli, M. Barbato, W. 570 Zamani, A. Alberti, S. Engelen, A. Stella, S. Joost, P. Ajmone-Marsan, R. 571 Negrini, L. Orlando, H. R. Rezaei, S. Naderi, L. Clarke, P. Flicek, P. Wincker, 572 E. Coissac, J. Kijas, G. Tosser-Klopp, A. Chikhi, M. W. Bruford, P. Taberlet, 573 F. Pompanon, Convergent genomic signatures of domestication in sheep and 574 goats. Nature communications 9, 813 (2018); published online EpubMar 6 575 (10.1038/s41467-018-03206-y). 576
6. K. G. Daly, P. Maisano Delser, V. E. Mullin, A. Scheu, V. Mattiangeli, M. D. 577 Teasdale, A. J. Hare, J. Burger, M. P. Verdugo, M. J. Collins, R. Kehati, C. M. 578 Erek, G. Bar-Oz, F. Pompanon, T. Cumer, C. Cakirlar, A. F. Mohaseb, D. 579 Decruyenaere, H. Davoudi, O. Cevik, G. Rollefson, J. D. Vigne, R. Khazaeli, 580 H. Fathi, S. B. Doost, R. Rahimi Sorkhani, A. A. Vahdati, E. W. Sauer, H. 581 Azizi Kharanaghi, S. Maziar, B. Gasparian, R. Pinhasi, L. Martin, D. Orton, B. 582 S. Arbuckle, N. Benecke, A. Manica, L. K. Horwitz, M. Mashkour, D. G. 583 Bradley, Ancient goat genomes reveal mosaic domestication in the Fertile 584 Crescent. Science 361, 85-88 (2018); published online EpubJul 6 585 (10.1126/science.aas9411). 586
7. N. Pidancier, S. Jordan, G. Luikart, P. Taberlet, Evolutionary history of the 587 genus Capra (Mammalia, Artiodactyla): discordance between mitochondrial 588 DNA and Y-chromosome phylogenies. Mol Phylogenet Evol 40, 739-749 589 (2006); published online EpubSep (10.1016/j.ympev.2006.04.002). 590
8. S. E. Hammer, H. M. Schwammer, F. Suchentrunk, Evidence for Introgressive 591 Hybridization of Captive Markhor (Capra falconeri) with Domestic Goat: 592 Cautions for Reintroduction. Biochemical Genetics 46, 216-226 593 (2008)10.1007/s10528-008-9145-y). 594
9. C. Grossen, L. Keller, I. Biebach, D. Croll, I. G. G. Consortium, Introgression 595 from domestic goat generated variation at the major histocompatibility 596 complex of alpine ibex. PLoS genetics 10, e1004438 (2014). 597
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
28
10. F. Racimo, S. Sankararaman, R. Nielsen, E. Huerta-Sanchez, Evidence for 598 archaic adaptive introgression in humans. Nature reviews. Genetics 16, 359-599 371 (2015); published online EpubJun (10.1038/nrg3936). 600
11. X. J. Hu, J. Yang, X. L. Xie, F. H. Lv, Y. H. Cao, W. R. Li, M. J. Liu, Y. T. 601 Wang, J. Q. Li, Y. G. Liu, Y. L. Ren, Z. Q. Shen, F. Wang, E. Hehua, J. L. 602 Han, M. H. Li, The Genome Landscape of Tibetan Sheep Reveals Adaptive 603 Introgression from Argali and the History of Early Human Settlements on the 604 Qinghai-Tibetan Plateau. Mol Biol Evol 36, 283-303 (2019); published online 605 EpubFeb 1 (10.1093/molbev/msy208). 606
12. D. D. Wu, X. D. Ding, S. Wang, J. M. Wojcik, Y. Zhang, M. Tokarska, Y. Li, 607 M. S. Wang, O. Faruque, R. Nielsen, Q. Zhang, Y. P. Zhang, Pervasive 608 introgression facilitated domestication and adaptation in the Bos species 609 complex. Nature ecology & evolution 2, 1139-1145 (2018); published online 610 EpubJul (10.1038/s41559-018-0562-y). 611
13. P. W. Hedrick, Adaptive introgression in animals: examples and comparison 612 to new mutation and standing variation as sources of adaptive variation. 613 Molecular ecology 22, 4606-4618 (2013); published online EpubSep 614 (10.1111/mec.12415). 615
14. Z. Sun, J. Shao, L. Liu, J. Cui, M. F. Bonomo, Q. Guo, X. Wu, J. Wang, The 616 first Neolithic urban center on China's north Loess Plateau: The rise and fall of 617 Shimao. Archaeological Research in Asia, (2017). 618
15. Y. Dong, X. Zhang, M. Xie, B. Arefnezhad, Z. Wang, W. Wang, S. Feng, G. 619 Huang, R. Guan, W. Shen, Reference genome of wild goat (capra aegagrus) 620 and sequencing of goat breeds provide insight into genic basis of goat 621 domestication. BMC genomics 16, 431-431 (2015). 622
16. V. Manceau, L. Despres, J. Bouvet, P. Taberlet, Systematics of the genus 623 Capra inferred from mitochondrial DNA sequence data. Mol Phylogenet Evol 624 13, 504-510 (1999); published online EpubDec (10.1006/mpev.1999.0688). 625
17. D. M. Shackleton, Wild sheep and goats and their relatives : status survey and 626 conservation action plan for caprinae. (International Union for Conservation 627 of Nature and Natural Resources, 1997). 628
18. M. A. Miguel, C. N. Mingala, Screening of Pig (Sus scrofa) Bactericidal 629 Permeability-Increasing Protein (BPI) Gene as Marker for Disease Resistance. 630 Animal biotechnology 30, 146-150 (2019). 631
19. W. Yan, C. Zheng, J. He, W. Zhang, X. A. Huang, X. Li, Y. Wang, X. Wang, 632 Eleutheroside B1 mediates its anti-influenza activity through POLR2A and N-633 glycosylation. International journal of molecular medicine 42, 2776-2792 634 (2018). 635
20. A. Omoumi, A. Fok, G.-Y. Hsiung, Analysis of two immune function genes 636 (CR1 and CD2AP) in Alzheimer's disease in two large Canadian cohorts. 637 Alzheimer's & Dementia: The Journal of the Alzheimer's Association 8, P722 638 (2012). 639
21. S. D. Babu, V. Jayanthi, N. Devaraj, C. A. Reis, H. Devaraj, Expression 640 profile of mucins (MUC2, MUC5AC and MUC6) in Helicobacter pylori 641
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
29
infected pre-neoplastic and neoplastic human gastric epithelium. Molecular 642 Cancer 5, 10 (2006). 643
22. M. A. McGuckin, S. K. Lindén, P. Sutton, T. H. Florin, Mucin dynamics and 644 enteric pathogens. Nature Reviews Microbiology 9, 265 (2011). 645
23. M. V. Benavides, T. S. Sonstegard, C. Van Tassell, Genomic Regions 646 Associated with Sheep Resistance to Gastrointestinal Nematodes. Trends in 647 parasitology 32, 470-480 (2016); published online EpubJun 648 (10.1016/j.pt.2016.03.007). 649
24. Hartmann, Jana, Karl, Rosa, nbsp, Alexander, Ryan, nbsp, P.D, Adelsberger, 650 Helmuth, STIM1 Controls Neuronal Ca2+ Signaling, mGluR1-Dependent 651 Synaptic Transmission, and Cerebellar Motor Behavior. Neuron 82, 635-644 652 (2014). 653
25. L. Majewski, F. Maciag, P. M. Boguszewski, I. Wasilewska, G. Wiera, T. 654 Wojtowicz, J. Mozrzymas, J. Kuznicki, Overexpression of STIM1 in neurons 655 in mouse brain improves contextual learning and impairs long-term depression. 656 Biochimica et biophysica acta 1864, 1071-1087 (2017); published online 657 EpubJun (10.1016/j.bbamcr.2016.11.025). 658
26. J. C. Craig, G. D. Bennett, R. C. Miranda, S. A. Mackler, R. H. Finnell, 659 Ribonucleotide reductase subunit r1: A gene conferring sensitivity to valproic 660 acid-induced neural tube defects in mice. Teratology 61, 305-313 661 (2000)10.1002/(sici)1096-9926(200004)61:4<305::aid-tera10>3.0.co;2-8). 662
27. E. O. Price, Behavioral aspects of animal domestication. The quarterly review 663 of biology 59, 1-32 (1984). 664
28. M. Rinaldi, L. Dreesen, P. R. Hoorens, R. W. Li, E. Claerebout, B. Goddeeris, 665 J. Vercruysse, W. Van Den Broek, P. Geldhof, Infection with the 666 gastrointestinal nematode Ostertagia ostertagi in cattle affects mucus 667 biosynthesis in the abomasum. Vet Res 42, 61 (2011); published online 668 EpubMay 11 (10.1186/1297-9716-42-61). 669
29. P. R. Hoorens, M. Rinaldi, R. W. Li, B. Goddeeris, E. Claerebout, J. 670 Vercruysse, P. Geldhof, Genome wide analysis of the bovine mucin genes and 671 their gastrointestinal transcription profile. BMC genomics 12, 140 (2011). 672
30. F. Roeber, A. R. Jex, R. B. Gasser, Impact of gastrointestinal parasitic 673 nematodes of sheep, and the role of advanced molecular tools for exploring 674 epidemiology and drug resistance-an Australian perspective. Parasites & 675 vectors 6, 153 (2013). 676
31. H. V. Simpson, S. Umair, V. C. Hoang, M. S. Savoian, Histochemical study of 677 the effects on abomasal mucins of Haemonchus contortus or Teladorsagia 678 circumcincta infection in lambs. Veterinary parasitology 226, 210-221 (2016); 679 published online EpubAug 15 (10.1016/j.vetpar.2016.06.026). 680
32. N. Moniaux, F. Escande, N. Porchet, J.-P. Aubert, S. K. Batra, Structural 681 organization and classification of the human mucin genes. Front Biosci 6, 682 D1192-D1206 (2001). 683
33. S. Z. Hasnain, A. L. Gallagher, R. K. Grencis, D. J. Thornton, A new role for 684 mucins in immunity: insights from gastrointestinal nematode infection. The 685
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
30
international journal of biochemistry & cell biology 45, 364-374 (2013). 686 34. D. Lawrence, G. Philip, H. Hunt, L. Snape-Kennedy, T. J. Wilkinson, Long 687
Term Population, City Size and Climate Trends in the Fertile Crescent: A First 688 Approximation. PLoS One 11, e0152563 689 (2016)10.1371/journal.pone.0152563). 690
35. K. N. Harper, G. J. Armelagos, Genomics, the origins of agriculture, and our 691
changing microbe‐scape: Time to revisit some old tales and tell some new 692
ones. American journal of physical anthropology 152, 135-152 (2013). 693 36. F. Pereira, S. J. Davis, L. Pereira, B. McEvoy, D. G. Bradley, A. Amorim, 694
Genetic signatures of a Mediterranean influence in Iberian Peninsula sheep 695 husbandry. Molecular Biology and Evolution 23, 1420-1426 (2006). 696
37. B. S. Arbuckle, Pace and process in the origins of animal husbandry in 697 Neolithic Southwest Asia. Bioarchaeology of the Near East 8, 53-81 (2014). 698
38. E. Huerta-Sanchez, X. Jin, Asan, Z. Bianba, B. M. Peter, N. Vinckenbosch, Y. 699 Liang, X. Yi, M. He, M. Somel, P. Ni, B. Wang, X. Ou, Huasang, J. Luosang, 700 Z. X. Cuo, K. Li, G. Gao, Y. Yin, W. Wang, X. Zhang, X. Xu, H. Yang, Y. Li, 701 J. Wang, J. Wang, R. Nielsen, Altitude adaptation in Tibetans caused by 702 introgression of Denisovan-like DNA. Nature 512, 194-197 (2014); published 703 online EpubAug 14 (10.1038/nature13408). 704
39. M. R. Jones, L. S. Mills, P. C. Alves, C. M. Callahan, J. M. Alves, D. J. R. 705 Lafferty, F. M. Jiggins, J. D. Jensen, J. Melo-Ferreira, J. M. Good, Adaptive 706 introgression underlies polymorphic seasonal camouflage in snowshoe hares. 707 Science 360, 1355-1358 (2018); published online EpubJun 22 708 (10.1126/science.aar5273). 709
40. E. M. Oziolor, N. M. Reid, S. Yair, K. M. Lee, S. Guberman VerPloeg, P. C. 710 Bruns, J. R. Shaw, A. Whitehead, C. W. Matson, Adaptive introgression 711 enables evolutionary rescue from extreme environmental pollution. Science 712 364, 455-457 (2019); published online EpubMay 3 (10.1126/science.aav4155). 713
41. M. Dannemann, A. M. Andres, J. Kelso, Introgression of Neandertal- and 714 Denisovan-like Haplotypes Contributes to Adaptive Variation in Human Toll-715 like Receptors. American journal of human genetics 98, 22-33 (2016); 716 published online EpubJan 07 (10.1016/j.ajhg.2015.11.015). 717
42. L. Abi-Rached, M. J. Jobin, S. Kulkarni, A. McWhinnie, K. Dalva, L. Gragert, 718 F. Babrzadeh, B. Gharizadeh, M. Luo, F. A. Plummer, The shaping of modern 719 human immune systems by multiregional admixture with archaic humans. 720 Science 334, 89-94 (2011). 721
43. F. L. Mendez, J. C. Watkins, M. F. Hammer, Neandertal origin of genetic 722 variation at the cluster of OAS immunity genes. Molecular biology and 723 evolution 30, 798-801 (2013). 724
44. K. Junker, I. G. Horak, B. Penzhorn, History and development of research on 725 wildlife parasites in southern Africa, with emphasis on terrestrial mammals, 726 especially ungulates. International Journal for Parasitology: Parasites and 727 Wildlife 4, 50-70 (2015). 728
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
31
45. M. Meyer, M. Kircher, Illumina sequencing library preparation for highly 729 multiplexed target capture and sequencing. Cold Spring Harbor protocols 730 2010, pdb prot5448 (2010); published online EpubJun (10.1101/pdb.prot5448). 731
46. D. M. Bickhart, B. D. Rosen, S. Koren, B. L. Sayre, A. R. Hastie, S. Chan, J. 732 Lee, E. T. Lam, I. Liachko, S. T. Sullivan, Single-molecule sequencing and 733 chromatin conformation capture enable de novo reference assembly of the 734 domestic goat genome. Nature genetics 49, 643 (2017). 735
47. H. Li, R. Durbin, Fast and accurate short read alignment with Burrows-736 Wheeler transform. Bioinformatics 25, 1754-1760 (2009); published online 737 EpubJul 15 (10.1093/bioinformatics/btp324). 738
48. H. Li, Aligning sequence reads, clone sequences and assembly contigs with 739 BWA-MEM. arXiv preprint arXiv:1303.3997, (2013). 740
49. H. Li, A statistical framework for SNP calling, mutation discovery, association 741 mapping and population genetical parameter estimation from sequencing data. 742 Bioinformatics 27, 2987-2993 (2011). 743
50. T. S. Korneliussen, A. Albrechtsen, R. Nielsen, ANGSD: analysis of next 744 generation sequencing data. BMC bioinformatics 15, 356 (2014). 745
51. A. McKenna, M. Hanna, E. Banks, A. Sivachenko, K. Cibulskis, A. Kernytsky, 746 K. Garimella, D. Altshuler, S. Gabriel, M. Daly, The Genome Analysis 747 Toolkit: a MapReduce framework for analyzing next-generation DNA 748 sequencing data. Genome research 20, 1297-1303 (2010). 749
52. P. Danecek, A. Auton, G. Abecasis, C. A. Albers, E. Banks, M. A. Depristo, R. 750 E. Handsaker, G. Lunter, G. T. Marth, S. T. Sherry, The variant call format 751 and VCFtools. Bioinformatics 27, 2156-2158 (2011). 752
53. S. R. Browning, B. L. Browning, Rapid and accurate haplotype phasing and 753 missing-data inference for whole-genome association studies by use of 754 localized haplotype clustering. The American Journal of Human Genetics 81, 755 1084-1097 (2007). 756
54. B. Browning, S. Browning, Genotype Imputation with Millions of Reference 757 Samples. American journal of human genetics 98, 116 (2016). 758
55. K. Tamura, G. Stecher, D. Peterson, A. Filipski, S. Kumar, MEGA6: 759 Molecular Evolutionary Genetics Analysis Version 6.0. Molecular Biology & 760 Evolution, 2725-2729 (2013). 761
56. P. S, N. B, T.-B. K, T. L, F. MA, B. D, M. J, S. P, d. B. PI, D. MJ, S. PC, 762 PLINK: A Tool Set for Whole-Genome Association and Population-Based 763 Linkage Analyses. American journal of human genetics 81, 559-575 (2007). 764
57. A. Stamatakis, RAxML version 8: a tool for phylogenetic analysis and post-765 analysis of large phylogenies. Bioinformatics 30, 1312 (2014). 766
58. N. Patterson, A. L. Price, D. Reich, Population structure and eigenanalysis. 767 PLoS genetics 2, e190 (2006); published online EpubDec 768 (10.1371/journal.pgen.0020190). 769
59. D. H. Alexander, J. Novembre, K. Lange, Fast model-based estimation of 770 ancestry in unrelated individuals. Genome research 19, 1655-1664 (2009). 771
60. D. J. Lawson, G. Hellenthal, S. Myers, D. Falush, Inference of population 772
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
32
structure using dense haplotype data. PLoS genetics 8, e1002453 (2012). 773 61. J. K. Pickrell, J. K. Pritchard, Inference of population splits and mixtures from 774
genome-wide allele frequency data. PLoS genetics 8, e1002967 775 (2012)10.1371/journal.pgen.1002967). 776
62. S. Schiffels, R. Durbin, Inferring human population size and separation history 777 from multiple genome sequences. Nature genetics 46, 919-925 (2014); 778 published online EpubAug (10.1038/ng.3015). 779
63. J. Terhorst, J. A. Kamm, Y. S. Song, Robust and scalable inference of 780 population history from hundreds of unphased whole genomes. Nature 781 genetics 49, 303-309 (2017); published online EpubFeb (10.1038/ng.3748). 782
64. R. N. Gutenkunst, R. D. Hernandez, S. H. Williamson, C. D. Bustamante, 783 Inferring the joint demographic history of multiple populations from 784 multidimensional SNP frequency data. PLoS genetics 5, e1000695 (2009); 785 published online EpubOct (10.1371/journal.pgen.1000695). 786
65. N. Patterson, P. Moorjani, Y. Luo, S. Mallick, N. Rohland, Y. Zhan, T. 787 Genschoreck, T. Webster, D. Reich, Ancient admixture in human history. 788 Genetics 192, 1065-1093 (2012); published online EpubNov 789 (10.1534/genetics.112.145037). 790
66. S. Soraggi, C. Wiuf, A. Albrechtsen, Powerful Inference with the D-Statistic 791 on Low-Coverage Whole-Genome Data. G3 8, 551-566 (2018); published 792 online EpubFeb 2 (10.1534/g3.117.300192). 793
67. S. R. Browning, B. L. Browning, Y. Zhou, S. Tucci, J. M. Akey, Analysis of 794 Human Sequence Data Reveals Two Pulses of Archaic Denisovan Admixture. 795 Cell 173, 53-61 e59 (2018); published online EpubMar 22 796 (10.1016/j.cell.2018.02.031). 797
68. C. Xie, X. Mao, J. Huang, Y. Ding, J. Wu, S. Dong, L. Kong, G. Gao, C. Y. Li, 798 L. Wei, KOBAS 2.0: a web server for annotation and identification of 799 enriched pathways and diseases. Nucleic Acids Res 39, W316-322 (2011); 800 published online EpubJul (10.1093/nar/gkr483). 801
69. H. Li, Minimap and miniasm: fast mapping and de novo assembly for noisy 802 long sequences. Bioinformatics 32, 2103-2110 (2016); published online 803 EpubJul 15 (10.1093/bioinformatics/btw152). 804
70. A. Heger, L. Holm, Rapid automatic detection and alignment of repeats in 805 protein sequences. Proteins: Structure, Function, and Bioinformatics 41, 224-806 237 (2000). 807
71. H. Whitlock, Some modifications of the McMaster helminth egg-counting 808 technique and apparatus. Journal of the Council for Scientific and Industrial 809 Research. Australia. 21, 177-180 (1948).810
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
33
Acknowledgments 811
We thank members of the Nextgen project for sharing their data. We thank Lobna 812
Mohammed for providing Nubian ibex hybrid blood samples. We thank High-813
Performance Computing (HPC) of NWAFU for providing computing resources. 814
Funding: This project was supported by grants from the National Natural Science 815
Foundation of China (31822052 and 31572381 to Y.J., 31572369 to Y.Chen), the 816
Talents Team Construction Fund of Northwestern Polytechnical University (NWPU) 817
and Strategic Priority Research Program of CAS (XDB13000000) to W.W. and the 818
Villum Foundation (VKR 023447) to R.H.. The Tang scholar at Northwest A&F 819
University (NWAFU) to X.-L.W.. Author contribution: Y.J. and Y.Chen conceived 820
the project and designed the research. Z.Z., M.L., and X.-H.W., performed the 821
majority of the analysis with contributions from Z.Y., Y.L., Y.Cai, Q.C., J.Liu, K.W., 822
X.P., Y.W., S.H., M.G., T.Z., Y.Z., Y.G., Y.X., and Y.Y.. X.-L.W., W.Zhang, J.H., 823
L.Chen, A.E., and M.O. prepared the modern DNA samples. J.Lesur provided the 824
historical sample of West Caucasian tur. D.C. prepared the ancient samples. X.-H.W., 825
Z.Z., Y.J., and M.L. drafted the manuscripts with input from all authors, and Y.J., 826
W.W., R.H., G.Z, K.D., D.B., L.Colli, and T.S.S. revised the manuscript. Competing 827
interests: The authors declare that they have no competing interests. Data and 828
materials availability: Individual genome sequence data are available at the 829
Sequence Read Archive (https://www.ncbi.nlm.nih.gov/sra) under accession code 830
PRJNA387635 and PRJNA361447.831
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
34
Supplementary Materials: 832
Supplementary Materials and Methods 833
Supplementary Text 834
Figs. S1 to S40 835
Tables S1 to S15 836
Legends for Data file S1 to S7837
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
35
Figures 838
839
Fig. 1. Geographic distribution and genetic affinities of wild and domestic goats 840
used in this study. (A) Locations of wild (squares), domestic (circles), and ancient 841
(pentagon) goats for each major geographic group. Domestic goats are colored to 842
mirror their geographic origin. (B) A neighbor-joining tree from the genome 843
sequences used in this study. The branches are colored following the same color code 844
used in panel (A). (C and D) Principal component analysis (PCA) of bezoars and 845
domestic goats (C) and domestic goats only (D). (E) ADMIXTURE results for k = 3 846
and k = 6, which had the low cross-validation error.847
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
36
848
Fig. 2. Gene flow during the early stage of goat domestication and diffusion. (A) 849
Geographic distribution of the wild Capra species and the dispersal routes of 850
domestic goats out of their domestication areas (3, 7). (B) Allele sharing between 851
domestic goats and bezoars. (C) Allele sharing between domestic goats and ibex-like 852
species. Statistically significant results, defined by |Z-scores| ≥ 3, are marked with a 853
red asterisk for panel (B) and panel (C). (D) A heatmap of D statistic testing for the 854
differential affinity between SWA (blue) and H2 (red), where H2 represents the 855
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
37
individual/population of bezoars and domestic goats located in the map. The East 856
Asian goats are depicted in the box in the upper right corner. Positive D (H1, H2; H3, 857
Outgroup) values indicate more allele sharing between H2 and H3 while negative D 858
values indicate more allele sharing between H1 and H3. (E) A scatter plot of 859
introgressed haplotype frequency in EUR-AFR and Asian goat populations. The red 860
dots represent immune-related loci. (F) Haplotype pattern at the MUC6 locus defined 861
by putatively introgressed variants (yellow, predicted introgressed allele; red, 862
predicted non-introgressed allele). (G) Haplotype network from the number of 863
pairwise differences at the MUC6 non-repeat region. All domestic goats are colored in 864
green, and wild Capra species are divided into five subgroups, including the 865
haplotypes from hybrids of Nubian ibex × domestic goat. The radius of the pie chart 866
and the width of edges were log2-transformed.867
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
38
868
Fig. 3. Genomic regions with selection signals in domestic goats. (A) Distribution 869
of the pairwise fixation index (FST) (x-axis), π ln ratio (y-axis) and value of XP-EHH 870
(color) between bezoars and domestic goats. The dashed vertical and horizontal lines 871
indicate the significance threshold (corresponding to Z test P < 0.005, where FST > 872
0.195, π ln ratio > 0.395, and XP-EHH > 2.1) used for extracting outliers. (B) KEGG 873
pathways identified as significantly overrepresented (hypergeometric test, adjusted P 874
< 0.01). (C and D) Selective sweep and distribution of the recombination (RHO) on 875
chromosome 29 (chr29:46.22-46.31 Mb) (C) and selective sweep on chromosome 15 876
(chr15:32.24-32.37 Mb) (D). The putative sweep region is additionally validated by 877
FST, Tajima’s D and composite likelihood ratio (CLR) test. Horizontal dashed lines 878
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
39
represent the whole-genome mean for the corresponding parameters. Gene 879
annotations in the sweep region, nonsynonymous SNPs with FST > 0.88 (C), and 880
SNPs nearly fixed for derived alleles in domestic goats (D) are indicated at the bottom.881
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
40
882
Fig. 4. The association of MUC6 with gastrointestinal nematodes resistance. (A) 883
Immunohistochemistry for MUC6 in abomasum pyloric and duodenal bulb of a goat. 884
Photomicrographs at 4X, 20X are shown on the left and in the middle. The negative 885
controls (20X) are shown on the right. (B) The statistical association between MUC6 886
genotype and the fecal egg counts for gastrointestinal nematodes. Wilcoxon’s rank-887
sum test was employed to compute the P values.888
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint
41
889
Fig. 5. The emergence and diffusion of domestic STIM1-RRM1 and MUC6 890
haplotypes are concurrent with the spread of mtDNA haplogroup A. (A) The 891
temporal changes in the frequency of the STIM1-RRM1D, MUC6D and mtDNA 892
haplogroup A from pre-domestication bezoars to modern domestic goats. The dates 893
are expressed as cal. BP. (B) Genotypes of STIM1-RRM1 and MUC6, and mtDNA 894
and Y-chromosome haplogroups. The presences in homozygous or heterozygous 895
states are shown in green and light green, respectively. The absence of the domestic 896
allele is depicted in light pink. The light grey color symbolizes missing information. 897
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 15, 2020. . https://doi.org/10.1101/2020.01.14.905505doi: bioRxiv preprint