parallel concerted evolution of ribosomal protein genes in ... · rajeh1,3, ajay chatrath1, zhenguo...

36
Parallel Concerted Evolution of Ribosomal Protein Genes in Fungi and Its Adaptive Significance Alison Mullis 1# , Zhaolian Lu 1# , Yu Zhan 1# , Tzi-Yuan Wang 2 , Judith Rodriguez 3 , Ahmad Rajeh 1,3 , Ajay Chatrath 1 , Zhenguo Lin 1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103 2 Biodiversity Research Center, Academia Sinica, Nankang, Taipei, Taiwan 3 Program of Bioinformatics and Computational Biology, Saint Louis University, St. Louis, MO USA 63103 # These authors contributed equally to this work * To whom correspondence should be addressed Zhenguo Lin, Email: [email protected] Phone: 314-977-9816 Keywords: ribosomal proteins, gene duplication, gene conversion, concerted evolution, fermentation

Upload: others

Post on 09-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

1

Parallel Concerted Evolution of Ribosomal Protein Genes in 1

Fungi and Its Adaptive Significance 2

3

Alison Mullis1#, Zhaolian Lu1#, Yu Zhan1#, Tzi-Yuan Wang2, Judith Rodriguez3, Ahmad 4

Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 5

6

1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103 7

2 Biodiversity Research Center, Academia Sinica, Nankang, Taipei, Taiwan 8

3 Program of Bioinformatics and Computational Biology, Saint Louis University, St. 9

Louis, MO USA 63103 10

11

12

# These authors contributed equally to this work 13

* To whom correspondence should be addressed 14

Zhenguo Lin, Email: [email protected] 15

Phone: 314-977-9816 16

17

Keywords: ribosomal proteins, gene duplication, gene conversion, concerted evolution, fermentation 18

Page 2: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

2

ABSTRACT 19

Ribosomal proteins (RPs) genes encode structure components of ribosomes, the cellular 20

machinery for protein synthesis. A single functional copy has been maintained in most of 78-80 21

RP families in animals due to evolutionary constraints imposed by gene dosage balance. Some 22

fungal species have maintained duplicate copies in most RP families. How the RP genes were 23

duplicated and maintained in these fungal species, and their functional significance remains 24

unresolved. To address these questions, we identified all RP genes from 295 fungi and inferred 25

the timing and nature of gene duplication for all RP families. We found that massive duplications 26

of RP genes have independently occurred by different mechanisms in three distantly related 27

lineages. The RP duplicates in two of them, budding yeast and Mucoromycota, were mainly 28

created by whole genome duplication (WGD) events. However, in fission yeasts, duplicate RP 29

genes were likely generated by retroposition, which is unexpected considering their dosage 30

sensitivity. The sequences of most RP paralogs in each species have been homogenized by 31

repeated gene conversion, demonstrating parallel concerted evolution, which might have 32

facilitated the retention of their duplicates. Transcriptomic data suggest that the duplication and 33

retention of RP genes increased RP transcription abundance. Physiological data indicate that 34

increased ribosome biogenesis allowed these organisms to rapidly consuming sugars through 35

fermentation while maintaining high growth rates, providing selective advantages to these 36

species in sugar-rich environments. 37

Page 3: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

3

INTRODUCTION 38

Gene duplication has served as a driving force for the evolution of new phenotypic traits and 39

contributed to adaptation of organisms to their specific niches (Ohno 1970; Sidow 1996). 40

Duplicate genes are mainly generated by chromosome or whole genome duplication (WGD), 41

unequal crossing-over, and retroposition (Zhang 2013). Similar to other types of mutations, only 42

a small portion of duplicate genes can be eventually fixed in a population, and the survivors are 43

usually advantageous to the organisms (Zhang 2003; Kondrashov and Kondrashov 2006). 44

Highly diverse retention patterns of duplicate genes have been observed among gene families 45

(Hahn, et al. 2005). For instance, tens to hundreds of odorant receptors (ORs) genes can be 46

found in metazoan genomes (Sanchez-Gracia, et al. 2009). In contrast, many genes have been 47

maintained as a single copy since the divergence of eukaryotes, such as the DNA repair genes 48

RAD51, MSH2, and MLH1 (Lin, et al. 2006; Lin, et al. 2007; Zeng, et al. 2014). 49

Another notable example is the gene families encoding for cytosolic ribosomal proteins 50

(RPs), which are the structural components of ribosomes. Ribosomes carry out one of the most 51

fundamental processes of living systems by translating genetic information from mRNA into 52

proteins. In eukaryotes, each ribosome consists of two subunits, the small and large subunit, 53

which consist of 78-80 different RPs and four types of ribosomal RNAs (rRNA) (Wool 1979; 54

Wimberly, et al. 2000). RP genes are highly conserved in all domains of life (Korobeinikova, et 55

al. 2012). Each RP was found to have unique amino acid sequences with very limited to none 56

similarities between each other. For most animals studied, only a single functional copy of RP 57

gene is maintained in each family, although many processed pseudogenes may be found 58

(Dudov and Perry 1984; Kuzumaki, et al. 1987; Kenmochi, et al. 1998). As structural 59

components of the highly expressed macromolecular complex, the evolutionary constraints on 60

duplicate RP genes was believed to be imposed by gene dosage balance (Birchler and Veitia 61

2012). In plants, attributing to polyploidization or WGD events, multiple gene copies are usually 62

present in each RP families in polyploid plants (Vision, et al. 2000; Barakat, et al. 2001). This is 63

probably because all RP genes were duplicated simultaneously by WGD, allowing maintenance 64

of balanced dosage among RPs (Birchler and Veitia 2012). 65

Similar to polyploid plants, most of RP families in the budding yeast Saccharomyces 66

cerevisiae have duplicate copies due to the occurrence of a WGD (Wolfe and Shields 1997; 67

Kellis, et al. 2004). Many RP paralogous genes in S. cerevisiae generated by WGD, or RP 68

ohnologs, are more similar to each other than to their orthologous genes due to interlocus gene 69

Page 4: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

4

conversion (Evangelisti and Conant 2010; Casola, et al. 2012). During a interlocus gene 70

conversion, one gene serves as a DNA donor that replaces the sequences of its paralogous 71

gene (Chen, et al. 2007). As a result, the sequences of paralogous genes have been 72

homogenized, resulted in the ancient duplicate events to appear much more recent, which was 73

called “concerted evolution” (Brown, et al. 1972). One of the best-known examples of concert 74

evolution is the genes encoding RNA component of ribosomes, the rRNA genes, in both 75

prokaryotes and eukaryotes (Arnheim, et al. 1980; Schlotterer and Tautz 1994; Blattner, et al. 76

1997). 77

According to the Ribosomal Protein Gene Database (RPG) (Nakao, et al. 2004), three of ten 78

fungal species listed have multiple gene copies in most RP families, including S. cerevisiae, a 79

fission yeast Schizosaccharomyces pombe, and a pin mold Rhizopus oryzae. The duplicate RP 80

genes in R. oryzae could be generated by a WGD event it is ancestor (Ma, et al. 2009), 81

although it has not been systematically examined. In addition, most RP families have more than 82

four gene copies, which cannot be explained by a single WGD. Unlike S. cerevisiae and R. 83

oryzae, no WGD has been detected during the evolution of Sch. pombe (Rhind, et al. 2011), 84

suggesting that each RP family might be duplicated independently by small scale duplication 85

(SSDs) events. This observation is unexpected because the duplicate genes encoding 86

macromolecules generated by SSDs are much less likely to survive because they are sensitive 87

to gene dosage balance (Li, et al. 1996; Conant and Wolfe 2008). It remains an unexplored 88

dimension about how RP genes have been duplicate and maintained in fungi, particularly in the 89

fission yeast Sch. pombe. 90

The expression of RP genes has been thoroughly linked to growth and proliferation, 91

reflecting their central role in the regulation of growth in yeast (Montagne, et al. 1999; 92

Jorgensen, et al. 2002; Brauer, et al. 2008). In rapid growth yeast cells, ~50% of RNA 93

polymerase II (Pol II) transcription initiation events are devoted to RP expression (Warner 94

1999). Therefore, the duplication and retention of RP genes might have more functional impacts 95

on these microorganisms than animals or plants. Like other types of mutations, the occurrence 96

of gene duplication is largely due to scholastic events, but retention of duplicate genes have 97

been mainly driven by natural selection (Panchy, et al. 2016). A better understanding of 98

evolutionary fates of RP duplicate genes could offer new insights into how gene duplication 99

produced adaptive solutions to microorganisms. 100

Page 5: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

5

To better understand the evolutionary patterns of RP genes and their adaptive significance, 101

we conducted systematic identification and evolutionary analyses of all RP families in all fungal 102

species with well-annotated genomes. We searched for RP genes from 295 fungal species and 103

identified independent duplications of most RP families in three distantly related fungal lineages. 104

We inferred the timing and nature of gene duplication for each RP family in each fungal lineage. 105

We found that a vast majority of RP paralogous genes have experienced repeated gene 106

conversion events that have homogenized their sequences in each species. In aligning with 107

integrative analyses of genomic, transcriptomic data and physiological data, we proposed that 108

the massive duplication, retention and concerted evolution of RP genes have contributed to the 109

evolution of fermentative lifestyle in fungal species. This study offers a classic example 110

illustrating the mechanisms and adaptive significance of maintaining duplicate genes encoding 111

macromolecules. 112

RESULTS 113

Massive duplications of RP genes found in three distantly related fungal lineages 114

To determine the prevalence of gene duplications in RP families in fungi, we first searched 115

for RP homologous genes for all fungal species with NCBI Reference Sequence (RefSeq) 116

protein data (Supplementary table 1). As of March 2019, 285 fungal species were annotated 117

with RefSeq protein data, covering five of the seven fungal phyla. We conducted BLASTP 118

searches against the 285 RefSeq protein datasets using amino acid sequences of RP genes 119

from both S. cerevisiae and Sch. pombe as queries (see Methods and Materials). Based on 120

BLASTP search results, we calculated the gene copy numbers in each RP family for every 121

examined species, and the total number of RP families with duplicate copies (Supplementary 122

table 1). 123

We considered a species with massive RP duplications if more than 50% (≥ 40) of RP 124

families have duplicate copies. Among the 285 fungi examined, only ten species meet the 125

criterion of massive RP duplication. The ten species distributes in three distantly related fungal 126

lineages: three in the class of Saccharomycetes (budding yeast), four in the class of 127

Schizosaccharomycetes (fission yeast), and three in the phylum of Mucoromycota 128

(Supplementary table 1). Although multiple hits of most CRP families were found in a budding 129

yeast Candida viswanathii, because the assembly type of its genome is diploid, the two hits 130

Page 6: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

6

found in most RP families are alleles instead of paralogous gene, it was not considered as 131

massive RP duplication. 132

Because protein annotations of a genome could be incomplete or inaccurate, manual 133

curation is required for a more accurate survey of RP repertoire. It is necessary to carry out a 134

second-round identification of RP genes with manual curation, focusing on the three fungal 135

lineages. We selected 24 species from the three fungal lineages, including the ten species with 136

massive RP duplication. To provide a more even distribution of taxonomic groups in each 137

lineage, we included ten other species whose genomic data are available in NCBI Whole 138

Genome Shotgun (WGS), Yeast Gene Order Browser (YGOB) and JGI (Byrne and Wolfe 2005; 139

Maguire, et al. 2013). In total, our second-round search examined 34 fungal species, which 140

includes 23 Saccharomycetes species, five Taphrinomycotina species (including the four fission 141

yeasts), and six Mucoromycota species (fig. 1, Supplementary table 2). The phylogenetic 142

relationships of the 34 species were inferred using the amino acid sequences of the largest 143

subunit of RNA Pol II proteins (Supplementary fig. 1). Including the 285 species analyzed in our 144

first round analysis, we have examined a total number of 295 fungal genomes in the two rounds 145

of RP gene searches, representing the largest scale of RP gene survey in fungi to our 146

knowledge. 147

To manually curate RP repertoire in a genome, we performed both BLASTP and TBLASTN 148

searches for each of the 34 fungal species. By comparing BLASTP and TBLASTN search 149

results, we identified discrepancies in the number of RP genes and aligned regions. We found 150

that many TBLASTN hits were not present in BLASTP searches, indicating the presence of 151

unannotated RP genes. Thus, we have manually predicted 259 novel RP genes from 32 of the 152

34 species. We also revised the annotations of open reading frame (ORF) for 95 RP genes. In 153

total, we identified a total number of 3950 RP genes from the 34 fungal species (Supplementary 154

tables 2 and 3). 155

We constructed maximum likelihood (ML) phylogenetic trees for each RP family (see 156

Materials and Methods). Similar tree topologies were observed among RP families with 157

duplicate copies (Supplementary File 1). For instances, two copies of RPL6 genes are present 158

in all ten post-WGD budding yeasts and all four fission yeast species (fig. 2A). More copies of 159

RPL6 genes are present in Mucoromycota species Lobosporangium transversale, Phycomyces 160

blakesleeanus, Rhizopus microspores, and Rhizopus delemar (former name R. oryzae), which 161

have 2, 4, 3, and 5 copies of RPL6 genes, respectively. According to the ML tree (fig. 2A), the 162

Page 7: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

7

RPL6 genes are more closely related to their paralogous genes in each species, instead of their 163

orthologous genes. Similar patterns are present in the RPS19 genes, which encode a ribosomal 164

small subunit protein (fig. 2B). These tree topologies suggest that the RP genes have been 165

independently duplicated in each species after their divergence from each other. However, at 166

least in the post-WGD budding yeasts, it has been documented that RPL6 and RPS19 were 167

generated by the WGD occurred prior to the divergence of S. cerevisiae and Vanderwaltozyma 168

polyspora (Conant and Wolfe 2006). Therefore, the phylogenetic trees do not accurately depict 169

the evolutionary history of RPL6 and RPS19 families in budding yeasts. It has been shown that 170

RPL6 and RPS19 genes have experienced gene conversion during the evolution of S. 171

cerevisiae, which explains the discrepancy between the tree topology and their duplication 172

history in the budding yeast (Evangelisti and Conant 2010; Casola, et al. 2012). However, it is 173

not known whether it is the same case in the fission yeast and Mucoromycota species, which 174

requires accurate timing of duplication events in the two lineages. 175

Based on these cases, we cannot infer the evolutionary history of RP genes in fungi solely 176

based on the topology of phylogenetic trees due to the possibility of gene conversion. It is 177

necessary to carry out additional analyses to determine when gene duplication events have 178

occurred. Because only a small number of species have experienced massive RP duplication in 179

each of the three fungal lineages (fig. 1 and Supplementary table 2), the most parsimonious 180

scenario is that the expansion of RP genes in each fungal lineage occurred independently. In 181

our subsequent analyses, we separately inferred the timing and nature of gene duplications for 182

each RP family and determined whether they have experienced gene conversion after gene 183

duplication in each lineage. 184

Duplication and concerted evolution of RP genes in the budding yeasts 185

We manually identified all RP genes for the 23 Saccharomycetes species (budding yeasts). 186

A total number of 59 RP families have duplicate copies in most WGD species (fig. 1 and 187

Supplementary table 2). 55 of them are ohnologs generated by an ancestral WGD (Conant and 188

Wolfe 2006). The other four RP families, including RPP1, RPP2, RPL9, and RPS22, have 189

duplicates in most budding yeasts, including these non-WGD species, suggesting that they 190

have been duplicated before the divergence of all budding yeasts. Therefore, most post-WGD 191

budding yeasts have a significant increase in RP gene number. 135-137 RP genes are present 192

in the four species in the Saccharomyces sensu stricto group, S. cerevisiae, S. mikatae, S. 193

uvarum, and S. eubayanus. The two Naumovozyma species have 134 and 137 RP genes, 194

Page 8: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

8

respectively. The numbers are relatively smaller (106 and 104) in the other two early-diverging 195

post-WGD species, Tetrapisispora phaffii and Vanderwaltozyma polyspora. In contrast, the 196

human opportunistic pathogen Candida glabrata (Nakaseomyces glabrata) and its closely 197

related species, Nakaseomyces bacillisporus, have only 85 and 97 RP genes respectively, 198

suggesting most RP duplicate genes have been lost in these species. 199

Because the timings of these RP duplications in budding yeast have already been 200

determined, it is possible to infer which RP paralogs have experienced gene conversion by 201

comparing their gene tree with their true duplication history. We can also use the tree topology 202

to infer when concerted evolution had terminated, which is the time when the paralogous genes 203

started to accumulate mutations independently in each paralogous gene. To simplify this 204

process, we constructed phylogenetic trees for each duplicate RP family using five 205

representative WGD species with different divergence times, including S. cerevisiae, S. 206

mikatae, S. eubayanus, N. castellii and T. phaffii (fig. 3 and Supplementary File 2). We found 207

that at least 52 RP duplicate pairs in S. cerevisiae have experienced gene conversion, including 208

50 pairs generated by WGD and two pairs (RPL9 and RPS22) generated by the ancient 209

duplication events (fig. 3). Therefore, 88% (52 out of 59) of RP paralogous genes in S. 210

cerevisiae have experienced gene conversion, which is more than that of previously identified 211

(16 and 29) (Evangelisti and Conant 2010; Casola, et al. 2012), suggesting that concerted 212

evolution of RP genes in the budding yeasts is more prevalent than previously recognized. 213

Based on gene tree topologies, we inferred when converted evolution of RP genes had 214

terminated during the evolution of budding yeasts. In 21 RP families, two copies of each RP 215

family gene in S. cerevisiae form a species-specific clade in gene tree (fig. 3A and 216

Supplementary File 2), suggesting that the concerted evolution is still ongoing in S. cerevisiae or 217

has recently terminated after its divergence from S. mikatae. In seven RP families, termination 218

of concerted evolution occurred before the split between S. cerevisiae and S. mikatae (fig. 3B). 219

24 RP families have ended their concerted evolution before the divergence of the 220

Saccharomyces sensu stricto group, which including S. cerevisiae, S. mikatae, S. eubayanus 221

(fig. 3C). Only seven RP gene pairs do not show strong evidence of gene conversion (fig. 3 D). 222

A summary of the termination time of concerted evolution in 59 RP pairs in S. cerevisiae was 223

provided in fig. 3E. 224

Duplication and concerted evolution of RP genes in the fission yeasts 225

Page 9: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

9

We identified all RP genes for five species in the subphylum of Taphrinomycotina, including 226

the four fission yeasts and Pneumocystis murina. P. murina belongs to the class of 227

Pneumocystidomycetes, which is probably the most closely related lineage to the fission yeasts, 228

and it was used as an outgroup to infer the evolutionary history of RP genes. 142 to 145 RPs 229

are present in the four fission yeast species. The number of RP families with duplicate copies 230

range from 58 to 59 in the four species (fig. 1 and Supplementary table 2). Most of them have 231

two gene copies, but three copies of RP genes are present in 6 RP families (fig. 1). Only 78 RP 232

genes were identified in P. murina. Thus, it is reasonable to assume that the massive expansion 233

of RPs genes in the fission yeasts occurred after their divergence with P. murina. 234

Similar to the budding yeasts, most paralogous RP genes in each fission yeast are more 235

similar to each other than to their orthologous genes (fig. 2 and Supplementary File 3). The tree 236

topologies indicate that these RP genes were duplicated independently in each fission yeast 237

after their divergence. However, we should consider the possibility of gene duplication which 238

resulted in underestimation the ages of duplicate genes. To infer when gene duplication 239

occurred, we conducted gene collinearity (microsynteny) analysis for all duplicate RP genes in 240

the four fission yeasts (see Methods and Materials). If an RP gene was duplicated 241

independently in each species after their divergence, the daughter genes are expected to be 242

found in different genomic regions among these species. Under this scenario, only the parental 243

copy of RP genes share microsynteny by them, and it is extremely unlikely that they also share 244

conserved regions of microsynteny around the daughter genes. On the other hand, if these 245

fission yeasts share microsynteny in both copies of RP genes, the two copies should be created 246

by a single gene duplication event in their common ancestor. 247

We first identified all orthologous groups in the four fission yeasts (Supplementary table 4). 248

Based on gene order and ortholog group information, we analyzed microsynteny for each pair of 249

RP genes. Herein, we defined a conserved region of microsynteny as a block containing three 250

or more conserved homologs within five genes downstream and upstream of an RP gene (fig. 251

4A). In Sch. pombe, 58 RP families have at least two gene copies. In each of the RP families, 252

the duplicate genes share microsynteny in Sch. pombe, Sch. cryophilus, and Sch. octosporus 253

(Supplementary table 5). This result suggests that duplicate pairs in the 58 RP families were 254

generated at least before the divergence of the three fission yeasts, which have occurred ~119 255

million years ago (mya) (Rhind, et al. 2011). We then inferred how many of gene duplication 256

events occurred even before the split of Sch. japonicas, approximately 220 mya (Rhind, et al. 257

2011). Thanks to the highly conserved gene order in fission yeasts (Rajeh, et al. 2018), we were 258

Page 10: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

10

able to detect the presence of microsynteny in both RP duplicates in 49 families in Sch. 259

japonicus, suggesting that gene duplication of these RP families have occurred before the 260

divergence of the four fission yeasts (Supplementary table 5). For example, two copies of 261

RPL11 genes are found in each Schizosaccharomyces species. Highly conserved regions of 262

microsynteny surrounding RPL11A genes were found in all four fission yeasts, and so were the 263

RPL11B genes (fig. 4A), supporting that RPL11 was duplicated in their common ancestor and 264

both copies have been maintained in each fission yeast after their divergence. 265

For the rest 9 RP families, we did not obtain conclusive evidence to determine whether they 266

were duplicated before the split of Sch. japonicus. Four of them (RPL10, RPL30, RPS12, and 267

RPS25) has only a single RP copy present in Sch. japonicus. These genes could be duplicated 268

in the common ancestor of fission yeasts, and a duplicate copy has subsequently lost in Sch. 269

japonicus. Alternatively, the duplication events have occurred after the split of Sch. japonicus 270

from the other species. In the other five RP families (RPL3, RPL17, RPL18, RPL21, and 271

RPS19), only one RP copy in Sch. japonicus share microsynteny with the other three species. 272

Similarly, the duplication events of these RP families could be predated to the divergence of 273

fission yeasts, following by genome rearrangements in Sch. japonicus that resulted in the loss of 274

its gene collinearity. However, we cannot exclude the possibility that they were generated by 275

independent duplication events in Sch. japonicus. 276

To determine the number of RP families has an incompatibility between gene phylogenetic 277

tree and duplication history, we constructed a phylogenetic tree for each RP family with 278

duplicates in the fission yeasts. In the case of RPL11, contradict to the gene true duplication 279

history as inferred by microsynteny analysis (fig. 4A), the phylogenetic tree shows that RPL11 280

paralogs from species-specific clades in each fission yeast (Fig. 4B). Such incompatibility 281

suggests that gene conversion has occurred between RPL11 paralogous genes in each fission 282

yeast after their divergence. A total number of 45 RP families (77.6%) in fission yeasts have a 283

similar tree topology to RPL11 (Supplementary File 3). In other families, such as RPS17, the 284

two copies of RP genes from Sch. pombe, Sch. octosporus and Sch. cryophilus form two clades 285

and each clade have one gene copy from the three species. We observed nine RP families 286

similar to RPS17, suggesting that concerted evolution of these RP genes might have been 287

terminated before their divergence (fig. 4C). However, we did not find evidence of gene 288

conversion in four RP families, including RPL30, RPS5, RPS12 and RPS28 (fig. 4D and E). 289

Page 11: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

11

Retroposition as a major mechanism for massive duplication of RP genes in the 290

ancestral fission yeast 291

Because no WGD was detected during the evolution of Sch. pombe (Rhind, et al. 2011), we 292

then inferred other mechanisms that resulted in massive duplication of RP genes in the fission 293

yeasts, such as unequal crossing-over and retroposition. Unequal crossing-over typically 294

generates segmental or tandem gene duplicates. If a pair of genes was generated by segmental 295

duplication, we expect to observe microsynteny between regions of paralogous RP genes within 296

a species. However, we did not find any case in these RP families (Supplementary table 5). 297

Furthermore, we did not detect tandemly arranged RP paralogous genes, suggesting that 298

unequal crossing-over is not a main contributor for RP duplications in the fission yeasts either. 299

Retroposition generates retroduplicates through random insertion of a retrotranscribed 300

cDNA from parental source genes, resulting in intron-less retroduplicate genes (Kaessmann, et 301

al. 2009). We examined the exon-intron structure for all RP paralogous genes in Sch. pombe. 302

Among the 21 singleton RP families in Sch. pombe, only 7 of them (33.3%) are intron-less 303

(Supplementary table 6). In contrast, 33 of 58 duplicate RP families (56.9%) have at least one 304

copy of intron-less gene, which is significantly higher than the group of singleton RPs (p = 305

0.006, Fisher exact test). This ratio is also significantly higher than 27.3% of RP duplicates 306

generated by WGD in S. cerevisiae. Thus, the enrichment of intron-less RP genes in the 307

duplicate RP families in fission yeast suggests that they were likely generated by retroposition. 308

For those RP duplicates with intron in both copies, the possibility that they may be created by 309

retroposition following by insertion of intron cannot be excluded, because the locations and 310

phases of introns between these paralogous RP genes in Sch. pombe are usually different. 311

Duplication and concerted evolution of RP genes in the Mucoromycota species 312

Four Mucoromycota species examined demonstrate massive duplication of RP genes. 313

Three of them belong to the order of Mucorales (pin molds) in subphylum of Mucoromycotina 314

(311 RP genes in R. delemar, 182 in R. microspores, and 217 in P. blakesleeanus). In their 315

distantly related species in the same subphylum, Bifiguratus adelaidae, only 89 RP genes were 316

found. Massive duplication of RP genes (137 RP genes) was also observed in Lobosporangium 317

transversale, which is a distantly related species belonging to another subphylum 318

Mortierellomycotina. The earliest diverging species among all Mucoromycota species examined 319

is Rhizophagus irregularis, which has only 78 RPs genes (fig. 1). 320

Page 12: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

12

Based on RP gene copy numbers and the evolutionary relationships of these Mucoromycota 321

species, it is most parsimonious to conclude that massive expansion of RP genes in the three 322

pin mold species and L. transversale occurred independently. L. transversal is a rare species 323

that having only been reported by a few isolations in North American (Benny and Blackwell 324

2004). The genomic studies and physiological characterizations L. transversale are scarce. Due 325

to lack of genomic data from closely related species, we cannot provide a systematic inference 326

of the timing and nature of massive RP duplication in L. transversale. Thus, our subsequent 327

analysis only focused on the origin and evolution of RP duplicate genes in pin molds. 328

A WGD event has been proposed in ancestral R. delemar (Ma, et al. 2009). Another WGD 329

was speculated to have occurred in P. blakesleeanus prior to its divergence from R. 330

microspores and R. delemar (Corrochano, et al. 2016). Therefore, R. delemar might have 331

experienced two rounds of WGDs, which correlates with the largest RP repertoire (311) 332

identified in R. delemar, while P. blakesleeanus and R. microspores have 217 and 182 RP 333

genes respectively. Based on RP gene numbers, it is reasonable to conclude that the second 334

WGD occurred after the divergence of R. delemar from R. microspores. 335

We conducted microsynteny analysis to infer which RP gene pairs were generated by the 336

WGDs in the pin molds. The estimated divergence time between Phycomyces and Rhizopus is 337

over 750 mya (Mendoza, et al. 2014). Most, if not all, microsynteny blocks generated by the first 338

WGD might have lost during the evolution of these pin molds. Even though we have used a less 339

strict definition of microsynteny (a minimum of 3 shared homologs in a block of ± 10 neighboring 340

genes surrounding RP), we only identified 3 and 10 pairs of microsynteny blocks between 341

paralogous RP genes in R. microspores and P. blakesleeanus respectively. In contrast, in R. 342

delemar, which has experienced a second round of WGD after its divergence from R. 343

microspores, we detected microsynteny for 63 pairs of RP paralogous genes (Supplementary 344

table 7), supporting the recent WGD as a major contributor to the expansion of RP genes in R. 345

delemar. 346

We attempted to identify microsynteny for orthologous RP genes to infer the evolutionary 347

history of each RP family in pin molds (Supplementary table 8). Due to the large divergence 348

times between these species, most RP orthologous genes lack well-supported microsynteny 349

(Supplementary table 7). The most well-supported example is probably the RPL3 family (Fig. 350

5A). Based on the shared gene orders between paralogous and orthologous RPL3 genes, it is 351

reasonable to infer that RPL3 was duplicated before the divergence of the three pin mold 352

Page 13: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

13

species, probably due to the first WGD. In R. delemar, the two RPL3 genes have been further 353

duplicated by the recent WGD, generating four copies. However, their gene tree (fig. 5B) 354

demonstrates that the RPL3 paralogous genes in each species form a species-specific clade, 355

suggesting the occurrence of gene conversion between paralogous RPL3 genes in each 356

species. A total number of 57 RP families have a similar tree topology (Supplementary File 4). 357

Although there is no conclusive microsynteny evidence to support that these RP families have 358

the same evolutionary history as RPL3, we believed that it would be the most likely scenario. In 359

some RP families, such as RPL38 (fig. 5C), the genes from R. delemar and R. microspores 360

form two clades, and each clade has members from both species. There are 17 RP families 361

have a similar tree topology to RPL38. If these RP genes were the product of the ancient WGD, 362

their concerted evolution had terminated prior to the divergence of the two Rhizopus species. 363

The last type of tree topology, such as RPS20 (fig. 5D), whose members form two clades, and 364

each clade include genes from all the three pin mold species. Such tree topology does not 365

support the occurrence of gene conversion. We observed two RP families belonging to this 366

type. In summary, our results implied that most RP paralogous genes in pin molds might have 367

also experienced gene conversion, similar to what happened in budding yeasts and fission 368

yeasts. 369

cDNA as the probable donor for gene conversion between RP paralogous genes 370

During gene conversion, the genomic sequence of the ‘acceptor’ locus is replaced by a 371

'donor' sequence through recombination (Chen, et al. 2007). The donor can be genomic DNA or 372

cDNA derived from an mRNA intermediate (Derr and Strathern 1993; Storici, et al. 2007). If 373

genomic DNA is the donor, the sequences of both intron and exon can be homogenized. In 374

contrast, if cDNA is the donor, only the exon sequences of the acceptor are replaced. 375

Considering that synonymous mutations are largely free from natural selection, it is possible to 376

determine the donor of gene conversion by comparing the substitution rates between intron and 377

synonymous sites of exons. If the synonymous substitution rates (dS) are significantly lower than 378

intron mutation rates (μintron), supporting cDNA as a donor. We calculated dS and μintron for all RP 379

duplicate genes for presentative species from each fungal lineage: S. cerevisiae, Sch. pombe 380

and R. microspores (Supplemental table 9). Overall, the dS values of all paralogous RP genes 381

are significantly lower than μintron in each species examined (fig. 6A-C, Student’s t-test, p < 0.01). 382

Considering that different genomic regions might have different mutation rates, we then 383

compared the dS and μintron between each pair of RP duplicate genes (fig. 6D-E). Consistently, 384

most of RP duplicate gene pairs have lower dS values than μintron. In a small number of cases, 385

Page 14: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

14

high dS are observed, probably because the concerted evolution between a pair of orthologous 386

genes have terminated long time ago, resulting accumulation of many synonymous mutations. 387

These results suggest that, in most cases, only the coding sequences have been homogenized 388

by gene conversion, supporting cDNA as the probable gene conversion donor. 389

The retention of RP gene duplicates was associated with the evolution to fermentative 390

ability in fungi 391

Most eukaryotic species fully oxidize glucose, their primary carbon and energy source, 392

through mitochondrial oxidative phosphorylation in the presence of oxygen for maximum energy 393

production. In contrast, post-WGD budding yeasts and fission yeasts predominantly ferment 394

sugar to ethanol in the presence of excess sugars, even under aerobic conditions, which was 395

called aerobic fermentation (Alexander and Jeffries 1990; Lin and Li 2011a). Aerobic 396

fermentation has independently evolved in the budding yeasts and fission yeasts (de Jong-397

Gubbels, et al. 1996). In addition, the domesticated form of R. microspores has been a widely 398

used starter culture for the production of tempeh from fermented soybean (Hachmeister and 399

Fung 1993). Its close relative, R. delemar, was also well known as efficient ethanol and fumaric 400

acid producer by fermentation (Kito, et al. 2009; Straathof and van Gulik 2012). P. 401

blakesleeanush was known for capable of fermenting sugar into β-carotene at an industrial 402

scale, which is derived from the end product of glycolysis (Kaessmann, et al. 2009). 403

We speculated that the massive duplication and retention of RP genes have contributed to 404

the evolution of fermentative ability in these species. Increased gene dosage could lead to a 405

quantitative increase in gene expression and production of protein. To determine the impact of 406

gene duplication on the production of RP transcripts, we calculated the total transcription 407

abundance of all RP genes using our transcriptomic data generated by Cap Analysis of Gene 408

Expression (CAGE) (McMillan, et al. 2019). The CAGE technique captures and sequences the 409

first 75 bp of transcripts, which quantifies the transcription abundance based on numbers of 410

mapped reads (Murata, et al. 2014). Among the 34 species, 11 of them have available CAGE 411

data, including nine budding yeasts and two fission yeasts (Supplementary table 10). As shown 412

in fig. 7A, the RP copy numbers are positively corrected with total transcription abundance value 413

of all RP genes (Supplemental table 10, Pearson correlation r = 0.72), supporting that the 414

increased RP gene dosage might have increased ribosome biogenesis by generating more RP 415

transcripts. 416

Page 15: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

15

We then infer whether increased RP gene dosage is associated with better fermentative 417

ability. A previous study has measured various physiological characteristics for over 40 yeast 418

species (Hagman, et al. 2013), including 19 species examined in this study. We observed a 419

positive correlation between RP gene number and ethanol production efficiency (r = 0.80), and 420

glucose consumption rate (r = 0.76) (fig. 7B and C). We also observed a significant positive 421

correlation between total RP expression and both ethanol production efficiency (r = 0.87) and 422

glucose consumption rate (r = 0.88) (Supplementary fig. 2). These results suggest that the 423

increased RP expression by gene duplication might have enhanced these organisms’ ability to 424

rapidly consuming glucose through the fermentation pathway. 425

Discussion 426

The preferential retention of RP duplicate genes was selection-driven 427

Our survey of 295 fungal genomes revealed that massive duplications of RP genes are not 428

prevalent. However, a significant increase in RP gene copy numbers had independently 429

occurred a small number of species in three distantly related lineages in fungi. WGD events 430

have played an important role in the expansion RP repertoire in the budding yeasts and pin 431

molds. In the budding yeasts, only ~10% of WGD ohnologs have survived, while 70.5% of RP 432

duplicates generated by WGD have been maintained in S. cerevisiae. As indicated by previous 433

studies, the survival rate of RP ohnologs is significantly higher than the other WGD ohnologs 434

(Papp, et al. 2003). 435

Our results suggested that RP genes in the fission yeasts were likely individually duplicated 436

by small-scale duplication events (SSD), such as retroposition. In general, the gene retention 437

rate of SSDs is much lower than ohnologs (half-life of 4 million years vs 33 million years) 438

(Hakes, et al. 2007), it is even lower in genes encoding macromolecular complexes due tp 439

evolutionary constraints imposed by gene dosage balance (Li, et al. 1996; Conant and Wolfe 440

2008). Similar to the fission yeasts, 99.8% of RP duplicates in mammals were found to be 441

generated by retroposition (Dharia, et al. 2014). However, almost all RP retroduplicates in 442

mammals become pseudogenes (Dharia, et al. 2014). Therefore, the high retention rates of 443

functional RP duplicates generated by SSDs in each fission yeasts are indeed unexpected. A 444

reasonable explanation is that the increased RP gene dosage have provided selective 445

advantages to these species, natural selection favored the retention of RP duplicate genes. 446

Page 16: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

16

There is another line of evidence supporting that the retention of RP duplicate genes in 447

fission yeasts was selection-driven. The fission yeasts have been known to maintain a single 448

copy of genes in most gene families (Rhind, et al. 2011; Rajeh, et al. 2018). Based on our 449

orthologous group data (Supplementary table 4), in 86% (4069/4734) of fission yeast ortholog 450

groups, only a single copy gene is present in each of the four species (or 1:1:1:1 ortholog). Of 451

the gene families with gene duplication or loss in at least one fission yeast species, ribosomal 452

proteins account for 9.3% (62/665) of them, which is significantly overrepresented in this group 453

(p < 10-5, Fisher exact test). 454

How duplication and retention of RP genes contributed to the evolution of fermentative 455

ability in fungi 456

Our data suggested that the retention of RP duplication genes might have been driven by 457

their contributions to the evolution of strong fermentative ability in these organisms. The 458

fermentative yeasts were believed to have gained a growth advantage through rapid glucose 459

fermentation in the presence of excess sugars (Piskur, et al. 2006). It was found that S. 460

cerevisiae often outgrew its non-fermentative competitors in co-culture experiments (Pérez-461

Nevado, et al. 2006; Williams, et al. 2015). Fermentation is a much less efficient way to 462

generate energy. Through fermentation, each glucose molecule only yields 2 ATP from 463

glycolysis, compared to 32 ATP through mitochondrial oxidative phosphorylation pathway. In 464

sugar rich environments, fermentative organisms are able to produce more ATP per unit time by 465

rapidly consuming sugars through fermentation, providing selective advantages (Pfeiffer, et al. 466

2001; Pfeiffer and Morley 2014). The rapid glucose consumption was facilitated by the 467

increased glycolysis flux and more efficiently transporting glucose across cellular membranes. 468

The enhanced glycolytic activity is also present in many tumor cells, known as the “Warburg 469

effect” (Vander Heiden, et al. 2009; Diaz-Ruiz, et al. 2011). It has been shown that increased 470

copy number of genes related to glycolysis (Conant and Wolfe 2007) and glucose transporters 471

(Lin and Li 2011b) have played an important role in the switch of glucose metabolism. Similarly, 472

it is reasonable to propose that the increased RP gene dosage increased their transcript 473

abundance and ribosome biogenesis, resulting in increased biogenesis of glycolysis enzymes, 474

glucose transporters, and other building blocks for cell growth and proliferation. 475

Ribosome biosynthesis, however, comes with the opportunity cost of higher expression of 476

other cellular processes needed for cell viability and function. Maintaining one RP gene per 477

family may be advantageous for most other species to allow for greater Pol II transcription 478

Page 17: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

17

potential for other genes. However, massive duplication of RP genes made it possible to rapidly 479

consume glucose through the low-efficient fermentative pathway, and at the same time maintain 480

a high growth rate. Such physiological chrematistic provided selective advantages to the 481

organisms in sugar rich environments, which well explained the high retention rate of RP 482

duplicates in fungal species. 483

Gene duplication allows further increase of transcription abundance from highly 484

expressed RP genes 485

One may argue that increased of ribosome biosynthesis can also be achieved by elevated 486

transcription activities of RP genes. It is probably true because we also observed elevated 487

expression level of RP genes in the two non-WGD yeasts with an intermediate level of ethanol 488

fermentation ability: Lachancea thermotolerans and Lachancea waltii (Hagman, et al. 2013). 489

Across all eukaryotic species, RP genes are the group of most abundantly transcribed genes in 490

eukaryotic cells, accounting for 50% of RNA polymerase II (Pol II) transcription (Warner 1999). 491

RP genes have the highest density of bound Pol II. 100 RP genes have on average >60% of the 492

maximum Pol II occupancy, while a majority of the genome only has <5% of the maximum Pol II 493

density (Venters and Pugh 2009). Therefore, there is limited room for further increase the 494

transcription abundance of RP genes. However, duplication of RP genes provides an additional 495

substrate on which Pol II can transcribe into RP mRNAs, reducing transcription as a rate-limiting 496

step in the process of ribosome biogenesis. 497

Gene conversion facilitated retentions of RP duplicate genes 498

The sequences of all RP families are highly conserved during the evolution of eukaryotes 499

due to their vital roles in many cellular functions (Korobeinikova, et al. 2012). Because 500

misfolding and misinteractions of highly abundant proteins can be more costly, proteins likely 501

RPs should have been under more functional constraints and evolved even slower than other 502

important proteins (Zhang and Yang 2015). It has shown that there is a strong evolutionary 503

constraint posed on the duplicability of genes encoding core components of protein complexes 504

(Li, et al. 2006). Accumulation of new mutations in duplicate genes could impact the stability of 505

protein complexes, posing selective disadvantages. The other structure component of 506

ribosomes, rRNA, is one of the best-known examples of concerted evolution (Liao 1999; Nei 507

and Rooney 2005). 508

Page 18: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

18

Our study confirms that gene conversion following the duplication of RP genes appears to 509

be a universal path in fungal species. Through gene conversion, the sequences of paralogous 510

genes are homogenized, easing the new mutations accumulated in one of the paralogous 511

genes. It has been shown that highly-expressed genes are more likely to experience mRNA-512

mediated gene conversion (Weng, et al. 2000; Schildkraut, et al. 2006). Thus, as a group of 513

most actively transcribed genes (Warner 1999), the repeated occurrence of gene conversion on 514

RP genes might be due to the highly expressed nature of RP genes. It was also found that the 515

promoter or flanking genomic sequences are much more divergence than coding sequences 516

between paralogous RP genes (Evangelisti and Conant 2010), further supporting gene 517

conversion in RPs was mediated by cDNA. Gene conversion could provide an additional layer 518

of protection on top of purifying selection to remove newly accumulated mutations (Evangelisti 519

and Conant 2010; Scienski, et al. 2015). Increased RP genes copies could be advantageous to 520

fungal species in sugar-rich environments through fermentative growth. Therefore, the repeated 521

occurrence of gene conversions between RP paralogous genes have contributed to the 522

maintenance of functional RP duplicates in these organisms by removing newly accumulated 523

mutations. 524

Materials and Methods 525

Data sources, identification and manual curation of RP repertoire 526

We obtained a list of RP genes from S. cerevisiae and Sch. pombe from the Ribosomal 527

Protein Gene Database (RPG)(Nakao, et al. 2004). We downloaded RefSeq protein sequence 528

data of 285 fungal genomes from NCBI (Table S1). In the first round of homologous sequences, 529

we used RP sequences from S. cerevisiae and Sch. pombe as queries to run BLASTP search 530

against the 285 proteomic data (Camacho, et al. 2009). For BLASTP search, we used e-value 531

cutoff of 1e-10, and only hits with a minimum alignment length of 50% of query sequences were 532

considered as homologous RP genes. 533

In the second round of homologous searches, we obtained the protein and genome 534

sequences of 34 fungal species from NCBI WGS, JGI and Yeast Gene Order Browser (YGOB) 535

(Byrne and Wolfe 2005; Maguire, et al. 2013) (Supplementary table 2). We first used 79 RP 536

protein sequences from S. cerevisiae and Sch. pombe as queries to search for homologous 537

sequences from the ten newly added fungal species using BLASTP. To identify RP sequences 538

not predicted by existing genome annotations, we conducted TBLASTN searches against all 34 539

Page 19: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

19

genomic sequences. Manual inspections were performed to compare the BLASTP and 540

TBLASTN results to identify discrepant hits. For hits obtained by TBLASTN by not BLASTP, we 541

predicted the coding sequences (CDS) based on six frame translations of genomic sequences. 542

The exon-intron boundaries were determined based on the TBLASTN alignments and the 543

presence of GT/AG splice sites in flanking intron sequences. We also revised the predicted 544

protein sequences if there is a discrepancy in aligned regions between BLASTP and TBLASTN 545

results. The same gene prediction method was used to revise misannotated ORF. 546

Construction of phylogenetic tree for RP genes 547

We inferred the phylogeny for each of the 79 RP families using the RP protein sequences 548

collected from the 34 fungal species. Sequences were aligned through MUSCLE (Edgar 2004). 549

The molecular phylogenetic tree was inferred by the Maximum likelihood method using RAxML 550

with 100 bootstrap pseudo-replicates (Stamatakis 2006). The best-fit substitution model was 551

inferred by using ProtTest (Abascal, et al. 2005). As the best substitution models are the LG 552

model were identified for the majority of RP families, it was used in our phylogenetic 553

reconstruction. A discrete Gamma distribution [+G] and invariable sites [+I] was used to model 554

evolutionary rate differences among sites. We also constructed lineage-specific phylogenetic 555

trees for duplicate RP families using representative species from each of the three fungal 556

lineages using Neighbor-Joining method using MEGA 7 (Kumar, et al. 2016). For those RP 557

families with almost identical amino acid sequences, we used their CDS for construction of gene 558

trees to obtain better resolved phylogenetic trees (Supplementary Files 2-4). 559

Homology microsynteny analysis 560

We conducted microsynteny analysis for each RP gene family in the four 561

Schizosaccharomyces species and four Mucoromycotina species, including R. delemar, R. 562

microspores, P. blakesleeanus and B. adelaidae. Gene order information was retrieved from 563

genome annotations of each species obtained from NCBI. The orthologous gene groups in the 564

fours fission yeast species and four Mucoromycotina species were respectively identified using 565

the OrthoDB (Kriventseva, et al. 2014). For fission yeasts, we obtained a list of ten genes 566

surrounding each RP gene (five upstream and five downstream of RP gene). For the Mucorales 567

species, we extended our microsynteny analysis to a block of ten genes upstream and ten 568

downstream of RP gene due to their divergent genome structures. 569

Page 20: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

20

Estimation of substitution rates in intron and synonymous sites 570

We calculated the substitution rates for every pair of duplicate RP genes for three species 571

representing the three fungal lineages with massive RP duplications, including S. cerevisiae, 572

Sch. pombe, R. microspores. The CDS and intron sequences were retrieved from NCBI, and 573

were aligned using MUSCLE. Synonymous substitution rate was calculated using Li-Wu-Luo 574

method with Kimura 2-parameter model (Li, et al. 1985) in MEGA 7 (Kumar, et al. 2016). 575

Nucleotide substitution rates in intron sequences were calculated using the Kimura 2-parameter 576

model in MEGA 7. 577

Analysis of RP gene transcriptomic and physiological data 578

The transcriptomic data of nine budding yeasts and two fission yeasts examined were 579

obtained from (McMillan, et al. 2019) based on CAGE. The expression abundance of an RP 580

gene was defined as the sum of transcripts initiated from all core promoters within 500 base 581

pairs upstream of its annotated start codon, which were normalized as TPM (tags per million 582

mapped reads, and each tag represent one sequenced transcript). The total transcription 583

abundance of RP genes in a species was calculated as the sum of TPM of all RP genes 584

identified in this species. For newly predicted RP genes that were not annotated in CAGE 585

datasets, we performed TBLASTN searches to determine their genomic locations of CDS and 586

obtain the expression abundance data using the same criteria. The ethanol production efficiency 587

and glucose consumption rate of 19 budding and fission yeast species were obtained from 588

(Hagman, et al. 2013). The ethanol production efficiency was measured as grams of ethanol 589

produced per gram of biomass per gram of glucose consumed. The glucose consumption rate 590

was measured as grams of glucose consumed per gram of biomass per hour. If multiple 591

biological replicates were measured for a single species, their average values were used for our 592

analysis. 593

Acknowledgements 594

This study was supported by the start-up fund and President’s Research Fund from Saint 595

Louis University to ZL. We would like to thank Dr. Zhenglong Gu and Dr. Dapeng Zhang for 596

valuable discussions and suggestions to this work. 597

598

Page 21: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

21

References 599

600

Abascal F, Zardoya R, Posada D. 2005. ProtTest: selection of best-fit models of protein 601

evolution. Bioinformatics 21:2104-2105. 602

Alexander MA, Jeffries TW. 1990. Respiratory efficiency and metabolite partitioning as 603

regulatory phenomena in yeasts. Enzyme Microb Technol 12:2-19. 604

Arnheim N, Krystal M, Schmickel R, Wilson G, Ryder O, Zimmer E. 1980. Molecular evidence 605

for genetic exchanges among ribosomal genes on nonhomologous chromosomes in man and 606

apes. Proc Natl Acad Sci U S A 77:7323-7327. 607

Barakat A, Szick-Miranda K, Chang F, Guyot R, Blanc G, Cooke R, Delseny M, Bailey-Serres J. 608

2001. The organization of cytoplasmic ribosomal protein genes in the Arabidopsis genome. 609

Plant Physiol 127:398-415. 610

Benny GL, Blackwell M. 2004. Lobosporangium, a new name for Echinosporangium Malloch, 611

and Gamsiella, a new genus for Mortierella multidivaricata. Mycologia 96:143-149. 612

Birchler JA, Veitia RA. 2012. Gene balance hypothesis: Connecting issues of dosage sensitivity 613

across biological disciplines. Proceedings of the National Academy of Sciences 109:14746-614

14753. 615

Blattner FR, Plunkett G, 3rd, Bloch CA, Perna NT, Burland V, Riley M, Collado-Vides J, Glasner 616

JD, Rode CK, Mayhew GF, et al. 1997. The complete genome sequence of Escherichia coli K-617

12. Science 277:1453-1462. 618

Brauer MJ, Huttenhower C, Airoldi EM, Rosenstein R, Matese JC, Gresham D, Boer VM, 619

Troyanskaya OG, Botstein D. 2008. Coordination of growth rate, cell cycle, stress response, 620

and metabolic activity in yeast. Mol Biol Cell 19:352-367. 621

Brown DD, Wensink PC, Jordan E. 1972. A comparison of the ribosomal DNA's of Xenopus 622

laevis and Xenopus mulleri: the evolution of tandem genes. J Mol Biol 63:57-73. 623

Byrne KP, Wolfe KH. 2005. The Yeast Gene Order Browser: combining curated homology and 624

syntenic context reveals gene fate in polyploid species. Genome research 15:1456-1461. 625

Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL. 2009. 626

BLAST+: architecture and applications. BMC Bioinformatics 10:421. 627

Casola C, Conant GC, Hahn MW. 2012. Very low rate of gene conversion in the yeast genome. 628

Mol Biol Evol 29:3817-3826. 629

Chen JM, Cooper DN, Chuzhanova N, Ferec C, Patrinos GP. 2007. Gene conversion: 630

mechanisms, evolution and human disease. Nat Rev Genet 8:762-775. 631

Conant GC, Wolfe KH. 2006. Functional partitioning of yeast co-expression networks after 632

genome duplication. PLoS Biol 4:e109. 633

Conant GC, Wolfe KH. 2007. Increased glycolytic flux as an outcome of whole-genome 634

duplication in yeast. Mol Syst Biol 3:129. 635

Page 22: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

22

Conant GC, Wolfe KH. 2008. Turning a hobby into a job: how duplicated genes find new 636

functions. Nat Rev Genet 9:938-950. 637

Corrochano LM, Kuo A, Marcet-Houben M, Polaino S, Salamov A, Villalobos-Escobedo JM, 638

Grimwood J, Alvarez MI, Avalos J, Bauer D, et al. 2016. Expansion of Signal Transduction 639

Pathways in Fungi by Extensive Genome Duplication. Curr Biol 26:1577-1584. 640

de Jong-Gubbels P, van Dijken JP, Pronk JT. 1996. Metabolic fluxes in chernostat cultures of 641

Schizosaccharomyces pombe grown on mixtures of glucose and ethanol. Microbiology 642

142:1399-1407. 643

Derr LK, Strathern JN. 1993. A role for reverse transcripts in gene conversion. Nature 361:170. 644

Dharia AP, Obla A, Gajdosik MD, Simon A, Nelson CE. 2014. Tempo and mode of gene 645

duplication in mammalian ribosomal protein evolution. PLoS ONE 9:e111721. 646

Diaz-Ruiz R, Rigoulet M, Devin A. 2011. The Warburg and Crabtree effects: On the origin of 647

cancer cell energy metabolism and of yeast glucose repression. Biochim Biophys Acta 648

1807:568-576. 649

Dudov KP, Perry RP. 1984. The gene family encoding the mouse ribosomal protein L32 650

contains a uniquely expressed intron-containing gene and an unmutated processed gene. Cell 651

37:457-468. 652

Edgar RC. 2004. MUSCLE: multiple sequence alignment with high accuracy and high 653

throughput. Nucleic Acids Res 32:1792-1797. 654

Evangelisti AM, Conant GC. 2010. Nonrandom survival of gene conversions among yeast 655

ribosomal proteins duplicated through genome doubling. Genome Biol Evol 2:826-834. 656

Hachmeister KA, Fung DY. 1993. Tempeh: a mold-modified indigenous fermented food made 657

from soybeans and/or cereal grains. Crit Rev Microbiol 19:137-188. 658

Hagman A, Sall T, Compagno C, Piskur J. 2013. Yeast "make-accumulate-consume" life 659

strategy evolved as a multi-step process that predates the whole genome duplication. PLoS 660

ONE 8:e68734. 661

Hahn MW, De Bie T, Stajich JE, Nguyen C, Cristianini N. 2005. Estimating the tempo and mode 662

of gene family evolution from comparative genomic data. Genome Res 15:1153-1160. 663

Hakes L, Pinney JW, Lovell SC, Oliver SG, Robertson DL. 2007. All duplicates are not equal: 664

the difference between small-scale and genome duplication. Genome Biol 8:R209. 665

Jorgensen P, Nishikawa JL, Breitkreutz B-J, Tyers M. 2002. Systematic identification of 666

pathways that couple cell growth and division in yeast. Science 297:395-400. 667

Kaessmann H, Vinckenbosch N, Long M. 2009. RNA-based gene duplication: mechanistic and 668

evolutionary insights. Nat Rev Genet 10:19-31. 669

Kellis M, Birren BW, Lander ES. 2004. Proof and evolutionary analysis of ancient genome 670

duplication in the yeast Saccharomyces cerevisiae. Nature 428:617-624. 671

Page 23: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

23

Kenmochi N, Kawaguchi T, Rozen S, Davis E, Goodman N, Hudson TJ, Tanaka T, Page DC. 672

1998. A map of 75 human ribosomal protein genes. Genome Res 8:509-523. 673

Kito H, Abe A, Sujaya IN, Oda Y, Asano K, Sone T. 2009. Molecular characterization of the 674

relationships among Amylomyces rouxii, Rhizopus oryzae, and Rhizopus delemar. Biosci 675

Biotechnol Biochem 73:861-864. 676

Kondrashov FA, Kondrashov AS. 2006. Role of selection in fixation of gene duplications. J 677

Theor Biol 239:141-151. 678

Korobeinikova AV, Garber MB, Gongadze GM. 2012. Ribosomal proteins: structure, function, 679

and evolution. Biochemistry (Mosc) 77:562-574. 680

Kriventseva EV, Tegenfeldt F, Petty TJ, Waterhouse RM, Simão FA, Pozdnyakov IA, Ioannidis 681

P, Zdobnov EM. 2014. OrthoDB v8: update of the hierarchical catalog of orthologs and the 682

underlying free software. Nucleic Acids Res 43:D250-D256. 683

Kumar S, Stecher G, Tamura K. 2016. MEGA7: Molecular Evolutionary Genetics Analysis 684

Version 7.0 for Bigger Datasets. Mol Biol Evol 33:1870-1874. 685

Kuzumaki T, Tanaka T, Ishikawa K, Ogata K. 1987. Rat ribosomal protein L35a multigene 686

family: molecular structure and characterization of three L35a-related pseudogenes. Biochimica 687

et Biophysica Acta (BBA)-Gene Structure and Expression 909:99-106. 688

Li B, Vilardell J, Warner JR. 1996. An RNA structure involved in feedback regulation of splicing 689

and of translation is critical for biological fitness. Proceedings of the National Academy of 690

Sciences 93:1596-1600. 691

Li L, Huang Y, Xia X, Sun Z. 2006. Preferential duplication in the sparse part of yeast protein 692

interaction network. Mol Biol Evol 23:2467-2473. 693

Li WH, Wu CI, Luo CC. 1985. A new method for estimating synonymous and nonsynonymous 694

rates of nucleotide substitution considering the relative likelihood of nucleotide and codon 695

changes. Mol Biol Evol 2:150-174. 696

Liao D. 1999. Concerted evolution: molecular mechanism and biological implications. Am J Hum 697

Genet 64:24-30. 698

Lin Z, Kong H, Nei M, Ma H. 2006. Origins and evolution of the recA/RAD51 gene family: 699

evidence for ancient gene duplication and endosymbiotic gene transfer. Proc Natl Acad Sci U S 700

A 103:10328-10333. 701

Lin Z, Li WH. 2011a. The evolution of aerobic fermentation in Schizosaccharomyces pombe 702

was associated with regulatory reprogramming but not nucleosome reorganization. Mol Biol 703

Evol 28:1407-1413. 704

Lin Z, Li WH. 2011b. Expansion of hexose transporter genes was associated with the evolution 705

of aerobic fermentation in yeasts. Mol Biol Evol 28:131-142. 706

Lin Z, Nei M, Ma H. 2007. The origins and early evolution of DNA mismatch repair genes - 707

Multiple horizontal gene transfers and co-evolution. Nucleic Acids Res 35:7591-7603. 708

Page 24: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

24

Ma L-J, Ibrahim AS, Skory C, Grabherr MG, Burger G, Butler M, Elias M, Idnurm A, Lang BF, 709

Sone T. 2009. Genomic analysis of the basal lineage fungus Rhizopus oryzae reveals a whole-710

genome duplication. PLoS Genet 5:e1000549. 711

Maguire SL, ÓhÉigeartaigh SS, Byrne KP, Schröder MS, O’Gaora P, Wolfe KH, Butler G. 2013. 712

Comparative genome analysis and gene finding in Candida species using CGOB. Mol Biol Evol 713

30:1281-1291. 714

McMillan J, Lu Z, Rodriguez JS, Ahn TH, Lin Z. 2019. YeasTSS: an integrative web database of 715

yeast transcription start sites. Database (Oxford) 2019. 716

Mendoza L, Vilela R, Voelz K, Ibrahim AS, Voigt K, Lee SC. 2014. Human Fungal Pathogens of 717

Mucorales and Entomophthorales. Cold Spring Harb Perspect Med 5. 718

Montagne J, Stewart MJ, Stocker H, Hafen E, Kozma SC, Thomas G. 1999. Drosophila S6 719

kinase: a regulator of cell size. Science 285:2126-2129. 720

Murata M, Nishiyori-Sueki H, Kojima-Ishiyama M, Carninci P, Hayashizaki Y, Itoh M. 2014. 721

Detecting expressed genes using CAGE. Methods Mol Biol 1164:67-85. 722

Nakao A, Yoshihama M, Kenmochi N. 2004. RPG: the ribosomal protein gene database. 723

Nucleic acids research 32:D168-D170. 724

Nei M, Rooney AP. 2005. Concerted and birth-and-death evolution of multigene families. Annu 725

Rev Genet 39:121-152. 726

Ohno S. 1970. Evolution by gene duplication Berlin, New York: Springer-Verlag. 727

Panchy N, Lehti-Shiu M, Shiu SH. 2016. Evolution of Gene Duplication in Plants. Plant Physiol 728

171:2294-2316. 729

Papp B, Pal C, Hurst LD. 2003. Dosage sensitivity and the evolution of gene families in yeast. 730

Nature 424:194-197. 731

Pérez-Nevado F, Albergaria H, Hogg T, Girio F. 2006. Cellular death of two non-732

Saccharomyces wine-related yeasts during mixed fermentations with Saccharomyces 733

cerevisiae. Int J Food Microbiol 108:336-345. 734

Pfeiffer T, Morley A. 2014. An evolutionary perspective on the Crabtree effect. Frontiers in 735

molecular biosciences 1:17. 736

Pfeiffer T, Schuster S, Bonhoeffer S. 2001. Cooperation and competition in the evolution of 737

ATP-producing pathways. Science 292:504-507. 738

Piskur J, Rozpedowska E, Polakova S, Merico A, Compagno C. 2006. How did Saccharomyces 739

evolve to become a good brewer? Trends Genet 22:183-186. 740

Rajeh A, Lv J, Lin Z. 2018. Heterogeneous rates of genome rearrangement contributed to the 741

disparity of species richness in Ascomycota. BMC Genomics 19:282. 742

Page 25: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

25

Rhind N, Chen Z, Yassour M, Thompson DA, Haas BJ, Habib N, Wapinski I, Roy S, Lin MF, 743

Heiman DI, et al. 2011. Comparative functional genomics of the fission yeasts. Science 744

332:930-936. 745

Sanchez-Gracia A, Vieira FG, Rozas J. 2009. Molecular evolution of the major chemosensory 746

gene families in insects. Heredity (Edinb) 103:208-216. 747

Schildkraut E, Miller CA, Nickoloff JA. 2006. Transcription of a donor enhances its use during 748

double-strand break-induced gene conversion in human cells. Mol Cell Biol 26:3098-3105. 749

Schlotterer C, Tautz D. 1994. Chromosomal homogeneity of Drosophila ribosomal DNA arrays 750

suggests intrachromosomal exchanges drive concerted evolution. Curr Biol 4:777-783. 751

Scienski K, Fay JC, Conant GC. 2015. Patterns of Gene Conversion in Duplicated Yeast 752

Histones Suggest Strong Selection on a Coadapted Macromolecular Complex. Genome Biol 753

Evol 7:3249-3258. 754

Sidow A. 1996. Gen (om) e duplications in the evolution of early vertebrates. Curr Opin Genet 755

Dev 6:715-722. 756

Stamatakis A. 2006. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with 757

thousands of taxa and mixed models. Bioinformatics 22:2688-2690. 758

Storici F, Bebenek K, Kunkel TA, Gordenin DA, Resnick MA. 2007. RNA-templated DNA repair. 759

Nature 447:338. 760

Straathof AJ, van Gulik WM. 2012. Production of fumaric acid by fermentation. Subcell Biochem 761

64:225-240. 762

Vander Heiden MG, Cantley LC, Thompson CB. 2009. Understanding the Warburg effect: the 763

metabolic requirements of cell proliferation. Science 324:1029-1033. 764

Venters BJ, Pugh BF. 2009. A canonical promoter organization of the transcription machinery 765

and its regulators in the Saccharomyces genome. Genome Res 19:360-371. 766

Vision TJ, Brown DG, Tanksley SD. 2000. The origins of genomic duplications in Arabidopsis. 767

Science 290:2114-2117. 768

Warner JR. 1999. The economics of ribosome biosynthesis in yeast. Trends Biochem Sci 769

24:437-440. 770

Weng Y-s, Xing D, Clikeman JA, Nickoloff JA. 2000. Transcriptional effects on double-strand 771

break-induced gene conversion tracts. Mutation Research/DNA Repair 461:119-132. 772

Williams KM, Liu P, Fay JC. 2015. Evolution of ecological dominance of yeast species in high-773

sugar environments. Evolution 69:2079-2093. 774

Wimberly BT, Brodersen DE, Clemons WM, Morgan-Warren RJ, Carter AP, Vonrhein C, 775

Hartsch T, Ramakrishnan V. 2000. Structure of the 30S ribosomal subunit. Nature 407:327-339. 776

Wolfe KH, Shields DC. 1997. Molecular evidence for an ancient duplication of the entire yeast 777

genome. Nature 387:708-712. 778

Page 26: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

26

Wool IG. 1979. The structure and function of eukaryotic ribosomes. Annu Rev Biochem 48:719-779

754. 780

Zeng L, Zhang Q, Sun R, Kong H, Zhang N, Ma H. 2014. Resolution of deep angiosperm 781

phylogeny using conserved nuclear genes and estimates of early divergence times. Nat 782

Commun 5:4956. 783

Zhang J. 2003. Evolution by gene duplication: an update. Trends in Ecology & Evolution 18:292-784

298. 785

Zhang J. 2013. Gene duplication. In: Losos J, editor. The Princeton Guide to Evolution. 786

Princeton, New Jersey 08540 USA: Princeton University Press. p. 397-405. 787

Zhang J, Yang JR. 2015. Determinants of the rate of protein sequence evolution. Nat Rev 788

Genet 16:409-420. 789

790

Page 27: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

27

Figure Legends: 791

Figure 1: Schematic illustration of gene duplication patterns of 79 RP families in 34 792

fungal species. Each row represents a fungal species, and each column represents an RP 793

gene family. The colors of a cell represent the numbers of gene copies identified in an RP family 794

of a species. The evolutionary relationship for 34 species, inferred based on amino acid 795

sequences of RNA polymerase II were shown to the left side of the matrix. The species names 796

were provided to the right of the matrix. 797

798

Figure 2: Phylogenetic trees of representative RP gene families. Phylogenetic trees of the 799

RPS6 gene family (A) and the RPS19 (B) gene family in 34 fungal species. The phylogenetic 800

trees were inferred by maximum likelihood (ML) method with 100 bootstrap tests. Only 801

bootstrap values above 50 are shown next to each node. The branches of budding yeast, fission 802

yeast, and Mucoromycota species are colored in red, blue and green respectively. The full 803

species names in taxa were provided in Supplementary table 2. 804

805

Figure 3: Major types of tree topologies observed from 55 RP families with duplicates in 806

the budding yeasts. (A) Phylogenetic relationships of RPL12 genes from five representative 807

budding yeast species. In this case, the two copies of RPL12 genes in S. cerevisiae for a 808

species-specific clade. 21 RP gene families demonstrate similar tree topology, as indicated by 809

number “21” in a yellow dot; (B) Phylogenetic relationships of RPL11 genes. The two copies of 810

RPL11 genes in S. cerevisiae and S. mikatae form a well-supported clade, and the orthologous 811

genes between S. cerevisiae and S. mikatae are more closely related to each other. Seven RP 812

gene families have a similar tree topology; (C) Phylogenetic relationships of RPL16 genes. Each 813

of RPL16 duplicate genes in S. cerevisiae is more closely related to their orthologous genes in 814

S. mikatae and S. eubayanus. 24 RP gene families share a similar tree topology; (D) 815

Phylogenetic relationships of RPP1 genes. Each of RPP1 duplicate genes in S. cerevisiae is 816

more closely related to their orthologous genes in the five WGD species. Seven RP gene 817

families demonstrate a similar tree topology. (E) The distribution of RP families with different 818

termination points of concerted evolution based on evolutionary relationships of RP duplicate 819

genes in S. cerevisiae. The numbers on each tree branch represent the numbers of RP families 820

that have terminated concerted evolution at the indicated evolutionary stages. 821

Figure 4: The origin and evolution of RP genes in the fission yeasts. (A) A schematic 822

illustration of microsynteny structures of RPL11 genes in four budding yeasts. The microsynteny 823

blacks of RPL11A are shared by all fission yeasts, so are the RPL11 genes, supporting that the 824

duplication of RPL11 occurred prior to the divergence of the fission yeasts. The number in each 825

box represents its orthologous group ID. (B) Phylogenetic relationships of RPL11 genes in four 826

fission yeast species. The RPL11 duplicate genes in Sch. pombe are more closely related to 827

each other than to any orthologous genes. 45 RP gene families demonstrate similar tree 828

topology; (C) Phylogenetic relationships of RPS17 genes. Each of RPL17 duplicate genes in 829

Sch. pombe are more closely related to their orthologous genes in Sch. octosporus and Sch. 830

cryophilus. Nine RP gene families demonstrate similar tree topology; (D) Phylogenetic 831

relationships of RPS5 genes. Each copy of RPS5 duplicates in Sch. pombe are more closely 832

related to their orthologous genes. Four RP gene families demonstrate similar tree topology. (E) 833

The distribution of RP families with different termination points of concerted evolution in Sch. 834

pombe. The numbers on each tree branch indicate the numbers of RP families that have 835

terminated concerted evolution at the indicated evolutionary stages. 836

Page 28: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

28

Figure 5: The origin and evolution of RP genes in pin molds. (A) A schematic illustration of 837

microsynteny structures of RPL3 genes in three pin molds. The shared microsynteny structure 838

suggested that the first duplication event of RPL3 genes have occurred prior to the divergence 839

of the pin molds species. The two RPL3 genes have experienced a second round of duplication 840

by WGD in R. delemar. (B) Phylogenetic relationships of RPL3 genes in three pin molds. The 841

RPL3 paralogous genes in R. microsporus are more closely related to each other than to any 842

orthologous genes. 57 RP gene families demonstrate similar tree topology; (C) Phylogenetic 843

relationships of RPL38 genes in Mucorales. Each of RPL38 duplicate genes in R. microsporus 844

is more closely related to their orthologous genes in R. delemar. 17 RP gene families 845

demonstrate similar tree topology; (D) Phylogenetic relationships of RPS20 genes in three pin 846

molds. Each RPS20 duplicate gene in R. microsporus is more closely related to their 847

orthologous genes. Two RP gene families demonstrate similar tree topology. (E) The 848

distribution of RP families with different termination points of concerted evolution during the 849

evolution of R. microsporus. The numbers on each tree branch represent the estimated 850

numbers of RP families that have terminated concerted evolution at the indicated evolutionary 851

stage. 852

853

Figure 6: Distinct mutations rates in substitution sites and introns between RP 854

paralogous genes. The distributions of mutation rates in intron and synonymous sites between 855

RP paralogous genes in S. cerevisiae (A), Sch. pombe (B) and R. microspores (C). Scatter plots 856

of mutation rates in intron against synonymous sites between RP paralogous genes in in S. 857

cerevisiae (D), Sch. pombe (E) and R. microspores (F). 858

859

Figure 7: RP gene copy numbers are positively corrected with RP transcription 860

abundance, ethanol production ability, and glucose consumption rates. (A) A scatter plot 861

between the RP copy number and total RP transcription abundance in 11 yeast species. (B) A 862

scatter plot between the RP copy number and ethanol production efficiency in 19 yeast species. 863

(C) A scatter plot between the RP copy number and glucose consumption rate in 19 yeast 864

species. 865

866

Page 29: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

29

Supplementary Figures: 867

Supplementary fig. 1: The phylogenetic relationships of 34 representative fungal species. 868

The phylogenetic tree was inferred based on amino acid sequences of RPA polymerase II using 869

ML method by RAxML with 100 bootstrap replicates. 870

Supplementary fig. 2: The total transcription abundance of RP genes is positively 871

correlated with ethanol production efficiency and glucose consumption rates. (A) A 872

scatter plot between total RP transcription abundance and ethanol production efficiency in 9 873

yeast species. (B) A scatter plot between the total RP transcription abundance and glucose 874

consumption rate in 9 yeast species. 875

Supplementary Tables: 876

Table S1: List of numbers of RP genes identified in 285 fungal species 877

Table S2: List of numbers of RP gene identified in 34 representative fungal species based on 878

manual curation. 879

Table S3: List of RP genes identified in 34 representative fungal species 880

Table S4: List of orthologous groups identified in four fission yeasts 881

Table S5: Microsynteny data of the 79 RP families in fission yeasts 882

Table S6: Statistics of intron number in RP genes in S. cerevisiae and Sch. pombe. 883

Table S7. Number of share homologous genes between microsynteny blocks near the RP 884

paralogous genes in three Mucorales species 885

Table S8. Number of share homologous genes between microsynteny blocks near the RP 886

orthologous genes in three Mucorales species 887

Table S9: Mutation rates of synonymous sites and intron in S. cerevisiae, Sch. pombe and R. 888

microspores 889

Table S10: Transcription abundance of RP genes in 11 species based on CAGE data 890

891

Supplementary Files: 892

893

Supplementary File 1: Phylogenetic trees of RP families from the 34 representatives fungal 894

species. 895

Supplementary File 2: Phylogenetic trees of RP families in the five WGD budding yeast species. 896

Supplementary File 3: Phylogenetic trees of RP families in the four fission yeast species. 897

Supplementary File 4: Phylogenetic trees of RP families in the four Mucoromycotina species. 898

899

900

Page 30: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

Rhizophagus irregularisLobosporangium transversaleBifiguratus adelaidaePhycomyces blakesleeanusRhizopus microsporusRhizopus delemar Pneumocystis murinaSchizosaccharomyces japonicusSchizosaccharomyces pombeSchizosaccharomyces cryophilusSchizosaccharomyces octosporusYarrowia lipolyticaYamadazyma tenuisCandida albicansDekkera bruxellensisKomagataella pastorisEremothecium sinecaudumKluyveromyces marxianusKluyveromyces lactisKluyveromyces dobzhanskiiLachancea kluyverii NRRLLachancea thermotoleransLachancea waltiiZygosaccharomyces rouxiiVanderwaltozyma polysporaTetrapisispora phaffiiNakaseomyces bacillisporusCandida glabrataNaumovozyma dairenensisNaumovozyma castelliiSaccharomyces uvarumSaccharomyces eubayanusSaccharomyces mikataeSaccharomyces cerevisiae

RP

L3

RP

L4

RP

L5

RP

L6

RP

L7

RP

L9

RP

L1

0R

PL

11

RP

L1

2R

PL

13

RP

L1

4R

PL

15

RP

L1

7R

PL

18

RP

L1

9

RP

L2

1R

PL

22

RP

L2

3R

PL

24

RP

L2

6

RP

L2

7

RP

L2

8e

RP

L2

9R

PL

30

RP

L3

1R

PL

32

RP

L3

4R

PL

35

RP

L3

6R

PL

37

RP

L3

8R

PL

39

RP

L4

0R

PL

41

RP

S2

RP

S3

RP

S4

RP

S5

RP

S6

RP

S7

RP

S8

RP

S9

RP

S1

0R

PS

11

RP

S1

2R

PS

13

RP

S1

4R

PS

15

RP

S1

6R

PS

17

RP

S1

8R

PS

19

RP

S2

0R

PS

21

RP

S2

3R

PS

24

RP

S2

5R

PS

26

RP

S2

7R

PS

28

RP

S2

9R

PS

30

RP

P0

RP

P1

RP

P2

RP

L1

RP

L2

RP

L8

RP

L1

6

RP

L2

0

RP

L2

5

RP

L2

8

RP

L3

3

RP

L4

2R

PL

43

RP

S0

RP

S1

RP

S2

2

RP

S3

1

Gene copy number

0 1 2 3 4 5 6Whole genome duplication Fermentative lifestyle

Page 31: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

Fig. 2

Scer YBR181C(RPS6B)Scer YPL090C(RPS6A)Smik 16.147Smik 2.321Suva 16.222Suva 4.432Seub DI49 5372Seub DI49 1207

Nbac 2815Nbac 4740

CAGL0M06303gCAGL0H05643g

TPHA 0I01780TPHA 0A01330

Kpol 1048p47Kpol 397p3

NCAS 0B02020NCAS 0C01970

NDAI 0E02690NDAI RPS6SAKL0H09394gLwal PRS6KLTH0E13178g

KLMA 10786KLLA0E24047g

Kdob CDO95170Esin AW171 hschr42745

ZYRO0F14586gYten XP 006687990.1

CAALFM C401270WADbru 83935

Kpas ANZ75409.1Kpas ANZ77339.1

YALI0F18766gSPOG 02014

SPOG 00464SOCG 04847SOCG 00490

SPAC13G6.07cSPAPB1E7.12

SJAG 03859SJAG 03034

PNEG 02018RO3G 13461T0

RO3G 09645T0RO3G 15797T0RO3G 11073T0RO3G 01861T0Rmic XP 023464232.1Rmic XP 023463716.1Rmic XP 023465388.1

Pbla XP 018287058.1Pbla XP 018290528.1Pbla XP 018291216.1Pbla XP 018286699.1

Bade OZJ04518.1Rirr XP 025187510.1

Ltra XP 021884183.1Ltra XP 021875601.1

98

66

100

7771

5681

100

100

8358

100

51

100

9361

61

100

100

79

100

100

70

10058

63

6661

100

9587

91

64

57

94

82

99

9957

94

77

100

98

0.05

Scer YOL121C(RPS19A)Scer YNL302C(RPS19B)Smik 14.29Smik 15.32

Suva 14.37Seub RPS19aSeub RPS19b

Suva 15.39Nbac 1538

NCAS 0C04630NCAS 0A10000NDAI 0G03990NDAI 0A06160

CAALFM C305200WAKpas RPS19

Yten XP 006687955.1KLLA0A07194g

Kdob CDO91849.1Kmax RPS19

Dbru RPS19Lwal RPS19KLTH0A01122gZYRO0C02530g

SAKL0C12584gEsin AW171 hschr2128

Kpol 1076p10TPHA 0A00470TPHA 0P01300

CAGL0E01991gYALI0B20504g

SJAG 01296SJAG 02335

SOCG 01376SOCG 01790SPOG 01258SPOG 03119

SPBC21C3.13SPBC649.02

PNEG 02439RO3G 02356T0RO3G 11462T0RO3G 14240T0RO3G 02659T0

Rmic XP 023467138.1Rmic XP 023463150.1Pbla XP 018289446.1Pbla XP 018297669.1

Ltra XP 021886335.1Ltra XP 021882438.1

Dabe OZJ06056.1Rirr XP 025187188.1

9095

85

99

75

81

69

59

85

10092

90

99

97

66

100

93

57

53

0.05

A. B.

Page 32: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

Fig. 3

RPL16 YIL133C

Smik 9.36

Seub RPL13Aa

YNL069C

Smik 14.257

Seub RPL13Ab

NCAS 0B06580

NCAS 0G02920

TPHA 0C03040

TPHA 0F01400100

99

90

99

90

60

80

0.01

C.

RPP1 YDL081C

Smik 4.155

Seub RPP1A

NCAS 0D01190

NCAS 0A03990

TPHA 0G02600

YDL130W

Smik 4.107

Seub RPP1B

NCAS 0E03620

TPHA 0P00830

10084

57

94

57

100

9758

0.05

D.

RPL12YDR418W

YEL054C

Smik 4.688

Smik 5.33

Seub DI49 0590

Seub DI49 1339

NCAS0A12170

NCAS0H02540

TPHA0E01990

TPHA0J03070100

98

73

61

51

100

0.005

A. 21

YGR085C

Smik 6.167

Smik 16.342

YPR102C

Seub DI49 2042

Seub DI49 5546

TPHA 0M01740

NCAS 0A11430

95

99

68

86

0.01

B. RPL11 7

24

7

S. cerevisiae

S. mikatae

S. eubayanu

N. castellii

T. phaffii

7

0

24

721E.

WGD

Page 33: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

RPS17SOCG 03986

SPOG 03251

SPCC24B10.09

SOCG 04465

SPOG 01307

SPBC839.05c

SJAG 04204

SJAG 04168

99

96

97

99

99

0.02

RPL11

SPBC17G9.10 (RPL11A)

SPAC26A3.07c (RPL11B)

SOCG_04834 (RPL11B)

SOCG_02588 (RPL11A)

SPOG_01413 (RPL11A)

SPOG_00477 (RPL11B)

SJAG_02280 (RPL11B)

SJAG_00465 (RPL11A)

99

99

99

91

91

0.01

Sch. pombe

Sch. octosporus

Sch. cryophilus

Sch. japonicus

4

9

45

Fig. 4

B.C.

E.

45

A.

9

4

SOCG 00442

SPOG 02061

SPAC328.10c

SJAG 02477

SOCG 03175

SPOG 04715

SPAC8C9.08

SJAG 00523

8380

58

9154

0.005

RPS5D.

3521 RPL11B2976431236 3484 702 26225043969 474

3521 RPL11B2976431236 3484 702 26225043969 474

3521 RPL11B2976431236 3484 702 26225043969 474

3521 RPL11B297643 702 26225043969 474

4122 RPL11A13034302906 2160 3774 42161587125 3294

4122 RPL11A13034302906 2160 3774 42161587125 3294

2160 RPL11A13032906430 4122 3774 42161587125 3294

2160 RPL11A13032906430 4122 3774 42161587125 3294

Sch. pombe

Sch. octosporus

Sch. cryophilus

Sch. japonicus

Sch. pombe

Sch. octosporus

Sch. cryophilus

Sch. japonicus

RPL11B

RPL11A

RPL11

Page 34: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

Fig. 5

RPL38

RPS20

A.

E.

R. microsporus

R. delemar

P. blakesleeanus

17

2

5717

2

WGDWGD

57

RO3G 00672T0 (RPL3Ab)

RO3G 05371T0 (RPL3Aa)

RO3G 07884T0 (RPL3Ba)

RO3G 13200T0 (RPL3Bb)

Rmic XP 023461004.1(RPL3A)

Rmic XP 023467737.1(RPL3B)

Pbla XP 018284708.1 (RPL3A)

Pbla XP 018297202.1(RPL3B)100

100

64

65

0.01

25796460 RPL3A 65344970 667642 3421

65344970 RPL3A 2288642 667

4048 RPL3B 6534

65344970 RPL3Aa 3381642 667 3421 2579 2791 7173

65346460 RPL3Ab3381 2791 7173 4970

5484 RPL3B 40831506 4048

4083 RPL3Ba4048

5484 RPL3Bb1506

R. microsporus

R. delemar

R. microsporus

P. blakesleeanus

P. blakesleeanus

R. delemar

R. delemar

R. delemar

B.

Rmic XP 023466194.1

RO3G 08326T0

RO3G 00209T0

Rmic XP 023461984.1

RO3G 03641T0

RO3G 01788T0

Pbla XP 018297305.1

Pbla XP 018284163.1

Pbla XP 018294690.198

64

80

72

56

96

0.02

C.

D.

RPL3

RO3G 08457T0

RO3G 00395T0

Rmic XP 023462361.1

Pbla XP 018294369.1

RO3G 09377T0

RO3G 14700T0

Rmic XP 023466367.1

Pbla XP 018291291.1

100

100

67

50

57

0.01

Page 35: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

Mutation rate

Num

ber o

f Gen

e Pa

irs

0.0 0.2 0.4 0.6 0.8 1.0

02

46

810

12 Synanymous sitesIntron

Mutation rate

Num

ber o

f Gen

e Pa

irs0.0 0.5 1.0 1.5 2.0

02

46

8Mutation rate

Num

ber o

f Gen

e Pa

irs

0.0 0.4 0.8 1.2

05

1015Synanymous sites

IntronSynanymous sitesIntron

A. B. C.

0.00

0.25

0.50

0.75

1.00

0.00 0.25 0.50 0.75 1.00Mutation rates of synonymous sites

Mut

atio

n ra

tes

of in

tron

0.5

1.0

1.5

2.0

0.5 1.0 1.5 2.0Mutation rates of synonymous sites

Mut

atio

n ra

tes

of in

tron

0.0

0.5

1.0

0.0 0.5 1.0Mutation rates of synonymous sites

Mut

atio

n ra

tes

of in

tron

D. E. F.

Page 36: Parallel Concerted Evolution of Ribosomal Protein Genes in ... · Rajeh1,3, Ajay Chatrath1, Zhenguo Lin1* 1 Department of Biology, Saint Louis University, St. Louis, MO, USA 63103

0

100

200

300

80 100 120 140

(100

0 TP

M)

r = 0.72

Tota

l exp

ress

ion

abun

danc

e

Tota l number of RP genes

0.0

0.3

0.6

0.9

80 100 120 140Et

hano

l pro

duct

ion

effic

ienc

y(g

/gD

W,h

)

r = 0.80

Total number of RP genes

A. B.

1

2

3

80 100 120 140Total RP gene numbers

r = 0.76

Glu

cose

com

sum

ptio

n R

ate

[Glc

/Bio

mas

s/h(

g/gD

W,h

)]

C.