emergence of a novel chimeric gene underlying grain number ... · 15/12/2016  · 106 discovery of...

35
Emergence of a novel chimeric gene underlying grain number in 1 rice 2 Hao Chen # , Yanyan Tang # , Jianfeng Liu, Lubin Tan, Jiahuang Jiang, Mumu Wang, 3 Zuofeng Zhu, Xianyou Sun, Chuanqing Sun* 4 State Key Laboratory of Plant Physiology and Biochemistry, National Center for 5 Evaluation of Agricultural Wild Plants (Rice), Department of Plant Genetics and 6 Breeding, China Agricultural University, Beijing 100193, China 7 8 # These authors contributed equally to this work 9 *Correspondence: Chuanqing Sun ([email protected]) 10 11 First author: Hao Chen e-mail: [email protected] 12 Co-first author: Yanyan Tang e-mail: [email protected] 13 Second author: Jianfeng Liu e-mail: [email protected] 14 Third author: Lubin Tan e-mail: [email protected] 15 Forth author: Jiahuang Jiang e-mail: [email protected] 16 Fifth author: Mumu Wang e-mail: [email protected] 17 Sixth author: Zuofeng Zhu e-mail: [email protected] 18 Seventh author: Xianyou Sun e-mail: [email protected] 19 Corresponding author: Chuanqing Sun e-mail: [email protected] 20 21 Genetics: Early Online, published on December 16, 2016 as 10.1534/genetics.116.188201 Copyright 2016.

Upload: others

Post on 03-Aug-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice

Emergence of a novel chimeric gene underlying grain number in 1

rice 2

Hao Chen #, Yanyan Tang #, Jianfeng Liu, Lubin Tan, Jiahuang Jiang, Mumu Wang, 3

Zuofeng Zhu, Xianyou Sun, Chuanqing Sun* 4

State Key Laboratory of Plant Physiology and Biochemistry, National Center for 5

Evaluation of Agricultural Wild Plants (Rice), Department of Plant Genetics and 6

Breeding, China Agricultural University, Beijing 100193, China 7

8

# These authors contributed equally to this work 9 *Correspondence: Chuanqing Sun ([email protected]) 10 11

First author: Hao Chen e-mail: [email protected] 12

Co-first author: Yanyan Tang e-mail: [email protected] 13

Second author: Jianfeng Liu e-mail: [email protected] 14

Third author: Lubin Tan e-mail: [email protected] 15

Forth author: Jiahuang Jiang e-mail: [email protected] 16

Fifth author: Mumu Wang e-mail: [email protected] 17

Sixth author: Zuofeng Zhu e-mail: [email protected] 18

Seventh author: Xianyou Sun e-mail: [email protected] 19

Corresponding author: Chuanqing Sun e-mail: [email protected] 20

21

Genetics: Early Online, published on December 16, 2016 as 10.1534/genetics.116.188201

Copyright 2016.

Page 2: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice

Running head: Emergence of a rice functional gene 22

Keywords: GN2, grain number, chimeric gene, gene emergence, rice 23

Corresponding author: 24

Chuanqing Sun, Ph.D 25

Professor, Cheung Kong Scholar 26

State Key Laboratory of Plant Physiology and Biochemistry 27

China Agricultural University 28

Beijing 100193, China 29

Tel: 86-10-62731811 30

Fax: +86-10-62731811 31

E-mail: [email protected] 32

http://sklppb.cau.edu.cn/SteamseView.asp?ID=15&SortID=6 33

34

Total word count for the main body of the text is 4088, including Abstract (184), 35

Introduction (629), Materials and Methods (785), Results (1665), Discussion (776), 36

and Acknowledgements (49). The total count for the Supporting information, 37

Reference and Figure legends is 198, 1459 and 824, respectively. No conflict of 38

interest exists in the submission of this manuscript. 39

40

Page 3: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice

ABSTRACT 41

Grain number is an important factor in determining grain production of rice (Oryza 42

sativa L.). The molecular genetic basis for grain number is complex. Discovering 43

new genes involved in regulating rice grain number increases our knowledge 44

regarding its molecular mechanisms and aids breeding programs. Here, we 45

identified GRAINS NUMBER 2 (GN2), a novel gene that is responsible for rice grain 46

number, from ‘Yuanjiang’ common wild rice (Oryza rufipogon Griff.). Transgenic 47

plants over-expressing GN2 showed less grain number, reduced plant height, and 48

later heading date than control plants. Interestingly, GN2 arose through the 49

insertion of a 1,094-bp sequence from LOC_Os02g45150 into the third exon of 50

LOC_Os02g56630, and the inserted sequence recruited its nearby sequence to 51

generate the chimeric GN2. The gene structure and expression pattern of GN2 were 52

distinct from those of LOC_Os02g45150 and LOC_Os02g56630. A sequence analysis 53

showed that GN2 may be generated in the natural population of ‘Yuanjiang’ 54

common wild rice. In this study, we identified a novel functional chimeric gene and 55

also provided information regarding the molecular mechanisms regulating rice 56

grain number. 57

Keywords: GN2, grain number, chimeric gene, gene emergence, rice 58

59

Page 4: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice

INTRODUCTION 60

As a widely consumed staple food in the world, rice (Oryza sativa L.) is an important 61

calorie source for humans. With the daily increase in the world’s population and the 62

decrease in potential available farmlands, effectively increasing rice grain production has 63

become a growing challenge (Khush, 1999; Zhang, 2007). Rice grain production is 64

thought to be determined by three major factors, panicle per plant, grain weight and grain 65

number (Xing & Zhang, 2010). Compared with the other two major factors, rice grain 66

number exhibits a wider phenotypic variation, and is regarded as a main selection and 67

improvement target (Yamagishi et al., 2002). Consequently, currently cultivated rice lines 68

display dramatically increased grain number compared with its wild progenitor, the 69

common wild rice (Oryza rufipogon Griff.) (Sun et al., 2001; Kovach et al., 2007). 70

Rice grain number is a complex agronomic trait impacted by many factors, including 71

panicle architecture, and the initiation and outgrowth of branches and spikelets (Xing & 72

Zhang, 2010; Wang & Li, 2011). Recently, a few quantitative trait loci (QTLs) and genes 73

regulating rice grain number have been identified. Among which, DEP1, DEP2, LP, SP1, 74

PAP2 and sped1-D alter panicle architecture (Huang et al., 2009; Zhou et al., 2009; Li et 75

al., 2009; Kobayashi et al., 2010; Li et al., 2010; Li et al., 2011; Jiang et al., 2014), 76

whereas Gn1a, LAX1, SPA1, TAW1, DST and FZP control the initiation and outgrowth of 77

branches and spikelets (Komatsu M et al., 2003; Komatsu K et al., 2003; Ashikari et al., 78

2005; Li et al., 2013; Yoshida et al., 2013). In addition, some genes, such as Ghd7, Ghd8, 79

PROG1, IPA1, FUWA, PAY1 and An1, underlying rice grain number show pleiotropic 80

effects in many significant agronomic- or domestication-related traits, including plant 81

height, heading date, plant architecture, awn habit and grain size (Jin et al., 2008; Tan et 82

Page 5: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice

al., 2008; Xue et al., 2008; Jiao et al., 2010; Miura et al., 2010; Luo et al., 2013; Chen et 83

al., 2015; Zhao et al., 2015). 84

Chimeric genes originate from multiple parental loci, and they have been reported in a 85

variety of organisms (Long et al., 2003). Chimeric genes have the potential to evolve 86

novel functions different from those of the parental loci, and provide the opportunity to 87

study gene function and evolution (Long et al., 2003). Some chimeric genes generate 88

novel functions and influence phenotypes. For example, sphinx determines male 89

courtship in Drosophila (Wang et al., 2002), CYPA-TRIM5 affects HIV1-resistance in 90

primates (Sayah et al., 2004), CHANNEL11/12 regulates multiple pathogen-resistance 91

responses in Arabidopsis (Yoshioka et al., 2006), BS1 takes part in the normal 92

reproductive development of maize (Jin & Bennetzen, 1989; Elrouby & Bureau, 2010) 93

and chimeric ATP6 and Rfo lead to male sterility in rice and radish, respectively 94

(Kadowaki et al., 1990; Wang et al., 2006; Brown et al., 2003). In the rice genome, 95

chimeric genes have also been widely detected (Wang et al., 2006). Systematic analysis 96

demonstrated that chimeric genes account for a high percentage of the total newly 97

evolved genes (Wang et al., 2006; Fan et al., 2008; Zhang et al., 2013). However, the 98

phenotypic effects of most rice chimeric genes have not been well investigated. 99

In this study, we identified a novel chimeric gene, GRAINS NUMBER 2 (GN2), which is 100

responsible for rice grain number, as well as plant height and heading date, on the long 101

arm of chromosome 2 in ‘Yuanjiang’ common wild rice. We found that a 1,094-bp 102

sequence from LOC_Os02g45150 was inserted into the third exon of LOC_Os02g56630, 103

and the inserted sequence recruited its nearby sequence to generate the chimeric GN2. 104

GN2 was generated in the natural population of ‘Yuanjiang’ common wild rice. The 105

Page 6: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice

discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 106

rice grain number and in uncovering information regarding gene emergence. 107

108

MATERIALS AND METHODS 109

Plant materials 110

‘YIL19’ is a rice introgression line derived from a set of 106 introgression lines 111

developed previously (Tan et al., 2007). This set of introgression lines was derived from a 112

cross between an accession of Chinese common wild rice (Yuanjiang, O. rufipogon Griff.) 113

and a high yield indica cultivar (O. sativa L.) ‘Teqing’ followed by four generations of 114

selfing and three generations of backcrossing. ‘Yuanjiang’ common wild rice was 115

collected from Yuanjiang Country, Yunnan Province, China. 116

117

Map-based cloning 118

A F2 population containing 190 individual plants derived from the cross between 119

‘YIL19’ and ‘Teqing’ was constructed for QTL analysis. QTL analysis was carried out 120

by interval analysis using Map Manager QTXb17 (Manly et al., 2001). A F2 segregating 121

population containing 4,512 plants (including the previous 190 F2 plants) was used for 122

the genetic analysis and fine mapping. The F3 families, derived from the F2 recombinant 123

individuals, were planted to confirm the phenotypes of the F2 recombinant individuals. 124

Some molecular markers used for map-based cloning were published markers (McCouch 125

et al., 2002). The newly developed molecular markers used in this study are listed in 126

Table S4. 127

128

Page 7: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice

Gene prediction and annotation 129

The annotation of putative genes in the fine mapping region were obtained from Rice 130

Genome Annotation Project (RGAP, MSU V7; 131

http://rice.plantbiology.msu.edu/downloads.shtml). We also annotated genes according to 132

the sequence information of ‘YIL19’ and ‘Teqing’ from the Rice Genome Automated 133

Annotation System (http://RiceGAAS.dna.affrc.go.jp/). 134

135

Detection of the possible transcripts by strand specific reverse transcription- 136

polymerase chain reaction (RT-PCR) analysis 137

To detect the possible transcript generated by the 1,094-bp DNA fragment insertion in the 138

locus of LOC_Os02g56630, 22 pairs of specific primers covering a 3-kb genomic region 139

surrounding the insertion site were designed to analyze the cDNA of ‘YIL19’ and 140

‘Teqing’. The RT-PCR products were confirmed by gel electrophoresis and sequence 141

analyses. 142

143

Real-time quantitative PCR (RT-qPCR) and rapid-amplifications of cDNA ends 144

(RACE) analysis 145

Total RNA was extracted from various samples, including 7-day-old seedlings, roots, 146

leaves, leaf sheathes, tiller bases, shoot apical meristem, young panicles shorter than 1 cm, 147

young panicles longer than 1 cm and shorter than 5 cm, young panicles longer than 5 cm, 148

mature panicles and spikelet hulls, using TRIzol reagent (Invitrogen, http://www. 149

lifetechnologies.com) and was purified using an RNeasy Micro Kit (Qiagen 150

https://www.qiagen.com). For RT-qPCR analyses, first-strand cDNA was synthesized 151

Page 8: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice

using an oligo(dT)18 primer (TaKaRa, http://www.takara.com) and SuperScript® III 152

Reverse Transcriptase (Invitrogen) from 2 µg of total RNA. The expression levels of 153

GN2 and other genes were analyzed using a CFX96 Real Time System (Bio-Rad, 154

http://www.bio-rad.com), and the rice Ubiquitin gene (LOC_Os03g13170) served as the 155

control. Each set of experiments was repeated three times. 5'- and 3'-RACE were carried 156

out using a 5'-Full RACE Kit with tobacco acid pyrophosphatase (TAP) and a 3'-Full 157

RACE Core Set with PrimeScript™ RTase (TaKaRa) according to the manufacturer’s 158

instructions. 159

160

Complementation test 161

The constructs pUbi::GN2.1 and pUbi::GN2.2 harbored the full-length sequence of 162

GN2.1 cDNA and GN2.2 cDNA, respectively, under the control of a maize Ubiquitin 163

promoter. These constructs were introduced into Agrobacterium tumefaciens strain 164

EHA105 and were subsequently transferred into ‘Teqing’. Two independent lines per 165

construct were used for further phenotypic evaluations. 166

167

Detection the presence of GN2 in rice germplasm 168

We detected the presence of GN2 by two methods. First, 169

the nucleotide BLAST search was performed using the genomic sequence and full-length 170

cDNA sequence of GN2 against the corresponding databases. Second, a pair of diagnostic 171

PCR primers flanking the 1,094-bp DNA fragment was designed to detect the presence of 172

GN2 in 236 rice germplasm accessions (Table S7). If GN2 is present, then the expected 173

PCR amplification band size between the diagnostic primers is ~2 kb, while if GN2 is 174

Page 9: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice

absent, then the expected PCR amplification band size is ~1 kb. 175

176

Protoplast transfection assays for the regulatory sequence analysis 177

The sequence P1 contains the ancestral sequence of LOC_Os02g56630, including a 178

~2-kb genomic sequence upstream of GN2, and P2 contains the upstream sequence of 179

GN2 from the insert sequence of LOC_Os02g45150. All of the sequences were amplified 180

by PCR from ‘YIL19’ genomic DNA and then inserted in the frame of pGreenII 181

0800-LUC (Hellens et al., 2005). Protoplasts were prepared from 7-day-old seedlings of 182

‘Teqing’. After protoplast preparation and transfection, firefly luciferase (LUC) and 183

renillia luciferase (REN) activities were measured using the Dual-Luciferase Reporter 184

Assay System (Promega, http://www.promega.com), Relative LUC activity was 185

calculated by normalizing against the REN activity following the manufacturer's 186

instructions. In the experiments, 10 µg of each of the constructs was used, and all of the 187

experiments were repeated three biological times with similar results. 188

189

Data availability 190

The plant materials in this study are available upon request. The authors state that all data 191

necessary for confirming the conclusions presented in this article are represented fully 192

within the article. 193

194

RESULTS 195

‘YIL19’ shows less grain number and pleiotropic phenotypes 196

Page 10: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice

An introgression line, ‘YIL19’, which originated from a set of introgression lines 197

constructed using ‘Yuanjiang’ common wild rice as the donor parent and the indica 198

variety ‘Teqing’ as the recurrent parent, exhibited less grain number compared with its 199

recurrent parent, ‘Teqing’ (Figure 1a-c). The investigation of panicle traits showed that 200

the grain number per panicle for ‘YIL19’ and ‘Teqing’ were 85.7 and 155.3, respectively 201

(Figure 1d). Significant differences (P < 0.01) were detected in grain number per panicle, 202

panicle length, mean length of primary branch, number of secondary branch and grains 203

on secondary branch between ‘YIL19’ and ‘Teqing’, while no significant difference was 204

present in the number of primary branch (Figure 1e-i; Figure S1). An evaluation of other 205

important agronomic traits between ‘YIL19’ and ‘Teqing’ indicated that there were 206

significant differences (P < 0.01) in plant height, days to heading, panicle number per 207

plant and grain yield per plant; however, no significant differences were detected in grain 208

weight or spikelet fertility (Figure 1j-k; Table S1). In addition, all of the internodes of the 209

stems were shortened in the ‘YIL19’ plants, compared with the ‘Teqing’ plants (Figure. 210

1c). 211

212

Detecting and fine-mapping GN2 213

‘YIL19’ contained four O. rufipogon chromosomal segments, located on chromosome 1, 214

2, 6 and 10 (Figure 2a). The grain number of the F1 hybrid derived from the cross 215

between ‘YIL19’ and ‘Teqing’ was similar to that of ‘YIL19’ (Figure 2b). An F2 216

population containing 190 individuals was genotyped using 14 simple sequence repeat 217

markers distributed on four O. rufipogon introgression regions and evaluated for grain 218

number on the main panicle. QTL analysis revealed that there was a QTL between 219

Page 11: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice

markers RM5300 and RM6312 on the long arm of chromosome 2 with a LOD score of 220

12.24 that explained 26% of the phenotypic variance (Table S2; S5). The O. 221

rufipogon-derived allele contributed a decreasing effect on grain number. We referred to 222

this locus as GRAINS NUMBER 2 (GN2). No QTL underlying rice grain number was 223

detected in the other introgression regions, suggesting that GN2 was responsible for the 224

less grain number phenotype of ‘YIL19’ (Figure 2c). Using 4,512 F2 individuals, the 225

location of GN2 was narrowed down to an interval of 47-kb between RM3535 and the 226

newly developed molecular marker R4 (Figure 2c; Figure S2; Table S6). There were six 227

predicted genes in this fine-mapping region based on the ‘Nipponbare’ genome (Figure 228

2d- e; Table S3). 229

230

The newly detected transcript in ‘YIL19’ is the major GN2 candidate 231

A sequence analysis of the six predicted genes revealed that a 1,094-bp DNA fragment 232

insertion was present at position +2275 on exon 3 of LOC_Os02g56630 in the ‘YIL19’ 233

genome, when compared with the sequence of ‘Teqing’ (Figure 2f), while no such large 234

variation occurred in the other five genes. Furthermore, this 1,094-bp DNA fragment 235

insertion cosegregated with the less grain number phenotype. The annotation analysis 236

also suggested that the 1,094-bp DNA fragment insertion in ‘YIL19’ resulted in the 237

premature termination of LOC_Os02g56630 mRNA translation. Therefore, 238

LOC_Os02g56630, with three exons and two introns, that encodes an OsWAK 239

(Wall-associated kinase) receptor-like protein kinase of 765-amino acids, appears to be a 240

GN2 candidate. However, the expression of LOC_Os02g56630 cannot be detected in all 241

of the tested tissues collected, which span the whole rice life cycle, in either ‘YIL19’ or 242

Page 12: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice

‘Teqing’. Therefore, we speculated that other unpredicted transcripts in this region may 243

influence the rice grain number. To detect the possible transcripts, we designed 22 pairs 244

of primers covering a ~3-kb genomic region surrounding the insert site to conduct the 245

strand specific RT-PCR analysis. A ~200-bp transcript spanning the inserted sequence 246

and the third exon of LOC_Os02g56630 was detected in ‘YIL19’ (Figure S3), while no 247

transcript was detected from the corresponding locus of ‘Teqing’. To determine the full 248

length of the transcript, the 5′- and 3′-RACE were conducted. We identified two unique 249

transcripts, GN2.1 and GN2.2. A sequence analysis revealed that GN2.1 was 497-bp in 250

length, while GN2.2 was 494-bp, with only a 3-bp deletion difference from GN2.1 251

(Figure 2g; Figure S4), suggesting that GN2.1 and GN2.2 were transcript variants of the 252

same gene, which has two exons and one intron (Figure 2h). Further analyses 253

demonstrated that the 3-bp deletion in GN2.2 occurs at position +22 of the cDNA 254

sequence, and is caused by an alternative recognition of the RNA splice junction “AG” at 255

the 5′ end of the second exon (Figure 2i). In addition, GN2.1 and GN2.2 showed a 256

chimeric structure composed of 95-bp and 92-bp, respectively, from the 1,094-bp inserted 257

sequence and 402-bp from LOC_Os02g56630 (Figure 2h; Figure S5). 258

259

Complementation test of GN2 260

To verify whether the novel transcripts resulted in the less grain number phenotype of 261

‘YIL19’, we performed a transgenic analysis. A construct over-expressing the novel 262

transcript GN2.1, driven by a maize Ubiquitin promoter, was introduced into ‘Teqing’ 263

plants. Twelve over-expressing transgenic lines (designated as GN2.1-Teqing) were 264

obtained (Figure 3a-b). A RT-qPCR analysis showed that these transgenic lines 265

Page 13: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice

over-expressed GN2.1 transcripts compared with the control lines (Figure 3c). All of the 266

GN2.1-Teqing transgenic lines displayed less grain number than control lines. Significant 267

differences (P < 0.01) were detected in grains per panicle, panicle length, average length 268

of primary branch and number of secondary branch compared with the control lines 269

(Figure 3a-b; 3d-g). Additionally, GN2.1-Teqing transgenic lines showed reduced plant 270

height and delayed heading date compared with their control lines (Figure 3h- i). 271

We also developed a construct over-expressing the GN2.2 transcript and introduced it into 272

‘Teqing’ (designated as GN2.2-Teqing). Ten independent transgenic lines were obtained. 273

Similar phenotypes were observed in the transgenic plants containing these two different 274

constructs (Figure S6). Thus, the novel gene, with two alternative transcripts, 275

corresponded to GN2, and influenced the grain number and other plant agronomic-related 276

traits in rice. 277

278

Expression pattern of GN2 279

To detect the expression pattern of GN2, we surveyed its expression profiles in ‘YIL19’ 280

using the RNA from tissues spanning the whole life cycle, including 7-day-old seedlings, 281

roots, leaves, leaf sheathes, tiller bases, shoot apical meristem, young panicles from 282

different stages, mature panicles and spikelet hulls. GN2 was constitutively expressed in 283

all of the surveyed tissues. However, the relative expression level of GN2 appeared to be 284

low in the tested tissues (Figure 4a). 285

To determine the occurrence ratio of the two independent transcripts, we randomly 286

sequenced 20 GN2 clones. Eleven clones expressed GN2.1, and the remaining nine clones 287

expressed GN2.2, suggesting that GN2.1 and GN2.2 have similar expression levels 288

Page 14: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice

(Figure 4b). 289

290

GN2 may influence rice grain number using a distinct pathway from known grain 291

number regulators 292

Many genes associated with grain number have been identified in rice (Xing & Zhang, 293

2010). To understand whether these genes are regulated indirectly by GN2, we carried out 294

a RT-qPCR analysis to investigate their expression profiles. The expression levels of 295

most of the tested genes did not show significant differences between ‘YIL19’ and 296

‘Teqing’ in young panicles (Figure S7). Interestingly, the expression level of TAW1, and 297

its related gene OsMADS22 and OsMADS55, showed a higher expression level in young 298

panicles of ‘YIL19’ compared with in ‘Teqing’ (Figure S7). However, a previous study 299

revealed that the higher expression level of TAW1 led to an increase in the number of 300

secondary branch (Yoshida et al., 2013). Thus, GN2 may influence rice grain number 301

through a novel pathway distinct from those genes reported previously. 302

303

Emergence of the chimeric GN2 gene 304

To trace the emergence of GN2, we conducted a BLAST search against the O. sativa L. 305

ssp. japonica genome set and found that 1,074-bp of the 1,094-bp inserted DNA fragment 306

had a very high DNA sequence identity (greater than 99%) with the sequence of 307

LOC_Os02g45150 in ‘Nipponbare’. The remaining 20-bp is the direct repetitive sequence 308

“TGCATAAGCA” of unknown origin that flanks the 1,074-bp sequence. Further analysis 309

showed that the 1,074-bp sequence belongs to the partial 5′ untranslated region sequence 310

(or partial 5′ untranslated region and intron according to another annotation version) of 311

Page 15: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice

LOC_Os02g45150 on chromosome 2 (Figure 5a-b). Interestingly, the 1,074-bp sequence 312

in LOC_Os02g45150 described above was absent in the genome of C4, the accession of 313

‘Yuanjiang’ common wild rice that is the parent of the introgression line used in this study, 314

suggesting that the inserted sequence may be “cut” from the locus of LOC_Os02g45150 315

and then “pasted” into the locus of LOC_Os02g56630 in ‘Yuanjiang’ common wild rice 316

(Figure 5c; Figure S9). 317

Based on the GN2 gene construct, we inferred that the 1,094-bp inserted sequence from 318

the locus of LOC_Os02g45150 contributed the transcriptional start site, the first exon, the 319

whole intron and the partial second exon of GN2, and LOC_Os02g56630 provided the 320

rest of the second exon and the polyadenylation sites (Figure 5d). Therefore, we 321

concluded that GN2 is a chimeric gene produced by the fusion of the two sequences from 322

different origins. 323

To analyze the possible transcriptional regulatory region of GN2, a transient expression 324

assay was conducted. The inserted sequence from LOC_Os02g45150 had a weak 325

regulatory ability, while the upstream sequence of GN2 from LOC_Os02g56630 did not, 326

indicating that GN2 acquired its regulatory sequences from the 1,094-bp inserted 327

sequence (Figure S8). 328

A BLAST similarity search showed that GN2 is unique in the rice genome. It also 329

revealed that GN2 has no homolog in any other organisms. We further analyzed the 330

presence of GN2 in 236 rice accessions representing diverse germplasms, including both 331

wild and cultivated species (Table S7). None of the tested accessions exhibited the 332

presence of GN2, suggesting that GN2 may be newly generated in the genome of 333

‘Yuanjiang’ common wild rice. 334

Page 16: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice

To detect the occurrence of GN2 in ‘Yuanjiang’ common wild rice population, we 335

analyzed the presence of GN2 in 22 individuals from the local natural population of 336

‘Yuanjiang’ common wild rice in Yunnan Province, China. We found that 12 individuals 337

contained homozygous GN2, 4 individuals exhibited heterozygosity at the GN2 locus, 338

and 6 individuals did not contain GN2, suggesting that GN2 has a high frequency in the 339

local natural population of ‘Yuanjiang’ common wild rice, although it is not completely 340

fixed in this population. 341

342

DISCUSSION 343

In this study, we identified a novel gene, GN2, underlying rice grain number, using an 344

introgression line ‘YIL19’. GN2 exhibits pleiotropism, not only influencing grain number, 345

but also affecting the plant height and heading date. Interestingly, GN2 is a novel 346

chimeric gene that emerged through the shuffling of a 1,094-bp genomic fragment from 347

LOC_Os02g45150 to LOC_Os02g56630 on the same chromosome. We also found that 348

GN2 is unique to the rice genome and newly emerged in the natural population of 349

‘Yuanjiang’ common wild rice. 350

351

The insertion event produced new GN2 transcripts. A sequence analysis revealed that the 352

longest predicted open reading frame in the GN2 transcript was 213-bp, encoding a 353

70-amino acid polypeptide. Therefore, we speculated that GN2 may exert its function as a 354

novel polypeptide. 355

The panicle branch plays an important role in determining rice grain number (Wang & Li, 356

2008). Both the branch meristem’s activities and the elongation of branches regulate rice 357

Page 17: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice

grain number. Previous studies showed that a couple of genes, such as LAX1, Gn1a and 358

TAW1, influence rice grain number by regulating the number of branches (Komatsu et al., 359

2003; Ashikari et al., 2005; Yoshida et al., 2013). Several genes, such as SP1, control rice 360

grain number by regulating the elongation of panicle, primary branch and secondary 361

branch (Li et al., 2010), while sped1-D influence the length of pedicel and secondary 362

branch (Jiang et al., 2014). Unlike the genes described above, GN2 has a specific effect 363

on the elongation of the primary branches. The shortened length of the primary branches 364

leads to a lower number of secondary branches in ‘YIL19’. This suggested that GN2 365

regulates rice grain number in a manner distinct from those of known genes. 366

Chimeric genes are able to generate new functions that differ from those of parent 367

sequences (Long et al., 2003). Although chimeric genes form a certain proportion of the 368

rice genome, and many of them have expression profiles and undergo selection, their 369

phenotypic effect are mostly unknown (Wang et al., 2006; Fan et al., 2008; Zhang et al., 370

2013). In this study, we verified that GN2 is a novel chimeric gene that exhibits a 371

distinctive gene structure and expression pattern from its parental genes, and we also 372

found that GN2 acquires novel functions and has a significant effect on morphological 373

traits. 374

GN2 arose through the insertion of a 1,094-bp sequence which may come from 375

LOC_Os02g45150. Its emergence is consistent with the “cut and paste” DNA fragment 376

shuffling model, having the direct repeat sequence “TGCATAAGCA” flanking the 377

inserted region. RNA-mediated retroposition and DNA-mediated recombination are 378

known to be two major mechanisms for chimeric genes emergence (Arguello et al., 2007; 379

Kaessmann, 2010). Most of the rice chimeric genes arose through one of the two 380

Page 18: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice

mechanisms (Turcotte et al., 2001; Bennetzen, 2005; Wang et al., 2006). Although we 381

detected the flanking direct repeat, one of the hallmarks of retroposition, in GN2, we were 382

unable to find other structural hallmarks of retroposition, such as the loss of intron and 383

the presence of poly (A) tracts. Furthermore, the repeat sequence “TGCATAAGCA” does 384

not resemble any characterized transposon or retrotransposon, and this insertion event 385

induced by the “TGCATAAGCA” repeat sequence has rarely been detected in the rice 386

genome. Therefore, we speculated that the formation of the chimeric GN2 may not have 387

occurred through the mechanisms of RNA-mediated retroposition or DNA-mediated 388

recombination, but through a different approach for the chimeric genes origination. 389

Comparative genomic studies revealed that the emergence of novel genes in rice genome 390

occurred at a higher frequency and more rapid pace than other organisms, such as 391

primates, and the process of gene emergence appears to be ongoing (Ge et al., 1999; 392

Vaughan et al., 2003; Wang et al., 2006). Interestingly, GN2 is only detected in the 393

genome of ‘Yuanjiang’ common wild rice. Furthermore, it exhibited a high frequency (up 394

to 63%) in the natural population of ‘Yuanjiang’ common wild rice. Hence, we speculated 395

that GN2 had newly emerged in this natural population. The GN2 high frequency in the 396

local natural population may be subject to positive selection, and thus GN2 may be 397

functional. 398

Many newly-emerged genes, which have evolved novel functions, are involved in 399

development, reproduction and stress responses in various organisms (Tautz & 400

Domazet-Loso, 2011; Chen et al., 2013). These novel genes have the potential to be used 401

in important agronomic traits improvement and broaden germplasm resources for 402

breeding. Furthermore, understanding the mechanism of gene emergence will shed light 403

Page 19: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice

on artificial synthesis of functional genes in the future. The identification of the newly 404

emerged GN2 which obviously influences agronomic traits provides more information to 405

elucidate the molecular mechanisms underlying rice grain number and new insight into 406

general phenotypic evolution. 407

408

ACKNOWLEDGEMENTS 409

We thank Prof. Qingwen Yang, the Institute of Crop Sciences of Chinese Academy of 410

Agricultural Sciences for providing the wild rice samples. This research was supported by 411

the National Natural Science Foundation of China (Grants 91535301 and 31171522), and 412

the National Key R&D Program for Crop Breeding (2016YFD0100301). 413

. 414

415

SUPPORTING INFORMATION 416

Figure S1. A comparison of the primary branch in a main panicle of ‘YIL19’ to one of 417

‘Teqing’ 418

Figure S2. Two-step substitution mapping of GN2 419

Figure S3. Detection of the unpredicted transcripts 420

Figure S4. The full length sequence of transcripts GN2.1 and GN2.2 421

Figure S5. The sequence at the locus of GN2 in ‘YIL19’ 422

Figure S6. Phenotypic evaluation of GN2.2 over-expressing lines 423

Figure S7. The expression pattern of known genes associated with rice grain number in 424

the young panicle of ‘YIL19’ and ‘Teqing’ 425

Figure S8. Protoplast Transfection Assays of transcriptional ability for the upstream 426

Page 20: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice

sequence of GN2 427

Figure S9. The origination and fixation processes of GN2 from a genome view. 428

Table S1. Comparison in some important agronomic traits between ‘YIL19’ and ‘Teqing’ 429

Table S2. QTL analysis for grain number in the F2 population derived from the cross 430

between ‘YIL19’ and ‘Teqing’. 431

Table S3. Candidate genes for GN2 in the fine mapping region according to the annotation 432

system of Nipponbare 433

Table S4. Primers used for study. 434

Table S5. Genotype and phenotype of F2 populations for QTL analysis 435

Table S6. Genotype and phenotype of fine-mapping recombinant lines 436

Table S7. Rice germplasm used for detecting the presence of GN2.437

Page 21: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice

REFERENCES

Ashikari, M., Sakakibara, H., Lin, S., Yamamoto, T., Takashi, T., Nishimura, A.,

Angeles, E.R., Qian, Q., Kitano, H. and Matsuoka, M. (2005) Cytokinin oxidase

regulates rice grain production. Science. 309 :741-745.

Arguello, J., Fan, C., Wang, W. and Long, M. (2007) Origination of chimeric genes

through DNA-level recombination. Genome Dyn. 3: 131-146.

Bennetzen, J. L. (2005) Transposable elements, gene creation and genome

rearrangement in flowering plants. Current Opinion in Genetics and Development. 15:

621-627.

Brown, G. G., Formanová, N., Jin, H., et al. (2003) The radish Rfo restorer gene of

Ogura cytoplasmic male sterility encodes a protein with multiple pentatricopeptide

repeats. Plant J. 35: 262-272.

Chen, J., Gao, H., Zheng, X.M. et al. (2015) An evolutionarily conserved gene, FUWA,

plays a role in determining panicle architecture, grain shape and grain weight in

rice. Plant J. 83: 427-438.

Chen, S., Krinsky, B.H. and Long, M. (2013) New genes as drivers of phenotypic

evolution. Nat. Rev. Genet. 14 : 645-660.

Elrouby, N. and Bureau, T.E. (2010) Bs1, a new chimeric gene formed by

retrotransposon-mediated exon shuffling in maize. Plant Physiol. 153: 1413-1424.

Fan, C., Zhang, Y., Yu, Y., Rounsley, S., Long, M. and Wing, R.A. (2008) The

subtelomere of Oryza sativa chromosome 3 short arm as a hot bed of new gene

origination in rice. Mol. Plant. 1: 839-850.

Ge, S., Sang, T., Lu, B.R. and Hong, D. Y. (1999) Phylogeny of rice genomes with

Page 22: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice

emphasis on origins of allotetraploid species. Proc. Natl. Acad. Sci. USA. 96:

14400-14405.

Hellens, R.P., Allan, A.C., Friel, EN., Bolitho, K., Grafton, K., Templeton, M.D.,

Karunairetnam, S., Gleave, A.P. and Laing, W.A. (2005) Transient expression vectors

for functional genomics, quantification of promoter activity and RNA silencing in plants.

Plant Methods. 1: 13.

Huang, X., Qian, Q., Liu, Z., Sun, H., He, S., Luo, D., Xia, G., Chu, C., Li, J. and Fu,

X. (2009) Natural variation at the DEP1 locus enhances grain yield in rice. Nat. Genet. 41:

494-497.

Jiang, G., Xiang, Y., Zhao, J., Yin, D., Zhao, X., Zhu, L. and Zhai, W. (2014)

Regulation of inflorescence branch development in rice through a novel pathway

involving the pentatricopeptide repeat protein sped1-D. Genetics. 197: 1395-1407.

Jiao, Y., Wang, Y., Xue, D. et al. (2010) Regulation of OsSPL14 by OsmiR156 defines

ideal plant architecture in rice. Nat. Genet. 42: 541-544.

Jin, J., Huang, W., Gao, J.P., Yang, J., Shi, M., Zhu, M.Z., Luo, D. and Lin, H.X.

(2008) Genetic control of rice plant architecture under domestication. Nat. Genet. 40:

1365-1369.

Jin, Y. K. and Bennetzen, J. L. (1989) Structure and coding properties of Bs1, a maize

retrovirus-like transposon. Proc. Natl. Acad. Sci. USA. 86: 6235-6239.

Kadowaki, K.I., Suzuki, T. and Kazama, S. (1990) A chimeric gene containing the 5′

portion of atp6 is associated with cytoplasmic male-sterility of rice. Mol. Gen. Genet. 224:

10-16.

Kaessmann, H. (2010) Origins, evolution, and phenotypic impact of new genes. Genome

Page 23: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice

Res. 20: 1313–1326.

Khush, G.S. (1999) Green revolution: preparing for the 21st century. Genome. 42:

646-655.

Kobayashi, K., Maekawa, M., Miyao, A., Hirochika, H. and Kyozuka, J. (2010)

PANICLE PHYTOMER2 (PAP2), encoding a SEPALLATA subfamily MADS-box protein,

positively controls spikelet meristem identity in rice. Plant Cell Physiol. 51: 47-57.

Komatsu, K., Maekawa, M., Ujiie, S., Satake, Y., Furutani, I., Okamoto, H.,

Shimamoto, K. and Kyozuka, J. (2003) LAX and SPA: major regulators of shoot

branching in rice. Proc. Natl. Acad. Sci. USA. 100: 11765-11770.

Komatsu, M., Chujo, A., Nagato,Y., Shimamoto, K. and Kyozuka, J. (2003) FRIZZY

PANICLE is required to prevent the formation of axillary meristems and to establish floral

meristem identity in rice spikelets. Development.130: 3841-3850.

Kovach, M.J., Sweeney, M.T. and McCouch, S.R. (2007) New insights into the history

of rice domestication. Trends Genet. 23: 578-587.

Li, F., Liu, W., Tang, J., Chen, J., Tong, H., Hu, B., Li, C., Fang, J., Chen, M. and

Chu, C. (2010) Rice DENSE AND ERECT PANICLE 2 is essential for determining

panicle outgrowth and elongation. Cell Res. 20: 838-849.

Li, M., Tang, D., Wang, K., Wu, X., Lu, L., Yu, H., Gu, M., Yan, C. and Cheng, Z.

(2011) Mutations in the F‐box gene LARGER PANICLE improve the panicle

architecture and enhance the grain yield in rice. Plant Biotechnol J. 9: 1002-1013.

Li, S., Qian, Q., Fu, Z. et al. (2009) Short panicle1 encodes a putative PTR family

transporter and determines rice panicle size. Plant J. 58 : 592-605.

Page 24: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice

Li, S., Zhao, B., Yuan, D. et al. (2013) Rice zinc finger protein DST enhances grain

production through controlling Gn1a/OsCKX2 expression. Proc. Natl. Acad. Sci. USA.

110: 3167-3172.

Long, M., Betrán, E., Thornton, K. and Wang, W. (2003) The origin of new genes:

glimpses from the young and old. Nat. Rev. Genet. 4 : 865-875.

Luo, J., Liu, H., Zhou, T. et al. (2013) An-1 encodes a basic helix-loop-helix protein that

regulates awn development, grain size, and grain number in rice. Plant Cell. 25:

3360-3376.

Manly, KF., Cudmore, RH., Meer, JM. (2001) Map Manager QTX: cross-platform

software for genetic mapping. Mamm Genome. 12: 930-932

McCouch, S.R., Teytelman, L., Xu, Y. et al. (2002) Development of 2,240 new SSR

markers for rice (Oryza sativa L.). DNA. Res. 9:199-207.

Miura, K., Ikeda, M., Matsubara, A., Song, X.J., Ito, M., Asano, K., Matsuoka, M.,

Kitano, H. and Ashikari, M. (2010) OsSPL14 promotes panicle branching and higher

grain productivity in rice. Nat. Genet. 42: 545-549.

Sayah, D.M., Sokolskaja, E., Berthoux, L. and Luban, J. (2004) Cyclophilin A

retrotransposition into TRIM5 explains owl monkey resistance to HIV-1. Nature. 430:

569-573.

Sun, C.Q., Wang, X.K., Li, Z.C., Yoshimura, A. and Iwata, N. (2001) Comparison of

the genetic diversity of common wild rice (Oryza rufipogon Griff.) and cultivated rice (O.

sativa L.) using RFLP markers. Theor. Appl. Genet. 102: 157-162.

Tan, L., Liu, F., Xue, W., Wang, G., Ye, S., Zhu, Z., Fu, Y., Wang, X. and Sun C.

(2007) Development of Oryza rufipogon and O. sativa Introgression Lines and

Page 25: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice

Assessment for Yield‐related Quantitative Trait Loci. J. Integr. Plant. Biol. 49 :

871-884.

Tan, L., Li, X., Liu, F. et al. (2008) Control of a key transition from prostrate to erect

growth in rice domestication. Nat. Genet. 40: 1360-1364.

Tautz, D and Domazet-Lošo, T. (2011) The evolutionary origin of orphan genes. Nat.

Rev. Genet. 12: 692-702.

Turcotte, K., Srinivasan, S. and Bureau, T. (2001) Survey of transposable elements

from rice genomic sequences. Plant J. 25: 169-179.

Vaughan, D. A., Morishima, H and Kadowaki, K. (2003) Diversity in the Oryza

genus. Curr. Opin. Plant.Biol. 6: 139-146.

Wang, W., Brunet, FG ., Nevo, E., et al. (2002) Origin of sphinx, a young chimeric

RNA gene in Drosophila melanogaster. Proc. Natl. Acad. Sci. USA. 99: 4448-4453.

Wang, W., Zheng, H., Fan, C. et al. (2006) High rate of chimeric gene origination by

retroposition in plant genomes. Plant Cell. 18: 1791-1802.

Wang, Y.H. and Li, J.Y. (2011) Branching in rice. Curr. Opin. Plant.Biol. 14 : 94-99.

Wang, Z., Zou, Y., Li, X. et al. (2006) Cytoplasmic male sterility of rice with boro II

cytoplasm is caused by a cytotoxic peptide and is restored by two related PPR motif

genes via distinct modes of mRNA silencing. Plant Cell. 18: 676-687.

Xing, Y.Z. and Zhang, Q.F. (2010) Genetic and molecular bases of rice yield. Annu. Rev.

Plant Biol. 61: 421-442.

Xue, W., Xing, Y., Weng, X. et al. (2008) Natural variation in Ghd7 is an important

regulator of heading date and yield potential in rice. Nat. Genet. 40: 761-767.

Page 26: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice

Yamagishi, M., Takeuchi, Y., Kono, I. and Yano, M. (2002) QTL analysis for panicle

characteristics in temperate japonica rice. Euphytica. 128: 219-224.

Yan, W., Wang, P., Chen, H. et al. (2011) A major QTL, Ghd8, plays pleiotropic roles in

regulating grain productivity, plant height, and heading date in rice. Mol. Plant. 4:

319-330.

Yoshida, A., Sasao, M., Yasuno, N. et al. (2013) TAWAWA1, a regulator of rice

inflorescence architecture, functions through the suppression of meristem phase transition.

Proc. Natl. Acad. Sci. USA. 110: 767-772.

Yoshioka, K., Moeder, W., Kang, H.G., Kachroo, P., Masmoudi, K., Berkowitz, G.

and Klessig, D. F. (2006) The chimeric Arabidopsis CYCLIC NUCLEOTIDE-GATED

ION CHANNEL11/12 activates multiple pathogen resistance responses. Plant Cell. 18:

747-763.

Zhang, C., Wang, J., Marowsky, N.C., Long, M., Wing, R.A. and Fan, C. (2013) High

Occurrence of Functional New Chimeric Genes in Survey of Rice Chromosome 3 Short

Arm Genome Sequences. Genome Biol Evol. 5: 1038-1048.

Zhang, Q.F. (2007) Strategies for developing green super rice. Proc. Natl. Acad. Sci.

USA. 104: 16402-16409.

Zhao, L., Tan, L., Zhu, Z., Xiao, L., Xie, D. and Sun, C. (2015) PAY1 improves plant

architecture and enhances grain yield in rice. Plant J. 83: 528-536.

Zhou, Y., Zhu, J., Li, Z., et al. (2009). Deletion in a quantitative trait gene qPE9-1

associated with panicle erectness improves plant architecture during rice

domestication. Genetics. 183: 315-324.

Page 27: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice

FIGURE LEGENDS

Figure 1. Phenotypes of ‘YIL19’ and ‘Teqing’

(a) Phenotypes of ‘YIL19’ and ‘Teqing’ plants at the maturity stage.

(b) The panicle structures of ‘YIL19’ and ‘Teqing’. The white arrow represents the place

where a secondary branch emergences. White bars = 5 cm.

(c) Stem structures of ‘YIL19’ and ‘Teqing’. On the left is the length of each internode in

‘Teqing’, and on the right is ‘YIL19’.

(d–k) Comparison of grain number per panicle, panicle length, number of main panicle,

mean length of primary branch, secondary branch number, grain number of secondary

branch, plant height and heading date between ‘YIL19’ and ‘Teqing’. All of the data

above are given as mean ± SE (n = 20). The double asterisks in (d-k) represent

significance differences determined by the Student’s t-test at the level of P < 0.01.

Figure 2. Fine-mapping and gene candidates for GN2

(a) The introgression condition of ‘YIL19’. The boxes represent rice chromosomes: black

boxes represent the chromosome segment from ‘Yuanjiang’ common wild rice, and the

white boxes represent the segments from ‘Teqing’.

(b) Grain number of ‘YIL19’ and ‘Teqing’, and their heterosis.

(c) Primary mapping of GN2 between markers RM318 and RM6312. cM, centimorgan; R

represents the number of individuals in a population.

(d) Fine mapping of GN2 harboring a 47-kb region of AP005303

(e) The putative candidate genes for GN2 based on the annotation of ‘Nipponbare’. The

white boxes represent the exons of genes based on the annotated information.

Page 28: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice

(f) The gene structures of LOC_Os02g56630 in ‘YIL19’ and ‘Teqing’. The black box

indicates the exon of LOC_Os02g56630; the black lines indicate introns and intergenic

regions, while the red box indicates the inserted DNA segment of LOC_Os02g56630 in

‘YIL19’.

(g) The mRNA splices of new transcripts. The boxes indicate the exons of the new

transcripts, while different types of stripes and dots in the boxes represent the sequences

having different origins.

(h) The gene structure of the new transcripts. The boxes indicate the exons of new

transcripts, while black lines among the boxes are the introns and intergenic sequences.

The direct repeat sequences are marked by TGCATAAGCA, while the polyadenylation

sites are marked by AATAAA.

(i) The alternative splicing style of the new transcripts. The sequence in yellow is the

exon of the new transcripts; the sequence in red represents the splicing site. The black

arrows are pointing to the splicing site. The 3-bp deletion sequence in GN2.2 is

underlined with a black line.

Figure 3. Complementation test of GN2

(a) Phenotype comparison between the GN2 over-expressing transgenic plants and

control plants transformed with an empty plasmid.

(b) Panicle comparison between the GN2 over-expressing transgenic plants and control

plants transformed with an empty plasmid. Oe-GN2 here represents the GN2.1

over-expressing transgenic line. White bars = 5 cm.

(c) The relative expression levels of GN2 in different GN2 over-expressing transgenic

Page 29: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice

plants. CL1 represents the ‘Teqing’ control line with an empty plasmid used for analysis.

pOE4 and pOE6 represent GN2.1 over-expressing transgenic lines used for analysis.

(d-i) Comparisons of grain number per panicle, panicle length, average length of primary

branch, number of secondary branch, plant height and heading date among GN2

over-expressing transgenic lines and their control lines.

All of the data above are presented as mean ± SE (n = 20). The double asterisks represent

significance differences determined by the Student’s t-test at the level of P < 0.01.

Figure 4. The spatial expression pattern of GN2

(a) The spatial expression pattern of GN2. Se, seedling; R, root; S, stem; L, leaf; LS, leaf

sheath; TB, tiller base; SAM, shoot apical meristem; P1, young panicle shorter than 1 cm;

P2, young panicle longer than 1 cm and shorter than 5 cm; P3, young panicle longer than

5 cm; P4 mature panicle; SH, spikelet hull. Values are means ± SD.

(b) The occurrence frequencies of different GN2 transcripts. The dot indicates the 3-bp

deletion in GN2.2

Figure 5. Emergence process of GN2

(a) Initial gene structures of LOC_Os02g45150 and LOC_Os02g56630. The boxes

indicate the exons of the genes; the dark blue box indicates the CDS region based on the

annotation of ‘Nipponbare’, while the connecting line in dark blue represents the intron.

The black connecting line and delimiter represent chromosome regions.

(b) The movement of the DNA fragment from LOC_Os02g45150 to the putative exon of

LOC_Os02g56630. The box and connecting line in red represent the “sheared” DNA

Page 30: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice

fragment in LOC_Os02g45150, and the triangles in purple represent the

‘TGCATAAGCA’ repeats.

(c) The gene structures of LOC_Os02g45150 and LOC_Os02g56630 after the insertion

event.

(d) The emergence of the GN2 gene. The box and triangle in green represent the exons of

GN2, and the connecting line in green represents the intron of GN2. The dotted line in

black represents the ancestral gene structure of LOC_Os02g56630, and the dotted line in

red represents the inserted genomic sequence. The red right angle arrow represents the

transcriptional starting site, and the AATAAA represents the polyadenylation site of GN2.

Page 31: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice

Teqing YIL19

Head

ing

Dat

e

0

80

100

120

140

Teqing YIL19

Plan

t Hei

ght (

cm)

0

60

80

100

Teqing YIL19

Gra

in n

umbe

r of S

econ

dary

Bra

nch

0

40

60

80

100

120

140

160

Teqing YIL19

Pani

cle

Leng

th (c

m)

0

15

20

25

30

Teqing YIL19

Num

ber o

f Prim

ary

Bra

nch

0

10

12

14

Teqing YIL19

Num

ber o

f Sec

onda

ry B

ranc

h

0

10

20

30

40

50

60

Teqing YIL19

Gra

in N

umbe

r per

Pan

icle

0

20

40

60

80

100

120

140

160

180

200

Teqing YIL19Mea

n Le

ngth

of P

rimar

y B

ranc

h (c

m)

0

6

8

10

12

b

d e f g

h i j k

Figure 1. Phenotypes of ‘YIL19’ and ‘Teqing’.

a b c

Page 32: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice
Page 33: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice

CL1 pOE4 pOE6

Plan

t Hei

ght (

cm)

020

60

80

100

CL1 YIL19 pOE4 pOE6

Rel

ativ

e E

xpre

ssio

n L

evel

0

2000

4000

6000

8000

10000

12000

CL1 pOE4 pOE6

Gra

in N

umbe

r per

Pan

icle

0

50

100

150

200

250

CL1 pOE4 pOE6

Head

ing

Dat

e

0

80

100

120

140

CL1 pOE4 pOE6

Num

ber o

f Sec

onda

ry B

ranc

h

0

20

30

40

50

60

CL1 pOE4 pOE6

Pani

cle

leng

th (c

m)

0

15

20

25

CL1 pOE4 pOE6Mea

n Le

ngth

of P

rimar

y B

ranc

h (c

m)

0

6

8

10

12

a b c

d e f

g h i

TQ-Control Oe-GN2 TQ-Control Oe-GN2

Figure 3. Complementation test of GN2.

Page 34: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice
Page 35: Emergence of a Novel Chimeric Gene Underlying Grain Number ... · 15/12/2016  · 106 discovery of GN2 could be useful in elucidating the molecular mechanisms underlying 107 . rice

The segment was cut off from genome by ‘TGCATAAGCA’ repeat and then inserted into the position of LOC_Os02g56630

GN2 got the TSS from insert sequence, and the fusion of two ancestral sequence formed the GN2 gene

‘TGCATAAGCA’ repeat

Chr2:LOC_Os02g45150 LOC_Os02g56630a

b

c

d

GN2

AATAAA

TSS

Figure 5. Emergence process of GN2