new wheat micro rna using whole genome sequences

17
ORIGINAL PAPER New wheat microRNA using whole-genome sequence Kuaybe Yucebilgili Kurtoglu & Melda Kantar & Hikmet Budak Received: 16 September 2013 /Revised: 6 December 2013 /Accepted: 22 December 2013 # Springer-Verlag Berlin Heidelberg 2014 Abstract MicroRNAs are post-transcriptional regulators of gene expression, taking roles in a variety of fundamental biological processes. Hence, their identification, annotation and characterization are of great significance, especially in bread wheat, one of the main food sources for humans. The recent availability of 5× coverage Triticum aestivum L. whole- genome sequence provided us with the opportunity to perform a systematic prediction of a complete catalogue of wheat microRNAs. Using an in silico homology-based approach, stem-loop coding regions were derived from two assemblies, constructed from wheat 454 reads. To avoid the presence of pseudo-microRNAs in the final data set, transposable element related stem-loops were eliminated by repeat analysis. Over- all, 52 putative wheat microRNAs were predicted, including seven, which have not been previously published. Moreover, with distinct analysis of the two different assemblies, both variety and representation of putative microRNA-coding stem-loops were found to be predominant in the intergenic regions. By searching available expressed sequences and small RNA library databases, expression evidence for 39 (out of 52) putative wheat microRNAs was provided. Expres- sion of three of the predicted microRNAs (miR166, miR396 and miR528) was also comparatively quantified with real- time quantitative reverse transcription PCR. This is the first report on in silico prediction of a whole repertoire of bread wheat microRNAs, supported by the wet-lab validation. Keywords Triticum aestivum . MicroRNA . MicroRNA prediction . Next-generation sequencing . Real-time quantitative reverse transcription PCR Abbreviations miRNA MicroRNA pri-miRNA Primary miRNA pre-miRNA Precursor miRNA RISC RNA-induced silencing complex SVM Support vector machine EST Expressed sequence tag LCG Low copy number assembly TREP Triticeae repeat database OG Orthologous group assembly OGR Orthologous group representatives cDNA Complementary DNA MFE Minimal folding free energy MITE Miniature inverted terminal repeat element TE-MIR Transposable element-related miRNA group GEO Gene expression omnibus TIR Terminal inverted repeat LTR Long terminal repeat siRNA Small interfering RNA Introduction Bread wheat (Triticum aestivum L.) is one of the most exten- sively grown crops with a global annual production of over 650 million tones. It is a fundamental source for human food consumption, providing approximately 20 % of the dietary energy supply (http://www.fao.org, 2011). Hexaploid bread wheat genome is large (~17Gbp), highly repetitive (80 %) and complex, having three homeologous but divergent subgenomes (AABBDD, 2n=42) (Hernandez et al. 2012; Kuaybe Yucebilgili Kurtoglu and Melda Kantar equally contributed to this research. Electronic supplementary material The online version of this article (doi:10.1007/s10142-013-0357-9) contains supplementary material, which is available to authorized users. K. Y. Kurtoglu : M. Kantar : H. Budak (*) Biological Sciences and Bioengineering Program, Sabanci University, Orhanli 34956, Istanbul, Turkey e-mail: [email protected] Funct Integr Genomics DOI 10.1007/s10142-013-0357-9

Upload: sabanciuniv

Post on 30-Nov-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

ORIGINAL PAPER

New wheat microRNA using whole-genome sequence

Kuaybe Yucebilgili Kurtoglu &

Melda Kantar & Hikmet Budak

Received: 16 September 2013 /Revised: 6 December 2013 /Accepted: 22 December 2013# Springer-Verlag Berlin Heidelberg 2014

Abstract MicroRNAs are post-transcriptional regulators ofgene expression, taking roles in a variety of fundamentalbiological processes. Hence, their identification, annotationand characterization are of great significance, especially inbread wheat, one of the main food sources for humans. Therecent availability of 5× coverage Triticum aestivumL. whole-genome sequence provided us with the opportunity to performa systematic prediction of a complete catalogue of wheatmicroRNAs. Using an in silico homology-based approach,stem-loop coding regions were derived from two assemblies,constructed from wheat 454 reads. To avoid the presence ofpseudo-microRNAs in the final data set, transposable elementrelated stem-loops were eliminated by repeat analysis. Over-all, 52 putative wheat microRNAs were predicted, includingseven, which have not been previously published. Moreover,with distinct analysis of the two different assemblies, bothvariety and representation of putative microRNA-codingstem-loops were found to be predominant in the intergenicregions. By searching available expressed sequences andsmall RNA library databases, expression evidence for 39(out of 52) putative wheat microRNAs was provided. Expres-sion of three of the predicted microRNAs (miR166, miR396and miR528) was also comparatively quantified with real-time quantitative reverse transcription PCR. This is the firstreport on in silico prediction of a whole repertoire of breadwheat microRNAs, supported by the wet-lab validation.

Keywords Triticum aestivum . MicroRNA .MicroRNAprediction . Next-generation sequencing . Real-timequantitative reverse transcription PCR

AbbreviationsmiRNA MicroRNApri-miRNA Primary miRNApre-miRNA Precursor miRNARISC RNA-induced silencing complexSVM Support vector machineEST Expressed sequence tagLCG Low copy number assemblyTREP Triticeae repeat databaseOG Orthologous group assemblyOGR Orthologous group representativescDNA Complementary DNAMFE Minimal folding free energyMITE Miniature inverted terminal repeat elementTE-MIR Transposable element-related miRNA groupGEO Gene expression omnibusTIR Terminal inverted repeatLTR Long terminal repeatsiRNA Small interfering RNA

Introduction

Bread wheat (Triticum aestivum L.) is one of the most exten-sively grown crops with a global annual production of over650 million tones. It is a fundamental source for human foodconsumption, providing approximately 20 % of the dietaryenergy supply (http://www.fao.org, 2011). Hexaploid breadwheat genome is large (~17Gbp), highly repetitive (80 %) andcomplex, having three homeologous but divergentsubgenomes (AABBDD, 2n=42) (Hernandez et al. 2012;

Kuaybe Yucebilgili Kurtoglu and Melda Kantar equally contributed tothis research.

Electronic supplementary material The online version of this article(doi:10.1007/s10142-013-0357-9) contains supplementary material,which is available to authorized users.

K. Y. Kurtoglu :M. Kantar :H. Budak (*)Biological Sciences and Bioengineering Program, SabanciUniversity, Orhanli 34956, Istanbul, Turkeye-mail: [email protected]

Funct Integr GenomicsDOI 10.1007/s10142-013-0357-9

Kantar et al. 2012). Despite its complexity, a great progresshas been achieved in elucidating the genomic back-ground of wheat with the recent advances in next-generation sequencing technology (Hernandez et al.2012; Vitulo et al. 2011).

MicroRNAs (miRNAs) are small (19–24 nucleotides),endogeneous, single-stranded, non-coding RNA moleculesthat are central to the regulation of gene expression at thepost-transcriptional level (Jones-Rhoades et al. 2006). Startingwith the discovery of the first plant miRNA (Llave et al.2002), over a decade ago, they have been extensively identi-fied and studied in a variety of plants. Several lines of researchhave shown that a great majority of miRNAs are highlyconserved across plant species (Jones-Rhoades et al. 2006).Additionally, the general networks in relation to their complexbiogenesis and modes of silencing have been widely charac-terized. Plant primary miRNAs (pri-miRNAs) are initiallytranscribed from miRNA genes and then undergo a series ofcleavage steps, leading to the production of a miRNA/miRNA* duplex via intermediate hairpin structures, calledprecursor miRNAs (pre-miRNAs). Mature miRNAs, re-trieved from these duplexes, are loaded to an RNA-inducedsilencing complex (RISC), which directs them to their targetsto induce silencing either through translational inhibition ormRNA cleavage (Bartel 2004; Park et al. 2005; Unver et al.2009).

Up to the present, plant miRNA identification reports haveadopted either experimental or computational approaches.Among wet-lab techniques, one of the most powerful is theconstruction and sequencing of small RNA libraries. SmallRNA reads retrieved with this method were used by severalresearchers to discover miRNAs in pooled tissues of wheat,grown under normal conditions (Kenan-Eichler et al. 2011;Wei et al. 2009; Yao et al. 2007; Li et al. 2013). The samestrategy was used to identify miRNAs under conditions ofstress, like infections of Fusarium (De Paoli et al., unpub-lished data), powdery mildew (Xin et al. 2010) and Pucciniagraminis (Gupta et al. 2012) or exposure to extreme heat (Xinet al. 2010) or cold (Tang et al. 2012). Other experimentalmethods include hybridization-based techniques such asnorthern blot analysis (Yu et al. 2012) and expression profilingwith microarray technology. The latter was used to verify theexpression of predicted miRNAs, in cold stressed bread wheat(Tang et al. 2012) and in other Triticum species and relativesexposed to water limitation (Budak and Akpinar 2011; Kantaret al. 2011).

Although high-throughput experimental techniques arevaluable for miRNA identification, they also have downsides.The currently applied methods are labour- and resource-intensive, and an experimental run is limited to a single orfew samples. This constitutes only a small subset of thevarious samples that can be retrieved from a single species,considering the whole set of possible spatio-temporal

representations, and the infinite number of growth conditions,of a single plant. Thus, results retrieved from small RNAlibraries and microarrays do not represent the whole miRNArepertoire of a plant, since miRNA expression is known to behighly tissue specific and extensively regulated throughdevelopment and by stress conditions (Bartel 2004;Unver et al. 2009).

The limitations of experimental techniques can becircumvented by computational methods used for miRNAidentification. With these strategies, even miRNAs that arevery rare or highly specific to tissue, condition or develop-mental stage can be predicted. In silico miRNA predictionutilizes the conserved secondary structure features of pre-miRNA stem-loops, for derivation of miRNAs from an avail-able sequence database (Lucas and Budak 2012; Unver et al.2009). The conventional computational method additionallytakes into account mature miRNA sequence conservationamong species and predicts homologs of previously knownmiRNAs. In order to identify novel miRNAs, several newpredictive tools have been recently developed. These use asupport vector machine (SVM) algorithm to define a set ofcharacteristics from an empirically obtained data and utilizethis set of criteria for elimination of pseudo stem-loops (Wuet al. 2011; Xuan et al. 2011; Teune and Steger 2010; Kadriet al. 2009). In silico methods have been previously used toretrieve putative T. aestivum miRNAs from expressed se-quence tags (ESTs) (Jin et al. 2008; Zhang et al. 2005; Yinand Shen 2010; Pandey et al. 2013), partial genomic se-quences (Dryanova et al. 2008; Schreiber et al. 2011; Yinand Shen 2010) and survey sequences of wheat chromosomes(Kantar et al. 2012; Lucas and Budak 2012; Vitulo et al. 2011;Kurtoglu et al. 2013). Furthermore, wild and close relativesof bread wheat were also subjected to in silico miRNAprediction (Dryanova et al. 2008; Unver and Budak2009). Yet, still a complete catalogue of whole-genomewheat miRNAs and their in-depth analysis is currentlyunavailable.

Over the last decade, plant miRNAs have been shown toplay a variety of essential roles, encompassing several phys-iological processes from development to stress responses(Unver et al. 2009; Dugas and Bartel 2004; Khraiwesh et al.2012). Hence, it is important to identify miRNA-coding se-quences in bread wheat, which will provide a basis formiRNA-related functional studies, from which the retrieveddata can be exploited for crop improvement. Thus, in thisreport, we performed a systematic prediction of conservedwheat miRNAs from contigs of two recently published as-semblies, one representing the whole genomic sequence andthe other encompassing gene-rich regions of wheat, bothconstructed from shotgun sequences of bread wheat wholegenome. We also provided an experimental comparativequantification for a selected subset of the predicted miRNAs(Brenchley et al. 2012).

Funct Integr Genomics

Materials and methods

Reference miRNAs

For in silico prediction of putative conserved miRNAs inwheat genome, a query containing previously identified plantmature miRNA sequences was used. From 67 different plantspecies, 5,940 sequences were downloaded from miRBaserelease 19 (August 2012) (Kozomara and Griffiths-Jones2011). In cases where several miRNAs had identical maturemiRNA sequences, only one was retained, leaving a total of3,228 unique, mature miRNA sequences.

Wheat assemblies

Homology-based computational miRNA identification, aswill be explained in detail in ‘Identifying miRNAs by se-quence similarity and secondary structure conservation’, wasperformed on low-copy-number genome (LCG) andorthologous group (OG) assemblies which were retrievedfrom a recent publication (Brenchley et al. 2012). For expres-sion analysis of computationally predicted wheat miRNAs,which will be referred in ‘Expression analysis of identifiedpre-miRNA sequences’, transcriptome assembly from thesame research of Brenchley et al. (2012) was used.

LCG assembly contained 5,321,847 assembled contigsfrom sets of 454 sequences clustered, corresponding to a totalsequence of 3,800,325,216 base pairs (bp). Roche 454 pyro-sequencing (GS FLX Titanium and GS FLX+ platforms) wasperformed on purified nuclear DNA of wheat variety ChineseSpring (CS42). From the sequencing process, 85 Gbp ofsequence (220 million reads) was retrieved. Thiscorresponded to approximately a fivefold coverage, basedon the estimated wheat genome size (17 Gbp). Prior to theconstruction of the low-copy-number genome assembly, 60 %of the reads, matching the repeats (from Triticeae RepeatDatabase (TREP)), wheat chloroplast and mitochondrial ge-nomes, ribosomal sequences (from SILVA database) wereremoved. After filtering, assembly of the remaining reads(40 %, approximately 87 million) was performed usinggsAssembler from the Newbler package (development ver-sion 2.6 pre).

OG assembly, representing the genic regions of wheatgenome, consisted of 949,279 sequences including bothcontigs clustered from 454 sequences and reads that remainas singletons. This corresponded to a total sequence of437,512,281 bp. The assembly was performed using theRoche GS de novoNewbler assembler software v2.5.3. Readsfor the assembly were selected with elimination with similar-ity search of repeat masked sequences (masking with Vmatchagainst the MIPS-REdat Poaceae v8.6.2 library) toorthologous group representatives (OGRs). OGRs, whichare the gene models with the highest similarity to wheat, were

selected from each orthologous group by blasting to a LCGassembly. The orthologous groups (20,496) were derivedfrom databases of genomic sequences of rice, sorghum,Brachypodium and barley full-length complementary DNAs(cDNAs) using OrthoMCL clustering.

Transcriptome assembly was prepared with the NewblergsAssembler (version 2.3) and contained 97,481 contigs cor-responding to 93,340,842 bp in total. cDNA read data set(4,859,388 reads corresponding to 1.6 Gbp) used in thisassembly was obtained by Roche 454 pyrosequencing ofwheat variety Chinese Spring (CS42) cDNA on GS FLXTitanium and GS FLX+ platforms. Sources of RNA samplesfor cDNA synthesis included several CS42 tissues undervarious treatments: immature and semi-mature tissues (seed,root, flower, young shoot and leaf); 3-and 6-day salt-stressedtissues (shoot and root); 6-day drought-stressed leaves; andcircadian (0, 6, 12, 18 h) sampled seedling leaves. After theassembly of reads, contigs smaller than 100 bp and whichhave hits to TREP database were removed (Brenchley et al.2012).

In this study, three separate BLAST databases of the above-mentioned LCG, OG and transcriptome assemblies were con-structed using an NCBI BLAST+ stand-alone toolkit, version2.2.25+ release (March 2011) (Camacho et al. 2009).

Identifying miRNAs by sequence similarity and secondarystructure conservation

A two-step procedure was applied for miRNA prediction withthe following parameters: (1) selection of T. aestivum assem-bly sequences with homology to previously known plantmature miRNAs and (2) further elimination of sequencesderived from the assembly, based on the consistency of theirsecondary structures with general features of pre-miRNAs(Kantar et al. 2010, 2011; Lucas and Budak 2012; Unverand Budak 2009; Zhang et al. 2005). Removal in relation toabove-mentioned rules was performed with two in-house Perlscripts, SUmirFind and SUmirFold (Kantar et al. 2011; Lucasand Budak 2012; Kurtoglu et al. 2013). SUmirFind usesBLAST version 2.2.25+ release (March 2011), to search aquery of plant mature miRNA sequences against the createddatabases of wheat assemblies, filtering all hits to eliminateany wheat sequences with 3 bp (or based on any other spec-ified mismatch criteria) different from the miRNA query(Zhang et al. 2005). SUmirFold utilizes UNAFold version3.8 (an implementation of the Zuker folding algorithm) forsingle-stranded RNA structure prediction and selects the low-est minimum free energy (MFE) structure for further evalua-tion in a two-step process (Markham and Zuker 2008).SUmirFold initially predicts the secondary structure and ex-amines base pairing between the putative miRNA:miRNA*duplex for all assembly sequences, remaining after SUmirFindimplementation. For all hits passing this criteria, the sequence

Funct Integr Genomics

region surrounding the putative miRNA:miRNA* is excised,refolded and checked for several other characteristics of a pre-miRNA structure (Kantar et al. 2010, 2011; Lucas and Budak2012; Unver and Budak 2009; Zhang et al. 2005; Kurtogluet al. 2013). The stem-loop results obtained from SUmirFoldwere saved and also manually inspected. In this process, allhairpins not consistent with the features of the pre-miRNAstructure including multi-branched loops were eliminated.Furthermore, in cases where identical miRNAs were predictedfrom two similar query mature miRNA sequences, only onewas retained.

Identifying repetitive elements in predicted miRNA-codingsequences

Repetitive elements in the pre-miRNA sequences predicted tobe present in the LCG and OG assemblies were identified andmasked. These sequences were run against a custom library of4,526 Triticum and Triticeae repeats, extracted from MIPS-REdat Poaceae v9.0 (http://mips.helmholtz-muenchen.de/plant/recat/). A semi-automated pipeline, RepeatMasker ver-sion 3.2.9 (www.repeatmasker.org), was used, with Cross-Match (www.phrap.org/phredphrapconsed.html) as analignment algorithm and at default settings. pre-miRNAs cov-ered by repeats more than 50 % were separately saved andmarked as ‘transposable element-related miRNA groups’ (TE-MIRs). miRNA types corresponding to only non-repetitivestem-loops were coined as ‘high confidence’, and others cor-responding to TE-MIRs, as well as non-repetitive hairpins,were termed as ‘low confidence’.

Representation analysis of whole-genome and genicpre-miRNAs

Representations in the genome and genic regions of wheatwere separately evaluated. Representation for each miRNAwas calculated by counting its corresponding stem-loop cod-ing regions: only non-repetitive hairpins for high confidenceand all (both non-repetitive and TE-MIRs) for low-confidencepredictions. Representation data set included all identical pre-miRNA sequences predicted to be located in different assem-bly sequences or at different locations on the same assemblysequence, in addition to all unique putative pre-miRNA se-quences. Identical pre-miRNA sequences located at the samegenomic region were also included separately to the foldanalysis, in cases when the mature miRNA was located onthe different arms of the hairpin.

Expression analysis of identified pre-miRNA sequences

Expression analysis was performed on repeat-masked pre-miRNA sequences retrieved from the LCG and OG assem-blies. These sequences were searched against two databases:

(1) transcriptome assembly and (2) NCBI T. aestivum(taxid:4565) EST database. BLASTN search against the tran-scriptome assembly was performed with an NCBI BLAST+stand-alone toolkit, version 2.2.25+ release (March 2011)(Camacho et al. 2009) using default parameters. ESTs werescreened using NCBI online BLAST tool using BLASTNmegablast for detection of highly similar sequences withdefault parameters. The combined results were saved andfurther filtered for 98 % similarity and 99 % query coverage.

Screening of small RNA libraries for predicted miRNAs

All wheat small RNA libraries deposited in NCBI GeneExpression Omnibus (GEO) (http://www.ncbi.nlm.nih.gov/geo/) were downloaded and prepared as query. GEOaccessions of small RNA libraries and detailed informationin relation to the samples used for their construction are givenin Table 1. A database of repeat-masked pre-miRNA se-quences predicted from the LCG and OG assemblies wasconstructed. Against this database, a BLASTN search of thequery was performed using the following parameters:ungapped -dust no -evalue 10 -strand plus -perc_identity 100-word_size 18 -outfmt 6. Database construction and BLASTNsearch were performed using an NCBI BLAST+ stand-alonetoolkit, version 2.2.25+ release (March 2011) (Camacho et al.2009). Search rules were optimized to detect the segments(>18 bp) of predicted pre-miRNA with perfect matches toregions in small RNA library sequences. Only alignmentscovering the full length of the putative mature miRNA ormiRNA* sequences were selected. This was performed bycomparing the length and stem-loop location of the matchedsequence region with the length and stem-loop location of theputative mature miRNAs/miRNA*s.

Plant materials and growth conditions

T. aestivum L. cv. Chinese Spring (AABBDD) was grown innormal greenhouse conditions (16-h light at 22 °C and 8-hdark at 18 °C). Seeds were surface sterilized and vernilized inpetri dishes for 3–4 days at 4 °C. Seedlings were transferred topots containing soil supplemented with 200 ppm N,100 ppm P and 20 ppm S. Leaf tissue was collected fromadult plants and stored at −80 °C.

Total RNA isolation of T. aestivum leaf samples

RNA isolation from frozen leaf tissue was carried out usingTRI Reagent (Sigma, MO, USA) according to the manufac-turer’s instructions. Quality and quantity of the isolated RNAwere measured using a Nanodrop ND-100 spectrophotometer(Nanodrop Technologies, Wilmington, DE, USA). Its integri-ty was confirmed by separating the major rRNA bands inagarose gels. DNase treatment of 1 μg of total RNA was

Funct Integr Genomics

performed in a 10-μL reaction mixture with 1 U of DNase Idioxyribonuclease I (Fermentas). RNAwas stored at −20 °C.

Stem-loop reverse transcription

Stem-loop reverse transcription (RT) primers for miR166,miR396 and miR528 were designed according to Varkonyi-Gasic et al. (2007) (Table S1). miRNA-specific RT reactionswere performed as outlined in Unver and Budak (2009), alongwith separate no-RT primer and no-RNA control reactions(Unver and Budak 2009; Kantar et al. 2010, 2011). Reactionsinvolved a mixture of 1 μL of DNase-treated RNA (100 ng)and 1 μL of 1 μM stem-loop RT primer (50 nM), which wascompleted to 12.5 μL with DEPC-treated water and incubatedat 65 °C for 5 min and subsequently chilled on ice for 2 min.RT reactions were performed with RevertAid H Minus M-MuLV RT (Fermantas) according to the manufacturer’s

instructions. Pulsed RT reaction was set as follows: 30 minat 16 °C; 60 cycles at 30 °C for 30 s, 42 °C for 30 s and 50 °Cfor 1 s; and terminated at 70 °C for 10 min. SynthesizedcDNAs were stored at −20 °C.

miRNA quantitative real-time PCR

To experimentally validate and quantify mature miRNA ex-pression in T. aestivum, real-time quantitative reverse tran-scription PCR (qRT-PCR) was performed using FastStartUniversal SYBR Green Master (ROX) (Mannheim, Germa-ny) on Icycler Multicolor Real-time PCR Detection Systems(Bio-Rad Laboratories). Using 1 μL from a 1:100 dilution ofmiRNA stem-loop RT products, quantitative PCR was per-formed as triplicates. Reactions were set as 20 μL, including10 μL 2× Master mix and 0.6 μL forward/reverse primers(300 nM from each). Specified qRT-PCR thermal set-up was

Table 1 Small RNA library con-struction studies Small RNA

library (GEOaccession)

Total number ofsmall RNA reads inthe library

Stresscondition

Tissue Reference

GSM548032 Control Flag-leaf Cantu et al. (2010)

Total 331,340

GSM903664 Control Spike Tang et al. (2012)

GSM903665 Cold Spike Tang et al. (2012)

GSM903666 Cold Spike Tang et al. (2012)

GSM903667 Cold Spike Tang et al. (2012)

GSM903668 Control Spike Tang et al. (2012)

GSM903669 Control Spike Tang et al. (2012)

GSM903670 Control Spike Tang et al. (2012)

Total 21,992,419

GSM406301 Control Root, shoot, leaf,young spike

Wei et al. (2009)

Total 474,354

GSM675612 Control Leaf Xin et al. (2010)

GSM675613 Powdery mildewinfection

Leaf Xin et al. (2010)

GSM675614 Control Leaf Xin et al. (2010)

GSM675615 Powdery mildewinfection

Leaf Xin et al. (2010)

GSM675616 Control Leaf Xin et al. (2010)

GSM675617 Heat stress Leaf Xin et al. (2010)

Total 4,264,244

GSM723049 Control Whole plant Kenan-Eichleret al. (2011)

GSM723050 Control Whole plant Kenan-Eichleret al. (2011)

Total 8,872,172

GSM803792 Control Leaf De Paoli et al.(unpublisheddata)

GSM803793 Control Spikelet

GSM803794 Fusariuminfection

Spikelet

Total 2,201,199

Funct Integr Genomics

adjusted as follows: heated to 95 °C for 10 min, followed by40 cycles of 95 °C for 15 s, 56 °C for 30 s and 72 °C for 30 sand kept at 72 °C for 7 min. The melting curves were gener-ated by collecting fluorescence signals as the temperature wasincreased 0.5 °C with a dwell time of 10 s, from 55 to 95 °Cfor 80 cycles.

Specific forward primers were designed for each individualmiRNA, but a common reverse primer was used for all qRT-PCR reactions (Table S1) (Varkonyi-Gasic et al. 2007). Forthe analysis of quantification, PCR efficiency calculationswere performed using the LinRegPCR software (version2013.0, March 2013 release) retrieved from the publicationof Ruijter et al. (2009). qRT-PCR reactions of differentmiRNAs had comparable efficiencies, which enabled theircomparative quantification (Peixoto et al. 2004). In order todistinguish mature miRNA amplicons from dimers, controlqRT-PCR reactions were set using no-RT primer and no-RNAreverse transcription products. Analysis of melting curvesenabled the observation of distinct miRNA and dimer meltingpeaks. qRT-PCR products (with 1:5 μl 6× loading dye), alongwith the controls, were seperated using 3 % agarose gels.

Results

Identification of conserved miRNAs in whole wheat genomeand genic regions

For homology-based prediction of miRNA-coding sequencesin wheat, previously known plant mature miRNA sequenceswere used as queries. Identification was separately performedon two data sets of wheat sequences: (1) LCG (5,321,847sequences) and (2) OG (949,279 sequences) assemblies,which were used to assess the whole genomic (genic andintergenic) and genic regions of wheat, respectively.

Despite the apparent size difference of the two assemblies,similar numbers of LCG and OG sequences: 20,273 and19,335, respectively, were found to contain segments withthree or fewer mismatches to a published mature miRNA.After further structural analysis using UNAFold, 5,659 and4,662 stem-loop-coding regions were predicted from LCGand OG assemblies, respectively. Subsequently, redundanthits to identical miRNAs (predicted from similar query maturemiRNA sequences) and stem-loops inconsistent with pre-miRNA structure features were manually removed. The re-maining data set represented all putative stem-loop-codingsequences including identical hairpins located at differentgenomic locations, which amounted to 3,510 (3,217 uniquestem-loops corresponding to 81 different miRNAs) and 3,567(3,009 unique stem-loops corresponding to different 45miRNAs) from the LCG and OG assemblies, respectively.Cumulatively, a total of 6,112 unique stem-loops correspond-ing to 85 miRNAs were identified (Tables 2, 3, 4 and 5).

Identification of repeats in putative miRNA-coding regions

A number of previously annotated plant miRNAs are knownto be identical or homologous to transposons and were previ-ously coined TE-MIRs (Li et al. 2011).

Table 2 Previously known miRNAs for which coding sequences werepredicted in wheat in this study

Predicted from LCG assembly(whole genome)

Predicted from OG assembly(genic regions)

miR1122 (LC) miR394 miR1130 (LC)

miR1130 (LC) miR395 miR1439 (LC)

miR1318 miR396 miR2275

miR1432 miR397 miR399

miR1439 (LC) miR398 miR5049 (LC)

miR156 miR399 miR5070

miR159 miR415 miR5175 (LC)

miR160 miR5049 (LC) miR5180 (LC)

miR164 miR5050 miR5181 (LC)

miR165 miR5068 miR5205 (LC)

miR166 miR5070 miR5568 (LC)

miR167 miR5084 (LC)

miR169 miR5085

miR170 miR5175 (LC)

miR171 miR5180 (LC)

miR172 miR5181 (LC)

miR1847 (LC) miR5200

miR1867 (LC) miR5203 (LC)

miR2118 miR528

miR2275 miR530

miR319 miR5568 (LC)

miR393 miR6197 (LC)

miRNAs listed in this table were previously published. miRNA-codingsequences predicted from both assemblies are shown in italics.

LC low-confidence TE-related, LCG low-copy-number genome assem-bly, OG orthologous group assembly

Table 3 Newly identified miRNA-coding sequences predicted to bepresent in wheat

Predicted from LCG assembly(whole genome)

Predicted from OG assembly(genic regions)

miR1878 miR162

miR5064 miR6248 (LC)

miR5522

miR6201

miR6246

miRNAs listed in this table were not previously published

LC low-confidence TE-related, LCG low-copy-number genome assem-bly, OG orthologous group assembly

Funct Integr Genomics

To determine TE-MIRs within our putative stem-loop sets,we separately screened 3,510 (3,217 unique) and 3,567 (3,009unique) putative hairpin sequences derived from LCG and OGassemblies, against Triticae/Triticum repeats (4,526 se-quences) extracted from MIPS-REdat Poaceae v9.0 library.It is important to note that this database has better repeatcoverage, in comparison to TREP and MIPS-REdat Poaceaev8.6.2, which were used for repeat masking prior to LCG andOG assembly construction, as mentioned in ‘WheatAssemblies’.

Four data sets were submitted to the RepeatMasker pro-gram individually: (1) all LCG assembly-derived stem-loops(3,510 sequences), (2) all OG assembly-derived stem-loops(3,567 sequences), (3) unique stem-loop sequences derivedfrom LCG assembly (3,217 sequences) and (4) unique stem-loop sequences derived fromOG assembly (3,009 sequences).For each run, the program calculated the ratio of the number ofbase pairs masked by repeats to the total number base pairs (allsequences) submitted. Overall, OG assembly-derived stem-loop coding regions were observed to show more repetitivecontent (83.13–88.08 %), in comparison to the LCG

assembly-derived hairpins (66.23–71.79 %). For OG-deriveddata, the repetitive content calculated by the program washigher (88.08 %) when all putative hairpins, including identi-cal stem-loop coding regions from different genomic loca-tions, were submitted as a query, in comparison to the results(83.13 %) received from an input of unique stem-loop se-quences. On the contrary, for LCG-derived data, repetitivecontent of the whole representation of stem-loop sequenceswas observed to be lower (66.23 %), compared to the repeatcontent (71.79 %) of unique stem-loop sequences. Based onthese results, we can deduce that stem-loop harbouring re-peats, TE-MIRs, are dominant in the genic regions, wheretheir representation is also higher in comparison to non-repetitive stem-loops.

Stem-loop sequences corresponding to 51 putativemiRNAs (out of 85) matched one or more known repetitivesequence. Apart from miR396, miR845, miR5057 andmiR5085, at least one hairpin structure corresponding to allof these miRNAs was observed to give hit to DNA transpo-sons. The most predominant class II terminal inverted repeat(TIR) elements identified in the stem-loops were Enhancer/Suppressor-mutator (En/Spm) (42), TcMariner (TcMar) (39)and Mutator (MuDR) (2). Harbinger subfamily repeat wasdetected in putative miR1139 hairpins derived from onlytwo LCG assembly contigs (contig764019, contig772941).

Table 4 Highly represented repeat-related wheat miRNA groups exclud-ed from the final miRNA set

Predicted from LCG assembly(whole genome)

Predicted from OG assembly(genic regions)

miR1117 miR5021a miR1117 miR437

miR1118 miR5057 miR1118 miR5057

miR1120 miR5067 miR1120 miR5067

miR1121 miR5086a miR1121 miR5161

miR1123a miR5161 miR1122b miR5169

miR1125 miR5169 miR1125 miR5203b

miR1127 miR5205b miR1127 miR5281

miR1128 miR5281 miR1128 miR5387

miR1131 miR5387 miR1131 miR6191

miR1133 miR6191 miR1133 miR6197b

miR1135 miR6205a miR1135 miR6219

miR1136 miR6219 miR1136 miR6220

miR1137 miR6220 miR1137 miR6224

miR1139 miR6224 miR1139 miR6458a

miR1436 miR818 miR1436 miR818

miR437 miR845a miR1867b miR859a

miRNAs for which all predicted pre-miRNA sequences were masked byrepeats. These were removed from the final data set of miRNAs inTables 1 and 2

LCG low-copy-number genome assembly, OG orthologous groupassemblya Highly represented repeat-related wheat miRNAs which were onlypredicted from one of the assemblies usedbHighly represented repeat-related wheat miRNAs which were excludedfrom the final miRNA set of one assembly but included as a low-confidence miRNA in the other

Table 5 Highly represented repeat-related wheat miRNA groups includ-ed in the final miRNA set as low-confidence miRNAs

Predicted from LCG assembly(whole genome)

Predicted from OG assembly(genic regions)

miR1122a miR1130

miR1130 miR1439

miR1439 miR5049

miR1847b miR5175

miR1867 miR5180

miR5049 miR5181

miR5084b miR5205a

miR5175 miR5568

miR5180 miR6248b

miR5181

miR5203a

miR5568

miR6197a

miRNAs for which some predicted pre-miRNA sequences were maskedby repeats. These are listed as low-confidence miRNAs in Tables 1 and 2

LCG low-copy-number genome assembly, OG orthologous groupassemblya Highly represented repeat-related wheat miRNAs which were excludedfrom the final miRNA set of one assembly but included as a low-confidence miRNA in the otherb Highly represented repeat-related wheat miRNAs which were onlypredicted from one of the assemblies used

Funct Integr Genomics

In addition to DNA transposon family members, miniatureinverted terminal repeat elements (MITEs) were also found instem-loops. In fact, at least one stem-loop structure corre-sponding to 29 different putative miRNAs was found toharbour this type of repeat. Furthermore, stem-loops corre-sponding to 22 different putative miRNAs were observed toharbour long terminal repeat (LTR) retrotransposons, predom-inantly Gypsy and Copia. Only two putative miRNAs:miR845 and miR5085 were observed to have exclusivelyLTRs, but no DNA transposons. The only repetitive elementsdetected on stem-loops that could not be classified as a trans-posons were a microsatellite sequence (GGAGA)n, on amiR396 stem-loop, and an AT-rich region on a miR5085stem-loop (Table S2, Fig. 1).

After the in-depth analysis of repeat types on hairpins, non-repetitive stem-loops were determined to be separately savedfrom repetitive stem-loops. Hairpins with 50 % sequencecoverage of repeats (cases in which 50 % of a pre-miRNAsequence is covered with repeats) were accepted as TE-MIRs.With this rule applied, 873 and 38 non-repetitive putativemiRNA-coding regions were retrieved by the masking of2,637 (out of 3,510) and 3,569 (out of 3,567) TE-MIR stem-loops, in the cases of LCG- and OG-derived data, respectively.TE-MIRs cannot be safely accepted as pre-miRNA-codingstem-loops since they may either represent common TE cop-ies or TE-derived hairpins in the process of evolving intomiRNA genes. Thus, all miRNA groups with at least onecorresponding TE-MIR hairpin were denoted as ‘repeat-relat-edmiRNA group’ (49). Forty-one (3 specific to OG) and 45 (7specific to LCG) of these groups were identified from OG andLCG assemblies, respectively (Table S2). Some of theserepeat-related miRNA groups, only those represented by TE-MIR stem-loops were not retained in the final miRNA dataset. However, the remaining which were represented by non-repetitive stem-loops, as well as TE-MIRs, were included inthe final miRNA data set, but coined as ‘low confidenceprediction’ due to the possibility that their unmasked stem-loops may correspond to unknown repeats. Accordingly, 15low-confidence miRNA groups: 9 (2 specific to OG) and 13(6 specific to LCG) from OG and LCG assemblies, respec-tively, were predicted (Tables 2, 3, 4 and 5). Furthermore,miRNA groups, which were not ‘repeat related’ and did notcorrespond to any TE-MIR hairpins, were designated as ‘highconfidence miRNAs’. Satisfying this definition, 37 high con-fidence miRNA groups: 4 (1 specific to OG) and 36 (33specific to LCG) from OG and LCG, respectively, were iden-tified (Tables 2 and 3).

Thus, a total of 52 miRNA types including 37 high confi-dence and 15 low-confidence predictions were identified andincluded in the final data set. From OG and LCG assemblies,13 (3 specific to OG) and 49 (39 specific to LCG) miRNAswere predicted, respectively, of which 10 are common (Ta-bles 2 and 3). These final data sets derived from two

assemblies and their corresponding non-repetitive stem-loops: 873 for LCG and 38 for OG were used in furtheranalysis.

Overall, in this study, a total of 52 miRNAs were comparedto previously published experimental or computationalmiRNAs and 7 novel miRNAs (miR162, miR1878,miR5064, miR5522, miR6201, miR6246 and miR6248) werefound to be present in wheat for the first time (Gupta et al.2012; Kantar et al. 2012; Kenan-Eichler et al. 2011; Li et al.2013; Lucas and Budak 2012; Pandey et al. 2013; Vitulo et al.2011; Yao and Sun 2012; Yin and Shen 2010; Kurtoglu et al.2013) (Tables 2 and 3).

Representation analysis of miRNAs predictedfrom whole-genome and genic regions of wheat

Representation analyses were performed separately formiRNA-coding regions predicted from the LCG and OGassemblies. A total of 595 unique, non-repetitive miRNAstem-loops (belonging to 49 miRNA families, including 36high and 13 low-confidence predictions) were found in thewheat genome (derived from the LCG assembly), correspond-ing to a whole representation of 873, considering the caseswhen identical pre-miRNA sequences have different genomiclocations. In order to assess the corresponding folds ofmiRNAs in the genome, identical pre-miRNAs giving hits todifferent contigs or different positions on the same contig werealso included. The differences in the amount of representationof different miRNAs were found to be highly variable, as highas 187 copies of a single miRNA, namely miR395, while 7(out of 49) miRNAs were represented only once in the ge-nome (Fig. 2).

For 13 (out of 49) miRNAs, which were predicted with lowconfidence, the majority of their stem-loops were found to beTE-MIRs in the repeat analysis. The representation of theseTE-MIRs was also separately calculated. Figure 2 demon-strates the comparative folds of the TE-MIRs and non-repetitive stem-loops, for low-copy-number predictions, forwhich 10 out of 13 non-repetitive stem-loops were observedto constitute a very small portion (0.65–9.09 %) of the allpredicted stem-loops corresponding to the miRNA type. It isimportant to note that the entire pool of all possible repetitivestem-loops is not covered in our analysis, since the initial stepsof miRNA prediction were performed on previously repeat-masked libraries. Thus, it is possible that non-repetitive stem-loops may even constitute a smaller portion of the overallrepetitive stem-loop pool (Fig. 2).

A similar representation analysis was also performed forthe gene-derived non-repetitive miRNA-coding regions (withthe aid of OG assembly). For 13 predicted miRNA types,including 9 low- and 4 high-confidence predictions, a totalrepresentation of 38 different genomic copies were observed.There was only one case where an identical pre-miRNA stem-

Funct Integr Genomics

loop was represented twice. miR162 had two overlappingidentical pre-miRNA sequences located on the same contig,providing an overall fold of two copies. The correspondingstem-loops for some other miRNA types (miR5025 andmiR5049) differed in sequence and were observed to bepresent at different locations, both on the same and differentcontigs, contributing to the overall fold. The high number ofgenic copies of miR5049, providing almost half (16 out of 38)of the overall representation of genic stem-loops, is signifi-cant. The whole genome representation of this miRNA wasassessed as 42-fold (with the aid of LCG assembly), and we

can deduce that genic regions contribute a considerable por-tion of the whole genome representation of this miRNA. Onthe contrary to miR5049, the remaining miRNAs showed verylittle genic representation: five showed only a single fold, andseven others were represented as only two to three copies inthe genic regions. These miRNAs were also observed to bepoorly represented in the overall wheat genome (deducedfrom the LCG-derived stem-loops), only 4 of them: miR399,miR5049, miR2275 and miR5070, showing more than five-fold. We can speculate that some of these miRNAs can evenbe specifically coded from gene regions since miR1439 and

Fig. 1 Percentage of repeat typesin stem-loop coding regions. Thepie chart shows the comparativepresence of different types ofrepeats in stem-loop codingregions. Number of TEs or otherrepeats present in TE-MIRs isshown in parentheses

Fig. 2 Representation of putative wheat miRNAs predicted from the LCG assembly. a Representation of miRNA-coding stem-loop that does notharbour any repeats (high confidence predictions). bRepresentation of both high and low-confidence (repetitive stem-loops) predictions

Funct Integr Genomics

miR5175 were predicted to be represented by one copy bothin the LCG assembly- and OG assembly-derived data sets(Table 6).

Expression analysis of predicted miRNA-coding hairpins

All unique T. aestivum pre-miRNA sequences identified inthis study, 595 and 37 derived from LCG and OG assemblies,respectively, were subjected to expression analysis, whichconsisted of two stages.

At the first stage, derived pre-miRNAswere blasted againstNCBI EST databases. As a result of this analysis, at least onestem-loop sequence of each miRNA (52) was shown to matchan expressed sequence. By applying more stringent criteria(filtering of BLAST results for 98% similarity and 99% querycoverage), strong in silico expression evidence was providedfor 17 miRNA groups (Table 7, Fig. 3).

In the second step, we complemented computationalmiRNA prediction with the utilization of small RNA libraryread data. The idea was retrieved from a recent publication ofBaev and his colleagues, but we used a different methodologyfor a similar purpose (Baev et al. 2011). To verify the expres-sion of mature miRNA and miRNA* sequences of the pre-dicted hairpins, we searched small RNA library reads, ac-quired from previous studies, against the miRNA stem-loopsequences, predicted in this study (Table 1). Mature miRNAand/or miRNA* regions of the putative stem-loops gave iden-tical matches to several reads, extracted from small RNAlibraries, which were prepared from different tissue pools ofplants, grown under control and/or stress conditions. Therelated data is listed in Table S3. With this method, maturemiRNA sequences of 35miRNAswere shown to be present inthe small RNA libraries. For 21 miRNAs, in silico expressionevidence for miRNA*, along with mature miRNA, was alsoestablished. The predominance of miRNA sequences, in com-parison to miRNA*s in the small RNA libraries, is expected tostem from the rapid degradation of miRNA* sequences sincethey are non-functional in most cases. However, for twomiRNA types (miR2275, miR6197), only miRNA* se-quences were detected, supporting the recently emerged viewthat in several cases, miRNA* may also have functions (Yanget al. 2011) (Fig. 3).

With at least one of the above-mentioned methods, 39 (outof 52) of the predicted miRNAs were in silico shown to beexpressed. Moreover, expression of 13 miRNAs was verifiedwith evidence from both of the implemented methods. Stem-loop sequences of these miRNAs were found to give an almostidentical match to an expressed sequence and their maturemiRNA sequences were shown to be represented in a smallRNA library. For seven of these miRNA groups, the comple-mentary sequence, miRNA*, was also detected to be present inthe small RNA reads. Detailed data in relation to expressionanalysis of predicted miRNA types is designated in Fig. 3.

qRT-PCR validation and quantification of miRNAs

Three miRNAs predicted with high confidence in our insilico analysis, were selected for experimental comparativequantification. miRNA measurements were performed inleaf tissue with qRT-PCR using the SYBR Green I assay(Unver and Budak 2009; Kantar et al. 2010, 2011). ThreemiRNAs (miR166, miR396 and miR528) were shown tobe expressed in bread wheat, but in varying amounts.

Table 6 Representation of putative wheat miRNAs in genic regions

miRNA Fold Matched contig

miR1130 1 Traes_Bradi2g13110.1_000018_D

miR1439 1 Traes_Sb10g021680.1_000123_D

miR162 2 Traes_Bradi3g32020.1_000001_Aa

miR2275 2 Traes_AK374586_000028_X ;Traes_Sb06g002250.1_000051_B

miR399 1 Traes_Sb03g010570.1_000115_X

miR5049 16 Traes_AK250463.1_000006_A,Traes_AK252346.1_000014_X,Traes_AK252775.1_000002_A,Traes_AK360099_000001_B,Traes_Bradi5g01230.1_000006_D,Traes_Os02g0120900_000007_B,Traes_Bradi1g77547.2_000039_Bb,Traes_Bradi1g77547.2_000040_Bb,Traes_Bradi2g25310.1_000036_Db,Traes_Bradi5g21497.1_000240_Xb,Traes_Sb09g001670.1_000014_Db

miR5070 2 Traes_AK356760_000012_D,Traes_Bradi3g27300.1_000163_B

miR5175 1 Traes_Os03g0204100_000030_X

miR5180 3 Traes_Os02g0712700_000071_B,Traes_AK368538_000040_B,Traes_Bradi2g39137.2_000099_X

miR5181 3 Traes_Os02g0712700_000071_B,Traes_Sb03g032940.1_000020_A,Traes_Bradi4g16260.3_000025_X

miR5205 3 Traes_Bradi4g32830.1_000002_B,Traes_Sb06g025820.1_000045_Bb

miR5568 1 Traes_Sb06g025820.1_000045_B

miR6248 2 Traes_Bradi3g13110.1_000015_A,Traes_Bradi3g13110.1_000013_X

Wheat gene assembly contigs are named according to their respectiveorthologous group (OGR) representative and their predicted genomes

Orthologous groups were created from genome sequences of rice, sor-ghum and Brachypodium distachyon and full-length cDNA sequences ofHordeum vulgare

Traes Triticum, Bradi B. distachyon, Sb Sorghum bicolor, Os O. sativa,AKOGR-derived from H. vulgarea Two overlapping pre-miRNA stem-loops are located on the same contigb Two pre-miRNA stem-loops are located on different regions of the samecontig

Funct Integr Genomics

miR396 was observed to be the most abundant, showingapproximately 2.7- and 2.3-fold expression in comparisonto miR528 and miR166, respectively (Fig. 4). miR166 wasshown to have a slightly higher expression (approximately1.2-fold) than miR528 (Fig. 4). Specificity of the qRT-PCRproducts was shown by their comparison with no-RT andno-RNA controls (Fig. 5).

Although mature miRNA expression is regulated at severallevels and highly environment dependent, it is important tonote that relative miRNA expression based on qRT-PCR wasobserved somewhat to have a correlation with the miRNAgene copy number, predicted by the representation analysis.Predicted gene copy numbers were observed to have the sametrend as the expression shown by qRT-PCR, being higher formiR396 (37) than miR166 (21) and miR528 (2) (Figs. 2a and 4).

Discussion

miRNAs have a central role in post-transcriptional regulation,and in order to fully understand regulation of gene expression,in-depth characterization of miRNA populations is of greatsignificance (Jones-Rhoades et al. 2006). Yet, identification ofmiRNAs is especially complicated in plants due to their largeand complex small RNA pools: small interfering RNAs(siRNAs) being the most dominant form (Axtell 2013;

Bartel 2004; Pantaleo 2011). Furthermore, complex and re-petitive nature of grass genomes adds another bottleneck todetermining the overall miRNA population in these species.Therefore, researchers have focused on establishing accuratecriteria for the annotation of plant miRNAs (Meyers et al.2008; Jones-Rhoades 2010). In accordance with these rules, inrecent years, several grass species, including bread wheat,were subjected to miRNA identification studies, most ofwhich were based on deep sequencing of small RNA librariesconstructed from tissues of plants grown under normal condi-tions or exposed to several stresses (Cantu et al. 2010; Kenan-Eichler et al. 2011; Tang et al. 2012; Wei et al. 2009; Xin et al.2010; Yao et al. 2007; Li et al. 2013) (Table 1). This methodhas been the best known experimental approach for novelmiRNA discovery. Yet, differentiating miRNAs from othersmall RNAs is challenging and must be supported by furthercomputational analysis, which involves the demonstration ofpre-miRNA hairpin structures. Another important means ofmiRNA identification involves in silico methods, either ho-mology based or predictive. In a couple of reports adoptingthese computational strategies, miRNAs were derived fromeither ESTs or partial genomic sequences (Dryanova et al.2008; Zhang et al. 2005; Schreiber et al. 2011; Jin et al. 2008;Yin and Shen 2010; Pandey et al. 2013). In others, researchershave utilized a chromosome/chromosome arm-centered ap-proach to perform miRNA prediction, utilizing the recently

Table 7 Expressed sequence hittable of predicted miRNAs

Hit names starting with ‘gi’ werederived from NCBI T. aestivum(taxid:4565) EST database, while‘contigs’ were derived from thetranscriptome assembly of a re-cent study (Brenchley et al. 2012)

LC low-confidence TE-related,LCG low-copy-number genomeassembly, OG orthologous groupassembly

LCG assembly

miR159 gi|383729060|dbj|HX145825.1|

miR167 gi|143464054|dbj|CJ846906.1|, gi|143399055|dbj|CJ833771.1|

miR169 contig25183, contig74010, contig79327, gi|383556880|dbj|HX098125.1|,gi|383556881|dbj|HX098126.1|

miR1878 contig00397

miR2275 gi|25149453|gb|CA597098.1|

miR319 gi|383729060|dbj|HX145825.1|

miR395 gi|39556435|gb|CK194045.1|, gi|39556094|gb|CK193704.1|

miR397 gi|282839589|gb|GH985139.1|

miR399 gi|93255560|dbj|CJ560621.1|, gi|93057089|dbj|CJ667854.1|

miR5049 gi|383682025|dbj|HX151156.1|, gi|92492322|dbj|CJ547855.1|,gi|93027534|dbj|CJ562894.1|, gi|93738453|dbj|CJ546995.1|

miR5050 gi|383666810|dbj|HX096644.1|

miR5064 gi|282836252|gb|FL488864.1|

miR5200 gi|22547791|gb|BU099992.1|

miR530 contig83392

miR6201 gi|20114431|dbj|BJ304328.1|

miR6246 gi|55679506|gb|CV774566.1|, gi|20081742|dbj|BJ259335.1|

OG assembly

miR162 contig07326, contig105394, contig108355, contig116680, contig51734,gi|93239444|dbj|CJ639158.1|

miR2275 gi|25149453|gb|CA597098.1|

Funct Integr Genomics

developed chromosome-sorting techniques (Doležel et al.2007; Kantar et al. 2012; Kubalakova et al. 2002; Lucas andBudak 2012; Vitulo et al. 2011; Kurtoglu et al. 2013)

The above-mentioned bioinformatic techniques requirelimited resources and are effective means of miRNA identifi-cation, even capable of finding miRNAs, expressed at low

Fig. 3 miRNA families forwhich expression is confirmed byevidence from expressedsequence or small RNA librarysearch. a Predicted miRNAsmatching at least one expressedwheat sequence. b PredictedmiRNAs matching to anyexpressed wheat sequence with98 % identity and 99 % querycoverage. c Predicted miRNAswhich have matches of theirmature miRNA in small RNAlibrary reads. d PredictedmiRNAs which have matches oftheir mature miRNA* sequence insmall RNA library reads

Fig. 4 Experimental evidence for miRNA expression. Real-time amplification curves of a tae-miR166, b tae-miR528 and c tae-miR396 (three technicalreplicates for each miRNA). dBar graph showing relative quantification of these miRNAs

Funct Integr Genomics

levels or under specific conditions, which cannot be detectedby experimental techniques. For sure, derivation of miRNAsfrom whole genomic survey sequence, instead of ESTs orpartial genomic sequnces, adds to the sensitivity of miRNAprediction. Yet, until recently, the whole genome of breadwheat was unsequenced. With the availability of 5× coverageT. aestivum L. 454 sequence data, we performed the firstsystematic homology-based identification of the whole set ofwheat miRNAs (Brenchley et al. 2012). To obtain a compar-ative view of wheat miRNA-coding regions in intergenic andgene-rich parts of the genome, miRNA prediction was per-formed on two data sets: LCG assembly, representing thewhole wheat genome, and OG assembly, highlighting thegenic regions. It is important to note that in silico miRNAprediction can result in false positives. Thus, in this study,repeat analysis was performed on the predicted stem-loopsequences to eliminate pseudo-miRNAs to retrieve a higherconfidence miRNA population (Table S2). A total of 52 (37high confidence) different putative miRNAs were predicted:

49 (36 high confidence) from the LCG assembly and 13 (4high confidence) from the OG assembly, of which 10 (3 highconfidence) were identified from both assemblies (Tables 4and 5). Of these, seven miRNAs were found to be novel sincethey were not previously identified in wheat (Tables 2 and 3).

Current literature points out that plant miRNAs are mostlygenerated from non-coding transcriptional units, unlike someof the animal miRNAs which are processed from introns orprotein-coding sequences. Still, some plant miRNA genes areknown to reside in intron sequences (Baskerville and Bartel2005; Reinhart et al. 2002; Piriyapongsa and Jordan 2008).Results obtained from this study, support this view, since 39(33 high confidence) miRNAs were identified exclusivelyfrom the LCG assembly, representing a catalogue of wheatmiRNA-coding sequences located in intergenic regions. Theremaining 13 (4 high confidence) miRNAs were derived fromthe OG assembly, from which we can deduce that some or allof the stem-loops, coding for these miRNAs, are located ingenic regions (Tables 2 and 3). However, unexpectedly, three

Fig. 5 miRNA melting curves. The representative melting curves of amplification reactions are displayed along with the corresponding agarose gelphotos of the qRT-PCR products (no-RNA and no-RT primer controls are also shown). a tae-miR166, b tae-miR528 and c tae-miR396

Funct Integr Genomics

(one high confidence) miRNAs, miR162, miR5205,miR6248, were predicted only from the OG assembly, a casethat may occur due to a number of reasons (Tables 2 and 3).First of all, minor changes due to innate sequencing errors orloss of partial sequence during assembly construction mayhave resulted in a shortage in the initial input sequences andconsequently in the initial pool of miRNA predictions. The setof stringent criteria applied in order to acquire miRNAs at thebest possible confidence may have resulted in the ab-sence of these miRNAs in the final LCG assembly-derived results. miR162 was eliminated due to the rulesapplied by SUmiRFold (miRNA-miRNA* mismatch;%GC, MFEI), and miR6248 pre-miRNA-coding se-quences were manually discarded since they showedinconsistency with pre-miRNA secondary structure features.Additionally, from initially detected putative pre-miR5205 stem-loops, all were eliminated from the LCG-derived data due to 50 % sequence coverage with repeats;however, two (Traes_Bradi4g32830.1_000002_B,Traes_Sb06g025820.1_000045_B) were retained in the finaldata set acquired from the OG assembly. These OG assemblycontigs were also masked with repeats, but not to an extent of50 %. All of these three cases can result from the extent ofsequence coverage of the LCG assembly andminor alterationsin base composition of its sequences. These factors can effectMFE structure or repeat content of stem-loops and result intheir removal from the final results, when coupled by stringentset of rules applied during miRNA prediction.

In recent years, several of the previously annotatedmiRNAs were shown to be homologous to transposons, andit was suggested that these TE-MIRs were evolutionary inter-mediates from transposable elements to miRNA genes(Piriyapongsa et al. 2007; Piriyapongsa and Jordan 2008; Liet al. 2011). 1AL chromosome of the wheat genome was alsoshown to contain stem-loops that hit to transposable elements(Lucas and Budak 2012). As indicated above, for this purpose,repeat analysis was performed in this study to eliminate TE-MIRs from the initial stem-loop data set. This assessmentrevealed that OG assembly-derived hairpins harbour morerepetitive content (83.13 %) compared to the LCG-derivedstem-loops (71.79 %). Additionally, a higher representation ofthese repetitive stem-loops was observed in the OG-deriveddata set (83.13 % for a query of unique stem-loops, 88.08 %for a whole representation of stem-loops), in comparison withthe LCG assembly-derived data (71.79 % for a query ofunique stem-loops, 66.23 % for a whole representation ofstem-loops) (Table S2, Fig. 1). This is in line with the previousfindings of Li and his colleagues, who observed that most ofthe rice TE-MIRs they identified were located in the gene-richregions (Li et al. 2011).

To obtain an overall understanding of wheat miRNA-coding sequences, representation analysis of each predictedmiRNA was performed, separately for 49 and 13 miRNAs

derived from LCG and OG assemblies, respectively (Table 6,Fig. 2). Fold changes were calculated considering thecases when identical pre-miRNA sequences have differentgenomic locations. Cumulative miRNA representationfrom the whole genome (595 different potential miRNAstem-loops corresponding to 873 genomic locations) wasobserved to far exceed the representation of miRNAslocated at genic regions (37 different potential miRNAstem-loops corresponding to 38 genomic locations). Fromthis observation, we can infer that the intergenic regionsof the genome harbour a greater representation as well asa higher variety of miRNAs in comparison to the genicregions. Furthermore, as another result of the representa-tion analysis, the level of miRNA types was found to begreatly variable, as high as 184 and 16 predicted copies ofa single miRNA in the whole genome and genic regions,respectively (Table 6, Fig. 2). This finding is importantsince the copy number of miRNA genes is significant inrelation to miRNA dosage, which plays a central role intheir target regulation (Li and Mao 2007). Besides, inrelation to the representation data, the observation thatall previously unreported miRNAs corresponding toLCG assembly-derived stem-loops, were scarcely repre-sented (under a set threshold of 15 fold representation)(Fig. 2). This finding emphasizes the advantage of our insilico methodology using whole genome sequence, inenabling the detection of miRNAs, that can be easilyskipped with other techniques such as EST based oncomputational miRNA prediction and sequencing of smallRNA libraries.

Yet, a disadvantage of our miRNA prediction approach,exploiting genomic sequence data, is that some of the predict-ed miRNAs may not have intact promoter sequences and, infact, may be silent. Therefore, in order to complement our datawith expression evidence, we performed searches of the pre-dicted miRNA-coding stem-loops against expressed se-quences and small RNA library databases (Table 7, Fig. 3,and Table S3). pre-miRNA sequences of all the putativemiRNAs were found to give hit to at least one expressedsequence (contig of the transcriptome assembly or EST) (Table 7,Fig. 3). In order to draw nearly identical matches, allowingonly one mismatch, stringent criteria were applied on the dataset. One nucleotide mismatch flexibility was allowed to ac-count for possible sequencing errors or mutations betweendifferent cultivars in ESTs. This method coupled with theenforcement of stringent criteria provided strong evidence,in relation to the expression of 17 (out of 52) miRNA-coding stem-loops (Table 7, Fig. 3). On the other hand, awhole length of mature miRNA sequences of the 35 (out of52) putative pre-miRNAs was shown to be represented in thesmall RNA library databases (Table S3, Fig. 3). Cumulatively,expression evidence for 39 (out of 52) miRNAs was providedby either of the two strategies used: screening of expressed

Funct Integr Genomics

sequences or searching small RNA libraries (Fig. 3). Further-more, expression of 13 (out of 52) was strongly supported byboth lines of evidence (Fig. 3). The remaining of the predictedputative miRNA-coding sequences identified in this studymay also be expressed, but their pre-miRNA or maturemiRNA sequences may not yet be represented in the currentlyavailable transcriptome assemblies, ESTs or small RNA li-braries. This can be explained by the fact that miRNA expres-sion is highly tissue/developmental stage/environment specif-ic and the databases generated up to date do not include smallRNA reads from a whole repertoire of samples, when thewhole representation of possible tissues, developmental stagesand environments are taken into consideration. In fact, threemiRNAs experimentally quantified in our study (miR166,miR396 and miR528) were not in silico detected in ESTs ortranscriptome assembly. Although these three miRNAs werecovered by small RNA libraries generated up to date, con-structed from non-stressed tissues and in the case of miR166and miR396 additionally from several stressed tissues, forsome other predicted miRNAs, in silico expression evidenceprovided from small RNA libraries was observed to be re-stricted only to specific conditions of stress. For example,mature miRNA sequence of miR1122 was found to be repre-sented only in heat-stressed leaf tissue and mature miRNAsequences of miR2118 and miR6201 only gave hit to smallRNA library reads derived from cold-stressed spike tissue(Table S3). This observation was expected since expressionof some miRNAs can be highly spatio-temporal specific andobserved only under stress conditions. Another noteworthyobservation of the in silico small RNA library search is inrelation to miRNA*. For 23 (out of 52) putative pre-miRNAs,miRNA* sequences were detected in small RNA libraries. Infact, for 2 (out of 23) stem-loops, only miRNA*s, but not themature miRNA sequences, were found to have counterparts inthe databases (Table S3, Fig. 3). This is in accordance with therecent findings in animals suggesting functionality ofmiRNA* sequences, which were previously thought to bedegraded immediately (Yang et al. 2011).

For comparative miRNA quantification, three miRNAs,identified with high confidence were quantified with qRT-PCR. Three miRNAs (miR166, miR396, miR528) wereshown in Fig. 4 since they were repeatedly reported to playroles in the stress regulation in several plant species. miR528, amonocot-specific miRNA, has been implicated in abioticstress, including conditions such as drought, scarce nitrogen,low temperature and hypoxia due to flooding (An et al. 2011;Bertolini et al. 2012; Budak and Akpinar 2011; Ferreira et al.2012; Kantar et al. 2011; Liu et al. 2012; Nischal et al. 2012;Xu et al. 2011; Zhang et al. 2008). miR396 was shown to beinvolved in temperature, salinity, cold, alkali and drought stressresponses (Gao et al. 2010; Giacomelli et al. 2012; Kantar et al.2011; Liu et al. 2009; Zhou et al. 2010, 2012), while miR166was evidenced to play part in drought (Kantar et al. 2010,

2011). Biotic stress relevance of miR166 and miR396 was alsoreported repeatedly (Bazin et al. 2013; Hewezi et al. 2008,2012; Zhou et al. 2012). In this study, with qRT-PCR, thesemiRNAs, predicted with high confidence from the LCG as-sembly, were comparatively quantified in leaf tissue of wheat,miR396 being the most abundant (Figs. 4 and 5).

Acknowledgments The authors are grateful to Dr. Rachel Brenchleyand her collegues (Centre for Genome Research, University of Liverpool,UK) for use of the 5× coverage wheat genome assemblies and thetranscriptome assembly.

Conflict of interest The authors declare that they have no conflict ofinterest.

Ethical standards Experiments comply with the current laws of thecountry in which they were performed.

References

An FM, Hsiao SR, ChanMT (2011) Sequencing-based approaches reveallow ambient temperature-responsive and tissue-specific microRNAsin phalaenopsis orchid. PLoS One 6(5):e18937. doi:10.1371/journal.pone.0018937

Axtell MJ (2013) Classification and comparison of small RNAs fromplants. Annu Rev Plant Biol. doi:10.1146/annurev-arplant-050312-120043

Baev V, Milev I, Naydenov M, Apostolova E, Minkov G, Minkov I,Yahubyan G (2011) Implementation of a de novo genome-widecomputational approach for updating Brachypodium miRNAs.Genomics 97(5):282–293. doi:10.1016/j.ygeno.2011.02.008

Bartel DP (2004) MicroRNAs: genomics, biogenesis, mechanism, andfunction. Cell 116(2):281–297

Baskerville S, Bartel DP (2005) Microarray profiling of microRNAsreveals frequent coexpression with neighboring miRNAs and hostgenes. RNA 11(3):241–247. doi:10.1261/rna.7240905

Bazin J, Khan GA, Combier JP, Bustos-Sanmamed P, Debernardi JM,Rodriguez R, Sorin C, Palatnik J, Hartmann C, Crespi M, Lelandais-Briere C (2013) miR396 affects mycorrhization and root meristemactivity in the legume Medicago truncatula. Plant J. doi:10.1111/tpj.12178

Bertolini E, Verelst W, Horner DS, Gianfranceschi L, Piccolo V, Inzé D,Pè ME, Mica E (2012) Addressing the role of microRNAs inreprogramming leaf growth during drought stress inBrachypodium distachyon. Mol Plant. doi:10.1093/mp/sss160

BrenchleyR, SpannaglM, PfeiferM, Barker GL, D’Amore R, AllenAM,McKenzie N, Kramer M, Kerhornou A, Bolser D, Kay S, Waite D,Trick M, Bancroft I, Gu Y, Huo N, Luo MC, Sehgal S, Gill B,Kianian S, Anderson O, Kersey P, Dvorak J, McCombie WR, HallA, Mayer KF, Edwards KJ, Bevan MW, Hall N (2012) Analysis ofthe bread wheat genome using whole-genome shotgun sequencing.Nature 491(7426):705–710. doi:10.1038/nature11650

Budak H, Akpinar A (2011) Dehydration stress-responsive miRNA inBrachypodium distachyon: evident by genome-wide screening ofmicroRNAs expression. OMICS: J Integr Biol 15(11):791–799

Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K,Madden TL (2009) BLAST plus: architecture and applications.BMC Bioinformatics 10. doi:10.1186/1471-2105-10-421

Cantu D, Vanzetti LS, Sumner A, Dubcovsky M, Matvienko M,Distelfeld A, Michelmore RW, Dubcovsky J (2010) Small RNAs,

Funct Integr Genomics

DNA methylation and transposable elements in wheat. BMCGenomics 11:408. doi:10.1186/1471-2164-11-408

Doležel J, Kubaláková M, Paux E, Bartoš J, Feuillet C (2007)Chromosome-based genomics in the cereals. Chrom Res 15(1):51–66. doi:10.1007/s10577-006-1106-x

Dryanova A, Zakharov A, Gulick PJ (2008) Data mining for miRNAsand their targets in the Triticeae. Genome 51(6):433–443. doi:10.1139/G08-025

Dugas DV, Bartel B (2004) MicroRNA regulation of gene expression inplants. Curr Opin Plant Biol 7(5):512–520. doi:10.1016/j.pbi.2004.07.011

Ferreira TH, Gentile A, Vilela RD, Costa GG, Dias LI, Endres L,MenossiM (2012) microRNAs associated with drought response in thebioenergy crop sugarcane (Saccharum spp.). PLoS One 7(10):e46703

Gao P, Bai X, Yang L, Lv D, Li Y, Cai H, JiW, GuoD, ZhuY (2010) Over-expression of osa-MIR396c decreases salt and alkali stress tolerance.Planta 231(5):991–1001. doi:10.1007/s00425-010-1104-2

Giacomelli JI, Weigel D, Chan RL,Manavella PA (2012) Role of recentlyevolved miRNA regulation of sunflower HaWRKY6 in response totemperature damage. New Phytol 195(4):766–773. doi:10.1111/j.1469-8137.2012.04259.x

Gupta OP, Permar V, Koundal V, Singh UD, Praveen S (2012)MicroRNA regulated defense responses in Triticum aestivum L.during Puccinia graminis f.sp. tritici infection. Mol Biol Rep39(2):817–824. doi:10.1007/s11033-011-0803-5

Hernandez P, Martis M, Dorado G, Pfeifer M, Galvez S, Schaaf S, JouveN, Simkova H, Valarik M, Dolezel J, Mayer KF (2012) Next-generation sequencing and syntenic integration of flow-sorted armsof wheat chromosome 4A exposes the chromosome structure andgene content. Plant J 69(3):377–386. doi:10.1111/j.1365-313X.2011.04808.x

Hewezi T, Howe P, Maier TR, Baum TJ (2008) Arabidopsis small RNAsand their targets during cyst nematode parasitism. Mol Plant-Microbe Interact: MPMI 21(12):1622–1634. doi:10.1094/MPMI-21-12-1622

Hewezi T, Maier TR, Nettleton D, Baum TJ (2012) The ArabidopsismicroRNA396-GRF1/GRF3 regulatory module acts as a develop-mental regulator in the reprogramming of root cells during cystnematode infection. Plant Physiol 159(1):321–335. doi:10.1104/pp. 112.193649

Jin W, Li N, Zhang B, Wu F, Li W, Guo A, Deng Z (2008) Identificationand verification of microRNA in wheat (Triticum aestivum). J PlantRes 121(3):351–355. doi:10.1007/s10265-007-0139-3

Jones-Rhoades MW (2010) Prediction of plant miRNA genes. MethodsMol Biol 592:19–30. doi:10.1007/978-1-60327-005-2_2

Jones-Rhoades MW, Bartel DP, Bartel B (2006) MicroRNAs and theirregulatory roles in plants. Ann Rev Plant Biol 57(1):19–53. doi:10.1146/annurev.arplant.57.032905.105218

Kadri S, Hinman V, Benos P (2009) HHMMiR: efficient de novo pre-diction of microRNAs using hierarchical hidden Markov models.BMC Bioinforma 10(Suppl 1):S35

Kantar M, Unver T, Budak H (2010) Regulation of barley miRNAs upondehydration stress correlated with target gene expression. FunctIntegr Genomics 10(4):493–507. doi:10.1007/s10142-010-0181-4

Kantar M, Lucas SJ, Budak H (2011) miRNA expression patterns ofTriticum dicoccoides in response to shock drought stress. Planta233(3):471–484. doi:10.1007/s00425-010-1309-4

Kantar M, Akpinar BA, Valarik M, Lucas SJ, Dolezel J, Hernandez P,Budak H, International Wheat Genome Sequencing C (2012)Subgenomic analysis of microRNAs in polyploid wheat. FunctIntegr Genomics 12(3):465–479. doi:10.1007/s10142-012-0285-0

Kenan-Eichler M, Leshkowitz D, Tal L, Noor E, Melamed-Bessudo C,Feldman M, Levy AA (2011) Wheat hybridization andpolyploidization results in deregulation of small RNAs. Genetics188(2):263–272. doi:10.1534/genetics.111.128348

Khraiwesh B, Zhu J-K, Zhu J (2012) Role of miRNAs and siRNAs inbiotic and abiotic stress responses of plants. Biochimica etBiophysica Acta (BBA) - Gene Reg Mech 1819(2):137–148. doi:10.1016/j.bbagrm.2011.05.001

Kozomara A, Griffiths-Jones S (2011) miRBase: integrating microRNAannotation and deep-sequencing data. Nucleic Acids Res39(Database issue):D152–D157. doi:10.1093/nar/gkq1027

KubalakovaM, Vrana J, Cihalikova J, Simkova H, Dolezel J (2002) Flowkaryotyping and chromosome sorting in bread wheat (Triticumaestivum L.). Theor Appl Genet 104(8):1362–1372. doi:10.1007/s00122-002-0888-2

Kurtoglu KYKM, Lucas SJ, Budak H (2013) Unique and conservedmicroRNAs in wheat chromosome 5D revealed by next-generationsequencing. PLoS ONE 8(7):e69801

Li A,Mao L (2007) Evolution of plant microRNAgene families. Cell Res17(3):212–218. doi:10.1038/sj.cr.7310113

Li Y, Li C, Xia J, Jin Y (2011) Domestication of transposable elementsinto microRNA genes in plants. PLoS ONE 6(5):e19212. doi:10.1371/journal.pone.0019212

Li Y-F, Zheng Y, Jagadeeswaran G, Sunkar R (2013) Characterization ofsmall RNAs and their target genes in wheat seedlings usingsequencing-based approaches. Plant Sci 203–204:17–24. doi:10.1016/j.plantsci.2012.12.014

Liu D, Song Y, Chen Z, Yu D (2009) Ectopic expression of miR396suppresses GRF target gene expression and alters leaf growth inArabidopsis. Physiol Plant 136(2):223–236. doi:10.1111/j.1399-3054.2009.01229.x

Liu Z, Kumari S, Zhang L, Zheng Y, Ware D (2012) Characterization ofmiRNAs in response to short-termwaterlogging in three inbred linesof Zea mays. PLoS One 7(6):e39786. doi:10.1371/journal.pone.0039786

Llave C, Xie Z, Kasschau KD, Carrington JC (2002) Cleavage ofscarecrow-like mRNA targets directed by a class of ArabidopsismiRNA. Science 297(5589):2053–2056. doi:10.1126/science.1076311

Lucas SJ, Budak H (2012) Sorting the wheat from the chaff: identifyingmiRNAs in genomic survey sequences of Triticum aestivum chro-mosome 1AL. PLoS One 7(7):e40859. doi:10.1371/journal.pone.0040859

Markham NR, Zuker M (2008) UNAFold: software for nucleic acidfolding and hybridization. Methods Mol Biol 453:3–31. doi:10.1007/978-1-60327-429-6_1

Meyers BC, Axtell MJ, Bartel B, Bartel DP, Baulcombe D, Bowman JL,Cao X, Carrington JC, Chen X, Green PJ, Griffiths-Jones S,Jacobsen SE, Mallory AC, Martienssen RA, Poethig RS, Qi Y,Vaucheret H, Voinnet O, Watanabe Y, Weigel D, Zhu JK (2008)Criteria for annotation of plant microRNAs. Plant Cell 20(12):3186–3190. doi:10.1105/tpc.108.064311

Nischal L,MohsinM,Khan I, KardamH,WadhwaA,Abrol YP, IqbalM,Ahmad A (2012) Identification and comparative analysis ofmicroRNAs associated with low-N tolerance in rice geno-types. PLoS One 7(12):e50261. doi:10.1371/journal.pone.0050261

Pandey B, Gupta OP, Pandey DM, Sharma I, Sharma P (2013)Identification of new stress-induced microRNA and their targets inwheat using computational approach. Plant Signal Behav 8(5):e23932

Pantaleo V (2011) Plant RNA silencing in viral defence. Adv Exp MedBiol 722:39–58. doi:10.1007/978-1-4614-0332-6_3

Park MY, Wu G, Gonzalez-Sulser A, Vaucheret H, Poethig RS (2005)Nuclear processing and export of microRNAs in Arabidopsis. ProcNatl Acad Sci U S A 102(10):3691–3696. doi:10.1073/pnas.0405570102

Peixoto A, Monteiro M, Rocha B, Veiga-Fernandes H (2004)Quantification of multiple gene expression in individual cells.Genome Res 14(10A):1938–1947. doi:10.1101/gr.2890204

Funct Integr Genomics

Piriyapongsa J, Jordan IK (2008) Dual coding of siRNAs and miRNAsby plant transposable elements. RNA 14(5):814–821. doi:10.1261/rna.916708

Piriyapongsa J, Mariño-Ramírez L, Jordan IK (2007) Origin and evolu-tion of human microRNAs from transposable elements. Genetics176(2):1323–1337. doi:10.1534/genetics.107.072553

Reinhart BJ, Weinstein EG, Rhoades MW, Bartel B, Bartel DP (2002)MicroRNAs in plants. Genes Dev 16(13):1616–1626. doi:10.1101/gad.1004402

Ruijter JM, Ramakers C, Hoogaars WMH, Karlen Y, Bakker O, van denHoff MJB,MoormanAFM (2009) Amplification efficiency: linkingbaseline and bias in the analysis of quantitative PCR data. NucleicAcids Res 37(6):e45. doi:10.1093/nar/gkp045

Schreiber A, Shi B-J, Huang C-Y, Langridge P, Baumann U (2011)Discovery of barley miRNAs through deep sequencing of shortreads. BMC Genomics 12(1):129

Tang Z, Zhang L, Xu C, Yuan S, Zhang F, Zheng Y, Zhao C (2012)Uncovering small RNA-mediated responses to cold stress in a wheatthermosensitive genic male-sterile line by deep sequencing. PlantPhysiol 159(2):721–738. doi:10.1104/pp. 112.196048

Teune J-H, Steger G (2010) NOVOMIR: de novo prediction ofmicroRNA-coding regions in a single plant-genome. Journal ofnucleic acids 2010. doi:10.4061/2010/495904

Unver T, Budak H (2009) Conserved microRNAs and their targets inmodel grass species Brachypodium distachyon. Planta 230(4):659–669. doi:10.1007/s00425-009-0974-7

Unver T, Namuth-Covert DM, Budak H (2009) Review of current meth-odological approaches for characterizing microRNAs in plants. Int JPlant Gen. doi:10.1155/2009/262463

Varkonyi-Gasic E,Wu R,WoodM,Walton E, Hellens R (2007) Protocol:a highly sensitive RT-PCR method for detection and quantificationof microRNAs. Plant Methods 3(1):12

Vitulo N, Albiero A, Forcato C, Campagna D, Dal Pero F, Bagnaresi P,ColaiacovoM, Faccioli P, Lamontanara A, Simkova H, KubalakovaM, Perrotta G, Facella P, Lopez L, Pietrella M, Gianese G, Dolezel J,Giuliano G, Cattivelli L, Valle G, Stanca AM (2011) First survey ofthe wheat chromosome 5A composition through a next generationsequencing approach. PLoS One 6(10):e26421. doi:10.1371/journal.pone.0026421

Wei B, Cai T, Zhang R, Li A, Huo N, Li S, GuY, Vogel J, Jia J, Qi Y,MaoL (2009) Novel microRNAs uncovered by deep sequencing of smallRNA transcriptomes in bread wheat (Triticum aestivum L.) andBrachypodium distachyon (L.) Beauv. Funct Integr Genomics 9(4):499–511. doi:10.1007/s10142-009-0128-9

Wu Y, Wei B, Liu H, Li T, Rayner S (2011) MiRPara: a SVM-basedsoftware tool for prediction of most probable microRNA coding

regions in genome scale sequences. BMCBioinform 12:107. doi:10.1186/1471-2105-12-107

XinM,Wang Y, Yao Y, Xie C, Peng H, Ni Z, Sun Q (2010) Diverse set ofmicroRNAs are responsive to powdery mildew infection and heatstress in wheat (Triticum aestivum L.). BMC Plant Biol10(1):123

Xu Z, Zhong S, Li X, Li W, Rothstein SJ, Zhang S, Bi Y, Xie C (2011)Genome-wide identification of microRNAs in response to lownitrate availability in maize leaves and roots. PLoS One 6(11):e28009. doi:10.1371/journal.pone.0028009

Xuan P, Guo M, Liu X, Huang Y, Li W, Huang Y (2011)PlantMiRNAPred: efficient classification of real and pseudo plantpre-miRNAs. Bioinformatics 27(10):1368–1376. doi:10.1093/bioinformatics/btr153

Yang JS, Phillips MD, Betel D, Mu P, Ventura A, Siepel AC, Chen KC,Lai EC (2011) Widespread regulatory activity of vertebratemicroRNA* species. RNA 17(2):312–326. doi:10.1261/rna.2537911

Yao Y, Sun Q (2012) Exploration of small non coding RNAs in wheat(Triticum aestivum L.). Plant Mol Biol 80(1):67–73. doi:10.1007/s11103-011-9835-4

Yao Y, Guo G, Ni Z, Sunkar R, Du J, Zhu JK, Sun Q (2007) Cloning andcharacterization of microRNAs from wheat (Triticum aestivum L.).Genome Biol 8(6):R96

Yin Z, Shen F (2010) Identification and characterization of conservedmicroRNAs and their target genes in wheat (Triticum aestivum).Genet Mol Res 9(2):1186–1196

Yu X,Wang H, Lu Y, de Ruiter M, Cariaso M, Prins M, van Tunen A, HeY (2012) Identification of conserved and novel microRNAs that areresponsive to heat stress in Brassica rapa. J Exp Bot 63(2):1025–1038. doi:10.1093/jxb/err337

Zhang BH, Pan XP, Wang QL, Cobb GP, Anderson TA (2005)Identification and characterization of new plant microRNAs usingEST analysis. Cell Res 15(5):336–360. doi:10.1038/sj.cr.7290302

Zhang Z, Wei L, Zou X, Tao Y, Liu Z, Zheng Y (2008) Submergence-responsive microRNAs are potentially involved in the regulation ofmorphological and metabolic adaptations in maize root cells. AnnBot 102(4):509–519. doi:10.1093/aob/mcn129

Zhou L, Liu Y, Liu Z, Kong D, Duan M, Luo L (2010) Genome-wideidentification and analysis of drought-responsive microRNAs inOryza sativa. J Exp Bot 61(15):4157–4168. doi:10.1093/jxb/erq237

Zhou J, Liu M, Jiang J, Qiao G, Lin S, Li H, Xie L, Zhuo R (2012)Expression profile of miRNAs in Populus cathayana L. and SalixmatsudanaKoidz under salt stress. Mol Biol Rep 39(9):8645–8654.doi:10.1007/s11033-012-1719-4

Funct Integr Genomics