identification of rna-dependent dna-methylation regulated promoters in arabidopsis

8
Research article Identication of RNA-dependent DNA-methylation regulated promoters in Arabidopsis Vesselin Baev 1 , Mladen Naydenov 1 , Elena Apostolova, Desislava Ivanova, Slaveya Doncheva, Ivan Minkov, Galina Yahubyan * Department of Plant Physiology and Molecular Biology, University of Plovdiv, 24 Tsar Assen St, 4000 Plovdiv, Bulgaria article info Article history: Received 15 July 2009 Accepted 18 March 2010 Available online 27 March 2010 Keywords: RNA-dependent DNA methylation Small interfering RNA microRNA Promoters Transposable elements abstract RNA-dependent DNA methylation (RdDM) is an important regulatory event involved in repressive epigenetic modications that can trigger transcriptional gene silencing (TGS). The criteria we used to pick out promoter sequences targeted by RdDM in Arabidopsis thaliana were the main RdDM hallmark properties: 24nt siRNAs as inducers of DNA methylation and transposable elements (TE) as one of the major targets of RdDM. Those genes whose promoters comprised overlapping sites for 24nt siRNA hits, TE and DNA methylation (siRNA/TE/Methylation overlapping regions), were dened as candidates that might be silenced by RdDM. On this basis two gene sets were created which include abiotic and biotic stress-responsive genes whose promoters may be silenced by RdDM. The DNA methylation status of the At3g50770 (CML41) promoter e one of the selected candidates, was experimentally assayed, and it showed dependence on the RdDM-associated Polymerase IV and Polymerase V. A publicly available 24nt siRNA-centered database called starPRO was developed that allows users easily to discover whether a particular promoter sequence is related to RdDM-associated features such as 24nt siRNA-target sites, TE, tandem repeats and DNA methylation. Ó 2010 Elsevier Masson SAS. All rights reserved. 1. Introduction RNA-dependent DNA methylation (RdDM) is an important regulatory event involved in repressive epigenetic modications that can trigger transcriptional gene silencing (TGS). RdDM was rst discovered in tobacco plants infected with recombinant viroids [1]. In plants, de novo methylation of cytosines was proposed to be induced by small interfering RNAs (siRNAs), mainly 24nt in length [1e3]. The class of 24nt siRNAs is produced by nuclear-encoded components of RNA interference machinery [4e6] and the plant- specic RNA polymerase IV (Pol IV) and RNA polymerase V (Pol V) [7e11]. RdDM affects symmetrical (CG and CHG, where H is A, C snd T all but G) and asymmetrical (CHH) cytosines within the region of homology between the inducing RNA and the target DNA [12]. RdDM can provide a reversible mark for TGS since it can be lost when the withdrawal of the inducing siRNA signal results in active or passive (in case of CHH context) demethylation [13e15]. Though the main targets for RdDM were found to be trans- posable elements (TE) and repeat elements, there are an increasing number of examples of siRNA-mediated methylation of protein- coding gene promoters. Exogenous promoter-directed siRNAs can induce promoter methylation and TGS of both transgene promoters [2,16,17] and endogenous promoters in different plant species such as Arabidopsis, maize and potato [2,18,19]. Recently, stress-induced demethylation of the H 2 O 2 -responsive Ep5C promoter region was demonstrated to activate the GUS transgene in Arabidopsis and tomato upon bacterial infection [20], and the changes in methylation status were mediated by AGO-4, an essential effector in RdDM [21]. The genome-wide high-resolution mapping and functional analysis of DNA methylation in Arabidopsis revealed that only about 5% of genes contain methylation within promoter regions [22]. One of the best characterized DNA meth- ylation-dependent promoters is that of the endogenous FWA gene [23] in which TE-derived siRNAs direct TGS and initiation of this silencing is dependent on rdr2, dcl3, and ago4 [4]. It is likely that some stress-responsive genes are regulated by RdDM, particularly when their promoters contain elements such as retrotransposon LTRs which are able to provide promoter/enhancer activities for adjacent plant genes [24,25] and can be activated in response to stress [25e27]. * Corresponding author. E-mail address: [email protected] (G. Yahubyan). 1 These authors contributed equally to this work. Contents lists available at ScienceDirect Plant Physiology and Biochemistry journal homepage: www.elsevier.com/locate/plaphy 0981-9428/$ e see front matter Ó 2010 Elsevier Masson SAS. All rights reserved. doi:10.1016/j.plaphy.2010.03.013 Plant Physiology and Biochemistry 48 (2010) 393e400

Upload: uni-plovdiv

Post on 24-Jan-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

lable at ScienceDirect

Plant Physiology and Biochemistry 48 (2010) 393e400

Contents lists avai

Plant Physiology and Biochemistry

journal homepage: www.elsevier .com/locate/plaphy

Research article

Identification of RNA-dependent DNA-methylation regulatedpromoters in Arabidopsis

Vesselin Baev 1, Mladen Naydenov 1, Elena Apostolova, Desislava Ivanova, Slaveya Doncheva,Ivan Minkov, Galina Yahubyan*

Department of Plant Physiology and Molecular Biology, University of Plovdiv, 24 Tsar Assen St, 4000 Plovdiv, Bulgaria

a r t i c l e i n f o

Article history:Received 15 July 2009Accepted 18 March 2010Available online 27 March 2010

Keywords:RNA-dependent DNA methylationSmall interfering RNAmicroRNAPromotersTransposable elements

* Corresponding author.E-mail address: [email protected] (G. Yahuby

1 These authors contributed equally to this work.

0981-9428/$ e see front matter � 2010 Elsevier Masdoi:10.1016/j.plaphy.2010.03.013

a b s t r a c t

RNA-dependent DNA methylation (RdDM) is an important regulatory event involved in repressiveepigenetic modifications that can trigger transcriptional gene silencing (TGS). The criteria we used topick out promoter sequences targeted by RdDM in Arabidopsis thaliana were the main RdDM hallmarkproperties: 24nt siRNAs as inducers of DNA methylation and transposable elements (TE) as one of themajor targets of RdDM. Those genes whose promoters comprised overlapping sites for 24nt siRNA hits,TE and DNA methylation (siRNA/TE/Methylation overlapping regions), were defined as candidates thatmight be silenced by RdDM. On this basis two gene sets were created which include abiotic and bioticstress-responsive genes whose promoters may be silenced by RdDM. The DNA methylation status of theAt3g50770 (CML41) promoter e one of the selected candidates, was experimentally assayed, and itshowed dependence on the RdDM-associated Polymerase IV and Polymerase V. A publicly available 24ntsiRNA-centered database called starPRO was developed that allows users easily to discover whethera particular promoter sequence is related to RdDM-associated features such as 24nt siRNA-target sites,TE, tandem repeats and DNA methylation.

� 2010 Elsevier Masson SAS. All rights reserved.

1. Introduction

RNA-dependent DNA methylation (RdDM) is an importantregulatory event involved in repressive epigenetic modificationsthat can trigger transcriptional gene silencing (TGS). RdDM wasfirst discovered in tobacco plants infected with recombinant viroids[1]. In plants, de novo methylation of cytosines was proposed to beinduced by small interfering RNAs (siRNAs), mainly 24nt in length[1e3]. The class of 24nt siRNAs is produced by nuclear-encodedcomponents of RNA interference machinery [4e6] and the plant-specific RNA polymerase IV (Pol IV) and RNA polymerase V (Pol V)[7e11]. RdDM affects symmetrical (CG and CHG, where H is A, C sndT all but G) and asymmetrical (CHH) cytosines within the region ofhomology between the inducing RNA and the target DNA [12].RdDM can provide a reversible mark for TGS since it can be lostwhen the withdrawal of the inducing siRNA signal results in activeor passive (in case of CHH context) demethylation [13e15].

an).

son SAS. All rights reserved.

Though the main targets for RdDM were found to be trans-posable elements (TE) and repeat elements, there are an increasingnumber of examples of siRNA-mediated methylation of protein-coding gene promoters. Exogenous promoter-directed siRNAscan induce promoter methylation and TGS of both transgenepromoters [2,16,17] and endogenous promoters in different plantspecies such as Arabidopsis, maize and potato [2,18,19]. Recently,stress-induced demethylation of the H2O2-responsive Ep5Cpromoter region was demonstrated to activate the GUS transgenein Arabidopsis and tomato upon bacterial infection [20], and thechanges in methylation status were mediated by AGO-4, anessential effector in RdDM [21]. The genome-wide high-resolutionmapping and functional analysis of DNA methylation in Arabidopsisrevealed that only about 5% of genes contain methylation withinpromoter regions [22]. One of the best characterized DNA meth-ylation-dependent promoters is that of the endogenous FWA gene[23] in which TE-derived siRNAs direct TGS and initiation of thissilencing is dependent on rdr2, dcl3, and ago4 [4]. It is likely thatsome stress-responsive genes are regulated by RdDM, particularlywhen their promoters contain elements such as retrotransposonLTRs which are able to provide promoter/enhancer activities foradjacent plant genes [24,25] and can be activated in response tostress [25e27].

V. Baev et al. / Plant Physiology and Biochemistry 48 (2010) 393e400394

The discovery of endogenous targets for RdDM is based onArabidopsis genome-wide approaches including profiling of siRNA[28,29], DNA methylation under normal conditions [22,30] andunder pathogen infection [14], and gene expression [31] in RNAiand DNA methylation mutant backgrounds. In this study weapplied a computational approach to search for candidate genesthat may be silenced by RdDM by scanning initially all Arabidopsispromoter sequences for 24nt siRNAs target sites. We sought toestablish a link between 24nt siRNA as inducers of RdDM, andpromoter regions of protein-coding and microRNA genes as targetsfor RdDM. We found a large number of 24nt siRNA-targetpromoters which were then screened for TE, and those marks werecorrelated to promoter loci methylation status. We have integratedthese data in a publicly available 24nt siRNA-centered databasecalled starPRO that would allow users easily to discover whethera particular promoter sequence is related to RdDM-associatedfeatures such as 24nt siRNA-target sites, TE, tandem repeats andDNA methylation. The database is available at http://bioinfo.uni-plovdiv.bg/starpro. Methylation status was experimentally deter-mined for the promoters of the stress-responsive genes At3g50770,At5g43260 and At4g09460, and the DNA methylation of theAt3g50770 (CML41) promoter region showed dependence on theRdDM-associated Pol IV and Pol V.

2. Materials and methods

2.1. Protein-coding promoter sequences

Promoter regions were obtained from the Arabidopsis GeneRegulatory Information Server (AGRIS) [32]. Sequences wereformatted as multi-FASTA formats and were intersected via acces-sion number with the TAIR8 gene functional description file (www.arabidopsis.org).

2.2. miRNA promoter sequences

We used 63 known microRNA promoter sequences of ourprevious research [33], for the remaining Arabidopsis miRNAs, an1000nt upstream region from the start of each precursor wasdownloaded from miRBase, release 12 (http://microrna.sanger.ac.uk) [34].

2.3. siRNA sequences and siRNA-target site creation

All small RNA sequences were downloaded from the ASRPdatabase [35]. The target site of each siRNA was created by pipingthe sequences to a Perl script, that makes reverse-complementsequence keeping the FASTA header.

2.4. Gene Ontology enrichment

Gene Ontology enrichment was performed by AmiGO TermEnrichment tool using the TAIR database (http://amigo.geneontology.org/).

2.5. BLAST analysis

To check for the presence of siRNAs target sites in promotorsequences (protein-coding and miRNAs sequences), BLAST analysiswas performed with default parameters and no mismatches wereallowed. A local copy of the BLAST tool was obtained from the NCBIFTP server (ftp://ftp.ncbi.nih.gov/blast).

2.6. Transposon masking

We used a local copy of RepeatMasker (Version 3.1.5; A. F. A.Smit, R. Hubley, and P. Green, www.repeatmasker.org) to maskpromoter sequences with the available Arabidopsis RepBasecollection (http://www.girinst.org).

2.7. Tandem repeats

The dataset of promoter sequences was screened for tandemrepeat loci using the software Tandem Repeat Finder (TRF, version3.21), developed by Benson and collaborators [36].

2.8. Methylation data

DNA methylation pattern was extracted from col BU/UB, GFFfiles provided by TAIR FTP server, which were originally gatheredfrom Zhang and collaborators [22].

2.9. starPRO database integration

The starPRO Database is implemented as a Perl CGI-script,designed in a modular form that enables use of specialised pack-ages such as BioPerl (modules for developers of Perl-based softwarefor life science research). Perl 5.8.5 (www.perl.com) and BioPerl 1.5(www.bioperl.org) were used.

2.10. Plant material

Plants used in this study were of the Arabidopsis thalianaColumbia (Col-0) accession. The mutant alleles were nrpd1a-2 andnrpd1b-2. Plants were grown for 17-days at 21 �Cwith 16 h light/8 hdark cycles on plates with MS medium supplemented with 1%sucrose. Three replicates were performed with 30 whole seedlingspooled in each.

2.11. DNA and methylation-sensitive PCR

Total DNA was extracted using DNeasy Plant Mini Kit (Qiagen),and 100 ng of DNA were digested with 1 U of McrBc (New EnglandBiolabs) for 8 h. PCR amplification was subsequently done on 5 ngof digested and mock-digested DNA with primers flankingpromoter 24nt siRNA-target region. The following primer pairswere used: 50-CAGGATACGATTGTCACAAG-30 and 50-CATGAGAGATTCACCATGTTG-30 for At3g50770, 50-GATAGGAGCTTTGAAACTTGC-30

and 50-TCAACTTGATGTTCTTGGAAAG-30 for At5g43260 and 50-TTGAATCTTACAACAATGGTC-30 and 50-GAACGACGTCGTATGAAGGAG-30

for At4g09460. PCR protocol: 94 �C for 2 min, followed by 30 cyclesof 94 �C for 45 s, 54 �C for 45 s and 72 �C for 1 min, and finalelongation at 72 �C for 7 min.

2.12. RNA and semiquantative RT-PCR

Total RNA was extracted using RNeasy Plant Mini Kit (Qiagen),and 1 mg was subjected to semiquantitative Long Range 2-StepRT-PCR (Qiagen). The following primer pairs were used: 50-CAA-CATAAAGAGTCACCAGG-30 and 50-TGGCTTCACACTCTCCGTACG-30

for At3g50770, 50-ACCAACAATGAGTCCGATCG and 50-GCCACCAATAATGACAAATTCG-30 for At5g43260, 50-TCCGTAATCACGGTGAAGG-30

and 50-CAACAGTGACAAACGCGCC-30 for At4g09460. The house-keeping Ef-1a cDNA was amplified with the primer pair: 50-ATTGTGGTCATTGGYCAYGT-30 and 50-CCAATCTTGTAVACATCCTG-30.PCR protocol: 94 �C for 5 min, followed by 25 cycles of 94 �C for30 s, 54 �C for 30 s and 72 �C for 1 min, and final elongation at 72 �C

V. Baev et al. / Plant Physiology and Biochemistry 48 (2010) 393e400 395

for 7 min. A parallel set of reactions without addition of reversetranscriptase was run as a quality control.

3. Results

3.1. Local promoter database

The first step of the study was to identify in A. thaliana plantgenes encoding proteins and miRNAs whose promoter regionsmight be potential targets for the 24nt size class siRNAs, a hallmarkfor RdDM. 25 516 promoter sequences of annotated Arabidopsisgenes were extracted from the AGRIS information resource toa local database. Since the sequences of the AGRIS database arepartially curated, our local dataset may contain intergenic regionscomprising cis-regulatory elements with promoter functions. Threesubsets e abiotic (1478 entries), biotic (557 entries) and miRNAgene promoters (178 entries) (Fig. 1) were constructed afterextraction of gene products associated to response to abiotic andbiotic stimuli from the Gene Ontology DB (http://www.amigo.geneontology.org).

3.2. Distribution of 24nt siRNA-target sites, transposableelements (TE) and DNA methylation

3.2.1. All protein-coding promotersPromoter loci containing target sites for 24nt siRNAs were

determined by base pairing of the promoter sequence and thereverse-complementary sequence of siRNA (downloaded fromASRP) through BLAST. All Arabidopsis promoter sequences werescanned by BLAST for siRNA reverse-complementary sequenceswhose hits represent siRNA-target sites. This analysis identified32 455 hits of which 4775 were unique siRNA-target promoter.Since the promoter sequences are provided as a single 50-30 strand,we indentified siRNA-target sites in the promoter forward strand.RepeatMasker detected 8079 TE sequences in 5011 promoter loci.The number of unique promoter loci comprising 24nt siRNA-targetsites and TE were estimated at 3061, and in 2546 promoter locisiRNA-target sites were situated in TE e siRNA/TE overlappingregions (Fig. 2A).

To check the DNA methylation status of the siRNA/TE over-lapping regions, we scanned the DNAmethylationmap of the entireArabidopsis genome [22]. The localization of siRNA-binding sites, TEand DNA methylation in the wild-type determined siRNA/TE

Fig. 1. Flowchart for identification of 24nt siRNA-target sites in the main gene promote

regions at which DNA was methylated e siRNA/TE/Meth over-lapping regions. They were found at 1619 promoter loci thatrepresent w6.3% of the Arabidopsis promoters.

The promoter set with the siRNA/TE/Meth overlapping regionswas analyzed for enrichment or depletion of a GO category byAmiGO Term Enrichment tool using the TAIR database as thebackground set, but significantly enriched GO categories were notfound. The siRNA/TE/Meth overlapping regions were analyzed forthe abundance and relative distribution of major TE classes(Fig. 2A). These regions contained non-LTR and LTR retro-transposons, and DNA transposons with DNA/MuDR and RC/Heli-trons as the most abundant classes. Among 1619 promoter lociwith the siRNA/TE/Meth overlapping regions, we found out 35bi-directional promoter pairs arranged head-to-head on oppositestrands of DNA where the promoter sequences of each paircomprised the same siRNA/TE/Meth overlapping region (Supple-mentary table A).

3.2.2. Abiotic stress gene promotersThe BLAST search revealed that among 1478 abiotic gene

promoter loci, 267 loci had regions which were potential 24ntsiRNA-targets (Supplementary table B), RepeatMasker revealedthat 265 promoter loci had TE, and the loci comprising both thesiRNA-hit regions and TE were 161 (Fig. 2C). The localization ofsiRNA-binding site, TE and DNA methylation determined 81promoter loci that contain siRNA/TE/Meth overlapping regions(w5.5% of the abiotic stress gene promoters). RC/Helitrons were themost abundant TE class.

3.2.3. Biotic stress gene promotersAmong 557 biotic gene promoter loci, 74 were found to be

putative siRNA-targets (Supplementary table C). RepeatMaskerpicked out 44 biotic gene promoters comprising 24nt siRNAcognate sites and TE (Fig. 2B). In those promoter sequences, thesiRNA/TE/Meth overlapping regions were found in 19 promoter loci(w3.4% of the biotic stress gene promoters). The over representedTE class in siRNA/TE/Meth overlapping regions was again that ofhelitrons.

miRNA gene promoters: Of 178 miRNA gene promoters, wefound siRNA-binding sites in 19 of them (Table 1). In these miRNApromoter sequences, significant TE or TR elements were notpresent except for several low complexity repeats. Several of thesiRNA-target promoters regulate expression of miRNA genes

r sets. The number of gene promoters passing each level is shown in parentheses.

Fig. 2. Distribution of the RdDM marks e 24nt siRNA-target sites, TE and DNA methylation within the promoter sequences of three different gene categories e the Arabidopsisprotein-coding genes (A), biotic (B) and abiotic (C) stress-responsive genes. The number of promoter sequences with 24nt siRNA-target sites (light green), the number of promotersequences with TE (green), the number of promoter sequences with siRNA/TE overlapping regions (light orange), and the number of promoter sequences with siRNA/TE/Methoverlapping regions (blue), are presented as a percentage of the respective gene category. (For interpretation of the references to colour in this figure legend, the reader is referred tothe web version of this article.)

V. Baev et al. / Plant Physiology and Biochemistry 48 (2010) 393e400396

involved in stress response. miR396 was found upregulated in salt,osmotic and cold stress [37], miR396 was downregulated inresponse to oxidative stress in Arabidopsis [38e40].

3.3. starPRO DB

starPRO (http://bioinfo.uni-plovdiv.bg/starpro) is an integrateddatabase of promoter sequences with annotations for severalfeatures important for RdDM such as: 24nt siRNA-target sites, TE,tandem repeats, CHH sites and wild-type DNA methylation profile.The current version of starPRO (v1.0) contains all promotersequences from the Arabidopsis genome (available from AGRIS DB).Users can search starPRO for a specific gene promoter through TAIRaccession number. The current database version, starPRO 1.0,consists of promoter sequences for 25 516 promoter-coding genesand 178miRNAs. Each promoter is pre-calculated for the location ofmentioned features. The database includes 32 380 siRNA-targetsites (named according to siRNA accession number in ASRP DB),8054 TE and 7805 TR. The user query promoter sequence is scannedde novo for CHH sites. The database provides two options for datapresentation, visual graphics and table format (Fig. 3).

The I/O sequence operations, graphics visualization and meth-ylation profile (histogram plot) are performed by BioPerl modulesincorporated within the main Perl script.

3.4. Experimental validation

The genes At3g50770, At5g43260 and At4g09460, identified inthe computational analysis as potential candidates regulated byRdDM, were subjected to validation through Methylation-sensi-tive-PCR of siRNA-target promoter regions and semiquantitativeRT-PCR of the related transcripts. For both assays we compared thewild-type with the nrpd1a-2 and nrpd1b-1mutants defective in thelargest subunit of Pol IV (NRPD1) and Pol V (NRPE1), respectively.For each promoter locus, primer pairs were designed to flankthe siRNA-target region. To analyze the methylation status atthe siRNA-target promoter regions of At3g50770, At5g43260 and4g09460, genomic DNAwas digested with McrBC, which recognizesand cleaves DNA containing at least two 5-methylcytosinespreceded by a purine and located in one of CG, CHG, and CHHsequence contexts. Successful amplification after digestion indi-cates lack of methylation.

Fig. 4A shows the methylation status at these regions in thepromoters of the candidate genes for RdDM. In wild-type plants,the observed lack of PCR amplification at McrBc-treatment revealedDNA methylation at the siRNA-target regions of the three protein-coding gene promoters. The DNA methylation was completely lostat the At3g50770 and At5g43260 promoters in the two nrpd1a-2 andnrpd1b-1 mutant lines while it remained unaffected by the lack of

Table 1Promoter regions of miRNA genes which have 24nt siRNA target sites.

miRNA promoter ASRP siRNA Starta Stop miRNA targets (TAIR)

ath-miR156a ASRP4176 544 567 SBPath-miR156f ASRP75724 213 236 SBPath-MIR159c ASRP111211 272 295 MYBath-MIR165b ASRP49408 3 26 HD-ZIPIIIlmath-MIR396b ASRP143314 19 42 GRF

ASRP128550 21 44ath-MIR398b ASRP168395 147 171 CSD, CytC

ASRP21189 170 193ASRP194951 148 171

ath-MIR401 ASRP60696 201 224 unknownASRP53784 386 409ASRP38804 427 450

ath-miR403 ASRP195660 229 252 AGO2ASRP183243 27 50ASRP43915 176 199ASRP42542 254 277

ath-MIR405a ASRP190248 806 829 transposon derived miASRP74550 90 113

ath-MIR405d ASRP112929 447 470 transposon derived miath-MIR406 ASRP203501 626 649 At2g42510(3) and

At1g54380(3)ASRP132930 730 753ASRP130831 629 652ASRP73649 633 656ASRP27943 639 662ASRP103 648 671

ath-MIR426 ASRP209133 47 70 unknownASRP159474 54 77ASRP156614 55 78ASRP138137 57 80ASRP72761 49 72ASRP69993 66 89ASRP48457 1 24ASRP40031 50 73ASRP21479 49 72ASRP18956 46 69ASRP18693 48 71

ath-MIR773 ASRP69661 74 97 MET2ath-mir780 ASRP190433 564 587 CHX18

ASRP175127 498 521ASRP150823 503 526ASRP122846 403 426ASRP121043 412 435ASRP120517 415 438ASRP119743 491 514ASRP108833 404 427ASRP104516 488 511ASRP104150 411 434ASRP80840 402 425ASRP67074 500 523ASRP43228 407 430

ath-mir832 ASRP151095 524 547 unknownath-mir841 ASRP17983 616 639 unknownath-mir849 ASRP160162 480 503 unknown

ASRP131879 507 530ath-mir868 ASRP193954 69 93 unknown

ASRP215969 84 107ASRP215055 283 306ASRP179370 593 616ASRP171901 527 550ASRP166703 272 295ASRP164795 338 361ASRP150994 143 166ASRP134886 270 293ASRP130353 73 96ASRP127946 204 227ASRP118830 129 152ASRP118465 267 290ASRP76570 579 602ASRP59948 132 155ASRP57585 665 688ASRP55671 233 256ASRP41718 207 230ASRP38875 47 70

a Start/end position in promoter region sequence according to AGRIS.

V. Baev et al. / Plant Physiology and Biochemistry 48 (2010) 393e400 397

NRPD1 and NRPE1 at the At4g09460 promoter region. To see if theobserved promoter methylation affects the regulation of cognategene expression we extended methylation analysis by semi-quantative RT-PCR (Fig. 4B). Only the At3g50770 transcript showedhigh dependency on NRPD1 and NRPE1 since it had a very lowexpression level in the wild-type, while in the nrpd1a-2 andnrpd1b-1mutants its expressionwas enhanced. The At5g43260 andAt4g09460 transcripts accumulated to equal relatively high levels inthe wild-type and in the both mutants.

4. Discussion

In Arabidopsis many endogenous genes are methylated eitherwithin their promoters or within their transcribed regions, andpromoter methylation may correlate with TGS [41e44]. In thisstudy we applied a computational approach to identify thoseendogenous gene promoters whose methylation can be induced by24nt siRNAs. The criteria we used to pick out putative RdDM-regulated promoter sequences were the main RdDM hallmarkproperties: 24nt siRNAs (as DNA methylation inducers) and TE(as major targets of RdDM). Our analysis showed that besides TE,pseudogenes residing in some promoters might be targets for 24ntsiRNAs as well (data are not presented). Those promoters in whichthe target siRNA site(s) overlapped with the TE position were usedfor further analysis. To make this search more precise, the outputtarget promoter sequences were associated with their methylationprofile [22]. Attention should be paid to the fact that we integratedwide genome DNA methylation profile and siRNA database thathave been produced in different experimental settings with DNAextracted from aerial parts of plants (http://epigenomics.mcdb.ucla.edu/DNAmeth) and RNA extracted from different planttissues (http://asrp.cgrb.oregonstate.edu) at different develop-mental stages, respectively. Only promoter loci which comprisedsiRNA/TE/Meth overlapping regions were defined as candidatepromoters that might be regulated by RdDM, and they counted forw6.3% of all protein-coding gene promoters in Arabidopsis. Thisnumber exceeds the 5% of methylated promoters reported byZhang and coworkers (2006) because the length of the promotersequences analyzed here might vary from 500 up to 3000nt [32].The identification of siRNA/TE/Meth overlapping regions in theshared sequences of bi-directional promoters implies a role forRdDM in co-regulated expression of adjacent genes.

With the presumption that stress-responsive genes might be oneof themost likely gene candidates regulated by reversible RdDM, twopromoter setswere createdwhich comprise promoters of abiotic andbiotic stress-responsive genes. The integration of the selected in ourstudy stress-responsive RdDM promoters with available microarrayexpression data from mutants defective in RdDM pathway or fromplants grown under different environmental stresses could result inmore precise identification of stress genes that may be transcrip-tionally silenced by siRNA-directed DNA methylation.

Methylation-sensitive PCR and semiquantative RT-PCR vali-dated the dependence of promoter methylation status and thetranscript expression of At3g50770 on NRPD1 and NRPE1. Upre-gulated expression of At3g50770 was demonstrated as well in dcl4/rdr2mutant [29]. The promoter siRNA-target region comprises veryclosely located target sites for 24nt siRNAs, contains mainlyasymmetric CHH sites, co-localizes with a TE, and overlaps with thetranscription start site. Thus, although having a DNA TE instead ofa SINE retrotransposon, the CML41 promoter bears a close resem-blance to Arabidopsis FWA promoter and, like it, is under the controlof RdDM. Te At3g50770 encodes a putative calmodulin-like protein(CML41) that was predicted, unlike the great majority of calm-odilin-sensing proteins, to act in chloroplasts [45], and was provento play a role in dampening the immune response [46].

Fig. 3. The user web interface of the starPRO database (http://bioinfo.uni-plovdiv.bg/starpro). The promoter sequence of At3g50770 is presented, since it was one of the selectedRdDM-dependent stress-responsive gene promoters, which was validated in our study.

V. Baev et al. / Plant Physiology and Biochemistry 48 (2010) 393e400398

4.1. starPRO DB features

The main advantage of starPRO is the easy and direct visuali-zation of the gene promoter sequences together with a detailedpicture specific for themain RdDMhallmarks, which are: target loci(hit positions) only for 24nt siRNAs (which are thought to be the

main inducers of RdDM), TE with family classification, tandemrepeats with copy number and DNA metylation status. All thesefeatures can be directly retrieved by gene accession numberstraight away, unlike any genome-oriented (GBrowser system) DBwhich need “walking around” for finding putative promoter region.We noticed that the non-coding RNA and miRNA promoter

Fig. 4. Experimental validation of the identified gene candidates regulated by RdDM. A. Promoter analysis: McrPCR was carried out on untreated and McrBC-treated genomic DNAamplified by primers for, At5g43260 and At4g09460 promoters in wild-type and nrpd1a-2 and nrpd1b-1 mutant lines. B. Transcript analysis: Semiquantitative RT-PCR analysis ofAt3g50770, At5g43260 and At4g09460 cDNA in wild-type and nrpd1a-2 and nrpd1b-1 mutant lines. Mock RT-PCR was performed without reverse transcriptase (-RT) using EF-1aprimers. The same primers were used for normalization of input RNA for each genotype.

V. Baev et al. / Plant Physiology and Biochemistry 48 (2010) 393e400 399

annotations are quite poor among other DBs. For this reason wehave included putative promoter regions for most of the Arabi-dopsis miRNAs in starPRO DB.

With starPRO DB the user can search for a specific promotersequence using only a gene TAIR accession number. In this way theuser can easily discover whether RdDM-associated features overlapa specific region. While all information is provided within graphicalrepresentation, starPRODBdelivers table results containing all featureelements and their locationwithin the promoter. An image file is alsoprovided in PNG downloadable format for further examination.

Acknowledgements

We thank Thierry Lagrange and lab colleagues from the LGDP,University of Perpignan, France, for providing mutant seed stocks.We thank Molly Megraw from the Center for Bioinformatics,University of Pennsylvania, Philadelphia, PA, USA, for critical readingof the manuscript. This work was supported by the BulgarianNational Science Fund (grant DOO2-279, VU-204).

Appendix. Supplementary material

Supplementary data associated with this article can be found inthe online version at doi:10.1016/j.plaphy.2010.03.013.

References

[1] M. Wassenegger, S. Heimes, L. Riedel, H.L. Sanger, RNA-directed de novomethylation of genomic sequences in plants. Cell 76 (1994) 567e576.

[2] B. Huettel, T. Kanno, L. Daxinger, E. Bucher, J. van der Winden, A.J. Matzke,M. Matzke, RNA-directed DNA methylation mediated by DRD1 and Pol IVb:a versatile pathway for transcriptional gene silencing in plants. Biochim.Biophys. Acta 1769 (2007) 358e374.

[3] M. Matzke, W. Aufsatz, T. Kanno, L. Daxinger, I. Papp, M.F. Mette, A.J. Matzke,Genetic analysis of RNA-mediated transcriptional gene silencing. Biochim.Biophys. Acta 1677 (2004) 129e141.

[4] S.W. Chan, D. Zilberman, Z. Xie, L.K. Johansen, J.C. Carrington, S.E. Jacobsen, RNAsilencing genes control de novo DNA methylation. Science 303 (2004) 1336.

[5] Z. Xie, L.K. Johansen, A.M. Gustafson, K.D. Kasschau, A.D. Lellis, D. Zilberman, S.E. Jacobsen, J.C. Carrington, Genetic and functional diversification of small RNApathways in plants. PLoS Biol. 2 (2004) E104.

[6] D. Zilberman, X. Cao, S.E. Jacobsen, ARGONAUTE4 control of locus-specific siRNAaccumulation and DNA and histone methylation. Science 299 (2003) 716e719.

[7] A.J. Herr, M.B. Jensen, T. Dalmay, D.C. Baulcombe, RNA polymerase IV directssilencing of endogenous DNA. Science 308 (2005) 118e120.

[8] T. Kanno, B. Huettel, M.F. Mette, W. Aufsatz, E. Jaligot, L. Daxinger, D.P. Kreil,M. Matzke, A.J. Matzke, Atypical RNA polymerase subunits required for RNA-directed DNA methylation. Nat. Genet. 37 (2005) 761e765.

[9] Y. Onodera, J.R. Haag, T. Ream, P.C. Nunes, O. Pontes, C.S. Pikaard, Plant nuclearRNA polymerase IV mediates siRNA and DNA methylation-dependentheterochromatin formation. Cell 120 (2005) 613e622.

[10] D. Pontier, G. Yahubyan, D. Vega, A. Bulski, J. Saez-Vasquez, M.A. Hakimi,S. Lerbs-Mache, V. Colot, T. Lagrange, Reinforcement of silencing at transposonsand highly repeated sequences requires the concerted action of two distinctRNA polymerases IV in Arabidopsis. Genes. Dev. 19 (2005) 2030e2040.

[11] X. Zhang, I.R. Henderson, C. Lu, P.J. Green, S.E. Jacobsen, Role of RNA poly-merase IV in plant small RNA metabolism. Proc. Natl. Acad. Sci. U. S. A. 104(2007) 4536e4541.

[12] T. Pelissier, S. Thalmeir, D. Kempe, H.L. Sanger, M. Wassenegger, Heavy denovo methylation at symmetrical and non-symmetrical sites is a hallmark ofRNA-directed DNA methylation. Nucleic. Acids. Res. 27 (1999) 1625e1634.

[13] J. Zhu, A. Kapoor, V.V. Sridhar, F. Agius, J.K. Zhu, The DNA glycosylase/lyaseROS1 functions in pruning DNA methylation patterns in Arabidopsis. Curr.Biol. 17 (2007) 54e59.

[14] V. Pavet, C. Quintero, N.M. Cecchini, A.L. Rosa, M.E. Alvarez, Arabidopsisdisplays centromeric DNA hypomethylation and cytological alterations ofheterochromatin upon attack by pseudomonas syringae. Mol. Plant-Microbe.Interact. 19 (2006) 577e587.

[15] T. Morales-Ruiz, A.P. Ortega-Galisteo, M.I. Ponferrada-Marin, M.I. Martinez-Macias, R.R. Ariza, T. Roldan-Arjona, DEMETER and Repressor OF SILENCING 1encode 5-methylcytosine DNA glycosylases. Proc. Natl. Acad. Sci. U. S. A. 103(2006) 6853e6858.

[16] M.F. Mette, W. Aufsatz, J. van der Winden, M.A. Matzke, A.J. Matzke, Tran-scriptional silencing and promoter methylation triggered by double-strandedRNA. EMBO J. 19 (2000) 5194e5201.

[17] P. Mourrain, R. van Blokland, J.M. Kooter, H. Vaucheret, A single transgenelocus triggers both transcriptional and post-transcriptional silencing throughdouble-stranded RNA production. Planta 225 (2007) 365e379.

[18] A.M. Cigan, E. Unger-Wallace, K. Haug-Collet, Transcriptional gene silencing asa tool for uncovering gene function in maize. Plant J. 43 (2005) 929e940.

[19] B.H. Heilersig, A.E. Loonen, E.M. Janssen, A.M. Wolters, R.G. Visser, Efficiency oftranscriptionalgenesilencingofGBSSI inpotatodependson thepromoter regionthat is used in an inverted repeat. Mol. Genet. Genomics 275 (2006) 437e449.

[20] A. Agorio, P. Vera, ARGONAUTE4 is required for resistance to Pseudomonassyringae in Arabidopsis. Plant Cell 19 (2007) 3778e3790.

[21] Y. Qi, X. He, X.J. Wang, O. Kohany, J. Jurka, G.J. Hannon, Distinct catalytic andnon-catalytic roles of ARGONAUTE4 in RNA-directed DNA methylation.Nature 443 (2006) 1008e1012.

[22] X. Zhang, J. Yazaki, A. Sundaresan, S. Cokus, S.W. Chan, H. Chen,I.R. Henderson, P. Shinn, M. Pellegrini, S.E. Jacobsen, J.R. Ecker, Genome-widehigh-resolution mapping and functional analysis of DNA methylation in ara-bidopsis. Cell 126 (2006) 1189e1201.

[23] Z. Lippman, B. May, C. Yordan, T. Singer, R. Martienssen, Distinct mechanismsdetermine transposon inheritance and methylation via small interfering RNAand histone modification. PLoS Biol. 1 (2003) E67.

V. Baev et al. / Plant Physiology and Biochemistry 48 (2010) 393e400400

[24] B. Huettel, T. Kanno, L. Daxinger, W. Aufsatz, A.J. Matzke, M. Matzke, Endog-enous targets of RNA-directed DNA methylation and Pol IV in Arabidopsis.EMBO J. 25 (2006) 2828e2836.

[25] K. Kashkush, M. Feldman, A.A. Levy, Transcriptional activation of retro-transposons alters the expression of adjacent genes in wheat. Nat. Genet. 33(2003) 102e106.

[26] M.A. Grandbastien, Stress activation and genomic impact of plant retro-transposons. J. Soc. Biol. 198 (2004) 425e432.

[27] N. Steward, T. Kusano, H. Sano, Expression of ZmMET1, a gene encoding a DNAmethyltransferase from maize, is associated not only with DNA replication inactively proliferating cells, but also with altered DNA methylation status incold-stressed quiescent cells. Nucleic. Acids Res. 28 (2000) 3250e3259.

[28] R. Rajagopalan, H. Vaucheret, J. Trejo, D.P. Bartel, A diverse and evolutionarilyfluid set ofmicroRNAs inArabidopsis thaliana. Genes. Dev. 20 (2006) 3407e3425.

[29] K.D. Kasschau, N. Fahlgren, E.J. Chapman, C.M. Sullivan, J.S. Cumbie, S.A. Givan,J.C. Carrington, Genome-wide profiling and analysis of Arabidopsis siRNAs.PLoS Biol. 5 (2007) e57.

[30] D. Zilberman, M. Gehring, R.K. Tran, T. Ballinger, S. Henikoff, Genome-wideanalysis of Arabidopsis thaliana DNA methylation uncovers an interdepen-dence between methylation and transcription. Nat. Genet. 39 (2007) 61e69.

[31] Y. Kurihara, A. Matsui, M. Kawashima, E. Kaminuma, J. Ishida, T. Morosawa,Y. Mochizuki, N. Kobayashi, T. Toyoda, K. Shinozaki, M. Seki, Identification ofthe candidate genes regulated by RNA-directed DNA methylation in Arabi-dopsis. Biochem. Biophys. Res. Commun. 376 (2008) 553e557.

[32] R.V. Davuluri, H. Sun, S.K. Palaniswamy, N. Matthews, C. Molina, M. Kurtz,E. Grotewold, AGRIS: arabidopsis gene regulatory information server, aninformation resource of Arabidopsis cis-regulatory elements and transcriptionfactors. BMC Bioinf. 4 (2003) 25.

[33] M. Megraw, V. Baev, V. Rusinov, S.T. Jensen, K. Kalantidis, A.G. Hatzigeorgiou,MicroRNA promoter element discovery in Arabidopsis. RNA 12 (2006)1612e1619.

[34] S. Griffiths-Jones, R.J. Grocock, S. van Dongen, A. Bateman, A.J. Enright, miR-Base: microRNA sequences, targets and gene nomenclature. Nucleic. AcidsRes. 34 (2006) D140eD144.

[35] A.M. Gustafson, E. Allen, S. Givan, D. Smith, J.C. Carrington, K.D. Kasschau,ASRP: the arabidopsis small RNA project database. Nucleic. Acids Res. 33(2005) D637eD640.

[36] Y. Gelfand, A. Rodriguez, G. Benson, TRDBethe Tandem Repeats Database.Nucleic. Acids Res. 35 (2007) D80eD87.

[37] H.H. Liu, X. Tian, Y.J. Li, C.A. Wu, C.C. Zheng, Microarray-based analysis ofstress-regulated microRNAs in Arabidopsis thaliana. RNA 14 (2008) 836e843.

[38] R. Sunkar, J.K. Zhu, Novel and stress-regulated microRNAs and other smallRNAs from Arabidopsis. Plant Cell 16 (2004) 2001e2019.

[39] R. Sunkar, A. Kapoor, J.K. Zhu, Posttranscriptional induction of two Cu/Znsuperoxide dismutase genes in Arabidopsis is mediated by downregulation ofmiR398 and important for oxidative stress tolerance. Plant Cell 18 (2006)2051e2065.

[40] G. Jagadeeswaran, A. Saini, R. Sunkar, Biotic and abiotic stress down-regulatemiR398 expression in Arabidopsis. Planta 229 (2009) 1009e1014.

[41] M. Gehring, J.H. Huh, T.F. Hsieh, J. Penterman, Y. Choi, J.J. Harada,R.B. Goldberg, R.L. Fischer, DEMETER DNA glycosylase establishes MEDEApolycomb gene self-imprinting by allele-specific demethylation. Cell 124(2006) 495e506.

[42] P.E. Jullien, T. Kinoshita, N. Ohad, F. Berger, Maintenance of DNA methylationduring the Arabidopsis life cycle is essential for parental imprinting. Plant Cell18 (2006) 1360e1372.

[43] T. Kinoshita, A. Miura, Y. Choi, Y. Kinoshita, X. Cao, S.E. Jacobsen, R.L. Fischer,T. Kakutani, One-way control of FWA imprinting in Arabidopsis endosperm byDNA methylation. Science 303 (2004) 521e523.

[44] W.J. Soppe, S.E. Jacobsen, C. Alonso-Blanco, J.P. Jackson, T. Kakutani,M. Koornneef, A.J. Peeters, The late flowering phenotype of fwa mutants iscaused by gain-of-function epigenetic alleles of a homeodomain gene. Mol.Cell 6 (2000) 791e802.

[45] E. McCormack, J. Braam, Calmodulins and related potential calcium sensors ofArabidopsis. New Phytologist. 159 (2003) 585e598.

[46] C. Denoux, R. Galletti, N. Mammarella, S. Gopalan, D. Werck, G. De Lorenzo,S. Ferrari, F. Ausubel, J. Dewdney, Activation of Defense response pathways byOGs and Flg22 Elicitors in arabidopsis seedlings. Mol. Plant 1 (2008) 423e445.