four divergent arabidopsis ethylene-responsive element-binding factor domains bind to a target dna...

10
Four divergent Arabidopsis ethylene-responsive element-binding factor domains bind to a target DNA motif with a universal CG step core recognition and different flanking bases preference Shuo Yang 1 , Shichen Wang 1 , Xiangguo Liu 1 , Ying Yu 1 , Lin Yue 3 , Xiaoping Wang 1 and Dongyun Hao 1,2 1 Key Laboratory for Molecular Enzymology and Engineering of the Ministry of Education, Jilin University, Changchun, China 2 Biotechnology Research Centre, Jilin Academy of Agricultural Sciences (JAAS), Changchun, China 3 School of Physical Education, Northeast Normal University, Changchun, China Introduction The ethylene-responsive element-binding factor (ERF) gene family of transcriptional factors is one of the largest transcriptional factor gene families in the plant kingdom [1,2]. The ERF domain was first identified as a conserved motif of 58–59 amino acids in four DNA- binding proteins from tobacco and was shown to bind specifically to a GCC box [3]. After the completion of the sequencing of the Arabidopsis genome [4], 124 genes were predicted to encode proteins belonging to the AtERF family [2]. Keywords CG step; DRE motif; ERF domain; homology; universal binding characteristic Correspondence D. Hao, Biotechnology Research Centre, Jilin Academy of Agricultural Sciences (JAAS), Changchun 130033, China Fax: +86 431 87063080 Tel: +86 431 87063195 E-mail: [email protected] (Received 31 August 2009, revised 29 September 2009, accepted 8 October 2009) doi:10.1111/j.1742-4658.2009.07428.x The Arabidopsis ethylene-responsive element-binding factor (AtERF) fam- ily of transcription factors has 120 members, all of which possess a highly conserved ERF domain. AtERF1, AtERF4, AtEBP and CBF1 are members from different phylogenetic subgroups within the family. Electrophoretic mobility shift assay analyses revealed that the ERF domains of these four proteins were capable of binding specifically to either GCC or dehydration-responsive element (DRE) motifs. In vitro and in vivo binding assays of the four AtERFs with the DRE motif showed that the recognition of the CG step was indispensable in all four of the specific binding reactions, implying that there may be a universal binding characteristic of various ERF domains binding to a given con- sensus (e.g. the DRE motif). In addition, the core DNA-binding motifs preferred by the four AtERFs were identified, and all of these motifs contained a conserved CG step core. Thus, conserved recognition of the CG step may be the foundation of the formation of the stable complex by the ERF domain with the DRE motif, which is probably determined by the highly conserved residues presented in the DNA contact surface among the whole AtERF family members. The different preferences at flanking bases of individual ERF domains, which appear to be attrib- uted to the subfamily- or subgroup-specific residues, may be essential discrimination of the target binding motif from various similar sequences by divergent AtERF domains. Abbreviations DBD, DNA binding domain; DRE, dehydration-responsive element; EMSA, electrophoretic mobility shift assay; ERE, ethylene-responsive element; ERF, ethylene-responsive element-binding factor. FEBS Journal 276 (2009) 7177–7186 ª 2009 The Authors Journal compilation ª 2009 FEBS 7177

Upload: shuo-yang

Post on 21-Jul-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Four divergent Arabidopsis ethylene-responsiveelement-binding factor domains bind to a target DNAmotif with a universal CG step core recognition anddifferent flanking bases preferenceShuo Yang1, Shichen Wang1, Xiangguo Liu1, Ying Yu1, Lin Yue3, Xiaoping Wang1

and Dongyun Hao1,2

1 Key Laboratory for Molecular Enzymology and Engineering of the Ministry of Education, Jilin University, Changchun, China

2 Biotechnology Research Centre, Jilin Academy of Agricultural Sciences (JAAS), Changchun, China

3 School of Physical Education, Northeast Normal University, Changchun, China

Introduction

The ethylene-responsive element-binding factor (ERF)

gene family of transcriptional factors is one of the

largest transcriptional factor gene families in the plant

kingdom [1,2]. The ERF domain was first identified as

a conserved motif of 58–59 amino acids in four DNA-

binding proteins from tobacco and was shown to bind

specifically to a GCC box [3]. After the completion of

the sequencing of the Arabidopsis genome [4], 124

genes were predicted to encode proteins belonging to

the AtERF family [2].

Keywords

CG step; DRE motif; ERF domain;

homology; universal binding characteristic

Correspondence

D. Hao, Biotechnology Research Centre,

Jilin Academy of Agricultural Sciences

(JAAS), Changchun 130033, China

Fax: +86 431 87063080

Tel: +86 431 87063195

E-mail: [email protected]

(Received 31 August 2009, revised

29 September 2009, accepted 8 October

2009)

doi:10.1111/j.1742-4658.2009.07428.x

The Arabidopsis ethylene-responsive element-binding factor (AtERF) fam-

ily of transcription factors has � 120 members, all of which possess a

highly conserved ERF domain. AtERF1, AtERF4, AtEBP and CBF1

are members from different phylogenetic subgroups within the family.

Electrophoretic mobility shift assay analyses revealed that the ERF

domains of these four proteins were capable of binding specifically to

either GCC or dehydration-responsive element (DRE) motifs. In vitro

and in vivo binding assays of the four AtERFs with the DRE motif

showed that the recognition of the CG step was indispensable in all four

of the specific binding reactions, implying that there may be a universal

binding characteristic of various ERF domains binding to a given con-

sensus (e.g. the DRE motif). In addition, the core DNA-binding motifs

preferred by the four AtERFs were identified, and all of these motifs

contained a conserved CG step core. Thus, conserved recognition of the

CG step may be the foundation of the formation of the stable complex

by the ERF domain with the DRE motif, which is probably determined

by the highly conserved residues presented in the DNA contact surface

among the whole AtERF family members. The different preferences at

flanking bases of individual ERF domains, which appear to be attrib-

uted to the subfamily- or subgroup-specific residues, may be essential

discrimination of the target binding motif from various similar sequences

by divergent AtERF domains.

Abbreviations

DBD, DNA binding domain; DRE, dehydration-responsive element; EMSA, electrophoretic mobility shift assay; ERE, ethylene-responsive

element; ERF, ethylene-responsive element-binding factor.

FEBS Journal 276 (2009) 7177–7186 ª 2009 The Authors Journal compilation ª 2009 FEBS 7177

The AtERF family is further divided into various

subgroups according to the homology of ERF

domains [5,6].

An ERF domain consists of a three-stranded anti-

parallel b-sheet and an a-helix, packed approximately

parallel to the b-sheet, with the seven thoroughly con-

served amino acids (Arg6, Arg8, Trp10, Glu16, Arg18,

Arg26 and Trp28) in the b-sheet contacting uniquely

with the bases of the target DNA at the major groove

(see Fig. 1A) [7]. Phylogenetic analyses of the ERF

domains of all members within the AtERF family

show that the residues Arg6, Glu16 and Trp28 are

completely conserved among all 124 members, whereas

more than 95% of members contain the Arg8, Arg18,

Arg26 residues [6].

From the results of the few AtERFs studied,

however, the conserved ERF domains do not seem to

prefer identical DNA consensus sequences. For ins-

tance, some AtERFs have been shown to bind in vitro

to the ethylene-responsive element (ERE), a GCCGCC

motif designated the GCC motif [3,8–12], to conduct

GCC motif-mediated transcription (activation or repres-

sion) in leaves of Arabidopsis [12]. This ERE was first

reported to be a binding site (referred to as the GCC

box) of a number of tobacco ERF proteins [3] and it

was later presumed to be the target site of many other

ERF proteins [2].

The ERF protein, AtEBP, was also found to protect

the GCC box in a DNase I foot-printing analysis [10].

In contrast, the dehydration-responsive element

(DRE), with the TACCGACAT motif, in the drought-

responsive gene rd29A from Arabidopsis has been

proven to be the recognition site of DRE-binding

proteins, which are transcription factors that have

authentic ERF domains [13] and that are involved in

the induction of rd29A expression by low-temperature

stress. A similar element to DRE, the C-repeat

(TGGCCGAC) has been identified in the cold-induc-

ible gene cor15a and it is reported to function in

cold-response regulation through binding by another

ERF protein, CBF1 [14].

The similarity of these ERF-binding elements and

the high similarity of ERF domains among the mem-

bers of the entire ERF family have led to speculation

that the ERF domains from various subgroups within

the AtERF family recognize a certain binding site with

universal binding characteristic to a conserved core.

The divergent short flanking bases, on the other hand,

allow preference to govern differential recognition. We

have previously demonstrated that various ERF

domains had divergences in their DNA recognition

modes [9], but, to date, additional supporting evidence

has been lacking. Indeed, little is still known regarding

the ways in which these differences are important for

the functionalities of members in the AtERF family,

the majority of which have not yet been studied.

In the present study, we selected four representatives

from different functional subgroups of the AtERF

family and characterized the in vivo and in vitro bind-

ing specificities of the four ERF domains for a

sequence containing the DRE motif. In addition, we

used a random sequence selection method to identify

the core recognition motifs preferred by each of the

four domains. A universal binding characteristic was

revealed, in addition to the individual features of vari-

ous ERF domains involved in recognition of the DRE

A

B

C

Fig. 1. (A) Solution structure of AtERF1–GCC box complex (PDB

code: 1GCC) [7]. The DNA-binding domain is shown in the sche-

matic; DNA is represented by tubes. The b-sheet of the ERF

domain is light blue and the seven conserved amino acid residues

reported to contact DNA bases directly are red; other conserved

amino acid residues that do not directly contact with DNA bases

are blue. (B) The DNA base sequence with position numbering

along the 16 bp fragment of DREwt. The bases in the core ACC-

GAC are in bold and boxed in gray. (C) Sequence alignment of four

ERF domains of AtERF1, AtERF4, AtEBP and CBF1. The secondary

structure scheme is indicated above the sequence. The conserved

amino acid residues that directly contact with DNA bases and the

other conserved amino acid residues that do not directly contact

with DNA bases are in red and blue, respectively.

Arabidopsis ERFs recognize a common CG step core S. Yang et al.

7178 FEBS Journal 276 (2009) 7177–7186 ª 2009 The Authors Journal compilation ª 2009 FEBS

motif. The results have important implications for

understanding the foundations of recognition of a

given binding site by divergent members of the AtERF

family.

Results and Discussion

The members of the ERF family in Arabidopsis can be

classified into a number of different phylogentic sub-

groups according to the sequence similarity of the

ERF domains [6]. We selected four AtERFs –

AtERF1, AtERF4, AtEBP and CBF1 – as representa-

tives from divergent subgroups (for details, see Figs 1C

and 6), to investigate whether the highly homologous

ERF domains of different AtERFs have universal

binding characteristics for the recognition of a given

consensus sequence (e.g. DRE).

Binding specificity of AtERFs to the GCC and DRE

motifs

Having established that CBF1 can specifically recog-

nize both GCC and DRE motifs [9], the two most

popularly reported ERF-binding sites, we continued to

explore the DNA-binding specificity of the other three

AtERFs. Table 1 shows that all four ERF domains

were capable of binding specifically to the 16 bp frag-

ment containing either the GCC or the DRE motif.

The equilibrium dissociation constants (Kd) of

AtERF1, AtERF4 and AtEBP for binding to the DRE

motif were within the level of typical monomeric

interaction, although the binding activities were in gen-

eral lower than those for binding to the GCC motif.

CBF1 appeared to bind to the DRE motif more

strongly than to the GCC motif, implying CBF1 may

prefer the DRE motif over the GCC motif. To further

confirm if these variations in binding affinity were

caused by binding instability as a result of nonspecific

interference, rather than the alternation of a binding

site, we carried out the competition binding assay

using a nonspecific competitor poly[dA-dT].poly[dA-

dT] in an electrophoretic mobility shift assay (EMSA).

Figure 2 shows that most of the AtERFs exhibited

similar stability in binding to either the GCC or the

DRE motif. The most remarkable feature arising from

the competition binding assay was the consistency of

the binding preference of the AtERFs with the EMSA

analysis. The three AtERFs, AtERF1 AtERF4 and

AtEBP, with higher sequence similarity to each other

than to CBF1, had similar binding preferences in

comparison with CBF1.

Verification of the binding characteristics of the

selected AtERFs with the DRE motif

To verify the detailed binding characteristics of the

four different AtERFs to a given consensus sequence

DRE, EMSAs were carried out with the DRE motif

and its mutants possessing single T substitutions (see

Fig. 3). Each base in the DRE motif from T5 to C11

was replaced with a T, except that T5 was replaced by

A, and the binding free energy changes (DDG) were

obtained from quantitative titration analysis. Figure 3

shows that AtERF1 and AtERF4 exhibited the highest

specific interactions at C7, C8, G9 or C11, because the

Table 1. Binding activities of the selected AtERFs to GCC and

DRE motif-containing sequences. Four ERF domains were tested

for binding to the 16 bp DRE or GCC motif-containing sequences

using quantitative EMSA, as described in Materials and methods.

Kd values are represented as the mean of three replicates ± stan-

dard deviation. The Kd value for nonspecific binding was estimated

to be � 1 lM or higher.

ERF fragments GCCwt (nM) DREwt (nM)

AtERF1-f 0.17 ± 0.04 2.02 ± 1.44

AtERF4-f 0.38 ± 0.24 31.7 ± 14.3

AtEBP-f 0.34 ± 0.17 1.25 ± 0.62

CBF1-f 5.63 ± 0.61 1.46 ± 0.99

Fig. 2. Competition binding assay of the ERF–DNA complex. The

binding complex of the ERFs and their binding DNAs were incu-

bated together with 0, 0.001, 0.01, 0.1, 1.0 and 10 lg poly[dA-

dT].poly[dA-dT] in a 10 lL volume and analysed by EMSA, as

described in the Materials and methods.

S. Yang et al. Arabidopsis ERFs recognize a common CG step core

FEBS Journal 276 (2009) 7177–7186 ª 2009 The Authors Journal compilation ª 2009 FEBS 7179

base substitution at that position caused the greatest

decline in binding activity. AtEBP requested C8, G9

and C11 most frequently and with the moderate

requirements of C7. As for CBF1, the prerequisite

bases appeared to be C8 and G9, whereas the other

bases within the binding motif were only moderately

required to varying extents. In the four reactions, bases

C8 and G9 in the DRE motif were absolutely

requested by all AtERFs for specific binding, indicat-

ing that the recognition of the CG step was conserved

by various AtERFs and may be the universal binding

characteristic of different AtERFs in recognition with

the DRE motif. In addition, bases C7 and C11 within

the motif were required to different extents by AtERFs

from the divergent phylogentic subgroups, implying

that the recognition of these bases was the individual

feature of distinct AtERFs binding to the DRE motif.

In vivo DNA binding specificity of AtERFs by the

reporter–effector transient assay

To confirm if these binding specificities of AtERFs

observed in vitro were also capable of regulating the

DRE-mediated transcription within plant tissue, repor-

ter effect cotransformation assays were carried out. An

effector plasmid possessing the coding region of the

full-length AtERF1, AtEBP or CBF1 genes driven by

the CaMV 35S promoter, together with the luciferase

reporter gene containing four tandem copies of either

the DRE motif or its mutants at the upstream regula-

tory region, was coexpressed into Arabidopsis leaves by

particle bombardment. Figure 4 shows that these three

AtERFs were able to transactivate the transcription of

a gene carrying the wild-type DRE motif (DREwt),

which was represented by an increase in luciferase

activity of about four- to seven-fold over the control.

No luciferase activity was detected when any of the

three ERF effectors was cotransformed with a reporter

carrying DREt1, in which the C8 was replaced by

T. Although AtERF1 did not activate transcription of

the reporter gene carrying either DREt2 or DREt3,

the coexpressions of AtEBP and CBF1 activated tran-

scription of DREt3 reporter genes to varying degrees.

As AtERF4 was a repressor, an extra effector in which

the activation domain of viral protein 16 was fused to

the yeast GAL4 DNA binding domain (DBD) and

then coexpressed with the AtERF4 effector was used

to test the in vivo binding specificity of AtERF4. The

reporter gene containing multicopies of the GAL4

binding sequence was inserted into the existing lucifer-

ase reporter next to the four tandem DRE motifs and

the transcription suppression by AtERF4 was assayed.

Figure 5 shows that AtERF4 suppressed viral protein

16 activation by more than 50% when cotransformed

with the reporter carrying DREwt, whereas no repres-

sion was detected with a reporter having mutant DRE

motifs in which C8, G9 or C11 were replaced by T.

0

1

2

3

4

5

0

1

2

3

4

5

0

1

2

3

4

5

0

1

2

3

4

5

AtERF1

AtERF4

AtEBP

CBF1

5 6 7 8 9 10 11

ΔΔG

(K

cal·m

ol–1

) ΔΔ

G (

Kca

l·mol

–1)

ΔΔG

(K

cal·m

ol–1

) ΔΔ

G (

Kca

l·mol

–1)

Fig. 3. Effect of single base substitutions on the relative binding

free energy change (DDG) in the binding of the four ERF domains

to the DRE motif. The DNA sequence shown at the bottom is the

DRE motif in which each base was substituted individually one by

one as illustrated. The solid bars indicate the increase in DDG

caused by the base substitution at the corresponding position. Posi-

tive DDG represents a decreased binding activity; a 10-fold

decrease in binding activity increased DDG by � 1.3 kcalÆmol)1.

Arabidopsis ERFs recognize a common CG step core S. Yang et al.

7180 FEBS Journal 276 (2009) 7177–7186 ª 2009 The Authors Journal compilation ª 2009 FEBS

These observations were consistent with the findings in

the in vitro single base substitution binding assays: the

substitution at C8 or G9 abolished the specific recogni-

tion of the DRE motif by all four of the AtERFs.

Random binding site selection reveals the

binding characteristics of divergent AtERFs

to the DRE motif

To clarify the possible existence of the moderately

divergent binding motifs of the four AtERFs from the

divergent phylogenetic subgroups, randomized oligonu-

cleotide selection was performed. The resulting binding

motif of hexamers selected by these four ERF domains

is shown in Table 2. AtERF1 seemed to prefer the

hexamer GCCGCC motif, which is consistent with the

results from previous studies [7,8]. Although the

AtERF4 required a relatively relaxed G or A at posi-

tion 2 of the hexamer G ⁄ aCCGCC, AtEBP selected a

binding motif of hexamer GCCGCC. The selected

motif of CBF1, AA ⁄ cCGAC, appears to agree with a

previous report [14]. Although each ERF domain

showed different binding preferences, all of the binding

sites selected by the AtERFs from the four subgroups

possessed a common CG core in the centre and a con-

served C at the last position (position 7). These moder-

ately divergent bases existed in the other positions

within the binding motifs, discriminating the members

from different subgroups.

The solution structure of the complex formed by the

ERF domain of AtERF1 with the GCC box (1GCC)

shows that two categories of residues within the domain

are considered to be important for specific DNA bind-

ing: one consists of the residues in the b-sheet directlycontacting the DNA bases; and the other is made up of

the numerous Ala residues in the a-helix and the hydro-

phobic residues with larger side chains in the b-sheet (inparticular Phe13, Phe32, Val27 and Ile17), which

appears to determine the geometry of the a-helix rela-

tive to the b-sheet [3–5,7–9,17]. A multiple alignment of

Arabidopsis ERF domains (Fig. 6) shows that a series

of residues (e.g. Gly4, Arg6, Arg8, Gly11, Glu16, Ile17,

Arg18, Arg26, Trp28, Leu29, Gly30, Ala38, Ala39,

Asp43 and Asn57) were almost absolutely conserved

among all members of the ERF family. Most of these

residues are present in the b-sheet, especially Arg6,

Arg8, Glu16, Arg18, Arg26 and Trp28 (Fig. 1A), which

are reported to contact directly with DNA, suggesting

that the conformation of a partial DNA contact surface

may be conserved among various ERF domains, which

result in the conserved recognition of the CG step in

the DRE motif by all four of the different AtERFs.

On the other hand, some other residues reported to

determine the geometry of the a-helix relative to the

b-sheet were not as conserved as these other residues,

but instead were subfamily or subgroup specific, e.g.

the Ile17 in almost all of the ERF family (V17 in

CBF1), V27 in ERF subfamily (Ile27 or Leu27 in the

DRE-binding protein subfamily) and Tyr42 in the

major ERF family (His42 in the CBF1 and TINY sub-

group) (Fig. 6). However, these subfamily- or group-

specific residues seem not to be involved in the direct

base contact, which may affect the local conformation

of the interface by the determination of the geometry

of the a-helix relative to the b-sheet. It seems that the

Reporters

Effectors

4 x DRE TATA LUC Nos

ΩCaMV-35S Nos ERF

Rel

ativ

e lu

cife

rase

act

ivit

y

0

2

4

6

8

10 DREwt

0

2

4

6

8

10 DREt2

DREt1

DREt3

Binding motifs:

l o r t n o C

1 F B

C

1 F R

E t A

P B

E t A

l o r t n o C

1 F B

C

1 F R

E t A

P B

E t A

Fig. 4. AtERF1, AtEBP and CBF1 activate the transcription of the

luciferase reporter gene driven by the DRE motif and its mutants.

The luciferase reporter gene contains four copies of the cis-acting

binding motif, DREwt, DREt1, DREt2 or DREt3, which are high-

lighted and underlined. The effector was constructed with a full

length of ERF cDNA that was controlled under the CaMV 35S pro-

moter following a translation enhancer (X) from tobacco mosaic

virus. These effectors induce transactivation of the reporter gene.

The control in the transient assay was the same as the experi-

ments without the addition of an effector. The results are shown

as relative luciferase activity per control.

S. Yang et al. Arabidopsis ERFs recognize a common CG step core

FEBS Journal 276 (2009) 7177–7186 ª 2009 The Authors Journal compilation ª 2009 FEBS 7181

flanking positions, as well as the CG step core in the

DNA motif, were required to varying extents by diver-

gent ERF domains, and may be determined by these

subfamily- or group-specific residues.

The biological function in DNA binding of individ-

ual ERF domains is apparently determined by the

primary structures of the divergent DBD and a phylo-

genetic classification of the ERF family may partly

reflect the features in DNA binding of a certain popu-

lation of ERF domains. The observations acquired in

the present study imply that the divergent ERF

domains from various groups of the family bind to a

given consensus sequence by conserved recognition of

a CG step core as the universal binding characteristic.

This may be the foundation of the formation of a sta-

ble ERF–DNA complex and the different flanking

position preferences by individual ERF domains may

be crucial for the precise regulation of their own target

genes by various ERFs.

Materials and methods

Preparation of ERF domain-containing proteins

The coding region of the ERF domain of CBF1 (Uni-

ProtKB: P93835) (amino acids 47–142), which contains 10

and 38 amino acids in the N- and C-terminal regions,

respectively, was prepared as described previously [9]. The

Fig. 5. AtERF4 suppresses the transcription of the luciferase repor-

ter gene driven by the DRE motif and it mutants. A multicopy of

the GAL4 binding sequence was inserted into the DRE:luciferase

reporter next to the 4· DRE motif. An extra effector was con-

structed carrying the coding sequences of the activation domain of

viral protein 16 and the yeast GAL4 DBD. The reporter and two

effectors in a ratio of 1 : 1 : 1 were cotransformed into plant tissue;

the remainder was the same as in Fig. 4.

Table 2. Selection of binding sites from a random oligonucleotide

pool by ERFs. Selections were performed using a 60 bp oligonu-

cleotide containing a randomized site of 10 bp. The selected

sequences were aligned computationally and the appearance of a

base at each position in a motif was presented as a percentage fre-

quency of all four kinds of base. The base with a frequency higher

than 50% (bold) was defined as the selected site. If the second

highest frequency base showed not less than half the highest fre-

quency (marked with an asterisk), it was defined as the second

possible site and is presented in lower case letter.

Proteins

Selection

position

Frequency (%)

A C G T

Deduced

consensus

AtERF1 1 38.5 15.4 30.8 15.4 N

2 0.0 11.5 88.5 0.0 G

3 0.0 80.8 15.4 3.8 C

4 11.5 76.9 7.8 3.8 C

5 0.0 0.0 100 0.0 G

6 15.4 69.2 3.8 11.5 C

7 11.5 80.8 3.8 3.8 C

AtERF4 1 29.6 33.3 22.2 14.8 N

2 25.9 11.1 51.8 11.1 G ⁄ a*

3 3.7 77.8 14.8 3.7 C

4 22.2 66.7 3.7 7.4 C

5 0.0 0.0 100 0.0 G

6 7.4 81.5 11.1 0.0 C

7 7.4 74.1 18.5 0.0 C

AtEBP 1 15.4 57.7 23.1 3.8 C

2 0.0 3.8 96.2 0.0 G

3 3.8 92.3 0.0 3.8 C

4 0.0 92.3 0.0 7.7 C

5 0.0 0.0 100.0 0.0 G

6 11.5 84.6 3.8 0.0 C

7 0.0 100.0 0.0 0.0 C

CBF1 1 20.0 40.0 16.0 24.0 N

2 36.0 20.0 36.0 8.0 V

3 8.0 64.0 16.0 12.0 C

4 8.0 60.0 20.0 12.0 C

5 0.0 4.0 88.0 8.0 G

6 56.0 32.0 8.0 4.0 A ⁄ c7 4.0 68.0 24.0 4.0 C

Arabidopsis ERFs recognize a common CG step core S. Yang et al.

7182 FEBS Journal 276 (2009) 7177–7186 ª 2009 The Authors Journal compilation ª 2009 FEBS

Fig. 6. Sequence alignment of ERF domains of members of the Arabidopsis ERF family. All ERF domain sequences were aligned and classi-

fied according to the results from the phylogenetic tree. The names of the ERF domains are represented by their gene locus numbers

except that the names of the four domains used in this study are represented by the transcriptional factor names. The secondary structure

indicated above the sequence and the seven conserved amino acid residues reported to contact DNA bases directly [7] are in red; other

conserved amino acid residues that do not directly contact DNA bases are in blue.

S. Yang et al. Arabidopsis ERFs recognize a common CG step core

FEBS Journal 276 (2009) 7177–7186 ª 2009 The Authors Journal compilation ª 2009 FEBS 7183

ERF domains of AtERF1 (UniProtKB: O80337), AtERF4

(UniProtKB: O80340) and AtEBP (UniProtKB: P42736)

with 10 and 8 amino acids in the terminal regions, respec-

tively, were prepared according to the previous work of Hao

et al. [8]. The PCR products were then cloned into the

pET16b plasmid (Novagen, Merck, Darmstadt, Germany)

Fig. 6. (Continued ).

Arabidopsis ERFs recognize a common CG step core S. Yang et al.

7184 FEBS Journal 276 (2009) 7177–7186 ª 2009 The Authors Journal compilation ª 2009 FEBS

and the corresponding proteins were expressed in

BL21(DE3) pLysS (Merck) Escherichia coli cells and puri-

fied using a His-Trap his-tagged protein purification kit

(Amersham Pharmacia Biotech, Uppsala, Sweden). The pro-

tein concentrations were determined using the bicinchoninic

acid protein assay kit (Pierce, Chester, UK) and further

confirmed using the method of Gill and von Hippel [18].

EMSA

Two 16 bp fragments, EREwt (5¢-CATAAGAGCCGCC

ACT-3¢) and DREwt (5¢-ATACTACCGACATGAG-3¢)(for DNA base sequence and position numbering of

DREwt, see Fig. 1B), from the promoter region of the

tobacco Gln2 gene [3] and the Arabidopsis rd29A gene [19],

respectively, were prepared, together with their mutants, by

synthesizing both stands. The EMSA, binding titration

analysis and the calculation of the binding free energy

change (DDG) were performed as described previously [8,9].

Binding competition assay

The binding condition and buffers used in the competition

assay were the same as used in the quantitative DNA-bind-

ing assay described above. The radioisotope-labelled DNA

probe was first mixed with the binding protein at a concen-

tration corresponding to its Kd. After allowing it to com-

plex for 5 min at room temperature, the mixture was

distributed into aliquots, to which a poly.[d(A-T)].poly[dA-

dT] (Amersham Pharmacia Biotech) gradient of 0.001–

10 lg was added to a final volume of 10 lL of each

aliquot. After incubation for a further 10 min, the contents

were loaded on to an 8% nondenaturing PAGE and visual-

ized as for EMSA.

Construction of the reporter and effector genes

For the reporter gene constructs, see Fig. 4. The detailed

dual-luciferase reporter transient assay was performed as

described previously [9].

Selection of the DNA-binding site

A 60 bp single-stranded DNA RDM10, with 10 random-

ized oligonucleotides in the center, i.e. CTGTCAGTGAT

GCATATGAACGAATN10AATCAACGACATTAGGATC

CTTAGC was synthesized. A 100 ng sample of RDM10

was radiolabelled during synthesis of double-stranded DNA

using [32P]dATP[aP] with the E. coli Klenow fragment

(New England Biolabs, Ipswich, MA, USA). The selections

were performed after incubation with the individual ERF

domains (25–100 ng) followed by EMSA. Briefly, each

binding reaction was carried out in a 10 lL binding buffer

[25 mm Hepes-KOH (pH 7.5), 40 mm KCl, 0.1 mm EDTA,

0.1 mgÆmL)1 BSA, 10% glycerol and 1 lg double-stranded

poly(dI–dC)] and 25–100 ng of individual ERF domain.

The bound oligonucleotides were gel purified, extracted with

phenol ⁄ chloroform and precipitated with ethanol. The puri-

fied DNAs were radiolabelled during amplification by PCR

using 5¢ and 3¢ primers in the presence of [32P]dATP[aP].This product was used for the next round of selection follow-

ing the same protocol. After seven cycles of selection, the

retarded DNA band of the final selection was cut off, puri-

fied and then cloned into the pUC119 plasmid (New England

Biolabs). Plasmid DNAs from � 40–50 independent insert-

containing colonies were prepared and the insert fragments

were sequenced. At least 25 of the resulting quality sequences

containing the randomized 10 bp oligonucleotides were

aligned computationally using clustal x [20]. The frequency

of each nucleotide appearing in the aligned position of the

selected sequences was calculated, leading to the establish-

ment of the selected binding site.

Phylogenetic analysis

The amino acid sequences of all AtERFs were downloaded

from the Database of Arabidopsis Transcription Factors

(DATF) (http://datf.cbi.pku.edu.cn) [21]. The sequences of

all ERF domains were extracted in bulk by a manual pro-

gram using Perl script. The sequence alignment was gener-

ated using clustal x: Gap at 10; Gap Extension at 0.2;

Delay Divergent Sequence at 10%; Negative Matrix Off

and Protein Weight Matrix of BLOSUM Series [20].

Acknowledgements

The experiments were carried out at the National Insti-

tute of Advanced Industrial Science and Technology,

Japan. DH was a recipient of a fellowship from the

former Agency of Industrial Science and Technology,

MITI, Japan, and of an STA fellowship from the

Science and Technology Agency of Japan. This study

was also supported partially by a grant issued by the

National Natural Science Foundation of China (grant

no. 30470159 ⁄C01020304) and the National High-

Technology Research and Development Program

(‘863’ Program) of China (grant no. 2007AA10Z110).

References

1 Okamuro JK, Caster B, Villarroel R, Van Montagu M

& Jofuku KD (1997) The AP2 domain of APETALA2

defines a large new family of DNA binding proteins in

Arabidopsis. Proc Natl Acad Sci USA 94, 7076–7081.

2 Riechman JL & Meyerowitz EM (1998) The AP2 ⁄ERE-

BP family of plant transcription factors. Biol Chem 279,

633–646.

S. Yang et al. Arabidopsis ERFs recognize a common CG step core

FEBS Journal 276 (2009) 7177–7186 ª 2009 The Authors Journal compilation ª 2009 FEBS 7185

3 Ohme-Takagi M & Shinshi H (1995) Ethylene-inducible

DNA binding proteins that interact with an ethylene-

responsive element. Plant Cell 7, 173–182.

4 The Arabidopsis Genome Initiative (2000) Analysis of

the genome sequence of the flowering plant Arabidop-

sis thaliana. Nature 408, 796–815.

5 Sakuma Y, Liu Q, Dubouzet JG, Abe H, Shinozaki K

& Yamaguchi-Shinozaki K (2002) DNA-binding speci-

ficity of the ERF ⁄AP2 domain of Arabidopsis DREBs,

transcription factors involved in dehydration and

cold-inducible gene expression. Biochem Biophys Res

Commun 290, 998–1009.

6 Nakano T, Suzuki K, Fujimura T & Shinshi H (2006)

Genome-wide analysis of the ERF gene family in

Arabidopsis and rice. Plant Physiol 140, 411–432.

7 Allen MD, Yamasaki K, Ohme-Takagi M, Tateno M &

Suzuki M (1998) A novel mode of DNA recognition by

a beta-sheet revealed by the solution structure of the

GCC-box binding domain in complex with DNA.

EMBO J 17, 5484–5496.

8 Hao D, Ohme-Takagi M & Sarai A (1998) Unique

mode of GCC box recognition by the DNA-binding

domain of ethylene-responsive element-binding factor

(ERF domain) in plant. J Biol Chem 273, 26857–

26861.

9 Hao D, Yamasaki K, Sarai A & Ohme-Takagi M

(2002) Determinants in the sequence specific binding of

two plant transcription factors, CBF1 and NtERF2, to

the DRE and GCC motifs. Biochemistry 41, 4202–4208.

10 Buttner M & Singh KB (1997) Arabidopsis thaliana

ethylene-responsive element binding protein (AtEBP),

an ethylene-inducible, GCC box DNA-binding protein

interacts with an ocs element binding protein. Proc Natl

Acad Sci USA 94, 5961–5966.

11 Zhou J, Tang X & Martin GB (1997) The Pto kinase

conferring resistance to tomato bacterial speck disease

interacts with proteins that bind a cis-element of

pathogenesis-related genes. EMBO J 16, 3207–3218.

12 Fujimoto SY, Ohta M, Usui A, Shinshi H &

Ohme-Takagi M (2000) Arabidopsis ethylene-responsive

element binding factors act as transcriptional activators

or repressors of GCC box-mediated gene expression.

Plant Cell 12, 393–404.

13 Yamaguchi-Shinozaki K & Shinozaki K (1994) A novel

cis-acting element in an Arabidopsis gene is involved in

responsiveness to drought, low-temperature, or high-salt

stress. Plant Cell 6, 251–264.

14 Baker SS, Wilhelm KS & Thomashow MF (1994) The

59-region of Arabidopsis thaliana cor15a has cis-acting

elements that confer cold-, drought- and ABA-regulated

gene expression. Plant Mol Biol 24, 701–713.

15 Prabakaran P, An J, Gromiha M, Selvaraj S, Uedaira

H, Kono H & Sarai A (2001) Thermodynamic database

for protein-nucleic acid interactions (ProNIT). Bioinfor-

matics 17, 1027–1034.

16 Triezenberg SJ, Kingsbury RC & McKnight SL (1988)

Functional dissection of VP16, the trans-activator of

herpes simplex virus immediate early gene expression.

Genes Dev 2, 718–729.

17 Liu Y, Zhao TJ, Liu JM, Liu WQ, Liu Q, Yan YB &

Zhou HM (2006) The conserved Ala37 in the ERF ⁄AP2

domain is essential for binding with the DRE element

and the GCC box. FEBS Lett 580, 1303–1308.

18 Gill SC & von HippelPH (1989) Calculation of protein

extinction coefficients from amino acid sequence data.

Anal Biochem 182, 319–326.

19 Liu Q, Kasuga M, Sakuma Y, Abe H, Setsuko M,

Yamaguchi-Shinozaki K & Shinozaki K (1998)

Two transcription factors, DREB1 and DREB2,

with an EREBP ⁄AP2 DNA binding domain

separate two cellular signal transduction pathways

in drought- and low-temperature-responsive gene

expression, respectively, in Arabidopsis. Plant Cell 10,

1391–1406.

20 Thompson JD, Gibson TJ, Plewniak F, Leanmougin F

& Higgins DG (1997) The CLUSTAL_X windows

interface: flexible strategies for multiple sequence align-

ment aided by quality analysis tools. Nucleic Acids Res

25, 4876–4882.

21 Guo A, He K, Liu D, Bai S, Gu X, Wei L & Luo J

(2005) DATF: a database of Arabidopsis transcription

factors. Bioinformatics 21, 2568–2569.

Arabidopsis ERFs recognize a common CG step core S. Yang et al.

7186 FEBS Journal 276 (2009) 7177–7186 ª 2009 The Authors Journal compilation ª 2009 FEBS