dna-binding motifs from eukaryotic transcription factors

9

Click here to load reader

Upload: stephen-k

Post on 18-Dec-2016

216 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: DNA-binding motifs from eukaryotic transcription factors

DNA-binding motifs from eukaryotic transcription factors

Stephen K Burley

The Rockefeller Univers i ty and Howard Hughes Medical Institute, New York, USA

Considerable progress has been made during the past year on structure determinations of eukaryotic transcription factors that function as DNA- binding proteins in concert with RNA polymerase II. New structures include two TATA box-binding proteins bound to distinct TATA elements, two b/HLH/Z factors, Max and USF, recognizing CACGTG, and four helix-turn-helix variants, the third repeat of c-Myb, the POU-specific domain of Oct-l, an atypical homeodomain from LFB1/HNF1, and the fork head domain of HNF-3y complexed with DNA. Other novel structures include the DNA-binding domain of GATA-1 complexed with DNA, the nucleic acid-binding domain of transcription factor IIS and the five zinc-finger

GLI-DNA complex.

Current Opinion in Structural Biology 1994, 4:3-11

Introduction

In eukaryotes, RNA polymerase II (pol II) transcribes nuclear genes encoding the messenger RNAs and sev- eral small nuclear RNAs [1]. Typical class II nuclear gene promoters contain three distinct DNA targets, which are recognized by DNA-binding proteins that fix the start of transcription and regulate RNA pro- duction. The core promoter, located immediately up- stream of the transcription start site and comprised of the TATA and initiator elements, fixes the initia- tion site. Promoter proximal elements occur between 50 and 200 base pairs (bp) upstream of the cap site - - proteins binding to these sequences modulate the level of transcription. Distal enhancer elements, which can be found far from the gene in either direction, con- stitute another group of DNA targets for transcription factors modulating pol II activity [2,3].

Distinct proteins interact with each segment of the class II nuclear gene promoter during transcription. The gen- eral initiation factors TFII-A, -B, -D, -E, -F, -G/J, -H and -I assemble on the core promoter with pol 1I before transcription begins. First, TFIID recognizes and tightly binds the TATA box. Thereafter, this TFIID-DNA com- plex directs accretion of the remaining general factors and pol II, forming a large multiprotein-DNA assembly (preinitiation complex or PIC) that initiates transcrip- tion correctly. However, pol II and the class II initi- ation factors cannot adjust the rate of RNA synthesis. Instead, two additional groups of transcription factors are required. These proteins recognize DNA sequences

within the promoter proximal and distal enhancer seg- ments of class II nuclear gene promoters, participating in highly specific interactions with the PIC and mod- ulating the rate of RNA synthesis. It is remarkable that the DNA-binding subunit of TFIID, the TATA box-bind- ing protein (TBP), is also required for transcription by RNA polymerases I and III, where it may not act as a sequence specific DNA-binding protein [4].

Work published within the past year has significantly extended our understanding of the structure and func- tion of the DNA-binding motifs found in eukaryotic transcription factors regulating pol II activity. New structures of protein-DNA complexes include two TBPs recognizing distinct TATA elements [5°',6°°], and the DNA-binding domains of five transcription factors complexed with promoter proximal or distal enhancer elements (Max [7"°1, USF [8°], HNF-3~[9°'], GATA-1 [10"°], GLI [11°']). New structures of DNA-binding pro- teins obtained without nucleic acid include the third repeat of c-Myb [12°], the POU-specific domain of Oct- 1 [13",14°], an atypical homeodomain from LFB1/HNF1 [15",16°], and the nucleic acid-binding domain of the elongation factor TFIIS [17°°].

TBP: minor groove recognition and bending of the core promoter

Last year, the structure of TBP isoform 2 (TBP2) from Arabidopsis thaliana was reported at 2.6.~ resolution

Abbreviations b--basic region; bp--base pairs; HLH--helix-loop-helix; HMG--high mobility group; HNFl--hepatocyte nuclear factor 1 ;

PIC---preinitiation complex; pol II--RNA polymerase II; POUhd--POU homeodomain; POU$--POU specific; TBP TATA box binding protein; TBP2~TBP isoform 2; USF--upstream stimulatory factor; z--zipper.

© Current Biology Ltd ISSN 0959-440X 3

Page 2: DNA-binding motifs from eukaryotic transcription factors

4 Protein-nucleic acid interactions

[18]. More recent progress on crystallographic studies of uncomplexed TBPs includes further refinement of TBP2 at 2.1A resolution [19], and molecular replace- ment structures of yeast TBP [20,21]. TBP consists of a phylogenetically conserved, 180 residue carboxy-ter- minal portion (>80% identity between yeast and hu- man), containing two structural repeats. The carboxy- terminal, or core region of TBP binds to the TATA consensus sequence (TATAa/tAa/t) with high affin- ity and slow off rate (kd=2-4xl0-gM , t l /2=2hours) [22], recognizing minor groove determinants and pro- moting DNA bending. The amino-terminal portion of TBP varies in length, shows little or no conservation among different organisms, and is largely unnecessary for transcription in certain yeast strains. The three-di- mensional structure of TBP resembles a saddle [18], the a-helical 'seat' of which serves as a binding surface for components of the transcription machinery [23]. The available biochemical and genetic data confirm that the concave anti-parallel ~-sheet of TBP sits astride the DNA [18]. However, it was not until simultaneous pub- lication of two different TBP-TATA element complexes t]~at minor groove recognition and bending of the core promoter by TBP were directly visualized.

Structures of TBP2 complexed with the Adenovirus major late promoter TATA element (TATAAAAG) [6], and the carboxyl terminus of yeast TBP complexed with the yeast CYC1 -52 TATA element (TATATAAA) [5] were obtained at 2.25A and 2.5A, respectively (Fig. 1). Although the two co-crystal structures dif- fer slightly in detail, they demonstrate similar modes of minor groove recognition and DNA distortion. As predicted, DNA-binding is mediated by the protein's curved, eight-stranded, antiparallel ~osheet, which pro- vides a large concave surface for minor groove and phosphate-ribose contacts with the 8 bp TATA element. The 5'end of standard B-form DNA enters the under- side of the molecular saddle, where an abrupt transi- tion to an unprecedented, partially unwound form of the right-handed double helix is induced by insertion of two phenylalanine residues into the first T:A base step. Thereafter, the widened minor groove face of the unwound, smoothly bent DNA is approximated to the underside of the molecular saddle, burying a total surface area of about 3000A 2, permitting direct interactions between protein side chains and the minor groove edges of the central 6 bp. A second large kink is induced by insertion of two phenylalanine residues in the base step between the last 2 bp of the TATA el- ement, and there is a corresponding abrupt return to B-form DNA.

Despite this massive distortion, Watson-Crick base pairing is preserved throughout, and there appears to be no strain induced in the DNA, because par- tial unwinding has been compensated for by right- handed supercoiling of the double helix, inducing an overall bend of about 100 ° . Side-chain-base contacts are restricted to the minor groove, including the four phenylalanines described above, plus four hydrogen bonds and a large number of van der Waals con- tacts (Fig. 2). There are no water molecules mediat-

N

Fig. 1. MOLSCRIPT [51 ] cartoon of the three-d~ensional structure of TBP2 from Arabidopsis thaliana complexed with the Adenovirus major late promoter TATA element. The molecular saddle (amino- and carboxy-termini labelled) is depicted as a ribbon drawing and the DNA is shown asa stick figure with the transcription start site labelled +1. When TBP recognizes the minor groove of the TATA element, the DNA is kinked and unwound to present the minor groove edges of the bases to the underside of the molecu- lar saddle. The coding strand is denoted with solid bonds. (Figure provided by JL Kim.)

ing side-chain-base interactions and the majority of the hydrogen bond donors and acceptors on the mi- nor groove edges of the bases remain unsatisfied. The co-crystal structures are entirely consistent with the re- suits of previous chemical studies of specific TBP-DNA complexes [24], and readily explain the mechanisms by which a double point mutant of yeast TBP can recog- nize both TATA- and TGTA-containing TATA elements [25]. TBP's mode of DNA recognition also provides an explanation for exclusion of nucleosomes from TBP- bound promoters, and exclusion of TBP from pre- formed chromatin [26]. Finally, bending of the core promoter by TBP may play a role in transcription by bringing factors bound to promoter proximal elements closer to the PIC.

The high mobility group (HMG) box is another DNA- binding motif found in eukaryotic transcription factors that recognizes DNA via minor groove contacts. Two NMR structures of the a-helical DNA-binding domain from HMG1 were published during the review pe- riod [27,28]. An additional NMR study of another HMG protein, SRY, demonstrates that this sex-determining transcription factor interacts with a widened minor groove by inserting an isoleucine into a base step [29]. Thus, TBP and SRY may represent two structurally dis- tinct protein scaffolds mediating similar modes of mi- nor groove DNA recognition.

Page 3: DNA-binding motifs from eukaryotic transcription factors

DNA-binding motifs from eukaryotic transcription factors Burley 5

Fig. 2. Schematic drawing of contacts between yeast TBP and the minor groove of the CYCl -52 TATA element (TATATAAA), viewed from the DNA. TBP's anti-parallel ~:sheet is depicted with arrows. Residues making contact with the DNA are shown enclosed in ovals and rectangles. The DNA is drawn with open rectangles for bases, pentagons for ribose groups and circled P for phosphate groups. Reprinted with permission from [5"].

Max and USF: two b/HLH/Z proteins recognizing CACGTG

Recent publication of the co-crystal structures of Max [7] and upstream stimulatory factor (USF) [8] com- plexed with DNA have significantly improved our un- derstanding of the helix-loop-helix (HLH) transcription factors [30]. These structures confirm that the con- served HLH and leucine zipper (Z) segments mediate protein dimerization, and that each conserved basic re- gion (b) recognizes a half-site within the consensus se- quence CANNTG (where N represents any base). Max is a b/HLH/Z protein that heterooligomerizes with the Myc oncoproteins, enabling them to bind DNA un- der physiologic conditions. Recent characterizations of other b/HLH/Z proteins that interact preferentially with Max suggest that Max plays a central role in or- chestrating the biological activities of some other family members. USF is a b/HLH/Z transcription factor orig- inally purified from HeLa nuclei, where it stimulates transcription as a honaodimer by binding to target se- quences within the promoter proximal region.

The three-dimensional structures of truncated forms of Max and USF complexed with the E-box (CACGTG) were reported at 2.9A resolution [7,8] (Fig. 3). The b/HLH/Z form of Max, cocrystallized with a 22bp oligonucleotide, consists of two lengthy (x-helices sep-

arated by a loop. The amino-terminal (x-helix (b/H1) is continuous, and includes residues from the basic and H1 regions. The second (x-helix (H2/Z), also contin- uous, is comprised of the H2 and leucine zipper re- gions. Max binds DNA a s a homodimer, with the two monomers folded into a parallel, left-handed, four- helix bundle. Two (x-helices, the two basic regions, project from the four-helix bundle towards the DNA, and enter the major groove in opposite directions. Fi- nally, two Z portions of the second (,-helical segment form a parallel, left-handed coiled-coil of right-handed (x-helices. The b/HLH form of USF, cocrystallized with a 21 bp oliognucleotide, lacks the leucine zipper but is otherwise nearly identical in structure to the Max-DNA complex. In both cases, the nucleic acid is straight and in the usual B-form.

Unlike the purely coiled-coil b/Z proteins [31-34], the homodimers of Max and USF, and by analogy homod- imers and heterodimers of the HLH and other b/HLH/Z proteins, have a globular domain, the four-helix bun- dle. Stabilization of this novel structure must be due in large .part to van der Waals interactions within its conserved, hydrophobic core. HLH and b/HLH/Z pro- teins vary in the sequence, amino acid composition, and length of the loop connecting H1 and H2. Ex- tant protein sequences include loops ranging from 5 (CBF1) to 20 residues (Drosophila AS), reviewed in [7°']. The structures of Max and USF fix the distance between the last (x-carbon of H1 and the first (x-carbon of H2 at about 15,~. Given the 3.8A distance between successive (x-carbon atoms in an extended polypeptide chain, the structures imply that the loop must be at least four, or five residues in length. A loop-deletion study of MyoD, which normally h a s an eight residue loop, documented loss of DNA-binding activity on shorten- ing the loop to four amino acids [35"]. Not surprising, therefore, the minimum possible loop in these related families of proteins is probably that observed in nature (5 residues).

Max and USF bind DNA in a very similar manner, with all the conserved amino acids in the DNA-binding seg- ment making either base or phosphate contacts [36]. Each (x-helical basic region makes four side-chain-base contacts with the half-site of the recognition sequence CACGTG (Fig. 4). Both proteins also make a large num- ber of basic region-phosphate contacts, which span the entire backbone of the recognition element. In addi- tion, the loop regions of Max and USF interact with the phosphodiester backbone, and family members with longer loops could make base specific protein-DNA contacts outside the E-box. Finally, there is a conserved basic residue located at the start of H2 (Arg60 in Max) that interacts with the DNA backbone.

Helix-turn-helix variants: cMyb, POUs, LFB1/HNF1, HNF-3~

Until this year, the homeodomain proteins were the only eukaryotic transcription factors known to contain

Page 4: DNA-binding motifs from eukaryotic transcription factors

6 Protein-nucleic acid interactions

(b)

Fig. 3. MOLSCRIPT cartoons of the three-dimensional structures of the homodimers of (a) Max and (b) USF complexed with DNA. The oligonucleotides are drawn as atomic stick figures. (Figure provided by AR Ferr6-D'Amar6.)

the helix-turn-helix (HTH) motif, typical of prokary- otic repressors [37]. [~-Turns found in canonical HTH proteins are four residues in length with glycine at the second position [38]. Determination of the structures of the third repeat of c-Myb [12], the POU-specific (POUs) domain of Oct-1 [13",14°], an atypical homeodomain from LFB1/HNF1 [15",16"], and the fork headdomain of HNF-3ycomplexed with DNA [9"], reveals a new sub- family of eukaryotic DNA-binding proteins, known as HTH variants, for which the definition is loosened to permit loops instead of turns.

The c-myb protooncogene product (c-Myb) is a mam- malian transcriptional regulator consisting of three im- perfect direct repeats, the third of which is thought to be responsible for seqeunce-specific DNA binding. As predicted from amino acid sequence comparisons, the NMR solution structure of the third repeat of c-Myb is comprised of three et-helices and resembles the HTH homeodomain [12]. However, the 'turn' in this pro- tein is one amino acid longer than the usual four and has sequence Leu-Pro-Gly=Arg-Thr. Another im- portant structural difference is the organization of the hydrophobic core, which is dictated by the relative ori- entation of the amino-terminal helix with the two HTH helices.

Two NMR solution structures of the POUs domain of the mammalian transcription factor Oct-1 demon- strate a striking similarity to the bacteriophage ~. re- pressor DNA-binding domain [13",14"] (Fig. 5). Over-

lays of the POU s NMR structure on those of HTH proteins revealed a longer 'turn' (Tyr-Gly-Asn-Asp- Phe-Ser) and an extension of the carboxyl terminus of the first HTH helix. These structures also provide insight into the mechanisms by which the bipartite POU domain, consisting of both POU s and the POU homeodomain (POUhd), interacts with DNA. The sec- ondary structure of the Oct-3 POUhd was also reported during the review period, and was found to be in- distinguishable from previously characterized home- odomains [39]. Finally, NMR studies of hydration of the Drosophila Antennapedia homeodomain demonstrate the presence of rapidly-exchanging water molecules in the protein-DNA interface [40°].

The most extreme HTH variant is the atypical home- odomain of hepatocyte nuclear factor 1 (HNF1), which was independently determined by both NMR and X-ray methods [15",16"]. HNF1 is a liver-specific transcription factor that binds as a dimer to the palindromic consen- sus sequence GTI'AATNATI'AAC, and is implicated in regulation of liver-specific gene expression. Both struc- tures were obtained in the absence of DNA, and resem- ble those of the canonical homeodomains, except for an extension of the carboxy-terminus of helix 2 and consequent lengthening of the 'turn' connecting he- lices 2 and 3. Unlike c-Myb, the hydrophobic cores of the HNF1 and Antennapedia homeodomains are quite similar because packing of the three or-helices is not significantly altered by insertion of 21 residues into the HTH motif.

Page 5: DNA-binding motifs from eukaryotic transcription factors

DNA-binding motifs from eukaryotic transcription factors Burley 7

........

; I

( Az ) - ~ ........ .~

Fig. 4. Schematic summary of DNA contacts made by Max. Amino acid side-chain to nucleotide base contacts are represented by con- tinuous arrows, and contacts with the phosphate backbone by bro- ken arrows. Amino acids making phosphate contacts are shaded. Contacts outside the boundaries of the figure are indicated in parentheses. The position of the dyad axis is indicated by the shaded oval. This view is drawn from the vantage point of the DNA major groove, looking out towards the surface of the basic region s-helix, Reprinted with permission from [7"].

Further support for this classification of eukaryotic HTH variants comes from structural comparisons of homeodomain-DNA complexes with the newly-de- termined hepatocyte nuclear factor (HNF) -3y--DNA complex [9]. HNF-3T is a mammalian tissue-specific transcription factor, containing a highly-conserved 110 residue DNA-binding domain, shared by a large gene family of transcription factors found in various organ- isms [41]. In addition to tissue-specific gene expression, members of the HNF-3/fork head family have been im- plicated in cell differentiation and development. The DNA-binding domain of HNF-3T is a 'winged helix' ot/~HTH variant that is topologically identical to the globular portion of histone H5 [42] and the DNA-bind- ing domain of BirA, the endogenous repressor of the biotin biosynthetic operon in Escherichia coli [43] (Fig. 6). Not unlike the homeodomains, HNF-3T binds as a monomer to a slightly bent, B-form DNA with the

Fig. 5. MOLSCRIPT cartoon of the C~ backbone trace of the POU s domain of Oct-1. The HTH segment is shown in profile on the right. (Figure provided by N Assa-Munt.)

so-called recognition helix lying in the major groove, and other regions of the protein making a variety of base and DNA backbone contacts, some of which are water mediated (Fig. 7). It should be emphasized that earlier predictions of the DNA-binding modes of the eukaryotic HTH variants have been largely inferred by analogy with canonical HTH proteins. The structure of the HNF-3T-DNA complex is the only direct proof that a eukaryotic HTH variant can bind DNA in a manner similar to the monomeric, eukaryotic HTH proteins.

New zinc-containing DNA-binding domains

Three new structures of zinc-containing DNA-binding domains were obtained during the review period, in- cluding uncomplexed TFIIS [17"], the GATA-1-DNA complex [10"], and the five zinc-finger GLI-DNA com- plex [11"]. TFIIS is a eukaryotic transcriptional elonga- tion factor that contains a novel Cys 4 Zn2+-binding site, capable of binding single-stranded DNA in a sequence independent fashion. The NMR structure of the uncom- plexed polypeptide reveals a compact 3-stranded, anti- parallel ]]-sheet coordinating a zinc ion [17"*] (Fig. 8). GATA-1 is an erythroid-specific transcription factor that regulates pol II activity within the erythroid cell lineage by binding as a monomer to the asymmetric consensus sequence t/aGATAa/g [44]. Unlike TFIIS, the GATA- 1 DNA-binding domain is a small Cys 4 Zn2+-contain- ing et/[~-motif. The NMR structure of the GATA-1-DNA complex demonstrates both major and minor groove side-chain base contacts, which explain DNA-binding specificity [10"]. The newly-determined 2.6A resolu- tion X-ray structure of the five-finger GLI-DNA com- plex [11"] is reviewed in this issue by Schmiedeskamp and Klevit (pp 28-35).

Page 6: DNA-binding motifs from eukaryotic transcription factors

8 Protein-nucleic acid interactions

H N F - 3 H2 w 31

W2

I S2

ENGR

GH5

C A P Fig. 6. MOLSCRIPT cartoons showing similarities in the following DNA-binding domains: HNF-3 7, histone H5, engrailed homeodomain (ENGR), Escherichia coil catabolite gene activator protein (CAP). HNF-3y is viewed looking towards the DNA-binding surface as if the target DNA was running vertically. The other DNA- binding domains were draw using the same orientation. Reprinted with permis- sion from [9"].

(a) (b)

Wl

H3

Fig. 7. (a) MOLSCRIPT cartoon of the three-dimensional structure of the monomeric HNF-3y-DNA complex. The 13 bp oligonucleotide is drawn as a stick figure. (~-Helices and G-strands are labelled with H and S, respectively. The short loop between 0~-helices H2 and H3 is designated T', and the two long loops are labelled with W. (b) A similar cartoon using a view rotated approximately 90 ° about the vertical axis to show helix H3 binding in the major groove of DNA. Reprinted with permission from [9"].

Continuing structural studies of the family of zinc- containing nuclear receptor DNA-binding motifs pub-

lished during the review period include: further work on the glucocorticoid receptor [45-47], NMR struc-

Page 7: DNA-binding motifs from eukaryotic transcription factors

DNA-binding motifs from eukaryotic transcription factors Burley 9

and elongation. Protein-protein molecular recognition

Zn

45 20 131 (N)

I Disordered :, Loop 1

2.

events occur within TFIID and the PIC, and between the PIC and factors binding to promoter proximal and distal enhancer elements. Finally, the important roles of chromatin structure and DNA looping must also be ex- amined in order to understand transcription as it occurs within the nucleus.

Acknowledgements

I am grateful to N Assa-Munt,KL Clark, AR Ferr6-D'Amare, JL Kim, Y Kim, DB Nikolov, and MA Weiss for providing illustrations for this review, and to RG Roeder for many useful discussions.

References and recommended reading

Papers of particular interest, published within the annual period of review, have been highlighted as:

of special interest

Fig. 8. MOLSCRIPT cartoon of the NMR structure of TFIIS. The lo- cation of the zinc ion is indicated with an arrow. (Figure provided by MA Weiss.)

tures of retinoic acid receptor-[~ [48] and retinoid X receptor<t[49], and the co-crystal structure of the es- trogen receptor-DNA complex [50].

• - of outstanding interest

SENTENAC A: Eukaryotic RNA Polymerases. CRC Cur Rev Biochem 1985, 18:31-90.

ROEDER RG: The Complexities of Eukaryotic Transcription Initiation: Regulation of Preinitiation Complex Assembly. Trends Biochem Sci 1991, 16:402-408.

3. ZAWEL L, REINBERG D: Advances in RNA Polymcrase 11 Transcription. Curr Opin Cell Biol 1992, 4:488495.

4. RIGBY PWJ: Three in One and One in Three: It All Depends on TBP. Cell 1993, 72:7-10.

5. RIM Y, GEIGER JH, HAHN S, SIGLER PB: Crystal Structure of °° a Yeast TBP/TATA-box Complex.Nature 1993, 365:512-520. The authors report the 2.5A structure of yeast core TBP complexed with the yeast CYC1 -52 TATA element. An unprecedented mode of minor groove recognition is demonstrated, involving DNA kinking, bending and unwinding.

6. KIM JL, NIKOLOV DB, BURLEY SK: Co-Crystal Structure of TBP °° Recognizing the Minor Groove of a TATA Element. Nature

1993, 365:520-527. The authors report the 2.25A structure of Arabidopsis thaliana TBP isoform 2 complexed with the Adenovirus major late promoter TATA element. This paper was published simultaneously with reference [5], and the two structures demonstrate precisely the same mode of DNA distortion and minor grcx)ve recognition.

7. FERRE-D'AMARE AR, PRENDERGAST GC, ZIFF EB, BURLEY SK: °,, Recognition by Max of its Cognate DNA Through a Dimeric

b/HLH/Z Domain. Nature 1993, 363:38-45. This study reveals the 2.9A structure of the homodimeric b/HLH/Z protein Max recognizing the sequence CACGTG. The HLH portion of the structure forms a left-handed, parallel, four-helix bundle, which is cont inuous with the left-handed coiled-coil of the leucine zipper region. The basic regions are or-helical and make side chain-contacts with each half-site of the cognate sequence.

8. FERRE-D'AMARE AR, POGNONEC P, ROEDER RG, BURLEY SK: ° Structure and Function of the b/HLH/Z Domain of USE

EMBO J 1994, 13:180-189. The authors report the 2.9~, structure the homodimeric b/HLH/Z protein ups t ream stimulatory factor bound to the sequence CACGTG. This structure was solved by molecular replacement using the Max model and is extremely similar. The paper also describes the results of a detailed physico-chemical characterization of the DNA binding and oligomerization properties of USF, which can function as a bi- valent homotetramer and may mediate DNA looping.

Conclusions and perspectives

Recent X-ray crystallographic and NMR studies of eukaryotic transcription factors traverse the entire length of a typical class II nuclear gene promoter, including the TATA=box binding protein and a large number of promoter proximal and distal enhancer binding factors. These high-resolution studies of novel DNA-binding motifs and specific protein-DNA com- plexes reveal a wealth of new protein folds, and new modes of DNA recognition. If, however, we are to understand the precise mechanisms underlying se- quence specific recognition of DNA, these studies must be extended to include homologous proteins com- plexed with similar oligonucleotides, along with a sys- tematic examination of the roles of individual amino acids and bases implicated in specific DNA binding. Moreover, other biophysical methods will be needed to understand sequence-dependent DNA deformation, the role of water, and the thermodynamics and ki- netics of DNA binding. In the longer term, we will need to turn out attention to the myriad of protein- protein interactions controlling transcription initiation

Page 8: DNA-binding motifs from eukaryotic transcription factors

10 Protein-nucleic acid interactions

9. CLARK KL, HALAY ED, LAI E, BURLEY SK: Co-Crystal Structure ,,. of the HNF-31fork head DNA-Recognifion Motif Resembles

Histone H5. Nature 1993, 364:412-420. This paper reports the 2.5-& structure of hepatocyte nuclear factor-3T complexed with a 13bp oligonucleotide. The protein is a HTH vari- ant that binds DNA in a manner similar to the homeodomains , with an or-helix lying in the major groove and sidechain-minor groove contacts. In addition, it is topologically identical to two unrelated DNA-binding proteins, avian histone H5 and E coli BirA.

10. OMICHINSKI JG, CLORE GM, SCHAAD O, FELSENFELD G, o• TRAINOR C, APPELLA E, STAHL SJ, GRONENBORN AM: NMR

Structure of a Specific DNA Complex of Zn-containing DNA Binding Domain of GATA-1. Science 1993, 261:438-446.

The authors describe the results of a highly-informative NMR struc- ture determination of the erythroid-specific transcriptional regulator GATA-1 complexed with its cognate DNA. A full discussion of this work appears in the review of Schmiedeskamp and Klevit (this issue pp 28-35).

11. PAVLET1CH NP, PABO CO: Crystal Structure of a Five-finger • • GLI-DNA Complex: New Perspectives on Zinc Fingers. Sci-

ence 1993, 261:1701-1707. This paper reports the determination of the 2.6A structure of an ex- tremely unusua l five zinc-finger protein complexed with DNA. This work is reviewed by Scmiedeskamp and Klevit in this volume (pp 28-35).

12. OGATA K, HOJO H, AIMOTO S, NAKAI T, NAKAMURA H, SARAI • A ISHII S, N1SHIMURA Y: Solution Structure of a DNA-binding

Unit of Myb: A Helix-turn-helix-related Motif with Conserved Tryptophans Forming a Hydrophobic Core. Proc Natl Acad Sci USA 1992, 89:6428-6432.

The authors describe the NMR structure of the uncomplexed c- Myb DNA-binding domain, which is a HTH variant that differs only slightly from a canonical homeodomain .

13. ASSA-MUNT N, MORTISHIRE-SMITH RJ, AURORA R, HERR W, • WRIGHT PE: The Solution Structure of the Oct-1 POU-

Specific Domain Reveals a Striking Similarity to the Bac- teriophage ~v Repressor DNA-Binding Domain. Cell 1993, 73:193-205.

This paper reports the NMR structure of the uncomplexed POU-spe- cific domain from Oct-l, which is a HTH variant. The structure is topologically identical to the DNA-binding domain of the bacterio- phage ~ repressor and a detailed evaluation of the .structural similar- ities and differences is presented.

14. DEKKER N, COX M, BOELENS R, VERRIJZER CP, VAN DER VLIET • PC, KAPTEIN R: Solution Structure of the POU-Specific DNA-

Binding Domain of OCT-1. Nature 1993, 362:852-855. Description o f the NMR structure of the uncomplexed POU-specific domain from Oct-l, which is identical to the structure described in [13]. NMR spectra of the specific protein-DNA complex are presented and a preliminary mcx.lel o f the POUs-DNA complex is derived. In addition, the present a model for the DNA-binding behavior of the bipartite POU domain, which consists of both the POU s and the POU homeodomains .

15. LEI'IING B, DEFRANCESCO R, TOMEI L, CORTESE R, OTI'ING • G , WUTHRICH K: The Three-Dimensional NMR-Solution

Structure of the Polypeptide Fragment 195-286 of the LFB1/HNF1 Transcription Factor from Rat Liver Comprises a Non-Classical Homeodomain. EMBOJ 1993, 12:1797-1803.

This NMR study reveals the structure of an atypical h o m e o d o m a i n from the DNA-binding domain of hepatocyte nuclear factor-I, a tissue-specific transcription factor. This HTH variant resembles the homeodomain with a 21 residue insertion between the second and third helices, which leaves the hydrophobic core of the domain largely unaffected.

16. CE.SKA TA, I2k/VlERS M, MONACI P, NICOSlA A, CORTESE R, • SUCK D: The X-Ray Structure of an Atypical Homeodomain

Present in the Rat Liver Transcription Factor LFB1/HNF1 and Implications for DNA Binding. EMBOJ 1993, 12:1805-1810.

This papers describes the 2.8A X-ray structure of an atypical home- odomain from the DNA-binding domain of hepatocyte nuclear factor-1. This structure is essentially identical to that reported si-

multaneously in reference [15]. A detailed model for DNA binding is derived from structural comparison of HNF1 with the homeodomain .

17. QIAN X, JEON CJ, YOON HS, AGARWAL K, WEI£S MR: Structure • ,, of a New Nucleic-Acid-Binding Motif in Eukaryotic Transcrip-

tional Elongation Factor TFIIS. Nature 1993, 365:277-279. This paper describes a highly unusual NMR structure of a zinc- containing nucleic acid binding domain from the transcriptional elongation factor TFIIS, which is a 3-stranded anti-p~wallel [3-sheet. Nonspecific binding of the domain to single stranded DNA is also demonstrated.

18, NIKOLOV DB, HU S-H, LIN J, GASCH A, HOFFMANN A, HORIKOSH1 M, CI-tUA N-H, ROEDER RG, BURLEY SK: Crys- tal Structure of WIID TATA-Box Binding Protein. Nature 1992, 360:40--46.

19. NIKOLOV DB, BURLEY SK: Structure Determination a n d Re- f inement of TFIID TATA-box Binding Protein at 2.12{ Reso- lution. Nature Struct Biol 1994, in press.

20. CHASMAN DI, FLAHERqT K/V[, SHARP PA, KORNBERG RD: Crys- tal Structure of Yeast TATA-Binding Protein and Model for Interaction with DNA. Proc Natl Acad Sci USA 1993, 90:8174-8178.

21. GEIGER JH, KIM Y, HAHN S, SIGLER PB: Crystal Structure of Yeast TBP at 2.12{ Resolution. Biochemistry 1994, in press.

22. HOOPER BC, LEBLANC JF, HAWLEY DK: Kinetic Analysis of Yeast TFIID-TATA Box Complex Formation Suggests a Mtdti- Step Pathway. J Biol Chem 1992, 267:11539-11547.

23. CORMACK BP, STRUHL K: Regional Codon Randomization: Defining a TATA-Binding Protein Surface Required for RNA Polymerase III Transcription. Science 1993, 262:244-248.

24. LEE DK, HORIKOSHI M, ROEDER RG: interaction of TFIID in the Minor Groove of the TATA Element. Cell 1991, 67:1241-1250.

25. STRUBIN M, STRUHI. K: Yeast and Human TFIID with Al- tered DNA-Binding Specificity for TATA Elements. Cell 1992, 68:721-730.

26. ADAMS CC, WORKMAN JL: Nucleosome Displacement in Transcription. Cell 1993, 72:305-308.

27. WEIR HM, KRAULIS VJ, HILl. CS, RAINE ARC, LAUE ED, THOMAS JO: Structure of the HMG Box Motif in the B-Domain of HMG1. EMBOJ 1993, 12:1311-1319.

28. READ CM, CARY PD, CRANE-ROBINSON C, DRISCOLI. PC, NORMAN DG: Solution Structure of a DNA-Binding Domain from HMG1. Nucl Acids Res 1993, 21:3427-3436.

29. KING C-Y, WEISS MA: The SRY High-Mobility-Group Box Rec- ognizes DNA by Partial Intercalation in the Minor G r o o v e : A Topological Mechanism of Sequence Specificity. Proc Natl Acad Sci USA 1993, 90: 11990-11994.

30. BEXEVANSIS A, VINSON CR: Interactions of Coiled Coils in Transcription Factors: Where is the Specificity? Curr Opin Genet Dev 1993, 3:278-285.

31. BLA'I'FER EE, EBRIGHT YW, EBRIGHT RH. Identification of an Amino Acid-Base Contact in the GCN4-DNA Complex by Bromouracil-Mediated Photocrosslinking. Nature 1992, 359:650-452.

32. EI.LENBERGER TE, BRANDL CJ, STRUHL K, HARRISON SC: The GCN4 Basic Region Leucine Zipper Binds DNA as a Dimer of Uninterrupted or-helices: Crystal Structure of the Protein- DNA Complex. Cell 1992, 71:1223-1237.

33. KIM J, TZAMARIAS D, ELLENBERGER T, HARRISON SC, STRUHL K: Adaptability at the Protein-DNA Interface is an Important Aspect of Sequence Recognition by bZIP Proteins. Proc Natl Acad Sci USA 1993, 90:453-4517.

34. KONIG P~ RICHMOND TJ: The X-Ray Structure of the GCN4- bZIP Bound to ATF/CREB Site DNA Shows the Complex Depends on DNA Flexibility. J Mol Biol 1993, 233:139-154.

Page 9: DNA-binding motifs from eukaryotic transcription factors

DNA-binding motifs from eukaryotic transcription factors Burley 11

35. STAROVASNIK MA, BLACKWELL TK, LAUE TM, WEIN'IRAUB • H, KLEVIT RE: Folding Topology of the Disulfide-Bonded

Dimeric DNA-Binding Domain of the Myogenic Determina- tion Factor MyoD. Biochemistry 1992, 31:9891-9903.

The authors report the results of a NMR study of the structure of and uncomplexed, disulfide-linked homodimer of the bHLH factor MyoD. The proposed structural mtxtel is very different from the structures of Max and USF and may represent another tx)ssible mode of ~t-helix packing for this protein.

36. FISHER DE, PARENT LA, SHARP PA: High~Affinity DNA-Bind- ing myc Analogs: Recognition by an s-Helix. Cell 1993, 72:467-476.

37. PABO CO, SAUER RT: Transcription Factors: Structural Fam- ilies and Principles of DNA Recognition. A n n Rev Biocbem 1992, 61:1053-1095.

38. BRENNAN RG, MKITHEWS BW: The Helix-Turn-Helix DNA Binding Motif. J Btol Chem 1989, 264:1903-1906.

39. MORITA EH, SH1RAKAWA M, HAYASHI F, IMAGAWA M, KYOGOKU Y. Secondary Structure of the Oct-3 POU Home- odomain as Determined by tH-15N NMR Spectroscopy. FEBS Lett 1993, 321:107-110.

40. QIAN YQ, O'VFING G, WUTHmCH K: NMR Detection of Hydra- * tion Water in the Intermolecular Interface of a Protein-DNA

Complex. J A m e r Chem Soc 1993, 115:1189-1190. The authors present the results of a detailed NMR study of the hy- dration properties of a homeodomain-DNA complex. They provide definitive evidence for the presence o f rapidly exchanging water molecules in the intermolecular interface.

41. LAI E, CLARK KL, BURLEY SK, DARNELL JE, JR: HNF-3/fork head or 'Winged Helix' Proteins: A New Family of Transcrip- tion Factors of Diverse Biologic Function. Proc Natl Acad Sct USA 1993, 90:10421-10423.

42. RAMAKRISHNAN V, FINCH JT, GRAZIANO V, LEE PL, SWEET RM: Crystal Structure of Globular Domain of Histone H5 and its Implications for Nucleosome Binding. Nature 1993, 362:219-223.

43. WILSON KP, SHEWCHUK LM, BRENNAN RG, OTSUKA AJ, MKI'IYIEWS BW: E. coil Biotin Holoenzyme Synthetase/bio

Repressor Crystal Structure Delineates the Biotin and DNA-Binding Domains. Proc Natl Acad Sci USA 1992, 89:9257-9261.

44 OM1CHINSKI JG, TRAINOR C, EVANS T, GRONENBORN AM, CLORE GM, FELSENFELD G: A Small Single-'Finger' Peptide from the Erythroid Transcription Factor GATA-1 Binds Specifically to DNA as a Zinc or Iron Complex. Proc Natl Acad Set USA 1993, 90:1676-1680.

45. BERGLUND H, KOVACS H, DAHLMAN-WRIGHT K, GUSTAFSSON J-A, HARD T: Backbone Dynamics of the Glucocorti- coid Receptor DNA-Binding Domain. Biochemistry 1992, 31:12001-12011.

46. LUNDBACK T, CAIRNS C, GUSTAFSSON J-A, CARLSTEDT-DUKE J, HARD T: Thermodynamics of the Glucocorticoid Receptor- DNA Interaction: Binding of Wild-Type GR DBD to Different Response Elements. Biochemistry 1993, 32:5074-5082.

47. BAUMANN H, PAULSEN K, KOVACS H, BERGLUND H, WRIGHT APH, GUSTAFSSON Jz2~, HARD T: Refined Structure of the Glucocorticoid Receptor DNA-Binding Domain. Biochemistry 1993, 32: 13463-13471.

48. KNEGTEL RMA, KATAHIRA M, SCHILTHUIS JG, BONVIN AMJJ, BOELENS R, EIB D, SAAG PT, KAPTEIN R: The Solution Struc- ture of the Human Retinoic Acid Receptor-~-DNA-Binding Domain. J Biomolec NMR 1993, 3:1-17.

49. LEE MS, KLIEWER SA, PROVENCAL J, WRIGHT PE, EVANS RM: Structure of the Retinoid X Receptor-0t-DNA Binding Do- main: A Helix Required for Homodimeric DNA Binding. Sci- ence 1993, 260:1117-1121.

50. SCHWABE JWR, CHAPMAN L, FINCH JT, RHODES D: The Crystal Structure of the Estrogen Receptor DNA-Binding Domain Bound to DNA: How Receptors Discriminate Between their Response Elements. Cell 1993, 75:567-578.

51. KRAULIS, JP MOLSCRIPT: A Program to Produce Both De- tailed and Schematic Plots of Protein Structures. J Appl Crys- tallogr 1991, 24:946-950.

Stephen K Burley, The Rockefeller University and Howard Hughes Medical Institute, 1230 York Avenue, New York, NY 10021, USA.