nuclear proteins that bind the pre-mrna 3' splice site sequence r

10
MOLECULAR AND CELLULAR BIOLOGY, JUlY 1993, p. 4301-4310 Vol. 13, No. 7 0270-7306/93/074301-10$02.00/0 Copyright X 1993, American Society for Microbiology Nuclear Proteins That Bind the Pre-mRNA 3' Splice Site Sequence r(UUAG/G) and the Human Telomeric DNA Sequence d(TTAGGG)n FUYUKI ISHIKAWA,lt MICHAEL J. MATUNIS,2 GIDEON DREYFUSS,2 AND THOMAS R. CECHl* Howard Hughes Medical Institute, Department of Chemistry and Biochemistry, University of Colorado, Boulder, Colorado 80309-0215,1 and Howard Hughes Medical Institute, Department of Biochemistiy and Biophysics, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania 19104-61482 Received 22 February 1993/Returned for modification 8 April 1993/Accepted 19 April 1993 HeLa cell nuclear proteins that bind to single-stranded d(TTAGGG)., the human telomeric DNA repeat, were identified and purified by a gel retardation assay. Immunological data and peptide sequencing experiments indicated that the purified proteins were identical or closely related to the heterogeneous nuclear ribonucleoproteins (hnRNPs) Al, A2-B1, D, and E and to nucleolin. These proteins bound to RNA oligonucleotides having r(UUAGGG) repeats more tightly than to DNA of the same sequence. The binding was sequence specific, as point mutation of any of the first 4 bases [r(UUAG)] abolished it. The fraction containing D and E hnRNPs was shown to bind specifically to a synthetic oligoribonucleotide having the 3' splice site sequence of the human P-globin intervening sequence 1, which includes the sequence UUAGG. Proteins in this fraction were further identified by two-dimensional gel electrophoresis as D01, D02, D1*, and EO; intriguingly, these members of the hnRNP D and E groups are nuclear proteins that are not stably associated with hnRNP complexes. These studies establish the binding specificities of these D and E hnRNPs. Furthermore, they suggest the possibility that these hnRNPs could perhaps bind to chromosome telomeres, in addition to having a role in pre-mRNA metabolism. A growing number of nuclear proteins that bind single- stranded RNA and DNA are being identified, purified, and cloned. Many of them contain a conserved domain, the RNA-binding domain (RBD), that is composed of about 90 amino acids and contains two highly conserved elements, RNP-1 and RNP-2 (1, 10, 18, 19). The RBD is present in many of the heterogeneous nuclear ribonucleoproteins (hnRNPs), which bind to nascent RNA polymerase II tran- scripts (9). Direct evidence that this domain is essential to RNA binding has been provided by deletion and site-di- rected mutagenesis of the Ul small nuclear RNP (snRNP) 70-kDa protein, Ul snRNP A, and U2 snRNP B" (35, 38). Recent structure determinations of the RBDs of Ul snRNP A and hnRNP C have led to three-dimensional models of RNA-RBD protein interactions (12, 30). Single-stranded nucleic acid-protein interactions also oc- cur at chromosome termini, or telomeres. Telomeric DNA is composed of multiple repeats of a short sequence, one strand of which is usually guanosine rich (e.g., T2AG3 in humans, T2G4 in Tetrahymena thermophila, and T4G4 in Oxytricha nova [2, 29, 48]). In several organisms in which the structure of this DNA has been examined, the G-rich strand protrudes at the 3' end as a 12- to 16-nucleotide single-stranded extension (15, 21). There is evidence for protein components of telomeres in many organisms, but in only a few cases have the proteins and their DNA recogni- tion elements been defined (for a review, see reference 47). In Oxytricha nova, a protein heterodimer binds specifically to the single-stranded (T4G4)2 of each macronuclear DNA * Corresponding author. t Present address: Tokyo Institute of Technology, Faculty of Bioscience and Biotechnology, Nagatsuta 4259, Yokohama 227, Japan. molecule, thereby capping the ends of the chromosomes (13, 14, 17, 34). In a search for protein components of human telomeres, we have purified HeLa cell nuclear proteins that bind to single-stranded (1TAGGG) repeats. Unexpectedly, the pu- rified proteins turned out to bind more tightly to RNA than to DNA. Furthermore, these proteins bind specifically to an oligoribonucleotide having the pre-mRNA 3' splice site sequence. Sequence analysis, immunological studies, and two-dimensional gel electrophoresis indicate that the puri- fied proteins are members or relatives of the hnRNP family. Possible roles of these proteins in binding to telomeres and in mRNA splicing are discussed. MATERIALS AND METHODS Oligonucleotides. All deoxyribo- and ribooligonucleotides were synthesized chemically on a 380 B synthesizer (Applied Biosystems) according to the manufacturer's instructions. Phosphoramidites were from Applied Biosystems (DNA) and American Bionetics/BioGenex (RNA). Oligonucleotides dHum-4 and rHum-4 are d(TTAGGG)4 and r(UUAGGG)4, respectively [dHum-n equals d(TTAGGG)n]. Oligoribonu- cleotide rN24 is random-sequence RNA synthesized with a 3'-terminal rC; each of the other 23 residues consists of an equimolar mixture of rA, rG, rC, and U. All oligonucleotides were purified by electrophoresis in 15% polyacrylamide-7 M urea gels. Preparation of nuclear extract. About 1010 HeLa cells were cultured in spinner flasks with Dulbecco's modified Eagle's medium supplemented with 10% calf serum. Cells were collected by centrifugation and washed twice with phos- phate-buffered saline. All procedures were done at 4°C except where otherwise indicated. Cells were suspended in 160 ml of nucleus isolation buffer (20 mM KCl, 5 mM 4301

Upload: phamnguyet

Post on 05-Jan-2017

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Nuclear Proteins That Bind the Pre-mRNA 3' Splice Site Sequence r

MOLECULAR AND CELLULAR BIOLOGY, JUlY 1993, p. 4301-4310 Vol. 13, No. 70270-7306/93/074301-10$02.00/0Copyright X 1993, American Society for Microbiology

Nuclear Proteins That Bind the Pre-mRNA 3' Splice SiteSequence r(UUAG/G) and the Human Telomeric

DNA Sequence d(TTAGGG)nFUYUKI ISHIKAWA,lt MICHAEL J. MATUNIS,2 GIDEON DREYFUSS,2 AND THOMAS R. CECHl*Howard Hughes Medical Institute, Department of Chemistry and Biochemistry, University of Colorado,Boulder, Colorado 80309-0215,1 and Howard Hughes Medical Institute, Department of Biochemistiy and

Biophysics, University ofPennsylvania School ofMedicine, Philadelphia, Pennsylvania 19104-61482

Received 22 February 1993/Returned for modification 8 April 1993/Accepted 19 April 1993

HeLa cell nuclear proteins that bind to single-stranded d(TTAGGG)., the human telomeric DNA repeat,were identified and purified by a gel retardation assay. Immunological data and peptide sequencingexperiments indicated that the purified proteins were identical or closely related to the heterogeneous nuclearribonucleoproteins (hnRNPs) Al, A2-B1, D, and E and to nucleolin. These proteins bound to RNAoligonucleotides having r(UUAGGG) repeats more tightly than to DNA of the same sequence. The binding wassequence specific, as point mutation of any of the first 4 bases [r(UUAG)] abolished it. The fraction containingD and E hnRNPs was shown to bind specifically to a synthetic oligoribonucleotide having the 3' splice sitesequence of the human P-globin intervening sequence 1, which includes the sequence UUAGG. Proteins in thisfraction were further identified by two-dimensional gel electrophoresis as D01, D02, D1*, and EO; intriguingly,these members of the hnRNP D and E groups are nuclear proteins that are not stably associated with hnRNPcomplexes. These studies establish the binding specificities of these D and E hnRNPs. Furthermore, theysuggest the possibility that these hnRNPs could perhaps bind to chromosome telomeres, in addition to havinga role in pre-mRNA metabolism.

A growing number of nuclear proteins that bind single-stranded RNA and DNA are being identified, purified, andcloned. Many of them contain a conserved domain, theRNA-binding domain (RBD), that is composed of about 90amino acids and contains two highly conserved elements,RNP-1 and RNP-2 (1, 10, 18, 19). The RBD is present inmany of the heterogeneous nuclear ribonucleoproteins(hnRNPs), which bind to nascent RNA polymerase II tran-scripts (9). Direct evidence that this domain is essential toRNA binding has been provided by deletion and site-di-rected mutagenesis of the Ul small nuclear RNP (snRNP)70-kDa protein, Ul snRNP A, and U2 snRNP B" (35, 38).Recent structure determinations of the RBDs of Ul snRNPA and hnRNP C have led to three-dimensional models ofRNA-RBD protein interactions (12, 30).

Single-stranded nucleic acid-protein interactions also oc-cur at chromosome termini, or telomeres. Telomeric DNA iscomposed of multiple repeats of a short sequence, onestrand of which is usually guanosine rich (e.g., T2AG3 inhumans, T2G4 in Tetrahymena thermophila, and T4G4 inOxytricha nova [2, 29, 48]). In several organisms in whichthe structure of this DNA has been examined, the G-richstrand protrudes at the 3' end as a 12- to 16-nucleotidesingle-stranded extension (15, 21). There is evidence forprotein components of telomeres in many organisms, but inonly a few cases have the proteins and their DNA recogni-tion elements been defined (for a review, see reference 47).In Oxytricha nova, a protein heterodimer binds specificallyto the single-stranded (T4G4)2 of each macronuclear DNA

* Corresponding author.t Present address: Tokyo Institute of Technology, Faculty of

Bioscience and Biotechnology, Nagatsuta 4259, Yokohama 227,Japan.

molecule, thereby capping the ends of the chromosomes (13,14, 17, 34).

In a search for protein components of human telomeres,we have purified HeLa cell nuclear proteins that bind tosingle-stranded (1TAGGG) repeats. Unexpectedly, the pu-rified proteins turned out to bind more tightly to RNA than toDNA. Furthermore, these proteins bind specifically to anoligoribonucleotide having the pre-mRNA 3' splice sitesequence. Sequence analysis, immunological studies, andtwo-dimensional gel electrophoresis indicate that the puri-fied proteins are members or relatives of the hnRNP family.Possible roles of these proteins in binding to telomeres and inmRNA splicing are discussed.

MATERIALS AND METHODS

Oligonucleotides. All deoxyribo- and ribooligonucleotideswere synthesized chemically on a 380 B synthesizer (AppliedBiosystems) according to the manufacturer's instructions.Phosphoramidites were from Applied Biosystems (DNA)and American Bionetics/BioGenex (RNA). OligonucleotidesdHum-4 and rHum-4 are d(TTAGGG)4 and r(UUAGGG)4,respectively [dHum-n equals d(TTAGGG)n]. Oligoribonu-cleotide rN24 is random-sequence RNA synthesized with a3'-terminal rC; each of the other 23 residues consists of anequimolar mixture of rA, rG, rC, and U. All oligonucleotideswere purified by electrophoresis in 15% polyacrylamide-7 Murea gels.

Preparation of nuclear extract. About 1010 HeLa cells werecultured in spinner flasks with Dulbecco's modified Eagle'smedium supplemented with 10% calf serum. Cells werecollected by centrifugation and washed twice with phos-phate-buffered saline. All procedures were done at 4°Cexcept where otherwise indicated. Cells were suspended in160 ml of nucleus isolation buffer (20 mM KCl, 5 mM

4301

Page 2: Nuclear Proteins That Bind the Pre-mRNA 3' Splice Site Sequence r

4302 ISHIKAWA ET AL.

Tris-HCl [pH 7.4], 0.05 mM spermine, 0.2 mM spermidine,0.5 mM K-EDTA, 1% thioglycerol, 0.5 mM phenylmethyl-sulfonyl fluoride, 0.5 mM TPCK [tosylphenylalanine chlo-romethyl ketone], 0.1% digitonin). Cells were homogenizedby 15 strokes of a Dounce homogenizer (pestle B). Com-pleteness of disruption was monitored by microscopy. Nu-clei were pelleted by centrifugation at 900 x g for 10 min.After the supematant was discarded, nuclei were resus-pended in 200 ml of digestion buffer (80 mM NaCl, 0.2 Msucrose, 5 mM Tris-HCl [pH 7.4], 1 mM CaCl2, 0.1 mMphenylmethylsulfonyl fluoride, 0.1 mM TPCK). Nuclei wereagain centrifuged as described above, and pelleted nucleiwere resuspended in 50 ml of digestion buffer. Nuclei wereincubated in a water bath at 37°C for 1 min, and micrococcalnuclease (MNase) (50 U) was added and mixed well. Nucleiwere incubated for another 5 min with occasional shaking.The reaction was stopped by adding 300 pl of 0.5 M EGTA[ethylene glycol-bis(,-aminoethyl ether)-N,NN',N'-tetra-acetic acid]. The tube containing nuclei was chilled on ice for5 min and gently shaken at 4°C for 4 h. These extractednuclei were centrifuged at 10,000 rpm in a Beckman JR 13rotor for 15 min, and the supernatant was collected asnuclear extract.

Protein purification. To the nuclear extract, 3 M ammo-nium sulfate and 0.5 M sodium phosphate buffer (pH 7.1)were added to give final concentrations of 1.2 M and 66 mM,respectively. The nuclear extract was incubated on ice for 30min and centrifuged in a JR 13 rotor as described above.Most of the activity binding to the dHum-4 oligonucleotidewas contained in the supernatant. The collected supernatantwas loaded onto a 250-ml column of phenyl-Sepharose(Pharmacia/LKB) equilibrated in 1.2 M ammonium sulfate incolumn buffer (50 mM sodium phosphate [pH 7.1], 1 mM2-mercaptoethanol, 2 mM EDTA, 0.2 mM EGTA, 10%glycerol). The column was washed successively with 500 mlof 1.2 M ammonium sulfate and 300 ml of 0.9 M ammoniumsulfate in column buffer. After washing, a 1,500-ml lineargradient from 0.9 to 0 M ammonium sulfate in column buffer(10 mM sodium phosphate [pH 7.1], 1 mM 2-mercaptoetha-nol, 2 mM EDTA, 0.2 mM EGTA, 10% glycerol) wasapplied. Fractions were assayed for DNA binding, and theactive fractions were dialyzed against affinity column buffer(0.1 M NaCl, 10 mM sodium phosphate [pH 7.1], 2 mMEGTA, 0.2 mM EDTA, 10% glycerol, 1 mM 2-mercaptoeth-anol, 0.01% Nonidet P-40) at 4°C for 5 h. The single-strandedDNA affinity column was made by coupling 140 mg ofherring sperm DNA (Sigma) with 50 ml (bed volume) ofCNBr-activated Sepharose 4B (Pharmacia/LKB) as the man-ufacturer recommended. Herring sperm DNA was exten-sively extracted with phenol and chloroform and precipi-tated with ethanol before being used. The dialysate wasloaded onto the column equilibrated with affinity columnbuffer. The column was washed with 100 ml of the samebuffer, followed by a 500-ml linear gradient of 0.1 to 1 MNaCl in affinity column buffer. Two different activities werenoted: fraction A was eluted in the flowthrough volume, andfraction B was eluted at the middle of the gradient. FractionA was dialyzed against affinity column buffer with 0.05 MNaCl and rechromatographed on single-stranded DNA; itthen bound to the column and eluted at -0.3 M NaCl. (Theabsence of binding to single-stranded DNA during the initialchromatography was ascribed to competition between frac-tions A and B binding to a column of limited capacity.)Fractions A and B were dialyzed against affinity columnbuffer and loaded onto the dHum-6 oligonucleotide affinitycolumn (-10 mg of 5'-end-biotinated dHum-6 conjugated

with avidin-high-performance liquid chromatography col-umn [Showa-Denko, Tokyo, Japan] according to the manu-facturer's recommendation; the column was then washedand installed in a fast-protein liquid chromatograph [Phar-macia/LKB]). The two activities, A and B, were separatelyloaded onto this column equilibrated with affinity columnbuffer. The column was washed with affinity column buffer,and a 30-ml linear gradient of 0.1 to 2 M NaCl in affinitycolumn buffer was applied. Both activities eluted at -1 MNaCl.

Gel retardation assay. Typically, crude nuclear extract orpurified proteins and 0.05 pmol of 5'-end-labeled oligonucle-otide were incubated in 10 ,u of 0.05 mM NaCl-10 mMTris-HCl (pH 7.4)-i mM EDTA-50 ,ug of bovine serumalbumin per ml except where otherwise noted. To reducenonspecific binding, 1 ,ug of either denatured herring spermDNA (Sigma) or Escherichia coli tRNA (Sigma) was in-cluded in reaction mixtures involving deoxyribo- or ribooli-gonucleotide probes, respectively. (Herring sperm DNA andE. coli tRNA were extensively extracted with phenol-chlo-roform and precipitated by ethanol before being used.) Afterthe reaction mixture was incubated at 37°C for 30 min, 1 p,uof 80% glycerol-0.1% bromophenol blue was added andmixed by gentle pipetting. The sample was loaded onto a10% polyacrylamide gel (acrylamide/bisacrylamide ratio =55:1) in 0.5 x Tris-borate-EDTA, which had been precooledin the cold room and prerun at 250 V/25 cm. Gels were driedand exposed to X-ray film.Two-dimensional gel electrophoresis and immunoblotting.

Two-dimensional nonequilibrium pH gradient gel electro-phoresis was performed essentially as described by O'Far-rell et al. (31) with an ampholine gradient of pH 3 to 10separated for 4 h at 400 V in the first dimension. Proteinswere separated by sodium dodecyl sulfate-polyacrylamidegel electrophoresis (SDS-PAGE) in the second dimension.For immunoblotting, proteins were transferred to a nitrocel-lulose membrane (Schleicher & Schuell, Keene, N.H.) andprobed sequentially with the indicated monoclonal antibod-ies. Bound antibodies were detected with "2I-labeled goatanti-mouse F(ab )2-

RESULTS

Identification of single-stranded d(TTAGGG). binding ac-tivity. Initially, HeLa cell nuclear extracts were prepared bythe method of Dignam et al. (8), which involves extraction ofisolated nuclei with a buffer containing 0.42 M NaCl. Noactivity that bound to single-stranded telomeric repeats wasfound. Therefore, HeLa cell nuclei were treated with MNaseto allow the fragmented chromatin to diffuse from the nucleiinto the extraction buffer.

Single-stranded telomeric deoxyoligonucleotides d(1TAGGG)n (abbreviated dHum-n, where n is the number of humantelomeric repeats) were 5' end labeled and tested in a gelretardation assay for the formation of complexes with com-ponents of the HeLa cell nuclear extract. Two major shiftedbands were formed after incubation of dHum-4 with thecrude extract (Fig. 1A, lane 2; one band is indicated bybrackets, and the other band is indicated by a pair of arrowsto indicate that it was resolved as a doublet in otherexperiments). This signal disappeared in the presence of a100-fold excess of unlabeled dHum-4 oligonucleotide (Fig.1A, lane 3), indicating that the amount of binding was limitedby the amount of activity rather than a low binding constant.The signal was not diminished by the addition of the samemolar excess of an unrelated oligonucleotide of the same size

MOL. CELL. BIOL.

Page 3: Nuclear Proteins That Bind the Pre-mRNA 3' Splice Site Sequence r

DNA AND RNA BINDING BY D AND E hnRNPs 4303

B

Extract MNase Fr. A Fr. B- + + - + + - + +

Competitor - - + - - + - - +

1 2 3 4 5 6 7 8 9

Extract MNase Fr. A Fr. B- + + - + + -

Competitor - - + - - + - - +

1 2 3 4 5 6 7 8 9_ r

m_. _*-Wllow *S

[Complexes .

Free -4

FIG. 1. Specific binding of HeLa cell proteins to dHum-4 and rHum-4 oligonucleotides. 5'-end-labeled dHum-4 (A) or rHum-4 (B) wasincubated with MNase-treated nuclear extract (lanes 2 and 3), purified fraction A (Fr. A) (lanes 5 and 6), or fraction B (Fr. B) (lanes 8 and9). Complex formation was monitored by nondenaturing gel electrophoresis. Lanes 1, 4, and 7 are without extract or proteins. Lanes 3, 6,and 9 are with extract or proteins and 100-fold excess of unlabeled dHum-4 (A) or rHum-4 (B). The positions of the unbound freeoligonucleotide and the bound complexes are marked by arrows or a bracket.

(data not shown). The same experiment with dHum-2 gaverise to retarded bands of weaker intensities and fastermobilities, while dHum-6 produced bands of somewhatstronger intensities and slower mobilities than those pro-duced with dHum-4 (data not shown).To see whether this activity bound to double-stranded

DNA as efficiently as to single-stranded DNA, a palindromicoligonucleotide, 5'(CCCTAA)3,AATfCGATCAG1TCCGAAJ(TTAGGG)33', was made. This sequence was de-signed to self-anneal to form double-stranded (CGAATTC)(TTAGGG)3. (CCCTAA)3(GAATTCG) hinged on the non-telomeric end by an 8-nucleotide spacer. The underlinedunique sequence was included to prevent slippage of therepeated double-stranded region, thereby ensuring that theend was double stranded. Most of this oligonucleotide wasshown to be properly self-annealed under the conditions ofthe binding reaction by digesting the EcoRI site (underlinedin the sequence). When this oligonucleotide was tested forbinding to the activity, the retarded bands were muchweaker in intensity than those formed with dHum-4 ordHum-2 under the same conditions (data not shown). Atpresent it is not clear whether this inefficient complexformation is produced by the double-stranded DNA or adenatured, single-stranded region of the oligonucleotide.However, the result indicates that the activity binds better tosingle-stranded telomeric repeats.The single-stranded telomere sequence binding activity

was retained after treatment with RNase A but was de-stroyed by treatment with proteinase K. It was stable aftertreatment for 5 min at 50°C but was inactivated after 5 min at70°C. Also, it was stable after treatment at pH 5 to 10 butwas inactivated by treatment at a pH below 4 (data notshown). Collectively, these results indicate proteinaceouscomponents in the HeLa cell nuclear extract that bindspecifically to oligonucleotides having single-stranded telo-meric repeats.

Purification of the binding proteins. The activity that bindsto dHum-4 was purified as described in Materials andMethods by using gel retardation with dHum-4 as the assay.Ammonium acetate precipitation, phenyl-Sepharose chro-matography, single-stranded DNA affinity chromatography,

and dHum-6 affinity chromatography gave the final samples.Chromatography on single-stranded DNA separated theactivity into two fractions, fractions A and B. Fraction A didnot bind to single-stranded DNA at 0.1 M NaCl during theinitial pass (see Materials and Methods), whereas fraction Bdid and was eluted at about 0.5 M NaCl. Each fraction wasfurther purified by dHum-6 affinity chromatography. Bothfractions were similarly eluted at about 1 M NaCl in a saltgradient. It will be shown in a subsequent section that eachfraction contains a limited group of polypeptides.

Fractions A and B and MNase-treated crude nuclearextract were compared in a gel retardation assay usinglabeled dHum-4 (Fig. 1A). The faster-migrating doublet ofretarded bands seen with crude nuclear extract (indicated bydouble arrows in lane 2) was observed with fraction A (lane5). The slower, relatively broad band seen with crudenuclear extract (bracket in lane 2) was produced by fractionB (lane 8). The retarded bands seen with fractions A and Bdisappeared upon the addition of a 100-fold excess of unla-beled dHum-4 (lanes 6 and 9). These findings indicate thatwe purified the major activities that bind to dHum-4, whichare present in the MNase-treated HeLa cell nuclear extract.d(TTAGGG),-binding proteins are abundant in the nu-

cleus. To estimate the abundance of the proteins that bind todHum-4, a dilution experiment was performed. One milliliterof nuclear extract was prepared from about 2.5 x 108 cells.One microliter of this extract and successive twofold dilu-tions were incubated with 0.5 pmol of labeled dHum-4. Thesecond twofold dilution gave about half of the oligonucleo-tide bound (data not shown). Assuming that all of the proteinwas recovered in an active form during extraction, that thisprotein was saturated with oligonucleotide, and that bindinginvolves one molecule of protein per molecule of oligonucle-otide, there would be 0.25 pmol of the activity in 0.25 Rl ofnuclear extract. This leads to a rough estimate of 2 x 106molecules of the activity per cell. Because of the assump-tions leading to this estimate, it should be regarded as alower limit. Thus, the activity is present at a much largeramount per cell than expected if 1 or even 100 moleculeswere bound to each telomere.d(TTAGGG)-binding proteins bind RNA more tightly than

A

Complexes I

Free -4

VOL. 13, 1993

Page 4: Nuclear Proteins That Bind the Pre-mRNA 3' Splice Site Sequence r

4304 ISHIKAWA ET AL.

A B

Probe dHLlnm-l

5 X 25 X 125 xCompetitor (-) --l

d r d r d r

1 2 3 4 5 6 7 8

rHum-4

5 x 25 x 125 x

d r d r d r

9 10 11 12 13 14 15 160-

dliun-4

5 x 25 x 125 xiompetitor (-) r -

d r d r d r

1 2 3 4 5 6 7 8

rHuni-4

5 x 25 x 125 x--

d r d r d r

9 10 11 12 13 14 15 16.E ,

Complex iL _i

Complex -e 4I

Complex

+- Free

4- Free

Free -4

FIG. 2. Proteins bind to rHum-4 more tightly than to dHum-4. Fraction A (A) or fraction B (B) was incubated with end-labeled dHum-4(lanes 1 to 8) or rHum-4 (lanes 9 to 16). Lanes 1 and 9 contain end-labeled oligonucleotide alone without proteins, and lanes 2 and 10 containproteins but not any unlabeled oligonucleotide competitor. Competitors d (unlabeled dHum-4) and r (unlabeled rHum-4) were included inreaction mixtures at 5-, 25-, or 125-fold molar excess as indicated.

DNA. The unexpected abundance of the activity raised thepossibility that it might not be an authentic telomere-bindingprotein but that it might instead bind to a more abundantsingle-stranded nucleic acid. RNA was a good candidate.Therefore, we performed a gel retardation experiment usinga synthetic oligoribonucleotide, r(UUAGGG)4 (rHum-4). Aswith dHum-4 (Fig. 1A, lane 2), incubation of rHum-4 withcrude nuclear extract produced two classes of retardedbands (Fig. 1B, lane 2). The faster band (possibly a doublet)was also produced by fraction A (lane 5), whereas the slowerband was due to fraction B (lane 8). All retarded bands couldbe depleted by competition with unlabeled rHum-4 (lanes 3,6, and 9). These results indicate that fractions A and B bindto rHum-4 and that they represent the major activity thatbinds to rHum-4 in MNase-treated crude nuclear extract.To determine the relative affinity of binding to dHum-4

and rHum-4, a competition experiment was performed.Fraction A (Fig. 2A) or B (Fig. 2B) was incubated withend-labeled dHum-4 (lanes 1 to 8) or rHum-4 (lanes 9 to 16).Unlabeled dHum-4 or rHum-4 was added at 5-, 25-, or125-fold molar excess (final concentration of 25, 125, or 600nM, respectively). The retarded band produced by labeleddHum-4 and fraction A or B (Fig. 2, lanes 2) almostdisappeared in the presence of a fivefold excess of unlabeledrHum-4 (lanes 4). In contrast, even in the presence of a125-fold excess of dHum-4, noticeable retarded bands re-mained (lanes 7). Similar results were obtained with labeledrHum-4 (Fig. 2, lanes 9 to 16), except that competitionrequired a higher concentration of unlabeled oligonucleo-tide. Even a 125-fold excess of dHum-4 depleted only a smallfraction of the retarded band (lanes 15), whereas no retardedband was apparent in the lanes with a 125-fold excess ofrHum-4 (lanes 16). This difference between end-labeleddHum-4 and rHum-4 with respect to the efficiency of com-petition indicates that both fractions A and B bind to rHum-4more tightly than to dHum-4.

Single-base changes disrupt binding. To evaluate the se-quence specificity of RNA binding by fractions A and B andto gain some insight into the possible function of theseproteins, a battery of mutant oligoribonucleotides was con-structed (Fig. 3A). Each contained a single-nucleotide tran-sition at the same position in all four repeats ofr(UUAGGG). With fractions A (Fig. 3B) and B (Fig. 3C),

only rHum-4A5 and rHum-4A6 gave rise to significant re-tarded bands which were blocked by excess unlabeledrHum-4 (lanes 16 to 21). The results indicate that the bindingbetween fraction A or B and rHum-4 is very specific for thefirst 4 bases of r(UUAGGG), with fraction A having someadditional preference for a G over an A in position 5 andfraction B having some additional preference for a G over anA in position 6 of the repeat.

Fraction B binds to a synthetic 3' splice site. The sequencerequirement for binding, which was revealed by the mutantoligonucleotide study, along with the abundance of theseproteins suggested their participation in pre-mRNA splicing.The consensus sequence of pre-mRNA 3' splice sites is(Py)nNPyAG/G, where the slash indicates the splice site (28,32). The sequence used in this study, r(UUAGGG), matchesthis consensus sequence very well. This prompted us tostudy the ability of fractions A and B to bind to theoligoribonucleotide r,Gl1 [r(UUUUCCCACCCUUAG/GCUGCUGGU)], which contains sequences surrounding the 3'splice site of the first intervening sequence of human f-glo-bin pre-mRNA. The sequence has the same length (24 bases)as rHum-4. The binding to rHum-4 and r,BGll was comparedwith the binding to the unrelated rECGF by fractions A andB (Fig. 4). Both fractions showed strong retarded bands withrHum-4 (lanes 2). Fraction B also produced retarded bandswith r,Gl1 and rECGF (Fig. 4B, lanes 10 and 18), whereasfraction A did not (Fig. 4A, lanes 10 and 18).To begin to assess the specificity of the complexes formed

between fraction A or B and rHum-4, r,BGll, or rECGF, thebinding was tested for competition with oligoribonucleotidesof random sequence (rN24; see Materials and Methods). Theretarded bands produced by rHum-4 and fraction A or Bwere quite resistant to competition, almost disappearingonly when the probe was premixed with 3.6 ,uM rN24 (36-fold[729-fold] excess of rN24; Fig. 4, lanes 8). The retardedbands produced by fraction B and r,Gl1 were three- orninefold less resistant to competitor than those involvingrHum-4 (compare lanes 7 and 8 with 13 and 14 in Fig. 4B).This small difference in apparent affinity to fraction B andthe difference in signal intensity might be explained byrHum-4 having four copies of the 3' splice site consensussequence compared with the one present in rOGl1. Thesecompetition studies reveal that fraction B binds to rHum-4

MOL. CELL. BIOL.

Page 5: Nuclear Proteins That Bind the Pre-mRNA 3' Splice Site Sequence r

DNA AND RNA BINDING BY D AND E hnRNPs 4305

ArHum-4

rHum-4C 1rHum-4C2rHum-4G3rHum-4A4rHum-4A5rHum-4A6

r(UUAGGG)4

r(CUAGGG)4r(UCAGGG)4r(UUGGGG)4r(UUAAGG)4r(UUAGAG)4r(UUAGGA)4

B Probe rHum-4 rHum-4C1 rHum-4C2 rHum-4G3 rHum-4A4 rHum-4A5 rHum-4A6 rECGF

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24_- .q_ at_. _ 0

Complex [ w U

Free -e

C Probe rHum-4 rHum-4Cl rHum-4C2 rHum-4G3 rHum-4A4 rHum-4A5 rHum-4A6 rECGF- I~-wr

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Complex [ W

Free -4

FIG. 3. Effect of point mutations on binding. (A) Mutants of rHum-4 used in this study. A single transition was introduced at the sameposition of every repeat. Altered bases are underlined. (B and C) Gel retardation experiments using fractions A and B and mutantribooligonucleotides. The labeled oligonucleotide probes are indicated. rECGF is a nbooligonucleotide having the same length but anunrelated sequence, r(GCAGCCUUGAUGACCUCGUGAACC). Oligonucleotides were incubated without protein (lanes 1, 4, 7, 10, 13, 16,19, and 22), with protein (lanes 2, 5, 8, 11, 14, 17, 20, and 23), or with protein and 100-fold excess of unlabeled rHum-4 (lanes 3, 6, 9, 12, 15,18, 21, and 24).

and to rIGll with much higher affinity than to random RNA.In contrast, the very weak retarded band with fraction B andrECGF completely disappeared in the presence of 45 nM(ninefold excess) of rN., indicating that this binding isnonspecific (Fig. 4B, lane 20). Another retarded band wasobserved to increase as a larger amount of rN24 was added inall cases (indicated by asterisks in Fig. 4). Although itsstructure was not investigated, it is likely to representdouble-stranded RNAs formed between the labeled probesand a selected subpopulation of rN24. Because the formationof these products depletes the amount of free probe availablefor complex formation, this experiment provides informationabout relative rather than absolute binding specificities.

Electrophoretic and immunological analyses of binding pro-teins. Fractions A and B were subjected to electrophoresis inpolyacrylamide gels and stained with Coomassie blue (Fig.5). Fraction A gave two major bands and one faint band

having apparent molecular masses of 26, 28, and 55 kDa,respectively (lane 1). They will be referred to as A26, A28,and A55. A28 was possibly a doublet. Fraction B showedthree major bands and one faint band (lane 3). They hadapparent molecular masses of 37, 39, 41, and 50 kDa and willbe referred to as B37, B39, B41, and B50. B41 was possiblya doublet. Fraction B was further analyzed by two-dimen-sional gel electrophoresis (Fig. 6A). Proteins were separatedby nonequilibrium pH gradient gel electrophoresis in the firstdimension and by SDS-PAGE in the second dimension, andthey were visualized by silver staining. hnRNPs isolatedfrom HeLa cells by single-stranded DNA chromatographywere run in parallel for comparison. The proteins in fractionB comigrate with proteins which are related to the hnRNPsD and E but which have not previously been named (33a).B37 resolves into two isoforms that comigrate with proteinsthat we refer to as EO, and B39 comigrates with D01. B41

VOL. 13, 1993

Page 6: Nuclear Proteins That Bind the Pre-mRNA 3' Splice Site Sequence r

4306 ISHIKAWA ET AL.

A B

Probe rHurn-4 rfBgll rECGF Probe rHum-4 rfgl 1 rECGF2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

' ' _~~~~~~~r-Mr,7- "'

Conmplex

Free

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

'-~~~~~~~~~~~~~~~~~~~~~111'_FiL,I

Complex p -*

.1

Free|

FIG. 4. Fraction B binds to rOGl1 specifically. Fractions A (A) and B (B) were incubated with end-labeled rHum-4, rpGll, or an unrelatedribooligonucleotide, rECGF, in a gel retardation experiment. Lanes 1, 9, and 17 have no proteins, and lanes 2, 10, and 18 have fraction A orB in the absence of competitor. The other lanes have proteins and increasing ratios of rNu competitor to labeled oligonucleotide as follows:3-fold (lanes 3, 11, and 19), 9-fold (lanes 4, 12, and 20), 27-fold (lanes 5, 13, and 21), 81-fold (lanes 6, 14, and 22), 243-fold (lanes 7, 15, and23), and 729-fold (lanes 8, 16, and 24). rN24 is a mixture of random 24-base RNAs. The asterisks indicate another retarded band that appearedto increase as larger amounts of rN24 were added.

resolves into two spots, D02 and D1*. The presence of DhnRNPs in fraction B but not fraction A was confirmed byimmunoblotting with monoclonal antibody 5B9, which isspecific for the D hnRNPs (DO1, D02, Dl, D1*, and D2 [33a])(Fig. 5, lanes 2 and 4).A two-dimensional immunoblot of fraction B was probed

sequentially with the monoclonal antibodies 5B9 and 5H3(Fig. 6B). 5B9 reacted with proteins in fraction B corre-sponding to DO1, D02, and D1*. 5H3 is specific for thelargest of the E hnRNPs and for EO (33a) and recognizes aprotein in fraction B corresponding to EO.To confirm that binding to labeled oligonucleotides is

performed by these identified proteins and not by someunexpected, contaminating protein, a gel retardation assayof fraction B and labeled dHum-4 was blotted onto amembrane and probed with the anti-hnRNP D monoclonalantibody 5B9 (Fig. 7). The protein-oligonucleotide complex,which was detected by the radioactivity of the oligonucleo-tide (lane 2), had a mobility identical to that of the bandcontaining hnRNP D, which was detected by the antibody(lane 4). Because the D hnRNPs are positively charged (pI =7.8 to 8.4 [33]), they are unlikely to migrate to a similarposition in the nondenaturing gel in the absence of dHum-4.These data therefore strongly suggest that hnRNP D actuallybinds to dHum-4.

Sequence analysis of binding proteins. Each protein bandwas purified from preparative gels and digested with trypsin,and the resulting peptides were fractionated by reverse-phase chromatography. Amino acid sequence analysis re-vealed that many of the proteins are identical or highlysimilar to known hnRNPs (Table 1).Two peptides of 9 and 11 amino acids obtained from A26

are identical to the human hnRNP Al RNP-1 domain se-quence deduced from cDNA (4). Three peptides derivedfrom A28, although they have some unidentified amino acidsin one sequence, are identical to the human hnRNP A2-B1amino acid sequence deduced from cDNA (3) except for oneposition (the position identified as valine or phenylalanine inpeptide A28 no. 1 was reported as tryptophan). hnRNP Bihas an amino acid sequence identical to that of hnRNP A2except for a 12-amino-acid insert in Bi (3). Although thereported molecular masses of hnRNP Al and A2-B1 are 34kDa and 36 and 38 kDa, respectively (33), both of theseproteins are known to undergo proteolytic degradation dur-ing purification (46). UP1, the 22-kDa proteolytic fragment of

Al, and UPlB, the 24-kDa fragment of A2-B1, have molec-ular masses close to those estimated for A26 and A28. Weconclude that our A26 and A28 are proteolytic fragments ofhnRNP Al and A2-B1, respectively.Two peptides derived from B37 have high homology with

a previously characterized human type A-B hnRNP (20).The 14-amino-acid peptide B37 no. 1 is identical to a regionin the protein reported by Khan et al. (20), except for oneposition (the alanine in B37 no. 1 was predicted to beserine-proline in the previously characterized protein). B37no. 2 also shows high homology with the human type A-BhnRNP, although the aspartic acid-threonine-proline in B37no. 2 was reported as isoleucine-lysine-methionine by Khanet al. (20). Because the B37 peptides are positioned atless-conserved regions of the RBD (1, 35), the high degree ofidentity (87%) is consistent with the identification of B37 asan A-B-like hnRNP. On the basis of its mobility on two-dimensional gels and on its immunological reactivity (Fig.6B), we shall refer to B37 as E0.Three peptide sequences were obtained from B39. The

20-amino-acid peptide B39 no. 2 is identical to a region in

2

kD69 -46 -

30 -

21.5

3 4

Os.i

kD

- 97- 69

- 46

- 3014.3

- 21.5

FIG. 5. Protein composition of fractions A and B. Fractions A(lanes 1 and 2) and B (lanes 3 and 4) were analyzed by electrophore-sis in 15 and 10% polyaciylamide gels, respectively. Lanes: 1 and 3,lanes stained with Coomassie blue; 2 and 4, identical set of lanesblotted onto a membrane and incubated with anti-hnRNP D mono-clonal antibody. The antibody was detected by a second antibodyconjugated with alkaline phosphatase. Molecular mass markers (inkilodaltons) are indicated.

MOL. CELL. BIOL.

*

Page 7: Nuclear Proteins That Bind the Pre-mRNA 3' Splice Site Sequence r

DNA AND RNA BINDING BY D AND E hnRNPs 4307

TTAGGG OH- H+ ssDNA OH

116 K-96 K -

68 K-

45 K

30 K-

B TTAGGG

116 K-96 K -

68 K-

45 K-Dl

,, D02, SDO1

EO

30 K-

three previously identified hnRNP-type proteins (23, 39, 44).B39 no. 1 is also identical to a region in the proteinscharacterized by Sharp et al. (39) and by Tay et al. (44) andis highly homologous to a region in the protein characterizedby Lahiri and Thomas (23). The cDNA characterized byLahiri and Thomas is partial, and the last two amino acids ofB39 no. 1 (valine-lysine) are not contained in their sequence.Moreover, the positions identified as lysine and glutamicacid in B39 no. 1 were deduced as asparagine and glutamineby Lahiri and Thomas. B39 no. 3 is nearly identical to aregion in the protein identified by Tay et al. (the arginine ofB39 no. 3 is predicted to be a serine in the protein charac-terized by Tay et al.) and partially overlaps with the aminoterminus of the protein characterized by Sharp et al. Twopeptides were sequenced from B41. B41 no. 1 is identical toB39 no. 1, and B41 no. 2 is essentially the same as butshorter than B39 no. 2. In summary, B39 and B41 are highlyrelated to each other and share sequence homology withseveral previously characterized hnRNP-type proteins. Onthe basis of their mobilities on two-dimensional gels and ontheir immunological reactivities (Fig. 6B), we shall refer toB39 as DO1 and to B41 as D02-D1*.

Finally, one peptide derived from B50 completely matchesthe sequence of human nucleolin (40). This region is near theRBD, but it is not significantly homologous with otherproteins. Nucleolin has an apparent molecular mass of 100 to

-s

.M 2. T;MI GI2 GL~C2a _, D02 B2

IC'i-UU~B

A2

FIG. 6. Analysis of fraction B by two-dimensional gel electro-phoresis and determining immunological reactivity. (A) Fraction B

OH- (TTAGGG) and hnRNPs purified from HeLa nucleoplasm by single-stranded DNA chromatography (33) (ssDNA) were resolved bynonequilibrium pH gradient gel electrophoresis in the first dimen-sion and by SDS-PAGE in the second dimension. Proteins werevisualized by silver staining and are labeled according to Pifiol-Roma et al. (33). DO1, D02, D1*, and EO have been named accordingto their immunological relatedness to the D and E hnRNPs. Molec-ular mass markers (in kilodaltons) are indicated. (B) Fraction B wasresolved by two-dimensional gel electrophoresis, transferred tonitrocellulose membrane, and sequentially probed with monoclonalantibodies specific for the D and E hnRNPs. DO1, D02, and D1* arerecognized by the monoclonal antibody 5B9, and EO is recognizedby the monoclonal antibody 5H3.

110 kDa on SDS-PAGE. However, nucleolin is highly sus-ceptible to proteolysis, and a 48-kDa proteolytic fragmentwith nucleic acid-binding activity has been characterizedpreviously (37). It is likely that B50 corresponds to thisfragment.

DISCUSSION

In an effort to identify nuclear components that interactwith the telomeres of human chromosomes, we analyzed aHeLa cell nuclear extract for binding to single-strandedoligonucleotides having d(TTAGGG) repeats. By a gel retar-dation assay, we purified the major activities that bound tothese oligonucleotides. These proteinaceous activities boundto single-stranded but not to double-stranded repeats, andthey unexpectedly bound more tightly to an RNA oligonu-cleotide having the same repeat sequence. Two-dimensionalgel electrophoresis and amino acid sequencing revealed thatthe major components of these activities are identical orclosely related to several hnRNPs. Because treating nucleiwith RNase releases hnRNPs efficiently (36), it is expectedthat our nuclear extracts prepared with MNase containhnRNPs. The observed binding specificities of these proteinsfor UUAGGG repeats raise the possibility that they areinvolved both in pre-mRNA processing and in functionsassociated with telomeres, although direct evidence foreither function has not been obtained.Our fraction A is composed of hnRNP Al and A2-B1, and

our fraction B is composed of D and E hnRNPs andnucleolin. Peptides sequenced from the D and E hnRNPs

A

VOL. 13, 1993

Page 8: Nuclear Proteins That Bind the Pre-mRNA 3' Splice Site Sequence r

4308 ISHIKAWA ET AL.

3 4

Complex -4

Free e

FIG. 7. dHum-4-fraction B complex reacts with anti-hnRNP Dantibody. End-labeled dHum-4 was incubated without (lanes 1 and3) or with (lanes 2 and 4) fraction B, separated by nondenaturing gelelectrophoresis, and blotted onto a membrane. The membrane wasexposed to X-ray film to detect radiolabeled dHum-4 (lanes 1 and 2)and then treated with anti-hnRNP D monoclonal antibody. Thereaction of antibody was detected by a second antibody conjugatedwith alkaline phosphatase (lanes 3 and 4).

isolated in this study have significant homology to severalpreviously characterized proteins. Peptides from B37 (EO)are related to an hnRNP type A-B protein characterized byKhan et al. (20). Although this type A-B protein wasoriginally referred to as an hnRNP type C protein because of

its comigration with hnRNP C2 on one-dimensional gels (22),it is more likely to be the largest of the E hnRNPs on thebasis of both its primary structure (which is related to that ofthe A-B hnRNPs) and its apparent molecular mass. Peptidesfrom B39 (D01) and B41 (D02-Dl*) have significant aminoacid sequence correspondence to proteins previously de-scribed as hnRNP type C proteins from humans (23) and rats(39) and to a related (if not identical) protein shown to bindin vitro to an enhancer of hepatitis B virus (44). Theseproteins are, however, not highly related to authentichnRNP Cl or C2 (3, 42). On the basis of our findings, theproteins characterized by Lahiri and Thomas, by Sharp etal., and by Tay et al. are likely to be D hnRNPs or proteinshighly related to the D hnRNPs. The sequence contains twoRBDs, both represented in the peptides we sequenced.The abundance of these proteins and their preference for

binding RNA over DNA have led us to focus on theirpossible functions in pre-mRNA splicing. BecauseUUAGGG matches the pre-mRNA 3' splice site consensussequence (YnNYAG/G, where the slash indicates the splicesite), we tested the proteins for binding to an oligoribonu-cleotide that includes the 3' splice site of human ,-globinintervening sequence 1. Fraction B showed specificity forthis oligonucleotide, as indicated by a competition experi-ment, whereas fractionA did not. Immunological reactivitieswith anti-hnRNP D and E monoclonal antibodies confirmedthe presence of these proteins in fraction B. With DhnRNPs, their presence in the oligonucleotide-protein com-plex was demonstrated by blotting from a gel retardationassay. (The other proteins in fraction B have not beenproven to bind specifically to the oligonucleotides; it remainspossible that they cochromatograph with D hnRNPs.)

TABLE 1. Amino acid sequences of peptides derived from proteins that bind d(TTAGGG),, and r(UUAGGG)n

Protein and Obtained sequence Match to reported sequence Identity'peptide no. (reference)bA26

1 (ASGVX)FAFVTFDD Al (4) Al2 (SA_K)FGFVTYATVE Al (4)

A281 (IiASG)YYEQ(VF)GK A2-Bl (3) A2-Bl2 (8S)FGFVTFDD-D--D-I A2-B1 (3)3 (L7KASGV)FVGGIKED A2-Bl (3)

B371 IFVGGLNPEATEEK Type A-B (20) EO2 (EM)GEVVD-T TPDPNTGR Type A-B (20)

B391 GECFITFKEEEPVK E2BP (44) D012 IFVGGLSPDTPEEKIREYFG E2BP (44)3 GFGFVLFKE(GI)EBVDKVMDQ E2BP (44)

B411 GECFITFKEEEPVK E2BP (44) D02-Dl*2 (IFSDG)FVGGLSPDTPEE E2BP (44)

B50 1 (§H)ISLYYTGEKGQNQDYR Nucleolin (40) Nucleolin

a When several amino acids were detected at one position, they are enclosed in parentheses; the double-underlined letter then represents the amino acid foundin the reported sequence. Each position at which no amino acid could be determined is indicated by a dash. Positions at which obtained sequences are differentfrom reported human sequences are underlined.

I The published sequence to which the sequence obtained in this study was compared. Names of proteins are according to the references cited. Comparisonsto sequences from other references are given in the text.

c Nomenclature for EO, D01, and D02-D1* is based on two-dimensional gel electrophoresis and immunological criteria.

MOL. CELL. BIOL.

Page 9: Nuclear Proteins That Bind the Pre-mRNA 3' Splice Site Sequence r

DNA AND RNA BINDING BY D AND E hnRNPs 4309

Mutation studies define the RNA-binding specificity of theidentified proteins and support the idea that they can interactwith pre-mRNA 3' splice sites. Introns in pre-mRNA invari-ably end in AG, and mutation of either of these bases inrHum-4 prevented binding by both fractions A and B. Theprevention of binding by a transition mutation in either of thefirst two U's preceding the AG is not predicted by the modelbut can be accommodated. For example, the proteins mightrecognize the AG dinucleotide and a short U tract upstreamat the same time, or the proteins might bind to the subset ofintrons that end in UUAG. Alteration of the last two G's ofeach repeat had more subtle effects on binding by fractionsAand B. The base at position + 1 (immediately following the 3'splice site) was found to be 50% G and 28% A in a recentcompilation of splice site sequences (32), while position +2showed no convincing sequence bias. Thus, r(UUAGAG)4and r(UUAGGA)4 would be expected to be recognized by a3' splice site binding protein, in agreement with our results.A protein or proteins associated with snRNP particles

have previously been shown to bind to the 3' splice siteregion of pre-mRNA (6, 11, 45). They were suggested torecognize both the polypyrimidine stretch and the AG. Onthe basis of their molecular masses (70 and 100 kDa [11, 45]),they appear unrelated to the hnRNP-class proteins that wehave identified. The U2 snRNP auxiliary factor (U2AF) alsobinds to the 3' splice site region of pre-mRNA; the sequenceof the RNA-binding subunit of U2AF has been determinedpreviously (48) and is distinct from those reported here.The participation of hnRNPs in pre-mRNA splicing has

been postulated previously. Choi et al. (7) found that amonoclonal antibody against hnRNP C inhibited pre-mRNAsplicing in vitro, and hnRNP Al was recently identified as afactor that influences the selection of alternative 5' splicesites (24). Swanson and Dreyfuss (41, 42) also showed thathnRNPs Al, C, D, L, and U interact with human 3-globinpre-mRNA transcribed in vitro by immunoprecipitation andan RNase protection assay. Binding interactions of hnRNPsAl, C, and D were mapped to the 3' end of interveningsequence 1. By cross-linking, Mayrand and Pederson (25)showed that hnRNP Al and C bind to 3-globin and adeno-virus type 2 pre-mRNA, respectively, transcribed in vitro.Buvoli et al. (5) found that recombinant hnRNP Al binds todeoxyoligonucleotides bearing 3' splice site sequences ofhuman 3-globin and adenovirus type 2 major late transcripts.The N-terminal 195-amino-acid section of hnRNP Al, whichis known as UP1 (16, 46), was reported to bind to theseoligonucleotides as efficiently as Al does. These reportssuggest that hnRNPs Al, C, and D might function inpre-mRNA splicing by binding to 3' splice sites, and thissuggestion is compatible with our results. One difference isthat we observed no binding between fraction A (containingproteolytic fragments of the hnRNPs Al and A2-Bl) and our3' splice site ribooligonucleotide. Possible explanations forthis difference are that the binding reaction mixtures ofBuvoli et al. (5) used a deoxyoligonucleotide concentrationabout 10-fold higher than our ribooligonucleotide concentra-tion and that they used UV cross-linking after the bindingreaction.Although our focus has been on the possible participation

of these proteins in pre-mRNA splicing, the possibility thatthey may associate with telomeric DNA in vivo needs to beconsidered. In this context, the proteins purified in this studyinclude the activity reported by McKay and Cooke (26) tobind to human telomeric DNA oligonucleotides, proteinswhich they subsequently identified as hnRNP A2-B1 (27). Itis also interesting that the forms of the D and E proteins

purified by the binding criterion are not stably associatedwith immunopurified hnRNP complexes (33). These proteinsmay therefore be only transiently associated with pre-mRNAs during the splicing process, and/or they may becomponents of other nuclear assemblies, such as chromo-some telomeres.

ACKNOWLEDGMENTS

We are grateful to Anne Gooding and Cheryl Grosshans forsynthesis of oligonucleotides and to Clive Slaughter (HowardHughes Medical Institute, Dallas, Tex.) for peptide sequencing. Weare also grateful to Serafin Pifiol-Roma for making available themonoclonal antibodies 5B9 and 5H3. We thank Susumu Nishimura(Banyu Tsukuba Research Institute), Joan Steitz, and Kevin Weeksfor critical comments on the manuscript.G.D. and T.R.C. are Investigators of the Howard Hughes Medical

Institute.

REFERENCES1. Bandziulis, R. J., M. S. Swanson, and G. Dreyfuss. 1989.

RNA-binding proteins as developmental regulators. Genes Dev.3:431-437.

2. Blackburn, E. H. 1991. Structure and function of telomeres.Nature (London) 350:569-573.

3. Burd, C. G., M. S. Swanson, M. Gorlach, and G. Dreyfuss. 1989.Primary structures of the heterogenous nuclear ribonucleopro-tein A2, Bi, and C2 proteins: a diversity of RNA bindingproteins is generated by small peptide inserts. Proc. Natl. Acad.Sci. USA 86:9788-9792.

4. Buvoli, M., G. Biamonti, P. Tsoulfas, M. T. Bassi, A. Ghetti, S.Riva, and C. Morandi. 1988. cDNA cloning of human hnRNPprotein Al reveals the existence of multiple mRNA isoforms.Nucleic Acids Res. 16:3751-3770.

5. Buvoli, M., F. Cobianchi, G. Biamonti, and S. Riva. 1990.Recombinant hnRNP protein Al and its N-terminal domainshow preferential affinity for oligonucleotides homologous tointron/exon acceptor sites. Nucleic Acids Res. 18:6595-6600.

6. Chabot, B., D. L. Black, D. M. LeMaster, and J. A. Steitz. 1985.The 3' splice site of pre-messenger RNA is recognized by asmall nuclear ribonucleoprotein. Science 230:1344-1349.

7. Choi, Y. D., P. J. Grabowski, P. A. Sharp, and G. Dreyfuss.1986. Heterogeneous nuclear ribonucleoproteins: role in RNAsplicing. Science 231:1534-1539.

8. Dignam, J. D., R. M. Lebovitz, and R. G. Roeder. 1983.Accurate transcription initiated by RNA polymerase II in asoluble extract from isolated mammalian nuclei. Nucleic AcidsRes. 11:1475-1489.

9. Dreyfuss, G. 1986. Structure and function of nuclear and cyto-plasmic ribonucleoprotein particles. Annu. Rev. Cell Biol.2:459-498.

10. Dreyfuss, G., M. S. Swanson, and S. Pifiol-Roma. 1988. Heter-ogeneous nuclear ribonucleoprotein particles and the pathwayof mRNA formation. Trends Biochem. Sci. 13:86-91.

11. Gerke, V., and J. A. Steitz. 1986. A protein associated withsmall nuclear ribonucleoprotein particles recognizes the 3'splice site of pre-messenger RNA. Cell 47:973-984.

12. Gorlach, M., M. Wittekind, RI A. Beckman, L. Mueller, and G.Dreyfuss. 1992. Interaction of the RNA-binding domain of thehnRNP C proteins with RNA. EMBO J. 11:3289-3295.

13. Gottschling, D. E., and V. A. Zakian. 1986. Telomere proteins:specific recognition and protection of the natural termini ofOxytricha macronuclear DNA. Cell 47:195-205.

14. Gray, J. T., D. W. Celander, C. M. Price, and T. RI Cech. 1991.Cloning and expression of genes for the Oxytricha telomere-binding protein: specific subunit interactions in the telomericcomplex. Cell 67:807-814.

15. Henderson, E. R., and E. H. Blackbun. 1989. An overhanging 3'terminus is a conserved feature of telomeres. Mol. Cell. Biol.9:345-348.

16. Herrick, G., and B. Alberts. 1976. Purification and physicalcharacterization of nucleic acid helix-unwinding proteins from

VOL. 13, 1993

Page 10: Nuclear Proteins That Bind the Pre-mRNA 3' Splice Site Sequence r

4310 ISHIKAWA ET AL.

calf thymus. J. Biol. Chem. 251:2124-2132.17. Hicke, B. J., D. W. Celander, G. H. MacDonald, C. M. Price,

and T. R. Cech. 1990. Two versions of the gene encoding the 41kilodalton subunit of the telomere binding protein of Oxytrichanova. Proc. Natl. Acad. Sci. USA 87:1481-1485.

18. Keene, J. D., and C. C. Query. 1991. Nuclear RNA-bindingproteins. Prog. Nucleic Acid Res. Mol. Biol. 41:179-202.

19. Kenan, D. J., C. C. Query, and J. D. Keene. 1991. RNArecognition: towards identifying determinants of specificity.Trends Biochem. Sci. 16:214-220.

20. Khan, F. A., A. K. Jaiswal, and W. Szer. 1991. Cloning andsequence analysis of a human type A/B hnRNP protein. FEBSLett. 290:159-161.

21. Klobutcher, L. A., M. T. Swanton, P. Donini, and D. M.Prescott. 1981. All gene-sized DNA molecules in four species ofhypotrichs have the same terminal sequence and an unusual 3'terminus. Proc. Natl. Acad. Sci. USA 78:3015-3019.

22. Kumar, A., H. Sierakowska, and W. Szer. 1987. Purification andRNA binding properties of a C-type hnRNP protein from HeLacells. J. Biol. Chem. 262:17126-17137.

23. Lahiri, D. K., and J. 0. Thomas. 1986. A cDNA clone of thehnRNP C protein and its homology with the single-strandedDNA binding protein UP2. Nucleic Acids Res. 14:4077-4094.

24. Mayeda, A., and A. R. Krainer. 1992. Regulation of alternativepre-mRNA splicing by hnRNP Al and splicing factor SF2. Cell68:365-375.

25. Mayrand, S. H., and T. Pederson. 1990. Crosslinking of hnRNPproteins to pre-mRNA requires Ul and U2 snRNPs. NucleicAcids Res. 18:3307-3318.

26. McKay, S. J., and H. Cooke. 1992. A protein which specificallybinds to single stranded TTAGGGn repeats. Nucleic Acids Res.20:1387-1391.

27. McKay, S. J., and H. Cooke. 1992. hnRNP A2/B1 binds specif-ically to single stranded vertebrate telomeric repeat TTAGGGn.Nucleic Acids Res. 20:6461-6464.

28. Mount, S. M. 1982. A catalogue of splice junction sequence.Nucleic Acids Res. 10:459-472.

29. Moyzis, R. K., J. M. Buckingham, L. S. Cram, M. Dani, L. L.Deaven, M. D. Jones, J. Meyne, R. L. Ratliff, and J.-R. Wu.1988. A highly conserved repetitive DNA sequence,(TTAGGG)n, present at the telomeres of human chromosomes.Proc. Natl. Acad. Sci. USA 85:6622-6626.

30. Nagai, K, C. Oubridge, T. H. Jessen, J. Li, and P. R. Evans.1990. Crystal structure of the RNA-binding domain of the Ulsmall nuclear ribonucleoprotein A. Nature (London) 348:515-520.

31. O'Farrell, P. Z., H. M. Goodman, and P. H. O'Farrell. 1977.High resolution two-dimensional electrophoresis of basic aswell as acidic proteins. Cell 12:1133-1142.

32. Oshima, Y., and Y. Gotoh. 1987. Signals for the selection of asplice site in pre-mRNA. J. Mol. Biol. 195:247-259.

33. Pifiol-Roma, S., Y. D. Choi, M. J. Matunis, and G. Dreyfuss.1988. Immunopurification of heterogeneous nuclear ribonucleo-protein particles reveals an assortment of RNA-binding pro-teins. Genes Dev. 2:215-227.

33a.Pifnol-Roma, S., and G. Dreyfuss. Unpublished data.34. Price, C. M., and T. R. Cech. 1987. Telomeric DNA-protein

interactions of Oxytricha macronuclear DNA. Genes Dev.1:783-793.

35. Query, C. C., R. C. Bentley, and J. D. Keene. 1989. A commonRNA recognition motif identified within a defined Ul RNAbinding domain of the 70K Ul snRNP protein. Cell 57:89-101.

36. Samarina, 0. P., E. M. Lukanidin, J. Molnar, and G. P.Georgiev. 1968. Structural organization of nuclear complexescontaining DNA-like RNA. J. Mol. Biol. 33:251-263.

37. Sapp, M., A. Richter, K. Weisshart, M. Caizergues-Ferrer, F.Amalric, M. 0. Wallace, M. N. Kirstein, and M. 0. J. Olson.1989. Characterization of a 48 kDa nucleic acid binding fragmentof nucleolin. Eur. J. Biochem. 179:541.

38. Scherly, D., W. Boelens, N. A. Dathan, W. J. van Venrooi, andI. W. Mattaj. 1990. Major determinants of the specificity ofinteraction between small nuclear ribonucleoproteins UlA andU2B1 and their cognate RNAs. Nature (London) 345:502-506.

39. Sharp, Z. D., K. P. Smith, Z. Cao, and S. Helsel. 1990. Cloningof the nucleic acid-binding domain of the rat hnRNP C-typeprotein. Biochim. Biophys. Acta 1048:306-309.

40. Srivastava, M., P. J. Fleming, H. B. Pollard, and A. L. Burns.1989. Cloning and sequencing of the human nucleolin cDNA.FEBS Lett. 250:99-105.

41. Swanson, M. S., and G. Dreyfuss. 1988. Classification andpurification of proteins of heterogeneous nuclear ribonucleopro-tein particles by RNA-binding specificities. Mol. Cell. Biol.8:2237-2241.

42. Swanson, M. S., and G. Dreyfuss. 1988. RNA binding specificityof hnRNP proteins: a subset bind to the 3' end of introns.EMBO J. 7:3519-3529.

43. Swanson, M. S., T. Y. Nakagawa, K. LeVan, and G. Dreyfuss.1987. Primary structure of human nuclear ribonucleoproteinparticle C proteins: conservation of sequence and domainstructures in heterogeneous nuclear RNA, mRNA, and pre-rRNA-binding proteins. Mol. Cell. Biol. 7:1731-1739.

44. Tay, N., S. Chan, and E. Ren. 1992. Identification and cloning ofa novel heterogeneous nuclear ribonucleoprotein C-like proteinthat functions as a transcriptional activator of the hepatitis Bvirus enhancer II. J. Virol. 66:6841-6848.

45. Tazi, J., C. Alibert, J. Temsamani, I. Reveillaud, G. Cathala, etal. 1986. A protein that specifically recognizes the 3' splice siteof mammalian pre-mRNA introns is associated with a smallnuclear ribonucleoprotein. Cell 47:755-766.

46. Williams, K. R., K. L. Stone, M. B. LoPresti, B. M. Merrill, andS. R PlancL 1985. Amino acid sequence of the UP1 calf thymushelix-destabilizing protein and its homology to an analogousprotein from mouse myeloma. Proc. Natl. Acad. Sci. USA82:5666-5670.

47. Zakian, V. A. 1989. Structure and function of telomeres. Annu.Rev. Genet. 23:579-604.

48. Zamore, P. D., J. G. Patton, and M. R. Green. 1992. Cloning anddomain structure of the mammalian splicing factor U2AF.Nature (London) 355:609-614.

MOL. CELL. BIOL.