the amino acid sequence of the sex steroid-binding protein of

10
THE JOURNAL OF BIOLOGICAL CHEMISTRY @ 1989 by The American Society for Biochemistry and Molecular Biology, Inc. Vol. 264, No. 32, Issue of November 15, pp. 19066-19075, 1989 Printed in U.S.A. The Amino Acid Sequence of the Sex Steroid-bindingProtein of Rabbit Serum* (Received for publication, June 9, 1989) Patrick R. Griffin@, Santosh Kumarll, Jeffrey ShabanowitzS, Harry Charbonneaull, Pearl C, Namkungll, Kenneth A. Walshll, Donald F. Hunt$, and Philip H. PetrallII** From the $Department of Chemistry, LJniuersity of Virginia, Charlottesville, Virginia 22901 and the Departments of (Biochemistry and IJ Obstetrics and Gynecology, University of Washington, Seattle, Washington 98195 The amino acid sequence of the sex steroid-binding protein (SBP or SHBG) of rabbit serum, specific for binding testosterone and 5a-dihydrotestosterone, was determined using a complementary c o m ~ i n a t ~ ~ n of mass spectrometric and Edman degradation tech- niques. The monomeric unit of the homodimeric pro- tein is a single chain glycopeptide of 367 amino acid residues, with N-linked oligosaccharide side chains at Asn-345 and Asn-361 and disulfide bonds connecting Cys-158 to Cys-182 and Cys-327 toCys-355. The poly- peptide molecular weight of the monomer calculated from the sequence is 39,769. The molecular weight of the homodimer including 9% carbohydrate is 87,404. The sequence contains a relatively hydrophobic seg- ment between Trp-241 and Leu-282, which includes many leucine residues in an alternating pattern. An amino acid sequence repeat is also located within that segment. Both of these patterns arepresent in human SBP and in the androgen-bind~ng protein of rat epidid- ymis. The sequence data indicate that the previously re- ported microheterogeneity of rabbit SBP in sodium dodecyl sulfate-polyacrylamide gel electrophoresis re- flects variants generated by differential glycosylation of the monomer rather than different gene products. Seventy-nine percent of the amino acids of rabbit SBP are identical to those of human SBP; rabbit SBP thus joins human SBP and rat androgen-binding protein in one gene family that is distinct from the steroid hor- mone receptor superfamily. It appears that the problem of binding sex steroid hormones has been solvedinde- pendently in two different gene families that contain completely different steroid-binding domains. Since the nonhomologous steroid-binding domains of both families of proteins recognize essentially the same ste- roid structure, it will be interesting to determine the structura1 basis of the two different protein designs that lead to similar steroid-binding specificity. Blood plasma of most species tested, including humans, * This research was supported by National Institutes of Health Grants HD13956 (to P. H. P.), GM37537 (to D. F. H.), GM15731 (to K. A. W.), and by instrument development funds awarded (to D. F. H.) from the Monsanto Co., CIT (BIO-87006), and the National Science Foundation (CHE-8618780). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “aduertisement” in accord- ance with 18 U.S.C. Section 1‘734 solely to indicate this fact. § Present address: Genentech, Inc., San Franscisco, CA. ** To whom correspondence and reprint requests should be ad- dressed Dept. of Obstetrics and Gynecology, RH-20, University of Washington, Seattle, WA 98195. Tel.: 206-543-5714. contains a sex steroid-binding protein (SBP)’ sometimes also called sex steroid hormone-binding protein and abbreviated SHBG) specific for 5m-dihydrotestosterone, testosterone, and 17P-estradiol(I, 2). In some species, inciudingprimates, SBP binds all three steroids, whereas in other species, including rabbits, the protein binds only the two androgens with high affinity. One biochemical role of SBP is to regulate the plasma metabolic clearance rate of these hormones by controlling their plasma concentrations (3, 4). Although mechanisms involving steroid receptors in thespecific transport of steroid hormones into tissues have been suggested (5), a current hypothesis is that the fraction of hormone unbound to SBP in plasma diffuses nonspecifically int,o tissues and represents the physiologically active form of the hormone. In this con- text, SBP is thought to control diffusion of unbound hormone into tissues indirectly. More recently, however, it has been proposed that the SBP-steroid complex may play a direct role either as a specific carrier of these hormones into target cells (6, 7) or by specifically binding to the plasma membrane (8, 9) thereby facilitating steroid diffusion into target tissues. Cells that respond to sex steroid hormones would then be selected on the basis of their ability to interact with the SBP complex. Involvement of an SBP membrane receptor has been proposed (6-16). Nevertheless, the view that steroid hormones diffuse nonspecifically into cells remains generally accepted by most endocrinologists (see Ref. 1 for discussion). This mechanism, however, does not explain how unassisted diffu- sion can deliver steroid hormones specifically and rapidly to nuclei of target tissue cells, particularly when their plasma concentrations are of the same magnitude as their Kd values for steroid binding to nuclear receptors (-lo-’ M). The dra- matic drop in hormonal concentration (lO”o-lO-” M) result- ing from nonspecific diffusion of sex steroids from plasma into all tissues would theoretically prevent significant satu- ration of steroid receptors within target cells. In order to explore the biological role of SBP, the human and rabbit proteinshave been purified by affinity chromatog- raphy and characterized (17-29). Results from our studies, recently summarized (30, 31), indicate that both native pro- teins are homodimeric glycoproteins of subunit molecular weight 46,700 for rabbit SBP (SDS-PAGE) and 43,000 for human SBP (sequence data). In the case of the human protein, the subunits were shown to be identical (26). Each native dimer binds one molecule of steroid (25, 26, 32). The amino acid sequence of human SBP has been determined (331, and cDNAs have been characterized (34-36). In contrast to human SRP, rabbit SBP does not bind 178- ‘The abbreviations used are: SBP, sex steroid-binding protein; rSBP, rabbit SBP; hSBP, human SBP; ABP, androgen-binding pro- tein; SDS, sodium dodecyl sulfate; PAGE, polyacryl~ide gel electro- phoresis; HPLC, high performance liquid chromatography. 19066

Upload: vucong

Post on 03-Jan-2017

215 views

Category:

Documents


0 download

TRANSCRIPT

THE JOURNAL OF BIOLOGICAL CHEMISTRY @ 1989 by The American Society for Biochemistry and Molecular Biology, Inc.

Vol. 264, No. 32, Issue of November 15, pp. 19066-19075, 1989 Printed in U.S.A.

The Amino Acid Sequence of the Sex Steroid-binding Protein of Rabbit Serum*

(Received for publication, June 9, 1989)

Patrick R. Griffin@, Santosh Kumarll, Jeffrey ShabanowitzS, Harry Charbonneaull, Pearl C, Namkungll, Kenneth A. Walshll, Donald F. Hunt$, and Philip H. PetrallII** From the $Department of Chemistry, LJniuersity of Virginia, Charlottesville, Virginia 22901 and the Departments of (Biochemistry and IJ Obstetrics and Gynecology, University of Washington, Seattle, Washington 98195

The amino acid sequence of the sex steroid-binding protein (SBP or SHBG) of rabbit serum, specific for binding testosterone and 5a-dihydrotestosterone, was determined using a complementary c o m ~ i n a t ~ ~ n of mass spectrometric and Edman degradation tech- niques. The monomeric unit of the homodimeric pro- tein is a single chain glycopeptide of 367 amino acid residues, with N-linked oligosaccharide side chains at Asn-345 and Asn-361 and disulfide bonds connecting Cys-158 to Cys-182 and Cys-327 toCys-355. The poly- peptide molecular weight of the monomer calculated from the sequence is 39,769. The molecular weight of the homodimer including 9% carbohydrate is 87,404. The sequence contains a relatively hydrophobic seg- ment between Trp-241 and Leu-282, which includes many leucine residues in an alternating pattern. An amino acid sequence repeat is also located within that segment. Both of these patterns are present in human SBP and in the androgen-bind~ng protein of rat epidid- ymis.

The sequence data indicate that the previously re- ported microheterogeneity of rabbit SBP in sodium dodecyl sulfate-polyacrylamide gel electrophoresis re- flects variants generated by differential glycosylation of the monomer rather than different gene products. Seventy-nine percent of the amino acids of rabbit SBP are identical to those of human SBP; rabbit SBP thus joins human SBP and rat androgen-binding protein in one gene family that is distinct from the steroid hor- mone receptor superfamily. It appears that the problem of binding sex steroid hormones has been solved inde- pendently in two different gene families that contain completely different steroid-binding domains. Since the nonhomologous steroid-binding domains of both families of proteins recognize essentially the same ste- roid structure, it will be interesting to determine the structura1 basis of the two different protein designs that lead to similar steroid-binding specificity.

Blood plasma of most species tested, including humans,

* This research was supported by National Institutes of Health Grants HD13956 (to P. H. P.), GM37537 (to D. F. H.), GM15731 (to K. A. W.), and by instrument development funds awarded (to D. F. H.) from the Monsanto Co., CIT (BIO-87006), and the National Science Foundation (CHE-8618780). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “aduertisement” in accord- ance with 18 U.S.C. Section 1‘734 solely to indicate this fact.

§ Present address: Genentech, Inc., San Franscisco, CA. ** To whom correspondence and reprint requests should be ad-

dressed Dept. of Obstetrics and Gynecology, RH-20, University of Washington, Seattle, WA 98195. Tel.: 206-543-5714.

contains a sex steroid-binding protein (SBP)’ sometimes also called sex steroid hormone-binding protein and abbreviated SHBG) specific for 5m-dihydrotestosterone, testosterone, and 17P-estradiol(I, 2) . In some species, inciudingprimates, SBP binds all three steroids, whereas in other species, including rabbits, the protein binds only the two androgens with high affinity. One biochemical role of SBP is to regulate the plasma metabolic clearance rate of these hormones by controlling their plasma concentrations (3, 4). Although mechanisms involving steroid receptors in the specific transport of steroid hormones into tissues have been suggested ( 5 ) , a current hypothesis is that the fraction of hormone unbound to SBP in plasma diffuses nonspecifically int,o tissues and represents the physiologically active form of the hormone. In this con- text, SBP is thought to control diffusion of unbound hormone into tissues indirectly. More recently, however, it has been proposed that the SBP-steroid complex may play a direct role either as a specific carrier of these hormones into target cells (6, 7) or by specifically binding to the plasma membrane (8, 9) thereby facilitating steroid diffusion into target tissues. Cells that respond to sex steroid hormones would then be selected on the basis of their ability to interact with the SBP complex. Involvement of an SBP membrane receptor has been proposed (6-16). Nevertheless, the view that steroid hormones diffuse nonspecifically into cells remains generally accepted by most endocrinologists (see Ref. 1 for discussion). This mechanism, however, does not explain how unassisted diffu- sion can deliver steroid hormones specifically and rapidly to nuclei of target tissue cells, particularly when their plasma concentrations are of the same magnitude as their K d values for steroid binding to nuclear receptors (-lo-’ M). The dra- matic drop in hormonal concentration (lO”o-lO-” M) result- ing from nonspecific diffusion of sex steroids from plasma into all tissues would theoretically prevent significant satu- ration of steroid receptors within target cells.

In order to explore the biological role of SBP, the human and rabbit proteins have been purified by affinity chromatog- raphy and characterized (17-29). Results from our studies, recently summarized (30, 31), indicate that both native pro- teins are homodimeric glycoproteins of subunit molecular weight 46,700 for rabbit SBP (SDS-PAGE) and 43,000 for human SBP (sequence data). In the case of the human protein, the subunits were shown to be identical (26). Each native dimer binds one molecule of steroid (25, 26, 32). The amino acid sequence of human SBP has been determined (331, and cDNAs have been characterized (34-36).

In contrast to human SRP, rabbit SBP does not bind 178-

‘The abbreviations used are: SBP, sex steroid-binding protein; rSBP, rabbit SBP; hSBP, human SBP; ABP, androgen-binding pro- tein; SDS, sodium dodecyl sulfate; PAGE, polyacryl~ide gel electro- phoresis; HPLC, high performance liquid chromatography.

19066

Sequence of Sex Steroid-binding Protein of Rabbit Serum 19067

estradiol significantly. The Kd for 17&estradiol binding to rabbit SBP is 18 times larger than for human SBP; human SBP binds Sa-dihydrotestosterone 10 times tighter than 170- estradiol, whereas rabbit SBP binds 5a-dihydrotestosterone 100 times better than 17P-estradiol (18). The data therefore indicate that rabbit SBP is essentially an an~ogen-binding protein. Thus, comparison of the molecular characteristics of human and rabbit SBPs should lead to identification of the specific structural elements in the steroid-binding site which differentiate estrogen binding from androgen binding. The present report deals with the first step, which is the determi- nation of the amino acid sequence of rabbit SBP and the comparison with that of the human protein.

EXPERIMENTAL PROCEDURES

Materials-Dithiothreitol (98%), 4-vinylpyridine, iodoacetic acid (99%), cyanogen bromide, trifluoroacetic acid, monothioglycerol (98%), trypsin (~-l-tosylamido-2-phenylethyl chloromethyl ketone treated) and f f -chymot~s in (l-chloro-3-tosylamido-7-amino-2-hep- tanone treated) were purchased from Sigma. HPLC grade acetonitrile was purchased from Burdick and Jackson (Muskegon, MI). Sequanal grade phenyl isothiocyanate for manual Edman degradations and urea were obtained from Pierce Chemical Co. Sequanal grade trypsin, endoprotease Glu-C ( S t ~ h y ~ ~ o c c u s aureus), Lys-C (Achromobacter lyticus), and Asp-N (Pseudomonas fragi) were from Boehringer Mannheim. Peptide N-glycosidase F (N-GlycanaseTM) was obtained from Genzyme Co. (Boston, MA). RP-8 (7-pm particle size, 2.1 mm X 10 cm) and RP-18 (5-pm particle size, 2.1 mm X 3 cm) columns were purchased from Rainin Instrument Co. (Woburn, MA).

Pur.ification of Rabbit SBP-Rabbit SBP was purified using the general procedures described previously (21) but with added modifi- cations. Rabbit serum (4 liters, Pel-Freez Biologicals) was precipi- tated with ammonium sulfate a t 50% saturation at 4 "C. The pellet was dissolved in 600 ml of 0.02 M sodium phosphate, pH 6.8, and dialyzed exhaustively at 4 "C for 3 days against three changes of 30 liters of the same buffer. The sample was centrifuged at 10,000 X g, and the supernatant was pumped onto a column (5 X 55 cm) of DE52 (Whatman) equilibrated previously overnight with 4 liters of 0.02 M sodium phosphate, pH 6.8, at a flow rate of 240 ml/h. After sample application, the column was washed with 4 liters of the same buffer followed by a linear gradient consisting of 2 liters of that buffer and 2 liters of 0.09 M sodium phosphate, pH 6.8, pumped at the same flow rate. Sixteen fractions of 240 ml each were collected. Rabbit SBP eluted in fractions 7-12 as indicated by 5a-dihydrotestosterone-bind- ing activity measurements using the filter assay (37). The fractions were pooled and concentrated to about 350 ml using Amicon mem- brane YM-IO. The SBP solution was made 10% glycerol, 0.5 M NaCl, pH 7.4, centrifuged to remove a slight precipitate, and added to 100 ml of packed 5a-dihydrotestosterone-17~-bexanyldiaminoethyl-(1,4- butanediol diglycidyl ether)-agarose diluted 1:1 with Sepharose CL- 4B. The affinity adsorbent was synthesized according to published procedures (21, 25). The suspension was stirred gently overnight a t 4 "C, washed to remove the impurities, and the SBP was eluted batchwise as described previously (21). The SBP solution was con- centrated to about 5 ml and electrophoresed preparatively as de- scribed previously (19) using a 6 X 2.5-cm-diameter separating gel (5% acrylamide) and a 2- X 2.5-cm-diameter stacking gel (3% acryl- amide), the ratio of methylene-bisacrylamide to acrylamide being set a t 3% throughout. The pure SBP solution was concentrated to about 5 mi, dialyzed against 10 mM Tris-C1, 10% glycerol, 5 mM CaC12, 0.1 M NaCl, 20 pM 5~-dihydrotestosterone, pH 7.4, and stored at -20 "C. Depending upon the original serum, the procedure yields about 15 mg (60% yield) of rabbit SBP as determined spectrophotometrically using e280 = 1.27 X lo5 cm" M" and a molecular weight of 85,800 (25). Protein purity was shown by SDS-PAGE as described previously

primate SBPs. (25). We have used this modified procedure successfully to purify

Reduction and S-Carboxymethylation-Rabbit SBP was dissolved in 100 mM Tris-HC1 buffer, pH 8.5, to a final protein concentration of 10 mg/ml. To prevent oxidation side reactions, dissolved oxygen was removed from the Tris buffer by vacuum filtration through a 0.45-fim aqueous filter and by stirring the water vigorously under vacuum for at least 10 min. Disulfide bonds were cleaved by treating the protein with dithiothreitol (100 nmollpl, IO-fold molar excess over disulfide bonds) at 37 "C under NZ for 1 h. The reduced protein

was S-alkylated using either iodoacetic acid (500 nmol/pl) or 4- vinylpyridine in 50 M excess over disulfide bonds under N, for 1 h at 37 "C in the dark and then lyophilized.

Methyl Ester Formation-A standard solution of 2 N HCl in methanol was prepared by adding 800 pl of acetyl chloride dropwise with stirring to 5 ml of methanol. After 5-min incubation at room temperature, 100-pl aliquots of the reagent were added to lyophilized HPLC fractions. Esterification was allowed to proceed for 2 h at room temperature, and the solvent was then removed by lyophiliza- tion.

N-Acetylation of Peptides-Peptides were dissolved in 50 pl of 50 mM ammonium bicarbonate (pH 8.0), and to this solution, 50 r l of freshly prepared acetylation reagent was added. Acetylation reagent was prepared by adding 100 pl of acetic anhydride to 300 pl of dry methanol. After standing for 15 min, the reaction mixture was lyoph- ilized. Acetylated peptides were analyzed directly without further purification.

Deglycosylation-N-Linked oligosaccharide was removed from gly- copeptides by digesting them at 37 "C with 1 pl of a 0.25 unit/ml of peptide N-glycosidase in 150 pl of 0.1 M NHaHC03, 0.005 M EDTA, pH 8.5. Release of oligosaccharide was monitored by HPLC at 2-h intervals. After 10 h, the reaction was stopped by lyophilization.

Proteolytic Digestions-Proteolytic cleavages of 10-nmol aliquots of alkylated rSBP were accomplished by one of the following proce- dures. Proteolysis with trypsin and endoprotease Lys-C was carried out in 100 pl of 100 mM Tris-HC1, pH 8.5, for 4-8 h at 37 "C. Cleavage with endoprotease Asp-N was done with 1 pg of enzyme in 100 pI of sodium phosphate buffer, 25 mM, pH 8.0, for 6-8 h at 37 "C. Treat- ment of rSBP with a-chymotrypsin and endoprotease Glu-C was carried out with 1-4 pg of enzyme in an ammonium bicarbonate buffer, 50 mM, pH 8.6, for 4-6 h. Subdigestion of oligopeptide frag- ments was performed at the 5-nmol level in 100 p1 of 50 mM ammo- nium bicarbonate by treatment of sample with 1 pg of a-chymotrypsin or pancreatic elastase. Proteolytic cleavage was allowed to proceed for 2-4 h at 37 "C. The products were lyophilized before separation by reverse phase HPLC.

Chemical Cleauage-Chemical cleavage with cyanogen bromide was performed by the addition of 300 pg of CNBr to 150 pl of 70% trifluoroacetic acid containing 100 mM L-tryptophan and 10 nmol of protein. Reaction was allowed to proceed at room temperature for 12 h in the dark under N2. Cleavage COOH-terminal to aspartic acid residues was accomplished by treating the protein with 2% formic acid at 110 "C for 4.5 h.

HPLC-HPLC chromatography was performed on an Applied Biosystems model 130A separation system. Sample (50 pg in 45 pl of 0.1% aqueous trifluoroacetic acid) was injected onto a microbore RP- 18 column (2.1 mm X 3 cm) or a RP-8 column (2.1 mm X 10 cm) and eluted with a 40-min linear gradient of 0-70% acetonitrile (0.085% trifluoroacetic acid) in 0.1% trifluoroacetic acid. Column effluent was monitored a t 214 nm, and fractions were collected by hand. Solvent was then removed by lyophilization. Alternatively, the peptide mix- ture in the endoprotease Lys-C digest was first separated by size on tandem TSK 3000 PW columns in 45% acetonitrile, 0.1% trifluoro- acetic acid, water, and then pooled fractions were purified on an RP 300/102 7-pm C8 column.

Mass Spectrometry-Mass spectra were recorded in both a triple quadrupole mass spectrometer (38) and a Fourier transform mass spectrometer (39). Operation of these instruments has been described previously (38,39). M e t h ~ o l o ~ for sequence analysis of peptides in the mass range up to 1800 Da by collision-activated dissociation on the triple quadrupole instrument (38) and by laser photodissociation on the Fourier transform instrument has also been reported (40,41). Molecular mass determinations on peptides in the mass range above 1800 Da were performed on the Fourier transform mass spectrometer (39). These peptides were then subdigested and reanalyzed on one of the above instruments to obtain amino acid sequence information.

Samples for mass analysis on the triple quadrupole instrument were prepared by dissolving lyophilized HPLC fractions in 5-20 pl of either 5% acetic acid or 0.1% trifluoroacetic acid. A 0.5-1.0-pl aliquot of these solutions (0.1-0.4 nmol of peptide) was added to 1 pl of monothioglycerol on a gold-plated stainless steel probe tip, 2 mm in diameter. Peptides were sputtered from this liquid matrix into the gas phase for mass analysis, largely in the form of (M + H)+ ions, by bombarding the sample matrix with 6-10 keV Cs+ ion projectiles. The latter ions were generated from a cesium ion gun (Antek, Palo Alto, CA) mounted directly on the ion source of the spectrometer.

Sample preparation for mass analysis on the Fourier transform instrument involved similar steps except that only 10-50 pmol of

19068 Sequence of Sex Steroid-binding Protein of Rabbit Serum peptide sample was required to obtain data in the mass range up to m/z 6000. A matrix of 1:l thioglycerol/glycer1 or 1:1:1 thioglycerol/ glycerol/dimethylsulfoxide, 6 M HC1 was employed for all sample runs on the Fourier transform instrument.

Disulfide Bond Assignment-Protein was digested with trypsin at 37 "C for 24 h in 200 pl of 100 mM Tris-HC1 at pH 7.5 in the presence of 5 mM iodoacetic acid. The latter reagent is used as scavenger for free sulfhydryl groups. Digestion was terminated by heating the reaction mixture a t 100 "C for 5 min to denature the enzyme. To remove asparagine-linked oligosaccharide, sample was then treated with 0.25 units of peptide N-glycosidase F at 37 "C for 24 h in the presence of 5 mM EDTA. Digestion was terminated by lyophilization of the mixture, and the resulting peptides were then fractionated by HPLC. Lyophilized fractions were redissolved in 0.1% trifluoroacetic acid, and 0.5-pl aliquots of each sample (5-20 pmol of peptide) were added to a 0.5-p1 matrix of 1:l glycerol/thioglycerol. Mass spectra were recorded on the quadrupole Fourier transform instrument. Sam- ple was withdrawn from the instrument and treated with 0.5 pi of dilute ammonium hydroxide to facilitate reduction of disulfide bonds by the thioglycerol matrix. After the sample had been reacidified a few seconds later, a second mass spectrum was recorded. Newly formed (M + H)+ ions for tryptic peptides, whose masses sum to a value 3 mass units higher than that of an (M + H)' ion in the first spectrum, are identified as being part of a disulfide bond in the native protein.

~ n n u n l Edmnn Degr~nt~on-Manual Edman degradations were carried out according to the methods described previously (42), mod- ified for use with tandem mass spectrometry (38).

Automated Edman Degraalation-This was carried out as published (43) using an Applied Biosystems model 470 sequenator with on-line identification of phenylthiohydantoins.

Amino Acid Analysis-This was carried out according to published procedures (44).

RESULTS

General Strategy of Amino Acid Sequence Determination

The amino acid sequence of rabbit SBP was determined primarily by tandem mass spectrometry. Peptides were gen- erated from the protein by either site-specific proteolytic enzymes or reagents such as cyanogen bromide. Products of these reactions were fractionated by reverse phase HPLC, and peptides in the resulting mixtures were then sequenced directly by either collision-activated dissociation on a triple quadrupole mass spectrometer or laser photodissociation on the recently constructed Fourier transform instrument as discussed below. In many cases, two different enzymes were employed to digest the protein in order to produce peptides of appropriate size for sequence analysis. Mass measurements on the (M + H)+ ions of large oligopeptides were employed to confirm overlaps assigned from sequence data. In most cases, these measurements were also performed on the correspond- ing peptide methyl esters. Conversion of a peptide to its methyl ester shifts the observed (M + H)' ion to higher mass by 14 daltons/COOH group. Therefore, division of an ob- served mass shift by 14 defines the total number of carboxylic acid moieties, COOH terminus plus aspartic acid, glutamic acid, and carboxymethyl-cys~ine residues, present in each peptide. Either automated Edman degradation or amino acid analysis was used to resolve areas of ambiguity and to assign residues as leucine or isoleucine. The latter amino acids have identical mass. Automated Edman degradation was also used to provide extensive sequence information at the NH, termi- nus of the protein and in a region of the molecule near the COOH terminus which contained two asparagine-linked oligosaccharide chains. Glutamine and lysine, two additional residues of identical mass, were d i s t i n ~ i s h ~ by acetylation of peptides on the solids probe of the mass spectrometer. This procedure increases the mass of lysine residues by 42 daltons. The location of two disulfide bonds in the molecule was determined by Fourier transform mass spectrometry.

Shown in Fig. 1 is the primary structure determined for rabbit SBP. Oligopeptide sequences are shown on five sepa- rate lines below the deduced primary structure and are labeled according to the method used to generate them and the technique used to sequence them. Residues determined by automated Edman degradation are shown on line 1. Sequences obtained on the triple quadrupole instrument are found on lines 2-4 and are labeled with the mass of the (M + H)+ ion subjected to collision-activated dissociation. More than 125 peptides were characterized by this approach. Peptides se- quenced on the Fourier transform instrument are designated by the label FTPD plus the mass of the (M + H)+ ion subjected to laser phot~ssociation. Sequences confirmed by enzyme specificity and molecular mass measurements on the Fourier transform instrument appear on lines 4 and 5 and are desig- nated by dashed lines plus the label FT followed by the mass of the sample molecule plus a proton, the (M + H)' ion.

Mass Spectrometry The initial step in the sequence analysis of peptides in

mixtures by triple quadrupole mass spectrometry (38) involves bombarding an acidic solution of the sample with high energy cesium ions. This process sputters the protonated peptide molecules (M + H)' ions into the gas phase. In the next step, the mixture of (M + H)+ ions is injected into the first quadrupole mass analyzer. This latter device functions as a mass filter and allows all (M + H)' ions of a particular mass, those from a single peptide, to pass through the analyzer. All other peptide (M + H)+ ions of different mass are rejected. In the next step, (M + H)' ions from the selected peptide are injected into a second quadrupole analyzer that is operated as a collision cell. Here, the ions suffer multiple collisions with argon atoms. In this process, kinetic energy is converted to vibrational energy, and the peptide (M + H)+ ions fragment more or less randomly at the various amide bonds in the molecule. The resulting charged fragments, many of which differ in length by a single amino acid residue, are then transmitted to the third quadrupole analyzer, separated ac- cording to mass, and counted to produce the resulting colli- sion-activated dissociation mass spectrum. Only seconds are required to generate this type of spectrum for each peptide in a particular HPLC fraction.

Shown in Fig. 2 is the collision-activated dissociation spec- trum recorded on (M + H)+ ions generated from the methyl ester of the oligopeptide containing residues 90-103 in rSBP. Predicted masses, m/z values, for fragment ions of type b (38, 40), all of which contain the amino-terminal residue plus 1, 2,3,4, etc. additional residues, are shown above the sequence of this peptide. Those observed in the spectrum are under- lined. Subtraction of m/z values for any two fragments that differ by a single amino acid, NHCH(R)CO, generates a value that specifies the mass and thus the identity of the extra residue in the larger fragment. Fragment ions of type b allow residues 7-14 to be specified in the present example. Since leucine and isoleucine have identical masses, residue 12 was assigned as isoleucine from amino acid analysis data.

Shown below the oligopeptide sequence are predicted masses, m/z values, for fragment ions of type y (38,401, all of which contain the COOH terminus of the peptide plus 1,2,3, 4, etc. additional residues. Those observed in the spectrum are underlined. The identity of residues 1-11 is determined by subtracting m/z values for fragments of type y which differ by a single amino acid. Ions labeled in the spectrum with an asterisk or open circle are formed by loss of ammonia or water, respectively, from fragment ions of either type b or Y.

Sequence of Sex Steroid-binding Protein of Rabbit Serum 19069

FIG. 1. Amino acid sequence data on rabbit SBP obtained by a combi- nation of tandem mass spectrometry and Edman degradation. Peptides are labeled according to the method used to generate them and the technique used to sequence them. Proteolytic and chemical cleavage methods are designated as fol- lows: K, endoprotease Lys-C; Tr, trypsin; Ch, ~ - c h y m o t ~ s i n ; E, endoprotease Glu-C; El, elastase; D, endoprotease Asp- N; hf, cyanogen bromide; A , acid cleav- age at asparagine. Peptides sequenced by Edman degradation are labeled Ed and are listed in the first line below the pri- mary structure of rabbit SBP. Uniden- tified residues are designated by blank spaces. Peptides sequenced by collision- activateddissociation on the triple quad- rupole mass spectrometer are listed on lines 2-4 below the primary structure and are labeled by the nominal monoiso- topic mass of the corresponding (M + H)+ ion. Primary structures deduced from laser photodissociation spectra re- corded on the Fourier transform instru- ment are also found in lines 2-4 and are designated by the label FTPD followed by the average mass value observed for the isotopic cluster corresponding to the (M + H)+ ion. Sequences confirmed by enzyme specificity and molecular mass measurement on the Fourier transform instrument appear on lines 4 and 5. These sequences are designated by dashed lines plus the label, FT followed by the average mass value recorded for the isotopic cluster corresponding to the (M + H)+ ion. Mass measurements on (M + H)+ ions for the corresponding peptide methyl esters were used to sup- port assignments made by this latter method. Predicted incremental mass shifts of l.l-Da/carboxylic acid group were observed in all cases. Amino acids located at the slashes demarking peptide lengths represent peptide-terminal resi- dues.

110 I R G D S V L L E V D G K E V L R L S q V S G T L H D K P Q P V H K I A V G G L L F P P S S L R L P

I20 130 140 150

\4- 1 4 0 0 - - ” / - I \-D-774-/ \- D- I 7 3&”----/ \-- U-FTPD-305--

\-.- Tr-1426----”/ \- \------------Tr-Ff-l866---------/

160 L V P A L D G C L R R G S U L D P Q A q i S A S A H A S R R S C D V E L q P G I F F P P G T H A E F

170 180 190 200

\--A-E+-- \--A

260 V V L S S G H E P G L D L P L A U G L P L Q L K L G V S T A V L S Q G S K K Q A L G L P S P G L G P

270 280 290 300

”/ \ - - 4 - 1 2 4 ~ ” - - - 1 \b“---X-Ch-121+ i--“----D-Ed +

- D - 1 4 5 L / \“-------I(-FTPD-2535 I

\ 4 h - 9 4 6 ” / \-“-K-Ch-1433--

360 U T H S C P S S P G I G T D T S H “----K-E” ____ I

A s p A s p Gly S e r T r p His Gln Val His Val Lys Ile Arg Gly

130 259 316 403 589 726 854 953 1090 1189 1317 1430 1586 1675

A s p A s p G l y Ser T r p H i s Gln Val Hi5 V a l Lys I le Arg Gly

1675 1546 1417 1360 1273 1087 950 822 723 586 487 359 246 90 “-

- 1342

Fig. 3 displays a laser photodissociation mass spectrum recorded on (M + H)+ ions from a 50-pmol sample of the 29- residue oligopeptide containing amino acids 127-155 of rSBP. To generate this spectrum, peptide (M + H)+ ions were sputtered into the gas phase, as described above, trapped in the 7-Tesla magnetic field of the quadrupole Fourier trans- form mass spectrometer, and then irradiated with a single 10- ns pulse of light at 193 nm from an argon fluoride excimer laser (40,45). Fragmentation induced by the laser occurs more

or less randomly at the various amide bonds in the backbone of the oligopeptide chain and produces a collection of charged fragments, many of which differ by a single amino acid residue. Predicted mass values for fragments of type y derived from the 29-residue peptide are shown below the structure in Fig. 4. Those observed are underlined. Subtraction of the mass values for fragments of type y which differ by a single amino acid residue facilitates assignment of residues 1-21 in the present sample.

Shown on top of the structure in Fig. 4 are mass values for three sets of fragment ions which result from sequential cleavage of two bonds internal to the peptide chain (38). All of these fragments contain proline at the amino terminus plus 1 or more residues extending toward the COOH terminus of the oligopeptide. Subtraction of mass values for fragments of this type which differ by a single amino acid facilitates as- signment of residues 19-23 and 27-29 in the sample. Appear- ance of ions at m/z 211.1 and 310.1 suggests strongly that residues 24-26 should be assigned as Pro-Leu-Val. Collision-

19070 Sequence of Sex Steroid-binding Protein of Rabbit Serum

FIG. 2. Collision-activated disso- ciation mass spectrum recorded on (M + H)+ ions at m/x 1676 from the methyl ester of the oligopeptide Asp-Asp-Gly-Ser-Trp-His-Gln-Val- His-Val-Lys-Ile-Arg-Gly, residues 90-103 of rSBP. Ions labeled in the spectrum with an asterisk or open circle are formed by loss of ammonia or water, respectively, from other fragment ions of type b or y.

I10

I 237 . 266 0 365

IO 1 726 F 3 r 7 4 3 854

5 - 487 586

I. I , I, I. r*ssr, Mfz 500 550 600 650 700 750 800 850

10 - 950 -1090 -1189

-953 1073 1273

1161 5 - I317

MfZ 950 lcbs 1 1 5 0 1260

l o ] 0 1546

0-

1 do 260 360 460 5 6 0 6 0 0 760 800 9

0 0

> c u m z w c z o

800 900 1000 1100 1 2 0 0 1300 1400 1500 1600

w n

1600 1700 1800 1900 2000 2100 2200 2300 2400

FIG. 3. Laser photodissociation mass spectrum recorded on (M + H)+ ions from a 50-pmol sample of the 29-residue oligopeptide Asp- Lys-Pro-Gln-Pro-Val-Met-Lys- Leu-Ala-Val-Gly-Gly-Leu-Leu- Phe-Pro-Pro-Ser-Ser-Leu-Arg- Leu-Pro-Leu-Val-Pro-Ala-Leu, amino acids 127-155 of rSBP.

0- in

MASS IN A . M . U

activated dissociation spectra recorded on peptides generated in an elastase subdigestion of the 29-residue oligopeptide confirmed this assignment. Differentiation of glutamine and lysine, two amino acids of identical mass, was accomplished by recording spectra both before and after acetylation of the sample on the probe of the mass spectrometer. Fragments containing the amino terminus and/or lysine shift to higher mass by 42 daltons after acetylation. No reaction occurs with glutamine residues. Amino acid composition data were used to assign residues as leucine rather than isoleucine in the above example.

Summary of Experiments Performed to Assemble the Primary Structure of rSBP

To facilitate presentation, the assembly of amino acid se- quence data on rSBP is discussed in unidirectional manner, proceeding from the NHz terminus to the COOH terminus. In the laboratory, the actual structural assembly occurred from several directions simultaneously.

Segment 1. Residues 1-155-Automated Edman degrada- tion on intact rSBP revealed the presence of a ragged NHz terminus. Two sequences, Thr-Gln-Arg-Ala-Gln-Asp-Ser- Pro-Ala-Val-His and Ala-Gln-Asp-Ser-Pro-Ala-Val-His, were

Sequence of Sex Steroid-binding Protein of Rabbit Serum 19071

~n,-Asp LYS Pro Gln Pro Val Met L y s Leu Ala Val Gly GlY Leu Leu- 7 @@+y%y@Y $%fl,q$?$&&&? ,@

@g&,%,,..,oyqq e , ,p 9 8 4 8' *pq"&a+? *%%

,9ep%y >Ybf ,y + . ,y ,,e> ,by.@ "Phe Pro Pro Ser Ser Leu Arg Leu Pro Leu Val Pro Ala Leu-OH

FIG. 4. Amino acid sequence or residues 127-155 of rSBP as determined by laser photodissociation on the quadrupole Fourier transform mass spectrometer.

found in approximately equal abundance. Mass spectrometry detected the same two NH2 termini. Digestion of rSBP with endoprotease Lys-C generated 10 oligopeptides, two of which afforded (M + H)' ions separated by 385 Da ( m l z 3660 and 3275), the mass corresponding to the three residues, Thr-Gln- Arg. Subdigestion of these two peptides with a-chymotrypsin afforded a mixture of peptides, two of which were sequenced by collision-activated dissociation of the corresponding (M + H)' ions as Thr-Gln-Arg-Ala-Gln-Asp-Ser-Pro-Ala-Val-His and Ala-Gln-Asp-Ser-Pro-Ala-Val-His, respectively. A third peptide in the above mixture provided the sequenced infor- mation to extend the Edman data through to residue 33.

Structural information in the region containing residues 34-94 was obtained by aligning nine oligopeptides sequenced by tandem mass spectrometry and one additional peptide sequenced by laser photodissociation on the quadrupole Four- ier transform instrument. Data from automated Edman deg- radation and molecular mass measurements on several large oligopeptides confirmed the proposed alignment.

Four peptides isolated from an endoprotease Asp-N digest of rSBP provided the information to establish the amino acid sequence for residues 94-155. Three of these were sequenced by collision-activated dissociation of the corresponding (M + H)' ions. Laser photodissociation of the (M + H)+ ion for the fourth peptide at m l z 3055 (Fig. 3) provided the initial se- quence information on residues 128-155. This was later con- firmed by collision-activated dissociation experiments on (M + H)' ions of two smaller peptides derived from this region of the protein.

Segment 2. Residues 156-21 7-Segment 2 contains two sets of adjacent arginine residues at positions 160-161 and 179- 180. The first of these appeared in the sequence of an endo- protease Asp-N peptide (residues 156-165) sequenced by tan- dem mass spectrometry. The second set of adjacent arginine residues, along with the preceding 2 amino acids in the se- quence, escaped detection until a tryptic map of the protein was generated at the very end of the structural analysis. In this experiment, oligopeptides produced from rSBP in a tryp- tic digest were fractionated by HPLC and subjected to analy- sis by mass spectrometry. All but two of the observed (M + H)' ions, m l z 2368 and 1883, could be assigned to tryptic peptides predicted from the known sequence information. When the first peptide was treated with phenylisocyanate under conditions for the Edman degradation, the (M + H)' ion shifted to higher mass by 135 daltons. This observation suggested that the unassigned signal at m l z 2368 corre- sponded to the (M + H)' ion of the tryptic peptide containing residues 289-311 plus pyroglutamate at the NH2 terminus.

Sequence analysis of the second unassigned tryptic peptide, (M + H)+ = 1883, was conducted simultaneously by tandem mass spectrometry and automated Edman degradation. Mass spectra recorded directly on the product mixture obtained by subdigesting the peptide with a-chymotrypsin showed a single (M + H)+ ion at m l z 1237. Analysis of the collision-activated dissociation spectrum recorded on this ion afforded the se-

quence for residues 165-176. Automated Edman degradation confirmed this assignment and also identified the 3 residues in the small polar chymotryptic fragment (residues 177-1791 not detected by mass spectrometry. Edman degradation of a fragment isolated after acid cleavage of a peptide from a Lys- C digest provided the data to sequence through the second set of adjacent arginine residues. The remaining residues in seg- ment 2 were assigned from collision-activated dissociation spectra recorded on (M + H)' ions of peptides generated in an endoprotease Asp-N digest of the intact protein.

Segment 3. Residues 218-342-Most of the sequence infor- mation on segment 3 of rSBP was obtained by subdigesting large oligopeptide fragments generated with endoprotease Lys-C, cyanogen bromide, and trypsin. Particularly notewor- thy was the use of laser photodissociation on the quadrupole Fourier transform instrument to assign the complete sequence of the 24-residue peptide containing amino acids 251-274.

Extension of the sequence beyond residue 274 proved to be a difficult task for tandem mass spectrometry. This region of the protein is particularly rich in cleavage sites for relatively nonspecific proteases such as elastase and chymotrypsin and low in cleavage sites preferred by the highly selective pro- teases, Asp-N and Glu-C. As a result, it proved impossible to generate a set of peptides that contained the necessary overlap information to sequence through this region. Digestions con- ducted on this segment produced either a large number of small peptides that proved difficult to fractionate and analyze or a small number of very large fragments that exceeded the 1800-Da mass range available on the triple quadrupole instru- ment built in our laboratory. Automated Edman degradation of a 59-residue Asp-N fragment solved this particular prob- lem. Data from the first 18 cycles of this process provided the information to overlap sequences within two endoprotease Lys-C fragments and thus to establish the primary structure of rSBP through residue 287.

Collision-activated dissociation spectra recorded on the (M + H)' ions of 10 additional peptides furnished the data to extend the sequence through residue 342. Overlap assign- ments in this region of the protein were confirmed by mass measurements on 32- and 20-residue peptides produced by endoprotease Glu-C subdigestion of large endoprotease Lys- C and cyanogen bromide fragments, respectively.

Segment 4. Residues 343-367-Sequence information at the COOH terminus of rSBP was obtained by automated Edman degradation of a 25-residue glycosylated peptide isolated from an endoprotease Lys-C digest of the native protein. Cycles 3 and 19 from this process liberated unidentified phenylthio- hydantoin-amino acids. Both of these gaps correspond to the first positions in the 3-residue sequences required for glyco- sylation on asparagine, Asn-X-Thr or Asn-X-Ser. Accord- ingly, both sites were assumed to be occupied by N-linked oligosaccharide. Confirmation of these assignments was ob- tained by digesting intact rSBP with trypsin, before and after treatment with the enzyme, peptide N-glycosidase F and then separating the resulting peptides by HPLC. The enzyme, peptide N-glycosidase F, hydrolyzes asparagine-linked oligo- saccharide to oligosaccharide and a peptide containing aspar- tic acid at the glycosylation site. Comparison of the two HPLC traces showed that the retention time of two peptides changed as a result of the deglycosylation step. Analysis of one of these peptides by tandem mass spectrometry afforded the sequence Ala-Leu-Asp-Arg and thus confirmed residue 345 as one gly- cosylation site. Mass spectra recorded on the mixture pro- duced by subdigestion of the second peptide with a-chymo- trypsin showed an abundant (M + H)' ion at m l z 1587. Since this is the result expected for a peptide containing aspartic

19072

FIG. 5. Comparison and align- ment of the amino acid sequence of rabbit and human SBPs. The solid lines represent identical sequences; (*) indicates N-glycosylation site; (+) indi- cates 0-glycosylation site; and the boxed areas in the human sequence designate the amino acid sequence repeat.

Sequence of Sex Steroid-binding Protein of Rabbit Serum

10 20 30 40 50 60 - - I - 1 - -1- - -1- 1 1

' I I I I 10 20 40 50 60

I 30

I

TQRAQDSPAVHLINGLGQEPIQVLTFDLTRLVKASSSFELRTWDSEGVIMGDTSPKDDW LRPVLPTQSAHDPPAVHLSNGPGQEPIAVMTFDLTKITKTSSSFEVRTWDPEGVIMGDTNPKDDW

70 80 90 100 110 120 ""1I-I- FMLGLRDGRPEIQMHNPWAQLTVGAGPRLDDGSWHQVHVKIRGDSVLLEVDGKEVLRLSQVSGTLH F M L G L R D G R P E I Q L H N H W A Q L T V G A G P R L D D G R W H Q V E V K

I I I I I I I 70 80 90 100 110 120 130

130 140 150 160 170 180 190

DKPQPVMKLAVGGLLFPPSSLRLPLVPALDGC~GSW~PQAQISAS~S~SCDVE~PGIFF SKRHPIMRIALGGLLFPASNLRLPLVPALDGC~S~KQAEISASAPTSLRSCDVESNPGI~

- I" """I- -I _I_

I 140

1 150

I 160

I 170

1 I 180 190

200 210 220 230 240 250

I 200

I 2 10

I 220

I 230

I 240

I 250

I 260

260 270 280 290 300 310 320 "-1-

KMKALALPPLGLAPLLNLWAKPOGRLFLGALPGEDSST I

270 I

280 I

290 I

300 I

310

330 340 350 360 "-I-* -I*- SFCLDGLWAQGQKLDMDKALNRSQDIWTHSCPSSPGNGTDTSH rSBP SFCLNGLWAQGQRLDVDQALNRSHEIWTHSCPQSPGNGTDASH hSBP

I I * I * I 340 350 360 370

~~-

I I 320 330

rSBP hSBP

rSBP hSBP

rSBP hSBP

rSBP hSBP

rSBP hSBP

acid at position 361 plus the other 14 residues assigned from the Edman degradation discussed above, residue 361 was confirmed as the second glycosylation site.

That His-367 is the real COOH terminus of the protein is suggested by the following pieces of evidence. ( a ) All other tryptic peptides have been placed within the sequence; ( b ) neither endoprotease Lys-C nor trypsin is expected to cleave the protein on the COOH terminal side of histidine residues; and ( c ) alignments of the rabbit and human SBP sequences are highly similar and contain COOH-terminal histidine. The weakest point in the sequence proposed for rSBP is the failure to find a peptide that overlaps residues 342 and 343. Linkage of these 2 residues is based entirely on the similarity observed between the sequences of human and rabbit SBP.

Assignment of Disulfide Bonds

To locate the disulfide bonds in rSBP, protein was digested with both trypsin and peptide N-glycosidase F, and the re- sulting peptides were then fractionated by HPLC and ana- lyzed by mass spectrometry on the quadrupole Fourier trans- form instrument. Mass spectra recorded on one of the frac- tions showed an (M + H)+ ion at m/z 4928, the expected mass for two of the four half-cystine-containing tryptic peptides

(residues 312-337 and 347-367) linked by a disulfide bond between Cys-327 and Cys-355. Reduction of the sample on the probe of the mass spectrometer caused the signal at m/z 4928 to disappear and generated the expected (M + H)' ion at m / z 2713 for residues 312-337.

Mass spectrometry failed to detect an (M + H)' ion for the 61-residue tryptic fragments containing the remaining 2 half- cystines, residues 149-160 and 181-229. Accordingly, several of the late eluting HPLC fractions were subdigested with cy- chymotrypsin and reexamined by mass spectrometry. Exist- ence of a disulfide bond between Cys-158 and Cys-183 was confirmed by observation of an (M + H)+ ion at m/z 2473 (residues 149-160 linked to 181-191) which shifted to m/z 1268 and 1208 following reduction of the sample on the probe of the mass spectrometer. These latter two signals correspond to (M + H)' ions for peptides containing residues 149-160 and 181-191, respectively.

Amino Acid Composition of the Rabbit SBP Monomer The final sequence of 367 residues is composed of 22 ala-

nines, 16 arginines, 6 asparagines, 26 aspartic acids, 4 half- cystines, 24 glutamines, 14 glutamic acids, 36 glycines, 10 histidines, 9 isoleucines, 57 leucines, 14 lysines, 5 methionines,

Sequence of Sex Steroid-binding Protein of Rabbit Serum 19073

TABLE I Comparison of the amino acid sequences of rabbit SBP,

human SBP (hSBP, Ref. 331, rat ABP (rtABP, Ref. 63), and the human androgen receptor ( M R . Refs. 46-48)

rSBP hSBP rtABP hAR" ~~

rSBP 63.76' 56.46' -1.94' hSBP 79%' 57.97b -0.57b

rtABP 68%' (o)d 68%'

hAR 14%' 15%' 13%" (Old ( W d

(7)d (lo)d ( W d

-1.67b

a Portion of the androgen receptor sequence used for the compar- ative analysis was from residues 466-918 (46-48), which contain the steroid-binding domain as well as the DNA-binding domain.

Alignment scores, expressed in units of standard deviation from the mean of random scores for scrambled sequences of the same composition (49).

e Percent sequence identities. The number of identical residues between two sequences is compared with the total possible matches between residues using the ALIGN program with mutation data matrix (49) and a penalty for a gap (break) = 16.

* ( ), no. of breaks.

11 phenylalanines, 32 prolines, 36 serines, 14 threonines, 11 tryptophans, 1 tyrosine, and 19 valines. This in turn corre- sponds to M, 39,769 plus the two oligosaccharide chains (-4,000). These data are in excellent agreement with the amino acid composition of rabbit SBP published previously (25).

Corrections of the Preliminary Rabbit SBP Sequence Reported Earlier

We reported preliminary mass spectrometric data on the amino acid sequence of rabbit SBP at the Second Interna- tional Symposium on Binding Proteins (31). Reexamination of selected peptides by Edman degradation provided 16 cor- rections (largely leucine/isoleucine changes that are reported here). The preliminary leucine/isoleucine assignments had been tentatively based only on homology with human SBP (33). In the present study, all leucine/isoleucine assignments were made either by Edman degradation (Fig. 1) or by amino acid composition of selected small peptides.

Sequence Comparisons Comparison of the amino acid sequences of human and

rabbit SBPs is shown in Fig. 5. Seventy-nine percent of the amino acid residues are identical, indicating that the two proteins are homologous and have arisen from a common ancestor. Calculated parameters for comparisons of various sex steroid-binding proteins are shown in Table I. The amino acid sequence of the human androgen receptor was recently deduced from cDNA clones isolated from libraries prepared from mRNA of human breast cancer T47D cells (46) and from human prostate (47, 48). It was of interest to compare its amino acid sequence with that of rabbit SBP since the two proteins bind 5a-dihydrotestosterone with similarly high af- finity ( K d -lo-' M ) . Using the ALIGN program (49), the alignment score between rabbit SBP and the carboxyl-ter- minal region of the human androgen receptor comprising Gly- 466 to Gln-918 (46-48) was determined by comparing with those derived from random sequences of the same composi- tion. That region of the androgen receptor contains both the DNA-binding domain and the steroid-binding domain. The alignment score of -1.94, shown in Table I, indicates that rSBP is not homologous to the androgen receptor. The same holds true for hSBP and rat ABP. In contrast, alignment

scores of 63.76 and 56.46 for hSBP and rat ABP, respectively, show that rSBP is homologous to those two proteins.

DISCUSSION

Although most amino acid sequence determinations are inferred today from DNA sequence data, there continues to be value in examining the proteins directly in cases in which they are readily available, as in SBP. In particular, structural information on sites of post-translational modification and on location of disulfide bonds can only be directly determined by protein sequence methods. There were two main objectives in the present study: (a ) to determine the primary structure of rabbit SBP and compare it with that of the human isoform, a necessary first step in understanding the structural basis for the difference in steroid-binding specificity between the two proteins; and ( b ) to assess the relative tactical advantages to the experimentalist of using automated Edman degradation and mass spectrometric methods for protein sequence analy- sis.

Edman degradation proved to be the method of choice for generating sequence information at the NH, terminus of the molecule, largely because the procedure could be carried out directly on the intact protein. Presence of a ragged NH2 terminus on rSBP was first detected by Edman degradation and then confirmed subsequently by mass spectrometry. In situations in which the protein contains a blocked NH2 ter- minus, the choice of mass spectrometry for sequence analysis becomes obvious.

Advantages of the tandem mass spectrometry method in- clude the ability to sequence peptides present in mixtures and the speed with which one can obtain sequence information over the entire length of a protein. Only 3-5 h of instrument time are required to obtain both molecular mass and sequence data on all peptides generated in any particular digest. As a result, about 75% of the rSBP primary structure was assem- bled into large nonoverlapping segments with only several weeks of effort. Location of disulfide bonds is also greatly facilitated by mass spectrometry. Mass measurements per- formed on peptides from a particular enzyme digest, before and after reduction of the sample on the probe of the mass spectrometer, identifying those sequences attached through a disulfide linkage. This experiment requires only minutes of instrument time.

Limitations of the above sequencing approach as practiced on the triple quadrupole mass spectrometer built in Dr. D. F. Hunt's laboratory include ( a ) an inability to produce fragment ions under low energy collision conditions that differentiate leucine and isoleucine; and ( b ) a mass range limit of 1800 Da imposed by instrument electronics. The first problem dictated that all 57 leucine and 9 isoleucine residues in the rSBP monomer be assigned from amino acid composition or Edman degradation data. Recent results from collision experiments conducted in the keV energy range on magnetic sector instru- ments suggest that tandem mass spectrometry can be em- ployed to make these assignments in the future (40). The second restriction is more severe and requires that peptides sequenced on the triple quadrupole instrument must contain less than 17 residues. Development of the quadrupole Fourier transform instrument allows us to make routine mass meas- urements on peptides at the 10-50-pmol level in the mass range below 6000 Da, but the laser photodissociation tech- nique employed on this instrument to generate sequence information is not yet routinely applicable to all peptide samples, even those below m / z 2000.

The two regions within the rSBP structure which proved most difficult to sequence by tandem mass spectrometry in-

19074 Sequence of Sex Steroid-binding Protein of Rabbit Serum

cluded residues 260-280 and 342-367. The first mentioned region contains a paucity of acidic and basic amino acids and a large number of small hydrophobic residues. As a result, proteolytic enzymes produced either one or two peptides whose (M + H)’ ions fell outside the mass range of the triple quadrupole instrument or a large number of small peptides, 2-4 residues in length, the sequences of which could not be overlapped to generate a unique primary structure for this part of the molecule. Automated Edman degradation of a 59- residue Asp-N peptide provided the necessary data without difficulty and will continue as the method of choice for se- quence analysis of such regions unless the working mass range of tandem mass spectrometers can be extended beyond 3000 Da. Use of lasers to fragment large oligopeptides (41, 45), microchannel array detectors to improve the sensitivity of four sector instruments (40), multiply charged ions to expand the mass range accessible to quadrupole instruments (50), and both laser desorption (51) and electrospray ionization tech- niques (52, 53) to analyze intact proteins by mass spectrom- etry all suggest that this will soon be possible.

Problems at the COOH terminus of the protein stemmed from the presence of two asparagine-linked oligosaccharide side chains. As a result, digestion of rSBP with the usual proteolytic enzymes always generated hydrophilic peptides from this region which were outside the working mass range of both the triple quadrupole (1800 Da) and the Fourier transform instruments (6000 Da). Automated Edman degra- dation of a large endoprotease Lys-C fragment provided the first sequence information from this part of the protein. Later it was found that oligosaccharide could be removed from asparagine residues by first digesting the native protein with trypsin and then treating the resulting peptides with peptide N-glycosidase F. Once this had been achieved, mass spectrom- etry easily confirmed the location of the two glycosylation sites. In general, our experience suggests that the best strategy for protein sequence analysis should employ both automated Edman and tandem mass spectrometric methods of analysis.

The results presented indicate that the rabbit SBP mon- omer is a single chain polypeptide composed of 367 amino acid residues with two N-linked oligosaccharide side chains and two disulfide bonds. The polypeptide molecular weight of the monomer calculated from the sequence is 39,769. Since the native glycoprotein has a molecular weight of 85,800 as determined by sedimentation equilibrium (25), we conclude from the sequence data that rabbit SBP is a dimer composed of identical subunits. The molecular weight of the homodimer, calculated from the sequence including 9% carbohydrate (25), is 87,404, which agrees well with the experimentally deter- mined value of 85,800 (25).

Although the sequence data reported here indicate that rabbit SBP is the product of a single gene since it is homodi- meric, there are published data that suggest otherwise. For instance, SDS-PAGE analyses display two stained bands, one major and one minor, migrating between 40,000 and 43,000 (22, 25). Two bands are also present in human SBP (21, 24, 27, 29, 54, 55) , baboon SBP (24), monkey SBP (32), and rat ABP (56,57). These two components, designated “heavy” and “light” protomers by some authors, have been interpreted as two different subunits within SBP and ABP (22, 55, 58, 59), representingproducts of two different genes (60). In contrast, we have maintained that the two SDS-PAGE bands could not represent the subunits of SBP since they were not present in stoichiometric amount and since additional bands with simi- lar molecular weights are detected in some preparations (24, 25, 30). In fact, isoelectric focusing patterns of native rabbit, human, baboon, and monkey SBPs show at least 12 different

stained bands, all of which are active dimeric SBP molecules (24). Treatment of the native protein with neuraminidase and other deglycosidases dramatically changed the isoelectric fo- cusing patterns of all SBPs tested. On the basis of these and other data, we concluded that the microheterogeneity ob- served in SDS-PAGE could not reflect the presence of two different gene products (24, 251, the sequence data presented here as well as that obtained earlier for human SBP2 (33) are in accord with this conclusion. Instead, the bands observed on SDS-PAGE must result in part from differential glycosyl- ation of the SBP monomer, giving rise to monomeric variants with slightly different molecular weights. In the dimeric pro- teins, various combinations give rise to the variety of dimeric molecules having different isoelectric points. Some microhet- erogeneity could also result from minor genetic variations (29) undetected in our analysis and to proteolytic cleavages at the amino-terminal end. Sequence analysis of the rabbit protein revealed that about 50% of the chains began with Thr-Gln- Arg-Ala-Gln-Asp-Ser-Pro-Ala-Val-His and the rest with Ala- Gln-Asp-Ser-Pro-Ala-Val-His. A similar “ragged end” was reported for human SBP in which 2 residues were missing from about 50% of the chains (33). Since both SBP prepara- tions are active, such cleavages at their amino termini appear to have no effect on steroid binding.

Comparison of the human and rabbit SBP sequences (Fig. 5 and Table I) indicate that 79% of their amino acids occupy identical loci. Rabbit SBP is shorter than the human protein by 6 residues at the amino-terminal end and lacks the 0- linked oligosaccharide side chain at Thr-7 of human SBP. This is in accord with the earlier findings that rSBP has a lower sugar content than hSBP and lacks N-acetylgalactosa- mine ( X I ) , a characteristic of 0-linked carbohydrate side chains. Two N-linked oligosaccharide side chains are present in corresponding positions of both proteins. Also, both pro- teins contain a single tyrosine residue and two disulfide bonds. Those are located in areas that have identical sequences. As in the case of human SBP (33), rabbit SBP contains a relatively hydrophobic segment beginning at Trp-241 and ending at Leu-282. An unusual feature of this segment is the presence of alternating leucine residues (10 in all). Within the same segment, the polypeptide sequence from Leu-242 to Gly-256 is partially repeated from Leu-271 to Gly-285 (Fig. 5 ) . Similar repeats were reported in human SBP with minor variations corresponding to Thr-279 and Ala-280 of rabbit SBP. If this region of the polypeptide chain includes features of the steroid-binding site as hypothesized previously (30,31), the change in these 2 amino acid residues might be related to the difference in steroid-binding specificity observed between rabbit and human SBP (18). Affinity labeling and oligo- directed mutagenesis will provide experimental tests of that proposal.

It has been determined that human SBP and the androgen- binding protein (ABP) of rat epididymis are homologous proteins (61) and that they belong to a gene family that is distinct from the supergene family of steroid receptors (33, 62). As expected, the data shown in Fig. 5 and Table I indicate that rabbit SBP joins human SBP and rat ABP in a single gene family. The fact that the androgen receptor is not homologous to rabbit SBP is interesting in the context that the two proteins bind 5a-dihydrotestosterone with similar K d

values M) but do not bind 178-estradiol. One might

* The amino acid sequences of hSBP and rSBP were determined on preparations containing the entire mixture of SBP isoforms. In the case of hSBP, no difference in sequence data was obtained between preparations containing either two or three SDS-PAGE bands. This is in answer to a question raised by Gershagen et al. (29).

Sequence of Sex Steroid-binding Protein of Rabbit Serum 19075

have expected some common structural features. Since the in Binding Proteins of Steroid Hormones (Forest, M. G. & two nonhomologous proteins recognize the same steroid struc- Pugeat, M., eds) Vol. 149, pp. 15-30, John Libbey, London

ture, it would be interesting to determine the structural basis 31. Petra, P. H., Que, B., Namkung, P. C., Ross, J. B. A., Charbon-

of the two different protein designs that lead to the same neau, H., Walsh, K. A., Griffin, P. R., Shabanowitz, J. & Hunt, D. F. (1988) Ann. N. Y. Acad. Sci. 538 , 10-24

steroid-binding specificity. 32. Turner. E. E., Ross, J. B. A., Namkung, P. C. & Petra, P. H.

REFERENCES 1. Westphal, U. (1986) Monograph on Endocrinology, Vol. 27, Sprin-

2. Moore, J. W. & Bulbrook, R. D. (1988) Oxford Reu. Reprod. Biol.

3. Vermeulen, A. L., Verdonck, L., Van der Straeten, M. & Orie, M. (1969) J. Clin. Endocrirwl. Metab. 29 , 1470-1480

4. Petra, P. H., Stanczyk, F. Z., Namkung, P. C., Fritz, M. A. & Novy, M. J. (1985) J. Steroid Biochem. 2 2 , 739-746

5. Szego, C. M. & Pietras, R. J. (1981) Biochemical Actions of Hormones (Litwack, G., ed) Vol. 8, pp. 307-463, Academic Press, New York

6, Bordin, S. & Petra, P. H. (1980) Proc. Natl. Acad. Sci. U. s. A.

7. Sakiyama, R., Pardridge, W. M. & Musto, N. A. (1988) J. Clin.

8. Awakumov, G. V., Zhuk, N. I. & Strel'chyonok, 0. A. (1986)

9. Pardridge, W. M. (1987) Am. J. Physiol. 252, E157-El64

ger-Verlag, New York

10,181-236

77,5678-5682

Endocrinol. Metab. 67,98-103

Bwchim. Biophys. Acta 881,489-498

10. Siiteri, P. K., Murai, J. T., Hammond, G. L., Nisker, J. A., Raymoure, W. J. & Kuhn, R. W. (1982) Recent Prog. Horm. Res. 38 , 457-510

11. Robel, P., Eychenne, B., Blondeau, J . P., Jung-Testas, I., Groyer, M. T., Mercier-Bodard, C., Hechter, O., Roux, C. & Dadoune, J. P. (1983) Horm. Res. 18,28-36

12. Tardivel-Lacombe, J., Egloff, M., Mazabraud, A. & Degrelle, H. (1984) Biochem. Biophys. Res. Commun. 118,488-494

13. Strel'chyonok, 0. A., Awakumov, G. V. & Survilo, L. I. (1984) Biochim. Biophys. Acta 802,459-466

14. Hryb, D. J., Khan, M. S. & Rosner, W. (1985) Biochem. Biophys. Res. Commun. 128,432-440

15. Stanczyk, F. Z., Namkung, P. C., Fritz, M. A., Novy, M. J. & Petra, P. H. (1986) in Binding Proteins of Steroid Hormones (Forest, M. G., Pugeat, M., eds) Vol. 149, pp. 555-563, John Libbey, London

16. Hyrb, D. J., Khan, M. S., Romas, N. A. & Rosner, W. (1989) J. Biol. Chem. 264,5378-5383

17. Mickelson, K. E. & Petra, P. H. (1975) Biochemistry 14 , 957- 963

18. Mickelson, K. E. & Pktra, P. H. (1978) J. Biol. Chem. 253,5293- 5298

19. Mickelson, K. E., Teller, D. C. & Petra, P. H. (1978) Biochemistry

20. Mercier-Bodard, C., Renoir, J. M. & Baulieu, E.-E. (1979) J.

21. Petra, P. H. & Lewis, J. (1980) Anal. Biochem. 105,165-169 22. Kotite, N. J. & Musto, N. A. (1982) J. Biol. Chem. 267 , 5118-

5124 23. Strel'chyonok, 0. A., Survilo, L. I., Tzapelik, G. Z. & Sviridov, 0.

V. (1983) Biokhimiya 48,756-762 24. Petra, P. H., Stanczyk, F. Z., Senear, D. F., Namkung, P. C.,

Novy, M. J., Ross, J. B. A., Turner, E. & Brown, J. A. (1983) J. Steroid Biochem. 19,699-706

25. Petra, P. H., Namkung, P. C., Senear, D. F., McCrae, D. A., Rousslang, K. W., Teller, D. C. & Ross, J. B. A. (1986) J. Steroid Biochem. 25 , 191-200

26. Petra, P. H., Kumar, S., Hayes, R., Ericsson, L. H. & Titani, K. (1986) J. Steroid Biochem. 24,45-49

27. Khan, M. S., Ehrlich, P., Birken, S. & Rosner, W. (1985) Steroids

28. Hammond, G. L., Robinson, P. A., Sugino, H., Ward, D. N. &

29. Gershagen, S., Henningsson, K. & Fernlund, P. (1987) J. Biol.

30. Petra, P. H., Namkung, P. C., Titani, K. & Walsh, K. A. (1986)

17 , 1409-1415

Steroid Biochem. 11, 253-259

45,463-472

Finne, J . (1986) J. Steroid Biochem. 24, 815-824

Chem. 262,8430-8437

33.

34. 35.

36.

37. 38.

39.

40.

41.

42. 43.

44.

45.

46.

47.

48.

49.

50.

51. 52.

53.

54.

55.

56.

57.

58.

59.

60.

61.

62.

63.

(1984) Biochemistry 23,492-497

(1986) Biochemistry 25,7584-7590 Walsh, K. A., Titani, K., Kumar, S., Hayes, R. & Petra, P. H.

Que, B. G. & Petra, P. H. (1987) FEBS Lett. 219,405-409 Hammond, G. L., Underhill, D. A., Smith, C. L., Goping, I. S.,

Harley, M. J., Musto, N. A,, Cheng, C. Y. & Bardin, C. W.

Gershagen, S., Fernlund, P. & Lundwall, A. (1987) FEBS Lett.

Mickelson, K. E. & Petra, P. H. (1974) FEBS Lett. 44 , 34-38 Hunt, D. F., Yates, J . R., 111, Shabanowitz, J., Winston, S. &

Hauer, C. R. (1986) Proc. Natl. Acad. Sci. U. S. A. 8 3 , 6233- 6237

Hunt, D. F., Shabanowitz, J., Yates, J. R., 111, Zhu, N.-Z., Rusell, D. H. & Castro, M. E. (1987) Proc. Natl. Acad. Sci. U. S. A. 8 4 ,

Biemann. K. (1988) Biomed. Enuiron. Mass Spectrom. 16, 99-

(1987) FEBS Lett. 215,100-104

220,129-135

620-623

111

SOC. Chem. Commun., 548-550 Hunt, D. F., Shabanowitz, J. & Yates, J. R., 111 (1987) J. Chem.

Tarr, G. A. (1977) Methods Enzymol. 47,335-357 Hunkapiller, M. W., Hewick, R. M., Dreyer, W. J. & Hood, L. E.

(1983) Methods Enzymol. 9 1 , 399-413 Bidlingmeyer, B. A., Cohen, S. A., & Tarvin, T. L. (1984) J.

Chromatogr. 336,93-104 Michel, H., Hunt, D. F., Shabanowitz, J. & Bennett, J. (1988) J.

Biol. Chem. 263,1123-1130 Trapman, J., Klaassen, P., Kuiper, G. G., van der Korput, J. A.,

Faber, P. W., Van Rooij, H. C., Geurts van Kessel, A., Voor- horst, M. M., Mulder, E. & Brinkmann, A. 0. (1988) Biochem. Biophys. Res. Commun. 153,241-248

Chang, C., Kokontis, J. & Liao, S. (1988) Proc. Natl. Acad. Sci.

Tilley, W. D., Marcelli, M., Wilson, J. D. & McPhaul, M. J.

Dayhoff, M. O., Barker, W. C. & Hunt, L. T. (1983) Methods

Hunt, D. F., Zhu, N. Z. & Shabanowitz, J. (1989) Rapid Commun.

Karas, M. & Hillenkamp, F. (1988) Anal. Chem. 88, 2299-2301 Yamashita, M. & Fenn, J. B. (1984) J. Phys. Chem. 88, 4451-

Covey, T. R., Bonner, R. F. & Shushan, B. I. (1988) Rapid

Fernlund, P. & Laurell, C. B. (1981) J. Steroid Biochem. 14,545-

Cheng, C. Y., Musto, N. A., Gunsalus, G. L. & Bardin, C. W.

Taylor, C. A., Jr., Smith, H. E. & Danzo, B. J. (1980) J. Biol.

Musto, N. A., Gunsalus, G. L. & Bardin, C. W. (1980) Biochem-

Musto, N. A., Gunsalus, G. L. & Bardin, C. W. (1978) Znt. J.

Schmidt. W. N.. Tavlor. C.A. &Danzo. B. J. (1981) Endocrimlay

U. S. A. 85,7211-7215

(1989) Proc. Natl. Acad. Sci. U. S. A. 86, 327-331

Enzymol. 91,524-545

Mass Spectrom. 3 , in press

4459

Commun. Mass Spectrom. 2,249-256

552

(1983) J. Steroid Biochem. 19 , 1379-1389

Chem. 255,7769-7773

i s t ~ 19,2853-2860

Androl. 2 , (suppl.) 424-427 ~I

108,786-794

(1985) J. Steroid Biochem. 22, 127-134

. , "

Cheng, C. Y., Musto, N. A., Gunsalus, G. L. & Bardin, C. W.

Petra, P. H., Titani, K., Walsh, K. A., Joseph, D. R., Hall, S. H. & French, F. S. (1986) in Binding Proteins of Steroid Hormones (Forest, M. G. & Pugeat, M., eds) Vol. 149, pp. 137-142, John Libbey, London

Bardin, C. W., Gunsalus, G. L., Musto, N. A., Cheng, C. Y., Reventos, J., Smith, C., Underhill, D. A. & Hammond, G. (1988) J. Steroid Biochem. 30,131-139

Joseph, D. R., Hall, S. H. & French, F. S. (1986) In Binding Proteins of Steroid Hormones (Forest, M. G. & Pugeat, M., eds) Vol. 149, pp. 123-135, John Libbey, London