a developmentally regulated cysteine proteinase in dictyostelium

8
The EMBO Journal vol.4 no.4 pp.999-1006,, 1985 A developmentally regulated cysteine proteinase in Dictyostelium discoideum J.G.Williams, M.J.North' and H.Mahbubani Imperial Cancer Research Fund, Mill Hill Laboratories, London, NW7 lAD and 1 Department of Biological Sciences, University of Stirling, Stirling FK9 4LA, UK Communicated by G. Gerisch We have determined the sequence of a Dictyostelium mRNA encoding a protein with a high degree of homology to plant and animal cysteine proteinases. The degree of homology is highest in the region of the cysteine residue which is tran- siently acylated during peptide hydrolysis but all other resi- dues known to be important in catalysis are also conserved. We have named this protein cysteine proteinase 1. There is a hydrophobic signal peptide of 18 amino acids and an additional 99 amino acids at the N terminus, which are not present in other cysteine proteases and which may be cleav- ed off during processing of the enzyme. There is a single copy of the gene in the Dictyostelium genome. The cysteine pro- teinase 1 mRNA is absent from growing cells and from cells isolated during the first 6 h of development but it constitutes - 1% of cellular mRNA by 10-12 h of development. Dur- ing the development of Dictyostelium a major fraction of cellular protein is degraded to provide amino acids and a source of energy. Cysteine proteinase 1 may play a role in this auto-digestion. Key words: cysteine proteinaselDictyostelium Introduction During the development and differentiation of Dictyostelium discoideum amoebae, from individual amoebae to mature stalk and spore cells, there are major changes in cell structure and metabolism. These are accompanied by changes in the level of activity of a large number of enzymes and for at least two of these enzymes, UDP glucose pyrophosphorylase (Franke and Sussman, 1973) and glycogen phosphorylase (Thomas and Wright, 1976), the increase in activity during development cor- relates with an increased rate of synthesis of the polypeptide. There are developmental changes in the relative abundance of actin mRNA (Alton and Lodish, 1977; Kindle and Firtel, 1978), discoidin 1 mRNA (Ma and Firtel, 1978; Williams and Lloyd, 1979) and of the mRNA encoding the sp 96 spore coat protein (Dowds and Loomis, 1984) which correlate with the time of syn- thesis of the protein. In addition to these mRNAs encoding known proteins, a number of cDNA clones have been isolated which hybridize to developmentally regulated mRNA sequences en- coding proteins of unknown functions (Williams and Lloyd, 1979; Rowekamp and Firtel, 1980; Mangiarotti et al., 1981). We have taken advantage of the extensive protein databases now in ex- istence to identify the enzymatic function of a polypeptide en- coded by a previously well characterized cDNA clone which hybridizes to an abundant developmentally regulated mRNA se- quence (Williams and Lloyd, 1979). A large number of genes have been shown to be first express- ed at a late stage of aggregation (Blumberg and Lodish, 1980) IRL Press Limited, Oxford, England. and we have previously described the isolation of cDNA clones first expressed at this stage using a clone bank prepared from cells at the 8th hour of development in suspension (with cells developing on a solid sub-stratum this would correspond to a stage slightly after aggregation, Williams and Lloyd, 1979). The cDNA clone pDdl 1.7.10 was one of four recombinants hybridizing to mRNA sequences which were absent, or present at very low con- centrations, during the first 6 h of development but which con- stituted -0.2-0.3% of the mRNA population by 9 h of development in suspension. Using pDdl 1.7.10 as a probe in quantitative filter hybridiza- tion, Bogdanovsky-Sequeval et al. (1984) have analyzed the com- plete developmental time course of mRNA accumulation. They find the peak of accumulation to be between 8 and 12 h of de- velopment when they estimate that this mRNA constitutes - 1 % of the population. When cells in the first few hours of develop- ment are exposed to either artificial pulses (Gerisch et al., 1975) or continuous high levels of cAMP (Sampson, 1978), the intra- cellular cAMP concentration rises and several proteins involved in aggregation appear prematurely (Gerisch et al., 1975; Dar- mon et al., 1975; Klein, 1975; Town and Gross, 1978). We have shown previously that the mRNA complementary to pDdl 1.7.10 is induced to accumulate prematurely in the presence of ex- ogenous cAMP (Williams et al., 1980a). The mRNA com- plementary to pDdl 1.7.10 is - 1200 nucleotides in length (Williams et al., 1980a; Bodganovsky-Sequeval et al., 1984) and it encodes a polypeptide of 36 000 daltons (Mahbubani and Williams, unpublished). We have now determined the entire nucleotide sequence of the insert in pDdl 1.7.10 and shown that the mRNA encodes a cysteine proteinase. Results Nucleotide sequence analysis of the cDNA insert in pDdl J. 7.10 The complete nucleotide sequence of the cDNA insert in pDdl 1.7.10 was determined as described in the legend to Figure 1. There is a single open reading frame of 907 nucleotides span- ning the entire insert. This suggests that sequences from both the 5' and 3' ends of the mRNA are absent from this recombi- nant. The cDNA clone bank was re-screened with internal restric- tion fragments from pDdl 1.7.10 and three additional recombinants were isolated. Nucleotide sequence analysis of these clones showed that clone pDd2 contained additional sequences from the 3' end of the mRNA and that there was an in-frame termination codon. The sequence downstream of the termina- tion codon was typical of non-coding regions in the Dictyostelium genome with a very high proportion of A and T residues (Kim- mel and Firtel, 1982). The insert in pDd2 appears to contain se- quences derived from the poly(A) tract of the mRNA. Although it is difficult to be certain of this because of the frequent occur- rence of homopolymer tracts in the 3' non-coding regions of Dic- tyostelium genes, there is an AATAAA element seventeen nucleotides upstream of the poly(A) tract which might act as the signal for polyadenylation (Proudfoot and Brownlee, 1976; Fitz- gerald and Schenk, 1981). 999

Upload: ngotuyen

Post on 12-Jan-2017

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A developmentally regulated cysteine proteinase in Dictyostelium

The EMBO Journal vol.4 no.4 pp.999-1006,, 1985

A developmentally regulated cysteine proteinase in Dictyosteliumdiscoideum

J.G.Williams, M.J.North' and H.Mahbubani

Imperial Cancer Research Fund, Mill Hill Laboratories, London, NW7 lADand 1 Department of Biological Sciences, University of Stirling, Stirling FK94LA, UK

Communicated by G. Gerisch

We have determined the sequence of a Dictyostelium mRNAencoding a protein with a high degree of homology to plantand animal cysteine proteinases. The degree of homology ishighest in the region of the cysteine residue which is tran-siently acylated during peptide hydrolysis but all other resi-dues known to be important in catalysis are also conserved.We have named this protein cysteine proteinase 1. There isa hydrophobic signal peptide of 18 amino acids and anadditional 99 amino acids at the N terminus, which are notpresent in other cysteine proteases and which may be cleav-ed off during processing of the enzyme. There is a single copyof the gene in the Dictyostelium genome. The cysteine pro-teinase 1 mRNA is absent from growing cells and from cellsisolated during the first 6 h of development but it constitutes

- 1% of cellular mRNA by 10-12 h of development. Dur-ing the development of Dictyostelium a major fraction ofcellular protein is degraded to provide amino acids and asource of energy. Cysteine proteinase 1 may play a role inthis auto-digestion.Key words: cysteine proteinaselDictyostelium

IntroductionDuring the development and differentiation of Dictyosteliumdiscoideum amoebae, from individual amoebae to mature stalkand spore cells, there are major changes in cell structure andmetabolism. These are accompanied by changes in the level ofactivity of a large number of enzymes and for at least two ofthese enzymes, UDP glucose pyrophosphorylase (Franke andSussman, 1973) and glycogen phosphorylase (Thomas andWright, 1976), the increase in activity during development cor-

relates with an increased rate of synthesis of the polypeptide.There are developmental changes in the relative abundance ofactin mRNA (Alton and Lodish, 1977; Kindle and Firtel, 1978),discoidin 1 mRNA (Ma and Firtel, 1978; Williams and Lloyd,1979) and of the mRNA encoding the sp 96 spore coat protein(Dowds and Loomis, 1984) which correlate with the time of syn-thesis of the protein. In addition to these mRNAs encoding knownproteins, a number of cDNA clones have been isolated whichhybridize to developmentally regulated mRNA sequences en-

coding proteins of unknown functions (Williams and Lloyd, 1979;Rowekamp and Firtel, 1980; Mangiarotti et al., 1981). We havetaken advantage of the extensive protein databases now in ex-

istence to identify the enzymatic function of a polypeptide en-

coded by a previously well characterized cDNA clone whichhybridizes to an abundant developmentally regulated mRNA se-

quence (Williams and Lloyd, 1979).A large number of genes have been shown to be first express-

ed at a late stage of aggregation (Blumberg and Lodish, 1980)

IRL Press Limited, Oxford, England.

and we have previously described the isolation of cDNA clonesfirst expressed at this stage using a clone bank prepared fromcells at the 8th hour of development in suspension (with cellsdeveloping on a solid sub-stratum this would correspond to a stageslightly after aggregation, Williams and Lloyd, 1979). The cDNAclone pDdl 1.7.10 was one of four recombinants hybridizing tomRNA sequences which were absent, or present at very low con-centrations, during the first 6 h of development but which con-stituted -0.2-0.3% of the mRNA population by 9 h ofdevelopment in suspension.Using pDdl 1.7.10 as a probe in quantitative filter hybridiza-

tion, Bogdanovsky-Sequeval et al. (1984) have analyzed the com-plete developmental time course of mRNA accumulation. Theyfind the peak of accumulation to be between 8 and 12 h of de-velopment when they estimate that this mRNA constitutes - 1 %of the population. When cells in the first few hours of develop-ment are exposed to either artificial pulses (Gerisch et al., 1975)or continuous high levels of cAMP (Sampson, 1978), the intra-cellular cAMP concentration rises and several proteins involvedin aggregation appear prematurely (Gerisch et al., 1975; Dar-mon et al., 1975; Klein, 1975; Town and Gross, 1978). We haveshown previously that the mRNA complementary to pDdl 1.7.10is induced to accumulate prematurely in the presence of ex-ogenous cAMP (Williams et al., 1980a). The mRNA com-plementary to pDdl 1.7.10 is - 1200 nucleotides in length(Williams et al., 1980a; Bodganovsky-Sequeval et al., 1984) andit encodes a polypeptide of 36 000 daltons (Mahbubani andWilliams, unpublished). We have now determined the entirenucleotide sequence of the insert in pDdl 1.7.10 and shown thatthe mRNA encodes a cysteine proteinase.

ResultsNucleotide sequence analysis ofthe cDNA insert in pDdlJ. 7.10The complete nucleotide sequence of the cDNA insert inpDdl 1.7.10 was determined as described in the legend to Figure1. There is a single open reading frame of 907 nucleotides span-ning the entire insert. This suggests that sequences from boththe 5' and 3' ends of the mRNA are absent from this recombi-nant. The cDNA clone bank was re-screened with internal restric-tion fragments from pDdl 1.7.10 and three additionalrecombinants were isolated. Nucleotide sequence analysis of theseclones showed that clone pDd2 contained additional sequencesfrom the 3' end of the mRNA and that there was an in-frametermination codon. The sequence downstream of the termina-tion codon was typical of non-coding regions in the Dictyosteliumgenome with a very high proportion of A and T residues (Kim-mel and Firtel, 1982). The insert in pDd2 appears to contain se-quences derived from the poly(A) tract of the mRNA. Althoughit is difficult to be certain of this because of the frequent occur-rence of homopolymer tracts in the 3' non-coding regions of Dic-tyostelium genes, there is an AATAAA element seventeennucleotides upstream of the poly(A) tract which might act as thesignal for polyadenylation (Proudfoot and Brownlee, 1976; Fitz-gerald and Schenk, 1981).

999

Page 2: A developmentally regulated cysteine proteinase in Dictyostelium

J.G.Williams, M.J.North and H.Mahbubani

Hind m RII

Bcl I RII l 'G--G pDd11710

RI II C-CL~~

RI II#CG -0

RI BglIXC- -C-

BgIfl RI

Bglfl RII I

BgII RII I

I I I I

0 100 200 300

HindM RII

HindM RI

HindM RII

BcIl RI BglD RI-G-4HG pDd 2

BcIl RI BgIl RII G-GLh#J pDd 7

BcIl RI RI;G-/4J pDd3

400 500 600 700 800 900 1.000 1.100Nucleotides

Fig. 1. Restriction enzyme cleavage maps of the cDNA clones with details of the nucleotide sequence analysis. The cDNA clone pDdll.7.10 was isolatedfrom a cDNA clone bank prepared by the method of Maniatis et al. (1976); the double-stranded cDNA being inserted into the EcoRI site of the tetracyclineresistance vector via poly dG homopolymer tracts (Williams and Lloyd, 1979). The orientation of the insert is shown relative to the HindIII site which lies370 nucleotides from the EcoRI site in the promoter of the tetracycline resistance gene. The nucleotide sequence of the cDNA insert was determined from therestriction sites indicated using the chemical procedure of Maxam and Gilbert (1980). The two central EcoRI fragments were sub-cloned into the vectorpEMBL8 (Dente et al., 1983) and their sequence was determined using the di-deoxy procedure (Sanger et al., 1977). In order to isolate additional cDNAclones containing sequences not present in pDdll.7.10 a new cDNA clone bank was constructed (C.J. Pears and J.G. Williams, in preparation) in the KpnIsite of the vector p2732B (this is a 2.7-kb ampicillin resistance vector which contains a small polylinker: J. Monahan, personal communication).Approximately 3000 cDNA clones were screened with a mixture of the two internal EcoRI fragments from pDdl 1.7.10. Three clones hybridized strongly andthese were characterized by restriction enzyme analysis and, in the case of pDd 2, by partial nucleotide sequence analysis as indicated on the figure. Thepolylinker in p2732B contains two EcoRI sites which flank a KpnI site (the site used for cloning) and a BglII site. The orientation of the inserts in the clonesis shown relative to the EcoRI and Bgll sites. Arrows above a clone represent sequence determined from the coding strand and arrows below representsequence determined from the non-coding strand. (0-> indicates the sequence was determined from pDdl 1.7.10, *-> from pDd2 and A-> from pDd7).The symbol C--C and G--G indicates the positions of the homopolymer tracts. Vector sequences are indicated by a dashed line.

None of the three clones contained additional sequences fromthe 5'-proximal region of the mRNA. Repeated attempts to isolatecDNA clones containing such sequences were unsuccessful andwe obtained sequence of the 5' end of the mRNA by primer ex-

tension (Figure 2).We estimate that the insert in pDdl 1.7.10 terminates 115 nucleo-tides from the 5' end of the mRNA. [The overall length of themRNA is therefore estimated to be 1140 nucleotides (exclusiveof the poly(A) tail) which is in good agreement with estimatesobtained by gel-electrophoresis of the RNA (Williams et al.,1980a; Bogdanovsky-Sequeval et al., 1984.)] The mRNA se-

quence determined by primer extension extended the open readingframe by 101 nucleotides to an initiation codon. The nucleotidesequence upstream of the initiation codon contained a very highproportion of A and T residues (data not shown) as is typicalfor non-coding regions of Dictyostelium genes (Kimmel andFirtel, 1982). The nucleotide sequence of the portion of themRNA present in cDNA clones pDdl 1.7.10 and pDd2, and ofthe 5'-proximal coding region determined by primer extensionis presented in Figure 3.

Comparison ofthe predicted amino acid sequence ofthe proteinencoded by the pDdlJ. 7.10 mRNA with other known proteinsThe predicted amino acid sequence of the protein was comparedwith all proteins in the National Biomedical Research Founda-tion database using a method similar to that of Kyte and Doolit-tle (1982). A high degree of homology was apparent with papainand actinidin, the two cysteine proteinase sequences in thedatabase. No other proteins showed a significant degree of homo-logy. The optimal alignment of the protein sequence with thetwo plant cysteine proteinases, papain and actinidin, and withthe mammalian proteinases, cathepsin H and cathepsin B, isshown in Figure 4.

Overall the homology is greatest in those portions of the pro-tein moleucle which correspond to the N-terminal and C-terminalregions of the other proteinases. For example, the protein has54% homology with residues 1-70 of papain, 21 % homologywith residues 71 - 140 and 39% homology with residues141 -212. The most striking homology is around the cysteineresidue which corresponds to the active site cysteine (Cys-25)of papain. All other residues known to be important in the ac-

1000

Hindlm'/H k/c--c

Bgll RII I

0--------al-

.0

Page 3: A developmentally regulated cysteine proteinase in Dictyostelium

Cysteine proteinase 1 of Dictyostelium discoideum

1.631 _

571 -

396 -_

298 -

221 __220

154-

75

1 2

Fig. 2. Analysis of the 5' terminus of the mRNA by primer extension.Poly(A)+ RNA isolated from cells at the 13th hour of development was

analysed by primer extension as described in Materials and methods. Thefull-length primer extension product is estimated to be 266 nucleotides inlength. Since the 5' terminus of the primer is 151 nucleotides from the 5'end of the insert (see Figure 3) the 5' end of the RNA must lie 115nucleotides upstream of the end of the insert. Lane 1, primer extension with50 ng of poly(A)+ RNA. Lane 2, a radioactively labelled Hinfl digest ofpAT153. This was a large-scale primer extension reaction in which 1 Ag ofpoly(A)+ RNA was annealed with an excess of primer. Afterautoradiography the extended primer (20 000 c.p.m.) was excised from thegel and subjected to nucleotide sequence analysis by the procedure ofMaxam and Gilbert (1980).

tive site of papain are also conserved. These residues are indicatedin the legend to Figure 5 which is a diagrammatic representa-tion of the tertiary structure of papain where residues also pre-sent in the Dictyostelium protein are indicated. It shows thatregions flanking the central cleft, including those of the activesite, are highly conserved between the two proteins while otherparts of the two molecules display a lower degree of homology.The fundamental structure of the proteins is likely to be the

same especially as all six cysteine residues which form the di-sulphide bridges of papain are present. Similar observations maybe made by comparing the protein with actinidin whose tertiary

structure is also known (Baker, 1977, 1980). This degree ofhomology, between organisms as widely diverged in evolutionas Dictyostelium and higher plants and animals, strongly sug-gests that these proteins serve a similar function. We concludethat the mRNA complementary to pDdl 1.7.10 encodes a cys-teine proteinase which we have chosen to term cysteine proteinase1.Analysis ofthe organization and expression of the cysteine pro-teinase I geneWe have shown that the pDdll.7.10 mRNA is absent duringgrowth and during the first 6 h of development but that it con-stitutes - 0.3% of the mRNA population by 9 h of development(Williams and Lloyd, 1979). This study was performed usingcells developing in suspension and we did not determine themRNA abundance at times after 9 h of development. Such ananalysis is shown in Figure 6 using RNA isolated from bacteriallygrown cells developing on agar.The peak of accumulation is at 10-12 h of development with

a subsequent decline in relative abundance of the mRNA. UsingpDdll .7.10 as a probe, Bogdanovsky-Sequeval et al. (1984) havedetermined the concentration of the mRNA by quantitative filterhybridization. They obtained a developmental profile similar tothat shown in Figure 6 and they estimate that the mRNA com-prises 1% of the population at the time of maximal expression(10-12 h of development). The same authors have also usedthe pDdl 1.7.10 clonedas a probe in Southern transfer analysisof genomic DNA. They showed that the gene was present insingle copy but did not derive a restriction map of the gene. Wehave confirmed and extended their results by establishing therestriction map shown in Figure 7.

In all the digests analyzed, only a single restriction fragmenthybridized to the probe. The hybridization and washing condi-tions used were of relatively low stringency and we would haveexpected to detect DNA sequences as much as 10% divergedfrom the cysteine proteinase 1 gene. We conclude therefore thatthere are no other genes with a high degree of homology withcysteine proteinase 1. There is, however, at least one other cys-teine proteinase 1 related gene but this has a relatively low degreeof homology with cysteine proteinase 1 (C. J. Pears and J. G.Williams, unpublished results).

DiscussionCysteine proteinase 1 protein displays a very high degree ofhomology with four plant and animal cysteine proteinases forwhich complete amino acid sequences are available. The degreeof homology is higher with papain, actinidin and cathepsin Hthan with cathepsin B. It has been suggested (Takio et al., 1983)that cathepsin B diverged from a common ancestral gene beforecathepsin H; our data suggest, therefore, that cysteine proteinase1 diverged after cathepsin B. In the alignment shown in Figure4, there are 40 residues common to all four of the animal andplant enzymes. Cysteine proteinase 1 has all but four of these,and in three cases there are only minor differences (Trp-150 inplace of Phe; Gly-170 in place of Ala; Leu-208 in place of Ile).The only major change is at position 180 where asparaginereplaces tyrosine. In contrast cathepsin B lacks 18 of the 54residues common to cysteine proteinase 1, papain, actinidin andcathepsin H; many of the differences involving major changesof residue. As Takio et al. (1983) found with their comparisons,the greatest difficulty in aligning cysteine proteinase sequencesoccurred with the central regions of the molecules, and a liberalinclusion of gaps was necessary.

1001

map

Page 4: A developmentally regulated cysteine proteinase in Dictyostelium

J.G.Williams, M.J.North and H.Mahbubani

1 ATG AAA GTT ATA TTA TTA TTT GTT TTA GCT GTT TTT ACT GTT TTT GTT TCALys Val Ile Leu Leu Phe Val Leu Ala Val Phe Thr Val Phe Val Ser 16

52 AGT AGA GGA ATT CCA CCA GAA GAA CAA AGT CAA TTC CTT GAA TTT CAA GAT AAASer Arg Gly Ile Pro Pro Giu Giu Gin Ser G1 n Phe Leu G1 u Phe Gin Asp Lys 34

106 TTC AAT AAA AAA TAT TCA CAT GAA GAA TAT TTG GAA AGA TTT GAA ATT TTT AAAPhe Asn Lys Lys Tyr Ser His Glu Glu Tyr Leu Glu Arg Phe Glu Ile Phe Lys 52

160 AGC AAT TTA GGA AAA ATT GAA GAA TTA AAT CTA ATA GCC ATT AAT CAC AAA GCTSer Asn Leu Gly Lys Ile Glu Glu Leu Asn Leu Ile Ala Ile Asn His Lys Ala 70

214 GAT ACT AAA TTT GGT GTA AAC AAG TTT GCA GAT CTT TCC AGT GAC GAA TTT AAAAsp Thr Lys Phe Gly Val Asn Lys Phe Ala Asp Leu Ser Ser Asp Glu Phe Lys 88

268 AAT TAT TAT TTA AAT AAT AAG GAA GCA ATA TTC ACT GAT GAC CTT CCA GTT GCTAsn Tyr Tyr Leu Asn Asn Lys Glu Ala Ile Phe Thr Asp Asp Leu Pro Val Ala 106

322 GAT TAT CTT GAT GAT GAA TTC ATT AAT TCA ATT CCA ACT GCA TTT GAT TGG AGAAsp Tyr Leu Asp Asp Glu Phe Ile Asn Ser Ile Pro Thr Ala Phe Asp Trp Arg 124

376 ACT AGA GGT GCT GTT ACA CCT GTA AAA AAT CAA GGT CAA TGT GGT AGT TGT TGGThr Arg Gly Ala Val Thr Pro Val Lys Asn Gin Gly Gin Cys Gly Ser Cys Trp 142

430 TCA TTT TCA ACT ACT GGT AAT GTT GAG GGA CAA CAT TTC ATT AGT CAG AAT AAASer Phe Ser Thr Thr Gly Asn Val Glu Gly Gin His Phe Ile Ser GIn Asn Lys 160

484 TTA GTT TCA TTA TCA GAG CAA AAC TTG GTA GAT TGT GAT CAT GAG TGT ATG GAALeu Val Ser Leu Ser Giu G1 n Asn Leu Val Asp Cys Asp His Giu Cys MET G1 u 178

538 TAT GAA GGT GAA GAA GCT TGT GAT GAG GGT TGT AAT GGT GGT CTT CAA CCA AATTyr Gl u Gly GI u Glu Ala Cys Asp Gl u Gly Cys Asn Gly Gly Leu Gin Pro Asn 196

592 GCA TAT AAT TAT ATC ATT AAA AAT GGT GGA ATT CAA ACA GAA TCT TCA TAT CCTAla Tyr Asn Tyr Ile Ile Lys Asn Gly Gly Ile Gln Thr Giu Ser Ser Tyr Pro 214

646 TAC ACT GCT GAA ACA GGT ACA CAA TGT AAC TTT AAC TCT GCC AAT ATT GGT GCATyr Thr Ala Glu Thr Gly Thr G1 n Cys Asn Phe Asn Ser A1 a Asn Ile Gly Ala 232

700 AAG ATT TCC AAT TTT ACA ATG ATC CCA AAG AAT GAA ACT GTA ATG GCT GGG TACLys Ile Ser Asn Phe Thr MET Ile Pro Lys Asn Glu Thr Val MET Ala Gly Tyr 250

754 ATC GTT AGT ACT GGA CCA CTC GCA ATT GCT GCT GAT GCT GTT GAG TGG CAA TTTIle Val Ser Thr Gly Pro Leu Ala Ile Ala Ala Asp Ala Val Glu Trp Gln Phe 268

808 TAT ATT GGT GGT GTA TTT GAT ATT CCA TGT AAT CCA AAT TCA CTT GAT CAT GGTTyr Ile Gly Gly Val Phe Asp Ile Pro Cys Asn Pro Asn Ser Leu Asp His Gly 286

862 ATT TTA ATT GTT GGT TAC TCT GCT AAA AAT ACA ATT TTC CGT AAA AAT ATG CCAIle Leu Ile Val Gly Tyr Ser Ala Lys Asn Thr Ile Phe Arg Lys Asn MET Pro 304

916 TAT TGG ATT GTA AAG AAT TCT TGG GGT GCA GAT TGG GGA GAA CAA GGA TAC ATTTyr Trp Ile Val Lys Asn Ser Trp Gly Ala Asp Trp Gly Giu Gin Gly Tyr Ile 322

970 TAT TTA AGA AGA GGA AAG AAT ACA TGT GGT GTA TCA AAC TTT GTT TCA ACT AGTTyr Leu Arg Arg Gly Lys Asn Thr Cys Gly Val Ser Asn Phe Val Ser Thr Ser 340

1,024 ATA ATTIle Ile

TAAATTTATA CCAAATATTT AGTTTAGAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAA

1,078 AGAATTTAAA AATAAATAAA ATTTTTTATT AAATCTAAAA AAAAAAA

Fig. 3. Nucleotide sequence of the cysteine proteinase 1 mRNA. The nucleotide sequence was determined from the cDNA clones pDdl 1.7.10 and pDd2, andby primer extension on the mRNA, as described in Figures 1 and 2. The location of the BgILI-Alu restriction fragment used for primer-extension is indicatedby a solid line over the sequence. The nucleotide sequence upstream of the initiation codon is not presented because it comprised a long poly(A)homopolymer tract in which it was impossible to determine the precise number of residues.

The mechanism of action of papain has been studied in greatdetail and the tertiary structure determined by X-ray crystallo-graphy (see Lowe, 1976). The molecule is formed from two do-mains with a deep cleft between them (Figure 5). The bindingsite for the substrate straddles the cleft with Cys-25 and His-159in close contact, although on opposite sides of the groove. Pep-tide bonds are hydrolyzed through an acyl enzyme pathway withCys-25 as the acylated residue. The active form of the enzymeis believed to be a tautomer of the form (RSH.Im) or(Rs-.HIm+) where RSH is Cys-25 and Im is His-159 (Figure5). A number of other residues are known to be important incatalysis, and all of the papain active site residues are presentin the cysteine proteinase 1 polypeptide (legend to Figure 5).

1002

In addition to the portion corresponding to known cysteine pro-teinases the cysteine proteinase 1 protein contains a sequence of117 amino acid residues at the N terminus which is not presentin any of the other enzymes. Mammalian cysteine proteinases,and cysteine proteinases already described in Dictyostelium(North, 1982), are lysosomal enzymes and would therefore beexpected to be synthesized with a cleavable N-terminal leadersequence. The extreme N terminus of the sequence that we haveestablished contains a putative signal peptide with a run of 17non-polar or uncharged amino acids preceded by a chargedresidue (lysine) and followed by a charged residue (glutamicacid). The remaining N-terminal amino acids, which we haveshown to be present in cysteine proteinase 1, but absent from

Page 5: A developmentally regulated cysteine proteinase in Dictyostelium

Cysteine proteinase 1 of Dictyostelium discoideum

10 20 30

I T Az W R T R G. . . AVT. V KN Q GIQ C G S C W S FST

[4P E Y V IRQ K . . S C G S C A SA

LS Y V W R S A . . . V ISIQ E C GIGIC W A F A

H YIPS S M D W RK K . . . N V S V K N AGA IC G S C W T F S TB LLP E S I E Q W S N C P T I A R S IC G S C W A G A

40 50 60

GN V I Q H F S Q F K L V S L S E Q1N L V D C DlH E C M E Y E

V V T I I I K I R T G N N E Q IE L |D C R R S Y . . .

I A T V E I N K I T S G S I E QE LI DCG R T Q N T R

H A LW S A V A IA S M M TLIAIE QIQQL V D C A Q N F N N H

B V E A M S D R I C I H T V NWDE VSA E D TC C G I Q C G D.

70 80 90

G E E A C D E G C N G G . . . P N A Y N Y K TNG GIQ...... ......G c N G GIY P W . .I . . ........L V A Q Y G IIH Y.I........D.GG]Y I T D G F

. I N D N

H.G..GGLPSQAFEY. LY INKIG MG

B.G C N G GY P S G A G N F W T R K . GLV S, V Y N

100. ly P Y T Al .

. .IP Y IY E .

. Y I . .

C LIP Y TI I P

1 s IO -R N T .

D E N .

H iJ DI .

B S H I G

. . . .A

. . .. A A... V T

H .. V A

B K M C E A G

F [! . M I P K N EI A . N Q P . . .

V [2 * Y Ql_ t *H V A L Y N.B I Y K N G . . .

..... . . . . . E T G T

. .....G V Q

RYJR S RiE...... . . . . . .. Q D G D D V A L Q D Q K Y .

. . . G K N G K F N P E K A . . .

P C E HHHV N G S RP P 1T G E G D T P K C N

110. . . . ..j]F N S A

K T D G V R V Q P Y. I D T Y E N V . P YF V K N V V I . T LY S T S Y K E D K H Y

120. . . . *. .NjK I S N

. . . . ....NL L Y S....... . N N E W]L Q T A

.. D E A M V E AG Y T S Y S VS D S E K E I M A E

130 140 150T V M A G Y I VS T G P L A I AlDA V E W MF

*. V V L Q . G K D FK QL

. . ....V..VL. G D AF K Q.. F A F E . . V T . . ED FM MY

.. . . . ..GAFT. F S D F LTJ

160 170I F . D I . . N S L HG IL I V G YIS A

RLGSJIW.VG . . . G K V HA V AA IV G YIN P

A SIIF . T GIP C. . . G T A VD HAIV V G Y G T

H K SL VlY S S NSQH K TED K VNAV V GGYIG E

B K S Y K H E A G D V M G . . . G A L JW G I

TO1FRFR MI L I .K N .

wli v .KNr .

H W[ V . EB WL V .ALIj.

. . IN T C G1 VS Y G VI G LA . GIT C G. I

H . . H7i7 LB . . [1HILCII

180. . . K N...G Y

E C G V D YQ N G L L YE N G V P Y

190 200 210

P Y W I V K N WA W GEQ YIYL R G K..WG

TGW N GYIR I K GT G N

.......S

W ID T TIWE

E M R I L G G. 1W GS GN N >I F L I E G Kic.

. WN V BWGD N liF F K I L GE.220

S N F V[o T S IY T S S F Y P V K . . NA T M P1 Y P V K Y N NA A C A GWPP Q VE S E I VA GL P R T Q

Fig. 4. Alignment of the amino acid sequence of cysteine proteinase I with other cysteine proteases. Comparison of the amino acid sequences of cysteine

proteinase I (this paper), papain (Mitchel et al., 1970), actinidin (Came and Moore, 1978), rat liver cathepsin H (Takio et al., 1983) and rat liver cathepsin

B (Takio et al., 1983). Gaps are indicated by dots. Those residues which cysteine proteinase I has in common with any of the other proteinases are enclosed

in boxes. The N-terminal portion of cysteine proteinase is omitted and therefore the numbering of amino acid residues does not correspond to that used in

Figure 3. The residue numbers are those of cysteine proteinase I numbering from the first conserved amino acids. Thus residue I corresponds to the N-

terminal residue of the other enzymes and to residue 118 in the cysteine proteinase 1 polypeptide.

other cysteine proteinases, may be cleaved off the protein dur-

ing further processing of a precursor form of the mature enzyme.This is the first cysteine proteinase to be analysed at the mRNAlevel and it may be that all cysteine proteinases are synthesizedas a pro-enzyme. Takahashi et al. (1984) have recently reportedthe presence of a larger, 35 000 dalton, protein in their prepara-

tions of porcine spleen cathepsin B which they suggest might be

a precursor form of the enzyme. Alternatively the additional se-

quence may be retained in the mature cysteine proteinase 1 pro-

tein where it might play a functional role. It seems very unlikely,however, that this would be related to the enzymatic cleavageof the substrate as all essential residues are present within the

1003

Proteinase I

PapainAct inidinCathepsinCathepsin

Proteinase I

PapainActinidinCathepsinCathepsin

Proteinase I

PapainActinidinCathepsinCathepsin

Proteinase I

PapainActinidinCathepsinCathepsin

Proteinase I

PapainActinidinCathepsinCathepsin

Proteinase I

PapainActinidinCathepsinCathepsin

Proteinase I

PapainAct inidinCathepsinCathepsin

Proteinase I

PapainActinidinCathepsinCathepsin

ProteinasePapainActinidirCathepsirCathepsir

F

I

I1

Page 6: A developmentally regulated cysteine proteinase in Dictyostelium

J.G.Williams, M.J.North and H.Mahbubani

* Residues in Dictysin

Fig. 5. Amino acid homologies between papain and cysteine proteinase 1. The tertiary structure of papain is represented, with only the ca-carbon of eachamino acid residue being shown and with disulphide bridges represented by hatched bars. Based on the alignment presented in Figure 4, residues common toboth papain and cysteine proteinase 1 are represented by closed circles. The cysteine and histidine residues indicated are Cys-25 and His-159. Other residuesknown to be important in active site are Glnl9, Gly23, Ser24, Aspl58, Hisl59, Asnl5, Asnl75, Serl76 and Trpl77 and with the alignment presented inFigure 4, these residues are conserved between the two sequences.

HindM

HindlmXbalPst I

Xbal Clal cHicMAI

istBcI Bcllj Clal. . . .

O*

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Kb

I I I I I I I

0 2 4 6 8 101214 1618 20

Fig. 6. The accumulation of cysteine proteinase 1 RNA during development.Total cellular poly(A)+ RNA, isolated from cells at the indicated stages ofdevelopment, was resolved on an agarose gel, transferred to a filter andhybridized with a cysteine proteinase 1 probe as described in Materials andmethods. The concentration of poly(A)+ RNA in each sample was

determined by hybridization with [3H]poly(U) and, based on this assay, 2.5yg of poly(A)+ RNA was loaded. The specific activity of the nick translatedprobe was 108 d.p.m./jg and the concentration of probe in the hybridizationwas 10 ng/ml. The autoradiogram was exposed for 16 h with an

intensification screen. The cysteine proteinase 1 RNA is 1200 nucleotidesin length (Williams et al., 1980a; Bogdanovsky-Sequeval et al., 1984).

1004

Fig. 7. Restriction enzyme cleavage map of the cysteine proteinase 1 gene.The cysteine proteinase 1 gene contains a single HindIII site located at theapproximate centre of the gene. The HindIII site is so close to the centralEcoRI site (Figure 1) that the region of overlap between the 3'-proximalEcoRI restriction fragment and any 5'-proximal HindIll fragment generatedfrom genomic DNA is too small to allow the formation of a stable hybrid.This restriction map was generated by double digestion of genomic DNAwith HindII and the other enzymes indicated. The DNA was transferred toa filter and hybridized firstly with the 3'-proximal EcoRI fragment frompDdll.7.10 and then with the 5'-proximal EcoRI fragment. In shortexposures of the autoradiograms the 3' probe was found to hybridize toonly one fragment in each digest and the 5' probe to another singlefragment and this information was used to generate the restriction map.

region common to other cysteine proteinases.The presence of cysteine proteinases in Dictyostelium has been

shown previously (Fong and Rutherford, 1978; North and Har-wood, 1979) and two enzymes, termed proteinase I (Gustafsonand Thon, 1979) and proteinase B (North and Whyte, 1984),have been purified and characterized. Paradoxically, the develop-mental changes in total intracellular cysteine proteinase activityare the opposite of that now anticipated for cysteine proteinase1 from the time course of accumulation of its mRNA. Cysteineproteinase activity decreases during the developmental phase(Fong and Rutherford, 1978; Gustafson and Thon, 1979; North,1982) and increases again only during spore germination (Northand Cotter, 1984; Jackson and Cotter, 1984). There are a numberof possible reasons why activity changes that reflect the ap-

Page 7: A developmentally regulated cysteine proteinase in Dictyostelium

Cysteine proteinase 1 of Dictyostelium discoideum

pearance of cysteine proteinase 1 have not been observed. First-ly the substrates used previously (various proteins, peptide andamino acid derivatives) may not be suitable for cysteine proteinase1, either because it has comparatively low activity towards themand would therefore be masked by other cysteine proteinases pre-sent, or because the substrate is not hydrolyzed at all. Some mam-malian cysteine proteinases, for example cathepsins N and T,appear to have a more restricted substrate specificity than others(Katunuma and Kominami, 1983) and cysteine proteinase 1 mightresemble them. It is not possible to predict the substrate specificityof cysteine proteinase 1, but there are some features of its se-quence which could be responsible for a novel substrate specifici-ty. One such is the presence of an octapeptide sequence(Met-Pro-Tyr-Trp-Ile-Val-Lys-Asn) between Asn-186 andSer-195, which correspond respectively to the papain active siteresidues Asn-175 and Ser-176. No such sequence is found in anyof the other proteinases. Alternative explanations for the absenceof changes in activity which could be related to cysteine pro-teinase 1 include the possible presence of an endogenous inhibitorin cell extracts, synthesis of cysteine proteinase 1 in an activeform or secretion of cysteine proteinase 1 immediately after syn-thesis.The appearance of cysteine proteinase 1 mRNA at a time when

the net cysteine proteinase activity of the cells is decreasing raisesquestions concerning its function. The proteinases of vegetativecells are involved in the breakdown of nutrient protein withindigestive vacuoles and activity due to cysteine proteinases is pre-sent in a subcellular fraction containing lysosomes (North, 1982).During development there is extensive breakdown of intracellularprotein, a process in which cysteine proteinases have been im-plicated (Fong and Bonner, 1979). This commences at the startof the developmental phase at least 6 h before cysteine proteinase1 mRNA begins to accumulate. It is possible that cysteine pro-teinase 1 is synthesized to replace vegetative cysteine proteinasesso that intracellular protein breakdown can continue during laterstages of development. An alternative role for cysteine proteinase1 might be in the developmental control of specific proteinsthrough activation or inactivation by limited proteolysis. Cys-teine proteinase activity and protein breakdown are both greaterin pre-stalk cells than in pre-spore cells (Fong and Rutherford,1978), and so it will be of interest to determine whether the cys-teine proteinase 1 protein is found in higher levels in the formerthan in the latter cells. Answers to these questions will, however,only be possible once the cysteine proteinase 1 protein has beenpurified and characterized and its localization within cells andin the cell population established.

Materials and methodsCell culture and isolationD. discoideum strain V12M2 (obtained from G. Gerisch) was grown at 22°Con SM agar (Sussman, 1966) in two membered-culture with Enterobacteraerogenes. Conditions for development of cells on agar and for the isolation ofpoly(A)+ RNA have been described previously (Tsang et al., 1982).Nucleotide sequence analysis of cysteine proteinase I cDNA clonesRestriction fragments to be sequenced by the chemical procedure (Maxam andGilbert, 1980) were labelled at the 5' termini with T4 polynucleotide kinase andat the 3' termini with AMV reverse transcriptase (Williams et al., 1980b). Restric-tion fragments to be sequenced by the chain termination procedure (Sanger et

al., 1977) were sub-cloned into the vector pEMBL8 (Dente et al., 1983) whichcontains the- replication origin from phage fl. Single-stranded DNA was obtain-ed by phage super-infection.Analysis of RNAPrimer extension was performed using a single-stranded primer of 36 nucleotidesprepared from pDdl 1.7.10 by sequential cleavage with BglII and AluI (the loca-tion of this primer is shown in Figure 3). The primer was labelled with T4 poly-

nucleotide kinase at the 5' tenminus, after cleavage with Bgll, but before cleavagewith AluI. The two strands of the fragment were separated by electrophoresison a 10% urea-acrylamide gel. A 5-fold molar excess of the primer was hybridizedwith IOg of poly(A)+ RNA. Hybridization and primer-extension were perform-ed exactly as described by Devine et al. (1982) except that the temperature ofhybridization was 55°C. The products of the primer extension reaction were analyz-ed on a 10% urea-acrylamide gel in which the Xylene Cyanol marker dye was

run 20 cm through the 40 cm gel.Analysis by Northern transfer (Alwine et al., 1977) was performed on RNA

which was denatured with glyoxal (McMaster and Carmichael, 1974) resolvedby electrophoresis in a 1.5% agarose gel and transferred to a Gene Screen Plusmembrane (New England Nuclear). Cysteine proteinase 1 RNA was detectedby hybridization with the two internal EcoRI fragments of pDdl1.7.10 whichwere labelled by nick translation to a specific activity of 1-2 x 108 d.p.m./jg.Hybridization was performed at 34°C in 50% formamide, 5 x SET containing0.1% BSA 0.1% Ficoll 0.1% polyvinylpyrolidone 50 ig/mli salmon sperm DNA,0.3% SDS and 10 Ag/ml poly(A). The filters were washed in 1 x SET, 0.1%SDS at 65°C. 1 x SET contains 0.15 M NaCl, 0.03 M Tris-HCl pH 8.0 and0.2 mM EDTA.DNA analysisRestriction enzyme analysis of cDNA clones and of genomic DNA was performedunder digestion conditions recommended by the manufacturer (New EnglandBiolabs) with a 2- to 5-fold excess of enzyme. Horizontal gel-electrophoresis andSouthern transfer analysis of DNA were performed using standard procedures(Maniatis et al., 1982). Hybridization was performed as described by Jeffreyset al. (1980) in 1 x SSC containing 0.2% BSA, 0.2% Ficoll, 0.2%polyvinylpyrolidone 50 pg/ml salmon sperm DNA, 0.1% SDS and 9% dextransulphate (Sigma 500 000 daltons). The filters were washed at 65°C in 1 x SSCcontaining 0.1% SDS.

AcknowledgementsWe would like to thank D. Banville, P. Mason and C. Pears for helpful com-

ments on the manuscript and R. White for a gift of tailed p2732B vector.

ReferencesAlton,H.T. and Lodish,H. (1977) Dev. Biol., 60, 180-206.Alwine,J.C., Kemp,J.P. and Stark,G.R. (1977) Proc. Natl. Acad. Sci. USA, 74,

5350-5354.Baker,E.N. (1977) J. Mol. Biol., 115, 263-277.Baker,E.N. (1980) J. Mol. Biol., 141, 441-484.Blumberg,D.D. and Lodish,H.F. (1980) Dev. Biol., 78, 285-300.Bogdanovsky-Sequeval,D., Ayala,J., Mathieu,M., Presse,F. and Felenbok,B.

(1984) Biol. Cell., 50, 217-222.Carne,A. and Moore,C.H. (1978) Biochem. J., 173, 73-83.Darmon,M., Brachet,P. and Pereira da Silva,L. (1975) Proc. Natl. Acad. Sci.

USA, 72, 3163-3166.Dente,L., Cesarini,G. and Cortese,R. (1983) Nucleic Acids Res., 11, 1645-1655.Devine,J.M., Tsang,A.S. and Williams,J.G. (1982) Cell, 28, 793-800.Dowds,B.C. and Loomis,W.F. (1984) Mol. Cell. Biol., 4, 2273-2278.Fitzgerald,M. and Schenk,T. (1981) Cell, 24, 251-260.Fong,D. and Rutherford,C.L. (1978) J. Bacteriol., 134, 521-527.Fong,D. and Bonner,J.T. (1979) Proc. Natl. Acad. Sci. USA, 76, 6481-6485.Franke,J. and Sussman,M. (1973) J. Mol. Biol., 81, 173-185.Gerisch,G., From,H., Huesgen,A. and Wick,U. (f975) Nature, 255, 547-549.Gustafson,G.L. and Thon,L.A. (1979) J. Biol. Chem., 254, 12471-12478.Jackson,D.P. and Cotter,D.A. (1984) Arch. Microbiol., 137, 205-208.Jeffreys,A.J., Wilson,V., Wood,D., Simons,J.P., Kay,R.M. and Williams,J.G.

(1980) Cell, 21, 555-564.Katunuma,N. and Kominami,E. (1983) Curr. Top. Cell. Regul., 22, 71-101.Kimmel,A.R. and Firtel,R.A. (1982) in Loomis,W.F. (ed.), The Development

of Dictyostelium discoideum, Academic Press Inc., NY, pp. 234-316.Kindle,K.L. and Firtel,R.A. (1978) Cell, 15, 763-778.Klein,C. (1975) J. Biol. Chem., 250, 7134-7138.Kyte,J. and Doolittle,R.F. (1982) J. Mol. Biol., 157, 105-132.Lowe,G. (1976) Tetrahedron, 32, 291-302.Ma,G.C.L. and Firtel,R.A. (1978) J. Biol. Chem., 253, 3924-3932.Mangiarotti,G., Chung,S., Zuker,C. and Lodish,H.F. (1981) Nucleic Acids Res.,

9, 947-963.Maniatis,T., Fritsch,E. and Sambrook,J. (1982) in Molecular Cloning. A

Laboratory Manual, Published by Cold Spring Harbor Laboratory Press, NY.Maxam,A. and Gilbert,W. (1980) Methods Enzymol., 65, 499-560.McMaster,G.K. and Carmichael,G.G. (1974) Proc. Natl. Acad. Sci. USA, 74,

5350-5354.Mitchel,R.E.J., Chaiken,I.M. and Smith,E.L. (1970) J. Biol. Chem., 245,

3485-3492.

1005

Page 8: A developmentally regulated cysteine proteinase in Dictyostelium

J.G.Williams, M.J.North and H.Mahbubani

North,M.J. (1982) Exp. Mycol., 6, 345-352.North,M.J. and Harwood,J.M. (1979) Biochim. Biophys. Acta, 566, 222-233.North,M.J. and Cotter,D.A. (1984) Exp. Mycol., 8, 47-54.North,M.J. and Whyte,A. (1984) J. Gen. Microbiol., 130, 123-134.Proudfoot,N.J. and Brownlee,G.G. (1976) Nature, 263, 211-214.Rowekamp,W. and Firtel,R.A. (1980) Dev. Biol., 79, 409-418.Sampson,J. (1978) Dev. Biol., 67, 54-64.Sanger,F., Nicklen,S. and Coulson,A.R. (1977) Proc. Natl. Acad. Sci. USA,

74, 5463-5467.Sussman,M. (1966) Methods Cell Physiol., 2, 397-410.Takahashi,T., Dehdarani,A.H., Schmidt,P.G. and Tang,J. (1984) J. Biol. Chem.,

259, 9874-9882.Takio,K., Towatari,T., Katunuma,N., Teller,D.C. and Titani,K. (1983) Proc.

Natl. Acad. Sci. USA, 80, 3666-3670.Thomas,D.A. and Wright,B.E. (1976) J. Biol. Chem., 251, 1253-1257.Town,C. and Gross,J.D. (1978) Dev. Biol., 63, 412420.Tsang,A.S., Mahbubani,H. and Wilfiams,J.G. (1982) Cell, 31, 375-382.Williams,J.G. and Lloyd,M.M. (1979) J. Mol. Biol., 129, 19.Williams,J.G., Lloyd,M.M. and Devine,J.M. (1979) Cell, 17, 903-913.Williams,J.G., Tsang,A.S. and Mahbubani,H. (1980a) Proc. Natl. Acad. Sci.

USA, 7, 7171-7175.Williams,J.G., Kay,R.M. and Patient,R.K. (1980b) Nucleic Acids Res., 8,

42474258.

Received on 31 December 1984

1006