sequence analysis streptococcus mutans ... · s. mutansftf gene 811 eco hind ccorv hind pss22 7.8kb...

7
Vol. 170, No. 2 JOURNAL OF BACTERIOLOGY, Feb. 1988, p. 810-816 0021-9193/88/020810-07$02.00/0 Copyright © 1988, American Society for Microbiology Sequence Analysis of the Streptococcus mutans Fructosyltransferase Gene and Flanking Regions TERUAKI SHIROZA AND HOWARD K. KURAMITSU* Department of Microbiology-Immunology, Northwestern University Medical-Dental Schools, Chicago, Illinois 60611 Received 11 September 1987/Accepted 19 November 1987 The nucleotide sequence of theftf gene from Streptococcus mutans GS-5 was determined. The deduced amino acid sequence indicates that the unprocessed fructosyltransferase gene product has a molecular weight of 87,600. A typical streptococcal signal sequence is present at the amino terminus of the protein. The processed enzyme is relatively hydrophilic and has a pI of 5.66. An inverted repeat structure was detected upstream from thejtf gene and may function in the regulation of fructosyltransferase expression. Sequencing of the regions flanking the gene revealed the presence of four other putative open reading frames (ORFs). Two of these, ORFs 2 and 3, appear to code for low-molecular-weight proteins containing amino acid sequences sharing homology with several gram-positive bacterial DNA-binding proteins. In addition, ORF 3 is transcribed from theft DNA coding strand. Partial sequencing of ORF 4 suggests that its gene product may be an extracellular protein. Streptococcus mutans has been implicated as the principal causative agent in the development of human dental caries (13). Extensive investigations have focused on the synthesis of water-insoluble glucan by these organisms since these polymers play a role in the colonization of tooth surfaces. Many strains of S. mutans are also capable of synthesizing extracellular fructan polymers (3). However, the role of these polysaccharides in the cariogenicity of S. mutans has not been elucidated. Recent evidence from this laboratory (S. Sato and H. K. Kuramitsu, unpublished results) sug- gested that mutants of S. mutans GS-5 defective in fructosyl- transferase (FTF; EC 2.4.1.10) activity still display normal sucrose-dependent colonization of smooth surfaces in vitro. Howeyer, since fructans can be degraded by several organ- isms present in human dental plaque (25), these polymers could function as a reserve source of carbohydrate for plaque microorganisms. Although the S. mutans ftf gene coding for FTF activity has been recently isolated and characterized (1,8), little information is currently available regarding the regulation of expression of this gene. Growth of S. mutans Ingbritt in continuous culture indicated that the enzyme is constitu- tively expressed (27). In contrast, another enzyme catalyz- ing the synthesis of fructan from sucrose, levansucrase from Bacillus subtilis, was shown to be inducible by sucrose (12). An analysis of various levansucrase mutants has indicated that multiple genes (sacR, sacS, sacQ, and sacH) may be involved in the regulation of levansucrase expression (12). In addition, more recent sequence analysis of the levansucrase gene (sacB) suggested the presence of a potential termina- tion structure upstream from this gene (23). This site could be involved in the regulation of gene expression (20). To examine the regulation of the S. mutans ftf gene, nucleotide sequencing of the gene was carried out. The present report describes the sequence of the intact gene along with both upstream and downstream flanking se- quences. Several potential open reading frames (ORFs) were detected in these regions, and this suggests that the product of one of these genes could be involved in gene regulation. * Corresponding author. MATERIALS AND METHODS Plasmids. The construction of plasmid pSS22 has been recently described (18). The plasmid encoding the intact ftf gene, pTS102, was identified in a HindIII clone bank of S. mutans GS-5 chromosomal DNA. Since Southern blot anal- ysis indicated that the ftf gene resides on a 4.5-kilobase (kb) HindIII fragment (data not shown), DNA fragments of 4 to 6 kb were isolated from an agarose gel following HindIII digestion of chromosomal DNA. The purified fragments were ligated to HindIII-cleaved vector pUC8. The ligation mixture was transformed into Escherichia coli JM83, and transformants harboring chimeric plasmids were selected on Luria-Bertoni (LB) agar plates containing ampicillin (50 ,ug/ml) and 5-bromo-4-chloro-3-indolyl-,3-D-galactoside (X- Gal). Transformants expressing FTF activity were initially identified following replica plating of the clone bank onto LB agar-ampicillin plates supplemented with 1% sucrose. Colo- nies growing poorly in the presence of sucrose (synthesizing large amounts of intracellular fructan [2]) were isolated, grown in small samples (2 ml), and assayed for FTF activity. In this manner, one colony expressing FTF activity was identified. The plasmid contained in this transformant, pTS102, was used for sequencing the intact ftf gene. DNA manipulations. DNA isolation, endonuclease restric- tions, ligations, transformation of competent E. coli cells, and Southern blot analysis were carried out as recently described (2). Nucleotide sequencing. The two large DNA fragments containing the intactftf gene, i. e., the 1.8-kb HindIII-EcoRI fragment from pSS22 and the 2.0-kb XhoI-HindIII fragment from pTS102 (Fig. 1), were subcloned into M13mpl8 or M13mpl9. A series of deleted bacteriophage clones were constructed as described by Henikoff (9) following treatment with exonuclease III and nuclease S1. The desired subfrag- ments were cloned into the M13 phage vectors by using the appropriate restriction sites as described below. Nucleotide sequences were determined by the dideoxy chain termina- tion procedure (17) with lambda phage single-stranded DNA, the 17-mer universal primer (Bethesda Research Laborato- ries, Gaithersburg, Md.), and [a - 35S]dATP (600 Ci/mmol; Amersham Corp., Arlington Heights, Ill.) as recently de- 810 on November 6, 2020 by guest http://jb.asm.org/ Downloaded from

Upload: others

Post on 10-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sequence Analysis Streptococcus mutans ... · S. MUTANSftf GENE 811 Eco Hind ccoRV Hind pSS22 7.8kb E Eo-0.4 0 1 EeoI Hindi EcoRV EaI-OF-0RF2-u ORF2-fI- pSS22 Hlnd EcoRV C,b X pTS102

Vol. 170, No. 2JOURNAL OF BACTERIOLOGY, Feb. 1988, p. 810-8160021-9193/88/020810-07$02.00/0Copyright © 1988, American Society for Microbiology

Sequence Analysis of the Streptococcus mutansFructosyltransferase Gene and Flanking Regions

TERUAKI SHIROZA AND HOWARD K. KURAMITSU*Department of Microbiology-Immunology, Northwestern University Medical-Dental Schools, Chicago, Illinois 60611

Received 11 September 1987/Accepted 19 November 1987

The nucleotide sequence of theftfgene from Streptococcus mutans GS-5 was determined. The deduced aminoacid sequence indicates that the unprocessed fructosyltransferase gene product has a molecular weight of87,600. A typical streptococcal signal sequence is present at the amino terminus of the protein. The processedenzyme is relatively hydrophilic and has a pI of 5.66. An inverted repeat structure was detected upstream fromthejtf gene and may function in the regulation of fructosyltransferase expression. Sequencing of the regionsflanking the gene revealed the presence of four other putative open reading frames (ORFs). Two of these, ORFs2 and 3, appear to code for low-molecular-weight proteins containing amino acid sequences sharing homologywith several gram-positive bacterial DNA-binding proteins. In addition, ORF 3 is transcribed from theft DNAcoding strand. Partial sequencing of ORF 4 suggests that its gene product may be an extracellular protein.

Streptococcus mutans has been implicated as the principalcausative agent in the development of human dental caries(13). Extensive investigations have focused on the synthesisof water-insoluble glucan by these organisms since thesepolymers play a role in the colonization of tooth surfaces.Many strains of S. mutans are also capable of synthesizingextracellular fructan polymers (3). However, the role ofthese polysaccharides in the cariogenicity of S. mutans hasnot been elucidated. Recent evidence from this laboratory(S. Sato and H. K. Kuramitsu, unpublished results) sug-gested that mutants of S. mutans GS-5 defective in fructosyl-transferase (FTF; EC 2.4.1.10) activity still display normalsucrose-dependent colonization of smooth surfaces in vitro.Howeyer, since fructans can be degraded by several organ-isms present in human dental plaque (25), these polymerscould function as a reserve source of carbohydrate forplaque microorganisms.Although the S. mutans ftf gene coding for FTF activity

has been recently isolated and characterized (1,8), littleinformation is currently available regarding the regulation ofexpression of this gene. Growth of S. mutans Ingbritt incontinuous culture indicated that the enzyme is constitu-tively expressed (27). In contrast, another enzyme catalyz-ing the synthesis of fructan from sucrose, levansucrase fromBacillus subtilis, was shown to be inducible by sucrose (12).An analysis of various levansucrase mutants has indicatedthat multiple genes (sacR, sacS, sacQ, and sacH) may beinvolved in the regulation of levansucrase expression (12). Inaddition, more recent sequence analysis of the levansucrasegene (sacB) suggested the presence of a potential termina-tion structure upstream from this gene (23). This site couldbe involved in the regulation of gene expression (20).To examine the regulation of the S. mutans ftf gene,

nucleotide sequencing of the gene was carried out. Thepresent report describes the sequence of the intact genealong with both upstream and downstream flanking se-quences. Several potential open reading frames (ORFs) weredetected in these regions, and this suggests that the productof one of these genes could be involved in gene regulation.

* Corresponding author.

MATERIALS AND METHODS

Plasmids. The construction of plasmid pSS22 has beenrecently described (18). The plasmid encoding the intact ftfgene, pTS102, was identified in a HindIII clone bank of S.mutans GS-5 chromosomal DNA. Since Southern blot anal-ysis indicated that the ftf gene resides on a 4.5-kilobase (kb)HindIII fragment (data not shown), DNA fragments of 4 to 6kb were isolated from an agarose gel following HindIIIdigestion of chromosomal DNA. The purified fragmentswere ligated to HindIII-cleaved vector pUC8. The ligationmixture was transformed into Escherichia coli JM83, andtransformants harboring chimeric plasmids were selected onLuria-Bertoni (LB) agar plates containing ampicillin (50,ug/ml) and 5-bromo-4-chloro-3-indolyl-,3-D-galactoside (X-Gal). Transformants expressing FTF activity were initiallyidentified following replica plating of the clone bank onto LBagar-ampicillin plates supplemented with 1% sucrose. Colo-nies growing poorly in the presence of sucrose (synthesizinglarge amounts of intracellular fructan [2]) were isolated,grown in small samples (2 ml), and assayed for FTF activity.In this manner, one colony expressing FTF activity wasidentified. The plasmid contained in this transformant,pTS102, was used for sequencing the intact ftf gene.DNA manipulations. DNA isolation, endonuclease restric-

tions, ligations, transformation of competent E. coli cells,and Southern blot analysis were carried out as recentlydescribed (2).

Nucleotide sequencing. The two large DNA fragmentscontaining the intactftf gene, i. e., the 1.8-kb HindIII-EcoRIfragment from pSS22 and the 2.0-kb XhoI-HindIII fragmentfrom pTS102 (Fig. 1), were subcloned into M13mpl8 orM13mpl9. A series of deleted bacteriophage clones wereconstructed as described by Henikoff (9) following treatmentwith exonuclease III and nuclease S1. The desired subfrag-ments were cloned into the M13 phage vectors by using theappropriate restriction sites as described below. Nucleotidesequences were determined by the dideoxy chain termina-tion procedure (17) with lambda phage single-stranded DNA,the 17-mer universal primer (Bethesda Research Laborato-ries, Gaithersburg, Md.), and [a - 35S]dATP (600 Ci/mmol;Amersham Corp., Arlington Heights, Ill.) as recently de-

810

on Novem

ber 6, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 2: Sequence Analysis Streptococcus mutans ... · S. MUTANSftf GENE 811 Eco Hind ccoRV Hind pSS22 7.8kb E Eo-0.4 0 1 EeoI Hindi EcoRV EaI-OF-0RF2-u ORF2-fI- pSS22 Hlnd EcoRV C,b X pTS102

S. MUTANS ftf GENE 811

Eco Hind

ccoRV

Hind

pSS227.8kb

E Eo

-0.4 0 1

Eeo Hind EcoRV EaI i I

-0RF2-u-OFORF2- f

I -pSS22

HlndEcoRV

C,b

X pTS102 Eco

7.0kb

Xho

d

so/

2 3 4 4.3kbI I

Io Xho Eco Eco si/ HilndI I I I e

'f -_ORF3 - ORF4

pTSIo2

FIG. 1. Restriction maps of pSS22 and pTS102 and the deduced total structure of the 4.7-kb EcoRI-HindIII fragment. The GS-5chromosomal DNA insert of plasmid pSS22 consists of three EcoRI fragments, while pTS102 contains a 4.3-kb Hindlll chromosomal insert.The transcriptional directions of the genes are indicated by the arrows (ORF 3 is transcribed in the opposite direction relative to the othergenes). A small segment ofORF 2 overlaps the 3' end ofORF 1. O, Putative signal sequence region at the 5' end of theftfgene. Abbreviations:Bgl, BglII; Eco, EcoRI; Hind, HindIII; Xho, XhoI.

scribed (22). However, putative ORF 4 (Fig. 1) was se-quenced by using double-stranded DNA and the 16-merreverse primer (New England BioLabs, Inc., Beverly,Mass.). Both DNA strands of the 4.7-kb EcoRI-HindIIIfragment (Fig. 1) were completely sequenced.

Sequence analysis. The nucleotide and amino acid se-quences were analyzed with the Pustell sequence analysisprogram (International Biotechnologies, Inc., New Haven,Conn.).Enzyme assays. FTF activity was measured as previously

described (18) by using either the sucrase assay or[3H]fructan synthesis from [3H-fructose]sucrose.

RESULTS

Identification of ORFs on the 4.7-kb EcoRI-HindIII frag-ment of S. mutans chromosomal DNA. Our previous studiesof the ftf gene have shown that E. coli transformantsharboring plasmid pSS22 (Fig. 1), containing a 3.2-kb S.mutans chromosomal DNA insert, exhibit strong FTF activ-ity (18). Nucleotide sequencing of the insert (Fig. 2 and datanot shown) revealed that the insert contains three potentialORFs (Fig. 1). Two of these, putative ORFs 1 and 2, encodepolypeptides of at least 159 and 122 amino acids, respec-tively. The other large ORF specifies a protein of 728 aminoacids. Therefore, the nucleotide sequence data revealed thatonly the latter ORF is compatible with the predicted molec-ular weight of the ftf gene (18). However, no terminationcodon could be found within the longer ORF correspondingto the ftf gene in pSS22.

Southern blot analysis of S. mutans chromosomal DNAcleaved with HindIl by using plasmid pSS22 as a probeindicates two positive signals of approximately 4.5 and 1.0kb (data not shown). Since the predicted molecular size ofthe ftf gene is less than 3.0 kb (18) and one HindlIl site islocated 680 base pairs (bp) upstream from the initiator Metcodon of the gene (Fig. 2), these results suggested that theentire ftf gene should be contained within the 4.5-kb HindIIIfragment. Accordingly, an S. mutans GS-5 HindIII clonebank was constructed in E. coli JM83 by using plasmidvector pUC8. One of the clones was demonstrated to ex-press FTF activity following screening of the clone bank forsucrase activity (15). The plasmid from this clone, desig-

nated pTS102, was purified from the transformant, and arestriction map was constructed (Fig. 1). Restriction analysisindicates that the plasmid contains a 1.4-kb EcoRI-HindIIIfragment which should contain the termination codon for theftf gene. Nucleotide sequencing of this region was carriedout by using the 2.1-kb XhoI-HindIII fragment of pTS102. Inaddition to the expected termination codon for the ftf gene(position 3072, Fig. 2), two additional potential ORFs (ORFs3 and 4) were identified within this fragment (Fig. 1). Thecoding strand for ORF 3 is opposite to that for the ftf geneand ORFs 1 and 2. Even though only a small portion ofputative ORF 4 was sequenced (Fig. 2), the data presentedbelow suggests that this sequence codes for an extracellularprotein. Therefore, the sequence data revealed the presenceof four additional potential ORFs in addition to the ftf geneon the 4.7-kb EcoRI-HindIII fragment.

Characterization of the ft gene. The fif structural genebegins with the ATG initiation codon (position 681, Fig. 2)and ends with the TAA termination codon (position 3072).The intact gene codes for a 797-amino-acid protein with apredicted molecular weight of 87,600. The ftf gene is pre-ceded by an inverted repeat region (positions 565 to 593)which may function as a regulatory region, as has beenpostulated for a similar region upstream of the levansucrasegene of B. subtilis (23). However, this same sequence couldalso act as a termination sequence for ORF 2. In addition, apromoterlike sequence ATGATA-N17-TAGGAT, which re-sembles the E. coli consensus sequence TTGACA-N17-TATAAT (8) exists between positions 617 and 645. Thenucleotide sequence downstream from the ftf gene containsan inverted repeat structure (positions 3150 to 3179) whichmight act as a transcription terminator (16) for the gene. It isof interest that this potential stem-loop structure is alsopositioned to play a similar role in regulating the expressionof ORF 3, with transcription of this gene occurring in theopposite direction to that of the ftf gene.

Characterization of the FTF protein. The amino-terminalregion of the FTF protein (first 34 amino acid residues)resembles a typical signal sequence for extracellular proteinsobserved in gram-positive bacteria (1, 6, 11). The first 14residues are strongly basic (7 of 14), and they are followed bya highly hydrophobic region (13 of 17). On the basis of theapplication of the general rules for determining signal pepti-

VOL. 170, 1988

on Novem

ber 6, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 3: Sequence Analysis Streptococcus mutans ... · S. MUTANSftf GENE 811 Eco Hind ccoRV Hind pSS22 7.8kb E Eo-0.4 0 1 EeoI Hindi EcoRV EaI-OF-0RF2-u ORF2-fI- pSS22 Hlnd EcoRV C,b X pTS102

24130 0 so be~~A 1230 1200 1250 1200

H03d 0

A AK "Ii TCC CaaMiGATA AM$ KC MI cmi An TTIIA A elA AM AM Sii CM AN CMA ITT IM TTA iT MAA L1i MA TA LM! GAT CM LA6 UT CAT

At Ile Air kg Ile Lu Sim 6al Lu Lys Lys Val gIge Lys Ale Val Ile Al Lft kg Lys Val Lys Lu Tbr htr Agp Ali TOr *V Ais

70 00 90 1to He0 1300 1310 132

TAC CU CCi GM Air UT COT Air GMAMA AIT; ITC Mi CAT Cii UT CMA CA AM ACC TAT CM OmA TIiT MA OTT CM ACD rim Ara CM UT CL

Tyr kgo Pr. Ile kg Ile Akp Sim Lys Val Val Phe Nis km Lft Aly.1 61a Not Tbr Tyr Ale Sml PhA. Ap Lys Ile Ala ihi Lu Ile Ala kp kgq130 100 I30 170 I00 1330 13300 1350 0 136 1300

gig Asp II# Arm Nis Val Val Lys Its Nit Val Ala Pig Lue Ile Ala Ais. Lu Ala Asp iir Ala fiIt Pro Tyr kg Ali Lys Ala II# Liy skm Li. Ala Al. TOr Thr kg

IV0 fit bao 133 1000 1030 1V400

CM MAOTiLAMCTMAr TTTCM CCTiiCALirTi AMACATinALKiCimL AuSTiATi TSMiCC LKAOi1CMi ALTAK iiiTmTmiUTABmmaT I CiiTOMccA ITTCMGATAly Ile ihr Lu Lis Pro hr Ala Lu Lis Ais Lu Al. Lu Al. Tyr Asp A4I la1 ihr Ay Il. Ile Al. Ap Luo 0.p Val irp AVp hr Tip Pro Val Al.n Asp

200 M 290 300 1040 3070 lA00 10 1300

IMTLKiMAAAMLAAiemucCiMTTiAACmATONLACTiiCMA6A imAIS MGATA6 KLMAN IUiMTAOMITTATA TAiM TAmMMTiTAiCM CTTITTITTiKTiATm ATO6LAsp Al. Al. Sly Ala kmg Pro Lu kg kg ihr Lue i Ur l.v Va1 Ole Ap Lis Al. Li. Tlw Aly Val Ile km Tip km Sl Tyri i mAl La. Val Val Al. Nit 0.t

3101 300 530 300 350 300 1310 1520 is153 130 1550

TTALOMVA CiTATTCiii6 MamAwCiALsiMiTM;CMaACTCTSMAAiATA SiTiA ATTcMA MYLiMiAAT iMTUTCATiA"TAT CiTCTTiATAMiA AiATTMA6AiMAiTMiLg.Ala IliLou IleLur Sli LysiLuAlahrligflyirh r LuLys IliAlyIle Ile ProAkaiThrAkmA.p kgNs II# Tyr Ln L.iyr AnLys TyrSI9 spAkmAkm

370 300 390 400 010 1370 Is1500 All0 1900

CMMAMAA AMAiiA MATATTMAiT iTA TMATTA TCTMiSTCiATITT'IC ATAMiATT6 TiiiM CATIiAAM MiLATCAASIi CTATCiiiAAT TATMAT MALAcCCiCALKiCASIs Lys livLys Lev Lys heAV IleVal---FPA* is..rpLysiAnAla mly hrI.Phem61,iTrkgAu lehr ProLiTOrIgle

030 000 050 00 070 000 100 lAb" 1000 1A70 lAm

TTALOTCACITTTACC CTACTTCM TATCACCAMTAT T'MMATT ATAM'AAAMTAT CTC VA TOATCAAM TA ;TiAAATCAAMiUTAMiMUT T16CM TTAiTTCTLACCAmTrip hr Sly hor Ala Thr Val kgm Aip 61i Lu 6Al L Phe TAr Lys

3A0 510 520 530 3s 1A 1700 l7lA 1720 1730 1700

Val ihr hr Ap Lys hr kg kg 6Al kgl Lu Ala TAr Ala ihi Val Lu

500 570 500 590o 17600 E 1700 1790

TTAi161CAT M AMiUSAl6 MA ITTAll MAM0f AAAUTiMMiTCT 6CTTIT6CCA6TBTMA T C TLM 0aSTTAUAlr kAspOg l. Val kg Ile Lev hr Val kg Asp Lys Val Lu ihr Pig

fCORVMAMIITTACAT ATmTACiAiAiAWAiT TAMAiALMUTiA 6MAAACITSATAAKCCTAT CAT TAi CMMiiAT CIA T06C6AiiCACC TiTLArAA!6

Val Ala Tyr Nis Tyr mol. Tyr 6A. Si. Tip k her ti P hw TAr Aly

670 700 710 lAow M le90 1900 life

MAMAAM~A 661TACTA'ATLMAAALTM:AAL6TMA ALMANATLTAT' AMAAM M:A LTMiMiAATATTLKTAi CST AAIcrACrATiTC ATTO"MOf MMTMA;MiCATC6TATN_ ot Thr Lis Val kgq Lis Lys Nit Lys Li. Sly Lys Ala Asp k Ile Ala htA Asp Pig Ais Oil II# Asp Al. km. Asp kgl Tir

730 700 750 700 770 700 1900 1950 1900 It"0

ITTT LT 166AiACC ACCAiCMA ACT ~LA16ACN ACT M ATT OnCIEC!CTICATST CTT6TCTTIT06LTMAT' ACA T LA616AMTATCTAT MAAAIATCM AHTTAC AACPhe rip Val Val Ala'TAr Ila iM ihi Ala001 Lou Thi Ile L uo hor hr Val Li Val 6Al Ala ihr 6ly Tlw Tyr imAl Aliy Al. Asp Al Ile Tyr

Ale Ano 6no 200 2010 2030

CABLAA6AT M LCMTITCA ACT CAAMTITCYTTCA6A TiALY I ASAMASTCM ITT iTILTMACiAiKCTATCK~ AKACTCiTiiAiT MIT LASTiCiTTITiAiTITTTAMUT6algAla .pAl,vAla kghSrthriAl.Val SurhrAOlsLuAIAla6vkgWr l.aVl PhiiThiAkm iyr6l 6lyInhrr AliiaTrimOal LyshrLuPhkgrqPho,L..A.p

65 90 2 20 "

CM6AMAT ACA LiTCKYtCAiTCALTCAM6OAMiOMAT ALiKTMAAMTM6Am6T CM AAT CA MiATm46TATiMCC61ALAAAKCT'LOCMAT LAKiAiT06AiTi TAMAACiTTAl. kg TUr Al. her Ala Ala k i iAl Ala Lys Thi Oil Al. Akp Si* Asp it Tyr kg kg lowh r Tp Ala kgs Al. Ala Ile Aly II~Lu Lys Lu

93 90 9 00 2ll0 Ila$ 213 21 I

MALTKIcCA TCAAA AMTccCL ALiLT ACTmITTinAACAiT AOTCALAA iMACA MAAAC06TMAAMAAAACA CT6AL 1A AT CAATTT TXACL CT TTA CTAAllTCALKAAlI.ihiPro hr ThriAsaProAla Ala AlaThrValLlgkvgAiThiAspAiThriihr Lys Lys 61iAkpLys Lys ThirPro LI.al AspAIimPh#iTy Thr Pro LosLeSu SuTrhr

970 990 110 1010 2170 XAO I"l a 2210 n

AITOATOALAOfAOMiOciLimTTMiCMAMAA6LiuntYAMAAT1CMKALKi AllAOT Tee UTINCTC ACMCCCMTATATITS TMALTTA MA0TAM NTAC TAT CTT

Pal Ile ihr Asp k Ala Ala Val AIle hr Lii Al. Lis TAr Lys Asp Ala Ala Va1 hr Asp Al. Luo 6l. kg Pio Val Pal Lyi Lus 6i Asp Lys Tyr Lou

1030 low 2 22 22 PM

ibr Vil Ti.LysiThr Ala AlaS hrihPro IAsValAlyIAlegihrkgAlegLysAspLis FMhihr Ala hrkgLiAss NislyhUrkm AuAsp Ala iip kgLLAl.ask.mIa

le"0 1100 llSf 1120 1130 1100 2310 2320 2350 2300

KL AAL A AUIAMAALiM6C AAiALTIACCtAMAM TLKA AIAMT1M TACTLL OiTTAifUTMiMY TACAITT AiACiAuMTiAT6i iiCTMATCMiiTB iACTACIKTACAla L Ala TAr L is Ale Al. Ile Pr. Lys TA. Ile Asp yii Sly Val Va g lyAI sp Val V.1 Not L Oly Val Asp L kgs Aly Tyr

I1150 1170 II90 II00 2350 2300t 2370 30 2390 2011h

CTALKAMALCMLKTCS;AiNATTLKiACiMTALKT S TAMTATiAAKALMT TTACA MACCCM iAMAMT Ml TUT 6T 6Li TAACALKTiTCAAITTCLA 6A 6ATT100CILA

Lag TA. 6 AGIs Ala kgl Lys Ile Ala ihi Al. Ala Ali Ile Lu hr Lus Thr Lii Pro L ki Ohy Pal Val Lu TAr hia her Pal hu Ala Asp Tip kg TA.

21 2030 2050 h00 3370 3300 3390 3000 3020

LaMTiTACiTCiTAiiTATLKIAicCMA iALKMOMiCAiTCiMGATLCCiCT1TAAiB AN AALAMACAMALc COurTciCLTAm TA LCiT;TTiTAi AiAuhiiTAcMATC CCAAlaThriTy hr iVw ii9TyrAla Val ProValAla AlyhrhrwA AVTAr.LosLuhAiThr V L L 6 V K V I

2 070 3 030 3 000 3 3 000 3 070 3

ACi iAi AOiL ACC M CAT MT MA PTC MAA MAd MT NCA AC TM LCA Cci AVC MNT CLC iA MT AiC ATO ATC AMA iCC TT6 MiT MA EA AL TiCAlaiTyr ht hriAnkkggAml. a l A.laVlyL ysi Sly LyskghriA iTpOAla Pro A V A V K L A

~~~~ ~ ~~~30903500 3 31 30~~~~~~~~~~~~~~~~~~~~~~a

M Ci6S ArT CMI Inn CC MA LC A AiC TiA GMA Aim LCA Au M A MA MA iTC CAi TOS ATC iCC MAU Mi cCA MA C AMA AiC 6i6 AiChr PkLu1os H al LevPro ApAfly TArThrL ysV il L Ala6lg6iteiThr6Ii L I P I A N

335 no 3370 35AA 3390 3600

CAIMT GTiTMSATu 86OTMiMAcC AMT CK AAC ALOrACTITiOK ACiAM'GT MiT TICTrCAiCATAM1AACMTaiMATEAMTAllCMATCOACCAMAAAL6AAATALOC 16TLl.Sly A"pTrp leIrp kpAIlePro Ow ihArIskTp pTh r Val flyViA.L n A" VP I 6

2000 2070 3010 3020 3030 3000 3650 3W

LAA cCAT CiT CCAMT MAMiMGATMT TATATTMiAT MMiA&IITOTTii06TAiK A LCC TLCTMiAATii TiC 6iAMACiCiTA64MA6AiA ATH 11AilTTiiCAiiTheTT CCThi Ala yii LosProfly6Il. kmAM6ly Tyi Ile AVpirp kgVal I iAlySly Tyr A I L V V V P V A K N L NA

2710 2700 2700 3070 3000 3090 3700 3710 3720

UTiTIiAALcCA CAI A4 MCM CIAi;TCM CCA ACITTcrAiTCA A~CACAITTCATi IiiCIi AiCiiOiiSiiLi A6 iMOITSCTAiT6A6A;TTAOiO6iiMMiiiA61IiTAT;6CiAly LowLys ProNis TAProBlyNisiTrSim,sProThr Val Prohri hriProiHoNis K L L I V AKL T I

2790 MO0 3730 3700 3/71 3/00 3770 3720

LAMALU~ A"iAiliiCTiTTiGMAOrCiTCTTTTUTMiiCATiCTCmii#ATiMAAcCiTSiC TIC AllCM AN 6 IA AMT TTCaLMAT A!C6ITSCTA TOOATMALI6A AKALMAATLUr Asp AspIIllH#owrPhflu Valhr PheAspAlyNis LosVal IlILysPro Val I L V V N I L A I

300 33 10 3 8 0 3

VAMTAOiAAMMiAT(CIKTUT CMAlM TCI 6AMAMAuCMlM ?CT CT, AVMCCiTiTAM iMO6AAKCM iiiTAiCiMAA siiiTTiii6ATAiiiMiiiTiOiiTTiALLMiALysVal kmAssAsophr AlaSlyikg lelApIe6for kg kmhOw6lyIlyiSuLev L L A L K I L N N T k

31 01 3 31600 3 070 3 le90 3900

MAATITT C TTTMT ITC TCT IN K AMTATT!CT ITC MA CTCTIYCM MA C6 TIC CCAITAiM AAijAMANMTcAiKAATTICCiMTA TAiAATC 1 TSTLiC TiLC6AAssValAla Ph# AmVolWr Ala SlySlykmI.lelhr Val Lys Pro hrAIeLys Ur V I P I I 6 L KI6 A T A I

295W 29 297" 2990 3000 3910 2030 3900 3950 3900

ATI Ml ALLACAMAN L:AMiIAMALTCATCATCAT TTICA AM6 ANMA CM'MA MftAOC M AiTTiiiii CATCiiiYCiiiiiiiiAAT T6ACTiAaMMATaAiA CTCCiC ATAIleikm kmihirLys .Thr LsLysLi Ole NiosNs VaIhrTriA.SgimLys6LysLNoL" L L K L N

3010 3020 3030 3006 303 3000 3970 3900 3990 0000 010 on2

AAAMAMATiiCiCiLLiOiLCTiCYOAOTTATAK "TTCMiLiTKi CVTITLiA AKAi All la ACTMAT TAT USiiiiAC0TATTTT AACTNTMiAAMACM 61aA ASAAmLii SIr kg Ph Ph# Ala Al. Li. Lu Al Lu Phe Off Ala Phe Cy Val fit

3070 3000 3A0" 3100 3110 3120 0030 4000 0070 ON6

SITTITMALTMAALKATI TCT TTA TAT ATA CMMACMALMACTTUOMT ATC TMM4T ACTiiTT0ATiiATiAiC ATiiAT ITiiCT TAiiALA CiiiMOiiATMLA0iCMA~Ii TSlIV Ph L" -

3130 3100 3150 3100 3170 3100 4009 0100 0110 0320 0130 0100

iMA ACi CCI iiC iM iM ATA iiiT TOW C.W MAA MA C Crc OACMTiAC6i

3200m 3210 322 3230 3201 0100 0170 0100 0190 0200

TIii MA rCi Tii 61C TMA CTT TIAT A TCA tiA MiT iI TIii OVA Cii onT TL T61AT A AMi C MI C iTO All CM M CM iLC 6CV CM6 iTO CAT TL

32350 320 327 326 329 3300 0210 0230 010 0250 0202

ATT CAAL6TiTMCKA MTCAV2TALCiiTCMAWM UAAVMA ArTTcCiACMATiTM'Tcc CACLOAMALA iATC Mi,TTi CITTITS6CiAKC TAiT TAT TIAT 00ONCAAM a

3310 3320 3330 330 3350 33U HA

MiMT tc6AM ATTi1TC ATCAAAiAAKLMAiiTATCANLAKTAiT;TCITITTMAM A AA AA AOi ALCAKOACMAM AAICcSTiiC C6ATi1AMIAA16CiiI1I 6P666E P APA IAVA AOKNL k0.AIa kitLys Lys IlkI*rlfitArgLftLys,Ala

812 SHIROZA AND KURAMITSU J. BACTERIOL.

on Novem

ber 6, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 4: Sequence Analysis Streptococcus mutans ... · S. MUTANSftf GENE 811 Eco Hind ccoRV Hind pSS22 7.8kb E Eo-0.4 0 1 EeoI Hindi EcoRV EaI-OF-0RF2-u ORF2-fI- pSS22 Hlnd EcoRV C,b X pTS102

S. MUTANS ftf GENE 813

TABLE 1. Codon usage of the ftf gene

Codon Aimino No. of Codon Amino No. of Amino No. of Amino No. ofacid residues acid residues C acid residues C acid residues

TTIT Phe 19 TCT Ser 17 TAT Tyr 25 TGT Cys 1TTC Phe 4 TCC Ser 1 TAC Tyr 8 TGC Cys 0TTA Leu 17 TCA Ser 19 TAA 1 TGA 0TTG Leu 7 TCG Ser 2 TAG 0 TGG Trp 14

CUT Leu 13 CCT Pro 7 CAT His 10 CGT Arg 8CTC Leu 3 CCC Pro 4 CAC His 1 CGC Arg 3CTA Leu 4 CCA Pro 14 CAA Gln 30 CGA Arg 3CTG Leu 4 CCG Pro 2 CAG Gln 11 CGG Arg 0

ATT Ile 25 ACT Thr 26 AAT Asn 50 AGT Ser 17ATC Ile 4 ACC Thr 9 AAC Asn 10 AGC Ser 4ATA Ile 5 ACA Thr 30 AAA Lys 35 AGA Arg 4ATG Met 15 ACG Thr 13 AAG Lys 21 AGG Arg 1

GUT Val 35 GCT Ala 40 GAT Asp 46 GGT Gly 27GTC Val 9 GCC Ala 7 GAC Asp 8 GGC Gly 11GTA Val 9 GCA Ala 21 GAA Glu 35 GGA Gly 10GTG Val 5 GCG Ala 4 GAG Glu 6 GGG Gly 4

dase cleavage sites (22), a putative cleavage site following account for the differences in molecular weights), regions ofVal-34 was assigned (Fig. 2). extensive homology can be detected (Fig. 3). High degreesExamination of a hydropathy plot of the deduced FTF of amino acid sequence homologies are especially apparent

amino acid sequence revealed that the enzyme is highly at the amino termini of both proteins (signal sequencehydrophilic with only two significant' hydrophobic regions regions) and in the central region of the FTF protein.(data not shown). One of these corresponds to a portion of Contained in these latter regions are sequences with as manythe signal sequence, while the other was found at the as nine consecutive amino acids in common between the twocarboxy terminus of the protein (nine consecutive hydropho- proteins.bic amino acids from base positions 3015 to 3040; Fig. 2). On Characterization of ORFs 1 and 2. Putative ORF 1, whichthe basis of the amino acid composition of the putative ends at the termination codon TGA, 42 bp downstream fromprocessed protein (Table 1), a pI of 5.66 was determined. a HindIII site, appears to encode a 159-amino-acid-residueCodon utilization of thefif gene. Examination of the codon carboxy terminus of an unknown protein, since no other

usage for the ftf gene (Table 1) revealed a high frequency termination codon is present in the 0.4-kb EcoRI-HindIII(78%) of A+T in the third positions of the codons. This fragment (data not shown). ORF 2, encoding'a 122-residuefrequency is higher than that observed for the same position polypeptide, begins with the ATG initiation codon (positionin the codons for the strain GS-5 gtJB gene (63%) and reflects 26, Fig. 2), is preceded by a potential Shine-Dalgarno (SD)the relatively low G+C content (36 to 38%) of the S. mutans sequence (21), and ends at the termination codon TAAchromosomal DNA (7). Only one residue of cysteine was (position 392). Therefore, the coding sequences for thedetected in the entire protein (Table 1). This is consistent amino-terminal region of ORF 2 appear to overlap with thewith recent observations that extracellular proteins from carboxy-terminal region of ORF 1.streptococci appear to contain little or no cysteine (22). Characterization of ORF 3. The putative ORF 3, tran-Comparison of FTF with the levansucrase ofB. subtilis. The scribed from the opposite DNA strand relative to the ftf

conversion of sucrose to a fructan polymer is catalyzed not gene, begins with the ATG initiation codon (position 3951,only by the FTF of S. mutans but also by the levansucrase of Fig. 2) and terminates at the TAA codon (position 3267).B. subtilis (4). The former enzyme produces primarily an This putative gene would code for a 228-amino-acid poly-inulinlike polymer (3), while the latter enzyme synthesizes a peptide with a molecular weight of 26,500. The putative genelevan product (23). Since Steinmetz et al. (23) have recently is preceded by potential -35 and -10 promoter regionsdetermined the nucleotide sequence of the latter gene, it was (TTGCAA and TATAAA), although the distance betweenof interest to compare the sequence with that of the S. the two sequences is 18 bp rather than the 17 bp differencemutans FTF. A comparison of the two genes by a homology found in most E. coli promoter sequences (8). Furthermore,matrix indicates significant homologies at both the amino a potential SD sequence was identified at position 3956 (Fig.acid and nucleotide sequence levels (data not shown). When 2). Interestingly, an inverted repeat sequence (positions 3958the two amino acid sequences are aligned (including gaps to to 3978) partially overlaps the SD sequence of the gene,

FIG. 2. Nucleotide sequence of the 4.3-kb HindIII fragment. The sequence shown is that for the noncoding strand of the ftf gene. Thecoding strand of ORF 3 is transcribed in the direction opposite to that of the other ORFs. To indicate this, the deduced amino acid sequenceof ORF 3 is presented by using the single-letter amino acid code. The termination codon (TGA at position 42) for ORF 1 is indicated by thebroken overline. Underlined are: position 12, SD sequence for ORF 2; positions 617, 640, and 668, respectively, the -35, -10, and SDsequences for the ftf gene; and positions 4107, 4130, and 4159, respectively, the -35, -10, and SD sequences for ORF 4. Overlined arepositions 4112, 4088, and 3959, respectively, the -35, -10, and SD sequences for ORF 3. The three opposing arrows (positions 565 to 593,3150 to 3179, and 3958 to 3978) denote inverted repeat structures. The putative signal sequence cleavage site for the FTF protein is indicatedby the vertical arrow (position 782). The base positions are numbered beginning at the HindIII site.

VOL. 170, 1988

on Novem

ber 6, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 5: Sequence Analysis Streptococcus mutans ... · S. MUTANSftf GENE 811 Eco Hind ccoRV Hind pSS22 7.8kb E Eo-0.4 0 1 EeoI Hindi EcoRV EaI-OF-0RF2-u ORF2-fI- pSS22 Hlnd EcoRV C,b X pTS102

814 SHIROZA AND KURAMITSU

* * * 4 * 50*METKVRKKMY-KK-GKFWVVATITTANLTGIGLSSVQADEANSTQVSSELAERSQVQENT

MNiKXFAKQATVLTFTTALIAG-GATQAFAKETNQKPYKETYGISHITRHDM* * * 50*

* * * * 100* *TASSSAAENQAKTEVQETPSTNPAAATVENTDQTTKVITDNAAVESKASKTKDQAATVTKLQI---

* * * 150* * *TAASTPEVGQTNEKDKAKATKEADITTPKNTIDEYGLTEQARKIATEAGINLSSLTQKQV

---PE-QQKNEK---60*

* * 200* * * *EALNKVKLTSDAQTGHQNTYQEFDKIAQTLIAQDERYAIPYFNAKAIKNMKAATTRDAQT

---YQVPEFDSSTIRNISSAKG---70* *

* 250* * * * *GQIADLDVWDSWPVODAKTGEVINWNGYQLVVAW_GIP-NTNDNHIYLLYNKYGDNNFDH

---LDVWYSWPLQNAD-GTV,AYHGYHIVFALAGDPKNADDTSIYMFYQKVGETSIDS* 100* * * *

*300 * * * *WKNAGSIFGYNETPL---------TQEWSGSATVNEDGSLQLFYTKVDTSDKNSNNQRLAWKNAGRVFKDSDKFDANDSILKDQTQEWSGSATFTSDGKIRLFYT--DFSGKuHYGKQTLT

* 150* * * * **350 * * * * 400*

TATVNLGFDDQDVRILSVENDKVLTPEGV AYH_SY2QWRSTFTGA--DNIAMRDPHVITAQVNVSASDSSLNINGVEDYXSIFDGDGKT--YQNVQQFIDEGNYSSGDNHTLRDPHYV

200* * * * * 250** * * * 450* *

9DENGDRYLVFEASTGTEN-YQGEDQIYNFTNYGGSSAYNVKSLFRFLDDQPMYNRASWAEDK-GHKYLVFEANTGTEDGYQGEESLFNKAYYGKSTSFFRQESQK-LLQSDKKRTAEIA

* * * * 300** * * 500* * *

NAAIGILKLKGDKKTPEVDQFYTPLLSSTMVSDELERPNVVKLGDKYYLFTASRLNHGSNNGALGMIELNDDYTLKKVMK---PLIASNTVTDEIERANVFKMNGKWYLFTDSR---GSK* * * * 350* *

* * 550* * * *NDAWNKANEVVGDNVVMLGYVSDQLTNGYKPLNNSGVVLTASVPADWRTATYSYYAVPVAMTIDGITSNDIY----MLGYVSNSLTGPYKPLNKTGLVLKHDLDPNDVTFTYSHFAVPQA

* * * 400* ** 600* * * * *

GSSDTLIMTAYMTNRNEVAGKGKNSTWAPSFLIQVLPDGTTKVIAEMTQQGDWIWDEPSRKGNNVVI-TSYMTNRGFYADKQ--STFAPSFLLNIKGKKTSVV---* * * 450*650* * * * * 700*

TTDTVGTLDTAYLPGENDGYIDWNVIGGYGLKPHTPG2Y2PTVPSTPIHTDDIISFEVSF---KDDILEQGQLTVNK.

* 470** * * * 750* *

DGHLVIKPVKVNNDSAGRIDQSRNSGGSLNVAFNVSAGGNISVKPSQKSINNTKETKKAH* * 790*

HVSTEKKQKKGNSFFAALLALFSAFCVSIGFK.

FIG. 3. Alignment and comparison of the amino acid sequencesof the S. mutans GS-5 FTF (upper sequence) and B. subtilislevansucrase (lower sequence). The two amino acid sequences arealigned based on a homology matrix (data not shown), 'and theconserved residues between the two proteins are underlined. Tomaximize the homology, gaps in the sequence alignment have beenintroduced. The downward and upward'arrows indicate the pre-dicted and actual signal sequence cleavage sites for the FTF andlevansucrase proteins, respectively.

suggesting that the former sequence may be involved in theregulation of expression ofORF 3. The possibility that ORFs1, 2, and 3 are actually translated in the S. mutans cells issuggested by similar codon utilization of the ORFs relativeto the gtJB and ftf genes of S. mutans (data not shown).Comparison of ORFs 2 and 3 with gram-positive regulatory

proteins. Recent genetic investigations of the levansucrasefrom B. subtilis (20) have suggested that the expression ofthe enzyme is positively regulated by the sacU gene product.Since such regulatory proteins bind to DNA and exhibitrelatively low molecular weights, it is possible that theproducts of ORFs 2 or 3 also participate in the regulation ofgene expression. Therefore, the predicted amino acid se-quences of the two ORF products were compared with thesequences of several DNA-binding proteins: sigma 37 sub-unit (5), spoOF (24), phoP (19), and penI (10) from gram-positive bacteria. No extensive homologies were observedbetween the two sets of proteins (data not shown). However,both ORFs 2 and 3 displayed several regions of amino acidhomology (involving up to eight consecutive amino acids)with these selected DNA-binding proteins. In addition, onesequence (Lys-Lys-Val-Tyr-Arg) was observed in both ORF2 (positions 54 to 67, Fig. 2) and ORF 3 (positions 3401 to3388).

Characterization of putative ORF 4. In the course ofsequencing the 1.4-kb EcoRI-HindIII fragment of pTS102(Fig. 4), it was observed that only one of the two DNAstrands (ftf antisense strand) of this fragment could beisolated in M13mpl8 or M13mpl9. However, both strands oftwo deletion derivatives of this fragment (clones 6 and 12)could be readily isolated in the M13 phages. These resultsare similar to our recent results in sequencing the gtfJ genefrom strain GS-5 (22). In the latter case, it was also notpossible to isolate both DNA strands containing the pro-moter region of this gene in the M13 phages. By analogy, itis likely that a strong promoter sequence exists on theterminal 0.3-kb region of the EcoRI-HindIII fragment. How-ever, as with the promoter region of the gtJB gene (22), it waspossible to determine the nucleotide sequence of both DNAstrands in the terminal region by using a double-strandedDNA template. Nucleotide sequencing also revealed thepresence of the beginning of another potential ORF, termedORF 4, whose deduced amino acid sequence is very similarto those of the highly basic terminal regions of the signalsequences from both the ftf (Fig. 2) and gtjB genes (22).Although only a small part of ORF 4 has been sequenced,these results suggest that ORF 4 may code for an extracel-

0

Hind EcoRV6 '

2

Eco XhoI I

Ec- a- /a BUl

1.4kb Eco-Hind I-w

Clone 6 1-

Clone 12 1-

3 4 4.3kbI

Eco Eco Ir Hind

H/nd

ORF3 ORF4 Cloning of1kb 0

1 1 I I both strands

MI NO

n I YES

YES

FIG. 4. Isolation of DNA fragments containing ORF 4 and its regulatory regions. Phage clones 6 and 12 are two deletion derivatives ofthe 1.4-kb EcoRI-HindIII fragment in which approximately 250- and 700-bp sequences from the right end of the HindlIl site are removed,respectively. In the parental 1.4-kb EcoRI-HindIII fragment, only one of the two strands that hybridized with the RNA polymerase sensestrand of the ftf gene was cloned, while both DNA strands from clones 6 and 12 were isolated in phage M13 vectors. The restrictionendonuclease abbreviations and the shaded box are the same as for Fig. 1.

J. BACTERIOL.

I

on Novem

ber 6, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 6: Sequence Analysis Streptococcus mutans ... · S. MUTANSftf GENE 811 Eco Hind ccoRV Hind pSS22 7.8kb E Eo-0.4 0 1 EeoI Hindi EcoRV EaI-OF-0RF2-u ORF2-fI- pSS22 Hlnd EcoRV C,b X pTS102

S. MUTANS ftf GENE 815

lular protein. It is also of interest that the putative -35region of ORF 4 could also serve a similar function for ORF3 (Fig. 2). However, in the latter case, the opposite DNAstrand read in the reverse direction would serve as part ofthe recognition site for RNA polymerase. Thus, the terminal0.3-kb fragment appears to contain two promoters (one forORF 3 and the other for ORF 4) (Fig. 2).

DISCUSSION

Nucleotide sequencing of the ftfgene from plasmid pSS22revealed that the cloned gene lacks a termination codon.Therefore, the FTF activity expressed by the plasmid cor-

responds to a fusion protein (ftf gene fused to a pACYC184vector sequence). Nevertheless, since FTF activity isreadily expressed from pSS22 (18), the missing carboxy-terminal portion of the ftf gene is not required for activity.Following the isolation of the intact ftf gene on plasmidpTS102, the termination codon for the gene was identified207 bp downstream from the EcoRI site in the 1.4-kbEcoRI-HindIII fragment of pTS102. The deduced amino acidsequence of the FTF protein revealed the presence of only a

single cysteine residue (Table 1). Likewise, the glucosyl-transferase I (GTF-I) protein from strain GS-S lacks cysteineresidues (22). Therefore, streptococcal extracellular proteinsappear to contain few cysteine residues (6). However, thesignificance of these observations has not yet been deter-mined.

Like other streptococcal extracellular proteins (6), FTFcontains a relatively long (34 amino acids) signal sequence.

This sequence is also very similar (10 of 34 identical aminoacids) to that of the GTF-I protein from strain GS-S (22). Aputative signal sequence cleavage site (position 782, Fig. 2)could be predicted based on the general rules of von Heijne(26). However, amino acid sequencing of the secreted FTFwill be necessary to confirm this prediction. The predictedmolecular size of the mature FTF, 83.8 kilodaltons, iscompatible w'ith that of the FTF activity observed in culturefluids of strain GS-5, 83 kilodaltons (data not shown). Thepreviously observed 91-kilodalton FTF from lambda clones(18) apparently results from fusion of the ftf gene with a

lambda DNA sequence.A comparison of the FTF protein with the levansucrase

from B. subtilis indicates that the two proteins containregions of homology (Fig. 3). Since both enzymes catalyzethe same basic reaction, it is likely that' these homologousregions are required for enzymatic activity (one of theseregions may be involved in binding the substrates). It will beof interest to determine the structural specificity which hasresulted in the FTF protein synthesizing a ,B-1,2-linkedfructan, while the levansucrase produces a ,-2,6-linkedpolymer.

It is also of interest that the putative signal sequence of theFTF protein is somewhat longer than that of the levansu-crase (Fig. 3). The results from several laboratories (6, 11)appear to suggest that the signal sequences of streptococcalextracellular proteins generally appear to be longer thanthose of other bacteria. However, the molecular basis forsuch a difference has not yet been determined. Previouslocalization studies (2) have indicated that most of the GTF-Iactivity expressed in E. coli clones is associated with thecytoplasmic membrane (2). In contrast, the majority of thecloned FTF activity is secreted into the periplasmic space(18). Since the signal sequences of the two enzymes appear

to be quite similar, it may be possible that structural differ-ences in other regions of the two proteins are responsible for

the difference in localization. However, the results of thesequence analysis suggest an alternate hypothesis. Transla-tion of the ftf gene could actually begin at an internal ATGcodon (position 705, Fig. 2), since this codon is preceded bya potential SD sequence (AGAAAAAAG). In this case, theresultant signal sequence would still retain its requiredcharacteristics (initial basic region followed by a hydropho-bic region). The resultant signal sequence would be similar insize to that of the levansucrase of B. subtilis, which issecreted into the periplasmic space of E. coli clones (23).Thus, it is possible that signal sequences with relativelyshort basic regions are able to pass through the E. colicytoplasmic membrane, while streptococcal proteins withlonger signal sequences (GTF-I protein) may not be able topenetrate this structure. Examination of more cloned strep-tococcal extracellular proteins will be required to test thishypothesis.

Nucleotide sequencing of regions flanking theftf structuralgene indicate the presence of potential regulatory sequences.The ftf gene is preceded by an inverted repeat sequence(positions 565 to 593, Fig. 2), which by analogy to a similarstructure upstream from the B. subtilis levansucrase gene(23) may be involved in the regulation of FTF expression. Ithas been proposed that the repeat sequence upstream fromthe levansucrase gene may act as a recognition site for aregulatory protein (20). However, this latter sequence ispositioned between the promoter region and the levansu-crase structural gene. In contrast, the present results indi-cate that the inverted repeat sequence preceding the ftfgenelies upstream from the promoter sequence for this gene (Fig.2). Additional mutagenesis experiments designed to alter thissequence will be required to demonstrate its possible role inregulating FTF expression.

Since the regulatory proteins involved in controlling bac-terial gene expression are of relatively low molecularweights (14), it was of interest that two ORFs (ORFs 2 and 3)coding for low-molecular-weight proteins could be identifiedflanking the ftf gene. A comparison of the deduced aminoacid sequences of these two genes with those of selectedgram-positive bacterial DNA-binding proteins revealed shortsequences of homology (data not shown). ORF 3 contains a20-amino-acid sequence (Gln-18 to Ile-27, Fig. 2) whichshares partial homology with amino acids 223 to 238 of thesigma 37 subunit (5). The latter sequence has been postu-lated to play a direct role in binding to DNA (5). TheDNA-binding sites of several regulatory proteins share thecommon structure Ala-N3-Gly-N5-ValIIle/Leu (14). The ho-mologous region from ORF 3 displays the sequence Ala-N3-Gly-N5-Tyr beginning with amino acid 22 (Fig. 2), whichmight also function in the same capacity. However, nocomparable sequence was detected for ORF 2, although thisprotein also shares several short regions of homology withthe regulatory proteins (data not shown). In addition, anextensive amino acid sequence comparison of ORFs 2 and 3with a wide variety of different DNA-binding proteins wasnot carried out in the present investigation. Since ORF 3 ispreceded by an inverted repeat structure (positions 3958 to3987, Fig. 2), it is possible that this gene is also subject toregulation by another regulatory protein (cascade regula-tion). To establish the possible roles of ORFs 2 or 3 in theregulation of FTF expression, it will be necessary to isolateand characterize both genes and their protein products. Inaddition, the cloned genes will be insertionally inactivatedand transformed into strain GS-5 to produce mutants whichare defective in the expression ofORFs 2 and 3 (2). It is clearthat these additional approaches will be required to confirm

VOL. 170, 1988

on Novem

ber 6, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 7: Sequence Analysis Streptococcus mutans ... · S. MUTANSftf GENE 811 Eco Hind ccoRV Hind pSS22 7.8kb E Eo-0.4 0 1 EeoI Hindi EcoRV EaI-OF-0RF2-u ORF2-fI- pSS22 Hlnd EcoRV C,b X pTS102

816 SHIROZA AND KURAMITSU

the regulatory roles for these ORFs suggested by the se-quence data and their possible relationships, if any, to FTFsynthesis.The inability to isolate one of the two DNA strands from

the 1.4-kb EcoRI-HindIII fragment in the M13 phages (Fig.4) is similar to our experience in attempting to isolate bothsingle-stranded DNA fragments containing the promoterregion of the gtIB gene (22,. In the latter case, only the RNApolymerase antisense strand containing this region could beisolated in the M13 phages. This result suggested thatinsertion of a strong promoter (sense strand) into the multi-cloning site of M13 phages such that the direction of tran-scription from the heterologous promoter is in the oppo'sitedirection relative to the phage genes results in the inability torecover phage particles. The results of the present investi-gation also indicate that only the RNA polymerase antisensestrand (relative to the ftf gene) of the 1.4-kb EcoRI-HindlIlfragment could be isolated in M13mpl8 or M13mpl9. This isindicated by the observation that the single-stranded DNAisolated in the M13 phages from the intact fragment hybrid-ized with the sense strand of the ftf gene (data not shown).The inability to isolate phage particles containing the sensestrand of the fragment suggests the presence of a strongpromoter on this strand.Although only a small portion of ORF 4 has been se-

quenced in the present investigation, the deduced amino acidsequence suggests the presence of a signal sequence typicalof GS-5 extracellular proteins. In addition, both potential SDand promoter sequences were identified upstream from thisputative structural gene. The ORF 4 promoter is apparentlyresponsible for the strong promoter activity preventing theisolation of the sense strand of the 1.4-kb EcoRI-HindIIIfragment in the M13 phages. It will be of interest to isolate aDNA fragment containing intact ORF 4 in order to charac-terize this potential extracellular protein. It also may bepossible that ORF 3 is involved in regulating the expressionof this protein.

ACKNOWLEDGMENT

This investigation was supported in part by Public Health Servicegrant DE-03258 from the National Institutes of Health.

LITERATURE CITED1. Abrahmsen, L., T. Moks, B. Nilsson, U. Heilman, and M. Uhlen.

1985. Analysis of signals for secretion in staphylococcal proteinA gene. EMBO J. 4:3901-3906.

2. Aoki, H., T. Shiroza, M. Hayakawa, S. Sato, and H. K. Kura-mitsu. 1986. Cloning of a Streptococcus mutans glucosyltrans-ferase gene coding for insoluble glucan synthesis. Infect. Im-mun. 53:587-594.

3. Carlsson, J. 1970. A levansucrase from Streptococcus mutans.Caries Res. 4:97-113.

4. Dedonder, R. 1966. Levansucrase from Bacillus subtilis. Meth-ods Enzymol. 8:500-505.

5. Duncan, M. L., S. S. Kalman, S. M. Thomas, and C. W. Price.1987. Gene encoding the 37,000-dalton minor sigma factor ofBacillus subtilis RNA polymerase: isolation, nucleotide se-quence, chromosomal locus, and cryptic function. J. Bacteriol.169:771-778.

6. Fahnestock, S. R., P. Alexander, J. Nagle, and D. Filpula. 1986.Gene for an immunoglobulin-binding protein from a group Gstreptococcus. J. Bacteriol. 167:870-880.

7. Hamada, S., and H. D. Slade. 1980. Biology, immunology, and

cariogenicity of Streptococcus mutans. Microbiol. Rev. 44:331-384.

8. Hawley, D. K., and W. R. McClure. 1983. Compilation andanalysis of Escherichia coli promoter DNA sequences. NucleicAcids Res. 11:2237-2255.

9. Henikoff, S. 1984. Unidirectional digestion with exonuclease IIIcreates targeted breakpoints for DNA sequencing. Gene 28:351-359.

10. Hhneno, T., T. Imanaka, and S. Aiba. 1986. Nucleotide se-quence of the penicillinase repressor gene penl of Bacilluslicheniformis and regulation ofpenP and penl by the repressor.J. Bacteriol. 168:1128-1132.

11. Hollingshead, S. K., V. A. Fischetti, and J. R. Scott. 1986.Complete nucleotide sequence of type 6M protein of the groupA Streptococcus. J. Biol. Chem. 261:1677-1686.

12. Lepesant, J.-A., F. Kunst, M. Pascal, J. Lepesant-Kejzlarova, M.Steinmetz, and R. Dedonder. 1976. Specific and pleiotropicregulatory mechanisms in the sucrose system ofBacillus subtilis168, p. 58-69. In D. Schlessinger (ed.), Microbiology-1976.American Society for Microbiology, Washington, D.C.

13. Loesche, W. J. 1986. Role of Streptococcus mutans in humandental decay. Microbiol. Rev. 50:353-380.

14. Pabo, C. O., and R. T. Sauer. 1984. Protein-DNA recognition.Annu. Rev. Biochem. 53:293-321.

15. Reeves, R. E., and A. Sols. 1973. Regulation of Escherichia coliphosphofructokinase in situ. Biochem. Biophys. Res. Commun.50:459-466.

16. Rosenberg, M., and D. Court. 1979. Regulatory sequencesinvolved in the promotion and termination of RNA transcrip-tion. Annu. Rev. Genet. 13:319-353.

17. Sanger, F., S. Nicklen, and A. R. Coulson. 1977. DNA sequenc-ing with chain-terminating inhibitors. Proc. Natl. Acad. Sci.USA 74:5463-5467.

18. Sato, S., and H. K. Kuramitsu. 1986. Isolation and characteri-zation of a fructosyltransferase gene from Streptococcus mu-tans GS-5. Infect. Immun. 52:166-170.

19. Seki, T., H. Yoshikawa, H. Takahashi, and H. Saito. 1987.Cloning and nucleotide sequence of phoP, the regulatory genefor alkaline phosphatase and phosphodiesterase in Bacillussubtilis. J. Bacteriol. 169:2913-2916.

20. Shimotsu, H., and D. J. Henner. 1986. Modulation of Bacillussubtilis levansucrase gene expression by sucrose and regulationof the steady-state mRNA level by sacU and sacQ genes. J.Bacteriol. 168:380-388.

21. Shine, J., and L. Dalgarno. 1974. The 3' terminal sequence ofEscherichia coli 16S ribosomal RNA; complementarity to non-sense triplets and ribosome binding sites. Proc. Natl. Acad. Sci.USA 71:1342-1346.

22. Shiroza, T., S. Ueda, and H. K. Kuramitsu. 1987. Sequenceanalysis of the gtp3 gene from Streptococcus mutans. J. Bacte-riol. 169:4263-4270.

23. Steinmetz, M., D. Le Coq, S. Aymerich, G. Gonzy-Treboul, andP. Gay. 1985. The DNA sequence of the gene for the secretedBacillus subtilis enzyme levansucrase and its genetic controlsites. Mol. Gen. Genet. 200:220-228.

24. Track, K. A., J. W. Chapman, P. J. Piggot, and J. A. Hoch.1985. Deduced product of the stage 0 sporulation gene spoOFshares homology with the spoOA, ompR, and sfrA proteins.Proc. Natl. Acad. Sci. USA 82:7260-7264.

25. Van Houte, J., and H. M. Jansen. 1968. Levan degradation bystreptococci isolated from human dental plaque. Arch. OralBiol. 13:827-830.

26. von HeQne, G. 1983. Patterns of amino acids near signal se-quence cleavage sites. Eur. J. Biochem. 133:17-21.

27. Wenham, D. G., T. D. Hennessey, and J. A. Cole. 1979.Regulation of glucosyl- and fructosyltransferase synthesis bycontinuous culture of Streptococcus mutans. J. Gen. Microbiol.114:117-124.

J. BACTERIOL.

on Novem

ber 6, 2020 by guesthttp://jb.asm

.org/D

ownloaded from