sequencing cellobiose phosphotransferase bacillus … · 6442 lai and ingram table 1. strains...

10
JOURNAL OF BACTERIOLOGY, Oct. 1993, p. 6441-6450 Vol. 175, No. 20 0021-9193/93/206441-10$02.00/0 Copyright ©D 1993, American Society for Microbiology Cloning and Sequencing of a Cellobiose Phosphotransferase System Operon from Bacillus stearothermophilus XL-65-6 and Functional Expression in Escherichia colit XIAOKUANG LAI AND L. 0. INGRAM* Department of Microbiology and Cell Science, University of Florida, Gainesville, Florida 32611 Received 21 June 1993/Accepted 11 August 1993 Cellulolytic strains of Bacillus stearothermophilus were isolated from nature and screened for the presence of activities associated with the degradation of plant cell walls. One isolate (strain XL-65-6) which exhibited strong activities with 4-methylumbelliferyl-4-D-glucopyranoside (MUG) and 4-methylumbelliferyl-p-D-cello- biopyranoside (MUC) was used to construct a gene library in Escherichia coli. Clones degrading these model substrates were found to encode the cellobiose-specific genes of the phosphoenolpyruvate-dependent phosph- otransferase system (PTS). Both MUG and MUC activities were present together, and both activities were lost concurrently during subcloning experiments. A functional E. coli ptsI gene was required for MUC and MUG activities (presumably a ptsH gene also). The DNA fragment from B. stearothermophilus contained four open reading frames which appear to form a cel operon. Intergenic stop codons for celA, celB, and celC overlapped the ribosomal binding sites of the respective downstream genes. Frameshift mutations or deletions in celA, celB, and celD were individually shown to result in a loss of MUC and MUG activities. On the basis of amino acid sequence homology and hydropathy plots of translated sequences, celA and celB were identified as encoding PTS enzyme II and celD was identified as encoding PTS enzyme III. These translated sequences were remarkably similar to their respective E. coli homologs for cellobiose transport. No reported sequences exhibited a high level of homology with the celC gene product. The predicted carboxy-terminal region for celC was similar to the corresponding region of E. coli celF, a phospho-,-glucosidase. An incomplete regulatory gene (celR) and proposed promoter sequence were located 5' to the proposed cel operon. A stem-loop resembling a rho-independent terminator was present immediately downstream from celD. These results indicate that B. stearothermophilus XL-65-6 contains a cellobiose-specific PTS for cellobiose uptake. Similar systems may be present in other gram-positive bacteria. Cellulose is the most abundant carbohydrate in nature. To balance the carbon cycle, an amount equivalent to over 10"' metric tons of biomass from photosynthesis must be metabo- lized primarily by fungi, yeasts, and bacteria (7). The insoluble polymers of cellulose are degraded by a cadre of enzymes which are generally classified as exocellobiohydrolases, endo- 1,4-3-glucanases, and ,B-glucosidases (cellobiases) (35). Cello- biohydrolases and endoglucanases act to solubilize cellulose and produce cellobiose, cellotriose, and glucose as products. Although cellobiose uptake in cellulolytic bacteria has not been studied extensively, two cryptic cel operons for cellobiose uptake in Escherichia coli have been described in some detail (19, 27). Both operons encode the cellobiose-specific compo- nents of the phosphoenolpyruvate-dependent phosphotrans- ferase system (PTS) and include a phospho-4-glucosidase which completes the hydrolysis process after uptake. In many bacteria, other disaccharides such as lactose and sucrose are also transported by a PTS with concomitant phosphorylation and intracellular cleavage (4, 14, 16). Often, genes encoding the sugar-specific enzymes of the PTS are grouped into an operon (4, 14, 15, 19, 27). Despite the potential abundance of cellobiose in the envi- ronment, comparatively little is known about the uptake systems which initiate metabolism in bacteria. Although the best-characterized cel operon is cryptic and must be activated * Corresponding author. t Florida Agricultural Experiment Station publication R-03236. in laboratory strains of E. coli (27), functional genes for cellobiose utilization have been reported in some recent isolates of E. coli from human and animal manures (18). Cellobiose is also metabolized by other enteric bacteria such as Klebsiella sp. (3) and Erwinia sp. (15). Considering the simi- larities between these enteric genera, it is likely that all use a cellobiose PTS to initiate metabolism. Many gram-positive and gram-negative bacteria have been reported to contain cell- associated ,B-glucosidases (9). Gram-positive bacteria such as Cellulomonas uda and Cellulomonas favigena contain cello- biose phosphorylases which are presumed to be involved in intracellular cellobiose metabolism (9). In this paper, we report the cloning of Bacillus stearother- mophilus genes which confer the ability to hydrolyze the model substrates 4-methylumbelliferyl-3-D-cellobiopyranoside (MUC) and 4-methylumbelliferyl-p-D-glucopyranoside (MUG). The cloned DNA fragment contained the cellobiose-specific genes of the PTS and a putative cleavage enzyme for cellobiose- phosphate. MATERIALS AND METHODS Bacterial strains, plasmids, and growth conditions. The bacterial strains and plasmids used in this study are listed in Table 1. Strains of B. stearothermophilus were grown in Difco nutrient broth or on Difco nutrient agar at 65°C. Strains of E. coli were grown at 37°C in Luria broth or on Luria agar (33). 6441 on March 3, 2020 by guest http://jb.asm.org/ Downloaded from

Upload: others

Post on 01-Mar-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sequencing Cellobiose Phosphotransferase Bacillus … · 6442 LAI AND INGRAM TABLE 1. Strains andplasmids used in this study Strain or plasmid Genetic characteristic(s) Source B

JOURNAL OF BACTERIOLOGY, Oct. 1993, p. 6441-6450 Vol. 175, No. 200021-9193/93/206441-10$02.00/0Copyright ©D 1993, American Society for Microbiology

Cloning and Sequencing of a Cellobiose PhosphotransferaseSystem Operon from Bacillus stearothermophilus XL-65-6 and

Functional Expression in Escherichia colitXIAOKUANG LAI AND L. 0. INGRAM*

Department of Microbiology and Cell Science, University of Florida, Gainesville, Florida 32611

Received 21 June 1993/Accepted 11 August 1993

Cellulolytic strains of Bacillus stearothermophilus were isolated from nature and screened for the presence ofactivities associated with the degradation of plant cell walls. One isolate (strain XL-65-6) which exhibitedstrong activities with 4-methylumbelliferyl-4-D-glucopyranoside (MUG) and 4-methylumbelliferyl-p-D-cello-biopyranoside (MUC) was used to construct a gene library in Escherichia coli. Clones degrading these modelsubstrates were found to encode the cellobiose-specific genes of the phosphoenolpyruvate-dependent phosph-otransferase system (PTS). Both MUG and MUC activities were present together, and both activities were lostconcurrently during subcloning experiments. A functional E. coli ptsI gene was required for MUC and MUGactivities (presumably a ptsH gene also). The DNA fragment from B. stearothermophilus contained four openreading frames which appear to form a cel operon. Intergenic stop codons for celA, celB, and celC overlappedthe ribosomal binding sites of the respective downstream genes. Frameshift mutations or deletions in celA, celB,and celD were individually shown to result in a loss of MUC and MUG activities. On the basis of amino acidsequence homology and hydropathy plots of translated sequences, celA and celB were identified as encodingPTS enzyme II and celD was identified as encoding PTS enzyme III. These translated sequences wereremarkably similar to their respective E. coli homologs for cellobiose transport. No reported sequencesexhibited a high level of homology with the celC gene product. The predicted carboxy-terminal region for celCwas similar to the corresponding region ofE. coli celF, a phospho-,-glucosidase. An incomplete regulatory gene(celR) and proposed promoter sequence were located 5' to the proposed cel operon. A stem-loop resembling arho-independent terminator was present immediately downstream from celD. These results indicate that B.stearothermophilus XL-65-6 contains a cellobiose-specific PTS for cellobiose uptake. Similar systems may bepresent in other gram-positive bacteria.

Cellulose is the most abundant carbohydrate in nature. Tobalance the carbon cycle, an amount equivalent to over 10"'metric tons of biomass from photosynthesis must be metabo-lized primarily by fungi, yeasts, and bacteria (7). The insolublepolymers of cellulose are degraded by a cadre of enzymeswhich are generally classified as exocellobiohydrolases, endo-1,4-3-glucanases, and ,B-glucosidases (cellobiases) (35). Cello-biohydrolases and endoglucanases act to solubilize celluloseand produce cellobiose, cellotriose, and glucose as products.Although cellobiose uptake in cellulolytic bacteria has notbeen studied extensively, two cryptic cel operons for cellobioseuptake in Escherichia coli have been described in some detail(19, 27). Both operons encode the cellobiose-specific compo-nents of the phosphoenolpyruvate-dependent phosphotrans-ferase system (PTS) and include a phospho-4-glucosidasewhich completes the hydrolysis process after uptake. In manybacteria, other disaccharides such as lactose and sucrose arealso transported by a PTS with concomitant phosphorylationand intracellular cleavage (4, 14, 16). Often, genes encodingthe sugar-specific enzymes of the PTS are grouped into anoperon (4, 14, 15, 19, 27).

Despite the potential abundance of cellobiose in the envi-ronment, comparatively little is known about the uptakesystems which initiate metabolism in bacteria. Although thebest-characterized cel operon is cryptic and must be activated

* Corresponding author.t Florida Agricultural Experiment Station publication R-03236.

in laboratory strains of E. coli (27), functional genes forcellobiose utilization have been reported in some recentisolates of E. coli from human and animal manures (18).Cellobiose is also metabolized by other enteric bacteria such asKlebsiella sp. (3) and Erwinia sp. (15). Considering the simi-larities between these enteric genera, it is likely that all use acellobiose PTS to initiate metabolism. Many gram-positive andgram-negative bacteria have been reported to contain cell-associated ,B-glucosidases (9). Gram-positive bacteria such asCellulomonas uda and Cellulomonas favigena contain cello-biose phosphorylases which are presumed to be involved inintracellular cellobiose metabolism (9).

In this paper, we report the cloning of Bacillus stearother-mophilus genes which confer the ability to hydrolyze the modelsubstrates 4-methylumbelliferyl-3-D-cellobiopyranoside (MUC)and 4-methylumbelliferyl-p-D-glucopyranoside (MUG). Thecloned DNA fragment contained the cellobiose-specific genesof the PTS and a putative cleavage enzyme for cellobiose-phosphate.

MATERIALS AND METHODS

Bacterial strains, plasmids, and growth conditions. Thebacterial strains and plasmids used in this study are listed inTable 1. Strains of B. stearothermophilus were grown in Difconutrient broth or on Difco nutrient agar at 65°C. Strains of E.coli were grown at 37°C in Luria broth or on Luria agar (33).

6441

on March 3, 2020 by guest

http://jb.asm.org/

Dow

nloaded from

Page 2: Sequencing Cellobiose Phosphotransferase Bacillus … · 6442 LAI AND INGRAM TABLE 1. Strains andplasmids used in this study Strain or plasmid Genetic characteristic(s) Source B

6442 LAI AND INGRAM

TABLE 1. Strains and plasmids used in this study

Strain or plasmid Genetic characteristic(s) Source

B. stearothermophilus XL-65-6 Prototroph This studyE. coliDH5a AIacZM15 recA BRLaXL1-Blue F'::TnJO (Tetr) lacIqA(lacZ)M15 recAl StratagenebMM6 lacI22 dctB3 ptsI2 relAl thi-I spoTl CGSCC

PlasmidspUC18 bla amp lacI'Z d BRLpBluescript II KS- bla amp lacZe StratagenepBluescript SK- bla amp lacZe StratagenepLOI902 pUC18 containing cel operon This studypLOI905 pUC18 containing cel operon in opposite orientation This study

a Bethesda Research Laboratories, Gaithersburg, Md.b La Jolla, Calif.cE. coli Genetic Stock Center, Yale University, New Haven, Conn.d Incomplete lacI and incomplete lacZ.e Incomplete lacZ.

Ampicillin (50 p.g/ml) and tetracycline (10 ,ug/ml) were addedto media after sterilization as appropriate for selection ofrecombinant E. coli.

Isolation of thermophilic strains with putative cellulolyticactivity. Environmental samples (5 g) were combined with 5 mlof nutrient broth and mixed vigorously for approximately 5min. After a brief period to allow for the settling of largeparticles, 1 ml of this suspension was combined with 1 ml of100% ethanol and shaken gently for 1 h at 22°C to enrich forendospores (24). Dilutions of treated samples were spread on

nutrient agar plates and incubated overnight at 65°C. Singlecolonies were transferred to master plates and to platescontaining 0.25% powdered cellulose (Solka-Floc SW40;James River Corp., Berlin, N.H.). Putative cellulolytic isolateswere identified by the presence of a clear zone or a partiallycleared halo after 4 to 7 days of incubation at 65°C.

Testing for hydrolases and taxonomic traits. The presenceof a complete cellulase activity was tested by using nutrientagar plates containing 0.25% cellulose. Endoglucanase activitywas tested by the Congo red method with plates containing0.2% carboxymethyl cellulose (CMC) (23). Putative cellobio-hydrolase and P-glucosidase activities were tested on nutrientagar plates containing 10 mg of MUC or MUG per liter (35).Mannosidase, arabinosidase, and xylosidase activities were

screened in a similar manner with the respective umbelliferylderivatives. The presence of pectin-degrading enzymes was

tested by a modification of the method described by Brown etal. (5) in which sodium polypectate was used instead ofalginate. Taxonomic tests were performed as described inBergey's manual (6).DNA manipulation. Procedures used for the preparation of

E. coli plasmids, the assembly of recombinant DNA, and thetransformation of E. coli have been described previously (33).Digestions with restriction enzymes were carried out as rec-ommended by the manufacturers. Genomic DNA was isolatedfrom B. stearothermophilus XL-65-6 by the standard method(10).

Cloning of B. stearothermophilus cel operon. Initial testingindicated that DNA isolated from XL-65-6 was not efficientlydigested by Sau3AI but was digested into small fragments byMboI. Poor digestion by Sau3AI is presumed to result from thepresence of 5-methylcytosine, which is tolerated by MboI (29).Fragments of 4 to 6 kbp were isolated from partial MboIdigestions of chromosomal DNA by agarose gel electrophore-sis. These were ligated into the BamHI site of pUC18 andtransformed into E. coli DH5ot. Plasmid DNA was isolated

from the pooled colonies (approximately 3,000 clones) andserved as an amplified library. Secondary transformants ofDH5ot were screened for MUC and MUG activities. Clonescontaining both activities were subsequently found to containthe cel operon.

Southern hybridization. Samples of genomic DNA fromXL-65-6 were digested with five restriction enzymes (BssHII,EcoRI, HindlIl, PstI, and StyI). After agarose gel electro-phoresis, fragments were transferred to Zeta-Probe GT mem-branes (Bio-Rad Laboratories, Richmond, Calif.) and probedwith a 1.1-kbp StyI fragment from within the B. stearother-mophilus cel operon. This probe was labeled with digitoxigenin(Genius System; Boehringer Mannheim Biochemicals, India-napolis, Ind.) and used as recommended by the manufacturer.Genomic DNA from E. coli DH5ot was digested with PstI andserved as a control.DNA sequencing and sequence analysis. The 5-kbp frag-

ment in pLOI902 was subcloned into pBluescript II KS- andpBluescript SK- (Stratagene Cloning Systems, La Jolla, Calif.)prior to sequencing. Two series of nested deletions weregenerated with the Erase-a-Base System (Promega Corp.,Madison, Wis.) to allow sequencing of the entire fragment inboth directions. The Magic Miniprep DNA Purification System(Promega Corp.) was used to prepare double-stranded plasmidDNA. Sequencing was performed by the dideoxynucleotidechain termination method of Sanger et al. (34) with fluores-cent-labeled primers (forward, 5'-CACGACGTTGTAAAACGAC-3'; forward, 5'-GGT'hl'TCCCAGTCACGACG-3'; re-verse, 5'-ATAACAA'TTCACACAGGA-3') (LI-COR, Inc.,Lincoln, Neb.) and the Sequenase 7-deaza-dGTP DNA Se-quencing kit (United States Biochemical Corp., Cleveland,Ohio). Regions of compression were resolved with either dITP(United States Biochemical Corp.) or the TaqTrack DeazaSequencing System (Promega Corp.). Sequencing was carriedout on a LI-COR model 4000 DNA sequencer and imageanalyzer with 7% acrylamide gels (Long Ranger Gel Concen-trate; AT Biochem, Malvern, Pa.).

Nucleotide and deduced amino acid sequences were manip-ulated and analyzed with the Genetics Computer GroupSequence Analysis Software Package (University of Wisconsin,Genetics Computer Group, Madison, Wis.) and the NationalCenter for Biotechnology Information with the BLAST net-work service. Amino acid sequences were aligned by theClustal V program (21).

J. BAC7ERIOL.

on March 3, 2020 by guest

http://jb.asm.org/

Dow

nloaded from

Page 3: Sequencing Cellobiose Phosphotransferase Bacillus … · 6442 LAI AND INGRAM TABLE 1. Strains andplasmids used in this study Strain or plasmid Genetic characteristic(s) Source B

CELLOBIOSE OPERON FROM BACILLUS STEAROTHERMOPHILUS

0 1

Native-P _

pLOI902

pLOI911

pLOI913

pLOI915

pLOI954

pLOI955

pLOI922

pLOI923

pLOI981

pLOI982

LacZ-P _pNative-P _-

2 3 4Kbp

5

-_- LacZ-P

la E u"3 vm04..ba"OI0 *.&0

C00, .- zA B C D

1432

1696

4606 + ++

V1432

1432 V

pLOI905 L J iA B C D

pLOI969 1696FIG. 1. Restriction map of the B. stearothermophilus DNA fragment contained in pLOI902. Subclones pLOI902, pLOI905, pLOI91 1, pLOI913,

pLO1915, and pLOI969 are derivatives of pUC18. Subclones pLOI922 and pLOI923 are derivatives of pBluescript II KS-. Subclones pLOI954and pLOI955 are derivatives of pBluescript SK-. Genes of the cel operon are labeled A, B, C, and D. The incomplete regulatory gene is labeledR'. The proposed terminator is marked as a solid circle. Approximate positions of frameshift mutations are indicated by arrowheads in pLOI981and pLOI982. These mutations were introduced by Klenow treatment and religation after digestion with BstXI and Apal, respectively. Derivativesof pUC18 are aligned beneath restriction sites used for construction; others are aligned beneath their respective sequence positions. The relativeactivities of MUC and MUG are indicated to the right.

RESULTS

Isolation and characterization of organisms. Environmentalsamples (102 total) were screened from a variety of sources,including soil, dried food, animal manure, compost, pondwater, seawater, and decaying plant material. Thermophileswere abundant in all samples (2,000 to 1,500,000 CFU/g). Allstrains which were highly cellulolytic (clear zones on celluloseagar) were provisionally identified as actinomycetes on thebasis of cell morphology and growth habits. However, manyother isolates were surrounded by a partially cleared halo andwere presumed to contain incomplete cellulase digestion sys-tems.A total of 56 representative clones which produced partial

clearing on cellulose plates were selected for further study. Allstrains were aerobic, endospore-forming, gram-positive rodsand were able to grow at 75°C. More detailed studies wereconducted with five representative strains, and all had charac-teristics similar to those of B. stearothermophilus (6). Theseorganisms represent a rich source of enzymes for the depoly-merization of carbohydrates. All but 6 of the 56 isolatesexhibited activity on one or more cellulolytic substrates (MUC,MUG, and CMC), further validating the selection procedure.No endoglucanase activity was found in these isolates whentested at 65°C. One-third of these isolates, however, did displayweak endoglucanase activity when incubated at 55°C. Allisolates contained multiple activities associated with hemicel-

lulose and cellulose hydrolysis, as illustrated by strain XL-65-6(isolated from rotting wood near the edge of a small pond).

Cloning of genes encoding MUC and MUG activities (celoperon). The plasmid library was screened for activities hydro-lyzing MUC and MUG after transformation into DH5ot.Approximately 50 positive clones were identified from the50,000 colonies tested. Those with weak activities were unsta-ble. Nineteen clones which exhibited strong activities for bothsubstrates were found. These activities were present togetherin all clones. Digestions with restriction enzymes indicated thatall active clones contained the same 5-kbp DNA fragment andwere siblings. One of these, denoted pLO1902, was selected forfurther study.A restriction enzyme map of pLOI902 is shown in Fig. 1. The

minimal coding region for the activities hydrolyzing MUC andMUG was localized by testing subclones. Subclones pLOI91 1,pLOI913, and pLOI915 were constructed by deleting HindlIl,SmaI, and NsiI fragments, respectively. All but pLOI915 wereinactive. This subclone retained both activities. SubclonespLOI922, pLOI923, pLOI954, and pLOI955 were constructedby deleting terminal regions with exonuclease III. MUC andMUG activities remained after the deletion of ca. 400 bpadjacent to the lac promoter (pLOI922) or 1,400 bp from theopposite end (pLOI954). However, further deletion of 375 bpfrom pLOI922 or 264 bp from pLOI954 resulted in theconcomitant loss of both activities (pLOI923 and pLOI955,

VOL. 175, 1993 6443

on March 3, 2020 by guest

http://jb.asm.org/

Dow

nloaded from

Page 4: Sequencing Cellobiose Phosphotransferase Bacillus … · 6442 LAI AND INGRAM TABLE 1. Strains andplasmids used in this study Strain or plasmid Genetic characteristic(s) Source B

6444 LAI AND INGRAM J. BACrERIOL.

1 TCCCGACGTCCTTGAGTTGACCGAAGCGATGGTCGTCTATGCGGAGMAAAMACTTGGGCGCCGCCTCGCCGAGAAGGTGATGTATGCGCTCGCCATGCACAT TCAAACGGCGATCAACCGP D V L E L T E A M V V Y A E K K L G R R L A E K V M Y A L A M H I Q T A I N R

121 CCTGCGGGCCGGGACGATCGTT TCCCATCCGAAMCTGAATGAAGTTCGCGCCGCCTATAAGCAGGAATTCGCCGTCGCGCTCGACTGCCTTCAGCT GATGGAAGAAAGGACGAATATCGAL R A G T I V S H P K L N E V R A A Y K Q E F A V A L D C L Q L M E E R T N I D

241 CTTCCCGATCGATGMGCGGGATTTTTGACGATGTTTTTTGCGTTCCACGAGGAGCMGCGGMGAGCGGGAGGMCGGGTGGCGATCGTGGTTGTGATGCATGGCMCGGTGTGGCTTCF P I D E A G F L T M F F A F H E E Q A E E R E E R V A I V V V M H G N G V A S

361 GGCGATGGCGGACGTCGTGMTCAGCTGTTGGCCGCCGCGTGTGTGCATGCGGTCGATATGCCGCTTGATGCCGACCCGAAACGAATTTACGAGCAGGTGAMGCGGTGCTACAGCCCGTA M A D V V N Q L L A A A C V H A V D M P L D A D P K R I Y E Q V K A V L Q P V

481 CGCATCGAAMGGGGCGCTGTTGCTTGTTGACATGGGATCGCTCGTGTCGTTTTCGMCTTTTTGGAMGAGCTCGCTGTTCCGGTGCGGGTGATTTCCGCTGCCAGCACGCCGCAA S K K G A L L L V D M G S L V S F S N F L E K E L A V P V R V I S A A S T P H

601 CGTGCTCGMGCGGCCCGCAMGCGATGCTCGGCTATGCGCTTCAGGAAATTTACGMGMGTGMMGCCGCAGCGCCGTTTTACATTCGCGGGCCGCTTTGGGAGGAGGAGGCTGCGGAV L E A A R K A M L G Y A L Q E I Y E E V K A A A P F Y I R G P L W E E E A A E

721 GCAGGACATGCTGGCGATCGTCACCGCCTGTTTTACTGGMMGGMGCGCGCTGGCGCTMMGCATATATTGGMACGTATTTGCAGCTCGATGAGCGCATGTGGCGCATTATTCCCATQ D M L A I V T A C F T G K G S A L A L K H I L E T Y L Q L D E R M W R I I P 1

841 TCAMTGGCAGATGCGGAAGMGCGCGGCAAACCTTGTCCMCGTCGCGMMCATTTCCGCATCGTCTGTATCGTCAGCCATCTTTGCCTTGATGAGCGMTCCCGCACTTTTCTCTTGAQ M A D A E E A R Q T L S N V A K H F R I V C I V S H L C L D E R I P H F S L E

961 GGACGTGCTTAGCTTGAGGGCGATGMAGMATCCMGCGTTAGCCGACGTTGAGGMMTTCATATGCATATGGCTAGAGAGCTGACCMTCATCTTCGCCACCTCGCACCGGcGCGGGCD V L S L R A M K E I Q A L A D V E E I H M H M A R E L T N H L R H L A P A R A

1081 CATTCCCGCCATTCGCGCTGCGCTGGCGGCGATTGGTCGGGMCTCGGCCTTGMGCGGATGGCCGCGATCTGGTGGGGCTTGTCTTTCATCTTTGCTGCTTGCTCGATCGCCTGCTGTCI P A I R A A L A A I G R E L G L E A D G R D L V G L V F H L C C L L D R L L S

1201 CGGAGAGAGMCGAGCGGTGACCGAGGAGCAGCGMMGCTTGCGGGCATGMGATGMGATGGCGCACTGTACAGGGCGGTCMGGAGGGGTTGTTTCCGCTTGMCAGCAGTACGGTGTG E R T S G D R G A A K A C G H E D E D G A L Y R A V K E G L F P L E Q Q Y G V

1321 CTGCATCGATGMGATGAACTGTGTCATATCGTTCACTTTTTTCGCTCCCTGCMGGMAACGAGATGGATMGGTGCGATMTTGGAMCGCTTCCATTACACAGTTTCCCTCCGTTMC I D E D E L C H I V H F F R S L Q G K R D G *

uuuCcuCc1441 CACACMCCGTTCCATCCGCTTCAGCGCTGTCTTTTCGGCGTTGGCATGGMTTTGCTTGMCMTAATGMTCATTGCMMMMGGAGGAGGMGCGGATGAACATTTTGCTCATTTG

-35 -10 RB M N I L L I CcelA

1561 CGCTGCCGGCATGTCGACGAGTTTGCTTGTGACGMGATGMGGAGGCGGCMMGCMMAGGGATCGAGGCGMCATTTGGGCTGTGTCAGCTGATGMGCGMMGTCATCTCGACCAA A G M S T S L L V T K M K E A A K Q K G I E A N I W A V S A D E A K S H L D Q

1681 GGCGGATGTTGTGCTGATTGGTCCGCMATCCGTTATMGCTCGCCGCCTTCMMAAGAGGGAGAGGCGCGCGGCATTCCGGTTGACGTCATCMTCCAGCTGACTATGGGCGGGTCMA D V V L I G PQ I R Y K L A A F K K E G E A R G I P V D V I N P A D Y G R V N

UUUccUCCaC1801 CGGCGCCGGCGTGTTGGACTTTGCGCTGCGGTTAMTAAACAGGGGGTAGMGCGGATGGACCGGTTTATTCGGATGTTGGAAGACCGCGTGATGCCTGTCGCCGGCMGATTGCC

G A G V L D F A L R L K K * RB M D R F I R M L E D R V M P V A G K I AcelB

1921 GMCAGCGCCATTTGCAGGCGATTCGTGACGGGATTATTTTGTCGATGCCATTGTTGATTATCGGGTCTTTATTTTTMTCGTTGGCTTTTTGCCGATTCCCGGTTACMCGMTGGATGE Q R H L Q A I R D G I I L S M P L L I I G S L F L I V G F L P I P G Y N E W M

2041 GCGMATGGTTCGGCGAGCATTGGCTTGATMGCTTCTCTATCCGGTTGGAGCGACGTTCGATATTATGGCGCTTGTTGTCAGCTTCGGAGTCGCCTATCGGCTGGCGGMMGTATAAAA K W F G E H W L D K L L Y P V G A T F D I M A L V V S F G V A Y R L A E K Y K

2161 GTGGATGCGCTGTCCGCCGGCGCGATTTCACTGGCTGCTTTTTTGCTCGCMCCCCGTATCMGTGCCGTTCACGCCGGMGGAGCGMMGMMCCATTATGGTCAGCGGTGGCATCCCGV D A L S A G A I S L A A F L L A T P Y Q V P F T P E G A K E T I M V S G G I P

2281 GTGCMTGGGTCGGCAGCMGGGGTTGTTTGTTGCCATGATTTTGGCGATTGTGTCMCCGMATTTACCGGAAMTCATTCAAAAAAATATTGTCATTMGCTGCCGGACGGGGTGCCGV Q W V G S K G L F V A M I L A I V S T E I Y R K I I Q K N I V I K L P D G V P

2401 CCTGCTGTGGCCCGCTCCTTTGTTGCTTTGATCCCGGGAGCCGCCGTTCTCGTCGTTGTCTGGGTAGCCCGGCTGATTTTGGAMTGACACCGTTTGAAAGTTTCCATMCATTGTATCTP A V A R S F V A L I P G A A V L V V V W V A R L I L E M T P F E S F H N I V S

2521 GTCCTCCTGMCMMCCGCTCAGTGTGCTCGGCGGCAGTGTATTTGGCGCCATTGTCGCCGTGCTGCTTGTCCAGCTGCTATGGTCGACCGGACTGCACGGAGCGGCCATCGTCGGCGGGV L L N K P L S V L G G S V F G A I V A V L L V Q L L W S T G L H G A A I V G G

2641 GTCATGGGGCCGATCTGGCTGTCGCTGATGGATGMMTCGMTGGTGTTCCAGCMMTCCGMTGCCGMCTGCCCMCGTCATTACGCMCAGTTTTTTGATCTTTGGATTTACATCV M G P I W L S L M D E N R M V F Q Q N P N A E L P N V I T Q Q FF D L W I Y I

FIG. 2. Nucleotide sequence of the 5-kbp DNA fragment from B. stearothermophilus contained in pLOI902. The deduced amino acid sequencesfor the cel operon and the incomplete regulatory gene are listed below the first nucleotide of the corresponding codon. A putative promoter regionfor the cel operon is underlined, with the -35 and - 10 regions labeled. The sequence for the 3' terminus of 16S rRNA from B. subtilis (26) isshown above the potential ribosomal binding region (underlined and labeled RB) from the cel genes. Unmatched bases are in lowercase. Geneswithin the cel operon are marked at their start codons. Stop codons are indicated by asterisks. The proposed terminator is marked with inverteddashed arrows which correspond to a proposed stem-loop structure.

respectively). Thus, a rather large region of DNA (approxi-mately 3.2 kbp) appeared to be required for the expression ofMUC and MUG activities in DH5ot. Both activities wereexpressed or lost concurrently in all subclones. Active clonesconsistently hydrolyzed MUG more rapidly than MUC.

Experiments were conducted to determine the direction of

transcription of the genes encoding MUC and MUG activitiesin pLOI902. The presence or absence of inducer (isopropyl-thio-3-D-galactopyranoside [IPTG]) did not alter the level ofexpression. Attempts to subclone this fragment into pUC18 inthe opposite orientation were unsuccessful when DH5ot wasused as a host, but clones were readily obtained with XL1-

on March 3, 2020 by guest

http://jb.asm.org/

Dow

nloaded from

Page 5: Sequencing Cellobiose Phosphotransferase Bacillus … · 6442 LAI AND INGRAM TABLE 1. Strains andplasmids used in this study Strain or plasmid Genetic characteristic(s) Source B

CELLOBIOSE OPERON FROM BACILLUS STEAROTHERMOPHILUS 6445

2761 GGCGGTTCAGGAGCGACATTGGCGTTGGCGTTGACGATGATGTTTCGGGCCCGCAGCCGGCAGCTGAAMGCTTAGGTCGGCTGGCGATCGCGCCCGGCATTTTCMTATTMTGAGCCGG G S G A T L A L A L T M M F R A R S R Q L K S L G R L A I A P G I F N I N E P

2881 ATCACGTTCGGCATGCCGATCGTCATGMCCCATTGCTGATCATTCCGTTCATTCTCGTGCCTGTCGTGCTTGTGGTTGTTTCCTACGCCGCGATGGCGACCGGGCTTGTCGCCMMCCAI T F G M P I V M N P L L I I P F I L V P V V L V V V S Y A A M A T G L V A K P

3001 AGCGGGGTGGCCGTGCCATGGACGACACCGATCGTGATCAGCGGCTATTTAGCGACGGGAGGCMMTTTCCGGGAGTATTTTGCMMTCGTTMCTTCTTCATCGCGTTTGCCATCTACS G V A V P W T T P I V I S G Y L A T G G K I S G S I L Q I V N F F I AFA I Y

uuuccuc3121 TATCCATTTTTCTCGATTTGGGACMMCAMMGCGGCCGAAGAGCAGGCCGATCCMCGATTTCMGCGGAGCGGGGGCMCGCACTCGCTGTAMGGAGAGGMCGAMTGCCGCGCT

Y P F F S I W D K Q K A A E E Q A D P T I S S G A G A T H S L * RB M P R YcelC

324 1 ATTGCATCGTCAACGCCGATGATTTCGGTTACTCGAAAGGGGTCAACTACGGGATTTTGGAAGCGTTTCAGAACGGTGTCGTCACGTCGGCGACGCTGATGGCGAATATGCCAGCGGCCGC I V N A D D F G Y S K G V N Y G I L E A F Q N G V V T S A T L M A N M P A A E

3361 AACACGCCGCCCGGCTGGCGAAGGACCATCCGGAACTCGGCGTTGGCATTCATTTTGTGCTGACGTGCGGCCGGCCGCTGGCCGATGTTCCATCGCTGGTGAACGAGAATGGGGAGTTTCH A A R L A K D H P E L G V G I H F V L T C G R P L A D V P S L V N E N G E F P

3481 CGCGGCGCGGGGAGGCGCTTGTCGGCGCTAGGCGCGGCGATATCGAGCGGGAGCTTTGCGCCCAATTGGAGCGTTTTTT CTCGTTCGGGCTCACTCCGACGCATATTGACAGCCATCATCR R G E A L V G A R R G D I E R E L C A Q L E R F F S F G L T P T H I D S H H H

360 1 ACGTTCATGAGCACCCGAATGTGTTTCCGGTTGTGGAACAATTGGCCGAACGCTATCGGCTGCCGATCCGCCCGGTGCGGACGGCACGGCCGCATCGGCTGCCGACCGTCGACGTCTTTTV H E H P N V F P V V E Q L A E R Y R L P I R P V R T A R P H R L P T V D V F F

372 1 TTCCGGATTTTTACGGCGATGGACTGACGAAMGACCGTTTTATCTCGCTGATCGACCGAATTGGCGACGGGCAGACGGCGGAAGTGATGTGCCACCCGGCGTATATCGATGTTCCGCTCGP D F Y G D G L T K D R F I S L I D R I G D G Q T A E V M C H P A Y I D V P L A

384 1 CGTCAGGAAGTTCCTATTGTCAACAACGAGTCGAAGAGCTGGCGGTGCTGACCGACCCAACGCTCGTTGCCGAGATGGCCGAGCGCGGTGTTCAGCTGATCACGTACCGCGAATTCTATAS G S S Y C Q Q R V E E L A V L T D P T L V A E M A E R G V Q L I T Y R E F Y K

UCCUCcaCU3961 AACTATAGGAGAGGGCGTTATGCAAACGTATGAACAAACTGTATTCCAACTGATTCTTCATGGCGGAAMCGGCCGCAGTTATGCAATGGAGGCGATTACGGCGGCGAAAAAGGGGAATT

L RB M Q T Y E Q T V F Q L I L H G G N G R S Y A M E A I T A A K K G E FcelD

4081 TGCCGAGGCGCGCAGGCTGCTTGAACAGGCGGGAGCAGAACTGCAGGCGGCCCATGGGCTGCAGACCGCGTTGCTGCCAGMAGCGAGCGGCGGGCAGCCGGTGGTGACGCTTCTGATA E A R R L L E Q A G A E L Q A A H G L Q T A L L Q Q E A S G G Q P V V T L L M

4201 GGTGCATGCCCAGATCATTTATGACGGCGATCACGGTCAGGATTTAGCCGCTGATTCGTGGAGCTGTATGAAGCGCTCAGCGGCAACACCGATCATATTGTCGTCGGCCGGV H A Q D H L M T A I T V K D L A A E F V E L Y E A L K R Q T T E S *

4321 CTTGTTGACCGGGTGAGGATGATTCGCTTCMCGGGMTGMCCCMGGCGGGCGGCMCGGGCTGTTCTAGCGACCGCCCGCTTTGTTGATTTTACMTMTTTTATTATGTGMCATA> <

4441 AGGGCGCCAGCCGTTCAGCCCTCCCGGCTGCCGCCGGTCTCGGCTGMCGGAGMGGAAAACTCCGCTTGTTTGCGTCACATATCGTGGCTTGTGTTGTGCTTTTGCCGCGTGTGATTGC4561 GGTTCTTCACTTTTTTCGAACCGTCAGCGGCTCCGGCTGCCCAGGATTGTCGCCTTTTGGGGCGTTTTTGCGCATATCTTTGCCGTCGTTTTTGTTCGTCATTTGTTGTCCCTCCTTTG4681 TCCGTTAGCATGCGGCGTTGTGCTTTTGCTTATTCGGGGCATMTTTGCGCCGGTTTGAMCGATMGTAGTGGATGACCGCGATGACTGCGMGGAGGMCGAGCCATGGTTTATCATA4801 AACGAGCAAACGCGTTTCAGGCGGCGGCAAGGCGACGATGGAGGCAAGAGTGGCATGACCATCTCGTCCGCGATCAGCCGATTACGGCCATCAGTTAGCCCATTTGCGGCAG4921 GAGTGAACGAAGCGTTTGCACAAATTGAAACGCGCTTGAGTCGCCTCGGAACGCAGCGCATGCAGCTTGAGAATTCCGCAGCGATTTGCAGGCGATCCCCGG

FIG. 2-Continued.

Blue. The resulting plasmid (pLOI905) was lethal when trans-formed back into DH5ao. The growth of XL1-Blue harboringpLOI905 was strongly inhibited by the presence of inducer,consistent with partial control by the resident F'lacIq episome.Thus, in the original clone, pLOI902, the direction of tran-scription for the cel operon is opposite to that of the lacpromoter.

Analysis of DNA sequence. The entire 5,026-bp DNA frag-ment in pLOI902 was sequenced in both directions (Fig. 1 and2). Four complete open reading frames were found in theregion required for the two activities, bordered by incompletegenes at both ends. All appear to be transcribed in the same

direction. The four central genes appear to be part of a celoperon (bases 1541 through 4306) with a potential promoter atthe 5' end and a possible rho-independent terminator at the 3'end (30). The coding regions for all four genes utilize ATGcodons to initiate transcription. The potential ribosomal bind-ing sequences for celA and celC provide a better match with the3' terminus of 16S rRNA from Bacillus subtilis (26). It isinteresting to note that the potential ribosomal binding se-

quences in celB, celC, and celD are overlapped by stop codonsfrom the preceding genes and may provide translational cou-

pling. A similar phenomenon was known in B. subtilis sucrose-

specific PTS (16), in which the stop codon for enzyme II

overlapped the start codon for sucrose-6-phosphate hydrolase.Identification of cloned genes by homology. An initial search

of the DNA data base revealed strong homology between thetranslated sequences of the cloned genes and previously de-scribed genes encoding sugar-specific proteins in PTSs (Table2). The highest degree of similarity was observed with the E.coli cel operon (Fig. 3). Because MUC and MUG can beconsidered analogs of cellobiose, the sequenced genes from B.stearothermophilus XL-65-6 were designated a cel operon, thefirst PTS cellobiose system to be characterized at a molecularlevel in a gram-positive organism. Table 3 summarizes some ofthe properties of these four genes. The gene products fromcelA and celB have high pl values and positive charges, whilethose from celC and celD have low pl values and negativecharges.The translated amino acid sequence for B. stearothermophi-

lus celA shares 43% identity and 59% similarity with that ofcelA from E. coli K-12 (27), which encodes a part of enzyme II(28) (Fig. 3A). In many organisms, the domains encoded bycelA and celB are fused into a single polypeptide (31). Thetranslated B. stearothermophilus celA sequence is homologousto the carboxy-terminal portion of the lactose-specific enzymeII proteins from Lactococcus lactis, Staphylococcus aureus, andLactobacillus casei (Table 2). Cys-7 is surrounded by a highlyconserved region in all homologs examined. This Cys residuehas been proposed as the site of phosphorylation in L. lactis(2). The celA gene is predicted to encode a protein with an Mrof 10,737 (Table 3).

VOL. 175, 1993

on March 3, 2020 by guest

http://jb.asm.org/

Dow

nloaded from

Page 6: Sequencing Cellobiose Phosphotransferase Bacillus … · 6442 LAI AND INGRAM TABLE 1. Strains andplasmids used in this study Strain or plasmid Genetic characteristic(s) Source B

6446 LAI AND INGRAM

TABLE 2. Comparison of predicted amino acid sequences for ce/A,B, and D and ceMR' from B. stearothermophilus to homologous PTS

polypeptides from other organisms"

% Identity (Cc similarity)Protein

Bs-CeIA Bs-CelB Bs-CelD Bs-CeIR'

Enzyme IIEc-CelA 42.9 (59.2)"Ll-LacE 23.2 (50.5)" 30.6 (58.2)'Sa-LacE 21.9 (5 1.0)" 29.4 (57.7)'Lc-LacE 16.0 (48.0)"' 29.1 (55.9)'Ec-CelB 32.9 (60.1)

Enzyme IIIEc-CeiC 41.6 (67.3)Ll-LacF 41.8 (66.0)Sa-LacF 38.4 (62.6)Lc-FIII 36.5 (58.9)

Regulatory proteinsBsu-LevR 29.5 (50.1)Bsu-SacY 27.2 (57.0)Bsu-SacT 30.0 (58.0)Echl-ArbG 30.2 (54.2)

" Bs-CeIA, Bs-CelB, and Bs-CeID refer to polypeptides for genes ceLA, celB,and celD of the ccl operon from B. stearothernop/lilus. Bs-CeIR' refers to theincomplete regulatory protein. Ec-CelA and Ec-CelB refer to the carboxyterminus and amino terminus of glucoside-specific PTS enzyme II of the celoperon from E. coli (27, 28). Ec-CeIC refers to enzyme III of the cel operon.L/-LacE and Ll-LacF refer to lactose PTS operon enzymes II and III from L.lactis, respectively (14). Sa-LacE and Sa-LacF refer to lactose-specific PTSenzyme II and enzyme III of the lac operon from S. aureus, respectively (4).Lc-LacE and Lc-FIII refer to lactose-specific PTS enzyme II and enzyme IIIfrom L. casei, respectively (1, 2). Bsu-LevR refers to the transcriptional regulatorof PTS levanase operon of B. su/tilis (13). The results of comparison withBs-celR' were from the comparison of amino acids 2 to 459 with amino acids 475to 935 of levR. Bslo-SacY refers to the antiterminator of the PTS levansucrascgene of B. subtilis (36). The results of comparison with Bs-celR' were from thecomparison of amino acids 2 to 115 with amino acids 68 to 181 of sacY Bsu-SacTrefers to the transcriptional antiterminator of the PTS sucrase operon of B.stibtilis (12). The results of comparison with Bs-ce/R' were from the comparisonof amino acids 2 to 101 with amino acids 66 to 165 of sacT. Ech-ArbG refers tothe antiterminator of the PTS phospho-3-glucosidase operon of Erwinia chry-santhemi (15). The results of comparison with Bs-celR' were from the comparisonof amino acids 2 to 97 with amino acids 69 to 164 of arbG.

" These results are comparisons of Bs-CelA with the carboxy-terminal portionsof L/-LacE, Sa-LacE, and Lc-LacE.

' These results are comparisons of Bs-CelB with the amino-terminal portionsof Ll-LacE, Sa-LacE, and Lc-LacE.

The celB gene begins 17 bases downstream from celA and ispreceded by a potential ribosomal binding site which overlapsthe last 2 bases of the celA stop codon. The celB gene ispredicted to encode a protein with an Mr of 48,805. Thededuced amino acid sequence for celB is most similar to that ofthe membrane-spanning polypeptide in the enzyme II complexfor cellobiose from E. coli (ce/B) (27), exhibiting 33% identityand 60% similarity (Fig. 3B). The translated sequence for B.stearothermophilus celB is also similar to the amino-terminalportion of genes encoding PTS enzyme II proteins from otherorganisms (Table 2).The celC gene begins 14 bases downstream from the celB

stop codon and also exhibits an overlap between the ribosomalbinding region and the preceding stop codon. In contrast to theother genes in the cel operon, no genes with a high degree ofsimilarity to the translated sequence for ce/C were found in thecurrent data base. However, the carboxy-terminal portion ofthis protein exhibited partial homology (22% identity, 53%similarity) with the carboxyl portion of E. coli celF encodingthe phospho-f3-glucosidase. Thus, the celC gene is presumed toencode a cleavage enzyme for cellobiose phosphate.The celD gene encodes a small protein containing 108 amino

acids. Again, the proposed ribosomal binding site overlaps thestop codon of the preceding gene. The translated amino acidsequence for celD exhibits 41.6% identity and 67.3% similarityto that of E. coli ce/C, which encodes enzyme III (Fig. 3C).Table 2 summarizes the homology between celD and relatedgenes from other organisms. A histidine residue (His-76) islocated in a region which is highly conserved among differentorganisms. This residue has been proposed as the site ofphosphorylation for the lactose-specific enzyme III in L. casei(1, 20).An incomplete open reading frame (Fig. 1 and 2) was found

upstream from the cel operon which encoded 463 amino acidsand is transcribed in the same direction as the operon. Thisgene was provisionally identified as encoding a regulatoryprotein (ceIR) on the basis of homology to regulatory proteinsin other PTSs (Table 2). The translated sequence for theincomplete celR gene is most similar to that of B. subtilis levR(13), the transcriptional regulator for the PTS levanaseoperon. Although a putative promoter for ce1ABCD was

readily evident in the intercistronic sequence, it is unclearwhether celR is part of the cel operon or is transcribedseparately. No stem-loop structure resembling a rho-indepen-dent terminator was found in the intergenic region, and it ispossible that all five genes may be transcribed at some levelfrom a common promoter.A portion of the amino terminus of a possible sixth gene was

found downstream from the cel operon but was not identifiedby homology searches of the current DNA data base.Codon usage. Codon usage in the cel genes was compared

with that for the genes encoding phosphoglycerate kinase (11)and alcohol dehydrogenase T (32) from B. stearothermophilus(data not shown). Many codons (i.e., AGA, TGA, TAG, ATA,and TGT) were rarely used. Either a G or C was usuallypresent in the wobble positions, resulting in G+C contents of52.8 to 54.9 for ceLAB and celD, respectively. The G+C contentof celC is unusually high (59.1%) and differs significantly fromthe other genes sequenced from this operon.

Southern hybridization analysis. Southern analysis con-

firmed that the cel genes originated from B. stearothermophilusXL-65-6. A StyI fragment from the internal region of the celoperon was used as a probe. This probe did not hybridize toDNA fragments from E. coli DH5ot and bound single regionsin digestions of XL-65-6 genomic DNA, consistent with the celoperon being present as a single chromosomal copy (data notshown).

TABLE 3. Characteristics of the cel operon from B. stearothermophilus XL-65-6

Gene PTS component" No. of nucleotides 6X GC Total no. of amino acids M, Charge Predicted pl

celA Enzyme II' 303 53 100 10,737 +3 9.8celB Enzyme II" 1,356 53 451 48,805 +4 10.0celC Proposed cleavage enzyme 738 59 245 27,423 -8 6.1celD PTS enzyme III 327 55 108 11,785 -5 5.2

" Enzyme 11' refers to the carhoxyl-terminal domain of enzyme 11. Enzyme 11" refers to the amino-terminal membrane-spanning domain of enzyme 11.

J. BAC-TERIOL.

on March 3, 2020 by guest

http://jb.asm.org/

Dow

nloaded from

Page 7: Sequencing Cellobiose Phosphotransferase Bacillus … · 6442 LAI AND INGRAM TABLE 1. Strains andplasmids used in this study Strain or plasmid Genetic characteristic(s) Source B

CELLOBIOSE OPERON FROM BACILLUS STEAROTHERMOPHILUS 6447

A. Enzyme ll'Bs-CeLA M-NILLICMGMSTSLLVTKMKE-KQKGIEANIWAVSADEAKSHLDQADVLIGPQIRYKLMFKKEGEARGIPVDVINPADYGRVNGAGVLDFALR--LKK ---

Ec-Ce(A MEKKHIYLFCSAGNSTSLLVSKMRAQAEKYEVPVI IEAFPETLAGEKGQNADWLLGPQIAYMLPEIQRLLPNK--PVEVIDSLLYGKVDGLGVLKAAVAAIKKAAAN* .**.*.*********.**. * . *.* * .**..*****.**** ***.*................. **.**.....**.*. * **.

100106

B. Enzyme II"Bs-CelB MDRFIRMLEDRVMPVAGKIAEQRHLQAIRDGI ILSMPLLI IGSLFLIVG--FLPIP--GYNEWMAKWFGEHWLDKL--LYPVGA -----TFDIMALVVSFGVAYRLAEKYKVDALSAGAI 109Ec-CeLB MSNVIASLEKVLLPFAVKI1GKPHVNAIKNGFIRLMPLTLAGAMFVLINNVFLSFGEGSFFYSLGIRLDASTIETLNGLKGIGGNVYNGTLGIMSLM0APFFIGMALAEERKVDALMGLL120

* * ** **- -**- -*-* *** * *e * **-. . .... . * * .*. *..**.*. .* ..*** *****.**.

Bs-CeLB SLMFLLATPYQVPFTPEGAKETIMVSGGIPVQWVGSKGLFVAMILAIVSTEIYRKI IQKNIVIKLPDGVPPAVARSFVALIPGAAVLVVVWVARLILEMTPFES-FHNIVSVLLNKPLS 228Ec-CeLB SVAAFMTVTPYSVG------ EAYAVGA---- NWLGGANI ISGI I IGLWAEMFTFIVRRNWVIKLPDSVPASVSRSFSAF--NSRLYYSFRDGDYCLALNTWGTNFHQI IMDTISTPLA 227_ *** *** * * * * * * * * * * ****** ** * *** * * ** * **

Bs-CeLB VLGGSVFGAIVAVLLVQLLWSTGLHGAAIVGGVMGPIWLSLMDENRMVFQQNPNAELPNVITQQFFDLW -------- IYIGGSGATLALALTMMFRARSRQLKSLGRLAIAPGI FNINEP 340Ec-CeLB SLGSVWLAYVI --LSTALVLRIHAACADRTGQRHYDAW-AL--ENIATYQQYGSVE-MLAAGKTFHIWAKPMLDSFI FLGGSGATLGLILAI FIASRRADYRQVAKLALPSGIFQINEP 341

** * * * * * * ** ** * * * * ******* ** * ** *** ****

Bs-CeLB ITFGMPIVMNPLLI IPFILVPWLVVVSYAAMATGLVAKPSGVAVPWTTPIVISGYLATGGKISGSILQIVNFFIAFAIYYPFFSIWDKQKAAEEQADPTISSGAGATHSLEc-CeLB ILFGLPI IMNPVMFIPLYWYNRILAAITLAAYYMGI IPPVTNIA-PWTMPTGLGAFFNTNGTSPHC------------------ WSHSSTLASQR -------------

...... .....** * *

451417

C. Enzyme mIBs-CelD M --QTYEQTVFQL I LHGGNGRSYAMEAI TAAKKGEFAEARRLLEQAGAELQAAHGLQTALLQQEASGGQPWTLLMVNAQDHLMTAI TVKDLAAEFVELYEALKRQTTES 108Ec-CelC MMDLDN I PDTQTEAEELEEVVMGL II NSGQARSLAYAALKQANRGDFAAAKAMMDQSRMALNEAHLVQTKL IEGDAGEGKMKVSLVLVNAQDHLMTSMLAREL ITELI ELHEKLK----- A 106

***** * ** * * * * ** * * * ** ** * * * * * ********* * * ** ***. ... .. . ... . . . . . . .. . . . .. . . . .

FIG. 3. Comparison of deduced amino acid sequences of cel operon genes from B. stearothermophilus and E. coli. Identical residues areindicated by asterisks. Conserved residues with similar properties are indicated by dots. The proposed phosphorylation sites are overlined. Gaps(dashes) have been introduced to optimize alignment.

Hydropathy analysis. Analyses of the deduced amino acidsequences by the method of Kyte and Doolittle (25) revealed ahydrophilic or slightly hydrophobic character for the productsof celA, ceMC, and celD (Fig. 4). Further analysis of membrane-spanning domains by the method of Klein et al. (22) predicateda single domain in the celA product which may serve as amembrane anchor. No membrane-spanning domains werepredicted in the celC and celD products. Hydropathy profilesfor the celA- and celD-encoded polypeptides exhibit a remark-able degree of conservation compared with those of homolo-gous PTS enzymes from other organisms despite differences inthe transported sugars (Fig. 4). The celB gene product is veryhydrophobic, with 9 to 11 predicted membrane-spanning do-mains. The hydropathy plot for this protein is also very similarto those of the celB homologs.

Reizer et al. (28) previously noted that the N-terminalregion of E. coli celB gene product and the C-terminal regionof the E. coli celC translation product can be folded into helicalwheel plots to reveal distinctly hydrophobic and hydrophilicfaces. A similar analysis of the corresponding regions from thetranslated sequences of B. stearothermophilus celB and celDconfirmed that these patterns were also maintained in thisgram-positive organism (Fig. 5). When folded, the first 16amino acids at the amino terminus of the B. stearothermophiluscelB-encoded protein produce a highly hydrophobic face whichmay be involved in the formation of the transport complex butis not predicted to span the membrane (22). Similarly, thecarboxy-terminal region of the celD gene product (close to thehistidinyl residue of the proposed phosphorylation site) can

also be folded into an ot helix in which most hydrophobicresidues are segregated on one side and may function in theassembly of the active enzyme II complex.

Functional analysis of genes in the B. stearothermophilus celoperon in E. coli. The B. stearothermophilus cel genes wereconstitutively expressed in DH5a(pLOI902). Recombinantcells hydrolyzed MUC, MUG, and p-nitrophenyl-glucopyrano-side. All activities were concurrently lost as a result of deletionsin the carboxy-terminal region of celD and frameshift muta-tions in celB (Fig. 1). The full insert was subcloned in thereverse orientation to provide transcription from the lacpromoter. Deletions in the amino-terminal region of celA alsoresulted in a loss of activity (Fig. 1). A variety of approaches toisolate mutations in the celC gene were tried without success.Thus, a requirement for three of the four genes within the celoperon has been demonstrated, while the fourth gene, ce1C, ispresumed to encode a cleavage enzyme for cellobiose-phos-phate.

Cells of DH5ao(pLOI902) exhibited strong MUC, MUG, andp-nitrophenyl-,-D-glucopyranoside activities when incubatedat 37°C. However, little activity was detected when cells wereincubated at 55°C, despite the thermal tolerance of B. stearo-thermophilus. Previous studies had demonstrated that thegeneral components (enzyme I and Hpr) of the E. coli PTScomplement sugar-specific components from other gram-pos-itive bacteria (14, 16). The loss of activity at 55°C could resultfrom thermal inactivation of essential host-supplied compo-nents. To test this hypothesis, a ptsI mutant of E. coli, MM6(17), was transformed with pLOI902. The resulting recombi-

A. Bs-CelB } 1 Bs-CelA3 -- B. Bs-CelD 3 _ l

Ec-CelB 34-- Ec-CeIA ,I- -

Sa-LacE -- --------------3--

Ec-CeLC 3 _ I

Sa-LacF -3_ 1

FIG. 4. Hydropathy plots of PTS enzymes from different organisms. Bs-CelB, Ec-CelB, Bs-CelA, and Ec-CeIA refer to the amino-terminal andcarboxy-terminal ends of enzyme II from B. stearothermophilus and E. coli, respectively. Sa-LacE refers to enzyme II from S. aureus. Bs-CelD,Ec-CelC, and Sa-LacF denote enzyme III from B. stearothermophilus, E. coli, and S. aureus, respectively.

VOL. 175, 1993

on March 3, 2020 by guest

http://jb.asm.org/

Dow

nloaded from

Page 8: Sequencing Cellobiose Phosphotransferase Bacillus … · 6442 LAI AND INGRAM TABLE 1. Strains andplasmids used in this study Strain or plasmid Genetic characteristic(s) Source B

6448 LAI AND INGRAM

A.

B.

Bs - CeIB Ec - CeIB

IM

G k

qq

Sa - LacE

Bs - CelD Ec - CelC Sa - LacF

FIG. 5. Comparisons of helical wheel plots of terminal regionsfrom PTS enzymes. Hydrophobic residues are indicated by capitalletters, and hydrophilic residues are indicated by lowercase letters. Thehydrophobic and hydrophilic phases are divided by the line. Bs-CelB,Ec-CelB, and Sa-LacE are folded from the first 16 amino acid residuesat amino-terminal ends of enzyme II (A). Bs-CelD, Ec-CelC, andSa-LacF are folded from the carboxyl-terminal ends of enzyme III,from amino acids 86 to 104, 97 to 115, and 86 to 103, respectively (B).Bs, B. stearothermophilus; Ec, E. coli; Sa, S. aureus.

nant was inactive at both temperatures. Since 55°C is sufficientto disrupt the integrity of the cell membrane, the requirementfor host enzyme I (and presumably also Hpr) for activity atboth temperatures indicates that the phosphorylation of chro-mogenic substrates may be essential prior to cleavage.

DISCUSSION

The chromogenic substrate MUC has often been used tomeasure exocellobiohydrolase activity, although potential con-founding activities from 3-glucosidases are well known (8).However, the discovery that MUC and MUG activities couldbe used as a marker for the cloning of a cellobiose PTS operonfrom a gram-positive bacterium, B. stearothermophilus, wasrather unexpected.The organization of the PTS has been described by many

researchers (31). This system consists of two general energy-coupling proteins, enzyme I and Hpr, as well as a sugar-specificpermease, commonly referred as the enzyme II complex.Although the enzyme II complex may consist of one, two,three, or four distinct polypeptide chains, each complex con-tains at least three functional domains: a hydrophobic trans-membrane domain which binds and transports the sugar, aclosely associated hydrophilic domain which possesses the firstphosphorylation site, and a second hydrophilic domain con-taining an additional phosphorylation site. In B. stearother-mophilus, celB encodes the membrane-spanning polypeptidewhich forms the transmembrane channel (Fig. 6). The celAproduct is less hydrophobic and contains a hydrophobic tailwhich may serve as a membrane anchor. The celA product ispredicted to have a high pl, similar to that of the celB product.Since celA and celB encode domains which form a singlepolypeptide in many organisms (31), these two gene productscan be assumed to interact closely in the enzyme II complex.The celD product is hydrophilic, although a hydrophobicsurface appears to be present near the carboxy terminus. Thisgene product is predicted to have a low pl, which may allow theformation of ionic interactions with both the celA- and celB-encoded polypeptides. The celC product is also very hydro-philic and is proposed to encode an enzyme for the cleavage ofcellobiose-phosphate, on the basis of homology with E. coli

Membrane

P-Glu + Glu

H20

_ P-CellobioseGC:eICD

CelA

GYelD

FIG. 6. Model for the PTS cellobiose from B. stearothermophilus. The PTS enzyme II complex is encoded by ceLA, celB, and celD. The celC geneproduct is proposed to encode a cleavage enzyme for cellobiose-phosphate. Although the enzyme encoded by celC is shown as a hydrolase asdescribed for E. coli (27), the possibility that this enzyme is a phosphorylase (9) has not been eliminated.

Cellobiose

PEP P-El P-HPr

Pyruvate El HPr

J. BACTERIOL.

on March 3, 2020 by guest

http://jb.asm.org/

Dow

nloaded from

Page 9: Sequencing Cellobiose Phosphotransferase Bacillus … · 6442 LAI AND INGRAM TABLE 1. Strains andplasmids used in this study Strain or plasmid Genetic characteristic(s) Source B

CELLOBIOSE OPERON FROM BACILLUS STEAROTHERMOPHILUS 6449

celF. All attempts to mutate this gene were unsuccessful andmay be lethal. In recombinant E. coli expressing MUC andMUG activities, enzyme I (and presumably Hpr) must besupplied by the host and complement the PTS genes from B.stearothermophilus. Other examples of functional complemen-tation with heterologous PTSs from gram-positive organisms inrecombinant E. coli have been reported previously (14, 16).MUC, MUG, and p-nitrophenyl-glucopyranoside are ana-

logs of cellobiose and require hydrolysis to release the respec-

tive chromogens. Since a functional E. coli enzyme I (ptsI) was

required for hydrolysis, phosphorylation can be presumed to

be essential. In the simplest case, the phosphorylated intracel-lular product must be hydrolyzed once to release the chromo-gen from MUG or twice to release the chromogen from MUC.However, it is possible that the transported substrate is cleavedby a phosphorylase as proposed for C. uda and C. favigena (9).A phosphorylase would produce 4-methylumbelliferyl-phos-phate or p-nitrophenyl-phosphate as products which must becleaved by a host phosphatase to release the chromogen. Theprecise nature of the reaction catalyzed by the proposedcleavage enzyme (celC) will be investigated in future studies.The PTS cellobiose-specific proteins from the gram-positive

thermophile B. stearothermophilus (ce/A, ce/B, and celD) ap-

pear remarkably similar to the corresponding genes from thegram-negative mesophile, E. coli ce/A, ce/B, and celC (27) andmay be derived from a common ancestor. Indeed, the sugar-

specific proteins from most organisms which have been exam-

ined exhibit a high degree of homology despite differences inthe transported substrates (31). However, the proposed phos-pho-,B-glucosidases from these two organisms (celC and celD,respectively) exhibit only modest homology to each other andlittle homology to other glycohydrolases, consistent with an

independent origin. The independent origin of celC is furthersupported by the high G+C composition of this gene, consid-erably higher than that of other sequenced genes from B.stearothermophilus. The discovery of a PTS cel operon in B.stearothermophilus which is similar to that in E. coli (27) isconsistent with the PTS being the prevalent route for cellobi-ose uptake in bacteria.

ACKNOWLEDGMENTS

This research was supported by grants 92-37308-7471 and 583620-

2-112 from the Department of Agriculture and FG05-86ER3574 fromthe Division of Energy Biosciences in the Department of Energy.We thank Lorraine P. Yomano for assistance during these investi-

gations.

REFERENCES

1. Alfred, C.-A., and B. Chassy. 1988. Molecular cloning and nucle-otide sequence of the factor 1Illac gene of Lactobacillus casei.Gene 62:277-288.

2. Alfred, C.-A., and B. Chassy. 1990. Molecular cloning and DNAsequence of lacE, the gene encoding the lactose-specific enzyme II

of the phosphotransferase system of Lactobacillus casei. J. Biol.Chem. 265:22561-22568.

3. Al-Zaag, A. 1989. Molecular cloning of cellobiose and other3-glucosidase determinants from Klebsiella oxytoca. J. Biotechnol.

12:79-86.4. Breidt, F., Jr., W. Hengstenberg, U. Finkeldei, and G. C. Stewart.

1987. Identification of the gene for the lactose-specific compo-

nents of the phosphotransferase system in the lac operon ofStaphylococcus aureus. J. Biol. Chem. 262:16444-16449.

5. Brown, B. J., J. F. Preston, and L. 0. Ingram. 1991. Cloning of

alginate lyase gene (alxM) and expression in Escherichia coli. Appl.Environ. Microbiol. 57:1870-1872.

6. Claus, D., and R. C. W. Berkeley. 1986. Genus Bacilluis Cohn 1872,

174AL p. 1105-1139. In P. H. A. Sneath, N. S. Mair, M. E. Sharpe,and J. G. Holt (ed.), Bergey's manual of systematic bacteriology,vol. 2. Williams & Wilkins, Baltimore.

7. Coughlan, M. P. 1985. The properties of fungal and bacterialcellulases with comment on their production and application.Biotechnol. Genet. Eng. Rev. 3:39-109.

8. Coughlan, M. P. 1988. Staining techniques for the detection of theindividual components of cellulolytic enzyme system. MethodsEnzymol. 160:135-144.

9. Coughlan, M. P., and F. Mayer. 1991. The cellulose-decomposingbacteria and their enzyme system, p. 460-516. In A. Balows, H. G.Truper, M. Dworkin, W. Harder, and K.-H. Schleifer (ed.), Theprocaryotes, 2nd ed. A handbook on the biology of bacteria:ecophysiology, isolation, identification, applications, vol. l. Spring-er-Verlag, New York.

10. Cutting, S. M., and P. B. Vander Horn. 1990. Genetic analysis, p.27-74. In C. R. Harwood and S. M. Cutting (ed.), Molecularbiological methods for Bacillus. John Wiley & Sons Ltd., Chi-chester, England.

11. Davies, G. J., J. A. Littlechild, H. C. Watson, and L. Hall. 1991.Sequence and expression of the gene encoding 3-phosphoglyceratekinase from Bacillus stearothermophilus. Gene 109:39-45.

12. Debarbouille, M., M. Arnaud, A. Fouet, A. Klier, and G. Rapoport.1990. The sacT gene regulating the sacPA operon in Bacillussubtilis shares strong homology with transcriptional antitermina-tors. J. Bacteriol. 172:3966-3973.

13. DebarbouiIll, M., I. Martin-Verstraete, A. Klier, and G. Rapoport.1991. The transcriptional regulator levR of Bacillus subtilis hasdomains homologous to both (T`- and phosphotransferase system-dependent regulators. Proc. Natl. Acad. Sci. USA 88:2212-2216.

14. de Vos, W. M., I. Boerrigter, R. J. van Rooyen, B. Reiche, and W.Hengstenberg. 1990. Characterization of the lactose-specific en-zymes of the phosphotransferase system in Lactococcus lactis. J.Biol. Chem. 265:22554-22560.

15. El Hassouni, M., B. Henrissat, M. Chippaux, and F. Barras. 1992.Nucleotide sequences of the arb genes, which control ,3-glucosideutilization in Erwinia chrysanthemi: comparison with the Esche-richia coli bgl operon and evidence for a new 3-glycohydrolasefamily including enzymes from eubacteria, archeabacteria, andhumans. J. Bacteriol. 174:765-777.

16. Fouet, A., M. Arnaud, A. Klier, and G. Rapoport. 1987. Bacillussubtilis sucrose-specific enzyme II of the phosphotransferase sys-tem: expression in Escherichia coli and homology to enzyme IIfrom enteric bacteria. Proc. Natl. Acad. Sci. USA 84:8773-8777.

17. Fraenkel, D. G., F. Falcoz-Kelly, and B. L. Horecker. 1964. Theutilization of glucose 6-phosphate by glucokinaseless and wild-typestrains of Escherichia coli. Proc. Natl. Acad. Sci. USA 52:1207-1213.

18. Hall, B. G., and W. Faunce III. 1987. Functional genes forcellobiose utilization in natural isolates of Escherichia coli. J.Bacteriol. 169:2713-2717.

19. Hall, B. G., and L. Xu. 1992. Nucleotide sequence, function,activation, and evolution of the cryptic asc operon of Escherichiacoli K-12. Mol. Biol. Evol. 9:688-706.

20. Hengstenberg, W., B. Reiche, R. Eisermann, R. Fischer, U. Kebler,A. Tarrach, W. M. De Vos, H.-R. Kalbitzer, and S. Glaser. 1989.Structure and function of proteins involved in sugar transport bythe PTS of Gram-positive bacteria. FEMS Microbiol. Rev. 63:35-42.

21. Higgins, D. G., A. J. Bleasby, and R. Fuchs. 1992. CLUSTAL V:improved software for multiple sequence alignment. Comput.Appl. Biosci. 8:189-191.

22. Klein, P., M. Kanehisa, and C. Delisi. 1985. The detection andclassification of membrane spanning proteins. Biochim. Biophys.Acta 815:468-476.

23. Kluepfel, D. 1988. Screening of prokaryotes for cellulose- andhemi-cellulose-degrading enzymes. Methods Enzymol. 160:180-186.

24. Koransky, J. R., S. D. Allen, and V. R. Dowell, Jr. 1978. Use ofethanol for selective isolation of sporeforming microorganisms.Appl. Environ. Microbiol. 35:762-765.

25. Kyte, J., and R. F. Doolittle. 1982. A simple method for displayingthe hydropathic character of a protein. J. Mol. Biol. 157:105-132.

VOL. 175, 199)3

on March 3, 2020 by guest

http://jb.asm.org/

Dow

nloaded from

Page 10: Sequencing Cellobiose Phosphotransferase Bacillus … · 6442 LAI AND INGRAM TABLE 1. Strains andplasmids used in this study Strain or plasmid Genetic characteristic(s) Source B

6450 LAI AND INGRAM

26. Mountain, A. 1989. Gene expression system for B. subtilis, p.

73-114. In C. R. Harwood (ed.), Bacillus. Plenum Press, NewYork.

27. Parker, L. L., and B. G. Hall. 1990. Characterization and nucle-otide sequence of the cryptic cel operon of Escherichia coli K-12.Genetics 124:455-471.

28. Reizer, J., A. Reizer, and M. H. Saier, Jr. 1990. The cellobiosepermease of Escherichia coli consists of three proteins and ishomologous to the lactose permease of Staphylococcus aureus.

Res. Microbiol. 141:1061-1067.29. Roberts, R. J., and D. Macelis. 1992. Restriction enzymes and their

isoschizomers. Nucleic Acids Res. 20(Suppl.):2167-2180.30. Rosenberg, M., and D. Court. 1979. Regulatory sequences in-

volved in the promotion and termination of RNA transcription.Annu. Rev. Genet. 13:319-353.

31. Saier, M. H., Jr., and J. Reizer. 1992. Proposed uniform nomen-

clature for the proteins and protein domains of the bacterialphosphoenolpyruvate:sugar phosphotransferase system. J. Bacte-

riol. 174:1433-1438.32. Sakoda, H., and T. Imanaka. 1992. Cloning and sequencing of the

gene coding for alcohol dehydrogenase of Bacillus stearother-mophilus and rational shift of the optimum pH. J. Bacteriol.174:1397-1402.

33. Sambrook, J., E. F. Fritsch, and T. Maniatis. 1989. Molecularcloning: a laboratory manual, 2nd ed. Cold Spring Harbor Labo-ratory, Cold Spring Harbor, N.Y.

34. Sanger, F., S. Nicklen, and A. R. Coulson. 1977. DNA sequencingwith chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA74:5463-5467.

35. Wood, T. M., and K. M. Bhat. 1988. Methods for measuringcellulase activities. Methods Enzymol. 160:87-112.

36. Zukowski, M. M., L. Miller, P. Cosgwell, K. Chen, S. Aymerich,and M. Steinmetz. 1990. Nucleotide sequence of the sacS locus ofBacillus subtilis reveals the presence of two regulatory genes. Gene90:153-155.

J. BACrERIOL.

on March 3, 2020 by guest

http://jb.asm.org/

Dow

nloaded from