complete genome sequence of the mosquitocidal bacterium ... · sequenced organisms (i.e., bacillus...

11
JOURNAL OF BACTERIOLOGY, Apr. 2008, p. 2892–2902 Vol. 190, No. 8 0021-9193/08/$08.000 doi:10.1128/JB.01652-07 Copyright © 2008, American Society for Microbiology. All Rights Reserved. Complete Genome Sequence of the Mosquitocidal Bacterium Bacillus sphaericus C3-41 and Comparison with Those of Closely Related Bacillus Species Xiaomin Hu, 1 Wei Fan, 2 Bei Han, 1 Haizhou Liu, 1 Dasheng Zheng, 1 Qibin Li, 2 Wei Dong, 2 Jianping Yan, 1 Meiying Gao, 1 Colin Berry, 3 and Zhiming Yuan 1 * State Key Laboratory of Virology, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan 430071, China 1 ; Beijing Genomics Institute, Chinese Academy of Sciences, Beijing, China 2 ; and Cardiff School of Biosciences, Cardiff University, Museum Avenue, Cardiff, CF10 3US, United Kingdom 3 Received 12 October 2007/Accepted 6 February 2008 Bacillus sphaericus strain C3-41 is an aerobic, mesophilic, spore-forming bacterium that has been used with great success in mosquito control programs worldwide. Genome sequencing revealed that the complete genome of this entomopathogenic bacterium is composed of a chromosomal replicon of 4,639,821 bp and a plasmid replicon of 177,642 bp, containing 4,786 and 186 potential protein-coding sequences, respectively. Comparison of the genome with other published sequences indicated that the B. sphaericus C3-41 chromosome is most similar to that of Bacillus sp. strain NRRL B-14905, a marine species that, like B. sphaericus, is unable to metabolize polysaccharides. The lack of key enzymes and sugar transport systems in the two bacteria appears to be the main reason for this inability, and the abundance of proteolytic enzymes and transport systems may endow these bacteria with exclusive metabolic pathways for a wide variety of organic compounds and amino acids. The genes shared between B. sphaericus C3-41 and Bacillus sp. strain NRRL B-14905, including mobile genetic elements, membrane-associated proteins, and transport systems, demonstrated that these two species are a biologically and phylogenetically divergent group. Knowledge of the genome sequence of B. sphaericus C3-41 thus increases our understanding of the bacilli and may also offer prospects for future genetic improve- ment of this important biological control agent. Bacillus sphaericus is a naturally occurring, aerobic, meso- philic, spore-forming bacterium, commonly isolated from the soil. Isolates of B. sphaericus can be grouped into 49 serotypes on the basis of flagellar agglutination, but relatively few bio- chemical and morphological tests are available to distinguish B. sphaericus as a species. Analysis of DNA homology between strains indicated five major groups (I to V), each probably corresponding to a separate species (34). Group II was further subdivided into groups IIA and IIB. Mosquitocidal B. sphaeri- cus strains are all found within DNA subgroup IIA and in association with nine serotypes (H1, H2, H3, H5, H6, H9, H25, H26, and H48). High-toxicity strains exhibit toxicity against mosquito larvae and thus are utilized in insect control pro- grams to reduce the populations of vector species that transmit tropical diseases, such as malaria, filariasis, and arboviral dis- eases (e.g., yellow fever, dengue fever, and West Nile virus). The mosquitocidal properties are due to the action of binary toxin (Bin proteins), which forms crystal inclusions during sporulation, and mosquitocidal toxins (Mtx proteins), pro- duced during vegetative growth (51). Some strains also pro- duce a further two-component toxin on sporulation (Cry48 and Cry49 proteins) (31). Compared to Bacillus thuringiensis subsp. israelensis, which is the other major bacterium used in the biological control of mosquitoes, B. sphaericus offers a distinct advantage, having higher levels of efficacy and environmental persistence (50, 66). Besides being an important bioinsecticide for mosquito control, B. sphaericus has several important phe- notypic properties, including those of being incapable of poly- saccharide utilization and having exclusive metabolic pathways for a wide variety of organic compounds and amino acids (3, 28, 54, 59, 72). To date, no extensive chromosomal DNA sequencing of B. sphaericus isolates has been reported. Moreover, no other ge- nome sequencing of a bacterium incapable of polysaccharide utilization has been completed. B. sphaericus C3-41, a highly active strain isolated from a mosquito breeding site in China in 1987, shows toxicity against Culex sp., Anopheles sp., and Aedes sp. and has significantly higher activity against Culex sp. than the commercialized B. sphaericus strain 2362 (75). The C3-41 strain belongs to the flagellar serotype H5a5b, like B. sphaeri- cus strains 2362 and 1593 (74), and it has been developed as a commercial larvicide (JianBao) and successfully used for the control of mosquito larvae for more than 10 years in China. In some cities, such as Shenzhen, Foshan, and Dongguan in southern China, the C3-41 mosquitocidal formulation has been chosen as the sole larvicidal agent for breeding-site manage- ment in the integrated mosquito control program. Here, we report the sequencing of the B. sphaericus C3-41 genome and a comparative analysis with genomes of other species. These data provide a global view of the genes possessed by the or- ganism and an insight into evolutionary relationships among the bacilli. * Corresponding author. Mailing address: Wuhan Institute of Virol- ogy, Chinese Academy of Sciences, Wuhan 430071, China. Phone: 86-27-87197242. Fax: 86-27-87198137. E-mail: [email protected]. † Supplemental material for this article may be found at http://jb .asm.org/. Published ahead of print on 22 February 2008. 2892 on January 18, 2020 by guest http://jb.asm.org/ Downloaded from

Upload: others

Post on 04-Dec-2019

13 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Complete Genome Sequence of the Mosquitocidal Bacterium ... · sequenced organisms (i.e., Bacillus anthracis strain Ames, Bacillus subtilis strain 168, Thermoanaerobacter tengcongensis

JOURNAL OF BACTERIOLOGY, Apr. 2008, p. 2892–2902 Vol. 190, No. 80021-9193/08/$08.00�0 doi:10.1128/JB.01652-07Copyright © 2008, American Society for Microbiology. All Rights Reserved.

Complete Genome Sequence of the Mosquitocidal BacteriumBacillus sphaericus C3-41 and Comparison with Those of

Closely Related Bacillus Species�†Xiaomin Hu,1 Wei Fan,2 Bei Han,1 Haizhou Liu,1 Dasheng Zheng,1 Qibin Li,2 Wei Dong,2

Jianping Yan,1 Meiying Gao,1 Colin Berry,3 and Zhiming Yuan1*State Key Laboratory of Virology, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan 430071, China1;

Beijing Genomics Institute, Chinese Academy of Sciences, Beijing, China2; and Cardiff School of Biosciences,Cardiff University, Museum Avenue, Cardiff, CF10 3US, United Kingdom3

Received 12 October 2007/Accepted 6 February 2008

Bacillus sphaericus strain C3-41 is an aerobic, mesophilic, spore-forming bacterium that has been used withgreat success in mosquito control programs worldwide. Genome sequencing revealed that the complete genomeof this entomopathogenic bacterium is composed of a chromosomal replicon of 4,639,821 bp and a plasmidreplicon of 177,642 bp, containing 4,786 and 186 potential protein-coding sequences, respectively. Comparisonof the genome with other published sequences indicated that the B. sphaericus C3-41 chromosome is mostsimilar to that of Bacillus sp. strain NRRL B-14905, a marine species that, like B. sphaericus, is unable tometabolize polysaccharides. The lack of key enzymes and sugar transport systems in the two bacteria appearsto be the main reason for this inability, and the abundance of proteolytic enzymes and transport systems mayendow these bacteria with exclusive metabolic pathways for a wide variety of organic compounds and aminoacids. The genes shared between B. sphaericus C3-41 and Bacillus sp. strain NRRL B-14905, including mobilegenetic elements, membrane-associated proteins, and transport systems, demonstrated that these two speciesare a biologically and phylogenetically divergent group. Knowledge of the genome sequence of B. sphaericusC3-41 thus increases our understanding of the bacilli and may also offer prospects for future genetic improve-ment of this important biological control agent.

Bacillus sphaericus is a naturally occurring, aerobic, meso-philic, spore-forming bacterium, commonly isolated from thesoil. Isolates of B. sphaericus can be grouped into 49 serotypeson the basis of flagellar agglutination, but relatively few bio-chemical and morphological tests are available to distinguishB. sphaericus as a species. Analysis of DNA homology betweenstrains indicated five major groups (I to V), each probablycorresponding to a separate species (34). Group II was furthersubdivided into groups IIA and IIB. Mosquitocidal B. sphaeri-cus strains are all found within DNA subgroup IIA and inassociation with nine serotypes (H1, H2, H3, H5, H6, H9, H25,H26, and H48). High-toxicity strains exhibit toxicity againstmosquito larvae and thus are utilized in insect control pro-grams to reduce the populations of vector species that transmittropical diseases, such as malaria, filariasis, and arboviral dis-eases (e.g., yellow fever, dengue fever, and West Nile virus).The mosquitocidal properties are due to the action of binarytoxin (Bin proteins), which forms crystal inclusions duringsporulation, and mosquitocidal toxins (Mtx proteins), pro-duced during vegetative growth (51). Some strains also pro-duce a further two-component toxin on sporulation (Cry48 andCry49 proteins) (31). Compared to Bacillus thuringiensis subsp.israelensis, which is the other major bacterium used in the

biological control of mosquitoes, B. sphaericus offers a distinctadvantage, having higher levels of efficacy and environmentalpersistence (50, 66). Besides being an important bioinsecticidefor mosquito control, B. sphaericus has several important phe-notypic properties, including those of being incapable of poly-saccharide utilization and having exclusive metabolic pathwaysfor a wide variety of organic compounds and amino acids (3,28, 54, 59, 72).

To date, no extensive chromosomal DNA sequencing of B.sphaericus isolates has been reported. Moreover, no other ge-nome sequencing of a bacterium incapable of polysaccharideutilization has been completed. B. sphaericus C3-41, a highlyactive strain isolated from a mosquito breeding site in China in1987, shows toxicity against Culex sp., Anopheles sp., and Aedessp. and has significantly higher activity against Culex sp. thanthe commercialized B. sphaericus strain 2362 (75). The C3-41strain belongs to the flagellar serotype H5a5b, like B. sphaeri-cus strains 2362 and 1593 (74), and it has been developed as acommercial larvicide (JianBao) and successfully used for thecontrol of mosquito larvae for more than 10 years in China. Insome cities, such as Shenzhen, Foshan, and Dongguan insouthern China, the C3-41 mosquitocidal formulation has beenchosen as the sole larvicidal agent for breeding-site manage-ment in the integrated mosquito control program. Here, wereport the sequencing of the B. sphaericus C3-41 genome anda comparative analysis with genomes of other species. Thesedata provide a global view of the genes possessed by the or-ganism and an insight into evolutionary relationships amongthe bacilli.

* Corresponding author. Mailing address: Wuhan Institute of Virol-ogy, Chinese Academy of Sciences, Wuhan 430071, China. Phone:86-27-87197242. Fax: 86-27-87198137. E-mail: [email protected].

† Supplemental material for this article may be found at http://jb.asm.org/.

� Published ahead of print on 22 February 2008.

2892

on January 18, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 2: Complete Genome Sequence of the Mosquitocidal Bacterium ... · sequenced organisms (i.e., Bacillus anthracis strain Ames, Bacillus subtilis strain 168, Thermoanaerobacter tengcongensis

MATERIALS AND METHODS

Sequencing of the B. sphaericus C3-41 genome. High-molecular-mass genomicDNA isolated from B. sphaericus C3-41 was used to construct small (1.5 to 3 kb)and large (6 to 8 kb) random sequencing libraries. A whole-genome shotgunsequencing was performed by using Applied Biosystems 3700 DNA sequencers(Perkin-Elmer). The genome was first assembled into 418 contigs by using thePhred-Phrap-Consed package (17, 18, 24). Gaps were closed by primer walkingover clone inserts, genome PCR, and pooling. Finally, the genome was assem-bled into two contigs representing the circular chromosome (with an average of8.9-times coverage) and the circular plasmid (21.4 times). An estimate of thecopy number of the plasmid was obtained by dividing the coverage depth of theplasmid by that of the chromosome. The repeats of the chromosome werecategorized by means of a suffix tree algorithm (37).

Sequence annotation. Glimmer 3 gene finder (56) was utilized to identifypotential coding regions. The annotation was accomplished by BlastP analysis ofsequences in the Nr, Nt, and Swissprot databases, respectively, and by manualcuration of the outputs of a variety of similarity searches and was completed asdescribed previously (63). The possible orthologs of the genome were identifiedbased on the COG (clusters of orthologous groups of proteins) database andclassified accordingly (62). The potential coding sequences (CDSs) involved indifferent pathways were determined by KEGG analysis (47, 30). The proteinmotifs and domains of all CDSs were documented based on intensive searchesagainst publicly available databases and by using their application tools, includingPfam, PRINTS, PROSITE, ProDom, and SMART. The results were summa-rized with InterPro (6). tRNA genes were identified by using tRNAscan-SE (41).GC skew analysis and the circular-genome-map drawing were performed byusing CGView software (60).

Construction of orthologous/paralogous families and comparison analysis.Orthologous/paralogous families for B. sphaericus C3-41, five other completelysequenced organisms (i.e., Bacillus anthracis strain Ames, Bacillus subtilis strain168, Thermoanaerobacter tengcongensis MB4, Clostridium perfringens strain 13,and Escherichia coli K-12), and two gapped genomes, those of Bacillus sp. strainNRRL B-14911 and Bacillus sp. strain NRRL B-14905, were built by usingTreefam’s method of comparing the gene tree with the species tree (http://www.treefam.org/) in stages (38). The method of Treefam defines a gene family as agroup of genes that evolved after the speciation, and the orthologs and paralogsin TreeFam are inferred from the phylogenetic tree of a gene family and aredifferent from those inferred by BLAST matches (i.e., Inparanoid, KOGs, andOrthoMCL) or BLAST matches and synteny (i.e., Ensembl-Compara andHomoloGene). Thus, it also tries to include outgroup genes to reveal the distantmembers (38). In this study, besides the reference Bacillus genomes, we havetaken C. perfringens and E. coli as the outgroup species. In addition, B. sphaericusis an archaic organism and its spores have apparently been found in 25- to40-million-year-old amber (9). Previous studies have also suggested that thismesophilic bacterium is closely related to some thermophilic archaea (29, 45, 70).T. tengcongensis was, therefore, included in the construction of orthologous/paralogous families.

There are four main steps: (i) an all-versus-all BLAST with proteins andconjoined fragmental alignments by Solar (http://treesoft.svn.sourceforge.net/viewrc/treesoft/branches/dev/solar); (ii) clustering of gene families by usingHcluster_sg (http://treesoft.svn.sourceforge.net/viewrc/treesoft/trunk/hcluster);(iii) performing multiple alignments by Muscle (http://www.ebi.ac.uk/muscle)and converting protein alignments to CDS alignments by using a Perl script; and(iv) building phylogenetic trees and inferring orthologs and paralogs by theneighbor-joining method.

Nucleotide sequence accession numbers. The B. sphaericus C3-41 genome isavailable in GenBank under accession numbers CP000817 and CP000818.

RESULTS

General features of the genome sequence. The genome of B.sphaericus C3-41 consists of a circular chromosome of4,639,821 bp with an average G�C content of 37.29% and atwo-copy plasmid (named pBsph) of 177,642 bp with an aver-age G�C content of 33.10% (Table 1; Fig. 1A and B). A totalof 4,786 and 186 CDSs were identified in the chromosome andplasmid, respectively. There are 85 tRNA genes representingall 20 amino acids and 10 rRNA operons in the chromosome.The likely origin of replication of the chromosome of B. spha-ericus C3-41 was identified by similarities to several features ofthe corresponding regions in B. subtilis and other bacteria,including colocalization of four genes (rpmH, dnaA, dnaN, andrecF) near the origin, GC nucleotide skew [(G � C)/(G � C)]analysis, and the presence of multiple dnaA boxes and AT-richsequences immediately upstream of the dnaA gene (13, 36, 40,42). The replication termination site of the chromosome isbelieved to be localized near 2.9 megabases (Mb), according toGC skew analysis, and the coding bias for the two strands ofthe genome is for the majority of CDSs to be on the outerstrand from 0 to �2.9 Mb and on the inner strand from �2.9Mb to the origin (Fig. 1, circles 1, 4, and 5). This is alsoreflected by the presence of several genes near the 2.9-Mb site,including parC and parE, which encode the subunits of topo-isomerase IV, involved in chromosome partitioning (1, 12).However, we did not find the homolog of rtp (replication ter-minator protein-encoding gene) in the chromosome of B. spha-ericus C3-41, and it can be seen from Fig. 1 that the putativereplication termination site is significantly offset from the pointdiametrically opposite to the origin. This is not unique to B.sphaericus C3-41, but it is unusual. The potential origin ofreplication of the large, 178-kb plasmid was also analyzed byGC skew (Fig. 1B). The possible replication proteins wereidentified through database comparisons (Bsph_014-017).Close to this putative origin, a predicted CDS shows highsimilarities to FtsZ/tubulin family proteins, which are known tobe involved in plasmid replication (7, 67).

Ortholog analysis of a total of 34,008 CDSs from thegenome of B. sphaericus C3-41; five other genomes, those ofB. anthracis strain Ames, B. subtilis strain 168, T. tengcon-gensis MB4, C. perfringens strain 13, and E. coli K-12; andtwo gapped genomes, those of Bacillus sp. strain NRRLB-14911 and Bacillus sp. strain NRRL B-14905 (GenBankaccession numbers NC_003997, NC_000964, NC_003869,NC_003366, NC_000913, NZ_AAOX00000000, and NZ_AAXV00000000) was performed. From this analysis, 29,321 CDSswith obvious homologs were classified in relation to 4,601protein families. The largest two families of B. sphaericusC3-41 are ATP-binding iron transport system proteins (66

TABLE 1. General features of whole-chromosome sequences of B. sphaericus C3-41

Genomic location Size (bp) No. ofCDSs

Repliconcoding (%)

G�Ccontent (%)

No. of:

rRNAoperons tRNAs Putative

prophagesTransposase

genesInsertionelements

Chromosome 4,639,821 4,786 82.7 37 10 85 3 45 7Plasmid 177,642 186 80.2 33.1 0 0 0 1 1

VOL. 190, 2008 GENOME SEQUENCE OF B. SPHAERICUS C3-41 2893

on January 18, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 3: Complete Genome Sequence of the Mosquitocidal Bacterium ... · sequenced organisms (i.e., Bacillus anthracis strain Ames, Bacillus subtilis strain 168, Thermoanaerobacter tengcongensis

CDSs) and sensory transduction proteins (57 CDSs). About339 CDSs from B. sphaericus C3-41 have no matches in theother seven species and were unclassified, and another 102CDSs from B. sphaericus C3-41 that have only low similaritiesto other CDSs were classified singly.

Genome comparisons between B. sphaericus C3-41 and otherbacteria. Direct comparisons between the predicted CDSs ofthe B. sphaericus C3-41 chromosome and those of seven otherbacterial species (B. anthracis strain Ames, B. subtilis strain168, T. tengcongensis MB4, C. perfringens strain 13, E. coli K-12,Bacillus sp. strain NRRL B-14911, and Bacillus sp. strainNRRL B-14905) were performed by BLAST analyses (5). Theresults revealed that the putative CDSs of B. sphaericus C3-41share greater similarities with those of Bacillus sp. NRRL

B-14905 than with those of other Bacillus species (Table 2).About 3,716 CDSs (77.64% of the total genes predicted in B.sphaericus C3-41) have matches in the gapped genome sequencesof Bacillus sp. strain NRRL B-14905, and the average identity ofthese genes is 91.62%. Bacillus sp. strain NRRL B-14905 is amarine Bacillus species, reported previously to be closely relatedto B. sphaericus C3-41 according to a phylogenetic tree based on16S rRNA sequencing, and both of them share similar physiolog-ical characteristics, such as strict aerobiosis, lack of sugar metab-olism, and the production of round spores (58). Additionally, T.tengcongensis MB4 and C. perfringens strain 13 are more closelyrelated than E. coli K-12 to B. sphaericus C3-41, these bacteriahaving 42.77%, 41.73%, and 27.48% CDSs, respectively, thatmatch those of B. sphaericus C3-41 (Table 2).

FIG. 1. Circular representations of the genome of B. sphaericus C3-41. (A) Chromosome. (B) Plasmid pBsph. From the inside: circles 1 and2, GC skew and G�C content (20-kb window with 5-kb step); circle 3, blue and green bars show positions of tRNA and rRNA, respectively, andblack bars show positions of repeats; circles 4 and 5, CDSs on the � and � strands. Colors reflect functional categories of CDSs. Teal, chromatinstructure and dynamics; blue, energy production and conversion; orange, cell cycle control, cell division, and chromosome partitioning; maroon,amino acid transport and metabolism; dark blue, nucleotide transport and metabolism; silver, carbohydrate transport and metabolism; dark green,coenzyme transport and metabolism; dark purple, lipid transport and metabolism; navy, translation, ribosomal structure, and biogenesis; lightbrown, transcription; aqua, replication, recombination, and repair; green, cell wall/membrane/envelope biogenesis; fuchsia, cell motility; gray,posttranslational modification, protein turnover, and chaperones; dark yellow, inorganic ion transport and metabolism; dark blue, secondarymetabolite biosynthesis, transport, and catabolism; dark red, general function prediction only; dark gray, function unknown; lime, signal trans-duction mechanisms; yellow, intracellular trafficking, secretion, and vesicular transport; olive, defense mechanisms; black, not classified by COG.The “0” coordinates marked on the outmost circles correspond to the putative replication origins, and the putative replication termination site islocated near 2.9 Mb.

TABLE 2. Genome comparisons between B. sphaericus C3-41 and seven other species

Strain Size (bp) Total no.of genes

No. of genesaligned

% of genes in B.sphaericus C3-41

% of genes inquery genomes

Mean DNAidentityb

Bacillus sp. strain NRRL B-14905 4,497,271a 4,624 3,716 77.64 80.36 91.62Bacillus sp. strain NRRL B-14911 5,085,825a 5,691 2,315 48.37 40.68 50.24B. anthracis strain Ames 5,227,293 5,311 2,221 46.41 41.82 48.89B. subtilis strain 168 4,214,630 4,105 2,025 42.31 49.33 48.53T. tengcongensis MB4 2,689,445 2,588 1,107 23.13 42.77 38.64C. perfringens strain 13 3,031,430 2,660 1,110 23.19 41.73 37.91E. coli K-12 4,639,675 4,243 1,166 24.36 27.48 36.28

a Gapped genome sequences from NCBI database.b Quantitated according to the identity value of the alignment of each gene.

2894 HU ET AL. J. BACTERIOL.

on January 18, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 4: Complete Genome Sequence of the Mosquitocidal Bacterium ... · sequenced organisms (i.e., Bacillus anthracis strain Ames, Bacillus subtilis strain 168, Thermoanaerobacter tengcongensis

The synteny between genomes of B. sphaericus C3-41 andB. anthracis strain Ames (GenBank accession number NC_003997), B. subtilis strain 168 (GenBank accession numberNC_000964), T. tengcongensis MB4 (GenBank accession num-ber NC_003869), C. perfringens strain 13 (GenBank accessionnumber NC_003366), E. coli K-12 (GenBank accession num-ber NC_000913), Bacillus cereus 10987 (GenBank accessionnumber NC_003909), B. cereus ATCC 14579 (GenBank acces-sion number NC_004722), and B. thuringiensis serovarKonkukian strain 97-27 (GenBank accession number NC_005957), respectively, is low or absent (Fig. 2 illustrates this forB. subtilis, B. anthracis, and T. tencongensis). Two main long-fragment syntenies (about 460 kb and 760 kb, respectively) areobserved between B. sphaericus C3-41, B. subtilis strain 168,and the three B. cereus group strains. B. sphaericus C3-41 andB. subtilis strain 168 display a better synteny than other Bacillusspecies. These species may have shared a common “chromo-some backbone” in a very ancient stage.

Further comparisons among B. sphaericus C3-41, Bacillus sp.strain NRRL B-14905, and B. subtilis strain 168 were per-formed. There are a significant number of proteins that arehomologous (BLAST E value, �10�5) between B. sphaericusC3-41 (1,227) and Bacillus sp. strain NRRL B-14905 (1,152)but are not present in B. subtilis strain 168 (Fig. 3).

Mobile genetic elements and prophages. B. sphaericus C3-41exhibits 45 and 1 CDSs in the chromosome and the plasmid,respectively, that are predicted to encode transposases. The 45transposases of the chromosome are assigned to 11 familiesaccording to the ortholog analysis among eight species (seeTable S1 in the supplemental material). The largest family,containing 18 transposases in B. sphaericus C3-41, has 14matches in Bacillus sp. strain NRRL B-14905. The CDSs of thisfamily member appear homologous to the transposase se-quences of the IS5 and ISNCY families. The second- andthird-largest families contain 11 and 5 transposases in B. spha-ericus C3-41, having 22 and 19 matches in Bacillus sp. strainNRRL B-14905, respectively. The CDSs of both families arehomologous to the IS3 family (The definitions and the se-quences of insertion sequence [IS] families are from the ISFinder database [http://www-is.biotoul.fr/is.html]). Obviously,B. sphaericus C3-41 and Bacillus sp. strain NRRL B-14905share transposase sequences with higher similarities than thesix other species.

The transposase sequences of B. sphaericus C3-41 were sub-mitted to the IS Finder database to search for possible inser-tion elements. At least seven IS elements were found in thechromosome (named ISBsph3 to ISBsph9), and one copy ofISBsph9 exists in plasmid pBsph (Table 3). ISBsph3, ISBsph4,ISBsph5, ISBsph6, ISBsph7, and ISBsph9 share a number offeatures with other IS3 family elements, including invertedrepeats, similarity between transposases, and the DD-(35)-E-(7)-K or DD-(35)-E-(7)-R motifs, which are highly conservedpatterns among ISs (19, 33, 35).

According to the AT-rich signatures and similarities to otherknown phage and prophage sequences, three putative pro-phages or prophage-related regions (i.e., Bsph_1772-1792,Bsph_1936-1952, and Bsph_3729-3788, respectively) were pre-dicted to be in the chromosome. The putative amino acidsequences of Bsph_1945 and Bsph_3765 showed high similar-ities to the large subunit of terminase, a signature protein thatis highly conserved among prophages (10). The first putativeprophage was found to be similar to the B. subtilis phage-likeelement PBSX but lacked a terminase gene. This prophage

FIG. 2. Plots of gene pairs based on genomic location. Each dotindicates a single protein plotted on the 5� end of the coding region inthe reference genome and the best match in the query genome. Fromthe top down: comparison of B. sphaericus C3-41 and B. subtilis subsp.strain 168; comparison of B. sphaericus C3-41 and B. anthracis strainAmes; comparison of B. sphaericus C3-41 and T. tengcongensis MB4.

FIG. 3. Venn diagram illustrating the number of putative proteinsassociated with each organism and the number shared between theseorganisms. Bsu, B. subtilis strain 168; Bac, Bacillus sp. strain NRRLB-14905; Bsp, B. sphaericus C3-41; n, total number of putative proteinsencoded by the organism.

VOL. 190, 2008 GENOME SEQUENCE OF B. SPHAERICUS C3-41 2895

on January 18, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 5: Complete Genome Sequence of the Mosquitocidal Bacterium ... · sequenced organisms (i.e., Bacillus anthracis strain Ames, Bacillus subtilis strain 168, Thermoanaerobacter tengcongensis

region is also present in Bacillus sp. strain NRRL B-14905. Thethird putative prophage is similar to the possible prophage ofSalmonella enterica serovar Typhimurium. Only three CDSs inthe second putative prophage-related region were similar toknown phage-associated proteins, including Bsph_1945 (simi-lar to the putative bacteriophage terminase large subunit),Bsph_1947 (similar to the portal vertex protein), andBsph_1949 (similar to the phage-related holin protein). Thus,the second putative prophage-related region might be the rem-nant of phage invasion and genome deterioration during evo-lution.

Carbohydrate metabolism and transport systems. The in-ability to metabolize carbohydrates is a well-known feature ofboth B. sphaericus and Bacillus sp. strain NRRL B-14905, butthe details of their energy metabolism remain to be investi-gated. Three CDSs, encoding CcpA (catabolite control proteinA), HPr, and Crh (the regulatory paralogue of HPr), werepresent in the genomes of B. sphaericus C3-41 and Bacillus sp.strain NRRL B-14905. CcpA is a pleiotropic transcriptionalregulator that acts as the key factor in the regulation of carbonand nitrogen metabolism, interacting with regulatory sites inthe control regions of the regulated operon to either repress oractivate transcription (8, 44, 69, 71). An alignment of CcpAsequences among B. sphaericus C3-41 (Bsph_4200) and 50other bacterial isolates was performed (data not shown). De-spite the variability among different species, the phylogeneticrelationships among the 51 samples revealed sequence conser-vation of this pleiotropic transcriptional regulator in closelyrelated species. B. sphaericus was grouped with the other Ba-cillus species and classified into a clade with Bacillus sp. strainNRRL B-14905 (see Fig. S1 in the supplemental material). InB. sphaericus C3-41, 15 examples of the TGWAANCGNTNWCA consensus, a cis-acting palindromic sequence called cre(catabolite-responsive element), which might be the regulatorybinding sites of CcpA (69), were found throughout the genomewhere they may influence genes encoding proteins involved inamino acid metabolism, transport, and transcriptional regula-tors. This number is far less than the numbers in B. subtilis andother low-GC, gram-positive bacterial strains (8, 30, 44, 61).

The function of HPr, which acts as a cofactor for CcpA, is toparticipate in the phosphotransferase system (PTS)-catalyzedtransport and phosphorylation of carbohydrates (15, 21, 30,61). The presence in B. sphaericus C3-41 of the conservedenzyme I (EI)- and of HPr-encoding genes, such as Bsph_2351,

Bsph_2352, and Bsph_0434, which might be involved in thephosphoenolpyruvate-dependent protein phosphorylationchain, was predicted. The direct and specific interaction ofCcpA and HPr–Ser46-P or HPr–His15-P has been demon-strated previously and is assumed to provide a direct linkbetween glycolytic activity of the cell and Cre binding by CcpA(15). The predicted HPr of B. sphaericus C3-41 (i.e.,Bsph_2351) might be phosphorylated by the EI (PtsI) encodedby Bsph_2352 at His-15 or phosphorylated at a regulatoryserine (Ser-46) by ATP and the HPr kinase encoded byBsph_0434, as occurs in the phosphoenolpyruvate-dependentPTS system of B. subtilis (16, 22, 23, 53), and the phosphorylgroup might be transferred to the sugar-specific EIIA bindingproteins. This ATP-dependent phosphorylation might regulatethe induction and carbon catabolite repression of some cata-bolic genes, as described previously (55).

However, with the exception of the sugar-binding proteinEII that is specific for N-acetylglucosamine, no other sugar-binding protein was identified in B. sphaericus C3-41. Bacterialgenome comparison analysis revealed that B. sphaericus C3-41lacks many PTS systems, such as glucose- or fructose-specificPTS enzyme components. Compared with B. subtilis strain 168(25 PTS systems) and B. anthracis strain Ames (19 PTS sys-tems), B. sphaericus C3-41 has fewer PTS systems (only 9), andthese are probably specific for the transport of N-acetylglu-cosamine, cellobiose, and an unknown pentitol, according totheir KEGG orthologs (http://www.genome.jp/kegg/). Further-more, ortholog analysis indicated that some ABC sugar trans-porters functioning in sugar binding and transport are presentin the other six species but absent from B. sphaericus C3-41 andBacillus sp. strain NRRL B-14905, both of which may, there-fore, be defective in the transport of sugars.

The KEGG pathways were compared between B. sphaericusC3-41, B. subtilis strain 168, and the B. cereus group strains.The results indicated that at least glucose-6-phosphate isomer-ase (Pgi), phosphomannose isomerase (ManB), PTS systemglucose-specific EII, and fructose-specific EII componentswere absent from B. sphaericus C3-41 and Bacillus sp. strainNRRL B-14905. Furthermore, both B. sphaericus C3-41 andBacillus sp. strain NRRL B-14905 lack an additional 17 en-zymes involved in carbohydrate metabolism in B. subtilis (datanot shown). The absence of a pgi gene from B. sphaericus wasreported previously by other authors (28), and the introductionof a heterologous pgi gene into this bacterium was able to

TABLE 3. IS elements predicted in genome of B. sphaericus C3-41

Element Familya CDS(s) and identity to known ISs IR(s)b (bp/bp) Length(s) (bp)

ISBsph3 IS3 Bsph_1044-1045, 58% and 34% identity to ISPsy8_PEP2 and ISBcen9_PEP1 19/26 1,444ISBsph4 IS3 Bsph_1068-1070, 35%, 52% and 51% identity to PEP1, PEP2, and PEP3 of

ISCpe624/35 1,491

ISBsph5 IS3 Bsph_1073-1075, the first two CDSs having 39% and 60% identity toISEnfa3_PEP1 and ISPsy8_PEP3

19/28 2,140

ISBsph6 IS3 Bsph_1796-1798, the first two CDSs having 33% and 68% identity to PEP1and PEP3 of ISBam1

14/19 2,005

ISBsph7 IS3 Bsph_2816-2818, the first two CDSs having 44% and 55% identity to PEP2and PEP3 of ISSau2

25/42 1,112

ISBsph8 IS607 Bsph_2862-2863, 44% and 53% identity to PEP1 and PEP2 of ISCbt4 17/24 1,961ISBsph9 IS3 Bsph_3190-3191, Bsph_152-153 19/28, 19/28 1,554, 1,554

a The IS classification into families is according to IS Finder (http://www-is.biotoul.fr/is.html).b IR, terminal inverted-repeat sequences.

2896 HU ET AL. J. BACTERIOL.

on January 18, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 6: Complete Genome Sequence of the Mosquitocidal Bacterium ... · sequenced organisms (i.e., Bacillus anthracis strain Ames, Bacillus subtilis strain 168, Thermoanaerobacter tengcongensis

restore polysaccharide utilization in vitro but not in vivo (B.Han, unpublished data). Thus, the absence of key metabolizingenzymes and a sugar transport system may explain the meta-bolic inactivity toward glucose, fructose, and most polysaccha-rides.

In contrast, a fragment of the N-acetylglucosamine utilizationoperon present in B. sphaericus C3-41 (Bsph_2343-Bsph_2352),including genes encoding a GntR family transcriptional regulator,three ABC transporters, N-acetylglucosamine-6-phosphate deacety-lase (NagA), glucosamine-6-phosphate deaminase (NagB), HPr,and PtsI, might be involved in the whole pathway for N-acetyl-glucosamine metabolism and transport, which supports previ-ous indications that B. sphaericus can degrade N-acetylglu-cosamine (4). It is interesting to note the presence of anotherN-acetylglucosamine utilization operon (Bsph_3892-Bsph_3899).The nagA gene in this region contains several frame shifts,causing a premature stop, and might be a pseudogene (Fig. 4).

Other metabolic enzymes and transport systems. AlthoughB. sphaericus C3-41 is incapable of polysaccharide utilization, ithas exclusive metabolic pathways for a wide variety of organiccompounds and amino acids (3, 54). A 12-gene urease genecluster (Bsph_2699-Bsph_2710) encoding the urease structural(UreABC) and accessory proteins (UreDEFG), as well as fiveABC transporters, which may be a transcriptional unit, is lo-cated in the chromosome of B. sphaericus C3-41. The ureasegene cluster is not present in B. anthracis strain Ames, B.thuringiensis serovar Konkukian strain 97-27, B. cereus ATCC14579, E. coli K-12, or T. tengcongensis MB4 according to ourcomparative analysis, but is present in B. cereus ATCC 10987as described before (52). Also, the 12 genes of the ureasecluster are located in the gapped chromosome of Bacillus sp.strain NRRL B-14905, but only 4 genes of the urease clusterare located in B. subtilis strain 168. Upstream of the ureasegene cluster, a putative transcriptional regulator and an aspar-

tase are identified. The presence of the urease enzymes mayincrease the fitness of B. sphaericus C3-41 under different nu-trient conditions in the environment.

A fragment of a putative ethanolamine utilization operonconsisting of 13 CDSs, including eutA, eutB, eutE, eutH, eutL,eutM, eutP, and eutS, was identified in B. sphaericus C3-41(Bsph_2095-Bsph_2108). Similar operons are also found inBacillus sp. strain NRRL B-14905, C. perfringens strain 13, andE. coli K-12, but not in the other five genomes, those of B.subtilis strain 168, B. anthracis strain Ames, B. thuringiensisserovar Konkukian strain 97-27, B. cereus ATCC 14579, and T.tengcongensis MB4, in our comparison. It has been proposedthat a metabolic pathway for ethanolamine utilization may bean important survival strategy against the constant famine thatmicroorganisms face in nature (20, 43). In addition, ethanol-amine found in the mammalian gastrointestinal tract maypresent an important alternative source of nitrogen and carbonfor bacteria living in the gut. Therefore, like many enterobac-teria, such as E. coli K-12, B. sphaericus C3-41 and Bacillus sp.strain NRRL B-14905 may use ethanolamine as a source ofboth carbon and nitrogen (57).

In contrast to the reduced number of sugar-specific phos-phoenolpyruvate-dependent PTS system genes, the genome ofB. sphaericus C3-41 harbors abundant CDSs for ABC trans-porters (253 CDSs in total). Among them, 99 transportersappear to be involved in amino acid transport and metabolismin B. sphaericus C3-41, compared to only 32 in B. subtilis strain168. Among the ABC peptide transporters in B. sphaericusC3-41, there are 32 putative ATP-binding proteins and 46permeases. Only seven ABC transporters appear to be used inpolysaccharide or sugar transport and metabolism, includingBsph_1250, Bsph_0763, Bsph_1252, Bsph_0765, Bsph_1251,Bsph_1260, and Bsph_0764. Furthermore, B. sphaericus C3-41has more peptidases and proteases than B. subtilis strain 168(93 versus 30 and 75 versus 24, respectively). Among these isBsph_0341, which encodes sphaericase (also known as sferi-case), the enzyme responsible for the degradation of the Mtx1mosquitocidal toxin in B. sphaericus (68). In addition, the pre-diction of four amino acid efflux systems and 17 other transportsystems involved in amino acid transport and metabolism im-plies that B. sphaericus C3-41 should be able to carry out activescavenging and secretion processes. The inability to metabolizecarbohydrates and the abundant presence of protein-metabo-lizing systems suggests that the ancestor of B. sphaericus mayhave an animal- or insect-associated origin.

Structural component, S layer, and membrane proteins. An-other interesting similarity between B. sphaericus C3-41 andBacillus sp. strain NRRL B-14905 that may affect the pheno-type is the presence of 22 CDSs probably functioning as sur-face (S) layer proteins or S layer homologs and of 17 mem-brane proteins. The counterparts of the S layer and membraneproteins in B. sphaericus C3-41 and Bacillus sp. strain NRRLB-14905 were not found in B. subtilis strain 168. These mem-brane- or S layer-associated proteins may be related to themarine environment of Bacillus sp. strain NRRL B-14905 orthe multiple environmental sources of bacteria designated asB. sphaericus, including soil, aquatic habitats, and even wastepiles from uranium mines (49). In the latter case, the complexof S layer proteins appears to be important in heavy metalsequestration by B. sphaericus strains, although it is not clear

FIG. 4. CDSs probably involved in N-acetylglucosamine metabo-lism. From left to right: (a) Bsph_2343, transcriptional regulator(GntR family); Bsph_2344-2346, ABC transporters; Bsph_2347, PTSsystem N-acetylglucosamine-specific EIIC component (NagE);Bsph_2348, N-acetylglucosamine-6-phosphate deacetylase (NagA);Bsph_2349, hypothetical protein; Bsph_2350, glucosamine-6-phos-phate deaminase (NagB); Bsph_2351, HPr (PtsH); and Bsph_2352(PtsI); and (b) Bsph_3892-Bsph_3894, PTS system EIIC/A/B compo-nents; Bsph_3895, hypothetical protein; Bsph_3896: HPr (PtsH);Bsph_3897, hypothetical protein; Bsph_3898, transcriptional regulator(GntR family); and Bsph_3899, similar to NagA but containing severalframe shifts. The possible transcriptional regulators are indicated bygray arrows, ABC transporters by arrows with dots, CDSs involved insugar PTS systems by hatched arrows, hypothetical proteins by openarrows, and key enzymes in N-acetylglucosamine metabolism by darkarrows.

VOL. 190, 2008 GENOME SEQUENCE OF B. SPHAERICUS C3-41 2897

on January 18, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 7: Complete Genome Sequence of the Mosquitocidal Bacterium ... · sequenced organisms (i.e., Bacillus anthracis strain Ames, Bacillus subtilis strain 168, Thermoanaerobacter tengcongensis

whether such strains belong to the same group (IIA) of B.sphaericus strains as C3-41.

Recently it was proposed that B. fusiformis and B. sphaericusbe reclassified into the genus of Lysinibacillus (2). We suggestthat Bacillus sp. strain NRRL B-14905 also should be classifiedinto one genus with B. fusiformis and B. sphaericus. This is alsoin accordance with the phylogenetic relationship among thethree Bacillus species based on 16S rRNA genes (58).

Sporulation and germination. As with other spore-formingbacteria, the spore formation of B. sphaericus occurs in re-sponse to nutrient limitation in the environment. A list ofprotein families related to sporulation and germination ac-cording to ortholog/paralog analysis of B. sphaericus C3-41 andseven other species was compiled (see Table S2 in the supple-mental material). The sporulation of B. sphaericus C3-41 ini-tiated by a deficiency in energy might be linked to the expres-sion of several genes, probably functioning as sporulationinitiation proteins or regulators of genetic competence, such asBsph_1217. Four CDSs, including two AbrB genes (Bsph_0057and Bsph_113, present in the chromosome and plasmid, re-spectively), Bsph_0076, and Bsph_2825, might regulate thedevelopmental pathways of spores and biofilms in B. sphaericusC3-41, according to their predicted function. The predictedCDSs encoding spore coat proteins of B. sphaericus C3-41 areclassified into 13 protein families according to their homolo-gies. The 13 protein families are present in all five Bacillusspecies included in this study. For example, many of the B.subtilis strain 168 spore coat proteins, including CotA, CotE,CotY, CotJ, YhcQ, YabP, YabQ, YkuD, and SspF, are in-cluded in the 13 protein families. Fifty-three CDSs predicted inthe chromosome of B. sphaericus C3-41, whose products areclassified into 13 protein families according to ortholog anal-ysis, are related to spore germination. Most of these CDSs canbe classified into the GerC, GerK, GerH, GerQ, and GerXfamilies, according to homology to these operons. SeventeenCDSs probably associated with sporulation and germination,such as GerP, can be found in other Bacillus species but areabsent from B. sphaericus C3-41 and Bacillus sp. strain NRRLB-14905. Moreover, gene orthologs and COG analysis showthat at least 64 possible CDSs in B. sphaericus C3-41 might beinvolved in the regulation and posttranslational modification ofthe sporulation process and assembly of the spore coat andexosporium (COG data are not shown).

However, sporulation and germination are complex, multi-stage events and a number of genes involved in transcription,signal transduction, posttranslational modification, and otherfunctions may also be involved in these processes, including theassembly of the spore coat and exosporium. Thus, many CDSsnot included in Table S2 in the supplemental material andsome hypothetical proteins with as-yet-unknown functionsmight also be linked to the sporulation and germination pro-cesses, and the resolution of this issue will require furtherproteomic analysis.

Evolution of mosquitocidal toxin genes. Unlike other bacte-rial pathogens, whose virulence genes are assembled withinputative pathogenicity islands, the insecticidal toxins of B.sphaericus C3-41 are widely distributed around the chromo-some. Comparison of the region encoding the mtx1 genes (64)in strain 2297 (GenBank accession number AB126007) andstrain SSII-1 (GenBank accession number M60446) with the

equivalent region from B. sphaericus C3-41 (nucleotides1,141,180 to 1,138,569) reveals that the mtx1 gene in C3-41might be a pseudogene with a frame shift (Bsph_1076). Themtx1 gene is poorly expressed in B. sphaericus and has a re-pressor binding site between its putative promoter and itsribosome binding site (64). The mtx2 toxin gene (Bsph_1071)(previously described by Thanabalu and Porter) is locatedclose to the mtx1 toxin gene (65). The Mtx2 protein is knownto be related to the Mtx3 toxin of B. sphaericus (39), which ispresent in the C3-41 genome as Bsph_2822. It is interesting tonote a further open reading frame (ORF), Bsph_2821, locatedupstream of mtx3 that has homology to both mtx2 and mtx3 butappears to be a pseudogene.

Our results indicated that the mtx1, mtx2, and mtx3 genes ofB. sphaericus C3-41 might have some direct relationship tomobile genetic elements. Two insertion elements that werenamed ISBsph4 and ISBsph5, located within the mtx1-mtx2cluster region, were found (Fig. 5A). orf2 and orf3 of the mtx2distal insertion element appear to be a transposase disruptedinto two nonfunctional genes (i.e., Bpsh_1069 and Bpsh_1070)by a premature stop, which suggests that this element may havebeen integrated for some time. This notion may be supportedby the fact that, of the known B. sphaericus toxins, the Mtx2proteins are the least conserved, with interstrain variationshaving significant effects on host range (11). There is also aninsertion element (ISBsph7) upstream of Bsph_2821 and mtx3(Bsph_2822) (Fig. 5B). This implies that these mobile geneticelements might play an important role in Mtx toxin evolution.With the exception of mtx1 (Bsph_1076), the other predictedmtx toxin genes, mtx2 (Bsph_1071) and mtx3 (Bsph_2822),have close orthologs in Bacillus sp. strain NRRL B-14905.

The binary toxin genes binA and binB of B. sphaericus C3-41,which are the main source of its activity against mosquitolarvae, were found to be located in an �35-kb duplicate frag-ment present both in the chromosome and in the large plas-mid. In total, 21 CDSs were predicted within this duplicatedfragment (Fig. 6). Apart from binA and binB (Bsph_3193/Bsph_155 and Bsph_3192/Bsph_154, respectively), the nearbyBsph_3195/Bsph_157 appears to encode a further protein withhomology to the Mtx2/3 toxins. Genes encoding a putativepeptide synthase and a chitin-binding protein are located in

FIG. 5. Mosquitocidal toxin gene clusters and relationship withmobile genetic elements. (a) Gene cluster from Bsph_1068 toBsph_1080. (b) Gene cluster from Bsph_2814 to Bsph_2825. Grayarrows, transposases; arrows with lattice, mosquitocidal toxins; darkarrows, possible transcriptional regulators; open arrows, hypotheticalproteins. Insertion elements and terminal inverted-repeat sequences(IR) are marked in the figures. The region encoding the possible Mtx1contains a frame shift. pseudo, pseudogene.

2898 HU ET AL. J. BACTERIOL.

on January 18, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 8: Complete Genome Sequence of the Mosquitocidal Bacterium ... · sequenced organisms (i.e., Bacillus anthracis strain Ames, Bacillus subtilis strain 168, Thermoanaerobacter tengcongensis

this region (Bsph_3196/Bsph_158 and Bsph_3198/Bsph_160).Two CDSs for phage integrase family proteins were observedupstream of the 35-kb fragment. Similar to those of the mtxcluster, a putative transposase gene (Bsph_3188) and insertionelement (ISBsph9) are also present within the �35-kb dupli-cate fragment, suggesting that the pathogenicity locus of B.sphaericus C3-41 may have phage or transposon origins. TheGerXB-XA-XC gene cluster is found upstream of the putativetransposase gene. This is exactly in accordance with B. anthra-cis plasmid pXO1 that also has a GerXB-XA-XC gene clusterupstream of a transposase (48), where it appears to influencethe germination rate of B. anthracis spores (26, 27). Compar-ison of the region following the binary toxin genes in B. spha-ericus strains C3-41 and strain 1593 (GenBank accession num-ber AJ224477) with the equivalent region from strain 2297(GenBank accession number AJ224478) reveals that the strain2297 sequence contains a probable transposase pseudogene.The transposase homology begins at a small CDS capable ofencoding a peptide of only 14 amino acids, but the homologycontinues in various reading frames, with at least five frameshifts, for 1,110 nucleotides. In strains C3-41 and 1593, how-ever, a 1,554-bp insertion is seen at the end of the small CDSof strain 2297. Several repeat sequences characteristic of trans-position/rearrangement events are found in the above trans-

posase regions. Downstream of the initial CDS in strain 2297,there is a direct repeat (TAAAGAATATAA) separated by 25bases within the disrupted transposase. This feature is main-tained in strain C3-41, downstream of the inserted sequence.In the latter strain, there is a long direct-repeat sequence(AAATAAAGTCgtGAtGTTTATAAAAAAGaTGCGAtgTTTaAAATAAAGTCacGAcGTTTATAAAAAAGcTGCGAagcTTT; centered around the a nucleotide) beginning just 11nucleotides from the 5� end of the inserted sequence and asmaller inverted-repeat sequence 62 nucleotides upstream ofthe 3� end of the insertion (ATAGAAAAAGCGTGTTTTGAtcAAtctTTttTCAAAACACGCTTTTTCTAT; centered aroundthe tct nucleotides) (Fig. 6).

Comparative analysis indicated that BinA, BinB, and thetwo phage integrase family proteins are unique to B. sphaericusand are not present in the other seven species analyzed. Wehypothesize that the binary toxins and even the �35-kb frag-ment may be the remnant of the phage infection that may havehappened only in B. sphaericus, and not in the other sevenspecies listed in this study. The plasmid location, putativephage infection, and transposition may all provide explana-tions of the fact that bin genes are not present in all B. spha-ericus strains, while the bin genes of different serotypes andisolates show extremely high levels of similarity (73, 74).

A further insecticidal toxin from B. sphaericus was recentlyreported to be active against the German cockroach Blattelagermanica (46). The gene for this sphaericolysin is identified inthe B. sphaericus C3-41 genome as Bsph_4094 and has highsimilarity with the gene encoding cereolysin O, which is presentin many Bacillus species. In addition, the chromosome of B.sphaericus C3-41 contains several homologs of genes known tobe involved in the pathogenicity of other gram-positive bacte-ria, such as B. cereus and Listeria monocytogenes. These includehaemolysin III (25) or putative haemolysin CDSs (Bsph_2727,Bsph_3508, Bsph_3583, and Bsph_3651), internalin-like genes(Bsph_4728), sigma factor B (Bsph_4295), and P60 extracellu-lar protease (Bsph_1123) (14). These shared virulence genesmight be part of the common arsenal associated with thepathogenicity of some gram-positive bacteria.

Duplication in genome sequences. The predicted proteins ofB. sphaericus C3-41 were used for finding segmental duplica-tion and tandem duplication. The results of BLAST analysisrevealed a large number of predicted CDSs with similarmatches within the chromosome of B. sphaericus C3-41(BLAST E value, �10�5), indicating gene duplication (Fig.2A). This includes single-gene duplication with the two geneswidely separated, duplication of neighboring genes, fragmentduplication with many CDSs, and even the duplication of al-most one-third to one-half of the circular chromosome. Theduplication of the chromosome itself suggested that the chro-mosome of B. sphaericus C3-41 might have evolved by a drasticdoubling in genome size. Similar phenomena were observedin other Bacillus species, such as B. anthracis strain Ames(GenBank accession number NC_003997) (data not shown).On the other hand, by genome comparison two main long-fragment syntenies (about 460 kb and 760 kb, respectively)were observed between B. sphaericus C3-41, B. subtilis strain168, and the three B. cereus group strains, as we describedbefore. Thus, we made an assumption that the Bacillus speciesshared a common “chromosome backbone” in a very ancient

FIG. 6. CDSs of the 35-kb duplicates in B. sphaericus C3-41 andcomparisons of the transposase homologs downstream from the binarytoxins in B. sphaericus strains C3-41, 1593, and 2297. The 35-kb dupli-cates presented both in the chromosome and in the large plasmidcontain 21 CDSs. Binary toxin BinA and BinB, CDSs for phage inte-grase family protein, insertion elements, the GerXB-XA-XC operon, aputative peptide synthase, and a chitin-binding protein are indicated.Some other, hypothetical proteins are indicated by open arrows. Thetransposase homolog of B. sphaericus 2297 begins at a small CDScapable of encoding a peptide of only 14 amino acids, but the homologcontinues in various reading frames for 1,110 nucleotides. In strainsC3-41 and 1593, a 1,554-bp insertion is located at the end of the smallCDS of strain 2297. Downstream of the initial CDS in strain 2297,there is a direct repeat (TAAAGAATATAA) separated by 25 baseswithin the disrupted transposase. This feature is maintained down-stream of the inserted sequence in strains 1593 and C3-41. The twoopposite-facing triangles represent the long, direct-repeat sequence atthe 3� end of the inserted sequence and the shorter, inverted-repeatsequence upstream of the 3� end of the IS, respectively.

VOL. 190, 2008 GENOME SEQUENCE OF B. SPHAERICUS C3-41 2899

on January 18, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 9: Complete Genome Sequence of the Mosquitocidal Bacterium ... · sequenced organisms (i.e., Bacillus anthracis strain Ames, Bacillus subtilis strain 168, Thermoanaerobacter tengcongensis

stage, after which a duplication process happened, the chro-mosome size increased, and then variation developed as aconsequence of complex and dynamic evolutionary mecha-nisms, finally resulting in the divergence of species. In addition,the presence of the �35-kb duplication between the chromo-some and the large plasmid pBsph, containing the CDSs ofbinary toxin, transposase, and phage integrase family proteins,was confirmed by analyses, including the average sequencingcoverage of the chromosome (8.9 times), plasmid (21.4 times),and 35-kb duplicate fragment (25 times), respectively, multiplePCRs, single-nucleotide polymorphism analysis, and the cod-ing bias (data not shown). We suggest that this large fragmentmay be the remnant of a phage infection and, thus, that bingenes are only present in a subset of B. sphaericus strains andshow extremely high levels of similarity among different sero-types and isolates.

DISCUSSION

There is a considerable debate surrounding the evolutionaryprocess and systematic classification of B. sphaericus. The com-pletion of the genomic sequence of B. sphaericus C3-41 andcomparative analysis with other bacterial species now lays thefoundation for understanding the special metabolic abilitiesand some common physiological and biochemical characteris-tics on a molecular level and, accordingly, offers new informa-tion regarding the evolution and ecology of this species.

Based on overall nucleotide and protein similarities, B. spha-ericus C3-41 is most similar to a marine bacterial species,Bacillus sp. strain NRRL B-14905. The similarities of these twospecies, including mobile genetic elements, membrane-associ-ated proteins, and metabolic and transport systems, providesolid data to explain the common features of B. sphaericusC3-41 and Bacillus sp. strain NRRL B-14905, such as theirinability to utilize polysaccharide, and suggest that these twospecies are a biologically and phylogenetically divergent groupwhose members have developed to adapt to particular envi-ronmental conditions over evolutionary time. This is in accor-dance with previous suggestions that B. sphaericus has somespecial features similar to those of some archaic organisms andbacteria that can grow in extreme environments (9, 49) and is,therefore, distinct from most Bacillus species (45). It was re-cently proposed that B. fusiformis and B. sphaericus be reclas-sified into the genus of Lysinibacillus, with the renaming of B.sphaericus as Lysinibacillus sphaericus (2). We would proposefrom the comparative analysis in our study that Bacillus sp.NRRL B-14905 should be classified into one genus with B.fusiformis and B. sphaericus.

On the other hand, despite the approximately 80% identityof the B. sphaericus C3-41 and Bacillus sp. strain NRRLB-14905 genomes, there are many CDSs that are individuallyunique in B. sphaericus C3-41. These include prophage and ISelements, sporulation- and germination-related proteins, andvirulence factors. Some unique genes, such as bin genes andmtx1, might have been obtained by insertion through mobilegenetic elements. This means of gene acquisition may explainwhy these genes are only present in some B. sphaericus strainsand show extremely high levels of similarity among differentserotypes and isolates.

Together, the similarities and differences may hint at over-

lapping but nonidentical environmental and ecological nichesfor the taxa of these species. In performing a comparativeanalysis of the synteny and duplication among eight species, wepostulated that, although B. sphaericus C3-41 is quite differentfrom other Bacillus species, it still shares a common “chromo-some backbone” with them. The variation of the chromosomesmight be due to duplication and a complex and dynamic evo-lutionary process producing the current bacterial species.

ACKNOWLEDGMENTS

We thank Quanxin Cai and Dongyuan Liu for technical assistance.This project was supported by grant number KSCX2-SW-315 from

the Chinese Academy of Sciences, 973 project number 2003CB114201,and grant number 30470037 from NFSC, China, and with financialassistance from Valent Biosciences Corp.

REFERENCES

1. Adams, D. E., E. M. Shekhtman, E. L. Zechiedrich, M. B. Schmid, and N. R.Cozzarelli. 1992. The role of topoisomerase IV in partitioning bacterialreplicons and the structure of catenated intermediates in DNA replication.Cell 71:277–288.

2. Ahmed, I., A. Yokota, A. Yamazoe, and T. Fujiwara. 2007. Proposal ofLysinibacillus boronitolerans gen. nov. sp. nov., and transfer of Bacillus fusi-formis to Lysinibacillus fusiformis comb. nov. and Bacillus sphaericus to Ly-sinibacillus sphaericus comb. nov. Int. J. Syst. Evol. Microbiol. 57:1117–1125.

3. Alexander, B., and F. G. Priest. 1990. Numerical classfication and identifi-cation of Bacillus sphaericus including some strains pathogenic for mosquitolarvae. J. Gen. Microbiol. 136:367–376.

4. Alice, A. F., G. Perez-Martınez, and C. Sanchez-Rivas. 2003. Phosphoenol-pyruvate phosphotransferase system and N-acetylglucosamine metabolism inBacillus sphaericus. Microbiology 149:1687–1698.

5. Altschul, S. F., W. Gish, W. Miller, E. W. Myers, and D. J. Lipman. 1990.Basic local alignment search tool. J. Mol. Biol. 215:403–410.

6. Apweiler, R., T. K. Attwood, A. Bairoch, A. Bateman, E. Birney, M. Biswas,P. Bucher, L. Cerutti, F. Corpet, and M. D. Croning. 2001. The InterProdatabase, an integrated documentation resource for protein families, do-mains and functional sites. Nucleic Acids Res. 29:37–40.

7. Berry, C., S. O’Neil, E. Ben-Dov, A. F. Jones, L. Murphy, M. A. Quail,M. T. G. Holden, D. Harris, A. Zaritsky, and J. Parkhill. 2002. Completesequence and organization of pBtoxis, the toxin-coding plasmid of Bacillusthuringiensis subsp. israelensis. Appl. Environ. Microbiol. 68:5082–5095.

8. Blencke, H. M., G. Homuth, H. Ludwig, U. Mader, M. Hecker, and J. Stulke.2003. Transcriptional profiling of gene expression in response to glucose inBacillus subtilis: regulation of the central metabolic pathways. Metab. Eng.5:133–149.

9. Cano, R. J., and M. K. Borucki. 1995. Revival and identification of bacterialspores in 20- to 40-million-year-old Dominican amber. Science 268:1060–1064.

10. Casjens, S. 2003. Prophages and bacterial genomics: what have we learned sofar? Mol. Microbiol. 49:277–300.

11. Chan, S. W., T. Thanabalu, B. Y. Wee, and A. G. Porter. 1996. Unusualamino acid determinants of host range in the Mtx2 family of mosquitocidaltoxins. J. Biol. Chem. 271:14183–14187.

12. Chen, Y., and H. P. Erickson. In vitro assembly studies of FtsZ/tubulin-likeproteins (TubZ) from Bacillus plasmids—evidence for a capping mechanism.J. Biol. Chem., in press. doi:10.1074/jbc.M709163200.

13. Christensen, B. B., T. Atlung, and F. G. Hansen. 1999. DnaA boxes areimportant elements in setting the initiation mass of Escherichia coli. J. Bac-teriol. 181:2683–2688.

14. Cossart, P. 2002. Molecular and cellular basis of the infection by Listeriamonocytogenes: an overview. Int. J. Med. Microbiol. 291:401–409.

15. Deutscher, J., E. Kuster, U. Bergstedt, V. Charrier, and W. Hillen. 1995.Protein kinase-dependent HPr/CcpA interaction links glycolytic activity tocarbon catabolite repression in gram-positive bacteria. Mol. Microbiol. 15:1049–1053.

16. Deutscher, J., and M. H. Saier, Jr. 1983. ATP-dependent protein kinasecatalyzed phosphorylation of a seryl residue in HPr, a phosphate carrierprotein of the phosphotransferase system in Streptococcus pyogenes. Proc.Natl. Acad. Sci. USA 80:6790–6794.

17. Ewing, B., and P. Green. 1998. Base-calling of automated sequencer tracesusing Phred. II. Error probabilities. Genome Res. 8:186–194.

18. Ewing, B., L. Hillier, M. C. Wendl, and P. Green. 1998. Base-calling ofautomated sequencer traces using Phred. I. Accuracy assessment. GenomeRes. 8:175–185.

19. Fayet, O., P. Ramond, P. Polard, M. F. Prere, and M. Chandler. 1990.Functional similarities between retroviruses and the IS3 family of bacterialinsertion sequences? Mol. Microbiol. 4:1771–1777.

2900 HU ET AL. J. BACTERIOL.

on January 18, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 10: Complete Genome Sequence of the Mosquitocidal Bacterium ... · sequenced organisms (i.e., Bacillus anthracis strain Ames, Bacillus subtilis strain 168, Thermoanaerobacter tengcongensis

20. Freter, R., H. Brickner, M. Botney, D. Cleven, and A. Aranki. 1983. Mech-anisms that control bacterial populations in continuous-flow culture modelsof mouse large intestinal flora. Infect. Immun. 39:676–685.

21. Galinier, A., J. Haiech, M. C. Kilhoffer, M. Jaquinod, J. Stulke, J. Deutscher,and I. Martin-Verstraete. 1997. The Bacillus subtilis crh gene encodes aHPr-like protein involved in catabolite repression. Proc. Natl. Acad. Sci.USA 94:8439–8444.

22. Galinier, A., M. Kravanja, R. Engelmann, W. Hengstenberg, M. C. Kilhoffer,J. Deutscher, and J. Haiech. 1998. New protein kinase and protein phos-phatase families mediate signal transduction in bacterial catabolite repres-sion. Proc. Natl. Acad. Sci. USA 95:1823–1828.

23. Gassner, M., D. Stehlik, O. Schrecker, W. Hengstenberg, W. Maurer, and H.Ruterjans. 1977. The phosphoenolpyruvate-dependent phosphotransferasesystem of Staphylococcus aureus. 2.1H and 31P nuclear-magnetic-resonancestudies on the phosphocarrier protein HPr, phosphohistidines and phos-phorylated HPr. Eur. J. Biochem. 75:287–296.

24. Gordon, D., C. Abajian, and P. Green. 1998. Consed: a graphical tool forsequence finishing. Genome Res. 8:195–202.

25. Granum, P. E., and T. Lund. 1997. Bacillus cereus and its food poisoningtoxins. FEMS Microbiol. Lett. 157:223–228.

26. Guidi-Rontani, C., M. Weber-Levy, E. Labruyere, and M. Mock. 1999. Ger-minationof Bacillus anthracis spores with alveolar macrophages. Mol. Micro-biol. 31:9–17.

27. Guidi-Rontani, C., Y. Pereira, S. Ruffie, J.-C. Sirard, M. Weber-Levy, and M.Mock. 1999. Identification and characterisation of a germination operon onthe virulence plasmid pXO1 of Bacillus anthracis. Mol. Microbiol. 33:407–414.

28. Han, B., H. Liu, X. Hu, Y. Cai, D. Zheng, and Z. Yuan. 2007. Molecularcharacterization of a glucokinase of broad hexose specificity from Bacillussphaericus strain C3-41. Appl. Environ. Microbiol. 73:3581–3586.

29. Han, B., H. Liu, X. Hu, and Z. Yuan. 2006. Preliminary characterization ofa thermostable DNA polymerase I from a mesophilic Bacillus sphaericusstrain C3-41. Arch. Microbiol. 186:203–209.

30. Henkin, T. M. 1996. The role of the CcpA transcriptional regulator in carbonmetabolism in Bacillus subtilis. FEMS Microbiol. Lett. 135:9–15.

31. Jones, G. W., C. Nielsen-Leroux, Y. Yang, Z. Yuan, V. Fiuza Dumas, R.Gomes Monnerat, and C. Berry. 2007. A new Cry toxin with a uniquetwo-component dependency from Bacillus sphaericus. FASEB J. 21:4112–4120.

32. Reference deleted.33. Katzman, M., J. P. Mack, A. M. Skalka, and J. Leis. 1991. A covalent

complex between retroviral integrase and nicked substrate DNA. Proc. Natl.Acad. Sci. USA 88:4695–4699.

34. Krych, V. K., J. L. Johnson, and A. A. Yousten. 1980. Deoxyribonucleic acidhomologies among strains of Bacillus sphaericus. Int. J. Sys. Bacteriol. 30:476–484.

35. Kulkosky, J., K. S. Jones, R. A. Katz, J. P. Mack, and A. M. Skalka. 1992.Residues critical for retroviral integrative recombination in a region that ishighly conserved among retroviral/retrotransposon integrases and bacterialinsertion sequence transposases. Mol. Cell. Biol. 12:2331–2338.

36. Kunst, F., N. Ogasawara, I. Mozser, A. M. Albertini, G. Alloni, V. Azebedo,M. G. Bertero, P. Bessieres, A. Bolotin, S. Borchert, R. Borriss, L. Boursier,A. Brans, M. Braun, S. C. Brignell, S. Bron, S. Brouillet, C. V. Bruschi, B.Caldwell, V. Capuano, N. M. Carter, S.-K. Choi, J.-J. Codani, I. F. Conner-ton, N. J. Cummings, R. A. Daniel, F. Denizot, K. M. Devine, A. Du Sterhoft,S. D. Ehrlich, P. T. Emmerson, K. D. Entian, J. Errington, C. Fabret, E.Ferrari, D. Foulger, C. Fritz, M. Fujita, Y. Fujita, S. Fuma, A. Galizzi, N.Galleron, S.-Y. Ghim, P. Glaser, A. Goffeau, E. J. Golightly, G. Grandi, G.Guiseppi, B. J. Guy, K. Haga, J. Haiech, C. R. Harwood, A. Henaut, H.Hilbert, S. Holsappel, S. Hosono, M.-F. Hullo, M. Itaya, L. Jones, B. Joris,D. Karamata, Y. Kasahara, M. Klaerr-Blanchard, C. Klein, Y. Kobayashi, P.Koetter, G. Koningstein, S. Krogh, M. Kumano, K. Kurita, A. Lapidus, S.Lardinois, J. Lauber, V. Lazarevic, S.-M. Lee, A. Levine, H. Liu, S. Masuda,C. Maue, C. Medigue, N. Medina, R. P. Mellado, M. Mizuno, D. Moestl, S.Nakai, M. Noback, D. Noone, M. O’Reilly, K. Ogawa, A. Ogiwara, B.Oudega, S.-H. Park, V. Parro, T. M. Pohl, D. Portetelle, S. Porwollik, A. M.Prescott, E. Presecan, P. Pujic, B. Purnelle, G. Rapoport, M. Rey, S. Reyn-olds, M. Rieger, C. Rivolta, E. Rocha, B. Roche, M. Rose, Y. Sadaie, T. Sato,E. Scanlan, S. Schleich, R. Schroeter, F. Scoffone, J. Sekiguchi, A. Sekowska,S. J. Seror, P. Serror, B.-S. Shin, B. Soldo, A. Sorokin, E. Tacconi, T. Takagi,H. Takahashi, K. Takemaru, M. Takeuchi, A. Tamakoshi, T. Tanaka, P.Terpstra, A. Tognoni, V. Tosato, S. Uchiyama, M. Vandenbol, F. Vannier, A.Vassarotti, A. Viari, R. Wambutt, E. Wedler, H. Wedler, T. Weitzenegger, P.Winters, A. Wipat, H. Yamamoto, K. Yamane, K. Yasumoto, K. Yata, K.Yoshida, H.-F. Yoshikawa, E. Zumstein, H. Yoshikawa, and A. Danchin.1997. The complete genome sequence of the gram-positive bacterium Ba-cillus subtilis. Nature 390:249–256.

37. Kurtz, S., J. V. Choudhuri, E. Ohlebusch, C. Schleiermacher, J. Stoye, andR. Giegerich. 2001. REPuter: the manifold applications of repeat analysis ona genomic scale. Nucleic Acids Res. 29:4633–4642.

38. Li, H., A. Coghlan, J. Ruan, L. J. Coin, J. K. Heriche, L. Osmotherly, R. Li,T. Liu, Z. Zhang, L. Bolund, G. K. Wong, W. Zheng, P. Dehal, J. Wang, and

R. Durbin. 2006. TreeFam: a curated database of phylogenetic trees ofanimal gene families. Nucleic Acids Res. 34:572–580.

39. Liu, J. W., A. G. Porter, B. Y. Wei, and T. Thanabalu. 1996. New gene fromnine Bacillus sphaericus strains encoding highly conserved 35.8-kilodaltonmosquitocidal toxins. Appl. Environ. Microbiol. 62:2174–2176.

40. Lobry, J. R. 1996. Asymmetric substitution patterns in the two DNA strandsof bacteria. Mol. Biol. Evol. 13:660–665.

41. Lowe, T. M., and S. R. Eddy. 1997. tRNAscan-SE: a program for improveddetection of transfer RNA genes in genomic sequence. Nucleic Acids Res.25:955–964.

42. Majka, J., D. Jakimowicz, W. Messer, H. Schrempf, M. Lisowski, and J.Zakrzewska-Czerwinska. 1999. Interactions of the Streptomyces lividans ini-tiator protein DnaA with its target. Eur. J. Biochem. 260:325–335.

43. Mason, T. G., and G. Richardson. 1981. Escherichia coli and the human gut:some ecological considerations. J. Appl. Bacteriol. 1:1–16.

44. Moreno, M. S., B. L. Schneider, R. R. Maile, W. Weyler, and M. H. Saier, Jr.2001. Catabolite repression mediated by CcpA protein in Bacillus subtilis:novel modes of regulation revealed by whole genome analyses. Mol. Micro-biol. 39:1366–1381.

45. Nakamura, L. K. 2000. Phylogeny of Bacillus sphaericus-like organisms. Int.J. Syst. Evol. Microbiol. 50:1715–1722.

46. Nishiwaki, H., K. Nakashima, C. Ishida, T. Kawamura, and K. Matsuda.2007. Cloning, functional characterization, and mode of action of a novelinsecticidal pore-forming toxin, sphaericolysin, produced by Bacillus sphaeri-cus. Appl. Environ. Microbiol. 73:3404–3411.

47. Ogata, H., S. Goto, K. Sato, W. Fujibuchi, H. Bono, and M. Kanehisa. 1999.KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res.27:29–34.

48. Okinaka, R. T., K. Cloud, O. Hampton, A. R. Hoffaster, K. K. Hill, P. Keim,T. M. Koehler, G. Lamke, S. Kumano, J. Mahillon, D. Manter, Y. Martinez,D. Ricke, R. Svensson, and P. J. Jackson. 1999. Sequence and organizationof pXO1, the large Bacillus anthracis plasmid harboring the anthrax toxingenes. J. Bacteriol. 181:6509–6515.

49. Pollmann, K., J. Raff, M. Merroun, K. Fahmy, and S. Selenska-Pobell. 2006.Metal binding by bacteria from uranium mining waste piles and its techno-logical applications. Biotechnol. Adv. 24:58–68.

50. Priest, F. G. 1992. Biological control of mosquitoes and other biting flies byBacillus sphaericus and Bacillus thuringiensis. J. Appl. Bacteriol. 72:357–369.

51. Priest, F. G., L. Ebdrup, V. Zahner, and P. Carter. 1997. Distribution andcharacterization of mosquitocidal toxin genes in some strains of Bacillussphaericus. Appl. Environ. Microbiol. 63:1195–1198.

52. Rasko, D. A., J. Ravel, O. A. Økstad, E. Helgason, R. Z. Cer, L. Jiang, K. A.Shores, D. E. Fouts, N. J. Tourasse, S. V. Angiuoli, J. Kolonay, W. C. Nelson,A. B. Kolstø, C. M. Fraser, and T. D. Read. 2004. The genome sequence ofBacillus cereus ATCC 10987 reveals metabolic adaptations and a large plas-mid related to Bacillus anthracis pXO1. Nucleic Acids Res. 32:977–988.

53. Reizer, J., C. Hoischen, F. Titgemeyer, C. Rivolta, R. Rabus, J. Stulke, D.Karamata, M. H. Saier, Jr., and W. Hillen. 1998. A novel protein kinase thatcontrols carbon catabolite repression in bacteria. Mol. Microbiol. 27:1157–1169.

54. Russell, B. L., S. A. Jelley, and A. A. Yousten. 1989. Carbohydrate metabo-lism in the mosquito pathogen Bacillus sphaericus 2362. Appl. Environ.Microbiol. 55:294–297.

55. Saier, M. H., Jr. 1996. Cyclic AMP-independent catabolite repression inbacteria. FEMS Microbiol. Lett. 138:97–103.

56. Salzberg, S. L., A. L. Delcher, S. Kasif, and O. White. 1998. Microbial geneidentification using interpolated Markov models. Nucleic Acids Res. 26:544–548.

57. Scarlett, F. A., and J. M. Turner. 1976. Microbial metabolism of aminoalcohols. Ethanolamine catabolism mediated by coenzyme B12-dependentethanolamine ammonia lyase in Escherichia coli and Klebsiella aerogenes.J. Gen. Microbiol. 95:173–176.

58. Siefert, J. L., M. Larios-Sanz, L. K. Nakamura, R. A. Slepecky, J. H. Paul,E. R. Moore, G. E. Fox, and P. Jurtshuk, Jr. 2000. Phylogeny of marineBacillus isolates from the Gulf of Mexico. Curr. Microbiol. 41:84–88.

59. Stanley, C. H., J. J. Gauthier, and D. J. Tipper. 1975. Ultrastructural studiesof sporulation in Bacillus sphaericus. J. Bacteriol. 122:1322–1338.

60. Stothard, P., and D. S. Wishart. 2005. Circular genome visualization andexploration using CGView. Bioinformatics 21:537–539.

61. Stulke, J., and W. Hillen. 2000. Regulation of carbon catabolism in Bacillusspecies. Annu. Rev. Microbiol. 54:849–880.

62. Tatusov, R. L., D. A. Natale, I. V. Garkavtsev, T. A. Tatusova, U. T. Shanka-varam, B. S. Rao, B. Kiryutin, M. Y. Galperin, N. D. Fedorova, and E. VKoonin. 2001. The COG database: new developments in phylogenetic clas-sification of proteins from complete genomes. Nucleic Acids Res. 29:22–28.

63. Tettelin, H., K. E. Nelson, I. T. Paulsen, J. A. Eisen, T. D. Read, S. Peterson,J. Heidelberg, R. T. DeBoy, D. H. Haft, R. J. Dodson, A. S. Durkin, M.Gwinn, J. F. Kolonay, W. C. Nelson, J. D. Peterson, L. A. Umayam, O. White,S. L. Salzberg, M. R. Lewis, D. Radune, E. Holtzapple, H. Khouri, A. M.Wolf, T. R. Utterback, C. L. Hansen, L. A. McDonald, T. V. Feldblyum, S.Angiuoli, T. Dickinson, E. K. Hickey, I. E. Holt, B. J. Loftus, F. Yang, H. O.Smith, J. C. Venter, B. A. Dougherty, D. A. Morrison, S. K. Hollingshead,

VOL. 190, 2008 GENOME SEQUENCE OF B. SPHAERICUS C3-41 2901

on January 18, 2020 by guesthttp://jb.asm

.org/D

ownloaded from

Page 11: Complete Genome Sequence of the Mosquitocidal Bacterium ... · sequenced organisms (i.e., Bacillus anthracis strain Ames, Bacillus subtilis strain 168, Thermoanaerobacter tengcongensis

and C. M. Fraser. 2001. Complete genome sequence of a virulent isolate ofStreptococcus pneumoniae. Science 293:498–506.

64. Thanabalu, T., J. Hindley, J. Jackson-Yap, and C. Berry. 1991. Cloning,sequencing, and expression of a gene encoding a 100-kilodalton mosquito-cidal toxin from Bacillus sphaericus SSII-1. J. Bacteriol. 173:2776–2785.

65. Thanabalu, T., and A. G. Porter. 1996. A Bacillus sphaericus gene encodinga novel type of mosquitocidal toxin of 31.8 kDa. Gene 170:85–89.

66. Thiery, I., C. Back, P. Barbazan, and G. Sinegre. 1996. Application deBacillus thuringiensis et de B. sphaericus dans la demoustication et la luttecontre les vecteurs de maladies tropicales. Ann. Inst. Pasteur Actual. 7:247–260.

67. Tinsley, E., and S. A. Khan. 2006. A novel FtsZ-like protein is involved inreplication of the anthrax toxin-encoding pXO1 plasmid in Bacillus anthracis.J. Bacteriol. 188:2829–2835.

68. Wati, M., T. Thanabalu, and A. G. Porter. 1997. Gene from tropical Bacillussphaericus encoding a protease closely related to subtilisins from Antarcticbacilli. Biochim. Biophys. Acta 1352:56–62.

69. Weickert, M. J., and G. H. Chambliss. 1990. Site-directed mutagenesis of acatabolite repression operator sequence in Bacillus subtilis. Proc. Natl. Acad.Sci. USA 87:6238–6242.

70. Xu, D., and J. C. Cote. 2003. Phylogenetic relationships between Bacillus

species and related genera inferred from comparison of 3� end 16S rDNAand 5� end 16S–23S ITS nucleotide sequences. Int. J. Syst. Evol. Microbiol.53:695–704.

71. Yoshida, K. I., K. Kobayashi, Y. Miwa, C. M. Kang, M. Matsunaga, H.Yamaguchi, S. Tojo, M. Yamamoto, R. Nishi, N. Ogasawara, T. Nakayama,and Y. Fujita. 2001. Combined transcriptome and proteome analysis as apowerful approach to study genes under glucose repression in Bacillus sub-tilis. Nucleic Acids Res. 29:6683–6692.

72. Yousten, A. A., and W. D. Elizabeth. 1982. Ultrastructural analysis of sporesand parasporal crystals formed by Bacillus sphaericus 2297. Appl. Environ.Microbiol. 44:1449–1455.

73. Yuan, Z. M., C. Neilsen-LeRoux, N. Pasteur, A. Delecluse, J. F. Charles, andR. Frutos. 1999. Cloning and expression of the bin genes of Bacillus spha-ericus C3-41 in a crystal minus B. thuringiensis subsp. israelensis. Acta Biol.Sin. 39:29–35.

74. Yuan, Z. M., C. Rang, R. C. Maroun, J. P. Victor, R. Frutos, N. Pasteur, C.Vendrely, C. Jean-Francois, and C. Nielsen-LeRoux. 2001. Identification andmolecular structural prediction analysis of a toxicity determinant in theBacillus sphaericus crystal larvicidal toxin. Eur. J. Biochem. 268:2751–2760.

75. Zhang, Y., E. Liu, C. Cai, and Z. Chen. 1987. Isolation of two highly toxicBacillus sphaericus strains. Insecticidal Microorg. 1:98–99.

2902 HU ET AL. J. BACTERIOL.

on January 18, 2020 by guesthttp://jb.asm

.org/D

ownloaded from