overlapping genes in bacterial and phage genomes

11
Molecuhar Bh~log); Vol. 34, No. 4, 2000, pp. 485--495. Translated from Molekulvarna)'a Biologiya, Vol. 34, No. 4, 2000, pp. 572-583. Original Russian Text Copyright 2000 by Scherbakxn" Garber. UDC 577.21 Overlapping Genes in Bacterial and Phage Genomes D. V. Scherbakov and M. B. Garber Institute of Protein Research, Russian Academy of Sciences, Pushchino, 142292 Russia; E-maih dimash @vega.protres, ru ReceivedFebruary I, 2000 Abstract---Overlapping of genes was first reported for small viruses and was initially considered a phenome- non related to the natural selection favoring the trend of genome size reduction. Later, overlapping genes were found in the genomes of bacteria, mitochondria, and chloroplasts; numerous evidence of their transcriptional and posttranscriptional interaction was provided. We suggest that overlapping of the genes is important for com- plex regulation of gene expression and/or for better interaction at the transcriptional or translational level. This review describes various types of overlapping genes and operons and the molecular mechanisms of overlapping gene interaction. Key words: genome, bacteria, phages, operon overlapping, gene overlapping, regulation of transcription, regu- lation of translation INTRODUCTION Under the term "gene" one at present considers not only the protein-encoding region, but also the numer- ous regulatory elements, including those rather distant from the coding part. The term "overlapping genes" is rather imprecise, because it is ascribed to the DNA regions coding for more than one polypeptide (or more than one RNA molecule). Therefore, it would be more appropriate to talk about "overlapping reading frames." However, overlapping of only regulatory regions of distinct genes is also of certain interest, because in these cases regulation of expression for these genes is more or less interrelated. In this review we use the term "gene" for the coding region (reading frame) plus regulatory regions. That is, saying "over- lapping genes" means that either coding or regulatory gene regions overlap. The phenomenon of gene overlapping was first reported for phages with single-stranded DNA (ssDNA) genome [I, 2]; it was initially considered an evolutionary advantage gained by the viruses (includ- ing phages and eukaryotic viruses) under natural selection favoring a general decrease of the genome size in this group. It was supposed that gene overlap- ping allows viruses to increase the density of valuable information within their genomic DNA. However, this point of view was weakened by discoveries of gene overlapping in other groups of organisms for which the genome size does not seem to be of critical impor- tance. At present, gene overlapping is considered essential for coordinated regulation of gene expres- sion and/or for subsequent protein-protein interac- tion. OVERLAPPING OF GENES WITHIN AN OPERON Overlapping of certain cistrons within an operon is quite a widespread phenomenon. In early 80s, cistron overlapping was already known for trp [3], his [4], gal [5], and frd [6] operons of E. coli, as well as for the phages ~X174 [7], G4 [2], T7 [8], and ~. [9]. Recently, wide-scale genome sequencing and computer analysis allowed discovery of hundreds examples of this type. The cases of cistron overlapping fall into three quite distinct groups: terminal overlapping involving only very small N- and C-terminal fragments of a coding sequence (most often the stop and start codons only), "out-of-phase" overlapping, when rather large regions of the two genes overlap in distinct reading frames; "in-phase" overlapping when two polypeptides are translated from a certain mRNA region within one reading frame. It should be noted that in first two cases, that is, terminal or "out-of-phase" overlapping, normal translation of the overlapping cistrons is pro- vided by the phenomenon of site-specific pro- grammed shift of the reading frame (PSRF). This phe- nomenon was recently reviewed in detail [10], there- fore we do not describe it here. However, it should be noted that, contrary to the random (spontaneous) shift of the reading frame (caused by various factors, for example by mutations in some tRNA genes [ 11]), site- specific PSRF depends only on the mRNA structure. Several regulatory signals should probably act together to provide this frame shifting. At the site of the shift a "flippy" sequence should be located, usu- ally composed of several identical nucleotides or con- taining a stop codon. Upstream of this sequence there should be an rRNA-binding site similar to the Shine- 0026-8933/00/3404-0485525.00 2000 MAIK"Nauka/interperiodica"

Upload: independent

Post on 21-Nov-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

Molecuhar Bh~log); Vol. 34, No. 4, 2000, pp. 485--495. Translated from Molekulvarna)'a Biologiya, Vol. 34, No. 4, 2000, pp. 572-583. Original Russian Text Copyright �9 2000 by Scherbakxn" Garber.

UDC 577.21

Overlapping Genes in Bacterial and Phage Genomes D. V. Scherbakov and M. B. Garber

Institute of Protein Research, Russian Academy of Sciences, Pushchino, 142292 Russia; E-maih dimash @ vega.protres, ru

Received February I, 2000

Abstract---Overlapping of genes was first reported for small viruses and was initially considered a phenome- non related to the natural selection favoring the trend of genome size reduction. Later, overlapping genes were found in the genomes of bacteria, mitochondria, and chloroplasts; numerous evidence of their transcriptional and posttranscriptional interaction was provided. We suggest that overlapping of the genes is important for com- plex regulation of gene expression and/or for better interaction at the transcriptional or translational level. This review describes various types of overlapping genes and operons and the molecular mechanisms of overlapping gene interaction.

Key words: genome, bacteria, phages, operon overlapping, gene overlapping, regulation of transcription, regu- lation of translation

INTRODUCTION

Under the term "gene" one at present considers not only the protein-encoding region, but also the numer- ous regulatory elements, including those rather distant from the coding part. The term "overlapping genes" is rather imprecise, because it is ascribed to the DNA regions coding for more than one polypeptide (or more than one RNA molecule). Therefore, it would be more appropriate to talk about "overlapping reading frames." However, overlapping of only regulatory regions of distinct genes is also of certain interest, because in these cases regulation of expression for these genes is more or less interrelated. In this review we use the term "gene" for the coding region (reading frame) plus regulatory regions. That is, saying "over- lapping genes" means that either coding or regulatory gene regions overlap.

The phenomenon of gene overlapping was first reported for phages with single-stranded DNA (ssDNA) genome [I, 2]; it was initially considered an evolutionary advantage gained by the viruses (includ- ing phages and eukaryotic viruses) under natural selection favoring a general decrease of the genome size in this group. It was supposed that gene overlap- ping allows viruses to increase the density of valuable information within their genomic DNA. However, this point of view was weakened by discoveries of gene overlapping in other groups of organisms for which the genome size does not seem to be of critical impor- tance. At present, gene overlapping is considered essential for coordinated regulation of gene expres- sion and/or for subsequent protein-protein interac- tion.

OVERLAPPING OF GENES WITHIN AN OPERON

Overlapping of certain cistrons within an operon is quite a widespread phenomenon. In early 80s, cistron overlapping was already known for trp [3], his [4], gal [5], and f rd [6] operons of E. coli, as well as for the phages ~X174 [7], G4 [2], T7 [8], and ~. [9]. Recently, wide-scale genome sequencing and computer analysis allowed discovery of hundreds examples of this type. The cases of cistron overlapping fall into three quite distinct groups: terminal overlapping involving only very small N- and C-terminal fragments of a coding sequence (most often the stop and start codons only), "out-of-phase" overlapping, when rather large regions of the two genes overlap in distinct reading frames; "in-phase" overlapping when two polypeptides are translated from a certain mRNA region within one reading frame. It should be noted that in first two cases, that is, terminal or "out-of-phase" overlapping, normal translation of the overlapping cistrons is pro- vided by the phenomenon of site-specific pro- grammed shift of the reading frame (PSRF). This phe- nomenon was recently reviewed in detail [10], there- fore we do not describe it here. However, it should be noted that, contrary to the random (spontaneous) shift of the reading frame (caused by various factors, for example by mutations in some tRNA genes [ 11]), site- specific PSRF depends only on the mRNA structure. Several regulatory signals should probably act together to provide this frame shifting. At the site of the shift a "flippy" sequence should be located, usu- ally composed of several identical nucleotides or con- taining a stop codon. Upstream of this sequence there should be an rRNA-binding site similar to the Shine-

0026-8933/00/3404-0485525.00 �9 2000 MAIK "Nauka/interperiodica"

486 SCHERBAKOV AND GARBER

Dalgarno sequence. It is suggested that the distance between the rRNA-binding site and the "flippy" sequence determines the direction of the frame shift- ing: at short distance the ribosome is pushed forward to the + I frame, and at long distance the ribosome is pulled back to the -1 frame [10]. Immediately down- stream of the "flippy" sequence a region of slow (for any reason) translation is usually located: it induces a short pause in the ribosome action. This region may contain the codons recognized by rare tRNAs, or may form a rather stable secondary structure, as for exam- ple in IS911 [12].

Terminal Overlapping of Genes

The most common is the first type of cistron over- lapping when terminator site of the first cistron is overlapped with the initiator site of the next one (for example, ATGA). These are, as a rule, the cases when similar or functionally dependent genes overlap: reg- ulation of these genes is interrelated at the level of translation.

This type of interaction was first reported for the trp operon of E. coli. This operon contains five genes, products of which are synthesized in equimolar ratios with the total mRNA. Nonsense mutation in the prox- imal gene trpE has strong polar effect on the expres- sion of the adjacent gene trpD with absolutely no effect on more distal genes trpC, trpB, and trpA. At the same time, rho-mutation in trpE has a strong polar effect on all distant genes.

Sequencing of the intercistron region trpE-trpD showed one-base overlapping of terminator and initi- ator codons. Two models were developed to describe interaction of translation for the cistrons trpE and trpD. According to the first model, the ribosome at the process of translation of the 3'- terminal region of mRNA trpE transforms secondary and/or tertiary structure of this region so that the ribosome-binding site of the cistron trpD becomes available for transla- tion. The second model accepted by most researchers suggests the existence of the PSRF mechanism com- bined with reinitiation, when one and the same ribo- some, after finishing translation of a cistron, is shifted back along mRNA to translate the next cistron [3].

Similar interaction of expression regulation at the translational level is found in many overlapping genes. In the same trp operon the stop codon of cistron trpB overlaps with the start codon of the downstream cistron trpA, so that the Shine-Dalgarno sequence is located within the coding region. Using various con- structs with altered (inducing premature termination of translation) reading frame of the cistron trpB it was shown that translation termination with normal stop codon is essential for the cistron trpA expression [ 13]. In this case there appears to be the same mechanism, i.e., combination of PSRF and reinitiation when the

ribosome without leaving mRNA is shifted several nucleotides back (Shine-Dalgarno sequence takes active part in this process) to begin synthesis of new polypeptide at the initiator codon.

Close interrelation at translation level, probably controlled by the same mechanism, is shown also for the genes ompR and envZ forming operon ompB in Salmonella typhimurium. The genes ompR and envZ encode, respectively, the positive transcription regula- tor OmpR and the membrane protein EnvZ, these pro- teins are essential for synthesis and normal function- ing of the proteins OmpC and OmpF located at the external side of the membrane. Coding regions of the cistrons ompR and envZ show 4-bp overlapping. When translation of ompR was terminated away from the normal stop codon translation efficiency of envZ decreased 10 times or more [14].

Translational interrelation of the synthesized prod- ucts of overlapping genes may be of practical impor- tance. For example, artificial hybrid operon with partly overlapping genes allowed considerable mcrease in yield of the human leukocyte interferon (IFN-ctF) in E. coli cells [15]. In this operon gene trpD is introduced upstream the gene encoding IFN-czF: stop codon of the gene trpD partly overlaps with the start codon of the gene encoding IFN-tzE Since the 5' end of mRNA encoding IFN-tzF is able to form stable hairpins with bacterial Shine-Dalgarno sequence, its translation is rather slow; introduction of partly overlapping genes allows one to sharply increase the rate of the IFN-aF synthesis. It may be concluded that in this case reinitiation at PSRF is more efficient than initiation.

Overlapping may include not only two, but much more cistrons. In operon rbs encoding the proteins of the ribose transport system three cistrons overlap: rbsD, rbsA, and rbsC. The effect of translational inter- relation on operon rbs is much weaker than in operons trp and ompB. Since in operon rbs the ribosome-bind- ing site of the distal cistron overlaps with the coding sequence of proximal cistron, interaction at transla- tion is probably assured by reinitiation. [16].

Five genes encoding structural proteins of the basal plate overlap in genome of the phage T4: genes 9, 10, 11, 12, and wac are transcribed from one late promoter and have one terminator, that in general allows us to consider them as one operon in which stop codon of an upstream gene overlaps with the start codon of the next gene [17].

The nik operon of E. coli includes five genes encoding the proteins of the nickel transport system. The coding sequences of these genes, nikA, nikB, nikC, nikD, and nikE are overlapped by their termina- tor and initiator ends. Translation of mRNA is coordi- nated, and all polypeptides are synthesized in equimo- lar ratio [18].

MOLECULAR BIOLOGY Vol. 34 No. 4 2000

OVERLAPPING GENES IN BACTERIAL AND PHAGE GENOMES 487

The spc operon of Thermus thermophilus carrying the genes of 11 ribosomal proteins and one nonriboso- mal protein shows overlapping of the coding regions of the genes rpsl7 and rpll4, rpll4 and rp124, rps8 and rpl6, rps5 and rpl30, rpl30 and rpll5. In most cases only stop and start codons are involved in over- lapping, but the pair of rps5 and rpl30 shows 20-bp overlapping of the reading frames [ 19].

Comparative analysis of terminal overlapping of certain cistrons allows one to draw some conclusions. First, almost always the expression of overlapping cis- trons is interrelated at translation level. This interrela- tion may be explained by combination of PSRF and reinitiation and should exist when the products of the overlapping genes form the complex in the cell or are tightly related functionally. In these cases gene over- lapping and its polar effect on translation regulation may serve an additional way to provide coordinated polypeptide synthesis. The opposite is also true: if two cistrons overlap, then they most probably encode structurally or functionally related polypeptides. This interrelation may be suggested already if the cistrons are located within the same operon: their expression in this case should be more or less coordinated at translation level (this concept is presented in review [20]). Taking this point of view, terminal overlapping is the next step of protein synthesis coordination at translational level. In great majority of cases the pro- teins encoded by overlapping cistrons are synthesized in equimolar or close to equimolar ratio: this ratio is essential for their normal functioning in the cell. Sec- ond, expression of the overlapping cistrons is con- trolled only together at either transcriptional, post- transcriptional or translational level, and this confirms the above statements. No individual regulation of such cistrons was reported.

A good example to the last statement is regulation of the pyr operon in Bacillus subtilis. This operon contains the genes encoding the products involved in pyrimidine biosynthesis, it includes an untranscribed leader sequence of 151 bp, cistron pyrR, intercistron spacer (173 bp), cistron pyrP, one more intercistron spacer (145 bp), and then eight overlapping cistrons encoding six enzymes of pyrimidine biosynthesis de novo. The operon is regulated by well-developed attenuation system including three interoperon regions of transcription termination located in leader and in two spacer sequences. Each terminator is pre- ceded by an antiterminator region, which in free state forms the secondary structure preventing termination. If the cell contains enough pyrimidines, protein PyrR binds with all three antitermination sites, destroys their secondary structure, and allows formation of the terminator hairpin resulting in inhibition of transcrip- tion [21]. Complete repression of transcription for the pyr operon is probably very important for the cell. It is achieved by termination at three independent sites

located outside the overlapping region, showing that transcription regulation of these gene is coordinated.

In certain cases terminal overlapping of the cis- trons combined with the PSRF may induce not coor- dinated translation of the two proteins, but the synthe- sis of one extended polypeptide. This was described, for example, for the genes IOA and 10B from phages T7 and T3 and for the genes G and T from phage X.

Capsid protein pl0B, is synthesized via a -1 frame- shift found at 10% frequency near the 3' end of the gene encoding the protein 10A, a few nucleotides from the stop codon. As a result, the C-terminal region of pl0A grows extended for 53 amino acid residues. The shifting site includes phenylalanine codons UUU and UUC, as well as the 3' region with well-developed secondary structure [22, 23].

The genes G and T showing four-triplet overlap- ping in phage ~, genome are located close to the genes encoding phage tail proteins. The gene G encodes the 16-kDa gpG protein, the product of the gene T is unknown. These genes together encode the 31-kDa protein gpG-T, which is synthesized via a -1 PSRF occurring in about 4% cases at translation of the sequence GGGAAAG encoding the dipeptide Gly-Lys in both reading frames. Interestingly, no reg- ulatory signals common for the PSRF are present in this case: there is no Shine-Dalgarno sequence at the 5' side (probably this is the reason for no translation of cistron T) and no distinct secondary structure of mRNA from the 3' side [24]. Though the protein gpG-T is not found in mature virus particles, it is essential for the integrative development of the phage tail [25].

"Out-of-Phase" Gene Overlapping

The gene overlapping may be more evident than described above. In these cases the terminator codon of one cistron is located deeply within the coding sequence of the subsequent cistron, and the overlap- ping region contains two distinct reading frames. In some rather rare cases for certain DNA fragment all three reading frames may be coding, as for example the genes A, C, and K in the genome of phage G4 [2].

This type of gene overlapping "out-of phase" mainly common to phages (better to say, to viruses in general) usually implies close relation of expression regulation for the overlapping genes at the level of translation. As a classical case one may consider over- lapping of the genes encoding wall proteins and lysis proteins in small ssRNA phages, for example f2, MS2, R17. We describe this case in more detail to show the logic of the related studies.

In earlier genomic studies of the small RNA-con- taining phages only three genes A, C, and S were detected encoding a protein of maturing, a wall pro-

MOLECULAR BIOLOGY Vol. 34 No. 4 2000

488 SCHERBAKOV AND GARBER

tein, and the RNA-dependent RNA synthetase (repli- case). The fourth gene encoding the lysis protein (L protein) was found in 1979 in phage f2 carrying the opal (UGA) mutation. This phage, Op3, was able to reproduce within the E. coli cells, but unable to pro- vide cell lysis and liberation of the phage particles [26]. It was found that Op3 lacks the L protein of 75 amino acid residues. However, synthesis of the L pro- tein and the ability to host cell lysis were recovered at Op3 transfection of the E. coli strain carrying an opal- suppressor mutation. These data contributed to local- ization of the gene encoding the L protein in the com- pletely sequenced genome of the phage MS2, relative of the phage f2 [271.

The start codon of the gene encoding the L protein is located inside the gene encoding the wall protein not far from its terminator codon; overlaps with this gene "out of phase" with +1 frameshift and covers the 36-bp intercistron spacer and a 142-bp replicase gene, also with +1 frameshift; cistrons C and S are read in the same frame. A substitution C ---- U in this region was detected in the Op3 mutant: this substitution in the codon for the second replicase amino acid residue stops the synthesis of the L protein at the 30th amino acid. Lowered replicase synthesis shown earlier for the Op3 is supposed to be induced by alteration of the mRNA secondary structure near the initiator codon at this substitution, reducing the availability of this site for ribosomes [28].

The L protein of phage f2 is synthesized in very small amounts either in vitro cell-free translation sys- tem containing native phage DNA, or in vivo. The time of its appearance in the infected cell rather exactly correspond to the start of the wall protein syn- thesis. These data, as well as the observation that amber C-gene mutants are unable to lyse the host cell suggested regulation of the L protein expression by the wall protein [29].

This process was studied using a great number of phage MS2 derivatives. It was shown that transfer of the wall-protein-encoding gene fragment from the region upstream the overlapping region to the region downstream the gene encoding the L protein induces the loss of phage ability to the host cell lysis, though the synthesized truncated protein is functionally active and 5'-terminal fragment of the cistron L is not affected. This experiment showed that overlapping of the C and L cistrons is essential for synthesis of the L protein.

Two possible mechanisms of polar control were considered for translation of the L protein. The first possibility is that the ribosome can start translation of the L cistron only using the putative ribosome-binding site located in mRNA of the wall protein upstream the initiator codon of the L protein. This site was in fact found in the C cistron; however, it was shown that the lysis of the host cell is blocked when a stop codon

(UAA) is introduced between this site and the initiator codon of the L cistron. This suggests the importance of translational state of the ribosome for initiation of L-protein mRNA translation in wild-type phage.

Deletion of one nucleotide upstream the stop codon in the described construct (inducing a +1 frameshift) favors intense lysis of the host cells. Therefore, a second possible mechanism was sug- gested, according to which initiation of translation of the L protein is induced by PSRF at normal reading of the C cistron. Considering that this type of frame shift is a rather rare event, the amount of the synthesized L protein should be very small [30].

The importance of the PSRF for initiation of L cis- tron translation was confirmed by experiments using E. coli cells with enhanced translation fidelity. These cells were not iysed by the wild-type f2 phage [31]. Since E. coli cells produce enough L protein to con- sider its synthesis independent from random frame shifts, attempts were made to use deletion mutants for detection of the sequences stimulating the frame shift. Most essential were alterations of the region upstream the start codon [30], in accordance with later data showing the importance for the PSRF of the structure similar to the Shine-Dalgarno sequence.

One more argument in favor of the PSRF model is the inaccessibility of the isolated L cistron for the ribosome. In this case synthesis of the L protein may proceed only if a small part of the coding sequence at the 5'end is deleted. However, this requires the pres- ence of initiator codon and of the Shine-Dalgarno sequence, probably related to formation of a hairpin structure at the 5' end that blocks ribosome binding with mRNA and initiation of translation [27].

Translational interaction in the cistrons overlap- ping "out of phase" may be not only positive, but also negative. Overlapping of the genes xis and int in phage k is an example of the latter [32]. The 3' end of the gene xis and the 5' end of the gene int show 23-bp overlapping. The products of these genes together with proteins IHF and Fis provide excision recombi- nation of phage DNA. Protein Int (integrase) is essen- tial for both integration and excision, while protein Xis (excisionase) is involved only in excision. The regulation of integration excision at various stages of the phage development is described in detail in [33], therefore, here we present only the data on lysogen induction, when both proteins, Xis and Int should be synthesized concentration of the first being much higher than of the second. According to the model of negative translation regulation, more active expres- sion of the cistron xis is provided by the ribosomes which translate this cistron at the region of its overlap- ping with the cistron int blocking ribosome binding and translation initiation in the latter [34].

MOLECULAR BIOLOGY Vol. 34 No. 4 2000

OVERLAPPING GENES IN BACTERIAL AND PHAGE GENOMES 489

"Unusual cases" of gene overlapping "out of phase" may be called those when one cistron is totally within another with a frame shift. This was shown, for example, for the cistrons RZ and RZI of phage ~., cis- trons 30.3 and 30.3' of phage T4 [35], cistrons D and E of phage OX 174 [ l ]. The first example is described below in more detail.

The cistron RZ encodes one of the late phage ~. pro- teins essential for lysis of the host cell. It is known that lysis of the host cell by phage ~, requires three pro- teins, products of the genes S, R, and RZ. The fourth protein involved in cell lysis was found when the gene RZ was expressed in vitro: two proteins were shown to be synthesized in the system, 17.2 and 6.5 kDa [36]. The 6.5-kDa protein was identified as a product of the gene RZI located within the gene RZ with a frame shift [37]. Till now the exact regulation mechanism of gene RZI expression remains unknown: most proba- bly, PSRF occurs with certain frequency at translation of the gene RZ [38].

Contrary to the terminally overlapping genes, those with "out-of-phase" overlapping are far not always showing interrelated translation, even if one gene is totally within another. For example, transla- tion of the genes D and E from phage r are translated independently from each other [39].

In certain cases the overlapping genes for example gene A and B of the small ssDNA-containing phages OX 174 or G4 may have different promoters and con- sequently may synthesize different mRNAs, [l, 2]. This situation is rather distinct from operon overlap- ping, in these cases one may talk about alternative transcription.

"In-Phase" Gene Overlapping

This type of gene overlapping is rather common for viruses, though known also for bacteria. In many cases the genes overlapping "in phase," that is, at coincident reading frames, may be considered as one gene changing with time.

The cases of "in-phase" gene overlapping fall into two main categories affecting either initiation or ter- mination of translation.

The first may be considered as the final step of alternative transcription, when the promoter of one gene is located within the reading frame of another gene, and termination of both is controlled by the same terminator. The resulting transcripts are trans- lated from two distinct initiation codons, and the syn- thesized polypeptides have dissimilar N ends and identical C ends. This situation was described, for example, for the genes C and Nu3 from phage ~, [9], and for the genes providing DNA replication in thread-like phages ~X!74, G4, fl, and MI3 [2, 7, 40]. The synthesized enzymes usually are able to bind with

one and the same substrate (because of their identical C ends), but catalyze different reactions by their unique N-terminal domains. Interaction of the identi- cal C-terminal domains probably results in formation of an oligomeric complex which functions by its dis- tinct N-terminal regions [34].

The genes encoding proteins Tnp and Inh of the transposon Tn5 may serve as example [41, 42]. This transposon contains two terminal inverted imperfect repeats (ISS0 regions) and the central region with the genes controlling resistance towards antibiotics. Both ISS0 regions contain the signals of transcription and translation start. However, full-sized polypeptides are synthesized only from the transcripts starting from the right repeat (ISSOR), because the left repeat contains a stop-codon causing premature termination. Proteins Tnp (transposase) and Inh (transposition inhibitor) are translated from one and the same mRNA and differ only in their N-terminal sequences: Tnp is 55 amino acid residues longer. These proteins provide regula- tion of transposition.

One more example is "in-phase" overlapping of the genes 4A and 4B of the phage T7. Gene 4A con- tains 567 triplets and encodes the polypeptide show- ing both primase and helicase activity. Codon 64 of the gene 4A is the initiator codon for the gene 4B, therefore the polypeptide encoded by the gene 4B is 63 amino acids shorter and shows only helicase activ- ity. Biological role of this phenomenon remains unclear, because it was shown that the product of the gene 4A possessing both activities provides normal phage growth and development in the cell [43].

Aspartokinase from the extreme thermophile Ther- musflavus is an example of complex formation by the products of the "in-phase" overlapping genes. The gene askA encodes the ot subunit of aspartokinase (405 amino acids), and the gene askB, which corre- sponds exactly to the 3' part of gene askA, encodes within the same reading frame the 13 subunit identical to the C-terminal part of the AskA protein (l 61 amino acid residue) [44]. In this case formation of oligo- meric complex may probably be explained by interac- tion of identical C-terminal domains.

One more example of combined activity of the "in-phase" overlapping gene products may be found in thread-like phages, for example in fl. Replication complex of this phage includes proteins encoded by the genes il and X: gene X corresponds to the 3' part (about one third) of the gene II [45]. The product of the gene It is essential for the phage life cycle partic- ipating in all steps of phage DNA replication. Pres- ence of the protein pX stimulates replication [46]; however, at excess concentration of the pX protein the synthesis of phage-specific DNA is strongly inhibited [47]. Proteins pII and pX are translated from different mRNAs formed by processing with RNase E [48].

MOLECULAR BIOLOGY Vol. 34 No. 4 2000

490 SCHERBAKOV AND GARBER

Stop codon

3'

~ ' ~ S P - G L N 3'

Fig. 1. Reprogramming process providing biosynthesis of the pili assembling system CS3 in E. coli. Codon UAG is read by the ribo- some as glutamine-encoding.

The second type of gene "in-phase" overlapping comprises synthesis of several polypeptides from one and the same initiator codon with termination at dis- tinct codons. A good example is provided by the genes involved in the system of pill CS3 synthesis and assembling in E. coli. Five polypeptides of 104, 63, 48, 33, and 20 kDa are encoded by the gene(s) in one reading frame starting from the same initiator codon. This frame is four times interrupted with stop codons UAG which may be read by the ribosome as glutamine codons, so five proteins of different size and with iden- tical N-terminal sequences are synthesized (Fig. I).

The mechanism providing a high frequency of the stop-codon redefinition and readthrough remains unknown. An essential role of the well-developed mRNA secondary structure including a pseudoknot immediately downstream the UAG codon was sug- gested [49].

The reversed situation when the synthesized pro- tein is not elongated because of the stop-codon readthrough, but shortened because of the PSRF is also possible. Synthesis of the x- and y-subunits of E. coli DNA polymerase III may serve an example [50]. The x-subunit is more essential for functioning of the DNA polymerase, because the mutants lacking this subunit do not survive, while the absence of the y-subunit results only in alteration of kinetic parame- ters for the holoenzyme [51].

The product of the dnaX gene expression is a 71- kDa polypeptide forming the x-subunit. The PSRF at about two thirds gene length downstream of the initi- ator codon results in premature termination with for- mation of truncated product of 47 kDa, the y subunit. The PSRF frequency in this case is about 50% [52], and functional role of the PSRF is probably to provide equal amounts of the two subunits. No more or less flexible regulation of this process was found up to now; most probably, equimolar synthesis of the prod- ucts results from equal probability of the two events: either continued translation with zero position of the reading frame, or frame shift inducing premature ter- mination.

The described peculiarities of the dnaX gene expression were found not only in E. coli, but also in other bacteria, for example in T. thermophilus [53].

Antiparailel Overlapping Genes Discussing this phenomenon, one should clearly

distinguish the antiparallei overlapping genes from the antiparallel overlapping open reading frames. Long antiparallei overlapping open reading frames are quite frequent, they are found in many IS ele- ments, for example in IS! [54,], IS2 [55], IS4 [56], IS5 [57], etc., in the genes E, L, and J of phage ~. [9], in several genes of E. coli [58]. Computer search of various genomes for the antiparallel overlapping read- ing frames [59] detected 40% overlapping and 95-100% overlapping of the antiparallei reading frames in about, respectively, 30% genes and 4-5% genes of the E. coli genome. The results also con- firmed the dependence of the number and length of the antiparallel reading frames from the codon frequencies and from G/C gene composition suggested earlier [60].

In a vast majority of cases, the antiparallel reading frames lack the start signals for transcription and translation and cannot therefore be considered as genes. The antiparallel overlapping genes encoding the proteins we found only in some IS elements, for example in IS4 and IS5: the latter allows synthesis, beside the main protein (transposase) of an additional 13.3-kDa-protein encoded by the antiparallel reading frame [61]. The gene encoding this additional polypeptide has its own promoter and terminator sites and its own ribosome-biding site. Therefore, tran- scription is initiated from the promoters of opposite orientation that may be of certain importance for reg- ulation of expression. As shown earlier [62], if weaker and stronger promoter are oriented toward each other, the efficiency of the first is still more lowered.

Since the discovery of the antiparallel open read- ing frames their origin and role in formation of new genes and molecular evolution is actively discussed [63, 64]. Computer analysis of homologies for the antiparallel and open reading frames [65] have shown some cases of high (up to 70%) homology of the

MOLECULAR BIOLOGY Vol. 34 No. 4 2000

OVERLAPPING GENES IN BACTERIAL AND PHAGE GENOMES 491

I frdA J] frdB ~ ampC

I fr O ,I L , I I

P Transcriptfrd t I // ,I I Transcript ampC P

// "1 t

Fig. 2. Overlapping of operonsfrd and cvnpC in E. coli. E promoter; t, terminator.

encoded amino acid sequences. The antiparallei read- ing frames may probably serve as a source of new genes or as an inductor of sharp changes of the exist- ing genes if the transfer of a DNA fragment results in placing of the antiparallel reading frame under control of a promoter or into the reading frame of an existing gene.

OVERLAPPING OF OPERONS AND ITS POSSIBLE RELATION TO EVOLUTION

OF BACTERIAL GENOMES

Recent data favor the hypothesis that operon posi- tioning at the chromosomal DNA is not random, and neighboring operons are able to interact with each other. This is 100% true for the overlapping operons.

As a rule, the term operon is attributed to bacterial genomes, so here we describe operon overlapping in bacteria.

Operon overlapping may be of two types: tandem, when the distal part of the first operon overlaps with the proximal part of the second, and antiparallel, or face-to-face, when the proximal part of one operon overlaps with the proximal part of the second operon which faces the first (most often the overlap is restricted to the promoter regions). Antiparallel pro- moter overlapping was first described in bacteria, but is more common for the phages.

The operon overlapping was first reported for the operonsfrd and ampC from E. coli. These operons are located at the 96th minute of the genetic chromosomal map. The operonfrd encodes the proteins of fumarate reductase complex: FrdA, FrdB, FrdC, and FrdD cat- alyzing the last step of electron transfer under anaero- bic conditions. The first two proteins, FrdA and FrdB, are equimolar components of the fumarate reductase, and the second two proteins, FrdC and FrdD provide enzyme fixation at the cytoplasmic membrane. The operon ampC contains a single gene which encodes ~-Iactamase. Promoter and terminator regions of these operons were determined using sequencing, in vitro transcription, and primer extension techniques [6, 66].

Promoter of the operon ampC is located within the gene frdD2 approximately 30 bp upstream of the ter- minator codon, so that the terminator codon offrdD is located between the -10 region and the start point of the operon ampC transcription (Fig. 2). Terminator of the operon frd is located within the 5' untranscribed leader region of the operon ampC. This terminator serves at the same time an attenuator of the operon ampC, and this decreases considerably the synthesis of ~-Iactamase in E. coli cells [67]. Up to 95% tran- scripts initiated at the operon ampC promoter in vitro are terminated at 41 bp [66].

The [Mactamase is a constitutive enzyme in the cells of E. coli; however, its synthesis depends on the cell growth rate [68]. Two mechanisms are suggested to be involved in this: from one hand, ampC may be partly cotranscribed with the operonfrd while regula- tion of transcription for the latter depends on growth conditions (aerobic or anaerobic) and on the presence of alternative electron acceptors, such as nitrates [69, 70]. On the other hand, according to the model [71] efficiency of attenuation on the frd terminator may considerably decrease in conditions of the active protein biosynthesis, the rate of which depends directly on the cell growth rate. The model suggests that the ribosomes are able to use the first three nucle- otides of ampC mRNA (ATC) as a binding site and to initiate translation at the first 5'-terminal codon AUG, destroying the secondary structure of the attenuator site. Translation initiated in this way is immediately stopped, because AUG is immediately followed by TAA, but this time is sufficient for RNA polymerase to pass through the attenuator site.

In the cells of many Gram-negative bacteria the synthesis of ~-Iactamase is induced by the substrates containing the 13-1actam ring and, as a rule, does not depend on the growth rate [72]. Expression regulation for the gene ampC usually involve a special protein activator, a product of the ampR gene, and at least two repressor proteins encoded by genes ampD and ampE interacting with the activator AmpR [73]. This regula- tion mechanism was shown, for example, for Citro- bacterfreundii [74] and for eubacterium Enterobacter

MOLECULAR BIOLOGY Vol. 34 No. 4 2000

492 SCHERBAKOV AND GARBER

cloacae which is similar to E. coli [75]. In these two bacteria the frd terminator does not act at the same time as amp attenuator. At the same time, no consid- erable ampC trans-regulation is shown in E. coli, that suggests the absence of the ampR gene [67]. E. coli has genes ampD and ampE, but these genes are not expressed in the absence of ampR. In the microorgan- isms rather distant from E. coli, for example in Pro- teus vulgaris, operons frd and amp are located not at the adjacent positions, but in distinct parts of the genome [76]. It may be suggested that earlier ~-Iacta- mase functioned as an enzyme involved in cell wall metabolism [77, 78]. Then its function was changed to passive protection from numerous microorganisms (mainly lower fungi producing penicillin and its derivatives). Later, when E. coli changed its ecologi- cal niche and became an inhabitant of the digestion tract of higher organisms, the pressure of natural selection favoring this function became lower, so that a mutation which resulted in the lack of the ampR gene became able to survive. The resulting sharp increase of the ~-Iactamase synthesis would be of neg- ative consequences for the cell, but the terminator of operon frd began to function as attenuator of the operon ampC providing at least 20-fold decrease of the constitutive ~lactamase synthesis [34].

To summarize, operons frd and ampC present an example of the tandem operon overlapping, when the proximal operon affects transcription of the distal one. At this type of overlapping the operons more or less preserve their independence: they have their own pro- moter and terminator sites and alterations offrd out- side the overlapping region rather weakly affect the ampC.

One more example of this type of overlapping is presented by operons LII and $6 from T. thermophi- lus. These operons are composed of the genes encod- ing the ribosomal proteins. Promoter of the distal operon $6 is located upstream the terminator of the proximal operon LII. Probably in this case the termi- nator serves at the same time an attenuator of the downstream operon [79].

An example of stronger interaction at the level of transcription regulation is also provided by the over- lapping operons encoding the ribosomal proteins. Three such operons of E. coli, str, SIO, and spc form a cluster located at the 73rd minute of the genetic map. In the genome of E. coli these operons are separated by spacers of about 15 kbp between str and SIO and 163 bp between SIO and spc). The operons have their own promoters and terminators; however, the readthrough mRNA transcription is rather frequent, up to 25% for the str-SlO pair and up to 30% for the SlO-spc pair [80]. Since expression of the genes encoding the ribosomal proteins is regulated at the level of translation [81 ], excess mRNA synthesis does not affect the amount of the synthesized proteins.

Therefore, the operons str, SIO, and spc of E. coli may be considered isolated and independent structures. However, the situation is quite different in genomes of certain other microorganisms. In T. thermophilus, for example, the operons str, SIO, and spc are not only overlapping, but almost fused to form a transcription regulation unit. T. thermophilus lacks not only the ter- minator of the operon SIO and the promoter of the operon spc, but also the interoperon spacer. The distal gene of the operon rpsl7 overlaps with the proximal gene of the downstream operon rpll4 [19]. Operons str and SIO are separated by the 8-bp spacer and also lack promoter and terminator regions [82].

This example is not unique. Each of the operons crtEF, bchCA, and pufcontaining the genes encoding the proteins involved in photosynthesis in faculta- tively phototrophic bacterium Rhodobacter capsula- tus has a promoter and a terminator site, and promot- ers of the distal operons overlap with terminators of the proximal ones [83]. The main part of mRNA is synthesized from the promoter of operon crtEF, these transcripts overlap operons bchCA and puf. As shown by analysis of the mutants with long noncoding inserts between these operons, the described readthrough transcription comprising also general regulation of transcription for these operons is essential for the nor- mal photosynthesis [84]. One may talk about fusion of operons crtEF, bchCA, and puf into one operon in Rh. capsulatus.

Comparative analysis of operon overlapping in E. coli, T. thermophilus, and Rh. capsulatus allows one to follow the possible steps of operon evolution. Perhaps, closer spatial positioning and/or overlapping of the operons was an essential step of the genome evolution. Originally the genes were isolated tran- scriptionally regulated units. Being affected by muta- tion process, they migrated within the genome to become more close or more distant. Two ways of operon origin and evolution may be considered: either abrupt fusion of genes or operons after deletion or migration, or their gradual approach followed by overlapping of the regulatory regions and further joint regulation and complete fusion. Most probably, both processes took place in evolution. In any case, the pressure of natural selection affecting the system at all stages of its development favored interaction and fusion of the genes encoding functionally related pro- teins to become jointly regulated within one operon.

As mentioned above, beside tandem operon over- lapping, there is also antiparallel or face-to-face over- lapping. This type of operon overlapping was first shown for the ilvY and ilvC operons of E. coli. These operons contain one gene each, ilvY and ilvC (Fig. 3). The gene ilvC encodes the acetohydroxyacid isomer reductase and the gene ilvY encodes the enzyme involved in one of the alternative pathways of valine and isoleucine biosynthesis. The product of the ilvY

MOLECULAR BIOLOGY Vol. 34 No. 4 2000

OVERLAPPING GENES IN BACTERIAL AND PHAGE GENOMES 493

Operon ih,GMEDA Operon ilv YC Pity u all .

~ ih'G ~ ilvD H ih'a i ih'Y I ilvC ~ -

Pl P2 t' t t' Pi t

Fig. 3. Organization of overlapping operons ilvGMEDA, ilvY, and ilvC in E. coli. E promoter; att, attenuator; t and t', p-independent and p-dependent terminator, respectively. Arrows show the direction of transcription.

gene is an activator protein inducing expression of the ilvC gene in the presence of acetolactate and acetohy- droxybutyrate [85]. The operons ilvY and ilvC are located antiparallel, so that the distance between the transcription start points is 45 bp, i.e., the promoter regions overlap. This position of the promoters allows translation regulation of both ilvC and ilvY by one and the same activator protein IIvY [86]. The overlapping region contains two antiparallel operators, O~ and 02, which are imperfect inverted repeats. The centers of symmetry for O~ and 02 are located, respectively, at +17 of the ilvY start point and at -35 of the ilvC start point. The protein activator IlvY cooperatively binds with DNA of both operon regions to repress the tran- scription of its operon. This effect is independent of an inductor; however, in the presence of inductor (ace- tolactate or acetohydroxybutyrate) the DNA-protein complex gains high affinity toward RNA polymerase, favoring its binding with promoter and stimulating transcription of ilvC. At the same time, association with the inductor has no effect on IIvY ability to bind specific (operator) DNA sites [87].

Not only promoters but various other fragments of different operons may overlap. For example, the pro- moter of operon ilvY from E. coli is antiparallel over- lapped with the promoter of operon ilvC, while its ter- minator is antiparallel overlapped with the terminator of operon ilvGMEDA. The latter operon contains genes encoding four enzymes involved in one of the pathways of valine and leucine biosynthesis; it is reg- ulated at the transcription level by a rather complex mechanism including attenuation [88] and regulation by the integration host factor (IHF) and by the leu- cine-responsive regulatory protein (Lpr) [89, 90]. The stop codons of gene ilv Y and gene ilvA (the latter is the last gene of the operon ilvGMEDA) are separated by a 52-bp spacer. Their terminator regions containing p-independent and p-dependent terminators are located within the coding sequences. The biological significance of this phenomenon remains unknown. This positioning of the terminator regions is supposed to affect the efficiency of their functioning [91 ].

REFERENCES

1. Barrel, B.G., Air, G.M., and Hutchison, C.A., Nature, 1976, vol. 264, pp..34----40.

2. Godson, G.N., Barrell, B.G., Staden, R., and Fiddes, J.C., Nature, 1978, vol. 276, pp. 236-247.

3. Oppenheim, D.S. and Yanofsky, C., Genetics, 1980, vol. 95, pp. 785-795.

4. Barnes, W.M. and Tuley, E., J. Mol. Biol., 1983, vol. 165, pp. 44."~-459.

5. McKenney, K., Shimatake, H., Court, D., Schmeissner, U., Brady, C., and Rosenberg, M., Gene Amplif. AnaL, 1981, vol. 2, pp. 383--415.

6. Cole, S.T., Eur. J. Biochem., 1982, vol. 122, pp. 479- 484.

7. Linney, E. and Hayashi, M., Nature New Biol., 1973, vol. 245, pp. 6-8.

8. Dunn, J.J. and Studier, EW., J. Mol. Biol., 1981, vol. 148, pp. 303-330.

9. Shaw, J.E. and Murialdo, H., Nature, 1980, vol. 283, pp. 30-35.

10. Gesteland, R.F. and Atkins, J.F., Annu. Rev. Biochem., 1996, vol. 65, pp. 741-768.

11. O'Connor, M., J. Mol. Biol., 1998, vol. 279, pp. 727- 736.

12. Rettberg, C.C., Prere, M.E, Gesteland, R.F., Atkins, J.E, and Fayet, O., J. Mol. Biol., 1999, vol. 286, pp. 1365- 1378.

13. Das, A. and Yanofsky, C., Nucleic Acids Res., 1989, vol. 17, pp. 9333-9.'L40.

14. Liljestrom, P., Laamanen, I., and Palva, E.T., J. Mol. Biol., 1988, vol. 201, pp. 663-673.

15. Mashko, S.V., Lapidus, A.L., Trukhan, M.E., I,ebede- va, M.I., Podkovyrov, S.M., Kashlev, M.V., Mochul'- skii, A.V., Eremashvili, M.R., lsotova, L.S., Strongin, A.Ya., Skvortsova, M.A., Sterkin, V.E., Lebedev, A.N., Reben- tish, B.A., Kozlov, Yu.l., Monastyrskaya, G.S., Tsarev, S.A., Sverdlov, E.D., and Debabov, V.G., Mol. Biol., 1987, voi. 210 pp. 1297-1309.

16. Bell, A.W., Buckel, S.D., Groarke, J.M., Hope, J.N., Kingsley, D.H., and Hermodson, M.A., J. Biol. Chem., 1986, vol. 261, pp. 7652-7658.

17. Selivanov, N.A., Prilipov, A.G., Efimov, V.P., Marusich, E.I., and Mesyanzhinov, V.V., Biomed. Sci., 1990, vol. I, pp. 55-62.

18. Navarro, C., Wu, L.E, and Mandrand-Berthelot, M.A., Mol. Microbiol., 1993, vol. 9, pp. 1181-1191.

MOLECULAR BIOLOGY Vol. 34 No. 4 2000

494 SCHERBAKOV AND GARBER

19. Vysotskaya, V.S., Shcherbakov, D.V., and Garber, M.B., Gene, 1997, vol. 193, pp. 2.3-30.

20. Dandekar, T., Snel, B., Huynen, M., and Bork, P., Trends Biochem. Sci., 1998, vol. 23, pp. 324-328.

21. Turner, R.J., Lu, Y., and Switzer, R.L., J. Bacteriol., 1994, vol. 176, pp. 3708-3722.

22. Sipley, J., Stassi, D., Dunn, J., and Goldman, E., Gene Expr., 1991, vol. I, pp. 127-136.

23. Condron, B.G., Atkins, J.E, and Gesteland, R.E, J. Bac- teriol., ! 991, vol. 173, pp. 6998-7003.

24. Levin, M.E., Hendrix, R.W., and Casjens, S.R., J. Mol. Biol., 1993, vol. 234, pp. 124-139.

25. Katsura, I. and Kuhl, P.W., J. Mol. Biol., 1975, vol. 91, pp. 257-273.

26. Model, P., Webster, R.E., and Zinder, N.D., Cell, 1979, voi. 18, pp. 235-246.

27. Fiers, W., Contreras, R., Duerinck, E, Haegeman, G., Iserentant, D., Merregaert, J., Min Jou, W., Molemans, E, Raeymaekers, A., van den Berghe, A., Volckaert, G., and Ysebaert, M., Nature, 1976, vol. 260, pp. 500-507.

28. Atkins, J.E, Steitz, J.A., Anderson, C.W., and Model, P., Cell, 1979, vol. 18, pp. 247-256.

29. Beremand, M.N. and Blumenthal, T., Cell, 1979, vol. 18, pp. 257-266.

30. Kastelein, R.A., Remaut, E., Fiers, W., and van Duin, J., Nature, 1982, vol. 295, pp. 35-41.

31. De Mars Cody, J. and Conway, T.W., J. Virol., 1981, vol. 37, pp. 813-820.

32. Davies, R.W., Nucleic Acids Res., 1980, vol. 8, pp. 1765-1782.

33. Ptaschne, M., Perekluchenie genov (Gene Switching), Russian translation, Moscow: Mir, 1988.

34. Normark, S., Bergstrom, S., Edlund, T., Grundstrom, T., Jaurin, B., Lindberg, EP, and Olsson, O., Annu. Rev. Genet., 1983, vol. 17, pp. 499-525.

35. Zajanckauskaite, A., Malys, N., and Nivinskas, R., Gene, 1997, vol. 194, pp. 157-162.

36. Garrett, J., Fusselman, R., Hise, J., Chiou, L., Smith- Grillo, D., Schulz, J., and Young, R., Mol. Gen. Genet., 198 I, vol. 182, pp. 326-33 !.

37. Hanych, B., Kedzierska, S., Walderich, B., Uznanski, B., and Taylor, A., Gene, 1993, vol. 129, pp. !-8.

38. Kedzierska, S., Wawrzynow, A., and Taylor, A., Gene, 1996, vol. 168, pp. I-8.

39. Blasi, U., Nam, K., Lubitz, W., and Young, R., J. Bacte- riol., 1990, vol. 172, pp. 5617-5623.

40. Linney, E. and Hayashi, M., Nature New Biol., 1973, vol. 245, pp. 6-8.

41. Johnson, R.C., Yin0 J.C., and Reznikoff, W.S., Cell, 1982, vol. 30, pp. 873-882.

42. De la Cruz, N.B., Weinreich, M.D., Wiegand, T.W., Krebs, M.P., and Reznikoff, W.S., J. Bacteriol., 1993, vol. 175, pp. 6932-6938.

43. Rosenberg, A.H.0 Patel, S.S., Johnson, K.A., and Studier, EW., J. Biol. Chem., 1992, vol. 267, pp. 15005- 15012.

44. Nishiyama, M., Kukimoto, M., Beppu, T., and Horinou- chi, S., Microbiology, 1995, vol. 141, pp. 1211-1219.

45. Yen, T.S. and Webster, R.E., J. Biol. Chem., 1981, vol. 256, pp. 11259-11265.

46. Fulford, W., and Model, R, J. Mol. Biol., 1984, vol. 178, pp. 137-153.

47. Fulford, W. and Model, P., J. MoL Biol., 1988, vol. 203, pp. 49-62.

48. Kokoska, R.J. and Steege, D.A., J. BacterioL, 1998, vol. 180, pp. 3245-3249.

49. Jalajakumari, M.B., Thomas, C.J., Halter, R., and Man- ning, P.A., Mol. Microbiol., 1989, vol. 3, pp. 1685-1695.

50. Tsuchihashi, Z. and Kornberg, A., Proc. Natl. Acad. Sci. USA, 1990, vol. 87, pp. 2516--2520.

51. Kim, S. and Marians, K.J., Nucleic Acids Res., 1995, vol. 23, pp. 1374-1379.

52. Larsen, B., Wills, N.M., Gesteland, R.E, and Atkins, J.E, J. Bacteriol., 1994, vol. 176, pp. 6842-6851.

53. Yurieva, O., Skangalis, M., Kuriyan, J., and O'Donnell, M., J. Biol. Chem., 1997, voi. 272, pp. 27131-27139.

54. Johnsrud, L., Mol. Gen. Genet., 1979, vol. 169, pp. 213- 218.

55. Ghosal, D., Sommer, H., and Saedler, H., Nucleic Acids Res., 1979, vol. 6, pp. I I I I-i 122.

56. Kiaer, R., Kuhn, S., Tillmann, E., Fritz, H.J., and Star- linger, P., Mol. Gen. Genet., 1981, vol. 18 I, pp. 169-175.

57. Kroger, M. and Hobom, G., Nature, 1982, vol. 297, pp. 159-162.

58. Kumamoto, C.A. and Nault, A.K., Gene, 1989, vol. 75, pp. 167-175.

59. Merino, E., Balbas, P., Puente, J.L., and Bolivar, E, Nucleic Acids Res., 1994, vol. 22, pp. 1903-1908.

60. Alff-Steinberger, C., Nucleic Acids Res., 1984, vol. 12, pp. 2235-2241.

61. Rak, B., Lusky, M., and Hable, M., Nature, 1982, vol. 297, pp. 124-128.

62. Horowitz, H. and Platt, T., Nucleic Acids Res., 1982, vol. 10, pp. 5447-5465.

63. Wellington, C.L., Taggart, A.K., and Beatty, J.T., J. Bac- teriol., 199 !, vol. 173, pp. 2954-296 I.

64. Silke, J., Gene, 1997, vol. 194, pp. 143-155. 65. Cebrat, S., Mackiewicz, P., and Dudek, M.R., Biosys-

terns, 1998, vol. 45, pp. 165-176. 66. Jaurin, B., Grundstrom, T., Edlund, T., and Normark, S.,

Nature, 1981, vol. 290, pp. 221-225. 67. Normark, S. and Burman, L.G., J. Bacteriol., 1977,

vol. 132, pp. I-'7. 68. Jaurin, B. and Normark, S., ./. Bacteriol., ! 979, vol. 138,

pp. 896-902. 69. Jones, H.M. and Gunsalus, R.P., J. Bacteriol., 1987,

vol. 169, pp. 3340--3349. 70. Engel, P., Trageser, M., and Unden, G., Arch. Microbiol.,

199 I, vol. 156, pp. 463-470. 71. Grundstrom, T. and Normark, S., Mol. Gen. Genet.,

1985, vol. 198, pp. 411-415. 72. Bergstrom, S., Lindberg, EE, Olsson, O., and Normark, S.,

J. Bacteriol., 1983, vol. 155, pp. 1297-12305. 73. Lindquist, S., Lindberg, E, and Normark, S., J. Bacte-

riol., 1989, vol. 17 l, pp. 3746-3753. 74. Lindberg, E, Westman, L., and Normark, S., Proc. Natl.

Acad. Sci. USA, 1985, vol. 82, pp. 4620-4624. 75. Honore, N., Nicolas, M.H., and Cole, S.T., EMBO J.,

1986, vol. 5, pp. 3709-3714.

MOLECULAR BIOLOGY Vol. 34 No. 4 2000

OVERLAPPING GENES IN BACTERIAL AND PHAGE GENOMES 495

76. Cole, S.T., Eur. J. Biochem., 1987, vol. 167, pp. 481- 488.

77. Tuomanen, E., Lindquist, S., Sande, S., Galleni, M., Light, K., Gage, D., and Normark, S., Science, 1991, vol. 251, pp. 201-204.

78. Bishop, R.E. and Weiner, J.H., FEMS Microbiol. Lett., 1993, vol. 114, pp. 349-354.

79. Sherbakov, D.V., Cand. Sci. (Biol.) Dissertation, Mos- cow: Moscow State Univ., 1999.

80. Lindahl, L., Sor, E, Archer, R.H., Nomura, M., and Zen- gel, J.M., Biochim. Biophys. Acta, 1990, vol. 1050, pp. 337-342.

81. Nomura, M., Yates, J.L., Dean, D., and Post, L.E., Proc. Natl. Acad. Sci. USA, 1980, vol. 77, pp. 7084-7088.

82. Pfeiffer, T., Jorcke, D., Feltens, R., and Hartmann, R.K., Gene, 1995, vol. 167, pp. 141-145.

83. Wellington, C.L. and Beatty, J.T., Z Bacteriol., 1991, vol. 173, pp. 1432-1433.

84. Wellington, C.L., Taggart, A.K., and Beatty, J.T., J. Bac- teriol., 1991, vol. 173, pp. 2954-2961.

85. Umbarger, H.E., Annu. Rev. Biochem., 1978, vol. 47, pp. 532-606.

86. Wek, R.C. and Hatfield, G.W., J. Biol. Chem., 1986, vol. 261, pp. 2441-2450.

87. Rhee, K.Y., Senear, D.F., and Hatfield, G.W., J. Biol. Chem., 1998, vol. 273, pp. 11257-11266.

88. Chen, J.W., Bennett, D.C., and Umbarger, H.E., J. Bac- teriol., 1991, vol. 173, pp. 2328-2340.

89. Pagel, J.M. and Hatfield, G.W., J. Biol. Chem., 1991, vol. 266, pp. 1985-1996.

90. Rhee, K.Y., Parekh, B.S., and Hatfield, G.W., J. Biol. Chem., 1996, vol. 271, pp. 26499-26507.

91. Sameshima, J.H., Wek, R.C., and Hatfield, G.W., J. Biol. Chem., 1989, vol. 264, pp. 1224-1231.

MOLECULAR BIOLOGY Vol. 34 No. 4 2000