Transcript
Page 1: Many nonuniversal archaeal ribosomal proteins are found in ...downloads.hindawi.com/journals/archaea/2009/971494.pdf · served L13 cluster, whereas the genes for S4e, L32e and L19e

Summary The genomic associations of the archaeal ribo-somal proteins, (r-proteins), were examined in detail. Thearchaeal versions of the universal r-protein genes are typicallyin clusters similar or identical and to those found in bacteria. Ofthe 35 nonuniversal archaeal r-protein genes examined, thegene encoding L18e was found to be associated with the con-served L13 cluster, whereas the genes for S4e, L32e and L19ewere found in the archaeal version of the spc operon. Elevennonuniversal protein genes were not associated with any com-mon genomic context. Of the remaining 19 protein genes, 17were convincingly assigned to one of 10 previously unrecog-nized gene clusters. Examination of the gene content of theseclusters revealed multiple associations with genes involved inthe initiation of protein synthesis, transcription or other cellu-lar processes. The lack of such associations in the universalclusters suggests that initially the ribosome evolved largely in-dependently of other processes. More recently it likely hasevolved in concert with other cellular systems. It was also veri-fied that a second copy of the gene encoding L7ae found insome bacteria is actually a homolog of the gene encoding L30eand should be annotated as such.

Keywords: operons, ribosome evolution, transcription.

Introduction

The archaeal translation machinery has a long evolutionaryhistory and it is of interest to know how it have evolved andhow its interactions with other cellular processes havechanged over time (Fox and Naik 2004). In general, genes thatare translated together share similar origins, physical interac-tions, functions or regulatory mechanism (Dandekar et al.1998). In the work reported here, we identified previously un-recognized genomic associations of ribosomal proteins (r-pro-teins) in archaea and discuss the implications of these associa-tions for the early history of the translation machinery.

Among the three domains of life, there are approximately102 recognized r-protein families. Of these, 17 large subunitand 19 small subunit r-proteins are universal These 36 r-pro-teins likely appeared before the last common ancestor,(LUCA), and thus are considered to be ancient in origin(Kyrpides et al. 1999, Lecompte et al. 2002). The Archaea and

the Eucaryota share 11 large subunit (LSU) r-proteins and 20small subunit (SSU) r-proteins but share no r-protein with theBacteria other than the universal proteins (Lecompte et al.2002). The large numbers of r-protein families shared by theArchaea and the Eucaryota suggest that the eukaryotic transla-tion system originated from an archaeal version (Hartman etal. 2006).

In early studies on the regulation of the synthesis of ribo-somal components in E. coli, it was found that many r-proteinsare encoded together; there being 32 r-proteins from both sub-units and two translation-related proteins grouped into sevenwell-studied clusters. These are the alpha, L10 (rif ), L11, S10,S20, str and spc operons (Nomura et al. 1984). In each case,gene expression was found to be regulated by one of the r-pro-tein components within the cluster, although the detailedmechanisms differ (Nomura et al. 1984, Zengel and Lindahl1994) between clusters and even between species in the samecluster (Allen et al. 1999). As more complete genomes of otherbacterial species became available, it was found that theseseven r-protein clusters were widely conserved among bacte-rial species (Mushegian and Koonin 1996, Siefert et al. 1997).Among these, the S10 and the spc operon are the largest, eachencoding about twelve r-proteins.

Multiple complete genomes from both archaeal and bacte-rial species are now available (http://www.ncbi.nlm.nih.gov/genomes/lproks.cgi). It has been confirmed that the S10, str,spc, and S13 clusters are also found in the Archaea (Tatusov etal. 2001). These clusters have been characterized experimen-tally in several archaeal species including Haloarculamarismortui, Sulfolobus acidocaldarius, and Desulfurococcusmobilis and their analogy to the bacterial operons confirmed,(Scholzen and Arndt 1992, Ramirez et al. 1994, Ceccarelli etal. 1995, Yang et al. 1999). The universality of these gene clus-ters suggests that the proteins for which they code not onlyhave a long history of association, but may represent ex-tremely early regulatory relationships (Kyrpides et al. 1999,Lecompte et al. 2002, Mushegian 2005). In particular, it hasbeen suggested that the progenotic entities of this era usedgroups of gene clusters (i.e., mini-chromosomes) to carry ge-netic information instead of what would be considered a com-plete genome (Olsen and Woese 1997, Siefert et al. 1997). Ifso, these extant extremely conserved r-protein gene clusters

Archaea 2, 241–251© 2009 Heron Publishing—Victoria, Canada

Many nonuniversal archaeal ribosomal proteins are found inconserved gene clusters

JIACHEN WANG,1 INDRANI DASGUPTA1 and GEORGE E. FOX1,2

1 Department of Biochemistry and Cell Biology, Rice University, Houston, TX 77251, USA2 Corresponding Author ([email protected])

Received December 28, 2008; accepted March 31, 2009; published online April 29, 2009

Page 2: Many nonuniversal archaeal ribosomal proteins are found in ...downloads.hindawi.com/journals/archaea/2009/971494.pdf · served L13 cluster, whereas the genes for S4e, L32e and L19e

with their RNA level regulation may be a remnant of this ear-lier time.

Although these clusters of universal r-proteins are of greatinterest, there are other less conserved r-proteins that may alsobe informative. In particular, the Archaea and Eucaryota share11 large subunit r-proteins and 20 small subunit r-proteinswhich are not found in bacteria (Lecompte et al. 2002). In thework presented here, we investigated the genomic associationsof each of these proteins and identified 10 previously unrecog-nized, but significantly conserved gene clusters.

Materials and methods

Genomes included in the study

Representative archaeal species whose complete genomeswere available from the National Center for Biotechnology In-formation (NCBI) GenBank database (http://www.ncbi.nlm.nih.gov/genomes/lproks.cgi) were considered for analysis.When multiple species of a single genus had been sequenced,only one was selected for inclusion here. The 27 genomes cho-sen included seven from the Crenarchaeota (Aeropyrum per-nix, Hyperthermus butylicus, Pyrobaculum aerophilum, Pyro-bacterium islandicum, Staphylothermus marinus, Sulfolobussolfataricus, and Thermofilum pendens) and 20 from the Eury-archaeota. The latter were Archaeoglobus fulgidus, Halo-arcula marismortui, Halobacterium sp. NRC-1, Halo-quadratum walsbyi, Methanocaldococcus jannaschii, Meth-anococcoides burtonii, Methanococcus maripaludis, Meth-anoculleus marisnigri, Methanopyrus kandleri, Methanosaetathermophila, Methanosarcina acetivorans, Methanospirillumhungatei, Methanosphaera stadtmanae, Methanothermo-bacter thermautotrophicus, Natronomonas pharaonis, Picro-philus torridus, Pyrococcus abyssi, Thermococcus koda-karensis and Thermoplasma volcanium.

The small genome of the parasitic Nanoarchaeum equitans,which is known to have been subject to extreme rearrangement(Waters et al. 2003, Makarova and Koonin 2005) likely relat-ing to its life style, was excluded from the analysis. Many ofthe r-proteins are absent, and even highly conserved gene clus-ters, such as the S10 and spc operons, are disrupted such thattheir components are dispersed throughout the genome.

Determination of gene neighborhood for the ribosomalproteins

All 27 genomes were examined to identify the genomic neigh-borhood of each r-protein gene. In addition, sequence align-ments were developed for each r-protein using the ClustalWalgorithm (Thompson et al. 1994) as implemented in the mul-tiple sequence alignment editor BioEdit (http://www.mbio.ncsu.edu/BioEdit/bioedit.html). Proteins were initially as-signed according to their Cluster of Orthologous Groups(COG) database number (Tatusov et al. 2003). This facilitatedproper annotation of the proteins.

Identification of conserved r-protein gene clusters

Extended lists of genes upstream and downstream of eachr-protein gene in each genome under consideration were pre-pared. Both decoding direction and spacer sequences wereused to establish likely start and end positions of possible geneclusters. The positions of each r-protein gene in the 27genomes were compared and clusters with similar gene orga-nization that occur in multiple organisms were identified. Inthis manner, complete genetic maps of both r-protein genesand their downstream and upstream neighbors were gener-ated. A cluster was considered to be significant if it occurred inat least half of the species considered including at least threerepresentatives of both the Crenarchaeota and Euryarchaeota.

Identification of possible lateral transfer events

The intention here was to examine the extent of similarity ingene order in distantly related taxa. However, similar patternsof gene organization may be found in distantly related organ-isms as a result of horizontal gene transfer. To ascertain the ex-tent to which lateral transfer events involving the r-proteinshave occurred, the phylogenetic tree of each protein was com-pared with a standard tree (Daubin et al. 2003). If an organismwas clearly misplaced in the protein tree, for example, ahalophile among the Crenarchaeota, this was considered to besuggestive of a lateral transfer event.

The r-proteins are typically small and as a result individualprotein trees differ considerably from one another in many de-tails, but typically they are consistent in major long-rangegroupings, e.g., crenarchaeota and euryarchaeota are sepa-rated, whereas there are strong local relationships, e.g., allhalophiles will be clustered. To obtain a tree of relationships,16S rRNA sequences of the 27 archaeal species were retrievedfrom GenBank and aligned manually in accordance withknown secondary structural features using the BioEdit Se-quence Alignment Editor (http://www.mbio.ncsu.edu/Bio-Edit/bioedit.html). Phylogenetic and molecular evolutionaryanalyses were conducted using the Neighbor Joining algo-rithm in MEGA version 3.1 (Kumar et al. 2004). Individualtrees were generated from each protein alignment in a similarmanner and visually compared with the 16S rRNA tree toidentify possible lateral transfer events involving the r-pro-teins.

Results

Nomenclatural clarifications

In assembling the data sets used here, the earlier comparativestudies (Lecompte et al. 2002) were, in effect, repeated. Al-though the majority of the earlier conclusions were confirmed,two important discrepancies were encountered. The morestraightforward case involved bacterial L16 and archaeal L10ewhich were originally regarded as distinct (Lecompte et al.2002). In the COG database (Tatusov et al. 2001), L10e is

242 WANG, DASGUPTA AND FOX

ARCHAEA VOLUME 2, 2009

Page 3: Many nonuniversal archaeal ribosomal proteins are found in ...downloads.hindawi.com/journals/archaea/2009/971494.pdf · served L13 cluster, whereas the genes for S4e, L32e and L19e

placed in the same family as bacterial L16. With crystal struc-tures available from both archaeal and bacterial ribosomes, itis apparent that these proteins share almost identical tertiarystructures (Klein et al. 2004, Schuwirth et al. 2005). Our align-ment (not shown) of sequences of L10e from 17 archaeal spe-cies and L16 from various bacterial species confirmed thatL10e is orthologous with bacterial L16. Thus, the COG data-base placement of L10e in the same family as bacterial proteinL16 is correct. In bacteria, L16 is usually encoded within theS10 operon, which typically includes 11 other universal r-pro-tein genes. In most archaeal genomes, however, although theS10 operon is largely intact, the gene encoding L10e/L16 istypically found elsewhere.

The second discrepancy involved L7ae which occurs in allarchaeal species and some bacteria (Lecompte et al. 2002).The gene encoding L7ae is found in most of the Firmicutes, aswell as some species that belong to the Actinobacteria andThermotogae families. Moreover, in complete genomes ofsome bacteria, two genes that share sequence similarity areboth annotated as rpl7ae. When this occurs, the second copy isfound upstream of the genes for S12 and S7 which mark thebeginning of the str operon. As we will show, this second copyof rpl7ae has more similarity with rpl30e and should be anno-tated as such. Thus, rpl30e although not universal, also occursin all three domains of life.

Gene clusters

The extent to which each archaeal r-protein gene was found ina characteristic gene cluster is summarized in Table 1. Twelveproteins, L10e (=L1), L13e (COG4352), L35ae (COG2451),L38e, L40e (COG1552), L41e, S3ae (COG1890), S8e

(COG2007), S17e (COG1383), S25e (COG4901), S26e(COG4830) and S30e (COG4919), were not routinely associ-ated with the same genes and thus were not assigned to a clus-ter. For the clusters that were observed, Table 1 indicates howmany members of the Crenarchaeota (Max = 7) and the Eury-archaeota (Max = 19) contain the indicated cluster. As ob-served previously (Nomura et al. 1984), genes for four of thenon-universal archaeal r-proteins were located in one of theclassic r-protein clusters that encode universal r-proteins inbacteria. In particular, rpl18e was found to be associated withthe conserved L13 cluster and rps4e, rpl32e and rpl19e werefound in the archaeal version of the spc operon. Of the genesencoding the remaining 19 archaeal-specific proteins, 17 wereconvincingly assigned to one of 10 previously unrecognizedclusters that are conserved over varying phylogenetic dis-tances. The remaining two genes, rps6e and rpl15e, showedsome conservation in genomic neighborhood, but it was uncer-tain if they should be regarded as conserved clusters.

For each of these 12 clusters (10 plus 2), the distribution ofeach variant was mapped onto a representative phylogenetictree derived from 16S rRNA. Because of the small size ofmany of the r-proteins, considerable variation in the trees ob-tained from individual proteins was expected. Nevertheless,the major groupings typically remain together thereby allow-ing the identification of likely horizontal gene transfer eventsbetween organisms in different major clusters. Overall, thenumber of such likely horizontal gene transfer events seen wasmodest and did not significantly affect the results. The mostobvious examples of likely horizontal gene transfer involvedrpl18e in Methanobacter thermoautotrophicum, rpl31e inArchaeoglobus fulgidus and rpl37e in Methanosphaera stad-tmanae. Finally, questions about the nomenclature of rpl7aeand rpl30e were clarified.

ARCHAEA ONLINE at http://archaea.ws

ARCHAEAL RIBOSOMAL PROTEIN GENE CLUSTERS 243

Table 1. List of common clusters containing genes for archaea-unique ribosomal proteins. The nonuniversal archaeal ribosome proteins (in bold)are shown in their most common genomic contexts. The numbers of crenarchaeota (Max = 7) and euryarchaeota (19) containing the particular ar-rangement are shown along with the total % of organisms containing that version. If the cluster did not completely meet our criteria (L14e, L7ae,and S6e), it is listed as uncertain. In two cases (L14e and L34e; S24e and S27ae), alternative but similar arrangements are shown. Gene/Proteinnames inside parentheses are not always present in the cluster and hence do not form the core of the cluster.

Gene cluster Number of Number of % that Typical cluster arrangementcrenarchaeota euryarchaeota agree

L14e, L34e 3 6 34.6 Uncertain: L34e-cytidylate kinase-L14eL14e 4 0 15.3 Alternative: L14e-truB in some CrenarchaeotaL15e 5 10 57.6 L15e-COG1325-COG1603L18e 7 19 100 L18e-L13-S9-rpoNL19e, L32e, S4e 6 18 92.3 L14-L24-S4e-L5-S14-S8-L6-L32e-L19e-L18-S5-L30-L15-secYL21e 6 17 88.4 (COG1258)-L21e-rpoF-COG1491L24, L7ae, S28e 3 18 80.7 ( L7ae)-S28e-L24e-(ndK)-(eIF2)L30e 7 13 76.9 rpoH-rpoB2-rpoB1-rpoA1-rpoA2-L30e-NusAL31e, L39, LXae 4 11 57.6 L39e-L31e-LXa-eIF6L37ae 5 8 50 L37ae-(rpoP)-COG2136L37e 4 15 73 snRNP-L37eL44e, S27e 6 15 80.7 L44e-S27e-eIF2_αS6e 4 8 46.1 Uncertain: S6e-eIF2_γS19e 7 16 88.4 S19e-COG2118, usually before L39e-L31e-LXa-eIF6S24e, S27ae 2 18 76.9 rpoE1-rpoE2-HP-S24e-S27ae-(COG5330)S24e, S27ae 3 1 15.3 Alternative: S24e-S27ae-COG5330

Page 4: Many nonuniversal archaeal ribosomal proteins are found in ...downloads.hindawi.com/journals/archaea/2009/971494.pdf · served L13 cluster, whereas the genes for S4e, L32e and L19e

Archaeal r-protein genes associated with universal r-proteinclusters

L18e-L13-S9-rpoN The archaeal r-protein L18e (COG1727)gene is located immediately upstream of the previously knownbacterial L13 operon, which consists of universal proteins L13and S9 as a single transcription unit (Isono et al. 1985). In mostarchaeal species, this cluster is followed by rpoN, which codesfor the omega subunit of DNA-directed RNA polymerase. Insome species, the universal alpha operon is upstream of theL18e cluster (Figure 1). The alpha operon itself is structured asin bacterial genomes except for the absence of an ortholog forr-protein L17. Thus, in archaea, the core gene cluster for the al-pha operon is rps13-rps4-rps11-rpoD, and is frequently fol-lowed by the grouping rpl18e-rpl13-rps9-rpoN. This hugecluster has been shown to form a single transcription unit inH. marismortui and includes the gene for phosphopyruvatehydratase (eno), thus revealing coupled transcription of aglycolytic enzyme and an r-protein (Kromer and Arndt 1991).The position of L18e in this extended cluster suggests this pro-tein may be a homolog of bacterial r-protein L17 which is miss-ing in the Archaea. However, sequence comparisons did notsupport this hypothesis. Instead, L18e is homologous to uni-versal r-protein L15 (Sano et al. 1999) and is a partial duplica-tion of it. From a structural perspective it is L31e, not L18e, thatincorporates into the ribosome in the same vicinity as L17.Thus, the gene for L18e is likely a late addition to the L13operon.

L19e, L32e and S4e This group of proteins is highly con-served among the Archaea and their genes are inserted into thespc operon, which otherwise maintains its bacterial gene order(Coenye and Vandamme 2005). The gene for S4e (COG1471)is usually located between the genes for L24 and L5, whereasthe genes for L32e and L19e are usually located consecutivelybetween the genes for L6 and L18 (Figure 2). Methano-

pyrus kandleri is an exception in that the cluster is broken afterthe gene encoding L32e. The second portion of the cluster be-ginning with the gene for L19e is found on the opposite strandapproximately 1.2 MB upstream. This appears to be a specialcase of a recently disrupted spc operon. Pyrobacu-lum aerophilum is also an exception in that the genes for L24and S4e remain together, but are separated from the rest of theoperon components. In bacteria, the r-proteins transcribedfrom the spc operon are regulated by S8 (Zengel and Lindahl1994). S8 typically binds to the rpl5 gene and represses transla-tion of L5 and the other ribosomal proteins either by retro-regulation or translational coupling (Cerretti et al. 1983,Mattheakis and Nomura 1988, Mattheakis et al. 1989). Giventhe conservation of the position of these proteins in the archaealversion of the cluster, it would not be surprising if a similar reg-ulatory pathway were in use despite the addition of the genesfor L19e, L32e and S4e, but this remains to be investigated.

L30e and the str operon Although the gene encoding L30eis found only in some archaeal species (Figure 3), it is associ-ated with a set of upstream genes encoding DNA-directed RNApolymerase components including rpoH, rpoB and rpoA. Thelatter two are split into two genes in many species. rpl30e is fol-lowed by transcription elongation factor gene nusA and thegenes for S12 and S7. In bacteria, rpoB and rpoA are at thedownstream end of the L10 operon (Klenk et al. 1999). rps12,rps7, fusA (translation elongation factor G), and tuf B (transla-tion elongation factor Tu) then form the conserved str operonwhose expression is regulated by S7. Previous comparisons ofthe RNA polymerase subunits among archaeal showed that therpoB and rpoA are sometimes split genes that together encodethe DNA-directed RNA polymerase beta (COG0085K) andbeta prime subunits (COG0086K). Together these genes formthe core of the multisubunit DNA-dependent RNA polymerase(Archambault and Friesen 1993, Cramer et al. 2001, Schuwirthet al. 2005). In bacteria, the split rpoB genes and rpoA genes are

244 WANG, DASGUPTA AND FOX

ARCHAEA VOLUME 2, 2009

Figure 1. Aligned versions of the L18egene cluster found in various archaealgenomes. Organisms containing a par-ticular arrangement are indicated. Inthis figure and subsequent figures,genes are shown as boxes with apointed end that indicates relativetranscription direction but not genelength. Each box is labeled with itsgene name except the r-protein genes,which are labeled to indicate theirgene product. The non-universalarchaeal r-protein genes are colored inpink and all other genes are shown incyan. The thin arrows connect imme-diately adjoining genes and serve asalignment gaps. Blank spaces indicatethe absence of a gene, e.g., rpoK inthis figure, from the cluster but notnecessarily the genome. These con-ventions are followed in all the subse-quent figures.

Page 5: Many nonuniversal archaeal ribosomal proteins are found in ...downloads.hindawi.com/journals/archaea/2009/971494.pdf · served L13 cluster, whereas the genes for S4e, L32e and L19e

replaced by the single genes, rpoB and rpoC, (equivalent ofarchaeal rpoA), thereby maintaining the equivalent gene orderupstream of rps12 and rps7. The archaeal version of the clusterfrequently includes translational elongation factor 2 and the al-pha subunit of translational elongation factor 1 after the rps7gene. These are functional equivalents of the bacterial genes. Insome species (P. aerophilum, T. pendens), these three genes arescattered elsewhere in the genome. Methanospirillum hungateiand M. marisnigri have a distinct cluster that includesrps12-rps7-EF2. This might be a case of late separation of the

str operon from the older and larger archaeal cluster. The majordistinction of the two systems is the presence in archaea of thegene for L30e and the transcription elongation factor nusAgene upstream of the r-protein S12 gene. Based on this compar-ison, it appears that a gene cluster analogous to the str operon isfound in the Archaea too.

In most bacteria, there is no gene between rpoA and nusA.However, an examination of bacterial genomes revealed that ina few cases (Firmicutes, actinobacterial and thermotogalgenomes) there is a gene, typically annotated as a second copy

ARCHAEA ONLINE at http://archaea.ws

ARCHAEAL RIBOSOMAL PROTEIN GENE CLUSTERS 245

Figure 3. Alternative versions of the gene cluster containing the gene encoding L30e in the Archaea. Labels are as in Figures 1 and 2. Group Ashows the archaeal arrangements and Group B is the arrangement typically found in the Firmicutes. The gene which is homologous to the gene forL30e in the Firmicutes is typically incorrectly annotated as a second copy of the gene encoding L7ae as discussed in the text.

Figure 2. Alternative versions of the spc gene cluster in the Archaea. Labeling conventions are as in Figure 1. In addition, a double broken line in-dicates a major gap such that the two groups of genes separated by the gap are not in the same vicinity of the genome. It should be noted that the rel-ative transcription directions indicated pertain only to members of each group not to the relative orientations between the groups. Thus, the twolarge subclusters indicated in M. kanderleri are in fact separated in the genome where they occur on opposite strands with transcription in oppositedirections. This cluster always contains the archaeal-specific r-protein genes for L19e, L32e and S4e.

Page 6: Many nonuniversal archaeal ribosomal proteins are found in ...downloads.hindawi.com/journals/archaea/2009/971494.pdf · served L13 cluster, whereas the genes for S4e, L32e and L19e

of rpl7ae, in the position occupied by the gene coding for L30ein the Archaea. To explore this further, multiple sequencealignments were constructed (not shown). The second copy ofrpl7ae in various bacteria was aligned with archaeal versionsof rpl30e and rpl7ae. In addition the archaeal rpl7ae andrpl30e sequences were aligned. Clear evidence of similaritywas found in all three cases suggesting that rpl7ae and rpl30eare related. However, the version of the protein found in thebacterial str operon is far more similar to archaeal L30ethan to archaeal L7ae and should therefore be annotated asL30e.

Archaea-unique clusters

S24e-S27ae S24e (COG2004) and S27ae (COG1998) are en-coded consecutively in all 27 archaeal species (Figure 4). Thesurrounding genes are such that three alternative arrangementsare seen. In the Euryarchaeota, two DNA-directed RNA poly-merase subunits E1 and E2 and an archaeal-conserved hypo-thetical protein of unknown function are always located up-stream. The unidentified protein is usually 150 to 195 aminoacids long. In the case of Halobacterium sp. NRC-1, rpoE1 andrps24e are separated by 594 bases which is large enough to en-code a protein of 198 amino acids. However, the largest openreading frame would only encode a 32-residue hypotheticalprotein.

A multiple sequence alignment of thirteen examples of thishypothetical protein’s sequence revealed several conservedsegments as well as highly variable region. The threecrenarchaeal species and H. marismortui lack this upstreamgroup of proteins but instead have a shared gene encodingO-sialoglycoprotein endopeptidase immediately downstream.Finally, several archaeal species have both the upstream anddownstream neighbors discussed above.

L39e, L31e, S19e and LXa The genes for these four uniquearchaeal proteins (Figure 5) are always found in close proxim-ity in a conserved cluster in conjunction with genes encodingseveral other proteins. There are, however, disruptions in sev-eral cases. The core of the cluster is rps19e-COG2118-COG2117-rpl39e. The COG2118 gene encodes a hypotheticalDNA-binding protein and the COG2117 gene encodes a sub-unit of a tRNA methyltransferase. rpl31e (COG2097) typicallyfollows rpl39e (COG2167). However, in P. aerophilum,M. jannaschii and M. maripaludis, these genes are separated

from each other with the cluster in effect being broken into twopieces. In several other cases, the cluster is again broken intotwo fragments with the rpl39e-rpl31e proximity kept intact. Inall species, the gene for translation initiation factor 6 (eIF6) isimmediately downstream of the gene for L31e and is fre-quently followed by rpl20A, which encodes the r-protein LXa(COG2157). Translation initiation factor 6 occurs in both theArchaea and Eucaryota and prevents premature association be-tween the 60S and 40S ribosomal subunits. The gene for LXa ismissing in the genomic sequences of some species belongingto the Halobacteria or Thermoplasmata families (Lecompte etal. 2002). The genes for COG1730 and COG0552 are fre-quently associated with the cluster too. They encode a molecu-lar chaperone and a signal recognition protein, respectively.

L7ae-S28e-L24e-ndK-infB The genes for S28e (COG2053)and L24e appear next to each other in all the archaeal genomeswith the exception of H. walsbyi, where the gene for S28e wasnot found, and T. pendens, where the two genes are separated.Additionally, in three crenarchaeal species, A. pernix,S. solfataricus, and S. marinus, the genes for S28e and L24e, al-though adjacent, are coded by different strands. In twenty ofthe twenty-six archaeal genomes, the gene for r-protein L7ae,(COG1358), is found immediately upstream of the gene forS28e as shown in Figure 6. The two crenarchaeal speciesA. pernix and S. solfataricus, along with a euryarchaeon,M. kandleri, deviate from this pattern with the rpl7ae genefound elsewhere in these genomes. Two more genes, ndk(nucleoside-diphosphate kinase) and infB (translation initia-tion factor 2) are frequently found immediately downstream ofrps28e and rpl24e. Thus, a clear picture emerges of a core con-served cluster involving the genes rpl7ae, rps28e, rpl24e, ndkand infB.

Characterization of other smaller archeal r-protein geneclusters

The genes encoding the remaining nonuniversal proteinsfound in the Archaea are in small clusters containing only twoor three genes. These clusters are summarized in Figure 7 andare briefly discussed below.

L21e (COG2139) is almost always encoded between a genefor a pseudouridylate-synthase-like protein (COG1258J) andrpoF, a DNA-directed RNA polymerase component. Typi-cally a gene for an RNA-binding protein, COG1491J, is imme-

246 WANG, DASGUPTA AND FOX

ARCHAEA VOLUME 2, 2009

Figure 4. Alternative versions ofthe S24e-S27ae gene cluster inArchaea. Notation is as in Fig-ure 1. The hypothetical protein isindicated as HP.

Page 7: Many nonuniversal archaeal ribosomal proteins are found in ...downloads.hindawi.com/journals/archaea/2009/971494.pdf · served L13 cluster, whereas the genes for S4e, L32e and L19e

ARCHAEA ONLINE at http://archaea.ws

ARCHAEAL RIBOSOMAL PROTEIN GENE CLUSTERS 247

Figure 5. Structure of the L31e-L39e-LXa-S19e gene cluster in the Archaea. Notation is as in Figure 1 and 2.

Figure 6. Alternative arrangementof ribosomal protein genes in theS28e-L24e gene cluster. Notationis as in Figure 1.

Page 8: Many nonuniversal archaeal ribosomal proteins are found in ...downloads.hindawi.com/journals/archaea/2009/971494.pdf · served L13 cluster, whereas the genes for S4e, L32e and L19e

diately downstream of rpoF. However, in M. stadtmanae andP. abyssi, genes encoding different proteins are found.

L34e and L14e are the gene products of an uncertain cluster.It is only found in some archaeal species, but when both genesare present they are always in proximity to one another. Whenpresent, the gene for L34e (COG2174) is always upstream ofcmk, which encodes a cytidylate kinase. The gene for L14e,(COG2163), either immediately follows cmk or is downstreamfrom it. Neither protein was found in species from the familiesArchaeoglobi, Halobacteria, Methanomicrobia or Thermo-plasmata.

L44e and S27e are always encoded together. rpsUI2, whichcodes for the alpha subunit of initiation factor 2 (COG1093J)is typically found downstream of the genes for L44e(COG1631) and S27e (COG1998). S27e has a novel C4 zincfinger motif (Herve du Penhoat et al. 2004). L44e has a zincfinger motif with a similar structure, except that the first C2motif is replaced by a CH motif with several amino acids in be-tween (Ban et al. 2000, Laity et al. 2001). Although they lackobvious primary sequence similarity, the structural similaritybetween these proteins and their co-occurrence in the genomesuggests they may have arisen in the same time frame of thearchaeal lineage.

L37e (COG2126) is found in many archaea, but it is absentin P. aerophilum, P. torridus and H. marismortui. Since severalclosely related species have L37e, these absences may reflect

poor annotation or sequencing problems. Excluding these spe-cies, the gene for L37e is typically located downstream ofsnRNP, which encodes the protein component of a small nu-clear ribonucleoprotein (Collins et al. 2001). The only excep-tion is M. jannaschii.

The gene for L37ae (COG1997) is found in conjunctionwith either or both of two other genes, one for a DNA-directedRNA polymerase subunit P (COG1996) and the other for RNAprocessing protein (COG2136). These genes are encoded afterrpl37ae, and when both are present, rpoP occurs first.

L15e (COG1632) is found in all archaea and although itmeets the cluster criteria, the position of its gene is notstrongly conserved. Most frequently the gene is located up-stream of genes, encoding an exosome subunit and the proteinpart of a ribonuclease P subunit.

S6e (COG2125) does not completely meet our cluster crite-ria and it is uncertain whether it represents a conserved cluster.The coding region of rps6e is frequently located immediatelyupstream of eif2G, which is the gene for the gamma subunit ofthe elongation initiation factor 2. However, in the genomes ofall members of the families Halobacteria and Methanom-icrobia as well as several other archaeal species these genes areseparated. Moreover, in species from the familiesMethanobacteria, Methanomicrobia and Methanopyri,rps6e-eif2G are typically located downstream of the rpl7ae-rps28e-rpl24e-ndK-infB cluster. This latter arrangement is

248 WANG, DASGUPTA AND FOX

ARCHAEA VOLUME 2, 2009

Figure 7. Additional clusters of archaealunique r-protein genes. Notation is as inFigures 1 and 2.

Page 9: Many nonuniversal archaeal ribosomal proteins are found in ...downloads.hindawi.com/journals/archaea/2009/971494.pdf · served L13 cluster, whereas the genes for S4e, L32e and L19e

consistent with the observation that L30e, L7ae/S6e and theeukaryotic unique S12e may share a conserved RNA-bindingmotif (Mushegian and Koonin 1996).

Discussion and conclusions

It must be considered whether the specific clusters reportedhere represent chance occurrences or transcriptional relation-ships between the clustered genes. In the absence of experi-mental data showing that clusters are operons, one must relylargely on statistical arguments. Although the significance ofclustering has been considered in the literature (Durand andSankoff 2003), no one recognized test has been established forclusters of orthologous genes. The main considerations influ-encing cluster conservation are the extent of proximity and or-der of the genes, genome size, and evolutionary distance be-tween the genomes being considered. The more rigorous theproximity requirements and the more distantly related thegenomes, the less likely that gene order will be conserved bychance. For example, when just five diverse bacterial genomeswere compared (Siefert et al. 1997), only 16 universal clustersin total were found, and essentially all of them reflect genessharing a transcriptional relationship in those organisms(mainly E. coli) where experimental data are available. In ourstudy, the manner in which the clusters were identified re-quired that they have nearly perfect proximity as well as thesame gene order within the cluster. The extent to which theclusters are conserved in the two archaeal orders is shown inTable 1. Twelve of the clusters occurred in more than half ofthe genomes considered with multiple representatives in boththe Crenarchaeota and Euryarchaeota. Six of the clusters oc-curred in more than 80% of the species.

In bacteria and archaea, genes that have the same origins orshare similar functions tend to form conserved gene clustersthat likely coordinate expression of the gene products. Suchconserved gene order may be an indicator of historical associ-ation. In the case of the universal r-proteins, these associationshave remained largely unchanged since LUCA and almost ex-clusively involve universal r-proteins. The major exceptionsare the core RNA polymerase genes, which are found immedi-ately upstream of the str operon in many or most bacteria.

To explore such associations further, a comparative analysisof the gene context of the known archaeal r-proteins in seven-teen complete genomes was conducted. As expected fromprior studies, the universal r-protein genes were found in thesame major clusters that occur in bacteria (Siefert et al. 1997,Coenye and Vandamme 2005). Of the nonuniversal archaealr-protein genes, four were found in these universal clusters.L18e was found in the L13 operon, whereas the genes forL19e, L32e and S4e were in the spc operon. In addition to theconserved operons, at least ten previously unrecognized con-served clusters, each containing at least one non-universalarchaeal r-protein gene, were identified. These clusters weretypically smaller compared with the universal r-proteinoperons and they frequently included more than onenonuniversal r-protein gene. The remaining 11 nonuniversalarchaeal r-proteins were not consistently associated with any

particular gene cluster. Two proteins, L7ae and L16e, are uni-versal in the Archaea and also found in a modest number ofmostly high G-C content Gram-positive bacteria.

The nonuniversal r-proteins would likely be later additions,if one excludes the possibility of universal loss in the Bacteria.In support of this, the nonuniversal r-proteins are primarily as-sociated with regions of the rRNA thought to be newer (Huryet al. 2006, Bokov and Steinberg 2009). Among the nonuni-versal protein genes, those that belong to conserved clusters,are likely older than those not associated with any cluster. Incontrast to the universal r-proteins, the clusters that containnonuniversal r-protein genes frequently contain non-r-proteingenes. Since the nonuniversal r-proteins are likely later addi-tions to the ribosomal machinery, we speculate that the pro-teins that they are associated with are later additions, likelycoming into existence in the same time frame. Thus, it may bepossible to gain insight in to the relative time of emergence ofvarious cellular processes.

Among the genes associated with the nonuniversal r-pro-teins are a number coding for proteins involved in the initiationof translation. These are infB, eIF2α, eIF2-GTPase, eIF2γ andeIF6. eIF6 is an anti-association protein that binds to the largesubunit and prevents the formation of the functional ribosomecomplex in both the Archaea and Eucaryota (Ceci et al. 2003).eIF2α and eIF2γ are components of eIF2, a translation initia-tion factor that is composed of several heterogeneous subunitsin the Archaea. It is responsible for initiator tRNA binding andtranslation start site recognition. The sequence of eIF2γ showshomology to a protein belonging to universal elongation factorfamily that includes EF-Tu in the Bacteria and eEF1A in theArchaea (Marintchev and Wagner 2004). Because the twoarchaeal initiation factors can function only as a complex, it islikely that they appeared at approximately the same time dur-ing ribosome evolution. It is also possible that the proteins,S6e, S27e and L44e, which belong to the same clusters as thetwo initiation factors, also appeared in the same time frame.L44e is known to be a protein component of the E site of theeukaryotic and archaeal ribosome (Schroeder et al. 2007) andmight be one of the more ancient non-universal r-proteins. S6ehas a unique RNA-binding domain that is shared betweentranslation termination suppressors, ribosomal protein S12eand rRNA modifying enzyme rimK in E. coli (Mushegian andKoonin 1996).

In bacteria, the initiation factors IF2 and IF3 play an identi-cal role in protein synthesis as their archaeal counterparts, butthey do not share any sequence similarity to the archaeal pro-teins in question. Initiation of bacterial protein synthesis uti-lizes a Shine-Dalgarno sequence that is typically found up-stream of the start codon, whereas in eukaryotes the Kozakscanning mechanism is typically employed. Thus, it is clearthat the mechanisms of translation initiation underwent furtherdevelopment after the Bacteria and Archaea diverged.

As with the translation initiation factors discussed above,several genes associated with transcription, including variousDNA-dependent RNA polymerase (RNAP) subunits, arefound in archaeal r-protein clusters. The L30e cluster inarchaea is in effect a homolog of the Beta cluster and str

ARCHAEA ONLINE at http://archaea.ws

ARCHAEAL RIBOSOMAL PROTEIN GENE CLUSTERS 249

Page 10: Many nonuniversal archaeal ribosomal proteins are found in ...downloads.hindawi.com/journals/archaea/2009/971494.pdf · served L13 cluster, whereas the genes for S4e, L32e and L19e

operon, which are found together in many bacteria. The genesrpoH, rpoA and rpoB are typically upstream of rpl30e which isthen followed, in analogy with the str operon by the genes forS12 and S7. This similarity is extended by our observation thatwhat has been reported as a second copy of rpl7ae in somebacteria is not only rpl31e, but is also in the same cluster posi-tion as archaeal rpl31e. In the archaea, rpoA and rpoB are splitgenes encoding the β′ and β subunits of the DNA-dependentRNA polymerase. These core components of the RNA poly-merase (RNAP) are conserved in all three domains (Cramer2002, Cramer et al. 2001, Darst 2001). In addition, in thearchaeal cluster, rpoH encodes RNA polymerase subunit H,which is a counterpart of rpb5, a subunit shared by the RNApolymerases I, II and III in the Eucaryota. No correspondinggene was found in bacteria (Cramer 2002).

When the nonuniversal r-protein gene clusters are consid-ered, the linkage between transcription and translation is ex-panded further. Thus, the L21e and S24e/S27ae clusters alsocontain components of the archaeal RNAP. The gene rpoF isusually found downstream of the gene for L21e, whereasgenes rpoE1 and rpoE2 are usually found upstream of thegenes for S24e and S27ae. Their products, the archaeal RNAPE1 and E2 subunits, are homologous to rpb4 and rpb7, sub-units of eukaryotic RNA polymerase II. They are reported toform a heterodimer in their own polymerase complex (Todoneet al. 2001, Armache et al. 2003, Bushnell and Kornberg2003).

In addition to the genes involved in transcription and the ini-tiation of translation, the nonuniversal archaeal r-protein geneclusters contain a variety of other genes many of which codefor products of uncertain function. Genes coding for proteinsof known function include those for cytidylate kinase andnucleoside diphosphate kinase, both of which are involved innucleotide synthesis. Likewise, truB, snRNP and COG1258are involved in post-transcriptional RNA maturation.

In summary, unlike those of the universal genes, the nonuni-versal r-protein gene clusters contain many non-r-proteingenes. If these gene clusters are regulatory groupings, then theexpression of the newer r-protein genes is likely to be morestrongly coordinated with other cellular processes. In particu-lar, there is a significant increase in the amount of coordinationwith transcription. In addition, there is likely to be significantcoordination with the initiation of protein synthesis, and to alesser extent RNA processing and nucleotide synthesis. Theoverall picture suggests an evolutionary history in which thecore ribosome and transcription machinery initially arose be-fore LUCA. Subsequently, after the archaeal and bacterial do-mains separated, the transcription process was refined and newmechanisms for the initiation of protein synthesis introducedin the same time frame. These were likely coordinated with en-hancements to the protein synthesis machinery represented bythe various nonuniversal r-proteins rather than with the oldercore machinery.

Acknowledgments

This research was supported in part by grants to GEF from the NASAExobiology program (NNG05GN75G), the Robert A. Welch Founda-

tion (E-1451), the Texas Advanced Research Program, and the Insti-tute of Space Systems Operations.

References

Allen, T., P. Shen, L. Samsel, R. Liu, L. Lindahl and J.M. Zengel.1999. Phylogenetic analysis of L4-mediated autogenous control ofthe S10 ribosomal protein operon. J. Bacteriol. 181:6124–6132.

Archambault, J. and J.D. Friesen. 1993. Genetics of eukaryotic RNApolymerases I, II, and III. Microbiol. Rev. 57:703–724.

Armache, K.J., H. Kettenberger and P. Cramer. 2003. Architecture ofinitiation-competent 12-subunit RNA polymerase II. Proc. Natl.Acad. Sci. USA 100:6964–6968.

Ban, N., P. Nissen, J. Hansen, P.B. Moore and T.A. Steitz. 2000. Thecomplete atomic structure of the large ribosomal subunit at 2.4 Aresolution. Science 289:905–920.

Bushnell, D.A. and R.D. Kornberg. 2003. Complete, 12-subunit RNApolymerase II at 4.1-A resolution: implications for the initiation oftranscription. Proc. Natl. Acad. Sci. USA 100: 6969–6973.

Bokov, K. and S.V. Steinberg. 2009. A hierarchical model for evolu-tion of 23S ribosomal RNA. Nature 457:977–980.

Ceccarelli, E., M. Bocchetta, R. Creti, A.M. Sanangelantoni,O. Tiboni and P. Cammarano. 1995. Chromosomal organizationand nucleotide sequence of the genes for elongation factors EF-1alpha and EF-2 and ribosomal proteins S7 and S10 of thehyperthermophilic archaeum Desulfurococcus mobilis. Mol. Gen.Genet. 246:687–696.

Ceci, M., C. Gaviraghi, C. Gorrini, L.A. Sala, N. Offenhauser,P.C. Marchisio and S. Biffo. 2003. Release of eIF6 (p27BBP) fromthe 60S subunit allows 80S ribosome assembly. Nature426:579–584.

Cerretti, D.P., D. Dean, G.R. Davis, D.M. Bedwell and M. Nomura.1983. The spc ribosomal protein operon of Escherichia coli: se-quence and cotranscription of the ribosomal protein genes and aprotein export gene. Nucleic Acids Res. 11:2599–2616.

Coenye, T. and P. Vandamme. 2005. Organisation of the S10, spc andalpha ribosomal protein gene clusters in prokaryotic genomes.FEMS Microbiol. Lett. 242:117–126.

Collins, B.M., S.J. Harrop, G.D. Kornfeld, I.W. Dawes, P.M. Curmiand B.C. Mabbutt. 2001. Crystal structure of a heptameric Sm-likeprotein complex from archaea: implications for the structure andevolution of snRNPs. J. Mol. Biol. 309:915–923.

Cramer, P. 2002. Multisubunit RNA polymerases. Curr. Opin. Struct.Biol. 12:89–97.

Cramer, P., D.A. Bushnell and R.D. Kornberg. 2001. Structural basisof transcription: RNA polymerase II at 2.8 angstrom resolution.Science 292:1863–1876.

Dandekar, T., B. Snel, M. Huynen and P. Bork. 1998. Conservation ofgene order: a fingerprint of proteins that physically interact. TrendsBiochem. Sci. 23:324–328.

Darst, S.A. 2001. Bacterial RNA polymerase. Curr. Opin. Struct.Biol. 11:155–162.

Daubin, V., N.A. Moran and H. Ochman. 2003. Phylogenetics and thecohesion of bacterial genomes. Science 301:829–832.

Durand, D. and D. Sankoff. 2003. Tests for gene clustering.J. Comput. Biol. 10:453–482.

Fox, G.E. and A.K. Naik. 2004. The evolutionary history of the trans-lation machinery. Kluwer Academic/Plenum, New York, pp92–105.

Hartman, H., P. Favaretto and T.F. Smith. 2006. The archaeal originsof the eukaryotic translational system. Archaea 2:1–9.

Herve du Penhoat, C., H.S. Atreya, Y. Shen, G. Liu, T.B. Acton,R. Xiao, Z. Li, D. Murray, G.T. Montelione and T. Szyperski. 2004.The NMR solution structure of the 30S ribosomal protein S27e en-

250 WANG, DASGUPTA AND FOX

ARCHAEA VOLUME 2, 2009

Page 11: Many nonuniversal archaeal ribosomal proteins are found in ...downloads.hindawi.com/journals/archaea/2009/971494.pdf · served L13 cluster, whereas the genes for S4e, L32e and L19e

coded in gene RS27_ARCFU of Archaeoglobus fulgidis reveals anovel protein fold. Protein Sci. 13:1407–1416.

Hury, J., U. Nagaswamy, M. Larios-Sanz and G.E. Fox. 2006. Ribo-some origins: the relative age of 23S rRNA domains. Orig. LifeEvol. Biosph. 36:421–429.

Isono, S., S. Thamm, M. Kitakawa and K. Isono. 1985. Cloning andnucleotide sequencing of the genes for ribosomal proteins S9 (rpsI)and L13 (rplM) of Escherichia coli. Mol. Gen. Genet.198:279–282.

Klein, D.J., P.B. Moore and T.A. Steitz. 2004. The roles of ribosomalproteins in the structure assembly, and evolution of the large ribo-somal subunit. J. Mol. Biol. 340:141–177.

Klenk, H.P., T.D. Meier, P. Durovic, V. Schwass, F. Lottspeich,P.P. Dennis and W. Zillig. 1999. RNA polymerase of Aquifexpyrophilus: implications for the evolution of the bacterial rpoBCoperon and extremely thermophilic bacteria. J. Mol. Evol.48:528–541.

Kromer, W.J. and E. Arndt. 1991. Halobacterial S9 operon. Three ri-bosomal protein genes are cotranscribed with genes encoding atRNA(Leu), the enolase, and a putative membrane protein in thearchaebacterium Haloarcula (Halobacterium) marismortui.J. Biol. Chem. 266:24573–24579.

Kumar, S., K. Tamura and M. Nei. 2004. MEGA3: Integrated soft-ware for molecular evolutionary genetics analysis and sequencealignment. Brief. Bioinform. 5:150–163.

Kyrpides, N., R. Overbeek and C. Ouzounis. 1999. Universal proteinfamilies and the functional content of the last universal commonancestor. J. Mol. Evol. 49:413–423.

Laity, J.H., B.M. Lee and P.E. Wright. 2001. Zinc finger proteins: newinsights into structural and functional diversity. Curr. Opin. Struct.Biol. 11:39–46.

Lecompte, O., R. Ripp, J.C. Thierry, D. Moras and O. Poch. 2002.Comparative analysis of ribosomal proteins in complete genomes:an example of reductive evolution at the domain scale. Nucleic Ac-ids Res. 30:5382–5390.

Makarova, K.S. and E.V. Koonin. 2005. Evolutionary and functionalgenomics of the Archaea. Curr. Opin. Microbiol. 8:586–594.

Marintchev, A. and G. Wagner. 2004. Translation initiation: struc-tures, mechanisms and evolution. Q. Rev. Biophys. 37:197–284.

Mattheakis, L., L. Vu, F. Sor and M. Nomura. 1989. Retroregulationof the synthesis of ribosomal proteins L14 and L24 by feedbackrepressor S8 in Escherichia coli. Proc. Natl. Acad. Sci. USA86:448–452.

Mattheakis, L.C. and M. Nomura. 1988. Feedback regulation of thespc operon in Escherichia coli: translational coupling and mRNAprocessing. J. Bacteriol. 170:4484–4492.

Mushegian, A. 2005. Protein content of minimal and ancestral ribo-some. Rna 11:1400–1406.

Mushegian, A.R. and E.V. Koonin. 1996. A minimal gene set for cel-lular life derived by comparison of complete bacterial genomes.Proc. Natl. Acad. Sci. USA 93:10268–10273.

Nomura, M., R. Gourse and G. Baughman. 1984. Regulation of thesynthesis of ribosomes and ribosomal components. Annu. Rev.Biochem. 53:75–117.

Olsen, G.J. and C.R. Woese. 1997. Archaeal genomics: an overview.Cell 89:991–994.

Ramirez, C., L.C. Shimmin, P. Leggatt and A.T. Matheson. 1994.Structure and transcription of the L11-L1-L10-L12 ribosomal pro-tein gene operon from the extreme thermophilic archaeonSulfolobus acidocaldarius. J. Mol. Biol. 244:242–249.

Sano, K., A. Taguchi, H. Furumoto, T. Uda and T. Itoh. 1999. Clon-ing, sequencing, and characterization of ribosomal protein andRNA polymerase genes from the region analogous to the al-pha-operon of Escherichia coli in halophilic archaea,Halobacterium halobium. Biochem. Biophys. Res. Commun. 264:24–28.

Scholzen, T. and E. Arndt. 1992. The alpha-operon equivalent ge-nome region in the extreme halophilic archaebacterium Haloarcula(Halobacterium) marismortui. J. Biol. Chem. 267:12123–12130.

Schroeder, S.J., G. Blaha, J. Tirado-Rives, T.A. Steitz and P.B. Moore.2007. The structures of antibiotics bound to the E site region of the50 S ribosomal subunit of Haloarcula marismortui: 13-deoxyte-danolide and girodazole. J. Mol. Biol. 367:1471–1479.

Schuwirth, B.S., M.A. Borovinskaya, C.W. Hau, W. Zhang,A. Vila-Sanjurjo, J.M. Holton and J.H. Cate. 2005. Structures ofthe bacterial ribosome at 3.5 A resolution. Science 310:827–834.

Siefert, J.L., K.A. Martin, F. Abdi, W.R. Widger and G.E. Fox. 1997.Conserved gene clusters in bacterial genomes provide further sup-port for the primacy of RNA. J. Mol. Evol. 45:467–472.

Tatusov, R.L., D.A. Natale, I.V. Garkavtsev, T.A. Tatusova,U.T. Shankavaram, B.S. Rao, B. Kiryutin, M.Y. Galperin,N.D. Fedorova and E.V. Koonin. 2001. The COG database: newdevelopments in phylogenetic classification of proteins from com-plete genomes. Nucleic Acids Res. 29:22–28.

Tatusov. R.L., N. D. Fedorova, J.D. Jackson et al. 2003. The COG da-tabase: an updated version includes eukaryotes. BMC Bioin-formatics 4:41.

Thompson, J.D., D.G. Higgins and T.J. Gibson. 1994. CLUSTAL W:improving the sensitivity of progressive multiple sequence align-ment through sequence weighting, position-specific gap penaltiesand weight matrix choice. Nucleic Acids Res. 22:4673–4680.

Todone, F., P. Brick, F. Werner, R.O. Weinzierl and S. Onesti. 2001.Structure of an archaeal homolog of the eukaryotic RNA polymer-ase II RPB4/RPB7 complex. Mol. Cell. 8:1137–1143.

Waters, E., M.J. Hohn, I. Ahelet al. 2003. The genome ofNanoarchaeum equitans: insights into early archaeal evolution andderived parasitism. Proc. Natl. Acad. Sci. USA 100:12984–12988.

Yang, D., I. Kusser, A.K. Kopke, B.F. Koop and A.T. Matheson.1999. The structure and evolution of the ribosomal proteins en-coded in the spc operon of the archaeon (Crenarchaeota)Sulfolobus acidocaldarius. Mol. Phylogenet. Evol. 12:177–185.

Zengel, J.M. and L. Lindahl. 1994. Diverse mechanisms for regulat-ing ribosomal protein synthesis in Escherichia coli. Prog. NucleicAcid Res. Mol. Biol. 47:331–370.

ARCHAEA ONLINE at http://archaea.ws

ARCHAEAL RIBOSOMAL PROTEIN GENE CLUSTERS 251

Page 12: Many nonuniversal archaeal ribosomal proteins are found in ...downloads.hindawi.com/journals/archaea/2009/971494.pdf · served L13 cluster, whereas the genes for S4e, L32e and L19e

Submit your manuscripts athttp://www.hindawi.com

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Anatomy Research International

PeptidesInternational Journal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporation http://www.hindawi.com

International Journal of

Volume 2014

Zoology

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Molecular Biology International

GenomicsInternational Journal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

The Scientific World JournalHindawi Publishing Corporation http://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

BioinformaticsAdvances in

Marine BiologyJournal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Signal TransductionJournal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

BioMed Research International

Evolutionary BiologyInternational Journal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Biochemistry Research International

ArchaeaHindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Genetics Research International

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Advances in

Virolog y

Hindawi Publishing Corporationhttp://www.hindawi.com

Nucleic AcidsJournal of

Volume 2014

Stem CellsInternational

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Enzyme Research

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

International Journal of

Microbiology


Top Related