o irl press limited, oxford, england. 8209

10
12 Numb* 21 1984 Nucleic Acids Research Preferential use of A- and U-rich codons for Mycoplasma capricolum ribosomal proteins S8 and L6 Akira Muto, Yasushi Kawauchi*, Fumiaki Yamao and Syozo Osawa Laboratory of Molecular Genetics, Department of Biology, Faculty of Science, Nagoya University, Furo-Cho, Chikusa-Ku, Nagoya 464, Japan Received 17 September 1984; Accepted 15 October 1984 ABSTRACT The nucleotide sequence of the 1.3 kilobase-pair DHA segment, which contains the genes for riboaomal proteins S8 and L6, and a part of L18 of Mycoplasma capricolum, has been determined and compared with the corresponding sequence in Escherichia coli (Cerretti et_ al., Nucl. Acids Res. 11, 2599, 1983). Identities of the predicted amino acid sequences of S8 and L6 between the two organisms are 54Z and 42Z, respectively. The A + T content of the M. capricolum genes is 71Z, which is much higher than that of Z. coli (49Z). Comparisons of codon usage between the two organisms have revealed that M. capricolum preferentially uses A- and U-rich codons. More than 90Z of the codon third positions and 57Z of the first positions in M. capricolum is either A or U, whereas J!. coll uses A or U for the third and the first positions at a frequency of 51Z and 36Z, respectively. The biased choice of the A- and U-rich codons In this organism has been also observed in the codon replacements for conservative amino acid substitutions between M. capricolum and 12. coli. These facts suggest that the codon usage of M. capricolum is strongly influenced by the high A + T content of the genome. INTRODUCTION The DNA ofraycoplasmasis extremely rich in A and T (1,2). It is thus interesting to see how this characteristic feature effects on the structure of the mycoplasma genes. Recently, we have cloned a DNA segment containing a cluster of ribosomal protein genes of Mycoplasma capricolum (3). This paper reports the nucleotide sequence of the DNA segment coding for ribosomal proteins S8 and L6, demonstrating that the codon usage in M. capricolum genes is greatly deviated from that in J2. coli. MATERIALS AND METHODS Plasmid DNA was prepared according to Oka et_ jil. (4). DNA sequence determinations were performed by the chain-termination method of Sanger et^ al. (5) as described by Messing and Vieira (6). Restriction-endonucleases, T4 DNA-ligase and DNA polymerase I (large fragment) were purchased from Takara-Shuzo Co. Ltd.(Kyoto). The enzyme reaction conditions were those recommended by the comercial supplier. O IRL Press Limited, Oxford, England. 8209 Downloaded from https://academic.oup.com/nar/article-abstract/12/21/8209/1166142 by guest on 02 April 2018

Upload: vutu

Post on 01-Feb-2017

219 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: O IRL Press Limited, Oxford, England. 8209

12 Numb* 21 1984 Nucleic Acids Research

Preferential use of A- and U-rich codons for Mycoplasma capricolum ribosomal proteins S8 and L6

Akira Muto, Yasushi Kawauchi*, Fumiaki Yamao and Syozo Osawa

Laboratory of Molecular Genetics, Department of Biology, Faculty of Science, Nagoya University,Furo-Cho, Chikusa-Ku, Nagoya 464, Japan

Received 17 September 1984; Accepted 15 October 1984

ABSTRACTThe nucleotide sequence of the 1.3 kilobase-pair DHA segment, which

contains the genes for riboaomal proteins S8 and L6, and a part of L18 ofMycoplasma capricolum, has been determined and compared with thecorresponding sequence in Escherichia coli (Cerretti et_ al., Nucl. Acids Res.11, 2599, 1983). Identities of the predicted amino acid sequences of S8 andL6 between the two organisms are 54Z and 42Z, respectively. The A + Tcontent of the M. capricolum genes is 71Z, which is much higher than that ofZ. coli (49Z). Comparisons of codon usage between the two organisms haverevealed that M. capricolum preferentially uses A- and U-rich codons. Morethan 90Z of the codon third positions and 57Z of the first positions in M.capricolum is either A or U, whereas J!. coll uses A or U for the third andthe first positions at a frequency of 51Z and 36Z, respectively. The biasedchoice of the A- and U-rich codons In this organism has been also observed inthe codon replacements for conservative amino acid substitutions between M.capricolum and 12. coli. These facts suggest that the codon usage of M.capricolum is strongly influenced by the high A + T content of the genome.

INTRODUCTION

The DNA of raycoplasmas is extremely rich in A and T (1,2). It is thus

interesting to see how this characteristic feature effects on the structure

of the mycoplasma genes. Recently, we have cloned a DNA segment containing a

cluster of ribosomal protein genes of Mycoplasma capricolum (3). This paper

reports the nucleotide sequence of the DNA segment coding for ribosomal

proteins S8 and L6, demonstrating that the codon usage in M. capricolum genes

is greatly deviated from that in J2. coli.

MATERIALS AND METHODS

Plasmid DNA was prepared according to Oka et_ jil. (4). DNA sequence

determinations were performed by the chain-termination method of Sanger et^

al. (5) as described by Messing and Vieira (6).

Restriction-endonucleases, T4 DNA-ligase and DNA polymerase I (large

fragment) were purchased from Takara-Shuzo Co. Ltd.(Kyoto). The enzyme

reaction conditions were those recommended by the comercial supplier.

O IRL Press Limited, Oxford, England. 8209

Downloaded from https://academic.oup.com/nar/article-abstract/12/21/8209/1166142by gueston 02 April 2018

Page 2: O IRL Press Limited, Oxford, England. 8209

Nucleic Acids Research

Fig. 1.

32,,

M tC -O MW O C *-<

h U X M

I M L_

(a)PMCB10ee

(bl 1.3Kb DNA

(OORFs

1Kb

0.2Kb

J LORF-1 ORF-2 ORF-3

(a) Restriction map of pMCB1088. The thick line shows the insertedM. capricolum DNA segment and the thin line represents the vectorpBR322 DNA. (b) The strategy for DNA sequence determination of the1.3 Kb Hindlll-fragment from pMCB1088. The arrows indicate thedirection of the sequence determinations. (c) The locations of openreading frumps.

fa- P)dATP was purchased from Amersham Japan. The construction of plasmid

pMCB1088 containing a DNA segment from M. capricolum ATCC27343(KID) was

described elsewhere (3).

RESULTS

Plasmid pMCB1088 contained a 9 kilobase-palr (Kb) Bglll-fragment of M.

capricolum DNA that had been inserted into the BamHI-site of pBR322 (3).

This DNA segment coded for at least eight ribosomal proteins (3). The

restriction map of the insert is shown in Fig. l(a). Deletion mapping

experiments revealed that the 1.3 Kb Hindlll-fragment derived from the

central part of the insert contained the gene for at least a ribosomal small

subunit protein of M. capricolum (MS5; see ref. 3).

The complete nucleotide sequence of the 1.3 Kb (1,251 base-pairs)

fragment was determined according to the strategy shown in Fig. l(b). More

than 90X of the sequences was confirmed by sequencing both strands. Computer

analyses for potential protein coding sequences revealed the presence of

three open reading frames (ORF-1, ORF-2 and ORF-3) on the same DNA strand

(Fig. l(c); see also Fig. 2). The amino acid sequences of the ORF-1 and

ORF-2 deduced from the nucleotide sequences were found to have 54Z and 42X

identities with those of £. coli ribosomal protein S8 (7) and L6 (8),

8210

Downloaded from https://academic.oup.com/nar/article-abstract/12/21/8209/1166142by gueston 02 April 2018

Page 3: O IRL Press Limited, Oxford, England. 8209

Nucleic Acids Research

respectively. The reported N-terminal sequences of proteins S8 and L6 from

Bacillus subtills also shoved high horologies with the corresponding

sequences of the ORFs (9, 10). Thus, we considered the ORF-1 and ORF-2 as

the genes for M. capricolum S8 or rpsH and L6 or rplF, respectively. The

0RF-3 could encode 81 amino acid residues, the sequence of which resembled

that from the N-terminus of the JE. coli ribosomal protein L18 (31Z identity).

We tentatively identified the ORF-3 as a part of the L18 gene or rplR. The

ORFs were interrupted by relatively short (19 and 28 base-pairs) spacers,

that did not contain promoter- or terminater-like sequences, suggesting that

these genes are a part of the operon for ribosomal protein genes. It is

interesting that the order of the genes (rpsH-rplF-rplR) on this fragment was

the same as that in the spc-operon of E. coli, where the genes for ten

ribosomal proteins and two membrane proteins are clustered as a

transcriptional unit (11). This indicates that the order of at least the

three genes mentioned above has been conserved between M. capricolum and I!.

coli.

Since the complete sequence of the j!. coll spc-operon was reported (11),

we can directly compare the sequences of the ribosomal protein genes between

M. capricolum and J!. coli. Fig. 2 shows the nucleotide sequences of the 1.3

Kb DNA (mRNA-like strand) and the predicted amino acid sequences aligned with

those of I!, coli. The M. caprllolum S8 gene was 384 base-pairs long (128

codons including initiation codon), 6 base-pairs shorter than the JS. coli

gene. The L6 gene of M. capricolum was 540 base-pairs long (180 codons), 9

base-pairs longer than that of E . coll. The spacers between the genes of M.

capricolum were longer than those of £. coli.

DISCUSSION

The A + T content of M. capricolum genomic DNA is 75Z (12), one of the

highest in prokaryotes so far known, whereas that of . coli DNA is about

502. Thus, the high A + T content of the genome may directly dictate the

codon usage of the M. capricolum genes. In fact, the A + T content of the M.

capricolum ribosomal protein genes is about 71%, which is much higher than

that of 12. coli (49X). In Table 1 is compared the codon usage in S8 and L6

genes between II. capricolum and . coll. The choice of synonymous codons in

M. capricolum is clearly different from that in 1!. coli and strongly biased.

For example, UUA(Leu), AUU(Ile) and AGA(Arg) are the most frequently used

codons among synonyms in M. capricolum, while they are rarely used in Z. coli

fCUG(Leu), AUC(Ile) and CGU(Arg) are predominant]. The most outstanding

feature of the codon usage in M. capricolum can be seen in a very high

8211

Downloaded from https://academic.oup.com/nar/article-abstract/12/21/8209/1166142by gueston 02 April 2018

Page 4: O IRL Press Limited, Oxford, England. 8209

Nucleic Acids Research

M.c.E.c .

M.c.E.c.

AAGCTrCATGATAGAAAGAGAGAAGATTCAAAAGTATTGTCACCAATTGAATCACGGCAGCTAAAGACAG

1 S8_startThriThrACAJACAAGClATGSerlltat

Met

ATGMet

CAAGin

AspGATGATAsp

Thr Arg II* Arg A»nACT AGA ATT AGA AATACC CGT ATC CGT AACThr Ars lie Arg Asn

M.c.E.c.

M.c.E.c.

M.c.E.c.

M.c.E.c.

M.c.E.c.

V a lGTTGTTV a l

UluGAAGAAGlu

GlyGGTGGCGly

AspGATGACAsp

Leu"jGluTTAlGAAGTGIGCAVajjAla

ValGTTACCThr

AlaiAsnGCTIAATGGTCly,—

l i e AlaATA GCAATC GCCH e Ala

ArgAGAGCCA l a

TyrTACGCGAla

LeuTTAAACAsn

LysAAAAAA

Lys

ThrACTGCT GCGAla Ala

ValGTTCCGPro

VaTGTAGTCVal

H e Ala Aap Met LeuATT GCA CAT ATG CTA 0 0 6 5ATC GCG GAT ATG CTGHe Ala Aap Met Leu

Ser jVa lAGTlCTTACCJATGTh^Mat

Arg T i l eAGAIATTAAC I GTC

Leu Lys G l u Glu Gly Phe H eTTA AAA GAA GAA GGA TTT ATCCTG AAG GAA GAA GGT TTT ATTLeu Lys Glu Glu Gly Phe H e

SerTCAGAAGlu

LysAAAAAGLys

Lys Thr !"lle| Asn f i l e 1 GluAAA ACT i ATT I AAT | ATT i GAACCT GAA[CTG|GAAICTTlACTPro GlujLeuiGlu^LeujThr

Leu Lys TyrTTA AAA TACCTG AAG TATLeu Lys Tyr

M.c.E . c .

M.c.E . c .

M.c,E.c,

AGAGTTV a l

VaTGTAGTTV a l

He Gin GlyjLeu|Lys [lys"lieATT CAA GCA|TTAIAAG, AAA ATT

GAA AGClATTJCAGICCT GTCGlu Serj_IlejGln{_Ar5_ Val

SerTCTACCSer

LysAAACGCArg

Pro Gly Leu ArgCCA GGT TTA AGACCA GGC TTG CGCPro Gly Leu Arg

TTCPhe

ValGTTTACTyr

Pro Ser SerCCG TCT AGC 0122CCT TCC TCCPro Ser Ser

Asp PheGAC TTCGAT TTTAsp Phe

Gin Gly LysCAA GGA AAACAG GGC AAAGin Gly Lys

ThrACT 0182AAALy.

ThrACT 0239GCTAla

TyrTATTATTyr

Ala Gin AlaGCA CAA GCT 0299AAA CGT AAALya Arg Lys

Asn Glu[~lleAAT GAAiATTGAT CAGICTGAsp Gln[_Leu

HeATAGTTVal

Met ThrATG ACTATG ACTMet Thr

ProCCACCCPro

Gin

AAALys

VaTGTAGTTVal

y f y l yGGTiAAAIAAAGAT'CCTJGCA

TCA TAA TAGGAGTTTAAATGCC TAA TCGGACGAAAAA-Ala130

Leu AmTTA AACATG GCGMet Ala

LeuCTACAAGin

1 L6 s tar t

Ala ArgGCT CCAGCG CCCAla Arg

Gly Lau Gly HeGGA TTA GGT ATCGGT CTG GGT ATCGly Leu Gly He

Met Ser ArgATG TCT CGTATG TCT CGTMet Ser Arg

AlaGCTGCTAla

Asn AlaAAT GCT|GGT CTT:Gly Leu

Ser(lieTCAIATTGCAJGTTA l a V a l

Val S e r Thr SerGTT TCA ACA TCAiGTT TCT ACC-TCTV a l S«r Thr Sar

GinCAA

Lys

GlyGGAGGTGly

T3B

0359

GlyGGTGGTGly

Gly GluGGA GAAGGC GAAGly Glu

Val Leu I AlaCTT CTAlGCAATT ATClTGC

H e GlylAsn ArgATA GGTIAAT AGAGTT GCTIAAA GCA

Leu I Leu I GinTTA'TTAICAA

PhejllelTTC|ATT1O419TAC'GTA]TyrLVal]

_Val_AlajLys Ala

VallGluGTT GAAGTT GACVal I Asp

V a lGTTGTAVal

LysAAAAAALys

H eATAATCH e

A l a Glu iAsn} Asn [ l e u l ValGCA GAA! AAC I AAT I TTA J GTAAAC GGT.CAG I GTT1 ATTAsn Gly ( G i n ) V a l [ l i e |

ThrACAACGThr

H eATTATCH e

ThrACA

Lyi

GlyGGAGGTGly

Ser Lys iGln Phe Ser Pro L a u | I l e ' L y s i I l e ~ J G l u Val Glu, rGlu|AsnTCA AAAICAA TTT TCA CCT TTAI ATC i AAA] ATT|GAA GTT GAAIGAA AACACT CCTjACT CTC AAC GAT GCT [CTT< GAG I GTT 'AAA CAT GCAJGAT AATjJhr_ArgjThr Leu Asn A»p

JHis Ala|A»p|Asn

L^Val

Ser LysTCT AAAAAA AACLys Asn

LysAAAACCThr

LeuTTACTGLeu

Arg Leu Asn Glu Gin Lys U l s Thr Lys Gin Leu H i sAGA TTA AAC GAA CAA AAA CAT ACA AAA CAA TTA CACCCG CGT GAT GGT TAC GCA GAC GGT TGG GCA CAG GCTPro Axg Aap Gly Tyr A l a Ajp Gly Trp A l a Gin A l a

tiu^ThrTTAlACTGTT1 ATCVal] He

G l y V a lGGA GTTGGT CTTG l y V a l

Sir"ACTACCThr

GAAGAG

GlyGCAGACAap

Ph«TTTTTCPhe

LysAAAACTThr

LysAAAAAGLys

C l y ThrGGA ACTGGT ACCC l y Thr

Thr Asn S e rACT AAT TCAGCG CGT GCCAla Arg A l a

GluGAAAACLy»

Leu GinTTA CAACTG CAGLeu Gin

Il«"|ThrATTlACTCTGIGTTLeu]Val

ProCCACCTPro

GlyGGAGGCGly

AsnAATGCCAla

ThrACTGAGGlu

Leu

CTGLeu

0540

H eATCACCThr

LeuTTACTCLeu

ClyGCGGCTCly

ThrACTTTTPhe

LeuCTACTGLeu

ValGTTGTAVal

LysAAA 0 6 0 0GGTGly

Cln;CAA|O66OAAC]

ClyCGC 0 7 2 0CGTGly

AlaGCTCCAAla

AlaGCAGCGAla

ValGTTGTTVal

AsnAATAAALys

GiyGGTGGGGly

SerTCTAATAsn

Lys'LeuAAAiTTAGTG'ATTValjjle

AanAATAACAan

LeuTTACTGLeu

SerAGTTCTS e r

LeuTTACTCL«u

GlyGGTCGTCly

TyrTATTTCPhe

S e rTCATCTSer

HisCATCATU l s

ProCCTCCTPro

ValCTTGTTVal

0780

8212

Downloaded from https://academic.oup.com/nar/article-abstract/12/21/8209/1166142by gueston 02 April 2018

Page 5: O IRL Press Limited, Oxford, England. 8209

Nucleic Acids Research

M.cE.c

M.c,E.c.

M.c.E.c.

M.c.E.c.

M.c.E.c.

M.c.E.c.

M.c.E.c.

M.c.E.c.

GlufileGAAlTTT GAAiATTGACJCAT CAGjCTGAspJaLs Gin [Leu

ProCCTCCTPro

Aap

GCCAla

GlyGCTGGTGly

V a l | V a l l i e Gin Ala Val Lys ProGTA]CTG ATT CAA CCA GTT AAA CCA

ATC'ACT GCT GAA TCT CCG ACT CAGl i e Thr Ala Glu Cys Pro Thr Gin

Thr GluACT GAAACT GAAThr Glu

UujAlaiile]TTA. CCT I ATT 10840ATC GTG CTCJl i e , V a l | L e u |

ThrACTAAALys

GlyGGAGGCGlv

HeATCGCTAla

Asp Lys GinCAT AAA CAAGAT AAG CAGAsp Ly» Gin

Leu ValTTA CTTGTG ATCVal l i e

PTO CluCCT GAACCT GAGPro Glu

Pro TyrCCA TATCCT TATPro Tyr

Lys Gly LysAAA GCT AAAAAA GGC AAGLys Gly Lys

GlyGGAGCTGly

GlyGGGGCT

GluGAA

ProCCT

LysAAAAAGLys

A l a ArgGCT AGA

CCTArg

GCTAla

Ala AlaGCA GCTAAG AAGLys Lys

177LysAAAATCHe

18(5Gly LysGCT AAA TAGTAA

Gly Gin Val Ala AlaGCT CAA CTA GCA GCAGGC CAC GTT GCA GCGGly Gin Val Ala Ala

H e LysATT AAAGTT CCTVal Arg

TyxTACTACTvr

Lys AsnAAA AATGCC CACAla Asp

Asn j H eAATI ATTGATJCTCAap[Leu

Arg Ala Tyr ArgAGA GCA TAT AGACGC GCC TAC CCTArg Ala Tyr Arg

LysAAA 0900CCT

GluGAAGAAGlu

TACCAGAGCTTAGGGATTACTAACTGCTAACACT

Thr lile"! Il« ArgACT J ATT | ATT AGACTCIGTCICGC ACCVal[_Val!Arg Thr1 L18~start

0960

ATGMet

Arg LeuAGA TTACGC CTGArg Leu

Thr Lys GlyACT AAA GGAGCT TCT GAAGly Ser Glu

AsnAATCTGVal

ValCTTCTTVal

Arg ArgCCT AGACCT CCTArg Arg

H i sCATGCGAla

Ph«TTCACCThr

A r gAGACGCArg

V a lGTAGCAAla

ArgAGACGCArg

HisCATCGCArg

P h e l L y s S e r ] A s n Thr Asn PheTTT'AAA TCAIAAT ACT AAT TTCCAT]CCT ACC'CCG CGT CAC ATTHlsj_Arg_ThrjPro Arg H i s H e

ThrACA

Asn H e jAAT ATT CAA1GCT"GCT

AAG TAC ACCJGGTJAATLys Tyr T h r G l y

LysAAAAAGLys

Val !Val Gly ThrlAlajGluGTTIGTT GCT ACTJCCTIGAACTC'CAG GAG CTGIGGC'GCA

LeujGln Glu L»uj_GlyjAla

Lys Phe ThrILysAAA TTT ACT|AAAGATAap Lys

TyrTATTACTyr

AlaGCTGCAAla

GinCAACACGin

GTAVal

l i eATTATTH e

H e Asp AspATT GAT GAT U 4 2GCA CCG AACAla Pro Asn

Leu ValTTA GTACTG CTALeu Val

SerTCTGCTAla

Glu Lys ValGAA AAA CTTAAA GAC GCGLys Asp Ala

Ala SerGCT TCTGCT TCTAla Ser

ThrACAACTThr

Leu^Lys Met AspFLeuJLys Ser Lys SerTTA AAA ATG GAT TTAIAAG ACT AAA TCT 1202GTA,GAA AAA GCT ATC|CCT GAA CAA CTGValJGlu Lys A l a i l l e j A l a Glu Gin Leu

AlaGCTGCTAla

Glu Glu,rLeipThr LysGAA GAAITTAlACT AAAGCA GCTJGTG"- GCTAla Ala|_Valj Cly

81Lys AlaAAA GCTAAA GCTLys Ala

1251

77

Fig. 2. The total nucleotlde sequence of the 1.3 Kb Hlndlll-fragment anddeduced amlno acid sequences of the ORFs. The sequences are alignedwith the corresponding E_. coll sequences reported by Cerrettlet_ £l (11). M.c: M. caprlcolum; E.c: E. coll.

frequency of the codons ending in either A or U; 91Z (280/308) of the M.

caprlcolum codons has A or U at the third position, in contrast to only 51Z

(155/307) In E. coll (Table 2). The A + U content of the first position of

the codons in M. caprlcolum is also significantly higher (57Z;174/308) than

that in J3. coli (36X;110/3O7). It is thus evident that the choice of

synonymous codon in M. caprlcolum is biased to the codons much richer in A

and U as compared with E. coli.

To see further the preferential use of A- and U-rich condons in M.

capricolum, individual codons for the S8 and L6 genes of the two organisms

are compared. There exists 143 identical amlno acids at the homologous

positions between M. capricolum and E. coli S8 and L6 (Fig. 2). Among them,

8213

Downloaded from https://academic.oup.com/nar/article-abstract/12/21/8209/1166142by gueston 02 April 2018

Page 6: O IRL Press Limited, Oxford, England. 8209

Nucleic Acids Research

Table 1. Codon usage in S8 and L6 genes of M. capricolum and 12. collSecond

PhePheLeuLeu

LeuLeuLeuLeu

HeHeHeMet

ValValValVal

M.c.

42240

0040

21545

16071

U

E.c*

3301

21018

51208

21535

SerSerSerSer

ProRroProPro

ThrThrThrThr

AlaAlaAlaAla

M.c.

5080

4051

15-060

80110

c

E.c.

5210

7114

8901

9799

TyrTyr

HisHisGinGin

AsnAsnLysLys

AspAspGluGlu

M.c.

5301

21150

134341

51210

A

E.c.

3520

30111

291711

106125

CysCys

Trp

ArgArgAr«Arg

SerSerArRArg

GlyGlyGlyGly

M.c.

0010

1010

31110

140133

G

E.c.

1101

13500

0200

19901

ucAG

U

cAG

UCAG

U

cAG

Taken from Cerretti e£ a^. (11)M.c. : M. capricolum ; E.c. : 15. coli

the corresponding codons for 46 amino acids are identical, while the rest 97

codons are different (synonymous codon replacement). All of these silent

nucleotide substitutions in the 97 synonymous codons are listed in Table 3.

Here, the preferential use of A- and U-rich codons in M. capricolum can

clearly be seen. Sixty-six codons having G or C at the third position in J5.

coll are replaced by synonymous codons ending in A or U with M. capricolum.

Table 2. Nucleotide composition at three different positions of codons inS8 and L6 genes of M. capricolum and J. coli

ucAG

A + U(2)

First

M.c.

5134123100

174(56.5)

E.c.

266784130

110(35.8)

Second

M.c.

936310547

198(64.3)

E.c.

87739552

182(59.3)

Third

M.c.

1161716411

280(90.9)

E.c.

Ill774475

155(50.5)

Total

M.c.

260114392158

625(70.6)

E.c.

224217223257

447(48.5)

M.c. : M. capricolum ; E. c. : E,. coli

8214

Downloaded from https://academic.oup.com/nar/article-abstract/12/21/8209/1166142by gueston 02 April 2018

Page 7: O IRL Press Limited, Oxford, England. 8209

Nucleic Acids Research

Table 3. Synonymous nucleotlde substitutions In S8 and L6 genesbetween M. caprlcolum and E. coll

Ala

Asn

Asp

Arg

Gin

Glu

Gly

lie

Leu

M.c.

GCAGCAGCUGCU

AUUAAC

GAUGAC

AGAAGACGA

CAA

GAA

GGAGGAGGUGGUGGG

AUUAUAAUC

UUAUUACUA

E.c.

GCGGCCGCGGCA

AACAAU

GACGAU

CGCCGUCGC

CAG

GAG

GGCGGUGGCGGGGGU

AACAUCAUU

CUGUUGCUG

(AU)a

(+1)(+D(+1)( 0)

(+1)

(-D

(+1)

(-D(+2)

(+D(+1)

(+1)

(+1)

(+1)( 0)(+1)(+1)

(-D

(+1)(+1)(-1)

(+2)(+1)(+1)

No.b

3211

21

11

231

5

3

47512

321

912

a x b

3210

2-1

1-1

431

5

3

4051-2

32-1

1812

Lys

Phe

Pro

Ser

Thr

Tyr

Val

M.c.

AAA

UUUUUC

CCACCACCG

UCUUCUUCAAGTJAGC

ACAACAACU

UAUUAC

GUAGUAGUU

E.c.

AAG

UUCUUU

CCC

ecuCOT

UCCAGCUCUUCUUCC

ACGACCACC

UACUAU

GUCGUUGUA

Total

(AU)a

(+1)

(+1)(-1)

(+1)( 0)

(-D

(+1)(+1)( 0)( 0)( 0)

(+1)(+1)(+1)

(+1)(-1)

(+1)( 0)( 0)

No.b

8

11

121

11311

112

21

132

97

a x b

8

11

10-1

11000

112

2-1

100

74°

M.c. : M. caprlcolum ; E.c. : E. colla : A + U gains in H. caprlcolum as compared with E. collb : Number of occurrencec : Total A + U gains In 97 synonymously substituted codons

(291 nucleotlde residues) in M. caprlcolum as compared with coll

A striking example is that eight AAG condone for Lys in E. coll are

substituted by AAA in M. caprlcolum.

Fifty conservative amlno acid replacements (i.e., Lys/Arg, Ser/Thr,

Leu/Ile/Val, etc.) can be seen at the homologous positions of the S8 and L6

proteins between the two organisms (Fig. 2). In Table 4 are collected all of

these amlno acid substitutions with their codon replacements. Most of the

conservative amino acid substitutions take place so that the M. capricolum

maintains much higher A + U content in the codons than E^. coli. For example,

seven Arg with codon CGU or CGC In £. coll are replaced at the corresponding

8215

Downloaded from https://academic.oup.com/nar/article-abstract/12/21/8209/1166142by gueston 02 April 2018

Page 8: O IRL Press Limited, Oxford, England. 8209

Nucleic Acids Research

Table 4. Conservative amino acid substitutions in protein S8 and L6between M. capricolum and E. coli with their codon replacements

Substitution

Lys/Arg

Leu/Ile

Leu/Val

Ile/Val

Ser/Thr

Ala/Gly

Asn/Gln

Asp/Glu

M. capricolum

Lys(AAA)Lys(AAA)

Ile(AUU)Ile(AUU)Leu(UUA)Leu(UUA)Leu(CUA)

Leu(UUA)Leu(UAA)Leu(UAA)Val(GUA)

Ile(AUU)Ile(AUU)Ile(AUU)Ile(AUU)Ile(AUA)Ile(AUC)Val(GUA)Val(GUU)Val(GUU)

Ser(AGU)Ser(UCA)Thr(ACA)

Ala(GCU)Gly(GGU)Gly(GGG)

Asn(AAC)Gln(CAA)

Glu(GAA)Glu(GAA)

Total

E. coli

Arg(CGU)Arg(CGC)

Leu(CUG)Leu(CUU)Ile(AUC)Ile(AUU)Ile(AUC)

Val(GUC)Val(GUG)Val(GUU)Leu(CUG)

Val(GUU)Val(GUC)Val(GUA)Val(GUG)Val(GUU)Val(GUU)Ile(AUC)Ile(AUC)Ile(AUU)

Thr(ACC)Thr(ACU)Ser(AGC)

Gly(GGU)Ala(GCU)Ala(GCU)

Gln(CAG)Asn(AAC)

Asp(GAC)Asp(GAU)

(AU-gain)a

(+2)(+3)

(+2)(+1)(+1)( 0)( 0)

(+2)(+2)(+1)(+1)

(+1)(+2)(+1)(+2)(+1)( 0)( 0)( 0)(-1)

(+1)( 0)(+1)

( 0)( 0)(-1)

(+1)( 0)

(+1)( 0)

Number ofoccurrence

61

61131

1211

421121111

211

111

11

21

50

a x b

123

121100

2411

44122000-1

201

00-1

10

20

54C

a : A + U gains in M. capricolum codons as compared with I!, colib : Number of occurrencec : Total A + U gains in 50 conservatively substituted codons (150

nucleotide residues) in M. capricolum as compared with E. coli

positions by Lys with AAA in H. capricolum. Thus, the choice of codons in M.

capricolum seems to occur to discriminate against G and C and to use A and U

wherever possible.

The generalc DNAs of M. capricolum as well as all the known species

8216

Downloaded from https://academic.oup.com/nar/article-abstract/12/21/8209/1166142by gueston 02 April 2018

Page 9: O IRL Press Limited, Oxford, England. 8209

Nucleic Acids Research

belonging to the class Mollicute (Mycoplasma, Acholeplasma, Ureaplasma and

Splroplasma) are rich In A and T (61-78Z), suggesting that the Mollicute has

evolved with a constraint keeping the high A + T contents in their genomes.

The obvious preference of A and T in M. capricolum is clearly seen not only

in the codons, but also in the spacers. In fact, the spacers between rRNA

genes of M. capricolum are extremely rich in A and T (13). Also, the A + U

content of rRNAs from mycoplasmas is highest in prokaryotes (14). Thus, in

H. capricolum, the constraint for the preferential use of A and T has

operated at the DNA level as a selection force upon the codon choice as well

as the construction of other parts of the genome.

ACKNOWLEDGEMENTS

This work was supported by grants from the Ministry of Education,

Science and Culture, Japan (Nos. 57121003, 57121009 and 58220012) and the

Naito Science Foundation Research Grant (81-106).

* On leave from Department of Biochemistry and Biophysics, Research Institute for NuclearMedicine and Biology, Hiroshima University, Hiroshima 734, Japan.

REFERENCES1. Maniloff, J. and Horowitz, H. J. (1972) Bacteriol. Rev. 36, 263-290.2. Razin, S. (1978) Microbial. Rev. 42, 414-470.3. Kawauchi, Y., Muto, A., Yamao, F. and Osawa, S. (1984) Mol. Gen. Genet,

in press.4. Oka, A., Sugisaki, H. and Takanami, M. (1981) J. Mol. Biol. 147,

217-226.5. Sanger, F., Wicken, S. and Coulson, A. R. (1977) Proc. Natl. Acad. Sci.

USA 74, 5463-5467.6. Messing, J. and Vieira, J. (1982) Gene 19, 269-276.7. Allen, G. and Wittmann-Liebold, B. (1978) Hoppe-Seyler's Z. Physiol.

Chem. 359, 1509-1525.8. Chen, R., Afrsten, U. and Chen-Schmeisser (1977) Hoppe-Seyler's Z.

Physiol. Chem. 358, 531-535.9. Higo, K., Otaka, E. and Osawa, S. (1982) Mol. Gen. Genet. 185, 239-244.

10. Higo, K., Itoh, T., Kumazaki, T. and Osawa, S. (1980) in Genetics andEvolution of RNA polymerase, tRNA and Ribosomes. Osawa, S., Ozeki, H.,Uchida, H. and Yura, T. eds. pp.655-666, Tokyo Univ. Press, Tokyo.

11. Cerretti, D. P., Dean, D., Davis, G. R., Bedwell, D. M. and Nomura, M.(1983) Nucl. Acids Res. U , 2599-2616.

12. Neimerk, H. C. (1970) J. Gen. Microbiol. 63, 2491263.13. Sawada, M., Muto, A., Iwami, M., Yamao, F. and Osawa, S. (1984) Mol.

Gen. Genet, in press.14. Ryan, J. L. and Horowitz, H. J. (1969) Proc. Natl. Acad. Sci. USA 63,

1282-1289.

8217

Downloaded from https://academic.oup.com/nar/article-abstract/12/21/8209/1166142by gueston 02 April 2018

Page 10: O IRL Press Limited, Oxford, England. 8209

Nucleic Acids Research

Downloaded from https://academic.oup.com/nar/article-abstract/12/21/8209/1166142by gueston 02 April 2018