coghlan wolfe, 2000

Author: juan-ramirez

Post on 06-Jan-2016




0 download

Embed Size (px)




  • Research Article

    Relationship of codon bias to mRNA concentrationand protein length in Saccharomyces cerevisiae

    Avril Coghlan and Kenneth H. Wolfe*Department of Genetics, Smurfit Institute, University of Dublin, Trinity College, Dublin 2, Ireland

    *Correspondence to:K. H. Wolfe, Department ofGenetics, University of Dublin,Trinity College, Dublin 2, Ireland.E-mail: [email protected]

    Received: 11 February 2000

    Accepted: 17 April 2000


    In 1982, Ikemura reported a strikingly unequal usage of different synonymous codons, in

    five Saccharomyces cerevisiae nuclear genes having high protein levels. To study this trendin detail, we examined data from three independent studies that used oligonucleotide arrays

    or SAGE to estimate mRNA concentrations for nearly all genes in the genome.

    Correlation coefficients were calculated for the relationship of mRNA concentration to

    four commonly used measures of synonymous codon usage bias: the codon adaptation

    index (CAI), the codon bias index (CBI), the frequency of optimal codons (Fop), and the

    effective number of codons (Nc). mRNA concentration was best approximated as an

    exponential function of each of these four measures. Of the four, the CAI was the most

    strongly correlated with mRNA concentration (rs=0.62t0.01, n=2525, p

  • predictor of expression levels (i.e., either proteinlevels or mRNA levels) in S. cerevisiae, althoughthis is now largely obsolete due to the availability ofdata from direct studies of genome-wide transcriptlevels. Here, we retrospectively examine the rela-tionship between codon bias and mRNA levels inyeast.

    The release of the genome sequence (Goffeauet al., 1996) and the development of new technol-ogies have allowed synonymous codon bias andmRNA concentrations to be quantified for all S.cerevisiae genes, and protein concentrations to bemeasured for many. This has prompted three recentstudies of the relationships between these variablesin S. cerevisiae. Futcher et al. (1999) analysed theirown protein concentration data from two-dimensional gel electrophoresis, and a combinationof mRNA concentration data from Velculescuet al.s (1997) serial analysis of gene expression(SAGE) experiment and from the high-densityoligonucleotide array (HDA) experiment ofWodicka et al. (1997). Gygi et al. (1999) comparedtheir own protein concentration data from two-dimensional gel electrophoresis to the same SAGEmRNA concentration data set (Velculescu et al.,1997). Furthermore, in a third recent study, Pavesi(1999) also examined data from the same SAGEexperiment (Velculescu et al., 1997). A strongcorrelation between mRNA concentration andcodon bias [the codon adaptation index (CAI) ofSharp and Li, 1987] was reported by Futcher et al.(1999) for 71 genes; Pavesi (1999) confirmed this forboth CAI and CBI [the codon bias index (CBI) ofBennetzen and Hall, 1982] for 72 genes (nocorrelation coefficients given). Protein concentra-tion was also reported to be strongly correlatedwith codon bias (CAI) (rs=0.80, n=71, p

  • number of codons (Nc), and three H1-basedmeasures, CAI, CBI, and Fop, as described brieflybelow.

    The effective number of codons (Nc; Wright,1990) measures deviation from equal use of synon-ymous codons. The Nc is the number of codonsthat, if used equally, would generate the level ofcodon bias observed. Nc takes values from 20.0(maximum bias) to 61.0 (no bias). The CAI (Sharpand Li, 1987) is a measure of synonymous codonusage bias in the direction of the bias seen in areference set of 24 S. cerevisiae genes having highprotein levels. CAI takes values from 0.0 (no bias)to 1.0 (maximum bias). The CBI (Bennetzen andHall, 1982) is a measure of the frequency of optimalcodons. Its 22 optimal codons are those present inmore than 85% of cases in the S. cerevisiae ADH1,TDH2 and TDH3 genes, and are complementary tothe anticodons of the major S. cerevisiae tRNAspecies. CBI generally takes values from 0.0 (nobias) to 1.0 (maximum bias), although negative CBIvalues can occur if optimal codons occur less oftenin a gene than in a sequence having an equal use ofthe synonymous codons for each amino acid. Thefrequency of optimal codons (Fop; Ikemura, 1985)measures the frequency of optimal codons chosenbased on tRNA anticodon sequences and isoaccep-tor tRNA concentrations. The set of 22 optimalcodons used to calculate Fop is almost the same asthat used for CBI, except that in Fop GCC (Ala) isexcluded and GAC (Asp) is included (Sharp andCowe, 1991). Fop takes values from 0.0 (no bias) to1.0 (maximum bias).

    Codon bias in each open reading frame (ORF)was calculated using a modified version of theFORTRAN 77 program CODONS (version 1.4); (Lloyd and Sharp,1992a). We altered CODONS so that it couldaccept more ORFs, and added a subroutine so itcould calculate CBI, as well as CAI, Fop, and Nc.

    Correspondence analysis of synonymouscodon usage

    Correspondence analysis is a multivariate statisticalmethod often used to analyse synonymous codonusage. It identifies the main trends in data as aseries of orthogonal axes in an n-dimensionalhyperspace. The first axis explains the highestproportion of the variation in the data, andsuccessive axes a decreasing proportion. The first

    axis in correspondence analysis of synonymouscodon usage in S. cerevisiae is hypothesized toreflect a relationship between codon bias andprotein or mRNA concentration (Lloyd andSharp, 1992b; Sharp and Cowe, 1991). To comparethe correlation between the first axis and mRNAlevels to that between mRNA levels and CAI/CBI/Fop/Nc, we carried out correspondence analysis ofthe synonymous codon usage of S. cerevisiae,estimated by relative synonymous codon usages(RSCUs). The RSCU of a codon is the observedfrequency of the codon divided by the frequencyexpected if all the synonyms for that aminoacid were used equally (Sharp et al., 1986).Correspondence analysis on RSCUs was implemen-ted using the program CodonW (version 1.4.2); (Peden J. F.,unpublished).

    Sequence data

    We obtained the S. cerevisiae nuclear genomesequence from (October 1998). This databaseincluded the coding sequence for 6217 currentstandard ORFs, with the introns removed. Weexcluded mitochondrial and plasmid ORFs, sincetheir codon usage differs from that of nuclear genes.

    mRNA concentration data

    We studied genome-wide mRNA concentrationdata from three research groups: one group usedthe SAGE technique (Velculescu et al., 1997), whilethe other two groups used HDAs (Cho et al., 1998;Holstege et al., 1998). Holstege et al. (1998) andVelculescu et al. (1997) estimated mRNA concen-tration in mRNA transcripts per cell. The thirddata set (Cho et al., 1998) is in units of normalizedfluorescence intensity; mRNA concentration isdirectly proportional to fluorescence intensity(Wodicka et al., 1997).

    The SAGE data (Velculescu et al., 1997) cover4665 ORFs, with mRNA concentration estimatesfrom 0.3 to over 200.0 transcripts per cell. Weobtained the data from (December 1998).Velculescu et al. (1997) calculated mRNA concen-tration for each ORF by assuming 15 000 tran-scripts per cell in total. For each SAGE tag, weaveraged the three tag concentration estimates fromlogarithmic phase, from S phase-arrested cells, and

    Unequal usage of different synonymous codons in yeast 1133

    Copyright # 2000 John Wiley & Sons, Ltd. Yeast 2000; 16: 11311145.

  • 1134 A. Coghlan and K. H. Wolfe

    Copyright # 2000 John Wiley & Sons, Ltd. Yeast 2000; 16: 11311145.

  • from G2/M phase-arrested cells. We then calculatedthe mRNA concentration for each ORF to be thesum of the concentrations of its corresponding tags,as did Velculescu et al. (1997). One unsettled issue iswhether data from a non-unique SAGE tag canreliably be divided among its ORFs of origin (e.g.sharing data between the related genes TDH2 andTDH3 from a tag contained in both of them). Weconsidered that including just data from uniquetags for an ORF would introduce less error than(somehow) including data from both non-uniqueand unique tags. Thus, we discarded data fromnon-unique tags, and included data correspondingto unique type 1 SAGE tags (within the ORF) andunique type 2 SAGE tags (within the 3k-untranslatedregion). Our sample of the SAGE data included 5358tags (from 3817 genes). In contrast, other analyses ofthe same SAGE data have included all the tags fromORFs with non-unique tags (Futcher et al., 1999;Gygi et al., 1999; Pavesi, 1999; Velculescu et al.,1997). Our genome-wide correlation coefficients(Table 2) were little affected by excluding the incon-clusive data from non-unique tags.

    The data of Cho et al. (1998) were obtained fromhttp : / / genomics . stanford . edu / yeast / cellcycle. html(October 1998). They estimated mRNA concentra-tions by using commercial oligonucleotide arrays: S.cerevisiae Ye6100 GeneChips from Affymetrix.They took measurements for each of 6218 ORFsin synchronized cells every 10 min during themitotic cell cycle. For each ORF, the data consistof 17 values, which are in units of normalizedfluorescence intensity and correspond to the time-points between 0 and 160 min after cell cyclereinitiation. We used both the average fluorescenceintensity during the cell cycle and the peakfluorescence intensity in our analysis of Cho et al.sdata. They normalized their raw fluorescenceintensity data to account for fluctuations in hybri-dization conditions during the cell cycle, assumingthat total mRNA concentration in the cell remains

    constant throughout. These normalization calcula-tions (described at the web-site above) perhapsreduced their datas accuracy for some ORFs. Toavoid temperature-induced effects caused by arrest-ing the cell cycle, we analysed data only from time-points more than 40 min past release from arrest(just the last 13 data points). We examined dataonly for probes that hybridize to a unique targetORF. Where two different probes were targeted tothe two exons of a split gene, we took data for thesecond exon only, since the first exon is usuallyshort in S. cerevisiae.

    We obtained the data of Holstege et al. (1998)from 1999). Using the same Affymetrix oligo-nucleotide arrays as Cho et al. (1998), theyestimated mRNA concentration for 5460 ORFs.

    Statistical analysis

    We used DataDesk 5.0.1 and Excel 4.0 forstatistical analysis.


    Concurrence between mRNA concentrationsestimated using SAGE and two oligonucleotidearray experiments

    The three whole-genome mRNA concentrationstudies analysed here all employed similar yeaststrains grown under similar conditions, so wecompared the mRNA levels reported for genes inthe different experiments, using loglog plots(Figure 1). Taking data from the two groups thatused the same commercial oligonucleotide array,the correlation between the average fluorescenceintensity during the cell cycle, as measured by Choet al. (1998), and the mRNA concentration data ofHolstege et al. (1998) was rs=0.68t0.01 (n=4360,p

  • estimated mRNA concentrations during particularcell cycle stages, the correlations between the dataof Cho et al. (1998; HDA method) and Velculescuet al. (1997; SAGE method) are rs=0.41t0.02(n=3094, p
  • CBI and Fop intermediate. CAIs performance wassignificantly better than Nc (but not CBI or Fop) forthe SAGE data and for the oligonucleotide arraydata of Holstege et al. (1998). For the oligonucleo-tide array data of Cho et al. (1998), there was nosignificant difference between correlations with thefour different codon bias measures. From corre-spondence analysis on RSCUs, position on axis 1did not correlate with mRNA levels as well as CAI,CBI or Fop or Nc (Table 2). This may be becauseour correspondence analysis included very low biasgenes as we used no low-bias cut-off, unlike for

    CAI/CBI/ Fop /Nc. Alternatively, factors other thanmRNA concentration may also affect axis 1 in S.cerevisiae; for example, we found protein length tobe weakly correlated with axis 1 (rs=0.14t0.02,n=3401, p

  • CBI third best. There is almost no relationship

    between mRNA levels and codon bias (measured by

    any method) for mRNAs with fewer than about five

    transcripts per cell (Figure 2). This is to be expected

    because, first, there is little consistency among the

    different experiments for transcripts present atbelow five transcripts per cell (Figure 1), and

    second, selection for translationally optimal

    codons is hypothesized to be less effective in low-

    abundance mRNAs (Sharp and Li, 1986). For

    comparison with Figure 2A, loglog plots of CAI

    vs. the data of Velculescu et al. and Cho et al. areshown in Figures 3 and 4, respectively.

    Protein length is negatively correlated withmRNA concentration in S. cerevisiae geneswith the same level of codon bias

    To take into account the dependencies of bothprotein length and mRNA concentration on CAI,we controlled for CAI (i.e. the effect of CAI waseliminated) by calculating the partial correlation

    1138 A. Coghlan and K. H. Wolfe

    Copyright # 2000 John Wiley & Sons, Ltd. Yeast 2000; 16: 11311145.

  • coefficient of mRNA concentration and protein

    length, excluding CAI (Bailey, 1995). We excluded

    proteins of < 100 codons to avoid sampling effectsin calculating the CAI. When we controlled for CAI

    in this way, mRNA concentration from Holstege

    et al. (1998) showed a weak negative partial

    correlation with protein length (partial

    rs=0.23t0.01, n=4765, p

  • with the same mRNA concentration. For proteins


  • based on the observed fluorescence intensity, with a

    greater than 95% probability that the actual

    concentration is between 0.5X and 2X. Estimates

    of mRNA concentrations in duplicate array experi-

    ments varied more for some genes than for others;

    these genes were found to be very sensitive to

    Figure 4. Average (A) and maximum (B) normalized fluorescence intensity measurements (a measure of mRNAconcentration) from Cho et al. (1998) plotted on a loglog scale against CAI for 2458 S. cerevisiae ORFs judged to havetranslational codon bias. The y axes in (A) and (B) are the average and maximum, respectively, of 13 fluorescence intensitymeasurements during the mitotic cell cycle. The regression curves for the non-log transformed data (dashed lines) are: (A)y#x11,736.15+e(9.36+0.26x); and (B) y#x2,398.69+e(7.73+1.61x)

    Unequal usage of different synonymous codons in yeast 1141

    Copyright # 2000 John Wiley & Sons, Ltd. Yeast 2000; 16: 11311145.

  • growth conditions such as cell density and micro-environment (Lander, 1999; Wodicka et al., 1997).There is so far no published in-depth study of theprecision of SAGE, although it is likely that SAGEconcentration estimates for low abundance mRNAshave low precision (Futcher et al., 1999; Gygi et al.,1999).

    Accuracy of mRNA concentration estimates canbest be judged by comparing data from differenttechniques, but no previous studies have comparedSAGE results for a large number of genes tomRNA concentrations estimated by other methods(e.g. with oligonucleotide arrays, cDNA microar-rays, or Northern blots). However, in several smallstudies the abundance of SAGE tags agreed wellwith relative mRNA concentration estimates fromcDNA hybridization experiments (Velculescu et al.,1995) or Northern blots (Velculescu et al., 1997;Zhang et al., 1997). HDA mRNA concentrationestimates for 17 S. cerevisiae genes agreed well withNorthern blots (Galitski et al., 1999), but fromunpublished data it has been speculated that arraysunderestimate high mRNA concentrations andoverestimate low mRNA concentrations (Anon-ymous, 1998; Futcher et al., 1999). Thus, it ishypothesized that SAGE is more accurate thanHDAs for abundant mRNAs (Futcher et al., 1999).Indeed, the SAGE estimates of Velculescu et al.(1997) are higher than the HDA estimates ofHolstege et al. (1998) for most abundant mRNAs(Figure 1A).

    In S. cerevisiae mRNA concentration iscorrelated with codon bias

    Why is this so? Two variables can be linearlycorrelated if one is the cause of the other, if theyinteract with each other, or if they are both effectsof another third variable (Campbell, 1974). Thereason why codon bias and mRNA concentrationare correlated is an interesting case of cause andeffect. In a living cell, codon bias may affecttranscriptional rate and mRNA stability and so bea contributory cause of mRNA concentration (Xia,1996). Messenger RNA concentration is withoutdoubt a necessary and sufficient cause of proteinconcentration, and codon bias debatably affectstranslation rate and so probably is also a smallcontributory cause of protein concentration(Kurland, 1987; Sharp and Li, 1986; Sharp et al.,1993; Xia, 1998). Over evolution, selection for

    protein concentration may have been a contributorycause of codon bias, via selection for translationalefficiency, and of mRNA concentration (Sharp andLi, 1986; Sharp et al., 1993). In addition selectionfor mRNA concentration may have been a con-tributory cause of codon bias, via selection fortranscriptional efficiency and mRNA stability (Xia,1996). That is, codon bias and mRNA concentra-tion, codon bias and protein concentration, andmRNA concentration and protein concentrationmay all have interacted over time; these threeinteractions may all have contributed to thecorrelation between codon bias and mRNA con-centration observed in this and other recentstudies in S. cerevisiae (Futcher et al., 1999;Pavesi, 1999) and in C. elegans, D. melanogasterand A. thaliana (Duret and Mouchiroud, 1999).How strongly are codon bias and mRNA concen-tration correlated when the effect of proteinconcentration is eliminated? This would be quanti-fied by partial correlation, which unfortunately wasnot calculated in studies of the relationshipsbetween codon bias and mRNA and protein levelsin S. cerevisiae (Futcher et al., 1999; Gygi et al.,1999).

    CAI is the best predictor of mRNAconcentration in S. cerevisiae

    Of the four codon bias measures studied, the CAIwas the best predictor of mRNA concentration(Table 3). It was not surprising that Nc was themost weakly correlated with mRNA concentration,since Nc is a H0

    *-based codon bias measure, whilethe CAI, CBI and Fop are H1-based measures. Eventhough Nc is a H0

    *-based measure, it was stillstrongly correlated with mRNA concentration(Table 2). This is because GC3s in S. cerevisiae isreasonably close to 50% (it is in the range 3545%)and so there is little influence from mutational biason codon usage (and so on Nc) in S. cerevisiae(Wright, 1990). Nc may, however, be a poorpredictor of mRNA concentration in species inwhich mutational bias is more pronounced (Wright,1990).

    Codon bias is often an imperfect measure ofmRNA abundance due to differential gene regula-tion. For example, Holstege et al. (1998) detectedrelatively low mRNA levels for the high-bias geneENO1 (17.1 transcripts per cell; CAI=0.87). ENO2,its homologue, had transcript levels more usual for

    1142 A. Coghlan and K. H. Wolfe

    Copyright # 2000 John Wiley & Sons, Ltd. Yeast 2000; 16: 11311145.

  • their high codon bias (61.1 transcripts per cell;CAI=0.89). ENO2 is dramatically induced byglucose (McAlister and Holland, 1982), whileENO1 is glucose-repressed (Carmen et al., 1995).Thus, under the glucose-rich growth conditionsused by Holstege et al. (1998), S. cerevisiaetranscribed abundant ENO2 mRNA and littleENO1 mRNA.

    Why other estimators fall short of CAI is aninteresting puzzle. There are three reasons. First,CAI assigns a relative translational adaptivenessto each codon in the range 0.0 (non-optimal) to 1.0(optimal), while CBI and Fop assign each codon anon-relative integer score of either 0 (non-optimal)or 1 (optimal) (Bennetzen and Hall, 1982; Ikemura,1985; Sharp and Li, 1987). Thus, CBI and Fop arecoarse-grained as compared to CAI. Second, simu-lations (Comeron and Aguade, 1998) demonstratedthat CAI has almost no systematic errors dependenton sequence length, and has low dispersion underdifferent sequence length and codon bias condi-tions. CBI and Fop were not analysed but, givenfindings for other codon bias measures (Comeronand Aguade, 1998), it would be worth investigatingwhether CBI and Fop have greater systematic errorsand dispersion than CAI. Third, CAI quantifies theoptimality of codons from their frequencies in a setof reference genes having high protein levels, somay be a good predictor of mRNA concentrationby default.

    CAI is an unreliable predictor of mRNA con-centration, as the residuals for our regression curvewere proportionately large (Figure 2); however, thiswas perhaps partly due to errors in the mRNAconcentration estimates. CAI gives little insight intowhy its codons are preferred since, unlike CBI andFop, its optimal codons were not chosen usingphysicochemical data. Another drawback of CAI isits inability to predict changes in mRNA or proteinconcentrations with changing physiological condi-tions. Indeed, equations constructed from physico-chemical data may be a much more accurate waythan CAI to predict protein concentrations; forexample, from data such as codon usage, tRNAconcentrations, ribosome binding site strengths,codon-anticodon binding energies, transcriptlengths and protein half-lives. One possibility isthat CBI or Fop could be updated using better S.cerevisiae empirical data (Percudani et al., 1997)and theoretical models of protein production(Solomovici et al., 1997; Xia, 1996, 1998). Such an

    improved translational codon bias measure mightprovide insights into the relative importance ofcontributory causes of protein concentration. Incontrast, to detect transcriptional selection oncodon usage (Xia, 1996), a completely new codonbias measure is needed. But will transcriptionalcodon bias turn out to be more or less correlatedwith mRNA concentration than is translationalcodon bias?

    Protein length is negatively correlated withmRNA concentration in S. cerevisiae geneswith the same level of codon bias

    When we controlled for CAI, mRNA concentrationshowed a weak negative partial correlation withprotein length (partial rs=0.23t0.01, n=4765,p

  • effect of mRNA concentration is eliminated (Davis,

    1985). It has been found that CAI and protein

    length are positively correlated in S. cerevisiae

    (Moriyama and Powell, 1998) and E. coli (Eyre-

    Walker, 1996; Moriyama and Powell, 1998) riboso-

    mal protein genes; all ribosomal protein genes have

    approximately equal protein concentrations. It has

    been hypothesized that, since long proteins are

    energetically more expensive to produce, transla-

    tional selection for codons which minimize missense

    errors during translation is more effective in long

    genes (Eyre-Walker, 1996).In contrast to our results, negative correlations

    have been reported between CAI and protein

    length in S. cerevisiae and. D. melanogaster

    (Moriyama and Powell, 1998; Powell and

    Moriyama, 1997), and between Fav (wFop) andprotein length in D. melanogaster, C. elegans and

    A. thaliana (Duret and Mouchiroud, 1999). Why

    did Moriyama and Powell (1998) reach different

    conclusions than us for S. cerevisiae? First, they

    did not consider that dependencies of codon bias

    and protein length on mRNA concentration will

    distort the codon biasprotein length bivariate

    correlation. Second, they used Pearson product

    moment correlation (rp), which requires that both

    variables be Gaussian, otherwise p-values for rp are

    meaningless (Bailey, 1995). Since they excluded

    proteins of

  • between protein and mRNA abundance in yeast. Mol Cell Biol

    19: 17201730.

    Holstege FC, Jennings EG, Wyrick JJ, Lee TI, Hengartner CJ,

    Green MR, Golub TR, Lander ES, Young RA. 1998.

    Dissecting the regulatory circuitry of a eukaryotic genome.

    Cell 95: 717728.

    Ikemura T. 1982. Correlation between the abundance of yeast

    transfer RNAs and the occurrence of the respective codons in

    protein genes. Differences in synonymous codon choice

    patterns of yeast and Escherichia coli with reference to the

    abundance of isoaccepting transfer RNAs. J Mol Biol 158:


    Ikemura T. 1985. Codon usage and tRNA content in unicellular

    and multicellular organisms. Mol Biol Evol 2: 1334.

    Kurland CG. 1987. Strategies for efficiency and accuracy in gene

    expression 1. The major codon preference: a growth optimiza-

    tion strategy. Trends Biochem Sci 12: 126128.

    Lander ES. 1999. Array of hope. Nature Genet 21: 34.

    Lloyd AT, Sharp PM. 1992a. CODONS: a microcomputer

    program for codon usage analysis. J Hered 83: 239240.

    Lloyd AT, Sharp PM. 1992b. Evolution of codon usage patterns:

    the extent and nature of divergence between Candida albicans

    and Saccharomyces cerevisiae. Nucleic Acids Res 20:


    McAlister L, Holland MJ. 1982. Targeted deletion of a yeast

    enolase structural gene. Identification and isolation of yeast

    enolase isozymes. J Biol Chem 257: 71817188.

    Moriyama EN, Powell JR. 1998. Gene length and codon usage

    bias in Drosophila melanogaster, Saccharomyces cerevisiae and

    Escherichia coli. Nucleic Acids Res 26: 31883193.

    Pavesi A. 1999. Relationships between transcriptional and

    translational control of gene expression in Saccharomyces

    cerevisiae: a multiple regression analysis. J Mol Evol 48:


    Percudani R, Pavesi A, Ottonello S. 1997. Transfer RNA gene

    redundancy and translational selection in Saccharomyces

    cerevisiae. J Mol Biol 268: 322330.

    Powell JR, Moriyama EN. 1997. Evolution of codon usage bias

    in Drosophila. Proc Natl Acad Sci USA 94: 77847790.

    Press WH, Teukolsky SA, Vetterling WT, Flannery BP. 1994.

    Numerical Recipes in C, 2nd edn. Cambridge University Press:

    Cambridge; p.275.

    Sharp PM, Cowe E. 1991. Synonymous codon usage in

    Saccharomyces cerevisiae. Yeast 7: 657678.

    Sharp PM, Li WH. 1986. An evolutionary perspective on

    synonymous codon usage in unicellular organisms. J Mol

    Evol 24: 2838.

    Sharp PM, Li WH. 1987. The codon Adaptation Indexa

    measure of directional synonymous codon usage bias, and its

    potential applications. Nucleic Acids Res 15: 12811295.

    Sharp PM, Stenico M, Peden JF, Lloyd AT. 1993. Codon usage:

    mutational bias, translational selection, or both? Biochem Soc

    Trans 21: 835841.

    Sharp PM, Tuohy TM, Mosurski KR. 1986. Codon usage in

    yeast: cluster analysis clearly differentiates highly and lowly

    expressed genes. Nucleic Acids Res 14: 51255143.

    Solomovici J, Lesnik T, Reiss C. 1997. Does Escherichia coli

    optimize the economics of the translation process? J Theor Biol

    185: 511521.

    VanBogelen RA, Greis KD, Blumenthal RM, Tani TH,

    Matthews RG. 1999. Mapping regulatory networks in micro-

    bial cells. Trends Microbiol 7: 320328.

    Velculescu VE, Zhang L, Vogelstein B, Kinzler KW. 1995. Serial

    analysis of gene expression. Science 270: 484487.

    Velculescu VE, Zhang L, Zhou W, Vogelstein J, Basrai MA,

    Bassett DE Jr, Hieter P, Vogelstein B, Kinzler KW. 1997.

    Characterization of the yeast transcriptome. Cell 88: 243251.

    Wodicka L, Dong H, Mittmann M, Ho MH, Lockhart DJ. 1997.

    Genome-wide expression monitoring in Saccharomyces cerevi-

    siae. Nature Biotechnol 15: 13591367.

    Wright F. 1990. The effective number of codons used in a gene.

    Gene 87: 2329.

    Xia X. 1996. Maximizing transcription efficiency causes codon

    usage bias. Genetics 144: 13091320.

    Xia X. 1998. How optimized is the translational machinery in

    Escherichia coli, Salmonella typhimurium and Saccharomyces

    cerevisiae? Genetics 149: 3744.

    Zhang L, Zhou W, Velculescu VE, Kern SE, Hruban RH,

    Hamilton SR, Vogelstein B, Kinzler KW. 1997. Gene expres-

    sion profiles in normal and cancer cells. Science 276:


    Unequal usage of different synonymous codons in yeast 1145

    Copyright # 2000 John Wiley & Sons, Ltd. Yeast 2000; 16: 11311145.