translation rate is controlled by coupled trade-offs ......translation rate is controlled by coupled...

14
Translation rate is controlled by coupled trade-offs between site accessibility, selective RNA unfolding and sliding at upstream standby sites Amin Espah Borujeni 1 , Anirudh S. Channarasappa 1 and Howard M. Salis 1,2, * 1 Department of Chemical Engineering, Penn State University, University Park, PA 16802, USA and 2 Department of Agricultural and Biological Engineering, Penn State University, University Park, PA 16802, USA Received June 29, 2013; Revised October 17, 2013; Accepted October 24, 2013 ABSTRACT The ribosome’s interactions with mRNA govern its translation rate and the effects of post-transcrip- tional regulation. Long, structured 5 0 untranslated regions (5 0 UTRs) are commonly found in bacterial mRNAs, though the physical mechanisms that determine how the ribosome binds these upstream regions remain poorly defined. Here, we systemat- ically investigate the ribosome’s interactions with structured standby sites, upstream of Shine– Dalgarno sequences, and show that these inter- actions can modulate translation initiation rates by over 100-fold. We find that an mRNA’s translation initiation rate is controlled by the amount of single-stranded surface area, the partial unfolding of RNA structures to minimize the ribosome’s binding free energy penalty, the absence of cooperative binding and the potential for ribosomal sliding. We develop a biophysical model employing thermodynamic first principles and a four- parameter free energy model to accurately predict the ribosome’s translation initiation rates for 136 synthetic 5 0 UTRs with large structures, diverse shapes and multiple standby site modules. The model predicts and experiments confirm that the ribosome can readily bind distant standby site modules that support high translation rates, providing a physical mechanism for observed context effects and long-range post-transcriptional regulation. INTRODUCTION Long 5 0 untranslated regions (5 0 UTRs) are commonly found in bacterial mRNAs, and their interactions with the ribosome play a central role in controlling translation initiation and post-transcriptional regulation (1–3). Changes in the 5 0 UTR sequence, structure or overall shape have been found to alter an mRNA’s translation rate, including when conformational changes are triggered by the binding of proteins, small RNAs or when cis-acting riboswitches bind their ligand (4–8). However, our under- standing of the physical mechanisms that determines the ribosome’s ability to bind long 5 0 UTRs remains incom- plete. While the ribosome’s interactions with Shine– Dalgarno (SD) sequences and nearby inhibitory mRNA structures have been extensively characterized (9–12) and are well-predicted by biophysical models (13–15), many natural 5 0 UTRs possess further upstream sequences and structures whose interactions with the ribosome are still poorly defined (Figure 1A). Here, we use a learn-by-design approach to systematically investigate the ribosome’s interactions with long, structured 5 0 UTRs, and to identify new physical mechanisms that determine the ribo- some’s ability to bind and initiate translation. According to crystal and cryo-EM structures, the 30S ribosomal complex appears to bind upstream 5 0 UTRs through its platform domain, which is a positively charged surface that binds non-specifically to mRNA’s sugar phosphate backbone (3,16–18). The ribosomal platform first binds an initial landing pad, known as a ‘standby site’ (19) (Supplementary Figure S1), followed by coordinated insertion of the mRNA into the Entry channel and the 16S rRNA’s base pairing with SD like sequences to position the start codon over the P-site (18). Ribosomal S1 protein likely plays a role in remodel- ing mRNA structure, prior to the formation of the 30S- initiation complex, by binding to single-stranded RNA and promoting the unfolding of RNA hairpins and duplexes; however, free energy must be inputted to unfold RNA structures (20). The mRNA and ribosomal platform bind together independently of other partici- pants, including tRNA fMet and the initiation factors (1,21). Interactions that delay binding to the 5 0 UTR can slow the mRNA’s translation initiation rate, including the presence of mRNA structures (17,22), competitive binding by small RNAs (23) or the absence of a standby site *To whom correspondence should be addressed. Tel: +1 814 865 1931; Fax: +1 814 863 1031; Email: [email protected] 2646–2659 Nucleic Acids Research, 2014, Vol. 42, No. 4 Published online 14 November 2013 doi:10.1093/nar/gkt1139 ß The Author(s) 2013. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

Upload: others

Post on 05-Jul-2020

15 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Translation rate is controlled by coupled trade-offs ......Translation rate is controlled by coupled trade-offs between site accessibility, selective RNA unfolding and sliding at upstream

Translation rate is controlled by coupled trade-offsbetween site accessibility selective RNA unfoldingand sliding at upstream standby sitesAmin Espah Borujeni1 Anirudh S Channarasappa1 and Howard M Salis12

1Department of Chemical Engineering Penn State University University Park PA 16802 USA and 2Departmentof Agricultural and Biological Engineering Penn State University University Park PA 16802 USA

Received June 29 2013 Revised October 17 2013 Accepted October 24 2013

ABSTRACT

The ribosomersquos interactions with mRNA govern itstranslation rate and the effects of post-transcrip-tional regulation Long structured 50 untranslatedregions (50 UTRs) are commonly found in bacterialmRNAs though the physical mechanisms thatdetermine how the ribosome binds these upstreamregions remain poorly defined Here we systemat-ically investigate the ribosomersquos interactions withstructured standby sites upstream of ShinendashDalgarno sequences and show that these inter-actions can modulate translation initiation rates byover 100-fold We find that an mRNArsquos translationinitiation rate is controlled by the amount ofsingle-stranded surface area the partial unfoldingof RNA structures to minimize the ribosomersquosbinding free energy penalty the absence ofcooperative binding and the potential for ribosomalsliding We develop a biophysical model employingthermodynamic first principles and a four-parameter free energy model to accurately predictthe ribosomersquos translation initiation rates for 136synthetic 50 UTRs with large structures diverseshapes and multiple standby site modules Themodel predicts and experiments confirm that theribosome can readily bind distant standby sitemodules that support high translation ratesproviding a physical mechanism for observedcontext effects and long-range post-transcriptionalregulation

INTRODUCTION

Long 50 untranslated regions (50 UTRs) are commonlyfound in bacterial mRNAs and their interactions withthe ribosome play a central role in controlling translationinitiation and post-transcriptional regulation (1ndash3)

Changes in the 50 UTR sequence structure or overallshape have been found to alter an mRNArsquos translationrate including when conformational changes are triggeredby the binding of proteins small RNAs or when cis-actingriboswitches bind their ligand (4ndash8) However our under-standing of the physical mechanisms that determines theribosomersquos ability to bind long 50 UTRs remains incom-plete While the ribosomersquos interactions with ShinendashDalgarno (SD) sequences and nearby inhibitory mRNAstructures have been extensively characterized (9ndash12) andare well-predicted by biophysical models (13ndash15) manynatural 50 UTRs possess further upstream sequences andstructures whose interactions with the ribosome are stillpoorly defined (Figure 1A) Here we use a learn-by-designapproach to systematically investigate the ribosomersquosinteractions with long structured 50 UTRs and toidentify new physical mechanisms that determine the ribo-somersquos ability to bind and initiate translation

According to crystal and cryo-EM structures the 30Sribosomal complex appears to bind upstream 50 UTRsthrough its platform domain which is a positivelycharged surface that binds non-specifically to mRNArsquossugar phosphate backbone (316ndash18) The ribosomalplatform first binds an initial landing pad known as alsquostandby sitersquo (19) (Supplementary Figure S1) followedby coordinated insertion of the mRNA into the Entrychannel and the 16S rRNArsquos base pairing with SD likesequences to position the start codon over the P-site(18) Ribosomal S1 protein likely plays a role in remodel-ing mRNA structure prior to the formation of the 30S-initiation complex by binding to single-stranded RNAand promoting the unfolding of RNA hairpins andduplexes however free energy must be inputted tounfold RNA structures (20) The mRNA and ribosomalplatform bind together independently of other partici-pants including tRNAfMet and the initiation factors(121) Interactions that delay binding to the 50 UTR canslow the mRNArsquos translation initiation rate including thepresence of mRNA structures (1722) competitive bindingby small RNAs (23) or the absence of a standby site

To whom correspondence should be addressed Tel +1 814 865 1931 Fax +1 814 863 1031 Email salispsuedu

2646ndash2659 Nucleic Acids Research 2014 Vol 42 No 4 Published online 14 November 2013doi101093nargkt1139

The Author(s) 2013 Published by Oxford University PressThis is an Open Access article distributed under the terms of the Creative Commons Attribution License (httpcreativecommonsorglicensesby30) whichpermits unrestricted reuse distribution and reproduction in any medium provided the original work is properly cited

altogether (22) We consider a 50 UTR as composed of oneor more standby site modules followed by a 16S rRNAbinding site (a SD sequence) a spacer region and endedwith a protein coding sequence (Figure 1B) An mRNAthat folds into several hairpins all located upstream of theSD sequence will contain several standby site modules

Thermodynamic modeling has enabled a comprehensivedissection of the ribosomersquos interactions with mRNA withclearly defined contributions quantitatively described bytheir free energies (Figure 1C) The most stringent test ofour current understanding is to combine thermodynamicmodeling with in silico optimization to design completelynovel non-natural RNA sequences with predicted riboso-mal interactions and translation initiation rates The dif-ferences between model predictions and experimentalmeasurements objectively draw the boundaries betweenour current understanding of RNA interactions and itslimitations A growing body of evidence from our workand others indicates the presence of unpredicted inter-actions between the ribosome and long 50 UTRs high-lighted as context effects (Supplementary Figure S1C)(1424ndash28)

In this article we decipher the biophysical rules govern-ing the ribosomersquos interactions with long structured50 UTRs We systematically alter 50 UTR sequence struc-ture and shape to measure the ribosomersquos interactionsusing in vivo reporter assays We use thermodynamicmodeling to quantify how the ribosome binds less access-ible 50 UTRs accommodates or partially unfolds RNAstructure selects from multiple binding sites slides

across the mRNA and initiates translation We developa predictive biophysical model that combines thermo-dynamic minimization with a four-parameter free energymodel to accurately predict the translation initiation ratesof 136 long diversely structured 50 UTRs Finally wedemonstrate that the ribosome can bind to distant sitesand engage in long-range RNA interactions providing amechanism for the observed context effects and new waysfor riboswitches and small RNAs to regulate translation

MATERIALS AND METHODS

Strains media and plasmid construction

Luria-Bertani (LB) media (10 gl tryptone 5 gl yeastextract 10 gl NaCl) was obtained from BD and supple-mented with 50 mgml chloramphenicol (Sigma-Aldrich)The expression system is derived from the pFTV1plasmid (ColE1 CmR) (14) and encodes a s70 constitutivepromoter (BioBrick J23100) a synthetic 50 UTR and acodon-optimized mRFP1 fluorescent protein Synthetic50 UTRs were inserted by standard cloning techniquesutilizing either BamHI and SacI or XbaI and SacI sitesdepending on 50 UTR length 50 UTRs with desiredsequences were created by either annealing of complemen-tary oligonucleotides with overhangs or by PCR assemblyof oligonucleotides with corresponding restriction sitesPlasmid variants were cloned by restriction digest ofDNA backbone purification and ligation to DNA frag-ments with 50 UTR sequences and transformation ofchemically competent Escherichia coli DH10B cells viaheat shock Correct clones were identified by DNAsequencing All the designed 50 UTR sequences arepresented in the Supplementary Data

Fluorescent protein and mRNA measurements

Fluorescent protein reporter measurements were per-formed in 96-well format An initial 96 deep-well platecontaining 700 ml LB and 50 mgml chloramphenicol wasinoculated from single colonies with up to 30 differentE coli DH10B cultures in an alternating pattern thatexcluded the outer wells Cultures were incubated over-night at 37C with 200 rpm orbital shaking A fresh96-well plate containing 200 ml LB and chloramphenicolmedia was inoculated by overnight cultures using a 1100dilution Plates were incubated at 37C in a plate-basedspectrophotometer (TECAN M1000) with high orbitalshaking OD600 measurements were recorded every10min Once a culture reached an OD600 of 015 asample of each culture was transferred to a new plate con-taining 200 ml phosphate buffered saline [NaCl 137mMKCl 27mM Phosphate buffer 10mM (Na2HPO4KH2PO4 pH74) purchased from OmniPur] and 2mgmlkanamycin (Sigma Aldrich) for flow cytometry measure-ments Periodic serial dilutions were repeated two moretimes maintaining cultures within the exponentialgrowth phase Three samples were taken from eachculture The single-cell fluorescence distributions ofsamples were measured using a Fortessa flow cytometer(BD Biosciences) All distributions were unimodal Theautofluorescence distribution of E coli DH10B cells was

B

clcA5rsquo

flgN5rsquo

yecH5rsquo

SD

Standby Site Modules

AUG

A

C Initial State Final State

30S ribosome

ΔGtotal

Translation

mRNA

Initiation

5rsquo

ΔGstart

ΔGspacing

ΔGmRNA-rRNA

ΔGstandby

SDanti-SD

AUGSpacing

Standby Site

ΔGmRNA

Standby Site

Figure 1 (A) Natural E coli 50 UTRs have diverse lengths and struc-tures with varying binding free energy penalties to the ribosomalplatform Green and light blue regions represent SD and start codonrespectively (B) 50 UTRs are separated into multiple standby sitemodules followed by a SD sequence spacer region and a proteincoding sequence (C) A schematic shows the mRNA regions thatcontact the ribosome in its initial and final states The ribosomalplatform binds to 50 UTRs with a binding free energy penaltyGstandby The sum of all binding free energies Gtotal determineshow likely the ribosome binds to an mRNA and initiates translation

Nucleic Acids Research 2014 Vol 42 No 4 2647

also measured The arithmetic mean of each distributionwas taken and the mean autofluorescence was subtractedfrom each sample The reported fluorescence values in thisstudy are the average of three measurements (n=3)For mRNA level measurements single colonies were

first grown overnight in 5ml LB media with chloram-phenicol then diluted to 001 OD600 in fresh media andgrown to mid-exponential phase until reaching 10OD600 as measured by a cuvette-based spectrophotom-eter (NanoDrop 2000C) Total RNA was extracted usingthe Total RNA Purification kit (Norgen Biotek)Treatment with Turbo DNAse (Ambion) eliminatedgenomic DNA contamination Reverse transcription wascarried out with a High Capacity cDNA ReverseTranscription kit (Applied Biosystems) followed by quan-titative PCR with a TaqMan probe (50-ACCTTCCATACGAACTTT-30) targeting the internal coding sequence formRFP1 (forward primer 50-ACGTTATCAAAGAGTTCATGCGTTTC-30 and reverse primer 50-CGATTTCGAACTCGTGACCGTTAA-30) and using a ABI 7300 real-time cycler A TaqMan probe for 16S rRNA was usedas an endogenous control Triplicate measurements wereeach performed on two separate RNA extractions Ct

calculations were then averaged for each sample

Biophysical model calculations

The complete biophysical model employs a free energymodel to calculate the total binding free energy betweenthe ribosome and mRNA according to the equation(Figure 1C)

Gtotal frac14 GmRNArRNA+Gspacing+Gstart

+Gstandby GmRNAeth1THORN

Using statistical thermodynamics we may relate this totalGibbs free energy change to the mRNArsquos translation ini-tiation rate r according to

r expethGtotalTHORN eth2THORN

This relationship has been previously validated on 132mRNA sequences where the Gtotal varied from 10 to16 kcalmol resulting in well-predicted translation ratesthat varied by over 100 000-fold (14) The apparentBoltzmann constant has been measured as045plusmn005molkcal which was confirmed in a secondstudy (29)In the initial state the mRNA exists in a structured

conformation where its free energy of folding isGmRNA (GmRNA is negative) After assembly of the30S ribosomal subunit the last nine nucleotides of its16S rRNA have hybridized to the mRNA while all non-clashing mRNA structures are allowed to fold The freeenergy of folding for this mRNAndashrRNA complex isGmRNAndashrRNA (GmRNAndashrRNA is negative) mRNA struc-tures that impede 16S rRNA hybridization or overlap withthe ribosome footprint remain unfolded in the final stateThese Gibbs free energies are calculated using a semi-empirical free energy model of RNA and RNAndashRNAinteractions (3031) and the minimization algorithmsavailable in the Vienna RNA suite version 185 (32)

Three additional interactions will alter the translationinitiation rate The tRNAfMET anti-codon loop hybridizesto the start codon (Gstart is most negative for AUG andGUG) The 30S ribosomal subunit also prefers a fivenucleotide aligned spacing (33) Non-optimal distancesbetween the 16S rRNA binding site and the start codonlead to an energetic binding penalty This relationshipbetween the ribosomersquos binding penalty (Gspacinggt 0)and aligned spacing distance was measured in ourprevious study (14) Finally the 50 UTR binds to the ribo-somal platform with a free energy penalty Gstandbywhich is calculated according to Equation (5) anddescribed in the main text A maximally accessiblestandby site module will bind to the ribosomal platformwith a zero Gstandby Gibbs free energy calculations forall mRNA sequences are listed in the SupplementaryData

The biophysical model automatically decomposes a50 UTR into one or more standby site modules using thefollowing definitions A standby site module contains asingle mRNA hairpin surrounded by two single-stranded regions an upstream distal binding site and adownstream proximal binding site The lengths of thedistal and proximal binding sites are denoted by D andP respectively The hairpinrsquos height H is determined bythe number of nucleotides from the hairpinrsquos base to themid-point of the hairpinrsquos loop including bulges internalloops and base pairs The heights of branched hairpins aredetermined by averaging the branchesrsquo heights(Supplementary Figure S3) Hairpins are only consideredas components of standby site modules if they aremutually exclusive with the 16S rRNA binding siteA 50 UTR with multiple upstream hairpins will containmultiple standby site modules We consider a 50 UTRwithout upstream hairpins to contain a single standbysite module with zero hairpin height and zero distal sitelength Using these definitions the biophysical model cal-culates a Gstandby for each standby site module

The available single-stranded surface area As of astandby site module is determined according toAs=15+P+D ndash H The key aspect of this equation isthat proximal length distal length and hairpin height haveequal and interchangeable effects on the amount of single-stranded surface area The constant 15 was selected tocreate a positive metric for single-stranded surface areaBased on data shown in Figure 2 increasing the hairpinheight beyond 15 nt did not reduce the mRNA translationrate regardless of the proximal or distal site lengthsAccordingly the value of H in the equation is notallowed to exceed 15 nt

The ribosomal platformrsquos distortion energy penalty is aquadratic function of the available RNA surface area As(Figure 2) according to

Gdistortion frac14 C1ethAsTHORN2+C2ethAsTHORN+C3 eth3THORN

where C1 C2 and C3 are the fitted parameter values andequal to 0038 kcalmolnt2 1629 kcalmolnt and17359 kcalmol respectively An alternative constant inthe definition of surface area would alter these coefficients

2648 Nucleic Acids Research 2014 Vol 42 No 4

but the overall relationship between the surface area andthe distortion free energy penalty would remain the same

Measurements of translation rate and binding freeenergy penalties

In our steady-state long-time growth conditions transla-tion initiation is a rate-limiting step in codon-optimizedmRFP1 fluorescent protein expression fluorescence levelsare proportional to translation initiation rates under ourcontrolled growth conditions (13) and translation initi-ation rate is related to total binding free energies accord-ing to a log-linear formula (Equation 2) We measure thechange in apparent binding free energy penalty by

comparing the fluorescence of a structured 50 UTR withthe fluorescence of a maximally accessible 50 UTR accord-ing to

Gstandby frac14 1

ln

X

Xref

ethGotherTHORN eth4THORN

where X is the fluorescence from a sample synthetic50 UTR Xref is the fluorescence of the unstructured50 UTR and is 045molkcal Gother is the differencein free energies between the sample and reference50 UTRs that depends on other ribosomal interactionsand not on the standby site By using identical and care-fully designed 16S rRNA binding sites spacer regions and

Height

Distal Proximal

SD AUG

A

D

0 5 10 15 20

102

103

104

12

108

64

2

0

14

Distal Length (D) ntFl

uore

scen

ce [a

u]

ΔGst

andb

y [k

calm

ol]

E

16 188 10 12 14 20

102

103

104

12

108

64

2

0

14

Hairpin Height (H) nt

Fluo

resc

ence

[au]

ΔGst

andb

y [k

calm

ol]

C

0 5 10 15 20

102

103

104

12

108

64

2

0

14

Proximal Length (P) nt

Fluo

resc

ence

[au]

ΔGst

andb

y [k

calm

ol]

0 5 10 15 20 25

101

102

103

104

18

16

14

12

10

8

6

4

2

0

Fluo

resc

ence

[au]

ΔGst

andb

y [k

calm

ol]

Standby Site Module Surface Area AS

8 12 14 18 20 227

6

5

4

3

2

1

0

103

104Fl

uore

scen

ce [a

u]

ΔGst

andb

y [k

calm

ol]

Standby Site Module Surface Area AS

10 16

A = 15 + P + D minus Hs

F G

B

SDAUG

D H P ndash H

Figure 2 (A) Standby site modules are defined by a proximal and distal binding site separated by an RNA structure (B) Standby site modulesurface area is calculated by summing the proximal P and distal D binding site lengths and subtracting the hairpin height H The surface area is keptpositive by adding a constant 15 (CndashE) Translation rate and ribosome binding free energy penalties are measured as proximal binding site lengthdistal binding site length and hairpin height are increased The Gstandby numbers shown in secondary y-axis are directly related to the dataaccording to Equation (4) (C) Hairpin heights are 9 nt (green stars) 12 nt (black asterisks) 15 nt (red squares) or 18 nt (blue circles) (D)Hairpin heights are 15 nt (red squares) or 18 nt (blue circles) (E) Hairpin height varies from 9 to 12 nt by adding to loop length of a 6 bp stem(blue circles) or from 14 to 18 nt by adding to loop length of a 12 bp stem (red squares) (F) 56 characterized 50 UTRs (Aslt 24) from parts (CndashE) arecombined to quantitatively relate the standby site modulersquos surface area to the ribosomersquos translation rate and binding free energy penalty Dashedline is a best-fit quadratic equation (G) The validity of the quadratic equation is tested by measuring translation rates from 15 additional 50 UTRsand comparing to model predictions (R2=078 average error is 066 kcalmol) Data points are the result of three measurements in 1 day In parts(CndashG) the horizontal dashed line is the translation rate and ribosome binding free energy penalty of the reference unstructured 50 UTR

Nucleic Acids Research 2014 Vol 42 No 4 2649

coding sequences in the 136 synthetic mRNA sequencesthe value for Gother was zero for 123 mRNA sequencesand a maximum of 090 kcalmol for the remainingsequences

Statistical analysis

Two-sample T-test at two-tailed 5 statistical signifi-cance was performed to compare the error distributionof different sets of model predictions The null hypothesisassumed that the errors in two different sets of modelpredictions have same mean and variance The alternativehypothesis indicates that the errors in two sets follow dif-ferent distributions The error in model predictionGstandby was calculated according to the formulaGstandby=measured Gstandby predicted GstandbyThe low probabilities (Plt 005) reject the null hypothesisshowing that the differences in two sets of model predic-tions were statistically significant

RESULTS

Standby site module surface area controls the ribosomersquostranslation rate

We first employed a learn-by-design approach to deter-mine the ribosomal platformrsquos interactions with standbysites that contain a single hairpin and therefore contain asingle standby site module Standby site modules aredefined by their hairpin height (H) upstream distalbinding site length (D) and downstream proximalbinding site length (P) (Figure 2A lsquoMaterials andMethodsrsquo section) We designed constructed andcharacterized 62 synthetic 50 UTRs containing singlestandby site modules with hairpins from 9 to 18 bp longdistal binding sites from 1 to 15 nt long and proximalbinding sites from 1 to 24 nt long Long mRNA hairpinscontain internal loops to avoid potential RNAse activity(34) All mRNA sequences are transcribed by the J23100constitutive promoter encode the mRFP1 fluorescentprotein and are designed with identical SD and spacersequences (lsquoMaterials and Methodsrsquo section) Flowcytometry measurements from long-time log-phasecultures with periodic serial dilutions revealed thatmRFP1 fluorescence varied by 221-fold (Figure 2CndashE)In contrast mRNA levels varied by at most 58-foldaccording to RT-qPCR measurements (SupplementaryFigure S2) Changes in fluorescence are primarily due tochanges in translation rate and in these growth condi-tions fluorescence and translation initiation rates are pro-portional All sequences and measurement data are listedin the Supplementary DataAccording to the data standby site module accessibility

plays an important role in modulating the ribosomersquosinteractions with 50 UTRs and controlling translationrate across a large range In particular lengthening thedistal binding site by four nucleotides increased themRNArsquos translation rate by 42-fold whereas lengtheningthe proximal binding site by the same amount increased itby 40-fold Shortening the hairpinrsquos height by 4 bpincreased the mRNArsquos translation rate similarly by 31-fold More generally the slopes (first derivatives) of the

log-transformed translation rates all have similar magni-tudes (Supplementary Data column E) Highly accessiblestandby site modules with long proximal binding sites orshort hairpins also cause the mRNArsquos translation rate toreach a plateau with the same fluorescence level as anmRNA with an unstructured 50 UTR (Figure 2C) Theseobservations led to the hypothesis that the geometric char-acteristics of the standby site module (D P H) are inter-changeable and that its available single-stranded surfacearea As determines the ribosomal platformrsquos ability tobind to it

A standby site modulersquos single-stranded surface area isthe number of nucleotides capable of single-strandedcontact with the ribosomal platform Longer proximalor distal binding sites increase the standby site modulersquossurface area while longer hairpin structures reduce thestandby site modulersquos accessibility mRNA structurescan potentially reduce site accessibility by folding overthe neighboring single-stranded sites and sequesteringbinding regions A standby site modulersquos availablesurface area is calculated by summing its proximal anddistal binding site lengths and subtracting the hairpinrsquosheight (Figure 2B lsquoMaterials and Methodsrsquo section) Ahighly accessible standby site module has a large Aswhereas a less accessible standby site module has a smallAs We then evaluated whether the standby site modulersquossurface area is the key determinant by comparing As to themeasured log-transformed translation rates For all 6250 UTRs with diverse geometric features we found asingle quantitative relationship that described theeffect of surface area on the mRNArsquos translation rate(Figure 2F)

We now turn to thermodynamic modeling to quantifythe ribosomersquos interactions with standby site modules Thecompetitive binding of free ribosomes to the pool of intra-cellular mRNAs allows one to use statistical thermo-dynamics to relate a ribosomersquos binding free energy toits translation initiation rate according to a constantlog-linear relationship (Equation 2) (1314) We calculatethe interaction energy between the ribosome and anmRNArsquos standby site modules by comparing its transla-tion to the translation rate of a reference unstructured50 UTR both with an identical SD sequence spacerregion and protein coding sequence (Equation 4) Usingthe data in Figure 2CndashE we found that the standby sitebinding free energy penalty Gstandby varied from 0 to118 kcalmol when changing its geometric characteristicsand that a simple quadratic function succinctly describedthe effect of site accessibility on the ribosomersquos bindingfree energy penalty (R2=097 average error is051 kcalmol) (Figure 2F) A maximally accessiblestandby site module has a zero binding free energypenalty (Gstandby=0) indicating that the mRNArsquostranslation initiation rate is not reduced by the ribosomalplatformrsquos interactions A standby site module with apositive binding free energy penalty (Gstandbygt 0) hasreduced the mRNArsquos translation initiation rate bydecreasing the ribosomal platformrsquos ability to initiallybind the mRNA

Next we critically tested the parameterized model byconstructing 15 new 50 UTRs and measuring their

2650 Nucleic Acids Research 2014 Vol 42 No 4

translation initiation rates and ribosome binding freeenergy penalties The 50 UTRs each contain a standbysite module with diverse structures between 43 and 91 ntlong including variable-length distal and proximalbinding sites hairpin heights and including two- andthree-branched structures (Supplementary Data) Forbranched structures we show that the relevant hairpinheight is the average across the multiple branches(Supplementary Figure S3) (see also SupplementaryText) The quadratic relationship accurately predictedthe ribosomersquos translation rate and binding free energypenalty (R2=078 average error is 066 kcalmol) con-firming that standby site modulersquos accessibility controlsthe ribosomersquos ability to bind and initiate translation(Figure 2G) The quadratic relationship contains threeparameters which are now treated as constants(Equation 3 lsquoMaterials and Methodsrsquo section)

Coupled energetic trade-offs when unfolding structures instandby site modules

Prior to initiating translation the 30S ribosome does nothave an external source of energy Initiation factor 2 doesnot hydrolyze GTP until after translation has beeninitiated during 50S recruitment (22) Consequently the30S complex expends its own free energy to unfold mRNAstructures Previous studies have shown that mRNA struc-tures that sequester the 16S rRNA binding site areunfolded prior to initiating translation (91422) ThemRNArsquos translation initiation rate is reduced accordingto the free energy penalty for unfolding these structures asdescribed in Equations (1) and (2) (lsquoMaterials andMethodsrsquo section) However it remains unclear whethermRNA structures located upstream of SD sequenceswithin standby site modules are unfolded oraccommodated We next investigated how the ribosomeunfolds hairpins in long 50 UTRs

By definition there is a coupled thermodynamic rela-tionship between the unfolding of an mRNA hairpin andincreasing the accessibility of the corresponding standbysite module where the accessibility of standby site modulecontrols the ribosomal platformrsquos distortion energypenalty Gdistortion (see lsquoMaterials and Methodsrsquosection) Unfolding a single base pair in a modulersquoshairpin will increase the standby site modulersquos surfacearea by three nucleotides the hairpinrsquos height decreasesby one and the lengths of both the proximal and distalbinding sites increase by one Therefore unfolding amodulersquos hairpin will lead to higher accessibility a lowerGdistortion and a higher translation rate However un-folding base pairs requires an input of free energy(Gunfoldinggt 0) which will lead to a lower translationrate Consequently there is an energetic trade-offbetween unfolding mRNA structures and increasing thesurface accessibility of standby site modules Theribosome may completely unfold a structured standbysite module to maximize its surface area but the energeticcost may be high (high Gunfolding penalty withGdistortion=0) In contrast the ribosome may fully ac-commodate a standby site modulersquos hairpin without

unfolding it even though its low surface accessibility willpenalize the ribosomersquos ability to bind to it (highGdistortion penalty with Gunfolding=0)Based on thermodynamic principles we hypothesize

that the ribosome will lsquoselectivelyrsquo and lsquopartiallyrsquo unfoldstructured standby site modules to minimize its totalbinding free energy penalty (Gstandby=Gdistortion+Gunfolding) Figure 3A and B presents an illustrativeexample of this principle and our use of minimization tocalculate the ribosomersquos binding free energy penalty to astandby site module A 50 UTR with a single standby sitemodule initially contains a short proximal binding site(P=2nt) a short distal binding site (D=2nt) and along hairpin (H=15nt) resulting in a largely inaccessible50 UTR (As=4) Based on the standby site modulersquossurface area the ribosome has an initial distortionenergy penalty of 115 kcalmol However the ribosomecan selectively unfold the hairpinrsquos closing AU basepairs increasing the standby site modulersquos surface areaand lowering its Gdistortion at the expense of increasingits Gunfolding Unfolding 1 bp increases the standby sitemodulersquos surface area by 3 nt decreases the distortionenergy penalty to 78 kcalmol and increases the unfoldingenergy penalty to 090 kcalmol leading to a total bindingfree energy penalty of 87 kcalmol The ribosome cancontinue to unfold base pairs to minimize its totalbinding free energy penalty unfolding 3 bp at a cost of27 kcalmol leads to a distortion energy penaltyof 26 kcalmol and a total binding free energy penaltyof 53 kcalmol (Figure 3B) Two additional examplesare shown in Supplementary Figure S4 The unfoldingfree energies are calculated using the Vienna RNA suitewith the Turner 1999 RNA parameters (see lsquoMaterials andMethodsrsquo section) Importantly thermodynamic mini-mization does not add any new parameters to the modelWe tested the proposed mechanism by characterizing the

translation rate of 22 synthetic 50 UTRs (Figure 3CndashE) thatcontain single standby site modules with varied geometries(D=0 to 7nt and P=1 to 20nt) and differing hairpinsequences sizes and folding energetics (SupplementaryFigure S5A) We employed the minimization principle tocalculate the number of unfolded base pairs and to predictthe ribosomersquos binding free energy penalty Gstandby

(Figure 3D) As comparisons we performed the same cal-culation while asserting that either no additional energywas inputted to unfold hairpins (high Gdistortion andGunfolding=0) (Figure 3C) or that hairpins were com-pletely unfolded by the ribosome (high Gunfolding andGdistortion=0) (Figure 3E)The data supports our hypothesis that the ribosome

partially and selectively unfolds structured standby sitemodules to minimize its total binding free energy penalty(Gstandby) rather than separately minimize its distortionor unfolding penalties (average error is 090 kcalmol whenminimizing Gstandby 242 kcalmol when Gunfolding=0and 1024 kcalmol when Gdistortion=0 comparativeP-values are 0004 and 10 13 respectively)Remarkably in all 22 cases only 1ndash3 bp of each hairpinare predicted by the model to be unfolded whileaccommodating the remaining large RNA structures

Nucleic Acids Research 2014 Vol 42 No 4 2651

Non-cooperative binding to multiple standby site modules

We have shown that the ribosome can accommodate largestructures of standby site modules and engage in frequenttranslation but it is unclear how the presence of multiplestandby site modules in a 50 UTR will affect its translationrate The presence of multiple standby site modules couldallow the ribosome to cooperatively bind mRNA andincrease its translation rate We next characterized ninesynthetic 50 UTRs (76ndash164 nt long) with up to fourstandby site modules to investigate the potential for co-operative binding (Figure 4A) If the ribosome can engagein cooperative binding then the additional standby sitemodules will increase the overall available surface areareduce the ribosomersquos binding free energy penalty andincrease the translation initiation rate In particular weexpect the greatest cooperativity when the availablesurface area of a single standby site module is limitedWe found that 50 UTRs with two three or four

standby site modules were all translated at the samerate indicating an absence of cooperative binding

(Figure 4B) including when the available surface areasfor individual standby site modules were significantlylowered Therefore even when the combined surfaceareas of multiple standby site modules could potentiallyincrease contact with the ribosomal platform and increasebinding a cooperative increase in translation rate was notobserved Consequently the data supports a mechanismwhereby the ribosome binds to only a single standby sitemodule prior to initiating translation

Distant standby site modules can support hightranslation rates

We next investigated whether any standby site modulecould be bound by the ribosome even if located farupstream of the start codon Using the modelrsquos predic-tions we forward-engineered multi-standby site modulesynthetic 50 UTRs where the most distantly upstreamsite is the only one capable of supporting high translationrates (Figure 4C) Internal standby site modules have longhairpins and limited surface areas (model predictions

B

0 2 4 6 8 1002468

10121416

Unfolded Base Pairs

Ene

rgy

[kca

lmol

]

Standby Site Module Surface Area (AS)4 10 16 22 28 34

dagger

GC

CGCU

C

C

CGA

UAUA

G

G

CGA

CUCG

AAuuu aaaAAUAAGGAGGUAAAAAAUGSD

After

GC

CGCUuuu

C

C

CGAaaa

UAUA

G

G

CGA

CUCG

AA AAUAAGGAGGUAAAAAAUGSD

dagger Before

C D

A

Partial RNA Unfolding

102

103

104

Fluo

resc

ence

[au]

Predicted ΔGstandby [kcalmol]0 5 10 15

No RNA Unfolding

0 5 10 15

102

103

104

Fluo

resc

ence

[au]

Predicted ΔGstandby [kcalmol]

Complete RNA Unfolding

0 5 10 15 20

10 2

10 3

10 4

Fluo

resc

ence

[au]

Predicted ΔGstandby [kcalmol]

E

Figure 3 (A and B) Competition between ribosomal distortion and selective RNA unfolding is illustrated on a structured 50 UTR with a weakhairpin The ribosomersquos distortion penalty Gdistortion (dashed red line) hairpin unfolding penalty Gunfolding (dotted dashed blue line) and their sumGstandby (green solid line) are shown as the hairpinrsquos closing base pairs are unfolded The total binding free energy penalty has a local minima whenthree base pairs are unfolded See also Supplementary Figure S4 (CndashE) Scatter plots comparing measured translation rates from 22 synthetic50 UTRs and model predictions to test three mechanisms for the ribosomersquos unfolding of RNA structures The average difference between modelpredictions and measurements is (C) 242 kcalmol when not unfolding structures (D) 090 kcalmol when partially unfolding structures and(E) 1024 kcalmol when fully unfolding structures Comparative P-values from two-sample T-tests are 0004 (C versus D) and 1013 (D versusE) Data points are the result of three measurements in 1 day In parts (CndashE) the diagonal dashed line is the predicted translation rate according toEquation (2)

2652 Nucleic Acids Research 2014 Vol 42 No 4

Gstandby between 90 and 174 kcalmol) compared to themaximally accessible upstream standby site modulesTherefore if the ribosome binds the farthest upstreamstandby site module its translation rate will be at least127-fold higher than binding to the internal standby sitemodules We measured fluorescence levels with flowcytometry and mRNA levels with RT-qPCR Comparedto an unstructured 50 UTR the mRNA levels decreased byonly 13- to 55-fold while the fluorescence levels werecorrespondingly reduced between 25- and 183-foldshowing that changes in fluorescence were mainly due tochanges in translation rate and ribosome binding freeenergy penalties (Figure 4D)

We found that all the mRNAsrsquo translation ratesremained very high even when the accessible standbysite module was located over 100 nt upstream of thestart codon (M4 50 UTR) (Figure 4E) In particular theM2 50 UTR is translated at a 338-fold higher rate whencompared to the translation rate that could be supportedby binding to the standby site module closest to the startcodon Moving the accessible standby site moduleupstream away from the start codon does not signifi-cantly reduce the translation rate Based on these meas-urements the apparent standby site binding penaltiesare 36 62 and 65 kcalmol for M2 to M4 50 UTRs

(Figure 4F) respectively which is much lower than theinternal standby site modulesrsquo predicted binding freeenergy penalties (90ndash174 kcalmol) These measurementssuggest that the ribosome will bind to the standby sitemodule that has the lowest binding free energy penaltyregardless of its distance from the start codon

The ribosomal platform can slide acrossupstream structures

Several mechanisms would allow the ribosomal platformto bind distant upstream standby site modules and tran-sition to a pre-initiation complex at the start codon Afterthe ribosome has bound to the upstream standby sitemodule the 30S complex could engage in diffusivehopping that would increase its local concentration andassembly rate at the start codon The rate of assemblywould largely depend on the physical distance betweenthe standby site module and start codon and not on thepresence of RNA structures Alternatively the ribosomecould employ a full contact sliding mechanism that reori-ents the mRNA until a stable pre-initiation complex hasformed The positively charged surface of the ribosomalplatform makes it ideal for maintaining contact with a50 UTR until 16S rRNA hybridization has anchored themRNA Unlike a diffusive hopping mechanism the

B

ΔGst

andb

y [kc

alm

ol]

Single Stranded Length (N) [nt]124 8

6543210

103

104

Fluo

resc

ence

[au]

A

SD AUGN N

SD AUGN NN

SD AUGN NNN

C

SD AUGM1

I

SD AUGM2

III

SD AUG

M3

IIIII

I

SD AUG

M4

IIIII

IIV

M3 M4M1 M2

Structured Standby Sites

Fluo

resc

ence

[au]

102

103

104

101

100

E

I IIM2

I II IIIM3

I II III IVM4

22

20

18

16

14

12

10

8

6

4

2

0

ΔGdi

stor

tion +

ΔG

slid

ing

[kca

lmol

]

IM1

22

20

18

16

14

12

10

8

6

4

2

0

ΔGdi

stor

tion

[kca

lmol

]

FD

M1 M2M3 M4

0

5

10

15

20

25

0

5

10

15

20

25

Structured Standby Sites

mR

NA

Lev

el F

old

Red

uctio

n

Fluo

resc

ence

Fol

d R

educ

tion

Figure 4 (A) Synthetic 50 UTRs with multiple evenly spaced standby site modules Spacing N is varied from 4 to 12 nt (B) Measured translation ratesare compared to standby site module multiplicity showing the absence of cooperative binding to multiple standby site modules regardless of limitingsurface areas The Gstandby numbers shown in secondary y-axis are directly related to the data according to Equation (4) (C) Synthetic 50 UTRs withone upstream and multiple internal standby site modules labeled with roman numbers (D) The reduction in fluorescence (red bars) and mRNA levels(blue bars) of 50 UTRs M1 to M4 compared to an unstructured 50 UTR (see also Supplementary Figure S2) (E) Fluorescence measurements of 50

UTRs M1 to M4 show that upstream standby site modules can support high translation rates (F) Model predictions are shown for 50 UTRs M1 to M4

using either Gdistortion (gray bars) or Gdistortion+Gsliding (blue bars) for each standby site module (IndashIV corresponding to labels in part C)compared to the ribosomersquos apparent binding free energy penalties (dashed lines yellow region) (average error with and without the sliding penalty is115 and 324 kcalmol respectively P-value is 0013) Gdistortion for each standby site module was calculated according to Equation (3) In parts Band E the horizontal dashed line is the translation rate and ribosome binding free energy penalty of the unstructured 50 UTR

Nucleic Acids Research 2014 Vol 42 No 4 2653

presence of mRNA structures would impede this slidingmotion and reduce the pre-initiation complexrsquos assemblyrateFurther analysis of our measurements shows that RNA

structures in between the upstream standby site moduleand start codon partly attenuate the mRNArsquos translationrate (Figure 4E) The apparent binding free energy penaltyof the M2 50 UTR (36 kcalmol) is higher than theupstream standby site modulersquos predicted binding freeenergy penalty (04 kcalmol) This energetic differencecannot be explained by binding to the internal standbysite module (114 kcalmol) or by the unfolding of thedownstream RNA structure (175 kcalmol) Thus thefree energy difference (36 ndash 04=32 kcalmol) is signifi-cant but much lower than predicted by alternativebinding to other standby site modules or from unfoldingof RNA structuresOur data supports the presence of a sliding mechanism

for the ribosomal platform Intervening mRNA structuresare not unfolded but are pushed aside with an energeticcost Using our measurements of the M2 50 UTR weestimate that 32plusmn05 kcalmol free energy is requiredto push aside the single RNA structure with a height of15 nt yielding a coefficient of 020plusmn004 kcalmol perhairpin height We then determined whether includingthis sliding penalty could improve the modelrsquos ability topredict the translation rate of long 50 UTRs with multiplestandby site modules with lsquohairpins of differing heightsrsquoFor each standby site module we calculate the heights ofall downstream RNA structures and multiplied the sumby the coefficient (c=02 kcalmolnt) to yield a freeenergy penalty of sliding Gsliding The sliding freeenergy penalties were 0 30 48 and 78 kcalmol for M1to M4 50 UTRs respectively We add together the slidingdistortion and unfolding free energy penalties to calculatethe ribosomersquos binding free energy penalty to each standbysite moduleFigure 4E shows the measured translation initiation

rates for M1 to M4 alongside the predicted binding freeenergy penalties (Figure 4F) both with and without thesliding free energy penalty The data shown in Figure 4Eand F are on the same scale according to the log-linearrelationship between translation initiation rate andbinding free energy differences (lsquoMaterials and Methodsrsquosection) enabling a comparison between the measure-ments and the two predictions The introduction of thesliding free energy penalty enabled the biophysical modelto more accurately predict the translation rate from longer50 UTRs with several standby site modules (average erroris 186 kcalmol with the sliding energy penalty comparedto 445 kcalmol without the sliding energy penaltyP-value is 0013)

A biophysical model predicts translation rates fromdiversely structured 50 UTRs

Altogether the ribosome binds to 50 UTRs by selectingthe most geometrically accessible standby site module thatrequires the least amount of RNA unfolding Standby sitemodule selection is not limited by distance to the startcodon and upstream standby site modules can support

high rates of translation We identified that ribosomal dis-tortion selective RNA unfolding and ribosomal slidingcontrol standby site module selection and binding ener-getics and combine them together to create a single bio-physical model The ribosomersquos binding free energypenalty to the standby site modules in a structured50 UTR is calculated according to

Gstandby frac14 Gdistortion+Gunfolding+Gsliding eth5THORN

All three Gibbs free energies are positive and a long un-structured 50 UTR would bind to the ribosomal platformwithout an energetic penalty (Gstandby=0kcalmol) AllRNA energetics are calculated using a semi-empirical freeenergy model of RNA and RNAndashRNA interactions(3031) and the minimization algorithms available in theVienna RNA suite version 185 (32) This model extendsour previous biophysical model that focused on down-stream mRNA interactions including the 16S rRNAbinding site spacer region and downstream RNA struc-tures (1314)

We further tested the accuracy of the biophysical modelby characterizing an additional 28 diversely structured50 UTRs (68ndash164 nt long) with two to four standby sitemodules of varying geometries RNA structure energeticsand hairpin heights (Supplementary Figure S5B) The bio-physical model was able to accurately predict the relativetranslation rates and ribosome binding free energypenalties (average error is 079 kcalmol and R2=083)(Figure 5A) Overall the biophysical model containsfour empirically determined parameter values but canaccurately predict the binding free energy penalties andtranslation rates of the 136 synthetic 50 UTRscharacterized in this study (average error is 075 kcalmoland R2=089) (Figure 5B)

Genome-wide predictions of 50 UTR translation rates

We next employed the biophysical model to examine howstandby sites in genomic 50 UTRs control their translationinitiation rates From the annotated E coli MG1655genome provided by EcoCyc (35) we collected 343050 UTRs that varied from short leaders to gene-sized regu-latory hot spots with average 50 UTR length of 100 nt The50 UTR sequences from the+1 of the mRNA transcript to+100 after the protein coding sequencersquos start codon wereinputted into the biophysical model For individual50 UTRs a list of standby site modules was automaticallygenerated For each standby site module the surface areaAs distortion energy penalty Gdistortion unfolding energypenalty Gunfolding and sliding energy penalty Gsliding

were calculated These calculations are combined to deter-mine the Gstandby for each standby site module Becauseof the absence of cooperative binding as shown thestandby site module with the lowest binding free energypenalty is selected as the one controlling the ribosomersquostranslation initiation rate

According to the model calculations we found that thelengths of the natural 50 UTRs do not correlate with theirability to bind the ribosomal platform (Figure 6A) Short50 UTRs with low accessibility can inhibit translationequally well compared to long 50 UTRs where

2654 Nucleic Acids Research 2014 Vol 42 No 4

accessibility is limited by RNA structures The model cal-culations indicate that the ribosomal platform can bind tonatural 50 UTRs with a wide range of energetic penalties(up to 174 kcalmol) repressing translation by up to 2500-fold (Figure 6B) Long 50 UTRs with low Gstandby can besubjected to translation repression when trans-actingsmall RNAs bind and reduce their single-stranded acces-sibility (232936ndash38) or when cis-acting riboswitchesreduce accessibility when bound to a chemical ligand(39ndash43) In one example the standby site in the tisB50 UTR is occluded when bound by the IstR-1 smallRNA and its translation is repressed (23) In contrastlong 50 UTRs with high Gstandby can be targets for

translation activation when structural remodeling bysmall RNAs or riboswitches increases their surface acces-sibility RprA small RNA has been shown to bind and up-regulate the translation of rpoS mRNA with the help ofHfq-protein (44)The model predicts that both types of translation regu-

lation occur in natural 50 UTRs In particular 39 ofgenomic 50 UTRs have standby site modules whosedistal or proximal binding sites are longer than 15 ntwhich are potent targets for translational repression incontrast 9 of 50 UTRs contain standby site moduleswith low accessibilities (Aslt 10) making them targetsfor translation activation Moreover 79 diverse 50 UTRs

A B

C

0 200 400 600 800 10000

2

4

6

8

10

12

14

16

18

ΔGst

andb

y [kca

lmol

]

Genomic 5rsquo UTR Length [nt]

0 10 20 30 400

2

4

6

8

10

12

14

16

18

ΔGst

andb

y [kca

lmol

]

0 2 4 6 8 10 12 14 16 18100

101

102

103

104

ΔGstandby

[kcalmol]

Num

ber

of G

enom

ic 5

rsquo UT

Rs

0 2 4 6 8 10 12 14 16 18

maximum change in ΔGstandby

[kcalmol]

Num

ber

of 5

rsquo UT

R Is

ofor

ms

100

101

102

103Genomic 5rsquo UTR Length [nt]

Figure 6 (A) The ribosomal platform binding free energy penalties Gstandby for 3430 transcribed 50 UTRs from the E coli MG1655 genome(EcoCyc release 171) are compared to their lengths The inset shows the Gstandby of short 50 UTRs (B) The number of genomic 50 UTRs withdifferent values of Gstandby is shown (C) 50 UTR isoforms within transcripts across the genome affect their standby site module characteristics asquantified by the maximum calculated change in Gstandby

A

0 2 4 6 8 1010

8

6

4

2

0

103

104

Predicted ΔGstandby

[kcalmol]

App

aren

t ΔG

stan

dby [

kcal

mol

]

Flu

ores

cenc

e [a

u]

B

0 2 4 6 8 10 12 14

102

103

104

14

12

10

8

6

4

2

0

Predicted ΔGstandby

[kcalmol]

App

aren

t ΔG

stan

dby [

kcal

mol

]

Flu

ores

cenc

e [a

u]

Figure 5 (A) A scatter plot showing fluorescence measurements and apparent Gstandby energy penalties compared to predicted Gstandby energypenalties for 28 synthetic 50 UTRs with diverse structures (average error is 079 kcalmol and R2=083) (B) The same comparison for the 136synthetic 50 UTRs characterized in this study (average error is 075 kcalmol and R2=089) In parts (A and B) the apparent Gstandby numbersshown in secondary y-axis are directly related to the data according to Equation (4) An x=y diagonal line is shown (dashed)

Nucleic Acids Research 2014 Vol 42 No 4 2655

with an average length of 175 nt have positive slidingenergy penalties (Gsliding) indicating that distantstandby site modules are controlling translation andmaking them potential targets for long-range regulationFurther there are 769 operons where variability in their

transcriptional start sites due to adjacent or overlappingpromoters result in 50 UTRs with different lengths andmRNA structures For example the mmuPM operon hasfive different transcriptional start sites leading to 50 UTRisoforms with lengths from 9 to 205 nt The biophysicalmodel predicts a 1088-fold change in the translation initi-ation rate of mmuP as a result of changes in the structureand surface area of the different standby site modulesOverall across all 50 UTR isoforms within the genomemodel calculations show large changes in translation ini-tiation rates through alterations in the standby sitemodulesrsquo characteristics (Figure 6C) These differencesimply that switching promoter usage due to changingtranscription factor or sigma factor levels can modulatean mRNArsquos translation rate and potentially createstandby site modules whose accessibility can be addition-ally regulated by translation factors However furtherinvestigation is necessary to examine the 50 UTRs ofisoforms and the regulation of their translation rates

Biophysical modeling to calculate translation rates fromsplit ribosome binding sites

Sacerdot et al (45) studied the translation of the 50 UTRof E coli thrS mRNA that uses a split ribosome bindingsite The biophysical model classifies the 50 UTR as con-taining three standby site modules with large RNA struc-tures and varying distal and proximal binding site lengths(Figure 7) The authors concluded that a 24 nt long singlestranded region (domain 3) is essential for ribosomebinding and that the ribosome can accommodate a long

hairpin (domain 2) without unfolding it which was latersupported by crystallographic data (3) The biophysicalmodelrsquos calculations are strongly consistent with their con-clusions In particular a mutation that deleted the entiredomain 3 and its upstream region (ILO4) repressed thetranslation rate by over 50-fold where it significantlyreduced the standby site accessibility (from As=24 andGdistortion=0 to As=0 and Gdistortion=174 kcalmol) To offset this large energetic burden model calcu-lations indicate that the ribosomal platform partiallyunfolded 6 bp from the domain 2 hairpin whichminimized its binding free energy penalty to a Gstandby

of 532 kcalmol (As=205 Gdistortion=002 andGunfolding=53 kcalmol) Destabilization of thedomain 2 hairpin (ILO5) restored the mRNArsquos transla-tion to its wild-type rate by increasing the surface area ofthe standby site to 205 nt resulting in an almost zeroGstandby The biophysical model calculations indicatethat other mutations to the thrS 50 UTR (L6 M1 N1BS4-9 BS4-9CS29 L19 L7 CS302 ILO1 andILO2 see Figure 7) did not alter its standby sitemodule surface area or RNA structure the maximumobserved change in translation was 25-fold for thesemutations

DISCUSSION

We designed and characterized synthetic mRNA se-quences and applied biophysical modeling to decipherthe rules for the ribosomersquos interactions at structured50 UTRs The 30S ribosome searches for a singlestandby site module that supports the highest translationrate regardless of its distance from the start codon Theribosomersquos binding free energy penalty to individualstandby site modules is governed by a competing trade-off between the accessibility of standby site modules

GAACUGGUUAAUAAACAAAUUUUUC

AG

GUGG

U

G

UACCUG

AAUU

G

G

AG

UUUG

AAC C

UG

ACUA

GA

UUCGUG

U

G

A

GCGUU

U

U

CGCAAU

GUUU

G

AA

UU

GC

A

CA

GU

AUUU

GAUACGGC

AU

UAAGGAUAUAAAAUGSD

U

AUGUU

U

UGCAAA

GCUU U

U

CGUG

ACUAGUGC

G

GU

C

CA

Domain 2

Domain 3

Domain 4

minus1

ILOΔ1

ILOΔ2

minus90

minus120

minus150

minus163 minus60 minus13

minus32

ILOΔ4 ILOΔ5

minus49

Figure 7 The sequence and structure of E coli thrS 50 UTR are shown Similar to Sacerdot et al (45) nucleotide numbering is negative for theupstream 50 UTR with respect to the start codon The list of mutations is BS4-9 G(-32)gtA BS4-9CS29 G(-32)gtA and G(-46)gtA L19 A(-13)gtCL7 U(-49)gtG CS302 deletion of domain 2 from A(-14) to U(-48) ILO1 transcription begins at G(-159) ILO2 transcription begins at G(-68)ILO4 transcription begins at G(-49) U(-49)gtG and A(-13)gtC ILO5 transcription begins at G(-49) U(-49)gtG and G(-46)gtA Bolded nucleo-tides are the positions of new transcriptional start sites

2656 Nucleic Acids Research 2014 Vol 42 No 4

(Figure 2) and the unfolding of RNA structures toincrease accessibility (Figure 3) We demonstrate thatthermodynamic minimization predicts the extent ofthe ribosomersquos selective and partial unfolding of RNAstructures Our data supports a sliding mechanismwhereby downstream RNA structures are not unfoldedbut attenuate translation rate in a height-dependentmanner (Figure 4)

Importantly initially designed sequences have well-defined features that perturb the ribosomersquos individualinteractions with mRNA allowing us to preciselymeasure their effect on the ribosomersquos binding freeenergy penalty The forward design of synthetic sequencesaccording to model predictions quantitatively tests ourknowledge of these interactions and the modelrsquos abilityto predict their strengths Overall the biophysical modelemploys first-principles calculations with four empiricallymeasured constants to accurately predict the translationinitiation rates of 136 mRNAs with diversely structuredlong 50 UTRs (Figure 5)

Based on our findings there is also the potential forlong-range regulation of translation Long structured50 UTRs can feature several upstream standby sitemodules extending over 100 nt away from an openreading frame Any of these standby site modules arepotential binding sites for the ribosome according totheir binding free energy penalties but only a singlestandby site module is ultimately bound Both cis-actingand trans-acting factors can modulate individual standbysite module surface accessibilities to control standby sitemodule selection and the resulting translation rate(31723) Importantly the regulation of standby sitemodule selection will produce discrete step changes intranslation rate that can further regulate downstreamprocesses according to non-Boolean multi-state logic

In this study we introduce a new metric to quantify thesurface area of a standby site module which was validatedusing diverse but canonical RNA secondary structuresThe metric relates the geometric characteristics of thestandby site module (P D H) to its available single-stranded surface area implicitly using coefficients of oneto indicate interchangeability Additional measurementswill be needed to precisely measure these coefficientsand relate them to the helical shape of A-form RNA Inaddition tertiary structures such as G-quadruplexes andpseudoknots will likely have a differently defined metricof surface area Further investigation will be necessary todetermine the effect of tertiary RNA structures on standbysite accessibility ribosome binding and translationinitiation

As suggested by the biophysical model the ribosomalplatformrsquos ability to bind standby site modules with lowsurface area relies on its structural flexibility Previouslypublished cryo-EM data shows that the platform domainin the 30S subunit in both the free and 50S-bound statescan occupy variable conformational states (46) The twohalves of the platform which are stabilized by ribosomalproteins at its surface switch conformations during trans-lation (47) Restraining this flexibility will reduce the ribo-somersquos translation initiation rate for example when theantibiotic Edeine binds 16S ribosomal RNA helices H23

and H24 (48) Consequently the ribosomal platformrsquosstructural flexibility appears to be an essential aspect oftranslating mRNAs with structured 50 UTRsSimilarly the rigidity of single-stranded RNA likely

plays an important role in modulating the ribosomersquos dis-tortion penalty due to the entropic differences of bindingflexible or rigid macromolecules (49) Similar to DNApolyA RNA sequences are less flexible than mixed orpolyU RNA sequences (5051) Differences in RNArigidity have been shown to alter the ribosomersquos bindingaffinity to spacer regions separating the SD and startcodon sequences (52) binding free energy differences are25 kcalmol between polyA and polyAU spacer se-quences and 5 kcalmol between polyA and polyUspacer sequences Incorporating a sequence-dependentmodel for RNA rigidity into the distortion penalty calcu-lations could potentially increase their accuracyAnalysis of the modelrsquos error can lead to further insight

into the ribosomersquos interactions The modelrsquos predictionshave a normally distributed error for 90 of sequences(Supplementary Figure S6) Thirteen outliers have errorsbetween 2 and 4 kcalmol (Supplementary Data) Someoutliers are due to the effects of increased mRNA degrad-ation of 50 UTRs with very long proximal or distal bindingsites (P Dgt 20 nt) (Supplementary Figure S2) In othersextremely short proximal or distal binding sites can resultin non-canonical base pairings that alter the ribosomersquosbinding affinity For example in the absence of aproximal binding site (P=0nt) co-axial stacking willtake place between the nucleotides in the distal bindingsite and the mRNAndashrRNA duplex that anchors theribosome to the mRNA Similarly co-axial stacking willplay a larger role in predicting the effects of small RNAsthat bind to sites nearby RNA structuresAs the list of interactions that control gene expression

continues to grow we will stretch the limits of humanpattern recognition and the use of observational correl-ations Biophysical modeling offers a comprehensiveapproach to account for all known interactions with thegene expression machinery identify gaps design experi-ments precisely measure strengths and make testable pre-dictions The accuracy of thesemodels will go hand-in-handwith our ability to engineer genetic systems without trial-and-error We have incorporated the ribosomersquos inter-actions with structured standby site modules into animproved biophysical model called the RBS Calculatorv20 A software implementation and user-friendly webinterface is available at httpsalispsuedusoftware

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online

ACKNOWLEDGEMENTS

The authors would like to thank Manish Kushwaha andthe Genomics Core Facility at Penn State for technicalassistance Chase Beisel Richard Lease Paul Babitzkeand the reviewers for providing comments on themanuscript

Nucleic Acids Research 2014 Vol 42 No 4 2657

FUNDING

DARPA Young Faculty Award [to HMS] Office ofNaval Research [N00014-13-1-0074 to HMS] NationalScience Foundation Career Award [CBET-1253641 toHMS] start-up funds provided by the Penn StateInstitute for the Energy and Environment Supported bythe NSF Research Experience for Undergraduatesprogram (to ASC) Funding for open access chargeThe Office of Naval Research [N00014-13-1-0074 toHMS]

CONTRIBUTIONS

AEB and HMS designed the study developed themodel analyzed results and wrote the manuscriptAEB and ASC conducted the experiments

Conflict of interest statement HMS is the founder of DeNovo DNA a company that commercializes software todesign genetic systems The presented results areindependent of financial consequences

REFERENCES

1 TsaiA PetrovA MarshallRA KorlachJ UemuraS andPuglisiJD (2012) Heterogeneous pathways and timing of factordeparture during translation initiation Nature 487 390ndash393

2 SimonettiA MarziS JennerL MyasnikovA RombyPYusupovaG KlaholzBP and YusupovM (2009) A structuralview of translation initiation in bacteria Cell Mol Life Sci 66423ndash436

3 JennerL RombyP ReesB Schulze-BrieseC SpringerMEhresmannC EhresmannB MorasD YusupovaG andYusupovM (2005) Translational operator of mRNA on theribosome how repressor proteins exclude ribosome bindingScience 308 120ndash123

4 BeiselCL UpdegroveTB JansonBJ and StorzG (2012)Multiple factors dictate target selection by Hfq-binding smallRNAs EMBO J 31 1961ndash1974

5 StorzG VogelJ and WassarmanKM (2011) Regulation bysmall RNAs in bacteria expanding frontiers Mol Cell 43880ndash891

6 BastetL DubeA MasseE and LafontaineDA (2011) Newinsights into riboswitch regulation mechanisms Mol Microbiol80 1148ndash1154

7 BeiselCL and SmolkeCD (2009) Design principles forriboswitch function PLoS Comput Biol 5 e1000363

8 LeaseRA and BelfortM (2000) A trans-acting RNA as acontrol switch in Escherichia coli DsrA modulates function byforming alternative structures Proc Natl Acad Sci USA 979919ndash9924

9 De SmitMH and Van DuinJ (1994) Control of translation bymRNA secondary structure in Escherichia coli A quantitativeanalysis of literature data J Mol Biol 244 144ndash150

10 De SmitMH and Van DuinJ (1994) Translational initiation onstructured messengers Another role for the Shine-Dalgarnointeraction J Mol Biol 235 173ndash184

11 OstermanIA EvfratovSA SergievPV and DontsovaOA(2013) Comparison of mRNA features affecting translationinitiation and reinitiation Nucleic Acids Res 41 474ndash486

12 BugautA and BalasubramanianS (2012) 50-UTR RNA G-quadruplexes translation regulation and targeting Nucleic AcidsRes 40 4727ndash4741

13 SalisHM (2011) The ribosome binding site calculator MethodsEnzymol 498 19ndash42

14 SalisHM MirskyEA and VoigtCA (2009) Automated designof synthetic ribosome binding sites to control protein expressionNat Biotechnol 27 946ndash950

15 CarothersJM GolerJA JuminagaD and KeaslingJD (2011)Model-driven engineering of RNA devices to quantitativelyprogram gene expression Science 334 1716ndash1719

16 SimonettiA MarziS MyasnikovAG FabbrettiAYusupovM GualerziCO and KlaholzBP (2008) Structure ofthe 30S translation initiation complex Nature 455 416ndash420

17 MarziS MyasnikovAG SerganovA EhresmannCRombyP YusupovM and KlaholzBP (2007) StructuredmRNAs regulate translation initiation by binding to the platformof the ribosome Cell 130 1019ndash1031

18 YusupovaGZ YusupovMM CateJ and NollerHF (2001)The path of messenger RNA through the ribosome Cell 106233ndash241

19 De SmitMH and Van DuinJ (2003) Translational standbysites how ribosomes may deal with the rapid folding kinetics ofmRNA J Mol Biol 331 737ndash743

20 QuX LancasterL NollerHF BustamanteC and TinocoI(2012) Ribosomal protein S1 unwinds double-stranded RNA inmultiple steps Proc Natl Acad Sci USA 109 14458ndash14463

21 MilonP MaracciC FilonavaL GualerziCO andRodninaMV (2012) Real-time assembly landscape of bacterial30S translation initiation complex Nat Struct Mol Biol 19609ndash615

22 StuderSM and JosephS (2006) Unfolding of mRNA secondarystructure by the bacterial translation initiation complex MolCell 22 105ndash115

23 DarfeuilleF UnosonC VogelJ and WagnerEGH (2007) Anantisense RNA inhibits translation by competing with standbyribosomes Mol Cell 26 381ndash392

24 MutalikVK GuimaraesJC CambrayG MaiQ-AChristoffersenMJ MartinL YuA LamC RodriguezCBennettG et al (2013) Quantitative estimation of activity andquality for collections of functional genetic elements NatMethods 10 347ndash353

25 CardinaleS and ArkinAP (2012) Contextualizing context forsynthetic biologyndashidentifying causes of failure of syntheticbiological systems Biotechnol J 7 856ndash866

26 LouC StantonB ChenY-J MunskyB and VoigtCA (2012)Ribozyme-based insulator parts buffer synthetic circuits fromgenetic context Nat Biotechnol 30 1137ndash1142

27 KosuriS GoodmanDB CambrayG MutalikVK GaoYArkinAP EndyD and ChurchGM (2013) Composability ofregulatory sequences controlling transcription and translation inEscherichia coli Proc Natl Acad Sci USA 110 14024ndash14029

28 TakahashiS FurusawaH UedaT and OkahataY (2013)Translation enhancer improves the ribosome liberation fromtranslation initiation J Am Chem Soc 135 13096ndash13106

29 HaoY ZhangZJ EricksonDW HuangM HuangY LiJHwaT and ShiH (2011) Quantifying the sequence-functionrelation in gene silencing by bacterial small RNAs Proc NatlAcad Sci USA 108 12473ndash12478

30 MathewsDH SabinaJ ZukerM and TurnerDH (1999)Expanded sequence dependence of thermodynamic parametersimproves prediction of RNA secondary structure J Mol Biol288 911ndash940

31 XiaT SantaLuciaJ BurkardME KierzekR SchroederSJJiaoX CoxC and TurnerDH (1998) Thermodynamicparameters for an expanded nearest-neighbor model for formationof RNA duplexes with Watson-Crick base pairs Biochemistry 3714719ndash14735

32 GruberAR LorenzR BernhartSH NeubockR andHofackerIL (2008) The Vienna RNA websuite Nucleic AcidsRes 36 W70ndashW74

33 ChenH BjerknesM KumarR and JayE (1994) Determinationof the optimal aligned spacing between the ShinendashDalgarnosequence and the translation initiation codon of Escherichia colimRNAs Nucleic Acids Res 22 4953ndash4957

34 PertzevAV and NicholsonAW (2006) Characterization ofRNA sequence determinants and antideterminants of processingreactivity for a minimal substrate of Escherichia coli ribonucleaseIII Nucleic Acids Res 34 3708ndash3721

35 KeselerIM Collado-VidesJ Santos-ZavaletaA Peralta-GilMGama-CastroS Muniz-RascadoL Bonavides-MartinezCPaleyS KrummenackerM AltmanT et al (2011) EcoCyc a

2658 Nucleic Acids Research 2014 Vol 42 No 4

comprehensive database of Escherichia coli biology Nucleic AcidsRes 39 D583ndashD590

36 MutalikVK QiL GuimaraesJC LucksJB and ArkinAP(2012) Rationally designed families of orthogonal RNA regulatorsof translation Nat Chem Biol 8 447ndash454

37 RodrigoG LandrainTE and JaramilloA (2012) De novoautomated design of small RNA circuits for engineering syntheticriboregulation in living cells Proc Natl Acad Sci USA 10915271ndash15276

38 CalluraJM CantorCR and CollinsJJ (2012) Geneticswitchboard for synthetic biology applications Proc Natl AcadSci USA 109 5850ndash5855

39 SinhaJ ReyesSJ and GallivanJP (2010) Reprogrammingbacteria to seek and destroy an herbicide Nat Chem Biol 6464ndash470

40 AnthonyPC PerezCF Garcıa-GarcıaC and BlockSM(2012) Folding energy landscape of the thiamine pyrophosphateriboswitch aptamer Proc Natl Acad Sci USA 109 1485ndash1489

41 PerdrizetGA II ArtsimovitchI FurmanR SosnickTR andPanT (2012) Transcriptional pausing coordinates folding of theaptamer domain and the expression platform of a riboswitchProc Natl Acad Sci USA 109 3323ndash3328

42 LynchSA DesaiSK SajjaHK and GallivanJP (2007) Ahigh-throughput screen for synthetic riboswitches revealsmechanistic insights into their function Chem Biol 14 173ndash184

43 CaronM-P BastetL LussierA Simoneau-RoyM MasseEand LafontaineDA (2012) Dual-acting riboswitch control oftranslation initiation and mRNA decay Proc Natl Acad SciUSA 109 E3444ndashE3453

44 UpdegroveT WilfN SunX and WartellRM (2008) Effect ofHfq on RprArpoS mRNA pairing HfqRNA binding and the

influence of the 50 rpoS mRNA leader region Biochemistry 4711184ndash11195

45 SacerdotC CailletJ GraffeM EyermannF EhresmannBEhresmannC SpringerM and RombyP (1998) The Escherichiacoli threonyl-tRNA synthetase gene contains a split ribosomalbinding site interrupted by a hairpin structure that is essential forautoregulation Mol Microbiol 29 1077ndash1090

46 GabashviliIS AgrawalRK GrassucciR and FrankJ (1999)Structure and structural variations of the Escherichia coli 30 Sribosomal subunit as revealed by three-dimensional cryo-electronmicroscopy J Mol Biol 286 1285ndash1291

47 ClemonsWM MayJL WimberlyBT McCutcheonJPCapelMS and RamakrishnanV (1999) Structure of a bacterial30S ribosomal subunit at 55 A resolution Nature 400 833ndash840

48 PiolettiM SchlunzenF HarmsJ ZarivachR GluhmannMAvilaH BashanA BartelsH AuerbachT and JacobiC(2001) Crystal structures of complexes of the small ribosomalsubunit with tetracycline edeine and IF3 EMBO J 201829ndash1839

49 WilliamsonJR (2000) Induced fit in RNAndashprotein recognitionNat Struct Mol Biol 7 834ndash837

50 FioreJL HolmstromED and NesbittDJ (2012) Entropicorigin of Mg2+-facilitated RNA folding Proc Natl Acad SciUSA 109 2902ndash2907

51 MillsJB VacanoE and HagermanPJ (1999) Flexibility ofsingle-stranded DNA use of gapped duplex helices to determinethe persistence lengths of poly (dT) and poly (dA) J Mol Biol285 245ndash257

52 EgbertRG and KlavinsE (2012) Fine-tuning gene networksusing simple sequence repeats Proc Natl Acad Sci USA 10916817ndash16822

Nucleic Acids Research 2014 Vol 42 No 4 2659

Page 2: Translation rate is controlled by coupled trade-offs ......Translation rate is controlled by coupled trade-offs between site accessibility, selective RNA unfolding and sliding at upstream

altogether (22) We consider a 50 UTR as composed of oneor more standby site modules followed by a 16S rRNAbinding site (a SD sequence) a spacer region and endedwith a protein coding sequence (Figure 1B) An mRNAthat folds into several hairpins all located upstream of theSD sequence will contain several standby site modules

Thermodynamic modeling has enabled a comprehensivedissection of the ribosomersquos interactions with mRNA withclearly defined contributions quantitatively described bytheir free energies (Figure 1C) The most stringent test ofour current understanding is to combine thermodynamicmodeling with in silico optimization to design completelynovel non-natural RNA sequences with predicted riboso-mal interactions and translation initiation rates The dif-ferences between model predictions and experimentalmeasurements objectively draw the boundaries betweenour current understanding of RNA interactions and itslimitations A growing body of evidence from our workand others indicates the presence of unpredicted inter-actions between the ribosome and long 50 UTRs high-lighted as context effects (Supplementary Figure S1C)(1424ndash28)

In this article we decipher the biophysical rules govern-ing the ribosomersquos interactions with long structured50 UTRs We systematically alter 50 UTR sequence struc-ture and shape to measure the ribosomersquos interactionsusing in vivo reporter assays We use thermodynamicmodeling to quantify how the ribosome binds less access-ible 50 UTRs accommodates or partially unfolds RNAstructure selects from multiple binding sites slides

across the mRNA and initiates translation We developa predictive biophysical model that combines thermo-dynamic minimization with a four-parameter free energymodel to accurately predict the translation initiation ratesof 136 long diversely structured 50 UTRs Finally wedemonstrate that the ribosome can bind to distant sitesand engage in long-range RNA interactions providing amechanism for the observed context effects and new waysfor riboswitches and small RNAs to regulate translation

MATERIALS AND METHODS

Strains media and plasmid construction

Luria-Bertani (LB) media (10 gl tryptone 5 gl yeastextract 10 gl NaCl) was obtained from BD and supple-mented with 50 mgml chloramphenicol (Sigma-Aldrich)The expression system is derived from the pFTV1plasmid (ColE1 CmR) (14) and encodes a s70 constitutivepromoter (BioBrick J23100) a synthetic 50 UTR and acodon-optimized mRFP1 fluorescent protein Synthetic50 UTRs were inserted by standard cloning techniquesutilizing either BamHI and SacI or XbaI and SacI sitesdepending on 50 UTR length 50 UTRs with desiredsequences were created by either annealing of complemen-tary oligonucleotides with overhangs or by PCR assemblyof oligonucleotides with corresponding restriction sitesPlasmid variants were cloned by restriction digest ofDNA backbone purification and ligation to DNA frag-ments with 50 UTR sequences and transformation ofchemically competent Escherichia coli DH10B cells viaheat shock Correct clones were identified by DNAsequencing All the designed 50 UTR sequences arepresented in the Supplementary Data

Fluorescent protein and mRNA measurements

Fluorescent protein reporter measurements were per-formed in 96-well format An initial 96 deep-well platecontaining 700 ml LB and 50 mgml chloramphenicol wasinoculated from single colonies with up to 30 differentE coli DH10B cultures in an alternating pattern thatexcluded the outer wells Cultures were incubated over-night at 37C with 200 rpm orbital shaking A fresh96-well plate containing 200 ml LB and chloramphenicolmedia was inoculated by overnight cultures using a 1100dilution Plates were incubated at 37C in a plate-basedspectrophotometer (TECAN M1000) with high orbitalshaking OD600 measurements were recorded every10min Once a culture reached an OD600 of 015 asample of each culture was transferred to a new plate con-taining 200 ml phosphate buffered saline [NaCl 137mMKCl 27mM Phosphate buffer 10mM (Na2HPO4KH2PO4 pH74) purchased from OmniPur] and 2mgmlkanamycin (Sigma Aldrich) for flow cytometry measure-ments Periodic serial dilutions were repeated two moretimes maintaining cultures within the exponentialgrowth phase Three samples were taken from eachculture The single-cell fluorescence distributions ofsamples were measured using a Fortessa flow cytometer(BD Biosciences) All distributions were unimodal Theautofluorescence distribution of E coli DH10B cells was

B

clcA5rsquo

flgN5rsquo

yecH5rsquo

SD

Standby Site Modules

AUG

A

C Initial State Final State

30S ribosome

ΔGtotal

Translation

mRNA

Initiation

5rsquo

ΔGstart

ΔGspacing

ΔGmRNA-rRNA

ΔGstandby

SDanti-SD

AUGSpacing

Standby Site

ΔGmRNA

Standby Site

Figure 1 (A) Natural E coli 50 UTRs have diverse lengths and struc-tures with varying binding free energy penalties to the ribosomalplatform Green and light blue regions represent SD and start codonrespectively (B) 50 UTRs are separated into multiple standby sitemodules followed by a SD sequence spacer region and a proteincoding sequence (C) A schematic shows the mRNA regions thatcontact the ribosome in its initial and final states The ribosomalplatform binds to 50 UTRs with a binding free energy penaltyGstandby The sum of all binding free energies Gtotal determineshow likely the ribosome binds to an mRNA and initiates translation

Nucleic Acids Research 2014 Vol 42 No 4 2647

also measured The arithmetic mean of each distributionwas taken and the mean autofluorescence was subtractedfrom each sample The reported fluorescence values in thisstudy are the average of three measurements (n=3)For mRNA level measurements single colonies were

first grown overnight in 5ml LB media with chloram-phenicol then diluted to 001 OD600 in fresh media andgrown to mid-exponential phase until reaching 10OD600 as measured by a cuvette-based spectrophotom-eter (NanoDrop 2000C) Total RNA was extracted usingthe Total RNA Purification kit (Norgen Biotek)Treatment with Turbo DNAse (Ambion) eliminatedgenomic DNA contamination Reverse transcription wascarried out with a High Capacity cDNA ReverseTranscription kit (Applied Biosystems) followed by quan-titative PCR with a TaqMan probe (50-ACCTTCCATACGAACTTT-30) targeting the internal coding sequence formRFP1 (forward primer 50-ACGTTATCAAAGAGTTCATGCGTTTC-30 and reverse primer 50-CGATTTCGAACTCGTGACCGTTAA-30) and using a ABI 7300 real-time cycler A TaqMan probe for 16S rRNA was usedas an endogenous control Triplicate measurements wereeach performed on two separate RNA extractions Ct

calculations were then averaged for each sample

Biophysical model calculations

The complete biophysical model employs a free energymodel to calculate the total binding free energy betweenthe ribosome and mRNA according to the equation(Figure 1C)

Gtotal frac14 GmRNArRNA+Gspacing+Gstart

+Gstandby GmRNAeth1THORN

Using statistical thermodynamics we may relate this totalGibbs free energy change to the mRNArsquos translation ini-tiation rate r according to

r expethGtotalTHORN eth2THORN

This relationship has been previously validated on 132mRNA sequences where the Gtotal varied from 10 to16 kcalmol resulting in well-predicted translation ratesthat varied by over 100 000-fold (14) The apparentBoltzmann constant has been measured as045plusmn005molkcal which was confirmed in a secondstudy (29)In the initial state the mRNA exists in a structured

conformation where its free energy of folding isGmRNA (GmRNA is negative) After assembly of the30S ribosomal subunit the last nine nucleotides of its16S rRNA have hybridized to the mRNA while all non-clashing mRNA structures are allowed to fold The freeenergy of folding for this mRNAndashrRNA complex isGmRNAndashrRNA (GmRNAndashrRNA is negative) mRNA struc-tures that impede 16S rRNA hybridization or overlap withthe ribosome footprint remain unfolded in the final stateThese Gibbs free energies are calculated using a semi-empirical free energy model of RNA and RNAndashRNAinteractions (3031) and the minimization algorithmsavailable in the Vienna RNA suite version 185 (32)

Three additional interactions will alter the translationinitiation rate The tRNAfMET anti-codon loop hybridizesto the start codon (Gstart is most negative for AUG andGUG) The 30S ribosomal subunit also prefers a fivenucleotide aligned spacing (33) Non-optimal distancesbetween the 16S rRNA binding site and the start codonlead to an energetic binding penalty This relationshipbetween the ribosomersquos binding penalty (Gspacinggt 0)and aligned spacing distance was measured in ourprevious study (14) Finally the 50 UTR binds to the ribo-somal platform with a free energy penalty Gstandbywhich is calculated according to Equation (5) anddescribed in the main text A maximally accessiblestandby site module will bind to the ribosomal platformwith a zero Gstandby Gibbs free energy calculations forall mRNA sequences are listed in the SupplementaryData

The biophysical model automatically decomposes a50 UTR into one or more standby site modules using thefollowing definitions A standby site module contains asingle mRNA hairpin surrounded by two single-stranded regions an upstream distal binding site and adownstream proximal binding site The lengths of thedistal and proximal binding sites are denoted by D andP respectively The hairpinrsquos height H is determined bythe number of nucleotides from the hairpinrsquos base to themid-point of the hairpinrsquos loop including bulges internalloops and base pairs The heights of branched hairpins aredetermined by averaging the branchesrsquo heights(Supplementary Figure S3) Hairpins are only consideredas components of standby site modules if they aremutually exclusive with the 16S rRNA binding siteA 50 UTR with multiple upstream hairpins will containmultiple standby site modules We consider a 50 UTRwithout upstream hairpins to contain a single standbysite module with zero hairpin height and zero distal sitelength Using these definitions the biophysical model cal-culates a Gstandby for each standby site module

The available single-stranded surface area As of astandby site module is determined according toAs=15+P+D ndash H The key aspect of this equation isthat proximal length distal length and hairpin height haveequal and interchangeable effects on the amount of single-stranded surface area The constant 15 was selected tocreate a positive metric for single-stranded surface areaBased on data shown in Figure 2 increasing the hairpinheight beyond 15 nt did not reduce the mRNA translationrate regardless of the proximal or distal site lengthsAccordingly the value of H in the equation is notallowed to exceed 15 nt

The ribosomal platformrsquos distortion energy penalty is aquadratic function of the available RNA surface area As(Figure 2) according to

Gdistortion frac14 C1ethAsTHORN2+C2ethAsTHORN+C3 eth3THORN

where C1 C2 and C3 are the fitted parameter values andequal to 0038 kcalmolnt2 1629 kcalmolnt and17359 kcalmol respectively An alternative constant inthe definition of surface area would alter these coefficients

2648 Nucleic Acids Research 2014 Vol 42 No 4

but the overall relationship between the surface area andthe distortion free energy penalty would remain the same

Measurements of translation rate and binding freeenergy penalties

In our steady-state long-time growth conditions transla-tion initiation is a rate-limiting step in codon-optimizedmRFP1 fluorescent protein expression fluorescence levelsare proportional to translation initiation rates under ourcontrolled growth conditions (13) and translation initi-ation rate is related to total binding free energies accord-ing to a log-linear formula (Equation 2) We measure thechange in apparent binding free energy penalty by

comparing the fluorescence of a structured 50 UTR withthe fluorescence of a maximally accessible 50 UTR accord-ing to

Gstandby frac14 1

ln

X

Xref

ethGotherTHORN eth4THORN

where X is the fluorescence from a sample synthetic50 UTR Xref is the fluorescence of the unstructured50 UTR and is 045molkcal Gother is the differencein free energies between the sample and reference50 UTRs that depends on other ribosomal interactionsand not on the standby site By using identical and care-fully designed 16S rRNA binding sites spacer regions and

Height

Distal Proximal

SD AUG

A

D

0 5 10 15 20

102

103

104

12

108

64

2

0

14

Distal Length (D) ntFl

uore

scen

ce [a

u]

ΔGst

andb

y [k

calm

ol]

E

16 188 10 12 14 20

102

103

104

12

108

64

2

0

14

Hairpin Height (H) nt

Fluo

resc

ence

[au]

ΔGst

andb

y [k

calm

ol]

C

0 5 10 15 20

102

103

104

12

108

64

2

0

14

Proximal Length (P) nt

Fluo

resc

ence

[au]

ΔGst

andb

y [k

calm

ol]

0 5 10 15 20 25

101

102

103

104

18

16

14

12

10

8

6

4

2

0

Fluo

resc

ence

[au]

ΔGst

andb

y [k

calm

ol]

Standby Site Module Surface Area AS

8 12 14 18 20 227

6

5

4

3

2

1

0

103

104Fl

uore

scen

ce [a

u]

ΔGst

andb

y [k

calm

ol]

Standby Site Module Surface Area AS

10 16

A = 15 + P + D minus Hs

F G

B

SDAUG

D H P ndash H

Figure 2 (A) Standby site modules are defined by a proximal and distal binding site separated by an RNA structure (B) Standby site modulesurface area is calculated by summing the proximal P and distal D binding site lengths and subtracting the hairpin height H The surface area is keptpositive by adding a constant 15 (CndashE) Translation rate and ribosome binding free energy penalties are measured as proximal binding site lengthdistal binding site length and hairpin height are increased The Gstandby numbers shown in secondary y-axis are directly related to the dataaccording to Equation (4) (C) Hairpin heights are 9 nt (green stars) 12 nt (black asterisks) 15 nt (red squares) or 18 nt (blue circles) (D)Hairpin heights are 15 nt (red squares) or 18 nt (blue circles) (E) Hairpin height varies from 9 to 12 nt by adding to loop length of a 6 bp stem(blue circles) or from 14 to 18 nt by adding to loop length of a 12 bp stem (red squares) (F) 56 characterized 50 UTRs (Aslt 24) from parts (CndashE) arecombined to quantitatively relate the standby site modulersquos surface area to the ribosomersquos translation rate and binding free energy penalty Dashedline is a best-fit quadratic equation (G) The validity of the quadratic equation is tested by measuring translation rates from 15 additional 50 UTRsand comparing to model predictions (R2=078 average error is 066 kcalmol) Data points are the result of three measurements in 1 day In parts(CndashG) the horizontal dashed line is the translation rate and ribosome binding free energy penalty of the reference unstructured 50 UTR

Nucleic Acids Research 2014 Vol 42 No 4 2649

coding sequences in the 136 synthetic mRNA sequencesthe value for Gother was zero for 123 mRNA sequencesand a maximum of 090 kcalmol for the remainingsequences

Statistical analysis

Two-sample T-test at two-tailed 5 statistical signifi-cance was performed to compare the error distributionof different sets of model predictions The null hypothesisassumed that the errors in two different sets of modelpredictions have same mean and variance The alternativehypothesis indicates that the errors in two sets follow dif-ferent distributions The error in model predictionGstandby was calculated according to the formulaGstandby=measured Gstandby predicted GstandbyThe low probabilities (Plt 005) reject the null hypothesisshowing that the differences in two sets of model predic-tions were statistically significant

RESULTS

Standby site module surface area controls the ribosomersquostranslation rate

We first employed a learn-by-design approach to deter-mine the ribosomal platformrsquos interactions with standbysites that contain a single hairpin and therefore contain asingle standby site module Standby site modules aredefined by their hairpin height (H) upstream distalbinding site length (D) and downstream proximalbinding site length (P) (Figure 2A lsquoMaterials andMethodsrsquo section) We designed constructed andcharacterized 62 synthetic 50 UTRs containing singlestandby site modules with hairpins from 9 to 18 bp longdistal binding sites from 1 to 15 nt long and proximalbinding sites from 1 to 24 nt long Long mRNA hairpinscontain internal loops to avoid potential RNAse activity(34) All mRNA sequences are transcribed by the J23100constitutive promoter encode the mRFP1 fluorescentprotein and are designed with identical SD and spacersequences (lsquoMaterials and Methodsrsquo section) Flowcytometry measurements from long-time log-phasecultures with periodic serial dilutions revealed thatmRFP1 fluorescence varied by 221-fold (Figure 2CndashE)In contrast mRNA levels varied by at most 58-foldaccording to RT-qPCR measurements (SupplementaryFigure S2) Changes in fluorescence are primarily due tochanges in translation rate and in these growth condi-tions fluorescence and translation initiation rates are pro-portional All sequences and measurement data are listedin the Supplementary DataAccording to the data standby site module accessibility

plays an important role in modulating the ribosomersquosinteractions with 50 UTRs and controlling translationrate across a large range In particular lengthening thedistal binding site by four nucleotides increased themRNArsquos translation rate by 42-fold whereas lengtheningthe proximal binding site by the same amount increased itby 40-fold Shortening the hairpinrsquos height by 4 bpincreased the mRNArsquos translation rate similarly by 31-fold More generally the slopes (first derivatives) of the

log-transformed translation rates all have similar magni-tudes (Supplementary Data column E) Highly accessiblestandby site modules with long proximal binding sites orshort hairpins also cause the mRNArsquos translation rate toreach a plateau with the same fluorescence level as anmRNA with an unstructured 50 UTR (Figure 2C) Theseobservations led to the hypothesis that the geometric char-acteristics of the standby site module (D P H) are inter-changeable and that its available single-stranded surfacearea As determines the ribosomal platformrsquos ability tobind to it

A standby site modulersquos single-stranded surface area isthe number of nucleotides capable of single-strandedcontact with the ribosomal platform Longer proximalor distal binding sites increase the standby site modulersquossurface area while longer hairpin structures reduce thestandby site modulersquos accessibility mRNA structurescan potentially reduce site accessibility by folding overthe neighboring single-stranded sites and sequesteringbinding regions A standby site modulersquos availablesurface area is calculated by summing its proximal anddistal binding site lengths and subtracting the hairpinrsquosheight (Figure 2B lsquoMaterials and Methodsrsquo section) Ahighly accessible standby site module has a large Aswhereas a less accessible standby site module has a smallAs We then evaluated whether the standby site modulersquossurface area is the key determinant by comparing As to themeasured log-transformed translation rates For all 6250 UTRs with diverse geometric features we found asingle quantitative relationship that described theeffect of surface area on the mRNArsquos translation rate(Figure 2F)

We now turn to thermodynamic modeling to quantifythe ribosomersquos interactions with standby site modules Thecompetitive binding of free ribosomes to the pool of intra-cellular mRNAs allows one to use statistical thermo-dynamics to relate a ribosomersquos binding free energy toits translation initiation rate according to a constantlog-linear relationship (Equation 2) (1314) We calculatethe interaction energy between the ribosome and anmRNArsquos standby site modules by comparing its transla-tion to the translation rate of a reference unstructured50 UTR both with an identical SD sequence spacerregion and protein coding sequence (Equation 4) Usingthe data in Figure 2CndashE we found that the standby sitebinding free energy penalty Gstandby varied from 0 to118 kcalmol when changing its geometric characteristicsand that a simple quadratic function succinctly describedthe effect of site accessibility on the ribosomersquos bindingfree energy penalty (R2=097 average error is051 kcalmol) (Figure 2F) A maximally accessiblestandby site module has a zero binding free energypenalty (Gstandby=0) indicating that the mRNArsquostranslation initiation rate is not reduced by the ribosomalplatformrsquos interactions A standby site module with apositive binding free energy penalty (Gstandbygt 0) hasreduced the mRNArsquos translation initiation rate bydecreasing the ribosomal platformrsquos ability to initiallybind the mRNA

Next we critically tested the parameterized model byconstructing 15 new 50 UTRs and measuring their

2650 Nucleic Acids Research 2014 Vol 42 No 4

translation initiation rates and ribosome binding freeenergy penalties The 50 UTRs each contain a standbysite module with diverse structures between 43 and 91 ntlong including variable-length distal and proximalbinding sites hairpin heights and including two- andthree-branched structures (Supplementary Data) Forbranched structures we show that the relevant hairpinheight is the average across the multiple branches(Supplementary Figure S3) (see also SupplementaryText) The quadratic relationship accurately predictedthe ribosomersquos translation rate and binding free energypenalty (R2=078 average error is 066 kcalmol) con-firming that standby site modulersquos accessibility controlsthe ribosomersquos ability to bind and initiate translation(Figure 2G) The quadratic relationship contains threeparameters which are now treated as constants(Equation 3 lsquoMaterials and Methodsrsquo section)

Coupled energetic trade-offs when unfolding structures instandby site modules

Prior to initiating translation the 30S ribosome does nothave an external source of energy Initiation factor 2 doesnot hydrolyze GTP until after translation has beeninitiated during 50S recruitment (22) Consequently the30S complex expends its own free energy to unfold mRNAstructures Previous studies have shown that mRNA struc-tures that sequester the 16S rRNA binding site areunfolded prior to initiating translation (91422) ThemRNArsquos translation initiation rate is reduced accordingto the free energy penalty for unfolding these structures asdescribed in Equations (1) and (2) (lsquoMaterials andMethodsrsquo section) However it remains unclear whethermRNA structures located upstream of SD sequenceswithin standby site modules are unfolded oraccommodated We next investigated how the ribosomeunfolds hairpins in long 50 UTRs

By definition there is a coupled thermodynamic rela-tionship between the unfolding of an mRNA hairpin andincreasing the accessibility of the corresponding standbysite module where the accessibility of standby site modulecontrols the ribosomal platformrsquos distortion energypenalty Gdistortion (see lsquoMaterials and Methodsrsquosection) Unfolding a single base pair in a modulersquoshairpin will increase the standby site modulersquos surfacearea by three nucleotides the hairpinrsquos height decreasesby one and the lengths of both the proximal and distalbinding sites increase by one Therefore unfolding amodulersquos hairpin will lead to higher accessibility a lowerGdistortion and a higher translation rate However un-folding base pairs requires an input of free energy(Gunfoldinggt 0) which will lead to a lower translationrate Consequently there is an energetic trade-offbetween unfolding mRNA structures and increasing thesurface accessibility of standby site modules Theribosome may completely unfold a structured standbysite module to maximize its surface area but the energeticcost may be high (high Gunfolding penalty withGdistortion=0) In contrast the ribosome may fully ac-commodate a standby site modulersquos hairpin without

unfolding it even though its low surface accessibility willpenalize the ribosomersquos ability to bind to it (highGdistortion penalty with Gunfolding=0)Based on thermodynamic principles we hypothesize

that the ribosome will lsquoselectivelyrsquo and lsquopartiallyrsquo unfoldstructured standby site modules to minimize its totalbinding free energy penalty (Gstandby=Gdistortion+Gunfolding) Figure 3A and B presents an illustrativeexample of this principle and our use of minimization tocalculate the ribosomersquos binding free energy penalty to astandby site module A 50 UTR with a single standby sitemodule initially contains a short proximal binding site(P=2nt) a short distal binding site (D=2nt) and along hairpin (H=15nt) resulting in a largely inaccessible50 UTR (As=4) Based on the standby site modulersquossurface area the ribosome has an initial distortionenergy penalty of 115 kcalmol However the ribosomecan selectively unfold the hairpinrsquos closing AU basepairs increasing the standby site modulersquos surface areaand lowering its Gdistortion at the expense of increasingits Gunfolding Unfolding 1 bp increases the standby sitemodulersquos surface area by 3 nt decreases the distortionenergy penalty to 78 kcalmol and increases the unfoldingenergy penalty to 090 kcalmol leading to a total bindingfree energy penalty of 87 kcalmol The ribosome cancontinue to unfold base pairs to minimize its totalbinding free energy penalty unfolding 3 bp at a cost of27 kcalmol leads to a distortion energy penaltyof 26 kcalmol and a total binding free energy penaltyof 53 kcalmol (Figure 3B) Two additional examplesare shown in Supplementary Figure S4 The unfoldingfree energies are calculated using the Vienna RNA suitewith the Turner 1999 RNA parameters (see lsquoMaterials andMethodsrsquo section) Importantly thermodynamic mini-mization does not add any new parameters to the modelWe tested the proposed mechanism by characterizing the

translation rate of 22 synthetic 50 UTRs (Figure 3CndashE) thatcontain single standby site modules with varied geometries(D=0 to 7nt and P=1 to 20nt) and differing hairpinsequences sizes and folding energetics (SupplementaryFigure S5A) We employed the minimization principle tocalculate the number of unfolded base pairs and to predictthe ribosomersquos binding free energy penalty Gstandby

(Figure 3D) As comparisons we performed the same cal-culation while asserting that either no additional energywas inputted to unfold hairpins (high Gdistortion andGunfolding=0) (Figure 3C) or that hairpins were com-pletely unfolded by the ribosome (high Gunfolding andGdistortion=0) (Figure 3E)The data supports our hypothesis that the ribosome

partially and selectively unfolds structured standby sitemodules to minimize its total binding free energy penalty(Gstandby) rather than separately minimize its distortionor unfolding penalties (average error is 090 kcalmol whenminimizing Gstandby 242 kcalmol when Gunfolding=0and 1024 kcalmol when Gdistortion=0 comparativeP-values are 0004 and 10 13 respectively)Remarkably in all 22 cases only 1ndash3 bp of each hairpinare predicted by the model to be unfolded whileaccommodating the remaining large RNA structures

Nucleic Acids Research 2014 Vol 42 No 4 2651

Non-cooperative binding to multiple standby site modules

We have shown that the ribosome can accommodate largestructures of standby site modules and engage in frequenttranslation but it is unclear how the presence of multiplestandby site modules in a 50 UTR will affect its translationrate The presence of multiple standby site modules couldallow the ribosome to cooperatively bind mRNA andincrease its translation rate We next characterized ninesynthetic 50 UTRs (76ndash164 nt long) with up to fourstandby site modules to investigate the potential for co-operative binding (Figure 4A) If the ribosome can engagein cooperative binding then the additional standby sitemodules will increase the overall available surface areareduce the ribosomersquos binding free energy penalty andincrease the translation initiation rate In particular weexpect the greatest cooperativity when the availablesurface area of a single standby site module is limitedWe found that 50 UTRs with two three or four

standby site modules were all translated at the samerate indicating an absence of cooperative binding

(Figure 4B) including when the available surface areasfor individual standby site modules were significantlylowered Therefore even when the combined surfaceareas of multiple standby site modules could potentiallyincrease contact with the ribosomal platform and increasebinding a cooperative increase in translation rate was notobserved Consequently the data supports a mechanismwhereby the ribosome binds to only a single standby sitemodule prior to initiating translation

Distant standby site modules can support hightranslation rates

We next investigated whether any standby site modulecould be bound by the ribosome even if located farupstream of the start codon Using the modelrsquos predic-tions we forward-engineered multi-standby site modulesynthetic 50 UTRs where the most distantly upstreamsite is the only one capable of supporting high translationrates (Figure 4C) Internal standby site modules have longhairpins and limited surface areas (model predictions

B

0 2 4 6 8 1002468

10121416

Unfolded Base Pairs

Ene

rgy

[kca

lmol

]

Standby Site Module Surface Area (AS)4 10 16 22 28 34

dagger

GC

CGCU

C

C

CGA

UAUA

G

G

CGA

CUCG

AAuuu aaaAAUAAGGAGGUAAAAAAUGSD

After

GC

CGCUuuu

C

C

CGAaaa

UAUA

G

G

CGA

CUCG

AA AAUAAGGAGGUAAAAAAUGSD

dagger Before

C D

A

Partial RNA Unfolding

102

103

104

Fluo

resc

ence

[au]

Predicted ΔGstandby [kcalmol]0 5 10 15

No RNA Unfolding

0 5 10 15

102

103

104

Fluo

resc

ence

[au]

Predicted ΔGstandby [kcalmol]

Complete RNA Unfolding

0 5 10 15 20

10 2

10 3

10 4

Fluo

resc

ence

[au]

Predicted ΔGstandby [kcalmol]

E

Figure 3 (A and B) Competition between ribosomal distortion and selective RNA unfolding is illustrated on a structured 50 UTR with a weakhairpin The ribosomersquos distortion penalty Gdistortion (dashed red line) hairpin unfolding penalty Gunfolding (dotted dashed blue line) and their sumGstandby (green solid line) are shown as the hairpinrsquos closing base pairs are unfolded The total binding free energy penalty has a local minima whenthree base pairs are unfolded See also Supplementary Figure S4 (CndashE) Scatter plots comparing measured translation rates from 22 synthetic50 UTRs and model predictions to test three mechanisms for the ribosomersquos unfolding of RNA structures The average difference between modelpredictions and measurements is (C) 242 kcalmol when not unfolding structures (D) 090 kcalmol when partially unfolding structures and(E) 1024 kcalmol when fully unfolding structures Comparative P-values from two-sample T-tests are 0004 (C versus D) and 1013 (D versusE) Data points are the result of three measurements in 1 day In parts (CndashE) the diagonal dashed line is the predicted translation rate according toEquation (2)

2652 Nucleic Acids Research 2014 Vol 42 No 4

Gstandby between 90 and 174 kcalmol) compared to themaximally accessible upstream standby site modulesTherefore if the ribosome binds the farthest upstreamstandby site module its translation rate will be at least127-fold higher than binding to the internal standby sitemodules We measured fluorescence levels with flowcytometry and mRNA levels with RT-qPCR Comparedto an unstructured 50 UTR the mRNA levels decreased byonly 13- to 55-fold while the fluorescence levels werecorrespondingly reduced between 25- and 183-foldshowing that changes in fluorescence were mainly due tochanges in translation rate and ribosome binding freeenergy penalties (Figure 4D)

We found that all the mRNAsrsquo translation ratesremained very high even when the accessible standbysite module was located over 100 nt upstream of thestart codon (M4 50 UTR) (Figure 4E) In particular theM2 50 UTR is translated at a 338-fold higher rate whencompared to the translation rate that could be supportedby binding to the standby site module closest to the startcodon Moving the accessible standby site moduleupstream away from the start codon does not signifi-cantly reduce the translation rate Based on these meas-urements the apparent standby site binding penaltiesare 36 62 and 65 kcalmol for M2 to M4 50 UTRs

(Figure 4F) respectively which is much lower than theinternal standby site modulesrsquo predicted binding freeenergy penalties (90ndash174 kcalmol) These measurementssuggest that the ribosome will bind to the standby sitemodule that has the lowest binding free energy penaltyregardless of its distance from the start codon

The ribosomal platform can slide acrossupstream structures

Several mechanisms would allow the ribosomal platformto bind distant upstream standby site modules and tran-sition to a pre-initiation complex at the start codon Afterthe ribosome has bound to the upstream standby sitemodule the 30S complex could engage in diffusivehopping that would increase its local concentration andassembly rate at the start codon The rate of assemblywould largely depend on the physical distance betweenthe standby site module and start codon and not on thepresence of RNA structures Alternatively the ribosomecould employ a full contact sliding mechanism that reori-ents the mRNA until a stable pre-initiation complex hasformed The positively charged surface of the ribosomalplatform makes it ideal for maintaining contact with a50 UTR until 16S rRNA hybridization has anchored themRNA Unlike a diffusive hopping mechanism the

B

ΔGst

andb

y [kc

alm

ol]

Single Stranded Length (N) [nt]124 8

6543210

103

104

Fluo

resc

ence

[au]

A

SD AUGN N

SD AUGN NN

SD AUGN NNN

C

SD AUGM1

I

SD AUGM2

III

SD AUG

M3

IIIII

I

SD AUG

M4

IIIII

IIV

M3 M4M1 M2

Structured Standby Sites

Fluo

resc

ence

[au]

102

103

104

101

100

E

I IIM2

I II IIIM3

I II III IVM4

22

20

18

16

14

12

10

8

6

4

2

0

ΔGdi

stor

tion +

ΔG

slid

ing

[kca

lmol

]

IM1

22

20

18

16

14

12

10

8

6

4

2

0

ΔGdi

stor

tion

[kca

lmol

]

FD

M1 M2M3 M4

0

5

10

15

20

25

0

5

10

15

20

25

Structured Standby Sites

mR

NA

Lev

el F

old

Red

uctio

n

Fluo

resc

ence

Fol

d R

educ

tion

Figure 4 (A) Synthetic 50 UTRs with multiple evenly spaced standby site modules Spacing N is varied from 4 to 12 nt (B) Measured translation ratesare compared to standby site module multiplicity showing the absence of cooperative binding to multiple standby site modules regardless of limitingsurface areas The Gstandby numbers shown in secondary y-axis are directly related to the data according to Equation (4) (C) Synthetic 50 UTRs withone upstream and multiple internal standby site modules labeled with roman numbers (D) The reduction in fluorescence (red bars) and mRNA levels(blue bars) of 50 UTRs M1 to M4 compared to an unstructured 50 UTR (see also Supplementary Figure S2) (E) Fluorescence measurements of 50

UTRs M1 to M4 show that upstream standby site modules can support high translation rates (F) Model predictions are shown for 50 UTRs M1 to M4

using either Gdistortion (gray bars) or Gdistortion+Gsliding (blue bars) for each standby site module (IndashIV corresponding to labels in part C)compared to the ribosomersquos apparent binding free energy penalties (dashed lines yellow region) (average error with and without the sliding penalty is115 and 324 kcalmol respectively P-value is 0013) Gdistortion for each standby site module was calculated according to Equation (3) In parts Band E the horizontal dashed line is the translation rate and ribosome binding free energy penalty of the unstructured 50 UTR

Nucleic Acids Research 2014 Vol 42 No 4 2653

presence of mRNA structures would impede this slidingmotion and reduce the pre-initiation complexrsquos assemblyrateFurther analysis of our measurements shows that RNA

structures in between the upstream standby site moduleand start codon partly attenuate the mRNArsquos translationrate (Figure 4E) The apparent binding free energy penaltyof the M2 50 UTR (36 kcalmol) is higher than theupstream standby site modulersquos predicted binding freeenergy penalty (04 kcalmol) This energetic differencecannot be explained by binding to the internal standbysite module (114 kcalmol) or by the unfolding of thedownstream RNA structure (175 kcalmol) Thus thefree energy difference (36 ndash 04=32 kcalmol) is signifi-cant but much lower than predicted by alternativebinding to other standby site modules or from unfoldingof RNA structuresOur data supports the presence of a sliding mechanism

for the ribosomal platform Intervening mRNA structuresare not unfolded but are pushed aside with an energeticcost Using our measurements of the M2 50 UTR weestimate that 32plusmn05 kcalmol free energy is requiredto push aside the single RNA structure with a height of15 nt yielding a coefficient of 020plusmn004 kcalmol perhairpin height We then determined whether includingthis sliding penalty could improve the modelrsquos ability topredict the translation rate of long 50 UTRs with multiplestandby site modules with lsquohairpins of differing heightsrsquoFor each standby site module we calculate the heights ofall downstream RNA structures and multiplied the sumby the coefficient (c=02 kcalmolnt) to yield a freeenergy penalty of sliding Gsliding The sliding freeenergy penalties were 0 30 48 and 78 kcalmol for M1to M4 50 UTRs respectively We add together the slidingdistortion and unfolding free energy penalties to calculatethe ribosomersquos binding free energy penalty to each standbysite moduleFigure 4E shows the measured translation initiation

rates for M1 to M4 alongside the predicted binding freeenergy penalties (Figure 4F) both with and without thesliding free energy penalty The data shown in Figure 4Eand F are on the same scale according to the log-linearrelationship between translation initiation rate andbinding free energy differences (lsquoMaterials and Methodsrsquosection) enabling a comparison between the measure-ments and the two predictions The introduction of thesliding free energy penalty enabled the biophysical modelto more accurately predict the translation rate from longer50 UTRs with several standby site modules (average erroris 186 kcalmol with the sliding energy penalty comparedto 445 kcalmol without the sliding energy penaltyP-value is 0013)

A biophysical model predicts translation rates fromdiversely structured 50 UTRs

Altogether the ribosome binds to 50 UTRs by selectingthe most geometrically accessible standby site module thatrequires the least amount of RNA unfolding Standby sitemodule selection is not limited by distance to the startcodon and upstream standby site modules can support

high rates of translation We identified that ribosomal dis-tortion selective RNA unfolding and ribosomal slidingcontrol standby site module selection and binding ener-getics and combine them together to create a single bio-physical model The ribosomersquos binding free energypenalty to the standby site modules in a structured50 UTR is calculated according to

Gstandby frac14 Gdistortion+Gunfolding+Gsliding eth5THORN

All three Gibbs free energies are positive and a long un-structured 50 UTR would bind to the ribosomal platformwithout an energetic penalty (Gstandby=0kcalmol) AllRNA energetics are calculated using a semi-empirical freeenergy model of RNA and RNAndashRNA interactions(3031) and the minimization algorithms available in theVienna RNA suite version 185 (32) This model extendsour previous biophysical model that focused on down-stream mRNA interactions including the 16S rRNAbinding site spacer region and downstream RNA struc-tures (1314)

We further tested the accuracy of the biophysical modelby characterizing an additional 28 diversely structured50 UTRs (68ndash164 nt long) with two to four standby sitemodules of varying geometries RNA structure energeticsand hairpin heights (Supplementary Figure S5B) The bio-physical model was able to accurately predict the relativetranslation rates and ribosome binding free energypenalties (average error is 079 kcalmol and R2=083)(Figure 5A) Overall the biophysical model containsfour empirically determined parameter values but canaccurately predict the binding free energy penalties andtranslation rates of the 136 synthetic 50 UTRscharacterized in this study (average error is 075 kcalmoland R2=089) (Figure 5B)

Genome-wide predictions of 50 UTR translation rates

We next employed the biophysical model to examine howstandby sites in genomic 50 UTRs control their translationinitiation rates From the annotated E coli MG1655genome provided by EcoCyc (35) we collected 343050 UTRs that varied from short leaders to gene-sized regu-latory hot spots with average 50 UTR length of 100 nt The50 UTR sequences from the+1 of the mRNA transcript to+100 after the protein coding sequencersquos start codon wereinputted into the biophysical model For individual50 UTRs a list of standby site modules was automaticallygenerated For each standby site module the surface areaAs distortion energy penalty Gdistortion unfolding energypenalty Gunfolding and sliding energy penalty Gsliding

were calculated These calculations are combined to deter-mine the Gstandby for each standby site module Becauseof the absence of cooperative binding as shown thestandby site module with the lowest binding free energypenalty is selected as the one controlling the ribosomersquostranslation initiation rate

According to the model calculations we found that thelengths of the natural 50 UTRs do not correlate with theirability to bind the ribosomal platform (Figure 6A) Short50 UTRs with low accessibility can inhibit translationequally well compared to long 50 UTRs where

2654 Nucleic Acids Research 2014 Vol 42 No 4

accessibility is limited by RNA structures The model cal-culations indicate that the ribosomal platform can bind tonatural 50 UTRs with a wide range of energetic penalties(up to 174 kcalmol) repressing translation by up to 2500-fold (Figure 6B) Long 50 UTRs with low Gstandby can besubjected to translation repression when trans-actingsmall RNAs bind and reduce their single-stranded acces-sibility (232936ndash38) or when cis-acting riboswitchesreduce accessibility when bound to a chemical ligand(39ndash43) In one example the standby site in the tisB50 UTR is occluded when bound by the IstR-1 smallRNA and its translation is repressed (23) In contrastlong 50 UTRs with high Gstandby can be targets for

translation activation when structural remodeling bysmall RNAs or riboswitches increases their surface acces-sibility RprA small RNA has been shown to bind and up-regulate the translation of rpoS mRNA with the help ofHfq-protein (44)The model predicts that both types of translation regu-

lation occur in natural 50 UTRs In particular 39 ofgenomic 50 UTRs have standby site modules whosedistal or proximal binding sites are longer than 15 ntwhich are potent targets for translational repression incontrast 9 of 50 UTRs contain standby site moduleswith low accessibilities (Aslt 10) making them targetsfor translation activation Moreover 79 diverse 50 UTRs

A B

C

0 200 400 600 800 10000

2

4

6

8

10

12

14

16

18

ΔGst

andb

y [kca

lmol

]

Genomic 5rsquo UTR Length [nt]

0 10 20 30 400

2

4

6

8

10

12

14

16

18

ΔGst

andb

y [kca

lmol

]

0 2 4 6 8 10 12 14 16 18100

101

102

103

104

ΔGstandby

[kcalmol]

Num

ber

of G

enom

ic 5

rsquo UT

Rs

0 2 4 6 8 10 12 14 16 18

maximum change in ΔGstandby

[kcalmol]

Num

ber

of 5

rsquo UT

R Is

ofor

ms

100

101

102

103Genomic 5rsquo UTR Length [nt]

Figure 6 (A) The ribosomal platform binding free energy penalties Gstandby for 3430 transcribed 50 UTRs from the E coli MG1655 genome(EcoCyc release 171) are compared to their lengths The inset shows the Gstandby of short 50 UTRs (B) The number of genomic 50 UTRs withdifferent values of Gstandby is shown (C) 50 UTR isoforms within transcripts across the genome affect their standby site module characteristics asquantified by the maximum calculated change in Gstandby

A

0 2 4 6 8 1010

8

6

4

2

0

103

104

Predicted ΔGstandby

[kcalmol]

App

aren

t ΔG

stan

dby [

kcal

mol

]

Flu

ores

cenc

e [a

u]

B

0 2 4 6 8 10 12 14

102

103

104

14

12

10

8

6

4

2

0

Predicted ΔGstandby

[kcalmol]

App

aren

t ΔG

stan

dby [

kcal

mol

]

Flu

ores

cenc

e [a

u]

Figure 5 (A) A scatter plot showing fluorescence measurements and apparent Gstandby energy penalties compared to predicted Gstandby energypenalties for 28 synthetic 50 UTRs with diverse structures (average error is 079 kcalmol and R2=083) (B) The same comparison for the 136synthetic 50 UTRs characterized in this study (average error is 075 kcalmol and R2=089) In parts (A and B) the apparent Gstandby numbersshown in secondary y-axis are directly related to the data according to Equation (4) An x=y diagonal line is shown (dashed)

Nucleic Acids Research 2014 Vol 42 No 4 2655

with an average length of 175 nt have positive slidingenergy penalties (Gsliding) indicating that distantstandby site modules are controlling translation andmaking them potential targets for long-range regulationFurther there are 769 operons where variability in their

transcriptional start sites due to adjacent or overlappingpromoters result in 50 UTRs with different lengths andmRNA structures For example the mmuPM operon hasfive different transcriptional start sites leading to 50 UTRisoforms with lengths from 9 to 205 nt The biophysicalmodel predicts a 1088-fold change in the translation initi-ation rate of mmuP as a result of changes in the structureand surface area of the different standby site modulesOverall across all 50 UTR isoforms within the genomemodel calculations show large changes in translation ini-tiation rates through alterations in the standby sitemodulesrsquo characteristics (Figure 6C) These differencesimply that switching promoter usage due to changingtranscription factor or sigma factor levels can modulatean mRNArsquos translation rate and potentially createstandby site modules whose accessibility can be addition-ally regulated by translation factors However furtherinvestigation is necessary to examine the 50 UTRs ofisoforms and the regulation of their translation rates

Biophysical modeling to calculate translation rates fromsplit ribosome binding sites

Sacerdot et al (45) studied the translation of the 50 UTRof E coli thrS mRNA that uses a split ribosome bindingsite The biophysical model classifies the 50 UTR as con-taining three standby site modules with large RNA struc-tures and varying distal and proximal binding site lengths(Figure 7) The authors concluded that a 24 nt long singlestranded region (domain 3) is essential for ribosomebinding and that the ribosome can accommodate a long

hairpin (domain 2) without unfolding it which was latersupported by crystallographic data (3) The biophysicalmodelrsquos calculations are strongly consistent with their con-clusions In particular a mutation that deleted the entiredomain 3 and its upstream region (ILO4) repressed thetranslation rate by over 50-fold where it significantlyreduced the standby site accessibility (from As=24 andGdistortion=0 to As=0 and Gdistortion=174 kcalmol) To offset this large energetic burden model calcu-lations indicate that the ribosomal platform partiallyunfolded 6 bp from the domain 2 hairpin whichminimized its binding free energy penalty to a Gstandby

of 532 kcalmol (As=205 Gdistortion=002 andGunfolding=53 kcalmol) Destabilization of thedomain 2 hairpin (ILO5) restored the mRNArsquos transla-tion to its wild-type rate by increasing the surface area ofthe standby site to 205 nt resulting in an almost zeroGstandby The biophysical model calculations indicatethat other mutations to the thrS 50 UTR (L6 M1 N1BS4-9 BS4-9CS29 L19 L7 CS302 ILO1 andILO2 see Figure 7) did not alter its standby sitemodule surface area or RNA structure the maximumobserved change in translation was 25-fold for thesemutations

DISCUSSION

We designed and characterized synthetic mRNA se-quences and applied biophysical modeling to decipherthe rules for the ribosomersquos interactions at structured50 UTRs The 30S ribosome searches for a singlestandby site module that supports the highest translationrate regardless of its distance from the start codon Theribosomersquos binding free energy penalty to individualstandby site modules is governed by a competing trade-off between the accessibility of standby site modules

GAACUGGUUAAUAAACAAAUUUUUC

AG

GUGG

U

G

UACCUG

AAUU

G

G

AG

UUUG

AAC C

UG

ACUA

GA

UUCGUG

U

G

A

GCGUU

U

U

CGCAAU

GUUU

G

AA

UU

GC

A

CA

GU

AUUU

GAUACGGC

AU

UAAGGAUAUAAAAUGSD

U

AUGUU

U

UGCAAA

GCUU U

U

CGUG

ACUAGUGC

G

GU

C

CA

Domain 2

Domain 3

Domain 4

minus1

ILOΔ1

ILOΔ2

minus90

minus120

minus150

minus163 minus60 minus13

minus32

ILOΔ4 ILOΔ5

minus49

Figure 7 The sequence and structure of E coli thrS 50 UTR are shown Similar to Sacerdot et al (45) nucleotide numbering is negative for theupstream 50 UTR with respect to the start codon The list of mutations is BS4-9 G(-32)gtA BS4-9CS29 G(-32)gtA and G(-46)gtA L19 A(-13)gtCL7 U(-49)gtG CS302 deletion of domain 2 from A(-14) to U(-48) ILO1 transcription begins at G(-159) ILO2 transcription begins at G(-68)ILO4 transcription begins at G(-49) U(-49)gtG and A(-13)gtC ILO5 transcription begins at G(-49) U(-49)gtG and G(-46)gtA Bolded nucleo-tides are the positions of new transcriptional start sites

2656 Nucleic Acids Research 2014 Vol 42 No 4

(Figure 2) and the unfolding of RNA structures toincrease accessibility (Figure 3) We demonstrate thatthermodynamic minimization predicts the extent ofthe ribosomersquos selective and partial unfolding of RNAstructures Our data supports a sliding mechanismwhereby downstream RNA structures are not unfoldedbut attenuate translation rate in a height-dependentmanner (Figure 4)

Importantly initially designed sequences have well-defined features that perturb the ribosomersquos individualinteractions with mRNA allowing us to preciselymeasure their effect on the ribosomersquos binding freeenergy penalty The forward design of synthetic sequencesaccording to model predictions quantitatively tests ourknowledge of these interactions and the modelrsquos abilityto predict their strengths Overall the biophysical modelemploys first-principles calculations with four empiricallymeasured constants to accurately predict the translationinitiation rates of 136 mRNAs with diversely structuredlong 50 UTRs (Figure 5)

Based on our findings there is also the potential forlong-range regulation of translation Long structured50 UTRs can feature several upstream standby sitemodules extending over 100 nt away from an openreading frame Any of these standby site modules arepotential binding sites for the ribosome according totheir binding free energy penalties but only a singlestandby site module is ultimately bound Both cis-actingand trans-acting factors can modulate individual standbysite module surface accessibilities to control standby sitemodule selection and the resulting translation rate(31723) Importantly the regulation of standby sitemodule selection will produce discrete step changes intranslation rate that can further regulate downstreamprocesses according to non-Boolean multi-state logic

In this study we introduce a new metric to quantify thesurface area of a standby site module which was validatedusing diverse but canonical RNA secondary structuresThe metric relates the geometric characteristics of thestandby site module (P D H) to its available single-stranded surface area implicitly using coefficients of oneto indicate interchangeability Additional measurementswill be needed to precisely measure these coefficientsand relate them to the helical shape of A-form RNA Inaddition tertiary structures such as G-quadruplexes andpseudoknots will likely have a differently defined metricof surface area Further investigation will be necessary todetermine the effect of tertiary RNA structures on standbysite accessibility ribosome binding and translationinitiation

As suggested by the biophysical model the ribosomalplatformrsquos ability to bind standby site modules with lowsurface area relies on its structural flexibility Previouslypublished cryo-EM data shows that the platform domainin the 30S subunit in both the free and 50S-bound statescan occupy variable conformational states (46) The twohalves of the platform which are stabilized by ribosomalproteins at its surface switch conformations during trans-lation (47) Restraining this flexibility will reduce the ribo-somersquos translation initiation rate for example when theantibiotic Edeine binds 16S ribosomal RNA helices H23

and H24 (48) Consequently the ribosomal platformrsquosstructural flexibility appears to be an essential aspect oftranslating mRNAs with structured 50 UTRsSimilarly the rigidity of single-stranded RNA likely

plays an important role in modulating the ribosomersquos dis-tortion penalty due to the entropic differences of bindingflexible or rigid macromolecules (49) Similar to DNApolyA RNA sequences are less flexible than mixed orpolyU RNA sequences (5051) Differences in RNArigidity have been shown to alter the ribosomersquos bindingaffinity to spacer regions separating the SD and startcodon sequences (52) binding free energy differences are25 kcalmol between polyA and polyAU spacer se-quences and 5 kcalmol between polyA and polyUspacer sequences Incorporating a sequence-dependentmodel for RNA rigidity into the distortion penalty calcu-lations could potentially increase their accuracyAnalysis of the modelrsquos error can lead to further insight

into the ribosomersquos interactions The modelrsquos predictionshave a normally distributed error for 90 of sequences(Supplementary Figure S6) Thirteen outliers have errorsbetween 2 and 4 kcalmol (Supplementary Data) Someoutliers are due to the effects of increased mRNA degrad-ation of 50 UTRs with very long proximal or distal bindingsites (P Dgt 20 nt) (Supplementary Figure S2) In othersextremely short proximal or distal binding sites can resultin non-canonical base pairings that alter the ribosomersquosbinding affinity For example in the absence of aproximal binding site (P=0nt) co-axial stacking willtake place between the nucleotides in the distal bindingsite and the mRNAndashrRNA duplex that anchors theribosome to the mRNA Similarly co-axial stacking willplay a larger role in predicting the effects of small RNAsthat bind to sites nearby RNA structuresAs the list of interactions that control gene expression

continues to grow we will stretch the limits of humanpattern recognition and the use of observational correl-ations Biophysical modeling offers a comprehensiveapproach to account for all known interactions with thegene expression machinery identify gaps design experi-ments precisely measure strengths and make testable pre-dictions The accuracy of thesemodels will go hand-in-handwith our ability to engineer genetic systems without trial-and-error We have incorporated the ribosomersquos inter-actions with structured standby site modules into animproved biophysical model called the RBS Calculatorv20 A software implementation and user-friendly webinterface is available at httpsalispsuedusoftware

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online

ACKNOWLEDGEMENTS

The authors would like to thank Manish Kushwaha andthe Genomics Core Facility at Penn State for technicalassistance Chase Beisel Richard Lease Paul Babitzkeand the reviewers for providing comments on themanuscript

Nucleic Acids Research 2014 Vol 42 No 4 2657

FUNDING

DARPA Young Faculty Award [to HMS] Office ofNaval Research [N00014-13-1-0074 to HMS] NationalScience Foundation Career Award [CBET-1253641 toHMS] start-up funds provided by the Penn StateInstitute for the Energy and Environment Supported bythe NSF Research Experience for Undergraduatesprogram (to ASC) Funding for open access chargeThe Office of Naval Research [N00014-13-1-0074 toHMS]

CONTRIBUTIONS

AEB and HMS designed the study developed themodel analyzed results and wrote the manuscriptAEB and ASC conducted the experiments

Conflict of interest statement HMS is the founder of DeNovo DNA a company that commercializes software todesign genetic systems The presented results areindependent of financial consequences

REFERENCES

1 TsaiA PetrovA MarshallRA KorlachJ UemuraS andPuglisiJD (2012) Heterogeneous pathways and timing of factordeparture during translation initiation Nature 487 390ndash393

2 SimonettiA MarziS JennerL MyasnikovA RombyPYusupovaG KlaholzBP and YusupovM (2009) A structuralview of translation initiation in bacteria Cell Mol Life Sci 66423ndash436

3 JennerL RombyP ReesB Schulze-BrieseC SpringerMEhresmannC EhresmannB MorasD YusupovaG andYusupovM (2005) Translational operator of mRNA on theribosome how repressor proteins exclude ribosome bindingScience 308 120ndash123

4 BeiselCL UpdegroveTB JansonBJ and StorzG (2012)Multiple factors dictate target selection by Hfq-binding smallRNAs EMBO J 31 1961ndash1974

5 StorzG VogelJ and WassarmanKM (2011) Regulation bysmall RNAs in bacteria expanding frontiers Mol Cell 43880ndash891

6 BastetL DubeA MasseE and LafontaineDA (2011) Newinsights into riboswitch regulation mechanisms Mol Microbiol80 1148ndash1154

7 BeiselCL and SmolkeCD (2009) Design principles forriboswitch function PLoS Comput Biol 5 e1000363

8 LeaseRA and BelfortM (2000) A trans-acting RNA as acontrol switch in Escherichia coli DsrA modulates function byforming alternative structures Proc Natl Acad Sci USA 979919ndash9924

9 De SmitMH and Van DuinJ (1994) Control of translation bymRNA secondary structure in Escherichia coli A quantitativeanalysis of literature data J Mol Biol 244 144ndash150

10 De SmitMH and Van DuinJ (1994) Translational initiation onstructured messengers Another role for the Shine-Dalgarnointeraction J Mol Biol 235 173ndash184

11 OstermanIA EvfratovSA SergievPV and DontsovaOA(2013) Comparison of mRNA features affecting translationinitiation and reinitiation Nucleic Acids Res 41 474ndash486

12 BugautA and BalasubramanianS (2012) 50-UTR RNA G-quadruplexes translation regulation and targeting Nucleic AcidsRes 40 4727ndash4741

13 SalisHM (2011) The ribosome binding site calculator MethodsEnzymol 498 19ndash42

14 SalisHM MirskyEA and VoigtCA (2009) Automated designof synthetic ribosome binding sites to control protein expressionNat Biotechnol 27 946ndash950

15 CarothersJM GolerJA JuminagaD and KeaslingJD (2011)Model-driven engineering of RNA devices to quantitativelyprogram gene expression Science 334 1716ndash1719

16 SimonettiA MarziS MyasnikovAG FabbrettiAYusupovM GualerziCO and KlaholzBP (2008) Structure ofthe 30S translation initiation complex Nature 455 416ndash420

17 MarziS MyasnikovAG SerganovA EhresmannCRombyP YusupovM and KlaholzBP (2007) StructuredmRNAs regulate translation initiation by binding to the platformof the ribosome Cell 130 1019ndash1031

18 YusupovaGZ YusupovMM CateJ and NollerHF (2001)The path of messenger RNA through the ribosome Cell 106233ndash241

19 De SmitMH and Van DuinJ (2003) Translational standbysites how ribosomes may deal with the rapid folding kinetics ofmRNA J Mol Biol 331 737ndash743

20 QuX LancasterL NollerHF BustamanteC and TinocoI(2012) Ribosomal protein S1 unwinds double-stranded RNA inmultiple steps Proc Natl Acad Sci USA 109 14458ndash14463

21 MilonP MaracciC FilonavaL GualerziCO andRodninaMV (2012) Real-time assembly landscape of bacterial30S translation initiation complex Nat Struct Mol Biol 19609ndash615

22 StuderSM and JosephS (2006) Unfolding of mRNA secondarystructure by the bacterial translation initiation complex MolCell 22 105ndash115

23 DarfeuilleF UnosonC VogelJ and WagnerEGH (2007) Anantisense RNA inhibits translation by competing with standbyribosomes Mol Cell 26 381ndash392

24 MutalikVK GuimaraesJC CambrayG MaiQ-AChristoffersenMJ MartinL YuA LamC RodriguezCBennettG et al (2013) Quantitative estimation of activity andquality for collections of functional genetic elements NatMethods 10 347ndash353

25 CardinaleS and ArkinAP (2012) Contextualizing context forsynthetic biologyndashidentifying causes of failure of syntheticbiological systems Biotechnol J 7 856ndash866

26 LouC StantonB ChenY-J MunskyB and VoigtCA (2012)Ribozyme-based insulator parts buffer synthetic circuits fromgenetic context Nat Biotechnol 30 1137ndash1142

27 KosuriS GoodmanDB CambrayG MutalikVK GaoYArkinAP EndyD and ChurchGM (2013) Composability ofregulatory sequences controlling transcription and translation inEscherichia coli Proc Natl Acad Sci USA 110 14024ndash14029

28 TakahashiS FurusawaH UedaT and OkahataY (2013)Translation enhancer improves the ribosome liberation fromtranslation initiation J Am Chem Soc 135 13096ndash13106

29 HaoY ZhangZJ EricksonDW HuangM HuangY LiJHwaT and ShiH (2011) Quantifying the sequence-functionrelation in gene silencing by bacterial small RNAs Proc NatlAcad Sci USA 108 12473ndash12478

30 MathewsDH SabinaJ ZukerM and TurnerDH (1999)Expanded sequence dependence of thermodynamic parametersimproves prediction of RNA secondary structure J Mol Biol288 911ndash940

31 XiaT SantaLuciaJ BurkardME KierzekR SchroederSJJiaoX CoxC and TurnerDH (1998) Thermodynamicparameters for an expanded nearest-neighbor model for formationof RNA duplexes with Watson-Crick base pairs Biochemistry 3714719ndash14735

32 GruberAR LorenzR BernhartSH NeubockR andHofackerIL (2008) The Vienna RNA websuite Nucleic AcidsRes 36 W70ndashW74

33 ChenH BjerknesM KumarR and JayE (1994) Determinationof the optimal aligned spacing between the ShinendashDalgarnosequence and the translation initiation codon of Escherichia colimRNAs Nucleic Acids Res 22 4953ndash4957

34 PertzevAV and NicholsonAW (2006) Characterization ofRNA sequence determinants and antideterminants of processingreactivity for a minimal substrate of Escherichia coli ribonucleaseIII Nucleic Acids Res 34 3708ndash3721

35 KeselerIM Collado-VidesJ Santos-ZavaletaA Peralta-GilMGama-CastroS Muniz-RascadoL Bonavides-MartinezCPaleyS KrummenackerM AltmanT et al (2011) EcoCyc a

2658 Nucleic Acids Research 2014 Vol 42 No 4

comprehensive database of Escherichia coli biology Nucleic AcidsRes 39 D583ndashD590

36 MutalikVK QiL GuimaraesJC LucksJB and ArkinAP(2012) Rationally designed families of orthogonal RNA regulatorsof translation Nat Chem Biol 8 447ndash454

37 RodrigoG LandrainTE and JaramilloA (2012) De novoautomated design of small RNA circuits for engineering syntheticriboregulation in living cells Proc Natl Acad Sci USA 10915271ndash15276

38 CalluraJM CantorCR and CollinsJJ (2012) Geneticswitchboard for synthetic biology applications Proc Natl AcadSci USA 109 5850ndash5855

39 SinhaJ ReyesSJ and GallivanJP (2010) Reprogrammingbacteria to seek and destroy an herbicide Nat Chem Biol 6464ndash470

40 AnthonyPC PerezCF Garcıa-GarcıaC and BlockSM(2012) Folding energy landscape of the thiamine pyrophosphateriboswitch aptamer Proc Natl Acad Sci USA 109 1485ndash1489

41 PerdrizetGA II ArtsimovitchI FurmanR SosnickTR andPanT (2012) Transcriptional pausing coordinates folding of theaptamer domain and the expression platform of a riboswitchProc Natl Acad Sci USA 109 3323ndash3328

42 LynchSA DesaiSK SajjaHK and GallivanJP (2007) Ahigh-throughput screen for synthetic riboswitches revealsmechanistic insights into their function Chem Biol 14 173ndash184

43 CaronM-P BastetL LussierA Simoneau-RoyM MasseEand LafontaineDA (2012) Dual-acting riboswitch control oftranslation initiation and mRNA decay Proc Natl Acad SciUSA 109 E3444ndashE3453

44 UpdegroveT WilfN SunX and WartellRM (2008) Effect ofHfq on RprArpoS mRNA pairing HfqRNA binding and the

influence of the 50 rpoS mRNA leader region Biochemistry 4711184ndash11195

45 SacerdotC CailletJ GraffeM EyermannF EhresmannBEhresmannC SpringerM and RombyP (1998) The Escherichiacoli threonyl-tRNA synthetase gene contains a split ribosomalbinding site interrupted by a hairpin structure that is essential forautoregulation Mol Microbiol 29 1077ndash1090

46 GabashviliIS AgrawalRK GrassucciR and FrankJ (1999)Structure and structural variations of the Escherichia coli 30 Sribosomal subunit as revealed by three-dimensional cryo-electronmicroscopy J Mol Biol 286 1285ndash1291

47 ClemonsWM MayJL WimberlyBT McCutcheonJPCapelMS and RamakrishnanV (1999) Structure of a bacterial30S ribosomal subunit at 55 A resolution Nature 400 833ndash840

48 PiolettiM SchlunzenF HarmsJ ZarivachR GluhmannMAvilaH BashanA BartelsH AuerbachT and JacobiC(2001) Crystal structures of complexes of the small ribosomalsubunit with tetracycline edeine and IF3 EMBO J 201829ndash1839

49 WilliamsonJR (2000) Induced fit in RNAndashprotein recognitionNat Struct Mol Biol 7 834ndash837

50 FioreJL HolmstromED and NesbittDJ (2012) Entropicorigin of Mg2+-facilitated RNA folding Proc Natl Acad SciUSA 109 2902ndash2907

51 MillsJB VacanoE and HagermanPJ (1999) Flexibility ofsingle-stranded DNA use of gapped duplex helices to determinethe persistence lengths of poly (dT) and poly (dA) J Mol Biol285 245ndash257

52 EgbertRG and KlavinsE (2012) Fine-tuning gene networksusing simple sequence repeats Proc Natl Acad Sci USA 10916817ndash16822

Nucleic Acids Research 2014 Vol 42 No 4 2659

Page 3: Translation rate is controlled by coupled trade-offs ......Translation rate is controlled by coupled trade-offs between site accessibility, selective RNA unfolding and sliding at upstream

also measured The arithmetic mean of each distributionwas taken and the mean autofluorescence was subtractedfrom each sample The reported fluorescence values in thisstudy are the average of three measurements (n=3)For mRNA level measurements single colonies were

first grown overnight in 5ml LB media with chloram-phenicol then diluted to 001 OD600 in fresh media andgrown to mid-exponential phase until reaching 10OD600 as measured by a cuvette-based spectrophotom-eter (NanoDrop 2000C) Total RNA was extracted usingthe Total RNA Purification kit (Norgen Biotek)Treatment with Turbo DNAse (Ambion) eliminatedgenomic DNA contamination Reverse transcription wascarried out with a High Capacity cDNA ReverseTranscription kit (Applied Biosystems) followed by quan-titative PCR with a TaqMan probe (50-ACCTTCCATACGAACTTT-30) targeting the internal coding sequence formRFP1 (forward primer 50-ACGTTATCAAAGAGTTCATGCGTTTC-30 and reverse primer 50-CGATTTCGAACTCGTGACCGTTAA-30) and using a ABI 7300 real-time cycler A TaqMan probe for 16S rRNA was usedas an endogenous control Triplicate measurements wereeach performed on two separate RNA extractions Ct

calculations were then averaged for each sample

Biophysical model calculations

The complete biophysical model employs a free energymodel to calculate the total binding free energy betweenthe ribosome and mRNA according to the equation(Figure 1C)

Gtotal frac14 GmRNArRNA+Gspacing+Gstart

+Gstandby GmRNAeth1THORN

Using statistical thermodynamics we may relate this totalGibbs free energy change to the mRNArsquos translation ini-tiation rate r according to

r expethGtotalTHORN eth2THORN

This relationship has been previously validated on 132mRNA sequences where the Gtotal varied from 10 to16 kcalmol resulting in well-predicted translation ratesthat varied by over 100 000-fold (14) The apparentBoltzmann constant has been measured as045plusmn005molkcal which was confirmed in a secondstudy (29)In the initial state the mRNA exists in a structured

conformation where its free energy of folding isGmRNA (GmRNA is negative) After assembly of the30S ribosomal subunit the last nine nucleotides of its16S rRNA have hybridized to the mRNA while all non-clashing mRNA structures are allowed to fold The freeenergy of folding for this mRNAndashrRNA complex isGmRNAndashrRNA (GmRNAndashrRNA is negative) mRNA struc-tures that impede 16S rRNA hybridization or overlap withthe ribosome footprint remain unfolded in the final stateThese Gibbs free energies are calculated using a semi-empirical free energy model of RNA and RNAndashRNAinteractions (3031) and the minimization algorithmsavailable in the Vienna RNA suite version 185 (32)

Three additional interactions will alter the translationinitiation rate The tRNAfMET anti-codon loop hybridizesto the start codon (Gstart is most negative for AUG andGUG) The 30S ribosomal subunit also prefers a fivenucleotide aligned spacing (33) Non-optimal distancesbetween the 16S rRNA binding site and the start codonlead to an energetic binding penalty This relationshipbetween the ribosomersquos binding penalty (Gspacinggt 0)and aligned spacing distance was measured in ourprevious study (14) Finally the 50 UTR binds to the ribo-somal platform with a free energy penalty Gstandbywhich is calculated according to Equation (5) anddescribed in the main text A maximally accessiblestandby site module will bind to the ribosomal platformwith a zero Gstandby Gibbs free energy calculations forall mRNA sequences are listed in the SupplementaryData

The biophysical model automatically decomposes a50 UTR into one or more standby site modules using thefollowing definitions A standby site module contains asingle mRNA hairpin surrounded by two single-stranded regions an upstream distal binding site and adownstream proximal binding site The lengths of thedistal and proximal binding sites are denoted by D andP respectively The hairpinrsquos height H is determined bythe number of nucleotides from the hairpinrsquos base to themid-point of the hairpinrsquos loop including bulges internalloops and base pairs The heights of branched hairpins aredetermined by averaging the branchesrsquo heights(Supplementary Figure S3) Hairpins are only consideredas components of standby site modules if they aremutually exclusive with the 16S rRNA binding siteA 50 UTR with multiple upstream hairpins will containmultiple standby site modules We consider a 50 UTRwithout upstream hairpins to contain a single standbysite module with zero hairpin height and zero distal sitelength Using these definitions the biophysical model cal-culates a Gstandby for each standby site module

The available single-stranded surface area As of astandby site module is determined according toAs=15+P+D ndash H The key aspect of this equation isthat proximal length distal length and hairpin height haveequal and interchangeable effects on the amount of single-stranded surface area The constant 15 was selected tocreate a positive metric for single-stranded surface areaBased on data shown in Figure 2 increasing the hairpinheight beyond 15 nt did not reduce the mRNA translationrate regardless of the proximal or distal site lengthsAccordingly the value of H in the equation is notallowed to exceed 15 nt

The ribosomal platformrsquos distortion energy penalty is aquadratic function of the available RNA surface area As(Figure 2) according to

Gdistortion frac14 C1ethAsTHORN2+C2ethAsTHORN+C3 eth3THORN

where C1 C2 and C3 are the fitted parameter values andequal to 0038 kcalmolnt2 1629 kcalmolnt and17359 kcalmol respectively An alternative constant inthe definition of surface area would alter these coefficients

2648 Nucleic Acids Research 2014 Vol 42 No 4

but the overall relationship between the surface area andthe distortion free energy penalty would remain the same

Measurements of translation rate and binding freeenergy penalties

In our steady-state long-time growth conditions transla-tion initiation is a rate-limiting step in codon-optimizedmRFP1 fluorescent protein expression fluorescence levelsare proportional to translation initiation rates under ourcontrolled growth conditions (13) and translation initi-ation rate is related to total binding free energies accord-ing to a log-linear formula (Equation 2) We measure thechange in apparent binding free energy penalty by

comparing the fluorescence of a structured 50 UTR withthe fluorescence of a maximally accessible 50 UTR accord-ing to

Gstandby frac14 1

ln

X

Xref

ethGotherTHORN eth4THORN

where X is the fluorescence from a sample synthetic50 UTR Xref is the fluorescence of the unstructured50 UTR and is 045molkcal Gother is the differencein free energies between the sample and reference50 UTRs that depends on other ribosomal interactionsand not on the standby site By using identical and care-fully designed 16S rRNA binding sites spacer regions and

Height

Distal Proximal

SD AUG

A

D

0 5 10 15 20

102

103

104

12

108

64

2

0

14

Distal Length (D) ntFl

uore

scen

ce [a

u]

ΔGst

andb

y [k

calm

ol]

E

16 188 10 12 14 20

102

103

104

12

108

64

2

0

14

Hairpin Height (H) nt

Fluo

resc

ence

[au]

ΔGst

andb

y [k

calm

ol]

C

0 5 10 15 20

102

103

104

12

108

64

2

0

14

Proximal Length (P) nt

Fluo

resc

ence

[au]

ΔGst

andb

y [k

calm

ol]

0 5 10 15 20 25

101

102

103

104

18

16

14

12

10

8

6

4

2

0

Fluo

resc

ence

[au]

ΔGst

andb

y [k

calm

ol]

Standby Site Module Surface Area AS

8 12 14 18 20 227

6

5

4

3

2

1

0

103

104Fl

uore

scen

ce [a

u]

ΔGst

andb

y [k

calm

ol]

Standby Site Module Surface Area AS

10 16

A = 15 + P + D minus Hs

F G

B

SDAUG

D H P ndash H

Figure 2 (A) Standby site modules are defined by a proximal and distal binding site separated by an RNA structure (B) Standby site modulesurface area is calculated by summing the proximal P and distal D binding site lengths and subtracting the hairpin height H The surface area is keptpositive by adding a constant 15 (CndashE) Translation rate and ribosome binding free energy penalties are measured as proximal binding site lengthdistal binding site length and hairpin height are increased The Gstandby numbers shown in secondary y-axis are directly related to the dataaccording to Equation (4) (C) Hairpin heights are 9 nt (green stars) 12 nt (black asterisks) 15 nt (red squares) or 18 nt (blue circles) (D)Hairpin heights are 15 nt (red squares) or 18 nt (blue circles) (E) Hairpin height varies from 9 to 12 nt by adding to loop length of a 6 bp stem(blue circles) or from 14 to 18 nt by adding to loop length of a 12 bp stem (red squares) (F) 56 characterized 50 UTRs (Aslt 24) from parts (CndashE) arecombined to quantitatively relate the standby site modulersquos surface area to the ribosomersquos translation rate and binding free energy penalty Dashedline is a best-fit quadratic equation (G) The validity of the quadratic equation is tested by measuring translation rates from 15 additional 50 UTRsand comparing to model predictions (R2=078 average error is 066 kcalmol) Data points are the result of three measurements in 1 day In parts(CndashG) the horizontal dashed line is the translation rate and ribosome binding free energy penalty of the reference unstructured 50 UTR

Nucleic Acids Research 2014 Vol 42 No 4 2649

coding sequences in the 136 synthetic mRNA sequencesthe value for Gother was zero for 123 mRNA sequencesand a maximum of 090 kcalmol for the remainingsequences

Statistical analysis

Two-sample T-test at two-tailed 5 statistical signifi-cance was performed to compare the error distributionof different sets of model predictions The null hypothesisassumed that the errors in two different sets of modelpredictions have same mean and variance The alternativehypothesis indicates that the errors in two sets follow dif-ferent distributions The error in model predictionGstandby was calculated according to the formulaGstandby=measured Gstandby predicted GstandbyThe low probabilities (Plt 005) reject the null hypothesisshowing that the differences in two sets of model predic-tions were statistically significant

RESULTS

Standby site module surface area controls the ribosomersquostranslation rate

We first employed a learn-by-design approach to deter-mine the ribosomal platformrsquos interactions with standbysites that contain a single hairpin and therefore contain asingle standby site module Standby site modules aredefined by their hairpin height (H) upstream distalbinding site length (D) and downstream proximalbinding site length (P) (Figure 2A lsquoMaterials andMethodsrsquo section) We designed constructed andcharacterized 62 synthetic 50 UTRs containing singlestandby site modules with hairpins from 9 to 18 bp longdistal binding sites from 1 to 15 nt long and proximalbinding sites from 1 to 24 nt long Long mRNA hairpinscontain internal loops to avoid potential RNAse activity(34) All mRNA sequences are transcribed by the J23100constitutive promoter encode the mRFP1 fluorescentprotein and are designed with identical SD and spacersequences (lsquoMaterials and Methodsrsquo section) Flowcytometry measurements from long-time log-phasecultures with periodic serial dilutions revealed thatmRFP1 fluorescence varied by 221-fold (Figure 2CndashE)In contrast mRNA levels varied by at most 58-foldaccording to RT-qPCR measurements (SupplementaryFigure S2) Changes in fluorescence are primarily due tochanges in translation rate and in these growth condi-tions fluorescence and translation initiation rates are pro-portional All sequences and measurement data are listedin the Supplementary DataAccording to the data standby site module accessibility

plays an important role in modulating the ribosomersquosinteractions with 50 UTRs and controlling translationrate across a large range In particular lengthening thedistal binding site by four nucleotides increased themRNArsquos translation rate by 42-fold whereas lengtheningthe proximal binding site by the same amount increased itby 40-fold Shortening the hairpinrsquos height by 4 bpincreased the mRNArsquos translation rate similarly by 31-fold More generally the slopes (first derivatives) of the

log-transformed translation rates all have similar magni-tudes (Supplementary Data column E) Highly accessiblestandby site modules with long proximal binding sites orshort hairpins also cause the mRNArsquos translation rate toreach a plateau with the same fluorescence level as anmRNA with an unstructured 50 UTR (Figure 2C) Theseobservations led to the hypothesis that the geometric char-acteristics of the standby site module (D P H) are inter-changeable and that its available single-stranded surfacearea As determines the ribosomal platformrsquos ability tobind to it

A standby site modulersquos single-stranded surface area isthe number of nucleotides capable of single-strandedcontact with the ribosomal platform Longer proximalor distal binding sites increase the standby site modulersquossurface area while longer hairpin structures reduce thestandby site modulersquos accessibility mRNA structurescan potentially reduce site accessibility by folding overthe neighboring single-stranded sites and sequesteringbinding regions A standby site modulersquos availablesurface area is calculated by summing its proximal anddistal binding site lengths and subtracting the hairpinrsquosheight (Figure 2B lsquoMaterials and Methodsrsquo section) Ahighly accessible standby site module has a large Aswhereas a less accessible standby site module has a smallAs We then evaluated whether the standby site modulersquossurface area is the key determinant by comparing As to themeasured log-transformed translation rates For all 6250 UTRs with diverse geometric features we found asingle quantitative relationship that described theeffect of surface area on the mRNArsquos translation rate(Figure 2F)

We now turn to thermodynamic modeling to quantifythe ribosomersquos interactions with standby site modules Thecompetitive binding of free ribosomes to the pool of intra-cellular mRNAs allows one to use statistical thermo-dynamics to relate a ribosomersquos binding free energy toits translation initiation rate according to a constantlog-linear relationship (Equation 2) (1314) We calculatethe interaction energy between the ribosome and anmRNArsquos standby site modules by comparing its transla-tion to the translation rate of a reference unstructured50 UTR both with an identical SD sequence spacerregion and protein coding sequence (Equation 4) Usingthe data in Figure 2CndashE we found that the standby sitebinding free energy penalty Gstandby varied from 0 to118 kcalmol when changing its geometric characteristicsand that a simple quadratic function succinctly describedthe effect of site accessibility on the ribosomersquos bindingfree energy penalty (R2=097 average error is051 kcalmol) (Figure 2F) A maximally accessiblestandby site module has a zero binding free energypenalty (Gstandby=0) indicating that the mRNArsquostranslation initiation rate is not reduced by the ribosomalplatformrsquos interactions A standby site module with apositive binding free energy penalty (Gstandbygt 0) hasreduced the mRNArsquos translation initiation rate bydecreasing the ribosomal platformrsquos ability to initiallybind the mRNA

Next we critically tested the parameterized model byconstructing 15 new 50 UTRs and measuring their

2650 Nucleic Acids Research 2014 Vol 42 No 4

translation initiation rates and ribosome binding freeenergy penalties The 50 UTRs each contain a standbysite module with diverse structures between 43 and 91 ntlong including variable-length distal and proximalbinding sites hairpin heights and including two- andthree-branched structures (Supplementary Data) Forbranched structures we show that the relevant hairpinheight is the average across the multiple branches(Supplementary Figure S3) (see also SupplementaryText) The quadratic relationship accurately predictedthe ribosomersquos translation rate and binding free energypenalty (R2=078 average error is 066 kcalmol) con-firming that standby site modulersquos accessibility controlsthe ribosomersquos ability to bind and initiate translation(Figure 2G) The quadratic relationship contains threeparameters which are now treated as constants(Equation 3 lsquoMaterials and Methodsrsquo section)

Coupled energetic trade-offs when unfolding structures instandby site modules

Prior to initiating translation the 30S ribosome does nothave an external source of energy Initiation factor 2 doesnot hydrolyze GTP until after translation has beeninitiated during 50S recruitment (22) Consequently the30S complex expends its own free energy to unfold mRNAstructures Previous studies have shown that mRNA struc-tures that sequester the 16S rRNA binding site areunfolded prior to initiating translation (91422) ThemRNArsquos translation initiation rate is reduced accordingto the free energy penalty for unfolding these structures asdescribed in Equations (1) and (2) (lsquoMaterials andMethodsrsquo section) However it remains unclear whethermRNA structures located upstream of SD sequenceswithin standby site modules are unfolded oraccommodated We next investigated how the ribosomeunfolds hairpins in long 50 UTRs

By definition there is a coupled thermodynamic rela-tionship between the unfolding of an mRNA hairpin andincreasing the accessibility of the corresponding standbysite module where the accessibility of standby site modulecontrols the ribosomal platformrsquos distortion energypenalty Gdistortion (see lsquoMaterials and Methodsrsquosection) Unfolding a single base pair in a modulersquoshairpin will increase the standby site modulersquos surfacearea by three nucleotides the hairpinrsquos height decreasesby one and the lengths of both the proximal and distalbinding sites increase by one Therefore unfolding amodulersquos hairpin will lead to higher accessibility a lowerGdistortion and a higher translation rate However un-folding base pairs requires an input of free energy(Gunfoldinggt 0) which will lead to a lower translationrate Consequently there is an energetic trade-offbetween unfolding mRNA structures and increasing thesurface accessibility of standby site modules Theribosome may completely unfold a structured standbysite module to maximize its surface area but the energeticcost may be high (high Gunfolding penalty withGdistortion=0) In contrast the ribosome may fully ac-commodate a standby site modulersquos hairpin without

unfolding it even though its low surface accessibility willpenalize the ribosomersquos ability to bind to it (highGdistortion penalty with Gunfolding=0)Based on thermodynamic principles we hypothesize

that the ribosome will lsquoselectivelyrsquo and lsquopartiallyrsquo unfoldstructured standby site modules to minimize its totalbinding free energy penalty (Gstandby=Gdistortion+Gunfolding) Figure 3A and B presents an illustrativeexample of this principle and our use of minimization tocalculate the ribosomersquos binding free energy penalty to astandby site module A 50 UTR with a single standby sitemodule initially contains a short proximal binding site(P=2nt) a short distal binding site (D=2nt) and along hairpin (H=15nt) resulting in a largely inaccessible50 UTR (As=4) Based on the standby site modulersquossurface area the ribosome has an initial distortionenergy penalty of 115 kcalmol However the ribosomecan selectively unfold the hairpinrsquos closing AU basepairs increasing the standby site modulersquos surface areaand lowering its Gdistortion at the expense of increasingits Gunfolding Unfolding 1 bp increases the standby sitemodulersquos surface area by 3 nt decreases the distortionenergy penalty to 78 kcalmol and increases the unfoldingenergy penalty to 090 kcalmol leading to a total bindingfree energy penalty of 87 kcalmol The ribosome cancontinue to unfold base pairs to minimize its totalbinding free energy penalty unfolding 3 bp at a cost of27 kcalmol leads to a distortion energy penaltyof 26 kcalmol and a total binding free energy penaltyof 53 kcalmol (Figure 3B) Two additional examplesare shown in Supplementary Figure S4 The unfoldingfree energies are calculated using the Vienna RNA suitewith the Turner 1999 RNA parameters (see lsquoMaterials andMethodsrsquo section) Importantly thermodynamic mini-mization does not add any new parameters to the modelWe tested the proposed mechanism by characterizing the

translation rate of 22 synthetic 50 UTRs (Figure 3CndashE) thatcontain single standby site modules with varied geometries(D=0 to 7nt and P=1 to 20nt) and differing hairpinsequences sizes and folding energetics (SupplementaryFigure S5A) We employed the minimization principle tocalculate the number of unfolded base pairs and to predictthe ribosomersquos binding free energy penalty Gstandby

(Figure 3D) As comparisons we performed the same cal-culation while asserting that either no additional energywas inputted to unfold hairpins (high Gdistortion andGunfolding=0) (Figure 3C) or that hairpins were com-pletely unfolded by the ribosome (high Gunfolding andGdistortion=0) (Figure 3E)The data supports our hypothesis that the ribosome

partially and selectively unfolds structured standby sitemodules to minimize its total binding free energy penalty(Gstandby) rather than separately minimize its distortionor unfolding penalties (average error is 090 kcalmol whenminimizing Gstandby 242 kcalmol when Gunfolding=0and 1024 kcalmol when Gdistortion=0 comparativeP-values are 0004 and 10 13 respectively)Remarkably in all 22 cases only 1ndash3 bp of each hairpinare predicted by the model to be unfolded whileaccommodating the remaining large RNA structures

Nucleic Acids Research 2014 Vol 42 No 4 2651

Non-cooperative binding to multiple standby site modules

We have shown that the ribosome can accommodate largestructures of standby site modules and engage in frequenttranslation but it is unclear how the presence of multiplestandby site modules in a 50 UTR will affect its translationrate The presence of multiple standby site modules couldallow the ribosome to cooperatively bind mRNA andincrease its translation rate We next characterized ninesynthetic 50 UTRs (76ndash164 nt long) with up to fourstandby site modules to investigate the potential for co-operative binding (Figure 4A) If the ribosome can engagein cooperative binding then the additional standby sitemodules will increase the overall available surface areareduce the ribosomersquos binding free energy penalty andincrease the translation initiation rate In particular weexpect the greatest cooperativity when the availablesurface area of a single standby site module is limitedWe found that 50 UTRs with two three or four

standby site modules were all translated at the samerate indicating an absence of cooperative binding

(Figure 4B) including when the available surface areasfor individual standby site modules were significantlylowered Therefore even when the combined surfaceareas of multiple standby site modules could potentiallyincrease contact with the ribosomal platform and increasebinding a cooperative increase in translation rate was notobserved Consequently the data supports a mechanismwhereby the ribosome binds to only a single standby sitemodule prior to initiating translation

Distant standby site modules can support hightranslation rates

We next investigated whether any standby site modulecould be bound by the ribosome even if located farupstream of the start codon Using the modelrsquos predic-tions we forward-engineered multi-standby site modulesynthetic 50 UTRs where the most distantly upstreamsite is the only one capable of supporting high translationrates (Figure 4C) Internal standby site modules have longhairpins and limited surface areas (model predictions

B

0 2 4 6 8 1002468

10121416

Unfolded Base Pairs

Ene

rgy

[kca

lmol

]

Standby Site Module Surface Area (AS)4 10 16 22 28 34

dagger

GC

CGCU

C

C

CGA

UAUA

G

G

CGA

CUCG

AAuuu aaaAAUAAGGAGGUAAAAAAUGSD

After

GC

CGCUuuu

C

C

CGAaaa

UAUA

G

G

CGA

CUCG

AA AAUAAGGAGGUAAAAAAUGSD

dagger Before

C D

A

Partial RNA Unfolding

102

103

104

Fluo

resc

ence

[au]

Predicted ΔGstandby [kcalmol]0 5 10 15

No RNA Unfolding

0 5 10 15

102

103

104

Fluo

resc

ence

[au]

Predicted ΔGstandby [kcalmol]

Complete RNA Unfolding

0 5 10 15 20

10 2

10 3

10 4

Fluo

resc

ence

[au]

Predicted ΔGstandby [kcalmol]

E

Figure 3 (A and B) Competition between ribosomal distortion and selective RNA unfolding is illustrated on a structured 50 UTR with a weakhairpin The ribosomersquos distortion penalty Gdistortion (dashed red line) hairpin unfolding penalty Gunfolding (dotted dashed blue line) and their sumGstandby (green solid line) are shown as the hairpinrsquos closing base pairs are unfolded The total binding free energy penalty has a local minima whenthree base pairs are unfolded See also Supplementary Figure S4 (CndashE) Scatter plots comparing measured translation rates from 22 synthetic50 UTRs and model predictions to test three mechanisms for the ribosomersquos unfolding of RNA structures The average difference between modelpredictions and measurements is (C) 242 kcalmol when not unfolding structures (D) 090 kcalmol when partially unfolding structures and(E) 1024 kcalmol when fully unfolding structures Comparative P-values from two-sample T-tests are 0004 (C versus D) and 1013 (D versusE) Data points are the result of three measurements in 1 day In parts (CndashE) the diagonal dashed line is the predicted translation rate according toEquation (2)

2652 Nucleic Acids Research 2014 Vol 42 No 4

Gstandby between 90 and 174 kcalmol) compared to themaximally accessible upstream standby site modulesTherefore if the ribosome binds the farthest upstreamstandby site module its translation rate will be at least127-fold higher than binding to the internal standby sitemodules We measured fluorescence levels with flowcytometry and mRNA levels with RT-qPCR Comparedto an unstructured 50 UTR the mRNA levels decreased byonly 13- to 55-fold while the fluorescence levels werecorrespondingly reduced between 25- and 183-foldshowing that changes in fluorescence were mainly due tochanges in translation rate and ribosome binding freeenergy penalties (Figure 4D)

We found that all the mRNAsrsquo translation ratesremained very high even when the accessible standbysite module was located over 100 nt upstream of thestart codon (M4 50 UTR) (Figure 4E) In particular theM2 50 UTR is translated at a 338-fold higher rate whencompared to the translation rate that could be supportedby binding to the standby site module closest to the startcodon Moving the accessible standby site moduleupstream away from the start codon does not signifi-cantly reduce the translation rate Based on these meas-urements the apparent standby site binding penaltiesare 36 62 and 65 kcalmol for M2 to M4 50 UTRs

(Figure 4F) respectively which is much lower than theinternal standby site modulesrsquo predicted binding freeenergy penalties (90ndash174 kcalmol) These measurementssuggest that the ribosome will bind to the standby sitemodule that has the lowest binding free energy penaltyregardless of its distance from the start codon

The ribosomal platform can slide acrossupstream structures

Several mechanisms would allow the ribosomal platformto bind distant upstream standby site modules and tran-sition to a pre-initiation complex at the start codon Afterthe ribosome has bound to the upstream standby sitemodule the 30S complex could engage in diffusivehopping that would increase its local concentration andassembly rate at the start codon The rate of assemblywould largely depend on the physical distance betweenthe standby site module and start codon and not on thepresence of RNA structures Alternatively the ribosomecould employ a full contact sliding mechanism that reori-ents the mRNA until a stable pre-initiation complex hasformed The positively charged surface of the ribosomalplatform makes it ideal for maintaining contact with a50 UTR until 16S rRNA hybridization has anchored themRNA Unlike a diffusive hopping mechanism the

B

ΔGst

andb

y [kc

alm

ol]

Single Stranded Length (N) [nt]124 8

6543210

103

104

Fluo

resc

ence

[au]

A

SD AUGN N

SD AUGN NN

SD AUGN NNN

C

SD AUGM1

I

SD AUGM2

III

SD AUG

M3

IIIII

I

SD AUG

M4

IIIII

IIV

M3 M4M1 M2

Structured Standby Sites

Fluo

resc

ence

[au]

102

103

104

101

100

E

I IIM2

I II IIIM3

I II III IVM4

22

20

18

16

14

12

10

8

6

4

2

0

ΔGdi

stor

tion +

ΔG

slid

ing

[kca

lmol

]

IM1

22

20

18

16

14

12

10

8

6

4

2

0

ΔGdi

stor

tion

[kca

lmol

]

FD

M1 M2M3 M4

0

5

10

15

20

25

0

5

10

15

20

25

Structured Standby Sites

mR

NA

Lev

el F

old

Red

uctio

n

Fluo

resc

ence

Fol

d R

educ

tion

Figure 4 (A) Synthetic 50 UTRs with multiple evenly spaced standby site modules Spacing N is varied from 4 to 12 nt (B) Measured translation ratesare compared to standby site module multiplicity showing the absence of cooperative binding to multiple standby site modules regardless of limitingsurface areas The Gstandby numbers shown in secondary y-axis are directly related to the data according to Equation (4) (C) Synthetic 50 UTRs withone upstream and multiple internal standby site modules labeled with roman numbers (D) The reduction in fluorescence (red bars) and mRNA levels(blue bars) of 50 UTRs M1 to M4 compared to an unstructured 50 UTR (see also Supplementary Figure S2) (E) Fluorescence measurements of 50

UTRs M1 to M4 show that upstream standby site modules can support high translation rates (F) Model predictions are shown for 50 UTRs M1 to M4

using either Gdistortion (gray bars) or Gdistortion+Gsliding (blue bars) for each standby site module (IndashIV corresponding to labels in part C)compared to the ribosomersquos apparent binding free energy penalties (dashed lines yellow region) (average error with and without the sliding penalty is115 and 324 kcalmol respectively P-value is 0013) Gdistortion for each standby site module was calculated according to Equation (3) In parts Band E the horizontal dashed line is the translation rate and ribosome binding free energy penalty of the unstructured 50 UTR

Nucleic Acids Research 2014 Vol 42 No 4 2653

presence of mRNA structures would impede this slidingmotion and reduce the pre-initiation complexrsquos assemblyrateFurther analysis of our measurements shows that RNA

structures in between the upstream standby site moduleand start codon partly attenuate the mRNArsquos translationrate (Figure 4E) The apparent binding free energy penaltyof the M2 50 UTR (36 kcalmol) is higher than theupstream standby site modulersquos predicted binding freeenergy penalty (04 kcalmol) This energetic differencecannot be explained by binding to the internal standbysite module (114 kcalmol) or by the unfolding of thedownstream RNA structure (175 kcalmol) Thus thefree energy difference (36 ndash 04=32 kcalmol) is signifi-cant but much lower than predicted by alternativebinding to other standby site modules or from unfoldingof RNA structuresOur data supports the presence of a sliding mechanism

for the ribosomal platform Intervening mRNA structuresare not unfolded but are pushed aside with an energeticcost Using our measurements of the M2 50 UTR weestimate that 32plusmn05 kcalmol free energy is requiredto push aside the single RNA structure with a height of15 nt yielding a coefficient of 020plusmn004 kcalmol perhairpin height We then determined whether includingthis sliding penalty could improve the modelrsquos ability topredict the translation rate of long 50 UTRs with multiplestandby site modules with lsquohairpins of differing heightsrsquoFor each standby site module we calculate the heights ofall downstream RNA structures and multiplied the sumby the coefficient (c=02 kcalmolnt) to yield a freeenergy penalty of sliding Gsliding The sliding freeenergy penalties were 0 30 48 and 78 kcalmol for M1to M4 50 UTRs respectively We add together the slidingdistortion and unfolding free energy penalties to calculatethe ribosomersquos binding free energy penalty to each standbysite moduleFigure 4E shows the measured translation initiation

rates for M1 to M4 alongside the predicted binding freeenergy penalties (Figure 4F) both with and without thesliding free energy penalty The data shown in Figure 4Eand F are on the same scale according to the log-linearrelationship between translation initiation rate andbinding free energy differences (lsquoMaterials and Methodsrsquosection) enabling a comparison between the measure-ments and the two predictions The introduction of thesliding free energy penalty enabled the biophysical modelto more accurately predict the translation rate from longer50 UTRs with several standby site modules (average erroris 186 kcalmol with the sliding energy penalty comparedto 445 kcalmol without the sliding energy penaltyP-value is 0013)

A biophysical model predicts translation rates fromdiversely structured 50 UTRs

Altogether the ribosome binds to 50 UTRs by selectingthe most geometrically accessible standby site module thatrequires the least amount of RNA unfolding Standby sitemodule selection is not limited by distance to the startcodon and upstream standby site modules can support

high rates of translation We identified that ribosomal dis-tortion selective RNA unfolding and ribosomal slidingcontrol standby site module selection and binding ener-getics and combine them together to create a single bio-physical model The ribosomersquos binding free energypenalty to the standby site modules in a structured50 UTR is calculated according to

Gstandby frac14 Gdistortion+Gunfolding+Gsliding eth5THORN

All three Gibbs free energies are positive and a long un-structured 50 UTR would bind to the ribosomal platformwithout an energetic penalty (Gstandby=0kcalmol) AllRNA energetics are calculated using a semi-empirical freeenergy model of RNA and RNAndashRNA interactions(3031) and the minimization algorithms available in theVienna RNA suite version 185 (32) This model extendsour previous biophysical model that focused on down-stream mRNA interactions including the 16S rRNAbinding site spacer region and downstream RNA struc-tures (1314)

We further tested the accuracy of the biophysical modelby characterizing an additional 28 diversely structured50 UTRs (68ndash164 nt long) with two to four standby sitemodules of varying geometries RNA structure energeticsand hairpin heights (Supplementary Figure S5B) The bio-physical model was able to accurately predict the relativetranslation rates and ribosome binding free energypenalties (average error is 079 kcalmol and R2=083)(Figure 5A) Overall the biophysical model containsfour empirically determined parameter values but canaccurately predict the binding free energy penalties andtranslation rates of the 136 synthetic 50 UTRscharacterized in this study (average error is 075 kcalmoland R2=089) (Figure 5B)

Genome-wide predictions of 50 UTR translation rates

We next employed the biophysical model to examine howstandby sites in genomic 50 UTRs control their translationinitiation rates From the annotated E coli MG1655genome provided by EcoCyc (35) we collected 343050 UTRs that varied from short leaders to gene-sized regu-latory hot spots with average 50 UTR length of 100 nt The50 UTR sequences from the+1 of the mRNA transcript to+100 after the protein coding sequencersquos start codon wereinputted into the biophysical model For individual50 UTRs a list of standby site modules was automaticallygenerated For each standby site module the surface areaAs distortion energy penalty Gdistortion unfolding energypenalty Gunfolding and sliding energy penalty Gsliding

were calculated These calculations are combined to deter-mine the Gstandby for each standby site module Becauseof the absence of cooperative binding as shown thestandby site module with the lowest binding free energypenalty is selected as the one controlling the ribosomersquostranslation initiation rate

According to the model calculations we found that thelengths of the natural 50 UTRs do not correlate with theirability to bind the ribosomal platform (Figure 6A) Short50 UTRs with low accessibility can inhibit translationequally well compared to long 50 UTRs where

2654 Nucleic Acids Research 2014 Vol 42 No 4

accessibility is limited by RNA structures The model cal-culations indicate that the ribosomal platform can bind tonatural 50 UTRs with a wide range of energetic penalties(up to 174 kcalmol) repressing translation by up to 2500-fold (Figure 6B) Long 50 UTRs with low Gstandby can besubjected to translation repression when trans-actingsmall RNAs bind and reduce their single-stranded acces-sibility (232936ndash38) or when cis-acting riboswitchesreduce accessibility when bound to a chemical ligand(39ndash43) In one example the standby site in the tisB50 UTR is occluded when bound by the IstR-1 smallRNA and its translation is repressed (23) In contrastlong 50 UTRs with high Gstandby can be targets for

translation activation when structural remodeling bysmall RNAs or riboswitches increases their surface acces-sibility RprA small RNA has been shown to bind and up-regulate the translation of rpoS mRNA with the help ofHfq-protein (44)The model predicts that both types of translation regu-

lation occur in natural 50 UTRs In particular 39 ofgenomic 50 UTRs have standby site modules whosedistal or proximal binding sites are longer than 15 ntwhich are potent targets for translational repression incontrast 9 of 50 UTRs contain standby site moduleswith low accessibilities (Aslt 10) making them targetsfor translation activation Moreover 79 diverse 50 UTRs

A B

C

0 200 400 600 800 10000

2

4

6

8

10

12

14

16

18

ΔGst

andb

y [kca

lmol

]

Genomic 5rsquo UTR Length [nt]

0 10 20 30 400

2

4

6

8

10

12

14

16

18

ΔGst

andb

y [kca

lmol

]

0 2 4 6 8 10 12 14 16 18100

101

102

103

104

ΔGstandby

[kcalmol]

Num

ber

of G

enom

ic 5

rsquo UT

Rs

0 2 4 6 8 10 12 14 16 18

maximum change in ΔGstandby

[kcalmol]

Num

ber

of 5

rsquo UT

R Is

ofor

ms

100

101

102

103Genomic 5rsquo UTR Length [nt]

Figure 6 (A) The ribosomal platform binding free energy penalties Gstandby for 3430 transcribed 50 UTRs from the E coli MG1655 genome(EcoCyc release 171) are compared to their lengths The inset shows the Gstandby of short 50 UTRs (B) The number of genomic 50 UTRs withdifferent values of Gstandby is shown (C) 50 UTR isoforms within transcripts across the genome affect their standby site module characteristics asquantified by the maximum calculated change in Gstandby

A

0 2 4 6 8 1010

8

6

4

2

0

103

104

Predicted ΔGstandby

[kcalmol]

App

aren

t ΔG

stan

dby [

kcal

mol

]

Flu

ores

cenc

e [a

u]

B

0 2 4 6 8 10 12 14

102

103

104

14

12

10

8

6

4

2

0

Predicted ΔGstandby

[kcalmol]

App

aren

t ΔG

stan

dby [

kcal

mol

]

Flu

ores

cenc

e [a

u]

Figure 5 (A) A scatter plot showing fluorescence measurements and apparent Gstandby energy penalties compared to predicted Gstandby energypenalties for 28 synthetic 50 UTRs with diverse structures (average error is 079 kcalmol and R2=083) (B) The same comparison for the 136synthetic 50 UTRs characterized in this study (average error is 075 kcalmol and R2=089) In parts (A and B) the apparent Gstandby numbersshown in secondary y-axis are directly related to the data according to Equation (4) An x=y diagonal line is shown (dashed)

Nucleic Acids Research 2014 Vol 42 No 4 2655

with an average length of 175 nt have positive slidingenergy penalties (Gsliding) indicating that distantstandby site modules are controlling translation andmaking them potential targets for long-range regulationFurther there are 769 operons where variability in their

transcriptional start sites due to adjacent or overlappingpromoters result in 50 UTRs with different lengths andmRNA structures For example the mmuPM operon hasfive different transcriptional start sites leading to 50 UTRisoforms with lengths from 9 to 205 nt The biophysicalmodel predicts a 1088-fold change in the translation initi-ation rate of mmuP as a result of changes in the structureand surface area of the different standby site modulesOverall across all 50 UTR isoforms within the genomemodel calculations show large changes in translation ini-tiation rates through alterations in the standby sitemodulesrsquo characteristics (Figure 6C) These differencesimply that switching promoter usage due to changingtranscription factor or sigma factor levels can modulatean mRNArsquos translation rate and potentially createstandby site modules whose accessibility can be addition-ally regulated by translation factors However furtherinvestigation is necessary to examine the 50 UTRs ofisoforms and the regulation of their translation rates

Biophysical modeling to calculate translation rates fromsplit ribosome binding sites

Sacerdot et al (45) studied the translation of the 50 UTRof E coli thrS mRNA that uses a split ribosome bindingsite The biophysical model classifies the 50 UTR as con-taining three standby site modules with large RNA struc-tures and varying distal and proximal binding site lengths(Figure 7) The authors concluded that a 24 nt long singlestranded region (domain 3) is essential for ribosomebinding and that the ribosome can accommodate a long

hairpin (domain 2) without unfolding it which was latersupported by crystallographic data (3) The biophysicalmodelrsquos calculations are strongly consistent with their con-clusions In particular a mutation that deleted the entiredomain 3 and its upstream region (ILO4) repressed thetranslation rate by over 50-fold where it significantlyreduced the standby site accessibility (from As=24 andGdistortion=0 to As=0 and Gdistortion=174 kcalmol) To offset this large energetic burden model calcu-lations indicate that the ribosomal platform partiallyunfolded 6 bp from the domain 2 hairpin whichminimized its binding free energy penalty to a Gstandby

of 532 kcalmol (As=205 Gdistortion=002 andGunfolding=53 kcalmol) Destabilization of thedomain 2 hairpin (ILO5) restored the mRNArsquos transla-tion to its wild-type rate by increasing the surface area ofthe standby site to 205 nt resulting in an almost zeroGstandby The biophysical model calculations indicatethat other mutations to the thrS 50 UTR (L6 M1 N1BS4-9 BS4-9CS29 L19 L7 CS302 ILO1 andILO2 see Figure 7) did not alter its standby sitemodule surface area or RNA structure the maximumobserved change in translation was 25-fold for thesemutations

DISCUSSION

We designed and characterized synthetic mRNA se-quences and applied biophysical modeling to decipherthe rules for the ribosomersquos interactions at structured50 UTRs The 30S ribosome searches for a singlestandby site module that supports the highest translationrate regardless of its distance from the start codon Theribosomersquos binding free energy penalty to individualstandby site modules is governed by a competing trade-off between the accessibility of standby site modules

GAACUGGUUAAUAAACAAAUUUUUC

AG

GUGG

U

G

UACCUG

AAUU

G

G

AG

UUUG

AAC C

UG

ACUA

GA

UUCGUG

U

G

A

GCGUU

U

U

CGCAAU

GUUU

G

AA

UU

GC

A

CA

GU

AUUU

GAUACGGC

AU

UAAGGAUAUAAAAUGSD

U

AUGUU

U

UGCAAA

GCUU U

U

CGUG

ACUAGUGC

G

GU

C

CA

Domain 2

Domain 3

Domain 4

minus1

ILOΔ1

ILOΔ2

minus90

minus120

minus150

minus163 minus60 minus13

minus32

ILOΔ4 ILOΔ5

minus49

Figure 7 The sequence and structure of E coli thrS 50 UTR are shown Similar to Sacerdot et al (45) nucleotide numbering is negative for theupstream 50 UTR with respect to the start codon The list of mutations is BS4-9 G(-32)gtA BS4-9CS29 G(-32)gtA and G(-46)gtA L19 A(-13)gtCL7 U(-49)gtG CS302 deletion of domain 2 from A(-14) to U(-48) ILO1 transcription begins at G(-159) ILO2 transcription begins at G(-68)ILO4 transcription begins at G(-49) U(-49)gtG and A(-13)gtC ILO5 transcription begins at G(-49) U(-49)gtG and G(-46)gtA Bolded nucleo-tides are the positions of new transcriptional start sites

2656 Nucleic Acids Research 2014 Vol 42 No 4

(Figure 2) and the unfolding of RNA structures toincrease accessibility (Figure 3) We demonstrate thatthermodynamic minimization predicts the extent ofthe ribosomersquos selective and partial unfolding of RNAstructures Our data supports a sliding mechanismwhereby downstream RNA structures are not unfoldedbut attenuate translation rate in a height-dependentmanner (Figure 4)

Importantly initially designed sequences have well-defined features that perturb the ribosomersquos individualinteractions with mRNA allowing us to preciselymeasure their effect on the ribosomersquos binding freeenergy penalty The forward design of synthetic sequencesaccording to model predictions quantitatively tests ourknowledge of these interactions and the modelrsquos abilityto predict their strengths Overall the biophysical modelemploys first-principles calculations with four empiricallymeasured constants to accurately predict the translationinitiation rates of 136 mRNAs with diversely structuredlong 50 UTRs (Figure 5)

Based on our findings there is also the potential forlong-range regulation of translation Long structured50 UTRs can feature several upstream standby sitemodules extending over 100 nt away from an openreading frame Any of these standby site modules arepotential binding sites for the ribosome according totheir binding free energy penalties but only a singlestandby site module is ultimately bound Both cis-actingand trans-acting factors can modulate individual standbysite module surface accessibilities to control standby sitemodule selection and the resulting translation rate(31723) Importantly the regulation of standby sitemodule selection will produce discrete step changes intranslation rate that can further regulate downstreamprocesses according to non-Boolean multi-state logic

In this study we introduce a new metric to quantify thesurface area of a standby site module which was validatedusing diverse but canonical RNA secondary structuresThe metric relates the geometric characteristics of thestandby site module (P D H) to its available single-stranded surface area implicitly using coefficients of oneto indicate interchangeability Additional measurementswill be needed to precisely measure these coefficientsand relate them to the helical shape of A-form RNA Inaddition tertiary structures such as G-quadruplexes andpseudoknots will likely have a differently defined metricof surface area Further investigation will be necessary todetermine the effect of tertiary RNA structures on standbysite accessibility ribosome binding and translationinitiation

As suggested by the biophysical model the ribosomalplatformrsquos ability to bind standby site modules with lowsurface area relies on its structural flexibility Previouslypublished cryo-EM data shows that the platform domainin the 30S subunit in both the free and 50S-bound statescan occupy variable conformational states (46) The twohalves of the platform which are stabilized by ribosomalproteins at its surface switch conformations during trans-lation (47) Restraining this flexibility will reduce the ribo-somersquos translation initiation rate for example when theantibiotic Edeine binds 16S ribosomal RNA helices H23

and H24 (48) Consequently the ribosomal platformrsquosstructural flexibility appears to be an essential aspect oftranslating mRNAs with structured 50 UTRsSimilarly the rigidity of single-stranded RNA likely

plays an important role in modulating the ribosomersquos dis-tortion penalty due to the entropic differences of bindingflexible or rigid macromolecules (49) Similar to DNApolyA RNA sequences are less flexible than mixed orpolyU RNA sequences (5051) Differences in RNArigidity have been shown to alter the ribosomersquos bindingaffinity to spacer regions separating the SD and startcodon sequences (52) binding free energy differences are25 kcalmol between polyA and polyAU spacer se-quences and 5 kcalmol between polyA and polyUspacer sequences Incorporating a sequence-dependentmodel for RNA rigidity into the distortion penalty calcu-lations could potentially increase their accuracyAnalysis of the modelrsquos error can lead to further insight

into the ribosomersquos interactions The modelrsquos predictionshave a normally distributed error for 90 of sequences(Supplementary Figure S6) Thirteen outliers have errorsbetween 2 and 4 kcalmol (Supplementary Data) Someoutliers are due to the effects of increased mRNA degrad-ation of 50 UTRs with very long proximal or distal bindingsites (P Dgt 20 nt) (Supplementary Figure S2) In othersextremely short proximal or distal binding sites can resultin non-canonical base pairings that alter the ribosomersquosbinding affinity For example in the absence of aproximal binding site (P=0nt) co-axial stacking willtake place between the nucleotides in the distal bindingsite and the mRNAndashrRNA duplex that anchors theribosome to the mRNA Similarly co-axial stacking willplay a larger role in predicting the effects of small RNAsthat bind to sites nearby RNA structuresAs the list of interactions that control gene expression

continues to grow we will stretch the limits of humanpattern recognition and the use of observational correl-ations Biophysical modeling offers a comprehensiveapproach to account for all known interactions with thegene expression machinery identify gaps design experi-ments precisely measure strengths and make testable pre-dictions The accuracy of thesemodels will go hand-in-handwith our ability to engineer genetic systems without trial-and-error We have incorporated the ribosomersquos inter-actions with structured standby site modules into animproved biophysical model called the RBS Calculatorv20 A software implementation and user-friendly webinterface is available at httpsalispsuedusoftware

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online

ACKNOWLEDGEMENTS

The authors would like to thank Manish Kushwaha andthe Genomics Core Facility at Penn State for technicalassistance Chase Beisel Richard Lease Paul Babitzkeand the reviewers for providing comments on themanuscript

Nucleic Acids Research 2014 Vol 42 No 4 2657

FUNDING

DARPA Young Faculty Award [to HMS] Office ofNaval Research [N00014-13-1-0074 to HMS] NationalScience Foundation Career Award [CBET-1253641 toHMS] start-up funds provided by the Penn StateInstitute for the Energy and Environment Supported bythe NSF Research Experience for Undergraduatesprogram (to ASC) Funding for open access chargeThe Office of Naval Research [N00014-13-1-0074 toHMS]

CONTRIBUTIONS

AEB and HMS designed the study developed themodel analyzed results and wrote the manuscriptAEB and ASC conducted the experiments

Conflict of interest statement HMS is the founder of DeNovo DNA a company that commercializes software todesign genetic systems The presented results areindependent of financial consequences

REFERENCES

1 TsaiA PetrovA MarshallRA KorlachJ UemuraS andPuglisiJD (2012) Heterogeneous pathways and timing of factordeparture during translation initiation Nature 487 390ndash393

2 SimonettiA MarziS JennerL MyasnikovA RombyPYusupovaG KlaholzBP and YusupovM (2009) A structuralview of translation initiation in bacteria Cell Mol Life Sci 66423ndash436

3 JennerL RombyP ReesB Schulze-BrieseC SpringerMEhresmannC EhresmannB MorasD YusupovaG andYusupovM (2005) Translational operator of mRNA on theribosome how repressor proteins exclude ribosome bindingScience 308 120ndash123

4 BeiselCL UpdegroveTB JansonBJ and StorzG (2012)Multiple factors dictate target selection by Hfq-binding smallRNAs EMBO J 31 1961ndash1974

5 StorzG VogelJ and WassarmanKM (2011) Regulation bysmall RNAs in bacteria expanding frontiers Mol Cell 43880ndash891

6 BastetL DubeA MasseE and LafontaineDA (2011) Newinsights into riboswitch regulation mechanisms Mol Microbiol80 1148ndash1154

7 BeiselCL and SmolkeCD (2009) Design principles forriboswitch function PLoS Comput Biol 5 e1000363

8 LeaseRA and BelfortM (2000) A trans-acting RNA as acontrol switch in Escherichia coli DsrA modulates function byforming alternative structures Proc Natl Acad Sci USA 979919ndash9924

9 De SmitMH and Van DuinJ (1994) Control of translation bymRNA secondary structure in Escherichia coli A quantitativeanalysis of literature data J Mol Biol 244 144ndash150

10 De SmitMH and Van DuinJ (1994) Translational initiation onstructured messengers Another role for the Shine-Dalgarnointeraction J Mol Biol 235 173ndash184

11 OstermanIA EvfratovSA SergievPV and DontsovaOA(2013) Comparison of mRNA features affecting translationinitiation and reinitiation Nucleic Acids Res 41 474ndash486

12 BugautA and BalasubramanianS (2012) 50-UTR RNA G-quadruplexes translation regulation and targeting Nucleic AcidsRes 40 4727ndash4741

13 SalisHM (2011) The ribosome binding site calculator MethodsEnzymol 498 19ndash42

14 SalisHM MirskyEA and VoigtCA (2009) Automated designof synthetic ribosome binding sites to control protein expressionNat Biotechnol 27 946ndash950

15 CarothersJM GolerJA JuminagaD and KeaslingJD (2011)Model-driven engineering of RNA devices to quantitativelyprogram gene expression Science 334 1716ndash1719

16 SimonettiA MarziS MyasnikovAG FabbrettiAYusupovM GualerziCO and KlaholzBP (2008) Structure ofthe 30S translation initiation complex Nature 455 416ndash420

17 MarziS MyasnikovAG SerganovA EhresmannCRombyP YusupovM and KlaholzBP (2007) StructuredmRNAs regulate translation initiation by binding to the platformof the ribosome Cell 130 1019ndash1031

18 YusupovaGZ YusupovMM CateJ and NollerHF (2001)The path of messenger RNA through the ribosome Cell 106233ndash241

19 De SmitMH and Van DuinJ (2003) Translational standbysites how ribosomes may deal with the rapid folding kinetics ofmRNA J Mol Biol 331 737ndash743

20 QuX LancasterL NollerHF BustamanteC and TinocoI(2012) Ribosomal protein S1 unwinds double-stranded RNA inmultiple steps Proc Natl Acad Sci USA 109 14458ndash14463

21 MilonP MaracciC FilonavaL GualerziCO andRodninaMV (2012) Real-time assembly landscape of bacterial30S translation initiation complex Nat Struct Mol Biol 19609ndash615

22 StuderSM and JosephS (2006) Unfolding of mRNA secondarystructure by the bacterial translation initiation complex MolCell 22 105ndash115

23 DarfeuilleF UnosonC VogelJ and WagnerEGH (2007) Anantisense RNA inhibits translation by competing with standbyribosomes Mol Cell 26 381ndash392

24 MutalikVK GuimaraesJC CambrayG MaiQ-AChristoffersenMJ MartinL YuA LamC RodriguezCBennettG et al (2013) Quantitative estimation of activity andquality for collections of functional genetic elements NatMethods 10 347ndash353

25 CardinaleS and ArkinAP (2012) Contextualizing context forsynthetic biologyndashidentifying causes of failure of syntheticbiological systems Biotechnol J 7 856ndash866

26 LouC StantonB ChenY-J MunskyB and VoigtCA (2012)Ribozyme-based insulator parts buffer synthetic circuits fromgenetic context Nat Biotechnol 30 1137ndash1142

27 KosuriS GoodmanDB CambrayG MutalikVK GaoYArkinAP EndyD and ChurchGM (2013) Composability ofregulatory sequences controlling transcription and translation inEscherichia coli Proc Natl Acad Sci USA 110 14024ndash14029

28 TakahashiS FurusawaH UedaT and OkahataY (2013)Translation enhancer improves the ribosome liberation fromtranslation initiation J Am Chem Soc 135 13096ndash13106

29 HaoY ZhangZJ EricksonDW HuangM HuangY LiJHwaT and ShiH (2011) Quantifying the sequence-functionrelation in gene silencing by bacterial small RNAs Proc NatlAcad Sci USA 108 12473ndash12478

30 MathewsDH SabinaJ ZukerM and TurnerDH (1999)Expanded sequence dependence of thermodynamic parametersimproves prediction of RNA secondary structure J Mol Biol288 911ndash940

31 XiaT SantaLuciaJ BurkardME KierzekR SchroederSJJiaoX CoxC and TurnerDH (1998) Thermodynamicparameters for an expanded nearest-neighbor model for formationof RNA duplexes with Watson-Crick base pairs Biochemistry 3714719ndash14735

32 GruberAR LorenzR BernhartSH NeubockR andHofackerIL (2008) The Vienna RNA websuite Nucleic AcidsRes 36 W70ndashW74

33 ChenH BjerknesM KumarR and JayE (1994) Determinationof the optimal aligned spacing between the ShinendashDalgarnosequence and the translation initiation codon of Escherichia colimRNAs Nucleic Acids Res 22 4953ndash4957

34 PertzevAV and NicholsonAW (2006) Characterization ofRNA sequence determinants and antideterminants of processingreactivity for a minimal substrate of Escherichia coli ribonucleaseIII Nucleic Acids Res 34 3708ndash3721

35 KeselerIM Collado-VidesJ Santos-ZavaletaA Peralta-GilMGama-CastroS Muniz-RascadoL Bonavides-MartinezCPaleyS KrummenackerM AltmanT et al (2011) EcoCyc a

2658 Nucleic Acids Research 2014 Vol 42 No 4

comprehensive database of Escherichia coli biology Nucleic AcidsRes 39 D583ndashD590

36 MutalikVK QiL GuimaraesJC LucksJB and ArkinAP(2012) Rationally designed families of orthogonal RNA regulatorsof translation Nat Chem Biol 8 447ndash454

37 RodrigoG LandrainTE and JaramilloA (2012) De novoautomated design of small RNA circuits for engineering syntheticriboregulation in living cells Proc Natl Acad Sci USA 10915271ndash15276

38 CalluraJM CantorCR and CollinsJJ (2012) Geneticswitchboard for synthetic biology applications Proc Natl AcadSci USA 109 5850ndash5855

39 SinhaJ ReyesSJ and GallivanJP (2010) Reprogrammingbacteria to seek and destroy an herbicide Nat Chem Biol 6464ndash470

40 AnthonyPC PerezCF Garcıa-GarcıaC and BlockSM(2012) Folding energy landscape of the thiamine pyrophosphateriboswitch aptamer Proc Natl Acad Sci USA 109 1485ndash1489

41 PerdrizetGA II ArtsimovitchI FurmanR SosnickTR andPanT (2012) Transcriptional pausing coordinates folding of theaptamer domain and the expression platform of a riboswitchProc Natl Acad Sci USA 109 3323ndash3328

42 LynchSA DesaiSK SajjaHK and GallivanJP (2007) Ahigh-throughput screen for synthetic riboswitches revealsmechanistic insights into their function Chem Biol 14 173ndash184

43 CaronM-P BastetL LussierA Simoneau-RoyM MasseEand LafontaineDA (2012) Dual-acting riboswitch control oftranslation initiation and mRNA decay Proc Natl Acad SciUSA 109 E3444ndashE3453

44 UpdegroveT WilfN SunX and WartellRM (2008) Effect ofHfq on RprArpoS mRNA pairing HfqRNA binding and the

influence of the 50 rpoS mRNA leader region Biochemistry 4711184ndash11195

45 SacerdotC CailletJ GraffeM EyermannF EhresmannBEhresmannC SpringerM and RombyP (1998) The Escherichiacoli threonyl-tRNA synthetase gene contains a split ribosomalbinding site interrupted by a hairpin structure that is essential forautoregulation Mol Microbiol 29 1077ndash1090

46 GabashviliIS AgrawalRK GrassucciR and FrankJ (1999)Structure and structural variations of the Escherichia coli 30 Sribosomal subunit as revealed by three-dimensional cryo-electronmicroscopy J Mol Biol 286 1285ndash1291

47 ClemonsWM MayJL WimberlyBT McCutcheonJPCapelMS and RamakrishnanV (1999) Structure of a bacterial30S ribosomal subunit at 55 A resolution Nature 400 833ndash840

48 PiolettiM SchlunzenF HarmsJ ZarivachR GluhmannMAvilaH BashanA BartelsH AuerbachT and JacobiC(2001) Crystal structures of complexes of the small ribosomalsubunit with tetracycline edeine and IF3 EMBO J 201829ndash1839

49 WilliamsonJR (2000) Induced fit in RNAndashprotein recognitionNat Struct Mol Biol 7 834ndash837

50 FioreJL HolmstromED and NesbittDJ (2012) Entropicorigin of Mg2+-facilitated RNA folding Proc Natl Acad SciUSA 109 2902ndash2907

51 MillsJB VacanoE and HagermanPJ (1999) Flexibility ofsingle-stranded DNA use of gapped duplex helices to determinethe persistence lengths of poly (dT) and poly (dA) J Mol Biol285 245ndash257

52 EgbertRG and KlavinsE (2012) Fine-tuning gene networksusing simple sequence repeats Proc Natl Acad Sci USA 10916817ndash16822

Nucleic Acids Research 2014 Vol 42 No 4 2659

Page 4: Translation rate is controlled by coupled trade-offs ......Translation rate is controlled by coupled trade-offs between site accessibility, selective RNA unfolding and sliding at upstream

but the overall relationship between the surface area andthe distortion free energy penalty would remain the same

Measurements of translation rate and binding freeenergy penalties

In our steady-state long-time growth conditions transla-tion initiation is a rate-limiting step in codon-optimizedmRFP1 fluorescent protein expression fluorescence levelsare proportional to translation initiation rates under ourcontrolled growth conditions (13) and translation initi-ation rate is related to total binding free energies accord-ing to a log-linear formula (Equation 2) We measure thechange in apparent binding free energy penalty by

comparing the fluorescence of a structured 50 UTR withthe fluorescence of a maximally accessible 50 UTR accord-ing to

Gstandby frac14 1

ln

X

Xref

ethGotherTHORN eth4THORN

where X is the fluorescence from a sample synthetic50 UTR Xref is the fluorescence of the unstructured50 UTR and is 045molkcal Gother is the differencein free energies between the sample and reference50 UTRs that depends on other ribosomal interactionsand not on the standby site By using identical and care-fully designed 16S rRNA binding sites spacer regions and

Height

Distal Proximal

SD AUG

A

D

0 5 10 15 20

102

103

104

12

108

64

2

0

14

Distal Length (D) ntFl

uore

scen

ce [a

u]

ΔGst

andb

y [k

calm

ol]

E

16 188 10 12 14 20

102

103

104

12

108

64

2

0

14

Hairpin Height (H) nt

Fluo

resc

ence

[au]

ΔGst

andb

y [k

calm

ol]

C

0 5 10 15 20

102

103

104

12

108

64

2

0

14

Proximal Length (P) nt

Fluo

resc

ence

[au]

ΔGst

andb

y [k

calm

ol]

0 5 10 15 20 25

101

102

103

104

18

16

14

12

10

8

6

4

2

0

Fluo

resc

ence

[au]

ΔGst

andb

y [k

calm

ol]

Standby Site Module Surface Area AS

8 12 14 18 20 227

6

5

4

3

2

1

0

103

104Fl

uore

scen

ce [a

u]

ΔGst

andb

y [k

calm

ol]

Standby Site Module Surface Area AS

10 16

A = 15 + P + D minus Hs

F G

B

SDAUG

D H P ndash H

Figure 2 (A) Standby site modules are defined by a proximal and distal binding site separated by an RNA structure (B) Standby site modulesurface area is calculated by summing the proximal P and distal D binding site lengths and subtracting the hairpin height H The surface area is keptpositive by adding a constant 15 (CndashE) Translation rate and ribosome binding free energy penalties are measured as proximal binding site lengthdistal binding site length and hairpin height are increased The Gstandby numbers shown in secondary y-axis are directly related to the dataaccording to Equation (4) (C) Hairpin heights are 9 nt (green stars) 12 nt (black asterisks) 15 nt (red squares) or 18 nt (blue circles) (D)Hairpin heights are 15 nt (red squares) or 18 nt (blue circles) (E) Hairpin height varies from 9 to 12 nt by adding to loop length of a 6 bp stem(blue circles) or from 14 to 18 nt by adding to loop length of a 12 bp stem (red squares) (F) 56 characterized 50 UTRs (Aslt 24) from parts (CndashE) arecombined to quantitatively relate the standby site modulersquos surface area to the ribosomersquos translation rate and binding free energy penalty Dashedline is a best-fit quadratic equation (G) The validity of the quadratic equation is tested by measuring translation rates from 15 additional 50 UTRsand comparing to model predictions (R2=078 average error is 066 kcalmol) Data points are the result of three measurements in 1 day In parts(CndashG) the horizontal dashed line is the translation rate and ribosome binding free energy penalty of the reference unstructured 50 UTR

Nucleic Acids Research 2014 Vol 42 No 4 2649

coding sequences in the 136 synthetic mRNA sequencesthe value for Gother was zero for 123 mRNA sequencesand a maximum of 090 kcalmol for the remainingsequences

Statistical analysis

Two-sample T-test at two-tailed 5 statistical signifi-cance was performed to compare the error distributionof different sets of model predictions The null hypothesisassumed that the errors in two different sets of modelpredictions have same mean and variance The alternativehypothesis indicates that the errors in two sets follow dif-ferent distributions The error in model predictionGstandby was calculated according to the formulaGstandby=measured Gstandby predicted GstandbyThe low probabilities (Plt 005) reject the null hypothesisshowing that the differences in two sets of model predic-tions were statistically significant

RESULTS

Standby site module surface area controls the ribosomersquostranslation rate

We first employed a learn-by-design approach to deter-mine the ribosomal platformrsquos interactions with standbysites that contain a single hairpin and therefore contain asingle standby site module Standby site modules aredefined by their hairpin height (H) upstream distalbinding site length (D) and downstream proximalbinding site length (P) (Figure 2A lsquoMaterials andMethodsrsquo section) We designed constructed andcharacterized 62 synthetic 50 UTRs containing singlestandby site modules with hairpins from 9 to 18 bp longdistal binding sites from 1 to 15 nt long and proximalbinding sites from 1 to 24 nt long Long mRNA hairpinscontain internal loops to avoid potential RNAse activity(34) All mRNA sequences are transcribed by the J23100constitutive promoter encode the mRFP1 fluorescentprotein and are designed with identical SD and spacersequences (lsquoMaterials and Methodsrsquo section) Flowcytometry measurements from long-time log-phasecultures with periodic serial dilutions revealed thatmRFP1 fluorescence varied by 221-fold (Figure 2CndashE)In contrast mRNA levels varied by at most 58-foldaccording to RT-qPCR measurements (SupplementaryFigure S2) Changes in fluorescence are primarily due tochanges in translation rate and in these growth condi-tions fluorescence and translation initiation rates are pro-portional All sequences and measurement data are listedin the Supplementary DataAccording to the data standby site module accessibility

plays an important role in modulating the ribosomersquosinteractions with 50 UTRs and controlling translationrate across a large range In particular lengthening thedistal binding site by four nucleotides increased themRNArsquos translation rate by 42-fold whereas lengtheningthe proximal binding site by the same amount increased itby 40-fold Shortening the hairpinrsquos height by 4 bpincreased the mRNArsquos translation rate similarly by 31-fold More generally the slopes (first derivatives) of the

log-transformed translation rates all have similar magni-tudes (Supplementary Data column E) Highly accessiblestandby site modules with long proximal binding sites orshort hairpins also cause the mRNArsquos translation rate toreach a plateau with the same fluorescence level as anmRNA with an unstructured 50 UTR (Figure 2C) Theseobservations led to the hypothesis that the geometric char-acteristics of the standby site module (D P H) are inter-changeable and that its available single-stranded surfacearea As determines the ribosomal platformrsquos ability tobind to it

A standby site modulersquos single-stranded surface area isthe number of nucleotides capable of single-strandedcontact with the ribosomal platform Longer proximalor distal binding sites increase the standby site modulersquossurface area while longer hairpin structures reduce thestandby site modulersquos accessibility mRNA structurescan potentially reduce site accessibility by folding overthe neighboring single-stranded sites and sequesteringbinding regions A standby site modulersquos availablesurface area is calculated by summing its proximal anddistal binding site lengths and subtracting the hairpinrsquosheight (Figure 2B lsquoMaterials and Methodsrsquo section) Ahighly accessible standby site module has a large Aswhereas a less accessible standby site module has a smallAs We then evaluated whether the standby site modulersquossurface area is the key determinant by comparing As to themeasured log-transformed translation rates For all 6250 UTRs with diverse geometric features we found asingle quantitative relationship that described theeffect of surface area on the mRNArsquos translation rate(Figure 2F)

We now turn to thermodynamic modeling to quantifythe ribosomersquos interactions with standby site modules Thecompetitive binding of free ribosomes to the pool of intra-cellular mRNAs allows one to use statistical thermo-dynamics to relate a ribosomersquos binding free energy toits translation initiation rate according to a constantlog-linear relationship (Equation 2) (1314) We calculatethe interaction energy between the ribosome and anmRNArsquos standby site modules by comparing its transla-tion to the translation rate of a reference unstructured50 UTR both with an identical SD sequence spacerregion and protein coding sequence (Equation 4) Usingthe data in Figure 2CndashE we found that the standby sitebinding free energy penalty Gstandby varied from 0 to118 kcalmol when changing its geometric characteristicsand that a simple quadratic function succinctly describedthe effect of site accessibility on the ribosomersquos bindingfree energy penalty (R2=097 average error is051 kcalmol) (Figure 2F) A maximally accessiblestandby site module has a zero binding free energypenalty (Gstandby=0) indicating that the mRNArsquostranslation initiation rate is not reduced by the ribosomalplatformrsquos interactions A standby site module with apositive binding free energy penalty (Gstandbygt 0) hasreduced the mRNArsquos translation initiation rate bydecreasing the ribosomal platformrsquos ability to initiallybind the mRNA

Next we critically tested the parameterized model byconstructing 15 new 50 UTRs and measuring their

2650 Nucleic Acids Research 2014 Vol 42 No 4

translation initiation rates and ribosome binding freeenergy penalties The 50 UTRs each contain a standbysite module with diverse structures between 43 and 91 ntlong including variable-length distal and proximalbinding sites hairpin heights and including two- andthree-branched structures (Supplementary Data) Forbranched structures we show that the relevant hairpinheight is the average across the multiple branches(Supplementary Figure S3) (see also SupplementaryText) The quadratic relationship accurately predictedthe ribosomersquos translation rate and binding free energypenalty (R2=078 average error is 066 kcalmol) con-firming that standby site modulersquos accessibility controlsthe ribosomersquos ability to bind and initiate translation(Figure 2G) The quadratic relationship contains threeparameters which are now treated as constants(Equation 3 lsquoMaterials and Methodsrsquo section)

Coupled energetic trade-offs when unfolding structures instandby site modules

Prior to initiating translation the 30S ribosome does nothave an external source of energy Initiation factor 2 doesnot hydrolyze GTP until after translation has beeninitiated during 50S recruitment (22) Consequently the30S complex expends its own free energy to unfold mRNAstructures Previous studies have shown that mRNA struc-tures that sequester the 16S rRNA binding site areunfolded prior to initiating translation (91422) ThemRNArsquos translation initiation rate is reduced accordingto the free energy penalty for unfolding these structures asdescribed in Equations (1) and (2) (lsquoMaterials andMethodsrsquo section) However it remains unclear whethermRNA structures located upstream of SD sequenceswithin standby site modules are unfolded oraccommodated We next investigated how the ribosomeunfolds hairpins in long 50 UTRs

By definition there is a coupled thermodynamic rela-tionship between the unfolding of an mRNA hairpin andincreasing the accessibility of the corresponding standbysite module where the accessibility of standby site modulecontrols the ribosomal platformrsquos distortion energypenalty Gdistortion (see lsquoMaterials and Methodsrsquosection) Unfolding a single base pair in a modulersquoshairpin will increase the standby site modulersquos surfacearea by three nucleotides the hairpinrsquos height decreasesby one and the lengths of both the proximal and distalbinding sites increase by one Therefore unfolding amodulersquos hairpin will lead to higher accessibility a lowerGdistortion and a higher translation rate However un-folding base pairs requires an input of free energy(Gunfoldinggt 0) which will lead to a lower translationrate Consequently there is an energetic trade-offbetween unfolding mRNA structures and increasing thesurface accessibility of standby site modules Theribosome may completely unfold a structured standbysite module to maximize its surface area but the energeticcost may be high (high Gunfolding penalty withGdistortion=0) In contrast the ribosome may fully ac-commodate a standby site modulersquos hairpin without

unfolding it even though its low surface accessibility willpenalize the ribosomersquos ability to bind to it (highGdistortion penalty with Gunfolding=0)Based on thermodynamic principles we hypothesize

that the ribosome will lsquoselectivelyrsquo and lsquopartiallyrsquo unfoldstructured standby site modules to minimize its totalbinding free energy penalty (Gstandby=Gdistortion+Gunfolding) Figure 3A and B presents an illustrativeexample of this principle and our use of minimization tocalculate the ribosomersquos binding free energy penalty to astandby site module A 50 UTR with a single standby sitemodule initially contains a short proximal binding site(P=2nt) a short distal binding site (D=2nt) and along hairpin (H=15nt) resulting in a largely inaccessible50 UTR (As=4) Based on the standby site modulersquossurface area the ribosome has an initial distortionenergy penalty of 115 kcalmol However the ribosomecan selectively unfold the hairpinrsquos closing AU basepairs increasing the standby site modulersquos surface areaand lowering its Gdistortion at the expense of increasingits Gunfolding Unfolding 1 bp increases the standby sitemodulersquos surface area by 3 nt decreases the distortionenergy penalty to 78 kcalmol and increases the unfoldingenergy penalty to 090 kcalmol leading to a total bindingfree energy penalty of 87 kcalmol The ribosome cancontinue to unfold base pairs to minimize its totalbinding free energy penalty unfolding 3 bp at a cost of27 kcalmol leads to a distortion energy penaltyof 26 kcalmol and a total binding free energy penaltyof 53 kcalmol (Figure 3B) Two additional examplesare shown in Supplementary Figure S4 The unfoldingfree energies are calculated using the Vienna RNA suitewith the Turner 1999 RNA parameters (see lsquoMaterials andMethodsrsquo section) Importantly thermodynamic mini-mization does not add any new parameters to the modelWe tested the proposed mechanism by characterizing the

translation rate of 22 synthetic 50 UTRs (Figure 3CndashE) thatcontain single standby site modules with varied geometries(D=0 to 7nt and P=1 to 20nt) and differing hairpinsequences sizes and folding energetics (SupplementaryFigure S5A) We employed the minimization principle tocalculate the number of unfolded base pairs and to predictthe ribosomersquos binding free energy penalty Gstandby

(Figure 3D) As comparisons we performed the same cal-culation while asserting that either no additional energywas inputted to unfold hairpins (high Gdistortion andGunfolding=0) (Figure 3C) or that hairpins were com-pletely unfolded by the ribosome (high Gunfolding andGdistortion=0) (Figure 3E)The data supports our hypothesis that the ribosome

partially and selectively unfolds structured standby sitemodules to minimize its total binding free energy penalty(Gstandby) rather than separately minimize its distortionor unfolding penalties (average error is 090 kcalmol whenminimizing Gstandby 242 kcalmol when Gunfolding=0and 1024 kcalmol when Gdistortion=0 comparativeP-values are 0004 and 10 13 respectively)Remarkably in all 22 cases only 1ndash3 bp of each hairpinare predicted by the model to be unfolded whileaccommodating the remaining large RNA structures

Nucleic Acids Research 2014 Vol 42 No 4 2651

Non-cooperative binding to multiple standby site modules

We have shown that the ribosome can accommodate largestructures of standby site modules and engage in frequenttranslation but it is unclear how the presence of multiplestandby site modules in a 50 UTR will affect its translationrate The presence of multiple standby site modules couldallow the ribosome to cooperatively bind mRNA andincrease its translation rate We next characterized ninesynthetic 50 UTRs (76ndash164 nt long) with up to fourstandby site modules to investigate the potential for co-operative binding (Figure 4A) If the ribosome can engagein cooperative binding then the additional standby sitemodules will increase the overall available surface areareduce the ribosomersquos binding free energy penalty andincrease the translation initiation rate In particular weexpect the greatest cooperativity when the availablesurface area of a single standby site module is limitedWe found that 50 UTRs with two three or four

standby site modules were all translated at the samerate indicating an absence of cooperative binding

(Figure 4B) including when the available surface areasfor individual standby site modules were significantlylowered Therefore even when the combined surfaceareas of multiple standby site modules could potentiallyincrease contact with the ribosomal platform and increasebinding a cooperative increase in translation rate was notobserved Consequently the data supports a mechanismwhereby the ribosome binds to only a single standby sitemodule prior to initiating translation

Distant standby site modules can support hightranslation rates

We next investigated whether any standby site modulecould be bound by the ribosome even if located farupstream of the start codon Using the modelrsquos predic-tions we forward-engineered multi-standby site modulesynthetic 50 UTRs where the most distantly upstreamsite is the only one capable of supporting high translationrates (Figure 4C) Internal standby site modules have longhairpins and limited surface areas (model predictions

B

0 2 4 6 8 1002468

10121416

Unfolded Base Pairs

Ene

rgy

[kca

lmol

]

Standby Site Module Surface Area (AS)4 10 16 22 28 34

dagger

GC

CGCU

C

C

CGA

UAUA

G

G

CGA

CUCG

AAuuu aaaAAUAAGGAGGUAAAAAAUGSD

After

GC

CGCUuuu

C

C

CGAaaa

UAUA

G

G

CGA

CUCG

AA AAUAAGGAGGUAAAAAAUGSD

dagger Before

C D

A

Partial RNA Unfolding

102

103

104

Fluo

resc

ence

[au]

Predicted ΔGstandby [kcalmol]0 5 10 15

No RNA Unfolding

0 5 10 15

102

103

104

Fluo

resc

ence

[au]

Predicted ΔGstandby [kcalmol]

Complete RNA Unfolding

0 5 10 15 20

10 2

10 3

10 4

Fluo

resc

ence

[au]

Predicted ΔGstandby [kcalmol]

E

Figure 3 (A and B) Competition between ribosomal distortion and selective RNA unfolding is illustrated on a structured 50 UTR with a weakhairpin The ribosomersquos distortion penalty Gdistortion (dashed red line) hairpin unfolding penalty Gunfolding (dotted dashed blue line) and their sumGstandby (green solid line) are shown as the hairpinrsquos closing base pairs are unfolded The total binding free energy penalty has a local minima whenthree base pairs are unfolded See also Supplementary Figure S4 (CndashE) Scatter plots comparing measured translation rates from 22 synthetic50 UTRs and model predictions to test three mechanisms for the ribosomersquos unfolding of RNA structures The average difference between modelpredictions and measurements is (C) 242 kcalmol when not unfolding structures (D) 090 kcalmol when partially unfolding structures and(E) 1024 kcalmol when fully unfolding structures Comparative P-values from two-sample T-tests are 0004 (C versus D) and 1013 (D versusE) Data points are the result of three measurements in 1 day In parts (CndashE) the diagonal dashed line is the predicted translation rate according toEquation (2)

2652 Nucleic Acids Research 2014 Vol 42 No 4

Gstandby between 90 and 174 kcalmol) compared to themaximally accessible upstream standby site modulesTherefore if the ribosome binds the farthest upstreamstandby site module its translation rate will be at least127-fold higher than binding to the internal standby sitemodules We measured fluorescence levels with flowcytometry and mRNA levels with RT-qPCR Comparedto an unstructured 50 UTR the mRNA levels decreased byonly 13- to 55-fold while the fluorescence levels werecorrespondingly reduced between 25- and 183-foldshowing that changes in fluorescence were mainly due tochanges in translation rate and ribosome binding freeenergy penalties (Figure 4D)

We found that all the mRNAsrsquo translation ratesremained very high even when the accessible standbysite module was located over 100 nt upstream of thestart codon (M4 50 UTR) (Figure 4E) In particular theM2 50 UTR is translated at a 338-fold higher rate whencompared to the translation rate that could be supportedby binding to the standby site module closest to the startcodon Moving the accessible standby site moduleupstream away from the start codon does not signifi-cantly reduce the translation rate Based on these meas-urements the apparent standby site binding penaltiesare 36 62 and 65 kcalmol for M2 to M4 50 UTRs

(Figure 4F) respectively which is much lower than theinternal standby site modulesrsquo predicted binding freeenergy penalties (90ndash174 kcalmol) These measurementssuggest that the ribosome will bind to the standby sitemodule that has the lowest binding free energy penaltyregardless of its distance from the start codon

The ribosomal platform can slide acrossupstream structures

Several mechanisms would allow the ribosomal platformto bind distant upstream standby site modules and tran-sition to a pre-initiation complex at the start codon Afterthe ribosome has bound to the upstream standby sitemodule the 30S complex could engage in diffusivehopping that would increase its local concentration andassembly rate at the start codon The rate of assemblywould largely depend on the physical distance betweenthe standby site module and start codon and not on thepresence of RNA structures Alternatively the ribosomecould employ a full contact sliding mechanism that reori-ents the mRNA until a stable pre-initiation complex hasformed The positively charged surface of the ribosomalplatform makes it ideal for maintaining contact with a50 UTR until 16S rRNA hybridization has anchored themRNA Unlike a diffusive hopping mechanism the

B

ΔGst

andb

y [kc

alm

ol]

Single Stranded Length (N) [nt]124 8

6543210

103

104

Fluo

resc

ence

[au]

A

SD AUGN N

SD AUGN NN

SD AUGN NNN

C

SD AUGM1

I

SD AUGM2

III

SD AUG

M3

IIIII

I

SD AUG

M4

IIIII

IIV

M3 M4M1 M2

Structured Standby Sites

Fluo

resc

ence

[au]

102

103

104

101

100

E

I IIM2

I II IIIM3

I II III IVM4

22

20

18

16

14

12

10

8

6

4

2

0

ΔGdi

stor

tion +

ΔG

slid

ing

[kca

lmol

]

IM1

22

20

18

16

14

12

10

8

6

4

2

0

ΔGdi

stor

tion

[kca

lmol

]

FD

M1 M2M3 M4

0

5

10

15

20

25

0

5

10

15

20

25

Structured Standby Sites

mR

NA

Lev

el F

old

Red

uctio

n

Fluo

resc

ence

Fol

d R

educ

tion

Figure 4 (A) Synthetic 50 UTRs with multiple evenly spaced standby site modules Spacing N is varied from 4 to 12 nt (B) Measured translation ratesare compared to standby site module multiplicity showing the absence of cooperative binding to multiple standby site modules regardless of limitingsurface areas The Gstandby numbers shown in secondary y-axis are directly related to the data according to Equation (4) (C) Synthetic 50 UTRs withone upstream and multiple internal standby site modules labeled with roman numbers (D) The reduction in fluorescence (red bars) and mRNA levels(blue bars) of 50 UTRs M1 to M4 compared to an unstructured 50 UTR (see also Supplementary Figure S2) (E) Fluorescence measurements of 50

UTRs M1 to M4 show that upstream standby site modules can support high translation rates (F) Model predictions are shown for 50 UTRs M1 to M4

using either Gdistortion (gray bars) or Gdistortion+Gsliding (blue bars) for each standby site module (IndashIV corresponding to labels in part C)compared to the ribosomersquos apparent binding free energy penalties (dashed lines yellow region) (average error with and without the sliding penalty is115 and 324 kcalmol respectively P-value is 0013) Gdistortion for each standby site module was calculated according to Equation (3) In parts Band E the horizontal dashed line is the translation rate and ribosome binding free energy penalty of the unstructured 50 UTR

Nucleic Acids Research 2014 Vol 42 No 4 2653

presence of mRNA structures would impede this slidingmotion and reduce the pre-initiation complexrsquos assemblyrateFurther analysis of our measurements shows that RNA

structures in between the upstream standby site moduleand start codon partly attenuate the mRNArsquos translationrate (Figure 4E) The apparent binding free energy penaltyof the M2 50 UTR (36 kcalmol) is higher than theupstream standby site modulersquos predicted binding freeenergy penalty (04 kcalmol) This energetic differencecannot be explained by binding to the internal standbysite module (114 kcalmol) or by the unfolding of thedownstream RNA structure (175 kcalmol) Thus thefree energy difference (36 ndash 04=32 kcalmol) is signifi-cant but much lower than predicted by alternativebinding to other standby site modules or from unfoldingof RNA structuresOur data supports the presence of a sliding mechanism

for the ribosomal platform Intervening mRNA structuresare not unfolded but are pushed aside with an energeticcost Using our measurements of the M2 50 UTR weestimate that 32plusmn05 kcalmol free energy is requiredto push aside the single RNA structure with a height of15 nt yielding a coefficient of 020plusmn004 kcalmol perhairpin height We then determined whether includingthis sliding penalty could improve the modelrsquos ability topredict the translation rate of long 50 UTRs with multiplestandby site modules with lsquohairpins of differing heightsrsquoFor each standby site module we calculate the heights ofall downstream RNA structures and multiplied the sumby the coefficient (c=02 kcalmolnt) to yield a freeenergy penalty of sliding Gsliding The sliding freeenergy penalties were 0 30 48 and 78 kcalmol for M1to M4 50 UTRs respectively We add together the slidingdistortion and unfolding free energy penalties to calculatethe ribosomersquos binding free energy penalty to each standbysite moduleFigure 4E shows the measured translation initiation

rates for M1 to M4 alongside the predicted binding freeenergy penalties (Figure 4F) both with and without thesliding free energy penalty The data shown in Figure 4Eand F are on the same scale according to the log-linearrelationship between translation initiation rate andbinding free energy differences (lsquoMaterials and Methodsrsquosection) enabling a comparison between the measure-ments and the two predictions The introduction of thesliding free energy penalty enabled the biophysical modelto more accurately predict the translation rate from longer50 UTRs with several standby site modules (average erroris 186 kcalmol with the sliding energy penalty comparedto 445 kcalmol without the sliding energy penaltyP-value is 0013)

A biophysical model predicts translation rates fromdiversely structured 50 UTRs

Altogether the ribosome binds to 50 UTRs by selectingthe most geometrically accessible standby site module thatrequires the least amount of RNA unfolding Standby sitemodule selection is not limited by distance to the startcodon and upstream standby site modules can support

high rates of translation We identified that ribosomal dis-tortion selective RNA unfolding and ribosomal slidingcontrol standby site module selection and binding ener-getics and combine them together to create a single bio-physical model The ribosomersquos binding free energypenalty to the standby site modules in a structured50 UTR is calculated according to

Gstandby frac14 Gdistortion+Gunfolding+Gsliding eth5THORN

All three Gibbs free energies are positive and a long un-structured 50 UTR would bind to the ribosomal platformwithout an energetic penalty (Gstandby=0kcalmol) AllRNA energetics are calculated using a semi-empirical freeenergy model of RNA and RNAndashRNA interactions(3031) and the minimization algorithms available in theVienna RNA suite version 185 (32) This model extendsour previous biophysical model that focused on down-stream mRNA interactions including the 16S rRNAbinding site spacer region and downstream RNA struc-tures (1314)

We further tested the accuracy of the biophysical modelby characterizing an additional 28 diversely structured50 UTRs (68ndash164 nt long) with two to four standby sitemodules of varying geometries RNA structure energeticsand hairpin heights (Supplementary Figure S5B) The bio-physical model was able to accurately predict the relativetranslation rates and ribosome binding free energypenalties (average error is 079 kcalmol and R2=083)(Figure 5A) Overall the biophysical model containsfour empirically determined parameter values but canaccurately predict the binding free energy penalties andtranslation rates of the 136 synthetic 50 UTRscharacterized in this study (average error is 075 kcalmoland R2=089) (Figure 5B)

Genome-wide predictions of 50 UTR translation rates

We next employed the biophysical model to examine howstandby sites in genomic 50 UTRs control their translationinitiation rates From the annotated E coli MG1655genome provided by EcoCyc (35) we collected 343050 UTRs that varied from short leaders to gene-sized regu-latory hot spots with average 50 UTR length of 100 nt The50 UTR sequences from the+1 of the mRNA transcript to+100 after the protein coding sequencersquos start codon wereinputted into the biophysical model For individual50 UTRs a list of standby site modules was automaticallygenerated For each standby site module the surface areaAs distortion energy penalty Gdistortion unfolding energypenalty Gunfolding and sliding energy penalty Gsliding

were calculated These calculations are combined to deter-mine the Gstandby for each standby site module Becauseof the absence of cooperative binding as shown thestandby site module with the lowest binding free energypenalty is selected as the one controlling the ribosomersquostranslation initiation rate

According to the model calculations we found that thelengths of the natural 50 UTRs do not correlate with theirability to bind the ribosomal platform (Figure 6A) Short50 UTRs with low accessibility can inhibit translationequally well compared to long 50 UTRs where

2654 Nucleic Acids Research 2014 Vol 42 No 4

accessibility is limited by RNA structures The model cal-culations indicate that the ribosomal platform can bind tonatural 50 UTRs with a wide range of energetic penalties(up to 174 kcalmol) repressing translation by up to 2500-fold (Figure 6B) Long 50 UTRs with low Gstandby can besubjected to translation repression when trans-actingsmall RNAs bind and reduce their single-stranded acces-sibility (232936ndash38) or when cis-acting riboswitchesreduce accessibility when bound to a chemical ligand(39ndash43) In one example the standby site in the tisB50 UTR is occluded when bound by the IstR-1 smallRNA and its translation is repressed (23) In contrastlong 50 UTRs with high Gstandby can be targets for

translation activation when structural remodeling bysmall RNAs or riboswitches increases their surface acces-sibility RprA small RNA has been shown to bind and up-regulate the translation of rpoS mRNA with the help ofHfq-protein (44)The model predicts that both types of translation regu-

lation occur in natural 50 UTRs In particular 39 ofgenomic 50 UTRs have standby site modules whosedistal or proximal binding sites are longer than 15 ntwhich are potent targets for translational repression incontrast 9 of 50 UTRs contain standby site moduleswith low accessibilities (Aslt 10) making them targetsfor translation activation Moreover 79 diverse 50 UTRs

A B

C

0 200 400 600 800 10000

2

4

6

8

10

12

14

16

18

ΔGst

andb

y [kca

lmol

]

Genomic 5rsquo UTR Length [nt]

0 10 20 30 400

2

4

6

8

10

12

14

16

18

ΔGst

andb

y [kca

lmol

]

0 2 4 6 8 10 12 14 16 18100

101

102

103

104

ΔGstandby

[kcalmol]

Num

ber

of G

enom

ic 5

rsquo UT

Rs

0 2 4 6 8 10 12 14 16 18

maximum change in ΔGstandby

[kcalmol]

Num

ber

of 5

rsquo UT

R Is

ofor

ms

100

101

102

103Genomic 5rsquo UTR Length [nt]

Figure 6 (A) The ribosomal platform binding free energy penalties Gstandby for 3430 transcribed 50 UTRs from the E coli MG1655 genome(EcoCyc release 171) are compared to their lengths The inset shows the Gstandby of short 50 UTRs (B) The number of genomic 50 UTRs withdifferent values of Gstandby is shown (C) 50 UTR isoforms within transcripts across the genome affect their standby site module characteristics asquantified by the maximum calculated change in Gstandby

A

0 2 4 6 8 1010

8

6

4

2

0

103

104

Predicted ΔGstandby

[kcalmol]

App

aren

t ΔG

stan

dby [

kcal

mol

]

Flu

ores

cenc

e [a

u]

B

0 2 4 6 8 10 12 14

102

103

104

14

12

10

8

6

4

2

0

Predicted ΔGstandby

[kcalmol]

App

aren

t ΔG

stan

dby [

kcal

mol

]

Flu

ores

cenc

e [a

u]

Figure 5 (A) A scatter plot showing fluorescence measurements and apparent Gstandby energy penalties compared to predicted Gstandby energypenalties for 28 synthetic 50 UTRs with diverse structures (average error is 079 kcalmol and R2=083) (B) The same comparison for the 136synthetic 50 UTRs characterized in this study (average error is 075 kcalmol and R2=089) In parts (A and B) the apparent Gstandby numbersshown in secondary y-axis are directly related to the data according to Equation (4) An x=y diagonal line is shown (dashed)

Nucleic Acids Research 2014 Vol 42 No 4 2655

with an average length of 175 nt have positive slidingenergy penalties (Gsliding) indicating that distantstandby site modules are controlling translation andmaking them potential targets for long-range regulationFurther there are 769 operons where variability in their

transcriptional start sites due to adjacent or overlappingpromoters result in 50 UTRs with different lengths andmRNA structures For example the mmuPM operon hasfive different transcriptional start sites leading to 50 UTRisoforms with lengths from 9 to 205 nt The biophysicalmodel predicts a 1088-fold change in the translation initi-ation rate of mmuP as a result of changes in the structureand surface area of the different standby site modulesOverall across all 50 UTR isoforms within the genomemodel calculations show large changes in translation ini-tiation rates through alterations in the standby sitemodulesrsquo characteristics (Figure 6C) These differencesimply that switching promoter usage due to changingtranscription factor or sigma factor levels can modulatean mRNArsquos translation rate and potentially createstandby site modules whose accessibility can be addition-ally regulated by translation factors However furtherinvestigation is necessary to examine the 50 UTRs ofisoforms and the regulation of their translation rates

Biophysical modeling to calculate translation rates fromsplit ribosome binding sites

Sacerdot et al (45) studied the translation of the 50 UTRof E coli thrS mRNA that uses a split ribosome bindingsite The biophysical model classifies the 50 UTR as con-taining three standby site modules with large RNA struc-tures and varying distal and proximal binding site lengths(Figure 7) The authors concluded that a 24 nt long singlestranded region (domain 3) is essential for ribosomebinding and that the ribosome can accommodate a long

hairpin (domain 2) without unfolding it which was latersupported by crystallographic data (3) The biophysicalmodelrsquos calculations are strongly consistent with their con-clusions In particular a mutation that deleted the entiredomain 3 and its upstream region (ILO4) repressed thetranslation rate by over 50-fold where it significantlyreduced the standby site accessibility (from As=24 andGdistortion=0 to As=0 and Gdistortion=174 kcalmol) To offset this large energetic burden model calcu-lations indicate that the ribosomal platform partiallyunfolded 6 bp from the domain 2 hairpin whichminimized its binding free energy penalty to a Gstandby

of 532 kcalmol (As=205 Gdistortion=002 andGunfolding=53 kcalmol) Destabilization of thedomain 2 hairpin (ILO5) restored the mRNArsquos transla-tion to its wild-type rate by increasing the surface area ofthe standby site to 205 nt resulting in an almost zeroGstandby The biophysical model calculations indicatethat other mutations to the thrS 50 UTR (L6 M1 N1BS4-9 BS4-9CS29 L19 L7 CS302 ILO1 andILO2 see Figure 7) did not alter its standby sitemodule surface area or RNA structure the maximumobserved change in translation was 25-fold for thesemutations

DISCUSSION

We designed and characterized synthetic mRNA se-quences and applied biophysical modeling to decipherthe rules for the ribosomersquos interactions at structured50 UTRs The 30S ribosome searches for a singlestandby site module that supports the highest translationrate regardless of its distance from the start codon Theribosomersquos binding free energy penalty to individualstandby site modules is governed by a competing trade-off between the accessibility of standby site modules

GAACUGGUUAAUAAACAAAUUUUUC

AG

GUGG

U

G

UACCUG

AAUU

G

G

AG

UUUG

AAC C

UG

ACUA

GA

UUCGUG

U

G

A

GCGUU

U

U

CGCAAU

GUUU

G

AA

UU

GC

A

CA

GU

AUUU

GAUACGGC

AU

UAAGGAUAUAAAAUGSD

U

AUGUU

U

UGCAAA

GCUU U

U

CGUG

ACUAGUGC

G

GU

C

CA

Domain 2

Domain 3

Domain 4

minus1

ILOΔ1

ILOΔ2

minus90

minus120

minus150

minus163 minus60 minus13

minus32

ILOΔ4 ILOΔ5

minus49

Figure 7 The sequence and structure of E coli thrS 50 UTR are shown Similar to Sacerdot et al (45) nucleotide numbering is negative for theupstream 50 UTR with respect to the start codon The list of mutations is BS4-9 G(-32)gtA BS4-9CS29 G(-32)gtA and G(-46)gtA L19 A(-13)gtCL7 U(-49)gtG CS302 deletion of domain 2 from A(-14) to U(-48) ILO1 transcription begins at G(-159) ILO2 transcription begins at G(-68)ILO4 transcription begins at G(-49) U(-49)gtG and A(-13)gtC ILO5 transcription begins at G(-49) U(-49)gtG and G(-46)gtA Bolded nucleo-tides are the positions of new transcriptional start sites

2656 Nucleic Acids Research 2014 Vol 42 No 4

(Figure 2) and the unfolding of RNA structures toincrease accessibility (Figure 3) We demonstrate thatthermodynamic minimization predicts the extent ofthe ribosomersquos selective and partial unfolding of RNAstructures Our data supports a sliding mechanismwhereby downstream RNA structures are not unfoldedbut attenuate translation rate in a height-dependentmanner (Figure 4)

Importantly initially designed sequences have well-defined features that perturb the ribosomersquos individualinteractions with mRNA allowing us to preciselymeasure their effect on the ribosomersquos binding freeenergy penalty The forward design of synthetic sequencesaccording to model predictions quantitatively tests ourknowledge of these interactions and the modelrsquos abilityto predict their strengths Overall the biophysical modelemploys first-principles calculations with four empiricallymeasured constants to accurately predict the translationinitiation rates of 136 mRNAs with diversely structuredlong 50 UTRs (Figure 5)

Based on our findings there is also the potential forlong-range regulation of translation Long structured50 UTRs can feature several upstream standby sitemodules extending over 100 nt away from an openreading frame Any of these standby site modules arepotential binding sites for the ribosome according totheir binding free energy penalties but only a singlestandby site module is ultimately bound Both cis-actingand trans-acting factors can modulate individual standbysite module surface accessibilities to control standby sitemodule selection and the resulting translation rate(31723) Importantly the regulation of standby sitemodule selection will produce discrete step changes intranslation rate that can further regulate downstreamprocesses according to non-Boolean multi-state logic

In this study we introduce a new metric to quantify thesurface area of a standby site module which was validatedusing diverse but canonical RNA secondary structuresThe metric relates the geometric characteristics of thestandby site module (P D H) to its available single-stranded surface area implicitly using coefficients of oneto indicate interchangeability Additional measurementswill be needed to precisely measure these coefficientsand relate them to the helical shape of A-form RNA Inaddition tertiary structures such as G-quadruplexes andpseudoknots will likely have a differently defined metricof surface area Further investigation will be necessary todetermine the effect of tertiary RNA structures on standbysite accessibility ribosome binding and translationinitiation

As suggested by the biophysical model the ribosomalplatformrsquos ability to bind standby site modules with lowsurface area relies on its structural flexibility Previouslypublished cryo-EM data shows that the platform domainin the 30S subunit in both the free and 50S-bound statescan occupy variable conformational states (46) The twohalves of the platform which are stabilized by ribosomalproteins at its surface switch conformations during trans-lation (47) Restraining this flexibility will reduce the ribo-somersquos translation initiation rate for example when theantibiotic Edeine binds 16S ribosomal RNA helices H23

and H24 (48) Consequently the ribosomal platformrsquosstructural flexibility appears to be an essential aspect oftranslating mRNAs with structured 50 UTRsSimilarly the rigidity of single-stranded RNA likely

plays an important role in modulating the ribosomersquos dis-tortion penalty due to the entropic differences of bindingflexible or rigid macromolecules (49) Similar to DNApolyA RNA sequences are less flexible than mixed orpolyU RNA sequences (5051) Differences in RNArigidity have been shown to alter the ribosomersquos bindingaffinity to spacer regions separating the SD and startcodon sequences (52) binding free energy differences are25 kcalmol between polyA and polyAU spacer se-quences and 5 kcalmol between polyA and polyUspacer sequences Incorporating a sequence-dependentmodel for RNA rigidity into the distortion penalty calcu-lations could potentially increase their accuracyAnalysis of the modelrsquos error can lead to further insight

into the ribosomersquos interactions The modelrsquos predictionshave a normally distributed error for 90 of sequences(Supplementary Figure S6) Thirteen outliers have errorsbetween 2 and 4 kcalmol (Supplementary Data) Someoutliers are due to the effects of increased mRNA degrad-ation of 50 UTRs with very long proximal or distal bindingsites (P Dgt 20 nt) (Supplementary Figure S2) In othersextremely short proximal or distal binding sites can resultin non-canonical base pairings that alter the ribosomersquosbinding affinity For example in the absence of aproximal binding site (P=0nt) co-axial stacking willtake place between the nucleotides in the distal bindingsite and the mRNAndashrRNA duplex that anchors theribosome to the mRNA Similarly co-axial stacking willplay a larger role in predicting the effects of small RNAsthat bind to sites nearby RNA structuresAs the list of interactions that control gene expression

continues to grow we will stretch the limits of humanpattern recognition and the use of observational correl-ations Biophysical modeling offers a comprehensiveapproach to account for all known interactions with thegene expression machinery identify gaps design experi-ments precisely measure strengths and make testable pre-dictions The accuracy of thesemodels will go hand-in-handwith our ability to engineer genetic systems without trial-and-error We have incorporated the ribosomersquos inter-actions with structured standby site modules into animproved biophysical model called the RBS Calculatorv20 A software implementation and user-friendly webinterface is available at httpsalispsuedusoftware

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online

ACKNOWLEDGEMENTS

The authors would like to thank Manish Kushwaha andthe Genomics Core Facility at Penn State for technicalassistance Chase Beisel Richard Lease Paul Babitzkeand the reviewers for providing comments on themanuscript

Nucleic Acids Research 2014 Vol 42 No 4 2657

FUNDING

DARPA Young Faculty Award [to HMS] Office ofNaval Research [N00014-13-1-0074 to HMS] NationalScience Foundation Career Award [CBET-1253641 toHMS] start-up funds provided by the Penn StateInstitute for the Energy and Environment Supported bythe NSF Research Experience for Undergraduatesprogram (to ASC) Funding for open access chargeThe Office of Naval Research [N00014-13-1-0074 toHMS]

CONTRIBUTIONS

AEB and HMS designed the study developed themodel analyzed results and wrote the manuscriptAEB and ASC conducted the experiments

Conflict of interest statement HMS is the founder of DeNovo DNA a company that commercializes software todesign genetic systems The presented results areindependent of financial consequences

REFERENCES

1 TsaiA PetrovA MarshallRA KorlachJ UemuraS andPuglisiJD (2012) Heterogeneous pathways and timing of factordeparture during translation initiation Nature 487 390ndash393

2 SimonettiA MarziS JennerL MyasnikovA RombyPYusupovaG KlaholzBP and YusupovM (2009) A structuralview of translation initiation in bacteria Cell Mol Life Sci 66423ndash436

3 JennerL RombyP ReesB Schulze-BrieseC SpringerMEhresmannC EhresmannB MorasD YusupovaG andYusupovM (2005) Translational operator of mRNA on theribosome how repressor proteins exclude ribosome bindingScience 308 120ndash123

4 BeiselCL UpdegroveTB JansonBJ and StorzG (2012)Multiple factors dictate target selection by Hfq-binding smallRNAs EMBO J 31 1961ndash1974

5 StorzG VogelJ and WassarmanKM (2011) Regulation bysmall RNAs in bacteria expanding frontiers Mol Cell 43880ndash891

6 BastetL DubeA MasseE and LafontaineDA (2011) Newinsights into riboswitch regulation mechanisms Mol Microbiol80 1148ndash1154

7 BeiselCL and SmolkeCD (2009) Design principles forriboswitch function PLoS Comput Biol 5 e1000363

8 LeaseRA and BelfortM (2000) A trans-acting RNA as acontrol switch in Escherichia coli DsrA modulates function byforming alternative structures Proc Natl Acad Sci USA 979919ndash9924

9 De SmitMH and Van DuinJ (1994) Control of translation bymRNA secondary structure in Escherichia coli A quantitativeanalysis of literature data J Mol Biol 244 144ndash150

10 De SmitMH and Van DuinJ (1994) Translational initiation onstructured messengers Another role for the Shine-Dalgarnointeraction J Mol Biol 235 173ndash184

11 OstermanIA EvfratovSA SergievPV and DontsovaOA(2013) Comparison of mRNA features affecting translationinitiation and reinitiation Nucleic Acids Res 41 474ndash486

12 BugautA and BalasubramanianS (2012) 50-UTR RNA G-quadruplexes translation regulation and targeting Nucleic AcidsRes 40 4727ndash4741

13 SalisHM (2011) The ribosome binding site calculator MethodsEnzymol 498 19ndash42

14 SalisHM MirskyEA and VoigtCA (2009) Automated designof synthetic ribosome binding sites to control protein expressionNat Biotechnol 27 946ndash950

15 CarothersJM GolerJA JuminagaD and KeaslingJD (2011)Model-driven engineering of RNA devices to quantitativelyprogram gene expression Science 334 1716ndash1719

16 SimonettiA MarziS MyasnikovAG FabbrettiAYusupovM GualerziCO and KlaholzBP (2008) Structure ofthe 30S translation initiation complex Nature 455 416ndash420

17 MarziS MyasnikovAG SerganovA EhresmannCRombyP YusupovM and KlaholzBP (2007) StructuredmRNAs regulate translation initiation by binding to the platformof the ribosome Cell 130 1019ndash1031

18 YusupovaGZ YusupovMM CateJ and NollerHF (2001)The path of messenger RNA through the ribosome Cell 106233ndash241

19 De SmitMH and Van DuinJ (2003) Translational standbysites how ribosomes may deal with the rapid folding kinetics ofmRNA J Mol Biol 331 737ndash743

20 QuX LancasterL NollerHF BustamanteC and TinocoI(2012) Ribosomal protein S1 unwinds double-stranded RNA inmultiple steps Proc Natl Acad Sci USA 109 14458ndash14463

21 MilonP MaracciC FilonavaL GualerziCO andRodninaMV (2012) Real-time assembly landscape of bacterial30S translation initiation complex Nat Struct Mol Biol 19609ndash615

22 StuderSM and JosephS (2006) Unfolding of mRNA secondarystructure by the bacterial translation initiation complex MolCell 22 105ndash115

23 DarfeuilleF UnosonC VogelJ and WagnerEGH (2007) Anantisense RNA inhibits translation by competing with standbyribosomes Mol Cell 26 381ndash392

24 MutalikVK GuimaraesJC CambrayG MaiQ-AChristoffersenMJ MartinL YuA LamC RodriguezCBennettG et al (2013) Quantitative estimation of activity andquality for collections of functional genetic elements NatMethods 10 347ndash353

25 CardinaleS and ArkinAP (2012) Contextualizing context forsynthetic biologyndashidentifying causes of failure of syntheticbiological systems Biotechnol J 7 856ndash866

26 LouC StantonB ChenY-J MunskyB and VoigtCA (2012)Ribozyme-based insulator parts buffer synthetic circuits fromgenetic context Nat Biotechnol 30 1137ndash1142

27 KosuriS GoodmanDB CambrayG MutalikVK GaoYArkinAP EndyD and ChurchGM (2013) Composability ofregulatory sequences controlling transcription and translation inEscherichia coli Proc Natl Acad Sci USA 110 14024ndash14029

28 TakahashiS FurusawaH UedaT and OkahataY (2013)Translation enhancer improves the ribosome liberation fromtranslation initiation J Am Chem Soc 135 13096ndash13106

29 HaoY ZhangZJ EricksonDW HuangM HuangY LiJHwaT and ShiH (2011) Quantifying the sequence-functionrelation in gene silencing by bacterial small RNAs Proc NatlAcad Sci USA 108 12473ndash12478

30 MathewsDH SabinaJ ZukerM and TurnerDH (1999)Expanded sequence dependence of thermodynamic parametersimproves prediction of RNA secondary structure J Mol Biol288 911ndash940

31 XiaT SantaLuciaJ BurkardME KierzekR SchroederSJJiaoX CoxC and TurnerDH (1998) Thermodynamicparameters for an expanded nearest-neighbor model for formationof RNA duplexes with Watson-Crick base pairs Biochemistry 3714719ndash14735

32 GruberAR LorenzR BernhartSH NeubockR andHofackerIL (2008) The Vienna RNA websuite Nucleic AcidsRes 36 W70ndashW74

33 ChenH BjerknesM KumarR and JayE (1994) Determinationof the optimal aligned spacing between the ShinendashDalgarnosequence and the translation initiation codon of Escherichia colimRNAs Nucleic Acids Res 22 4953ndash4957

34 PertzevAV and NicholsonAW (2006) Characterization ofRNA sequence determinants and antideterminants of processingreactivity for a minimal substrate of Escherichia coli ribonucleaseIII Nucleic Acids Res 34 3708ndash3721

35 KeselerIM Collado-VidesJ Santos-ZavaletaA Peralta-GilMGama-CastroS Muniz-RascadoL Bonavides-MartinezCPaleyS KrummenackerM AltmanT et al (2011) EcoCyc a

2658 Nucleic Acids Research 2014 Vol 42 No 4

comprehensive database of Escherichia coli biology Nucleic AcidsRes 39 D583ndashD590

36 MutalikVK QiL GuimaraesJC LucksJB and ArkinAP(2012) Rationally designed families of orthogonal RNA regulatorsof translation Nat Chem Biol 8 447ndash454

37 RodrigoG LandrainTE and JaramilloA (2012) De novoautomated design of small RNA circuits for engineering syntheticriboregulation in living cells Proc Natl Acad Sci USA 10915271ndash15276

38 CalluraJM CantorCR and CollinsJJ (2012) Geneticswitchboard for synthetic biology applications Proc Natl AcadSci USA 109 5850ndash5855

39 SinhaJ ReyesSJ and GallivanJP (2010) Reprogrammingbacteria to seek and destroy an herbicide Nat Chem Biol 6464ndash470

40 AnthonyPC PerezCF Garcıa-GarcıaC and BlockSM(2012) Folding energy landscape of the thiamine pyrophosphateriboswitch aptamer Proc Natl Acad Sci USA 109 1485ndash1489

41 PerdrizetGA II ArtsimovitchI FurmanR SosnickTR andPanT (2012) Transcriptional pausing coordinates folding of theaptamer domain and the expression platform of a riboswitchProc Natl Acad Sci USA 109 3323ndash3328

42 LynchSA DesaiSK SajjaHK and GallivanJP (2007) Ahigh-throughput screen for synthetic riboswitches revealsmechanistic insights into their function Chem Biol 14 173ndash184

43 CaronM-P BastetL LussierA Simoneau-RoyM MasseEand LafontaineDA (2012) Dual-acting riboswitch control oftranslation initiation and mRNA decay Proc Natl Acad SciUSA 109 E3444ndashE3453

44 UpdegroveT WilfN SunX and WartellRM (2008) Effect ofHfq on RprArpoS mRNA pairing HfqRNA binding and the

influence of the 50 rpoS mRNA leader region Biochemistry 4711184ndash11195

45 SacerdotC CailletJ GraffeM EyermannF EhresmannBEhresmannC SpringerM and RombyP (1998) The Escherichiacoli threonyl-tRNA synthetase gene contains a split ribosomalbinding site interrupted by a hairpin structure that is essential forautoregulation Mol Microbiol 29 1077ndash1090

46 GabashviliIS AgrawalRK GrassucciR and FrankJ (1999)Structure and structural variations of the Escherichia coli 30 Sribosomal subunit as revealed by three-dimensional cryo-electronmicroscopy J Mol Biol 286 1285ndash1291

47 ClemonsWM MayJL WimberlyBT McCutcheonJPCapelMS and RamakrishnanV (1999) Structure of a bacterial30S ribosomal subunit at 55 A resolution Nature 400 833ndash840

48 PiolettiM SchlunzenF HarmsJ ZarivachR GluhmannMAvilaH BashanA BartelsH AuerbachT and JacobiC(2001) Crystal structures of complexes of the small ribosomalsubunit with tetracycline edeine and IF3 EMBO J 201829ndash1839

49 WilliamsonJR (2000) Induced fit in RNAndashprotein recognitionNat Struct Mol Biol 7 834ndash837

50 FioreJL HolmstromED and NesbittDJ (2012) Entropicorigin of Mg2+-facilitated RNA folding Proc Natl Acad SciUSA 109 2902ndash2907

51 MillsJB VacanoE and HagermanPJ (1999) Flexibility ofsingle-stranded DNA use of gapped duplex helices to determinethe persistence lengths of poly (dT) and poly (dA) J Mol Biol285 245ndash257

52 EgbertRG and KlavinsE (2012) Fine-tuning gene networksusing simple sequence repeats Proc Natl Acad Sci USA 10916817ndash16822

Nucleic Acids Research 2014 Vol 42 No 4 2659

Page 5: Translation rate is controlled by coupled trade-offs ......Translation rate is controlled by coupled trade-offs between site accessibility, selective RNA unfolding and sliding at upstream

coding sequences in the 136 synthetic mRNA sequencesthe value for Gother was zero for 123 mRNA sequencesand a maximum of 090 kcalmol for the remainingsequences

Statistical analysis

Two-sample T-test at two-tailed 5 statistical signifi-cance was performed to compare the error distributionof different sets of model predictions The null hypothesisassumed that the errors in two different sets of modelpredictions have same mean and variance The alternativehypothesis indicates that the errors in two sets follow dif-ferent distributions The error in model predictionGstandby was calculated according to the formulaGstandby=measured Gstandby predicted GstandbyThe low probabilities (Plt 005) reject the null hypothesisshowing that the differences in two sets of model predic-tions were statistically significant

RESULTS

Standby site module surface area controls the ribosomersquostranslation rate

We first employed a learn-by-design approach to deter-mine the ribosomal platformrsquos interactions with standbysites that contain a single hairpin and therefore contain asingle standby site module Standby site modules aredefined by their hairpin height (H) upstream distalbinding site length (D) and downstream proximalbinding site length (P) (Figure 2A lsquoMaterials andMethodsrsquo section) We designed constructed andcharacterized 62 synthetic 50 UTRs containing singlestandby site modules with hairpins from 9 to 18 bp longdistal binding sites from 1 to 15 nt long and proximalbinding sites from 1 to 24 nt long Long mRNA hairpinscontain internal loops to avoid potential RNAse activity(34) All mRNA sequences are transcribed by the J23100constitutive promoter encode the mRFP1 fluorescentprotein and are designed with identical SD and spacersequences (lsquoMaterials and Methodsrsquo section) Flowcytometry measurements from long-time log-phasecultures with periodic serial dilutions revealed thatmRFP1 fluorescence varied by 221-fold (Figure 2CndashE)In contrast mRNA levels varied by at most 58-foldaccording to RT-qPCR measurements (SupplementaryFigure S2) Changes in fluorescence are primarily due tochanges in translation rate and in these growth condi-tions fluorescence and translation initiation rates are pro-portional All sequences and measurement data are listedin the Supplementary DataAccording to the data standby site module accessibility

plays an important role in modulating the ribosomersquosinteractions with 50 UTRs and controlling translationrate across a large range In particular lengthening thedistal binding site by four nucleotides increased themRNArsquos translation rate by 42-fold whereas lengtheningthe proximal binding site by the same amount increased itby 40-fold Shortening the hairpinrsquos height by 4 bpincreased the mRNArsquos translation rate similarly by 31-fold More generally the slopes (first derivatives) of the

log-transformed translation rates all have similar magni-tudes (Supplementary Data column E) Highly accessiblestandby site modules with long proximal binding sites orshort hairpins also cause the mRNArsquos translation rate toreach a plateau with the same fluorescence level as anmRNA with an unstructured 50 UTR (Figure 2C) Theseobservations led to the hypothesis that the geometric char-acteristics of the standby site module (D P H) are inter-changeable and that its available single-stranded surfacearea As determines the ribosomal platformrsquos ability tobind to it

A standby site modulersquos single-stranded surface area isthe number of nucleotides capable of single-strandedcontact with the ribosomal platform Longer proximalor distal binding sites increase the standby site modulersquossurface area while longer hairpin structures reduce thestandby site modulersquos accessibility mRNA structurescan potentially reduce site accessibility by folding overthe neighboring single-stranded sites and sequesteringbinding regions A standby site modulersquos availablesurface area is calculated by summing its proximal anddistal binding site lengths and subtracting the hairpinrsquosheight (Figure 2B lsquoMaterials and Methodsrsquo section) Ahighly accessible standby site module has a large Aswhereas a less accessible standby site module has a smallAs We then evaluated whether the standby site modulersquossurface area is the key determinant by comparing As to themeasured log-transformed translation rates For all 6250 UTRs with diverse geometric features we found asingle quantitative relationship that described theeffect of surface area on the mRNArsquos translation rate(Figure 2F)

We now turn to thermodynamic modeling to quantifythe ribosomersquos interactions with standby site modules Thecompetitive binding of free ribosomes to the pool of intra-cellular mRNAs allows one to use statistical thermo-dynamics to relate a ribosomersquos binding free energy toits translation initiation rate according to a constantlog-linear relationship (Equation 2) (1314) We calculatethe interaction energy between the ribosome and anmRNArsquos standby site modules by comparing its transla-tion to the translation rate of a reference unstructured50 UTR both with an identical SD sequence spacerregion and protein coding sequence (Equation 4) Usingthe data in Figure 2CndashE we found that the standby sitebinding free energy penalty Gstandby varied from 0 to118 kcalmol when changing its geometric characteristicsand that a simple quadratic function succinctly describedthe effect of site accessibility on the ribosomersquos bindingfree energy penalty (R2=097 average error is051 kcalmol) (Figure 2F) A maximally accessiblestandby site module has a zero binding free energypenalty (Gstandby=0) indicating that the mRNArsquostranslation initiation rate is not reduced by the ribosomalplatformrsquos interactions A standby site module with apositive binding free energy penalty (Gstandbygt 0) hasreduced the mRNArsquos translation initiation rate bydecreasing the ribosomal platformrsquos ability to initiallybind the mRNA

Next we critically tested the parameterized model byconstructing 15 new 50 UTRs and measuring their

2650 Nucleic Acids Research 2014 Vol 42 No 4

translation initiation rates and ribosome binding freeenergy penalties The 50 UTRs each contain a standbysite module with diverse structures between 43 and 91 ntlong including variable-length distal and proximalbinding sites hairpin heights and including two- andthree-branched structures (Supplementary Data) Forbranched structures we show that the relevant hairpinheight is the average across the multiple branches(Supplementary Figure S3) (see also SupplementaryText) The quadratic relationship accurately predictedthe ribosomersquos translation rate and binding free energypenalty (R2=078 average error is 066 kcalmol) con-firming that standby site modulersquos accessibility controlsthe ribosomersquos ability to bind and initiate translation(Figure 2G) The quadratic relationship contains threeparameters which are now treated as constants(Equation 3 lsquoMaterials and Methodsrsquo section)

Coupled energetic trade-offs when unfolding structures instandby site modules

Prior to initiating translation the 30S ribosome does nothave an external source of energy Initiation factor 2 doesnot hydrolyze GTP until after translation has beeninitiated during 50S recruitment (22) Consequently the30S complex expends its own free energy to unfold mRNAstructures Previous studies have shown that mRNA struc-tures that sequester the 16S rRNA binding site areunfolded prior to initiating translation (91422) ThemRNArsquos translation initiation rate is reduced accordingto the free energy penalty for unfolding these structures asdescribed in Equations (1) and (2) (lsquoMaterials andMethodsrsquo section) However it remains unclear whethermRNA structures located upstream of SD sequenceswithin standby site modules are unfolded oraccommodated We next investigated how the ribosomeunfolds hairpins in long 50 UTRs

By definition there is a coupled thermodynamic rela-tionship between the unfolding of an mRNA hairpin andincreasing the accessibility of the corresponding standbysite module where the accessibility of standby site modulecontrols the ribosomal platformrsquos distortion energypenalty Gdistortion (see lsquoMaterials and Methodsrsquosection) Unfolding a single base pair in a modulersquoshairpin will increase the standby site modulersquos surfacearea by three nucleotides the hairpinrsquos height decreasesby one and the lengths of both the proximal and distalbinding sites increase by one Therefore unfolding amodulersquos hairpin will lead to higher accessibility a lowerGdistortion and a higher translation rate However un-folding base pairs requires an input of free energy(Gunfoldinggt 0) which will lead to a lower translationrate Consequently there is an energetic trade-offbetween unfolding mRNA structures and increasing thesurface accessibility of standby site modules Theribosome may completely unfold a structured standbysite module to maximize its surface area but the energeticcost may be high (high Gunfolding penalty withGdistortion=0) In contrast the ribosome may fully ac-commodate a standby site modulersquos hairpin without

unfolding it even though its low surface accessibility willpenalize the ribosomersquos ability to bind to it (highGdistortion penalty with Gunfolding=0)Based on thermodynamic principles we hypothesize

that the ribosome will lsquoselectivelyrsquo and lsquopartiallyrsquo unfoldstructured standby site modules to minimize its totalbinding free energy penalty (Gstandby=Gdistortion+Gunfolding) Figure 3A and B presents an illustrativeexample of this principle and our use of minimization tocalculate the ribosomersquos binding free energy penalty to astandby site module A 50 UTR with a single standby sitemodule initially contains a short proximal binding site(P=2nt) a short distal binding site (D=2nt) and along hairpin (H=15nt) resulting in a largely inaccessible50 UTR (As=4) Based on the standby site modulersquossurface area the ribosome has an initial distortionenergy penalty of 115 kcalmol However the ribosomecan selectively unfold the hairpinrsquos closing AU basepairs increasing the standby site modulersquos surface areaand lowering its Gdistortion at the expense of increasingits Gunfolding Unfolding 1 bp increases the standby sitemodulersquos surface area by 3 nt decreases the distortionenergy penalty to 78 kcalmol and increases the unfoldingenergy penalty to 090 kcalmol leading to a total bindingfree energy penalty of 87 kcalmol The ribosome cancontinue to unfold base pairs to minimize its totalbinding free energy penalty unfolding 3 bp at a cost of27 kcalmol leads to a distortion energy penaltyof 26 kcalmol and a total binding free energy penaltyof 53 kcalmol (Figure 3B) Two additional examplesare shown in Supplementary Figure S4 The unfoldingfree energies are calculated using the Vienna RNA suitewith the Turner 1999 RNA parameters (see lsquoMaterials andMethodsrsquo section) Importantly thermodynamic mini-mization does not add any new parameters to the modelWe tested the proposed mechanism by characterizing the

translation rate of 22 synthetic 50 UTRs (Figure 3CndashE) thatcontain single standby site modules with varied geometries(D=0 to 7nt and P=1 to 20nt) and differing hairpinsequences sizes and folding energetics (SupplementaryFigure S5A) We employed the minimization principle tocalculate the number of unfolded base pairs and to predictthe ribosomersquos binding free energy penalty Gstandby

(Figure 3D) As comparisons we performed the same cal-culation while asserting that either no additional energywas inputted to unfold hairpins (high Gdistortion andGunfolding=0) (Figure 3C) or that hairpins were com-pletely unfolded by the ribosome (high Gunfolding andGdistortion=0) (Figure 3E)The data supports our hypothesis that the ribosome

partially and selectively unfolds structured standby sitemodules to minimize its total binding free energy penalty(Gstandby) rather than separately minimize its distortionor unfolding penalties (average error is 090 kcalmol whenminimizing Gstandby 242 kcalmol when Gunfolding=0and 1024 kcalmol when Gdistortion=0 comparativeP-values are 0004 and 10 13 respectively)Remarkably in all 22 cases only 1ndash3 bp of each hairpinare predicted by the model to be unfolded whileaccommodating the remaining large RNA structures

Nucleic Acids Research 2014 Vol 42 No 4 2651

Non-cooperative binding to multiple standby site modules

We have shown that the ribosome can accommodate largestructures of standby site modules and engage in frequenttranslation but it is unclear how the presence of multiplestandby site modules in a 50 UTR will affect its translationrate The presence of multiple standby site modules couldallow the ribosome to cooperatively bind mRNA andincrease its translation rate We next characterized ninesynthetic 50 UTRs (76ndash164 nt long) with up to fourstandby site modules to investigate the potential for co-operative binding (Figure 4A) If the ribosome can engagein cooperative binding then the additional standby sitemodules will increase the overall available surface areareduce the ribosomersquos binding free energy penalty andincrease the translation initiation rate In particular weexpect the greatest cooperativity when the availablesurface area of a single standby site module is limitedWe found that 50 UTRs with two three or four

standby site modules were all translated at the samerate indicating an absence of cooperative binding

(Figure 4B) including when the available surface areasfor individual standby site modules were significantlylowered Therefore even when the combined surfaceareas of multiple standby site modules could potentiallyincrease contact with the ribosomal platform and increasebinding a cooperative increase in translation rate was notobserved Consequently the data supports a mechanismwhereby the ribosome binds to only a single standby sitemodule prior to initiating translation

Distant standby site modules can support hightranslation rates

We next investigated whether any standby site modulecould be bound by the ribosome even if located farupstream of the start codon Using the modelrsquos predic-tions we forward-engineered multi-standby site modulesynthetic 50 UTRs where the most distantly upstreamsite is the only one capable of supporting high translationrates (Figure 4C) Internal standby site modules have longhairpins and limited surface areas (model predictions

B

0 2 4 6 8 1002468

10121416

Unfolded Base Pairs

Ene

rgy

[kca

lmol

]

Standby Site Module Surface Area (AS)4 10 16 22 28 34

dagger

GC

CGCU

C

C

CGA

UAUA

G

G

CGA

CUCG

AAuuu aaaAAUAAGGAGGUAAAAAAUGSD

After

GC

CGCUuuu

C

C

CGAaaa

UAUA

G

G

CGA

CUCG

AA AAUAAGGAGGUAAAAAAUGSD

dagger Before

C D

A

Partial RNA Unfolding

102

103

104

Fluo

resc

ence

[au]

Predicted ΔGstandby [kcalmol]0 5 10 15

No RNA Unfolding

0 5 10 15

102

103

104

Fluo

resc

ence

[au]

Predicted ΔGstandby [kcalmol]

Complete RNA Unfolding

0 5 10 15 20

10 2

10 3

10 4

Fluo

resc

ence

[au]

Predicted ΔGstandby [kcalmol]

E

Figure 3 (A and B) Competition between ribosomal distortion and selective RNA unfolding is illustrated on a structured 50 UTR with a weakhairpin The ribosomersquos distortion penalty Gdistortion (dashed red line) hairpin unfolding penalty Gunfolding (dotted dashed blue line) and their sumGstandby (green solid line) are shown as the hairpinrsquos closing base pairs are unfolded The total binding free energy penalty has a local minima whenthree base pairs are unfolded See also Supplementary Figure S4 (CndashE) Scatter plots comparing measured translation rates from 22 synthetic50 UTRs and model predictions to test three mechanisms for the ribosomersquos unfolding of RNA structures The average difference between modelpredictions and measurements is (C) 242 kcalmol when not unfolding structures (D) 090 kcalmol when partially unfolding structures and(E) 1024 kcalmol when fully unfolding structures Comparative P-values from two-sample T-tests are 0004 (C versus D) and 1013 (D versusE) Data points are the result of three measurements in 1 day In parts (CndashE) the diagonal dashed line is the predicted translation rate according toEquation (2)

2652 Nucleic Acids Research 2014 Vol 42 No 4

Gstandby between 90 and 174 kcalmol) compared to themaximally accessible upstream standby site modulesTherefore if the ribosome binds the farthest upstreamstandby site module its translation rate will be at least127-fold higher than binding to the internal standby sitemodules We measured fluorescence levels with flowcytometry and mRNA levels with RT-qPCR Comparedto an unstructured 50 UTR the mRNA levels decreased byonly 13- to 55-fold while the fluorescence levels werecorrespondingly reduced between 25- and 183-foldshowing that changes in fluorescence were mainly due tochanges in translation rate and ribosome binding freeenergy penalties (Figure 4D)

We found that all the mRNAsrsquo translation ratesremained very high even when the accessible standbysite module was located over 100 nt upstream of thestart codon (M4 50 UTR) (Figure 4E) In particular theM2 50 UTR is translated at a 338-fold higher rate whencompared to the translation rate that could be supportedby binding to the standby site module closest to the startcodon Moving the accessible standby site moduleupstream away from the start codon does not signifi-cantly reduce the translation rate Based on these meas-urements the apparent standby site binding penaltiesare 36 62 and 65 kcalmol for M2 to M4 50 UTRs

(Figure 4F) respectively which is much lower than theinternal standby site modulesrsquo predicted binding freeenergy penalties (90ndash174 kcalmol) These measurementssuggest that the ribosome will bind to the standby sitemodule that has the lowest binding free energy penaltyregardless of its distance from the start codon

The ribosomal platform can slide acrossupstream structures

Several mechanisms would allow the ribosomal platformto bind distant upstream standby site modules and tran-sition to a pre-initiation complex at the start codon Afterthe ribosome has bound to the upstream standby sitemodule the 30S complex could engage in diffusivehopping that would increase its local concentration andassembly rate at the start codon The rate of assemblywould largely depend on the physical distance betweenthe standby site module and start codon and not on thepresence of RNA structures Alternatively the ribosomecould employ a full contact sliding mechanism that reori-ents the mRNA until a stable pre-initiation complex hasformed The positively charged surface of the ribosomalplatform makes it ideal for maintaining contact with a50 UTR until 16S rRNA hybridization has anchored themRNA Unlike a diffusive hopping mechanism the

B

ΔGst

andb

y [kc

alm

ol]

Single Stranded Length (N) [nt]124 8

6543210

103

104

Fluo

resc

ence

[au]

A

SD AUGN N

SD AUGN NN

SD AUGN NNN

C

SD AUGM1

I

SD AUGM2

III

SD AUG

M3

IIIII

I

SD AUG

M4

IIIII

IIV

M3 M4M1 M2

Structured Standby Sites

Fluo

resc

ence

[au]

102

103

104

101

100

E

I IIM2

I II IIIM3

I II III IVM4

22

20

18

16

14

12

10

8

6

4

2

0

ΔGdi

stor

tion +

ΔG

slid

ing

[kca

lmol

]

IM1

22

20

18

16

14

12

10

8

6

4

2

0

ΔGdi

stor

tion

[kca

lmol

]

FD

M1 M2M3 M4

0

5

10

15

20

25

0

5

10

15

20

25

Structured Standby Sites

mR

NA

Lev

el F

old

Red

uctio

n

Fluo

resc

ence

Fol

d R

educ

tion

Figure 4 (A) Synthetic 50 UTRs with multiple evenly spaced standby site modules Spacing N is varied from 4 to 12 nt (B) Measured translation ratesare compared to standby site module multiplicity showing the absence of cooperative binding to multiple standby site modules regardless of limitingsurface areas The Gstandby numbers shown in secondary y-axis are directly related to the data according to Equation (4) (C) Synthetic 50 UTRs withone upstream and multiple internal standby site modules labeled with roman numbers (D) The reduction in fluorescence (red bars) and mRNA levels(blue bars) of 50 UTRs M1 to M4 compared to an unstructured 50 UTR (see also Supplementary Figure S2) (E) Fluorescence measurements of 50

UTRs M1 to M4 show that upstream standby site modules can support high translation rates (F) Model predictions are shown for 50 UTRs M1 to M4

using either Gdistortion (gray bars) or Gdistortion+Gsliding (blue bars) for each standby site module (IndashIV corresponding to labels in part C)compared to the ribosomersquos apparent binding free energy penalties (dashed lines yellow region) (average error with and without the sliding penalty is115 and 324 kcalmol respectively P-value is 0013) Gdistortion for each standby site module was calculated according to Equation (3) In parts Band E the horizontal dashed line is the translation rate and ribosome binding free energy penalty of the unstructured 50 UTR

Nucleic Acids Research 2014 Vol 42 No 4 2653

presence of mRNA structures would impede this slidingmotion and reduce the pre-initiation complexrsquos assemblyrateFurther analysis of our measurements shows that RNA

structures in between the upstream standby site moduleand start codon partly attenuate the mRNArsquos translationrate (Figure 4E) The apparent binding free energy penaltyof the M2 50 UTR (36 kcalmol) is higher than theupstream standby site modulersquos predicted binding freeenergy penalty (04 kcalmol) This energetic differencecannot be explained by binding to the internal standbysite module (114 kcalmol) or by the unfolding of thedownstream RNA structure (175 kcalmol) Thus thefree energy difference (36 ndash 04=32 kcalmol) is signifi-cant but much lower than predicted by alternativebinding to other standby site modules or from unfoldingof RNA structuresOur data supports the presence of a sliding mechanism

for the ribosomal platform Intervening mRNA structuresare not unfolded but are pushed aside with an energeticcost Using our measurements of the M2 50 UTR weestimate that 32plusmn05 kcalmol free energy is requiredto push aside the single RNA structure with a height of15 nt yielding a coefficient of 020plusmn004 kcalmol perhairpin height We then determined whether includingthis sliding penalty could improve the modelrsquos ability topredict the translation rate of long 50 UTRs with multiplestandby site modules with lsquohairpins of differing heightsrsquoFor each standby site module we calculate the heights ofall downstream RNA structures and multiplied the sumby the coefficient (c=02 kcalmolnt) to yield a freeenergy penalty of sliding Gsliding The sliding freeenergy penalties were 0 30 48 and 78 kcalmol for M1to M4 50 UTRs respectively We add together the slidingdistortion and unfolding free energy penalties to calculatethe ribosomersquos binding free energy penalty to each standbysite moduleFigure 4E shows the measured translation initiation

rates for M1 to M4 alongside the predicted binding freeenergy penalties (Figure 4F) both with and without thesliding free energy penalty The data shown in Figure 4Eand F are on the same scale according to the log-linearrelationship between translation initiation rate andbinding free energy differences (lsquoMaterials and Methodsrsquosection) enabling a comparison between the measure-ments and the two predictions The introduction of thesliding free energy penalty enabled the biophysical modelto more accurately predict the translation rate from longer50 UTRs with several standby site modules (average erroris 186 kcalmol with the sliding energy penalty comparedto 445 kcalmol without the sliding energy penaltyP-value is 0013)

A biophysical model predicts translation rates fromdiversely structured 50 UTRs

Altogether the ribosome binds to 50 UTRs by selectingthe most geometrically accessible standby site module thatrequires the least amount of RNA unfolding Standby sitemodule selection is not limited by distance to the startcodon and upstream standby site modules can support

high rates of translation We identified that ribosomal dis-tortion selective RNA unfolding and ribosomal slidingcontrol standby site module selection and binding ener-getics and combine them together to create a single bio-physical model The ribosomersquos binding free energypenalty to the standby site modules in a structured50 UTR is calculated according to

Gstandby frac14 Gdistortion+Gunfolding+Gsliding eth5THORN

All three Gibbs free energies are positive and a long un-structured 50 UTR would bind to the ribosomal platformwithout an energetic penalty (Gstandby=0kcalmol) AllRNA energetics are calculated using a semi-empirical freeenergy model of RNA and RNAndashRNA interactions(3031) and the minimization algorithms available in theVienna RNA suite version 185 (32) This model extendsour previous biophysical model that focused on down-stream mRNA interactions including the 16S rRNAbinding site spacer region and downstream RNA struc-tures (1314)

We further tested the accuracy of the biophysical modelby characterizing an additional 28 diversely structured50 UTRs (68ndash164 nt long) with two to four standby sitemodules of varying geometries RNA structure energeticsand hairpin heights (Supplementary Figure S5B) The bio-physical model was able to accurately predict the relativetranslation rates and ribosome binding free energypenalties (average error is 079 kcalmol and R2=083)(Figure 5A) Overall the biophysical model containsfour empirically determined parameter values but canaccurately predict the binding free energy penalties andtranslation rates of the 136 synthetic 50 UTRscharacterized in this study (average error is 075 kcalmoland R2=089) (Figure 5B)

Genome-wide predictions of 50 UTR translation rates

We next employed the biophysical model to examine howstandby sites in genomic 50 UTRs control their translationinitiation rates From the annotated E coli MG1655genome provided by EcoCyc (35) we collected 343050 UTRs that varied from short leaders to gene-sized regu-latory hot spots with average 50 UTR length of 100 nt The50 UTR sequences from the+1 of the mRNA transcript to+100 after the protein coding sequencersquos start codon wereinputted into the biophysical model For individual50 UTRs a list of standby site modules was automaticallygenerated For each standby site module the surface areaAs distortion energy penalty Gdistortion unfolding energypenalty Gunfolding and sliding energy penalty Gsliding

were calculated These calculations are combined to deter-mine the Gstandby for each standby site module Becauseof the absence of cooperative binding as shown thestandby site module with the lowest binding free energypenalty is selected as the one controlling the ribosomersquostranslation initiation rate

According to the model calculations we found that thelengths of the natural 50 UTRs do not correlate with theirability to bind the ribosomal platform (Figure 6A) Short50 UTRs with low accessibility can inhibit translationequally well compared to long 50 UTRs where

2654 Nucleic Acids Research 2014 Vol 42 No 4

accessibility is limited by RNA structures The model cal-culations indicate that the ribosomal platform can bind tonatural 50 UTRs with a wide range of energetic penalties(up to 174 kcalmol) repressing translation by up to 2500-fold (Figure 6B) Long 50 UTRs with low Gstandby can besubjected to translation repression when trans-actingsmall RNAs bind and reduce their single-stranded acces-sibility (232936ndash38) or when cis-acting riboswitchesreduce accessibility when bound to a chemical ligand(39ndash43) In one example the standby site in the tisB50 UTR is occluded when bound by the IstR-1 smallRNA and its translation is repressed (23) In contrastlong 50 UTRs with high Gstandby can be targets for

translation activation when structural remodeling bysmall RNAs or riboswitches increases their surface acces-sibility RprA small RNA has been shown to bind and up-regulate the translation of rpoS mRNA with the help ofHfq-protein (44)The model predicts that both types of translation regu-

lation occur in natural 50 UTRs In particular 39 ofgenomic 50 UTRs have standby site modules whosedistal or proximal binding sites are longer than 15 ntwhich are potent targets for translational repression incontrast 9 of 50 UTRs contain standby site moduleswith low accessibilities (Aslt 10) making them targetsfor translation activation Moreover 79 diverse 50 UTRs

A B

C

0 200 400 600 800 10000

2

4

6

8

10

12

14

16

18

ΔGst

andb

y [kca

lmol

]

Genomic 5rsquo UTR Length [nt]

0 10 20 30 400

2

4

6

8

10

12

14

16

18

ΔGst

andb

y [kca

lmol

]

0 2 4 6 8 10 12 14 16 18100

101

102

103

104

ΔGstandby

[kcalmol]

Num

ber

of G

enom

ic 5

rsquo UT

Rs

0 2 4 6 8 10 12 14 16 18

maximum change in ΔGstandby

[kcalmol]

Num

ber

of 5

rsquo UT

R Is

ofor

ms

100

101

102

103Genomic 5rsquo UTR Length [nt]

Figure 6 (A) The ribosomal platform binding free energy penalties Gstandby for 3430 transcribed 50 UTRs from the E coli MG1655 genome(EcoCyc release 171) are compared to their lengths The inset shows the Gstandby of short 50 UTRs (B) The number of genomic 50 UTRs withdifferent values of Gstandby is shown (C) 50 UTR isoforms within transcripts across the genome affect their standby site module characteristics asquantified by the maximum calculated change in Gstandby

A

0 2 4 6 8 1010

8

6

4

2

0

103

104

Predicted ΔGstandby

[kcalmol]

App

aren

t ΔG

stan

dby [

kcal

mol

]

Flu

ores

cenc

e [a

u]

B

0 2 4 6 8 10 12 14

102

103

104

14

12

10

8

6

4

2

0

Predicted ΔGstandby

[kcalmol]

App

aren

t ΔG

stan

dby [

kcal

mol

]

Flu

ores

cenc

e [a

u]

Figure 5 (A) A scatter plot showing fluorescence measurements and apparent Gstandby energy penalties compared to predicted Gstandby energypenalties for 28 synthetic 50 UTRs with diverse structures (average error is 079 kcalmol and R2=083) (B) The same comparison for the 136synthetic 50 UTRs characterized in this study (average error is 075 kcalmol and R2=089) In parts (A and B) the apparent Gstandby numbersshown in secondary y-axis are directly related to the data according to Equation (4) An x=y diagonal line is shown (dashed)

Nucleic Acids Research 2014 Vol 42 No 4 2655

with an average length of 175 nt have positive slidingenergy penalties (Gsliding) indicating that distantstandby site modules are controlling translation andmaking them potential targets for long-range regulationFurther there are 769 operons where variability in their

transcriptional start sites due to adjacent or overlappingpromoters result in 50 UTRs with different lengths andmRNA structures For example the mmuPM operon hasfive different transcriptional start sites leading to 50 UTRisoforms with lengths from 9 to 205 nt The biophysicalmodel predicts a 1088-fold change in the translation initi-ation rate of mmuP as a result of changes in the structureand surface area of the different standby site modulesOverall across all 50 UTR isoforms within the genomemodel calculations show large changes in translation ini-tiation rates through alterations in the standby sitemodulesrsquo characteristics (Figure 6C) These differencesimply that switching promoter usage due to changingtranscription factor or sigma factor levels can modulatean mRNArsquos translation rate and potentially createstandby site modules whose accessibility can be addition-ally regulated by translation factors However furtherinvestigation is necessary to examine the 50 UTRs ofisoforms and the regulation of their translation rates

Biophysical modeling to calculate translation rates fromsplit ribosome binding sites

Sacerdot et al (45) studied the translation of the 50 UTRof E coli thrS mRNA that uses a split ribosome bindingsite The biophysical model classifies the 50 UTR as con-taining three standby site modules with large RNA struc-tures and varying distal and proximal binding site lengths(Figure 7) The authors concluded that a 24 nt long singlestranded region (domain 3) is essential for ribosomebinding and that the ribosome can accommodate a long

hairpin (domain 2) without unfolding it which was latersupported by crystallographic data (3) The biophysicalmodelrsquos calculations are strongly consistent with their con-clusions In particular a mutation that deleted the entiredomain 3 and its upstream region (ILO4) repressed thetranslation rate by over 50-fold where it significantlyreduced the standby site accessibility (from As=24 andGdistortion=0 to As=0 and Gdistortion=174 kcalmol) To offset this large energetic burden model calcu-lations indicate that the ribosomal platform partiallyunfolded 6 bp from the domain 2 hairpin whichminimized its binding free energy penalty to a Gstandby

of 532 kcalmol (As=205 Gdistortion=002 andGunfolding=53 kcalmol) Destabilization of thedomain 2 hairpin (ILO5) restored the mRNArsquos transla-tion to its wild-type rate by increasing the surface area ofthe standby site to 205 nt resulting in an almost zeroGstandby The biophysical model calculations indicatethat other mutations to the thrS 50 UTR (L6 M1 N1BS4-9 BS4-9CS29 L19 L7 CS302 ILO1 andILO2 see Figure 7) did not alter its standby sitemodule surface area or RNA structure the maximumobserved change in translation was 25-fold for thesemutations

DISCUSSION

We designed and characterized synthetic mRNA se-quences and applied biophysical modeling to decipherthe rules for the ribosomersquos interactions at structured50 UTRs The 30S ribosome searches for a singlestandby site module that supports the highest translationrate regardless of its distance from the start codon Theribosomersquos binding free energy penalty to individualstandby site modules is governed by a competing trade-off between the accessibility of standby site modules

GAACUGGUUAAUAAACAAAUUUUUC

AG

GUGG

U

G

UACCUG

AAUU

G

G

AG

UUUG

AAC C

UG

ACUA

GA

UUCGUG

U

G

A

GCGUU

U

U

CGCAAU

GUUU

G

AA

UU

GC

A

CA

GU

AUUU

GAUACGGC

AU

UAAGGAUAUAAAAUGSD

U

AUGUU

U

UGCAAA

GCUU U

U

CGUG

ACUAGUGC

G

GU

C

CA

Domain 2

Domain 3

Domain 4

minus1

ILOΔ1

ILOΔ2

minus90

minus120

minus150

minus163 minus60 minus13

minus32

ILOΔ4 ILOΔ5

minus49

Figure 7 The sequence and structure of E coli thrS 50 UTR are shown Similar to Sacerdot et al (45) nucleotide numbering is negative for theupstream 50 UTR with respect to the start codon The list of mutations is BS4-9 G(-32)gtA BS4-9CS29 G(-32)gtA and G(-46)gtA L19 A(-13)gtCL7 U(-49)gtG CS302 deletion of domain 2 from A(-14) to U(-48) ILO1 transcription begins at G(-159) ILO2 transcription begins at G(-68)ILO4 transcription begins at G(-49) U(-49)gtG and A(-13)gtC ILO5 transcription begins at G(-49) U(-49)gtG and G(-46)gtA Bolded nucleo-tides are the positions of new transcriptional start sites

2656 Nucleic Acids Research 2014 Vol 42 No 4

(Figure 2) and the unfolding of RNA structures toincrease accessibility (Figure 3) We demonstrate thatthermodynamic minimization predicts the extent ofthe ribosomersquos selective and partial unfolding of RNAstructures Our data supports a sliding mechanismwhereby downstream RNA structures are not unfoldedbut attenuate translation rate in a height-dependentmanner (Figure 4)

Importantly initially designed sequences have well-defined features that perturb the ribosomersquos individualinteractions with mRNA allowing us to preciselymeasure their effect on the ribosomersquos binding freeenergy penalty The forward design of synthetic sequencesaccording to model predictions quantitatively tests ourknowledge of these interactions and the modelrsquos abilityto predict their strengths Overall the biophysical modelemploys first-principles calculations with four empiricallymeasured constants to accurately predict the translationinitiation rates of 136 mRNAs with diversely structuredlong 50 UTRs (Figure 5)

Based on our findings there is also the potential forlong-range regulation of translation Long structured50 UTRs can feature several upstream standby sitemodules extending over 100 nt away from an openreading frame Any of these standby site modules arepotential binding sites for the ribosome according totheir binding free energy penalties but only a singlestandby site module is ultimately bound Both cis-actingand trans-acting factors can modulate individual standbysite module surface accessibilities to control standby sitemodule selection and the resulting translation rate(31723) Importantly the regulation of standby sitemodule selection will produce discrete step changes intranslation rate that can further regulate downstreamprocesses according to non-Boolean multi-state logic

In this study we introduce a new metric to quantify thesurface area of a standby site module which was validatedusing diverse but canonical RNA secondary structuresThe metric relates the geometric characteristics of thestandby site module (P D H) to its available single-stranded surface area implicitly using coefficients of oneto indicate interchangeability Additional measurementswill be needed to precisely measure these coefficientsand relate them to the helical shape of A-form RNA Inaddition tertiary structures such as G-quadruplexes andpseudoknots will likely have a differently defined metricof surface area Further investigation will be necessary todetermine the effect of tertiary RNA structures on standbysite accessibility ribosome binding and translationinitiation

As suggested by the biophysical model the ribosomalplatformrsquos ability to bind standby site modules with lowsurface area relies on its structural flexibility Previouslypublished cryo-EM data shows that the platform domainin the 30S subunit in both the free and 50S-bound statescan occupy variable conformational states (46) The twohalves of the platform which are stabilized by ribosomalproteins at its surface switch conformations during trans-lation (47) Restraining this flexibility will reduce the ribo-somersquos translation initiation rate for example when theantibiotic Edeine binds 16S ribosomal RNA helices H23

and H24 (48) Consequently the ribosomal platformrsquosstructural flexibility appears to be an essential aspect oftranslating mRNAs with structured 50 UTRsSimilarly the rigidity of single-stranded RNA likely

plays an important role in modulating the ribosomersquos dis-tortion penalty due to the entropic differences of bindingflexible or rigid macromolecules (49) Similar to DNApolyA RNA sequences are less flexible than mixed orpolyU RNA sequences (5051) Differences in RNArigidity have been shown to alter the ribosomersquos bindingaffinity to spacer regions separating the SD and startcodon sequences (52) binding free energy differences are25 kcalmol between polyA and polyAU spacer se-quences and 5 kcalmol between polyA and polyUspacer sequences Incorporating a sequence-dependentmodel for RNA rigidity into the distortion penalty calcu-lations could potentially increase their accuracyAnalysis of the modelrsquos error can lead to further insight

into the ribosomersquos interactions The modelrsquos predictionshave a normally distributed error for 90 of sequences(Supplementary Figure S6) Thirteen outliers have errorsbetween 2 and 4 kcalmol (Supplementary Data) Someoutliers are due to the effects of increased mRNA degrad-ation of 50 UTRs with very long proximal or distal bindingsites (P Dgt 20 nt) (Supplementary Figure S2) In othersextremely short proximal or distal binding sites can resultin non-canonical base pairings that alter the ribosomersquosbinding affinity For example in the absence of aproximal binding site (P=0nt) co-axial stacking willtake place between the nucleotides in the distal bindingsite and the mRNAndashrRNA duplex that anchors theribosome to the mRNA Similarly co-axial stacking willplay a larger role in predicting the effects of small RNAsthat bind to sites nearby RNA structuresAs the list of interactions that control gene expression

continues to grow we will stretch the limits of humanpattern recognition and the use of observational correl-ations Biophysical modeling offers a comprehensiveapproach to account for all known interactions with thegene expression machinery identify gaps design experi-ments precisely measure strengths and make testable pre-dictions The accuracy of thesemodels will go hand-in-handwith our ability to engineer genetic systems without trial-and-error We have incorporated the ribosomersquos inter-actions with structured standby site modules into animproved biophysical model called the RBS Calculatorv20 A software implementation and user-friendly webinterface is available at httpsalispsuedusoftware

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online

ACKNOWLEDGEMENTS

The authors would like to thank Manish Kushwaha andthe Genomics Core Facility at Penn State for technicalassistance Chase Beisel Richard Lease Paul Babitzkeand the reviewers for providing comments on themanuscript

Nucleic Acids Research 2014 Vol 42 No 4 2657

FUNDING

DARPA Young Faculty Award [to HMS] Office ofNaval Research [N00014-13-1-0074 to HMS] NationalScience Foundation Career Award [CBET-1253641 toHMS] start-up funds provided by the Penn StateInstitute for the Energy and Environment Supported bythe NSF Research Experience for Undergraduatesprogram (to ASC) Funding for open access chargeThe Office of Naval Research [N00014-13-1-0074 toHMS]

CONTRIBUTIONS

AEB and HMS designed the study developed themodel analyzed results and wrote the manuscriptAEB and ASC conducted the experiments

Conflict of interest statement HMS is the founder of DeNovo DNA a company that commercializes software todesign genetic systems The presented results areindependent of financial consequences

REFERENCES

1 TsaiA PetrovA MarshallRA KorlachJ UemuraS andPuglisiJD (2012) Heterogeneous pathways and timing of factordeparture during translation initiation Nature 487 390ndash393

2 SimonettiA MarziS JennerL MyasnikovA RombyPYusupovaG KlaholzBP and YusupovM (2009) A structuralview of translation initiation in bacteria Cell Mol Life Sci 66423ndash436

3 JennerL RombyP ReesB Schulze-BrieseC SpringerMEhresmannC EhresmannB MorasD YusupovaG andYusupovM (2005) Translational operator of mRNA on theribosome how repressor proteins exclude ribosome bindingScience 308 120ndash123

4 BeiselCL UpdegroveTB JansonBJ and StorzG (2012)Multiple factors dictate target selection by Hfq-binding smallRNAs EMBO J 31 1961ndash1974

5 StorzG VogelJ and WassarmanKM (2011) Regulation bysmall RNAs in bacteria expanding frontiers Mol Cell 43880ndash891

6 BastetL DubeA MasseE and LafontaineDA (2011) Newinsights into riboswitch regulation mechanisms Mol Microbiol80 1148ndash1154

7 BeiselCL and SmolkeCD (2009) Design principles forriboswitch function PLoS Comput Biol 5 e1000363

8 LeaseRA and BelfortM (2000) A trans-acting RNA as acontrol switch in Escherichia coli DsrA modulates function byforming alternative structures Proc Natl Acad Sci USA 979919ndash9924

9 De SmitMH and Van DuinJ (1994) Control of translation bymRNA secondary structure in Escherichia coli A quantitativeanalysis of literature data J Mol Biol 244 144ndash150

10 De SmitMH and Van DuinJ (1994) Translational initiation onstructured messengers Another role for the Shine-Dalgarnointeraction J Mol Biol 235 173ndash184

11 OstermanIA EvfratovSA SergievPV and DontsovaOA(2013) Comparison of mRNA features affecting translationinitiation and reinitiation Nucleic Acids Res 41 474ndash486

12 BugautA and BalasubramanianS (2012) 50-UTR RNA G-quadruplexes translation regulation and targeting Nucleic AcidsRes 40 4727ndash4741

13 SalisHM (2011) The ribosome binding site calculator MethodsEnzymol 498 19ndash42

14 SalisHM MirskyEA and VoigtCA (2009) Automated designof synthetic ribosome binding sites to control protein expressionNat Biotechnol 27 946ndash950

15 CarothersJM GolerJA JuminagaD and KeaslingJD (2011)Model-driven engineering of RNA devices to quantitativelyprogram gene expression Science 334 1716ndash1719

16 SimonettiA MarziS MyasnikovAG FabbrettiAYusupovM GualerziCO and KlaholzBP (2008) Structure ofthe 30S translation initiation complex Nature 455 416ndash420

17 MarziS MyasnikovAG SerganovA EhresmannCRombyP YusupovM and KlaholzBP (2007) StructuredmRNAs regulate translation initiation by binding to the platformof the ribosome Cell 130 1019ndash1031

18 YusupovaGZ YusupovMM CateJ and NollerHF (2001)The path of messenger RNA through the ribosome Cell 106233ndash241

19 De SmitMH and Van DuinJ (2003) Translational standbysites how ribosomes may deal with the rapid folding kinetics ofmRNA J Mol Biol 331 737ndash743

20 QuX LancasterL NollerHF BustamanteC and TinocoI(2012) Ribosomal protein S1 unwinds double-stranded RNA inmultiple steps Proc Natl Acad Sci USA 109 14458ndash14463

21 MilonP MaracciC FilonavaL GualerziCO andRodninaMV (2012) Real-time assembly landscape of bacterial30S translation initiation complex Nat Struct Mol Biol 19609ndash615

22 StuderSM and JosephS (2006) Unfolding of mRNA secondarystructure by the bacterial translation initiation complex MolCell 22 105ndash115

23 DarfeuilleF UnosonC VogelJ and WagnerEGH (2007) Anantisense RNA inhibits translation by competing with standbyribosomes Mol Cell 26 381ndash392

24 MutalikVK GuimaraesJC CambrayG MaiQ-AChristoffersenMJ MartinL YuA LamC RodriguezCBennettG et al (2013) Quantitative estimation of activity andquality for collections of functional genetic elements NatMethods 10 347ndash353

25 CardinaleS and ArkinAP (2012) Contextualizing context forsynthetic biologyndashidentifying causes of failure of syntheticbiological systems Biotechnol J 7 856ndash866

26 LouC StantonB ChenY-J MunskyB and VoigtCA (2012)Ribozyme-based insulator parts buffer synthetic circuits fromgenetic context Nat Biotechnol 30 1137ndash1142

27 KosuriS GoodmanDB CambrayG MutalikVK GaoYArkinAP EndyD and ChurchGM (2013) Composability ofregulatory sequences controlling transcription and translation inEscherichia coli Proc Natl Acad Sci USA 110 14024ndash14029

28 TakahashiS FurusawaH UedaT and OkahataY (2013)Translation enhancer improves the ribosome liberation fromtranslation initiation J Am Chem Soc 135 13096ndash13106

29 HaoY ZhangZJ EricksonDW HuangM HuangY LiJHwaT and ShiH (2011) Quantifying the sequence-functionrelation in gene silencing by bacterial small RNAs Proc NatlAcad Sci USA 108 12473ndash12478

30 MathewsDH SabinaJ ZukerM and TurnerDH (1999)Expanded sequence dependence of thermodynamic parametersimproves prediction of RNA secondary structure J Mol Biol288 911ndash940

31 XiaT SantaLuciaJ BurkardME KierzekR SchroederSJJiaoX CoxC and TurnerDH (1998) Thermodynamicparameters for an expanded nearest-neighbor model for formationof RNA duplexes with Watson-Crick base pairs Biochemistry 3714719ndash14735

32 GruberAR LorenzR BernhartSH NeubockR andHofackerIL (2008) The Vienna RNA websuite Nucleic AcidsRes 36 W70ndashW74

33 ChenH BjerknesM KumarR and JayE (1994) Determinationof the optimal aligned spacing between the ShinendashDalgarnosequence and the translation initiation codon of Escherichia colimRNAs Nucleic Acids Res 22 4953ndash4957

34 PertzevAV and NicholsonAW (2006) Characterization ofRNA sequence determinants and antideterminants of processingreactivity for a minimal substrate of Escherichia coli ribonucleaseIII Nucleic Acids Res 34 3708ndash3721

35 KeselerIM Collado-VidesJ Santos-ZavaletaA Peralta-GilMGama-CastroS Muniz-RascadoL Bonavides-MartinezCPaleyS KrummenackerM AltmanT et al (2011) EcoCyc a

2658 Nucleic Acids Research 2014 Vol 42 No 4

comprehensive database of Escherichia coli biology Nucleic AcidsRes 39 D583ndashD590

36 MutalikVK QiL GuimaraesJC LucksJB and ArkinAP(2012) Rationally designed families of orthogonal RNA regulatorsof translation Nat Chem Biol 8 447ndash454

37 RodrigoG LandrainTE and JaramilloA (2012) De novoautomated design of small RNA circuits for engineering syntheticriboregulation in living cells Proc Natl Acad Sci USA 10915271ndash15276

38 CalluraJM CantorCR and CollinsJJ (2012) Geneticswitchboard for synthetic biology applications Proc Natl AcadSci USA 109 5850ndash5855

39 SinhaJ ReyesSJ and GallivanJP (2010) Reprogrammingbacteria to seek and destroy an herbicide Nat Chem Biol 6464ndash470

40 AnthonyPC PerezCF Garcıa-GarcıaC and BlockSM(2012) Folding energy landscape of the thiamine pyrophosphateriboswitch aptamer Proc Natl Acad Sci USA 109 1485ndash1489

41 PerdrizetGA II ArtsimovitchI FurmanR SosnickTR andPanT (2012) Transcriptional pausing coordinates folding of theaptamer domain and the expression platform of a riboswitchProc Natl Acad Sci USA 109 3323ndash3328

42 LynchSA DesaiSK SajjaHK and GallivanJP (2007) Ahigh-throughput screen for synthetic riboswitches revealsmechanistic insights into their function Chem Biol 14 173ndash184

43 CaronM-P BastetL LussierA Simoneau-RoyM MasseEand LafontaineDA (2012) Dual-acting riboswitch control oftranslation initiation and mRNA decay Proc Natl Acad SciUSA 109 E3444ndashE3453

44 UpdegroveT WilfN SunX and WartellRM (2008) Effect ofHfq on RprArpoS mRNA pairing HfqRNA binding and the

influence of the 50 rpoS mRNA leader region Biochemistry 4711184ndash11195

45 SacerdotC CailletJ GraffeM EyermannF EhresmannBEhresmannC SpringerM and RombyP (1998) The Escherichiacoli threonyl-tRNA synthetase gene contains a split ribosomalbinding site interrupted by a hairpin structure that is essential forautoregulation Mol Microbiol 29 1077ndash1090

46 GabashviliIS AgrawalRK GrassucciR and FrankJ (1999)Structure and structural variations of the Escherichia coli 30 Sribosomal subunit as revealed by three-dimensional cryo-electronmicroscopy J Mol Biol 286 1285ndash1291

47 ClemonsWM MayJL WimberlyBT McCutcheonJPCapelMS and RamakrishnanV (1999) Structure of a bacterial30S ribosomal subunit at 55 A resolution Nature 400 833ndash840

48 PiolettiM SchlunzenF HarmsJ ZarivachR GluhmannMAvilaH BashanA BartelsH AuerbachT and JacobiC(2001) Crystal structures of complexes of the small ribosomalsubunit with tetracycline edeine and IF3 EMBO J 201829ndash1839

49 WilliamsonJR (2000) Induced fit in RNAndashprotein recognitionNat Struct Mol Biol 7 834ndash837

50 FioreJL HolmstromED and NesbittDJ (2012) Entropicorigin of Mg2+-facilitated RNA folding Proc Natl Acad SciUSA 109 2902ndash2907

51 MillsJB VacanoE and HagermanPJ (1999) Flexibility ofsingle-stranded DNA use of gapped duplex helices to determinethe persistence lengths of poly (dT) and poly (dA) J Mol Biol285 245ndash257

52 EgbertRG and KlavinsE (2012) Fine-tuning gene networksusing simple sequence repeats Proc Natl Acad Sci USA 10916817ndash16822

Nucleic Acids Research 2014 Vol 42 No 4 2659

Page 6: Translation rate is controlled by coupled trade-offs ......Translation rate is controlled by coupled trade-offs between site accessibility, selective RNA unfolding and sliding at upstream

translation initiation rates and ribosome binding freeenergy penalties The 50 UTRs each contain a standbysite module with diverse structures between 43 and 91 ntlong including variable-length distal and proximalbinding sites hairpin heights and including two- andthree-branched structures (Supplementary Data) Forbranched structures we show that the relevant hairpinheight is the average across the multiple branches(Supplementary Figure S3) (see also SupplementaryText) The quadratic relationship accurately predictedthe ribosomersquos translation rate and binding free energypenalty (R2=078 average error is 066 kcalmol) con-firming that standby site modulersquos accessibility controlsthe ribosomersquos ability to bind and initiate translation(Figure 2G) The quadratic relationship contains threeparameters which are now treated as constants(Equation 3 lsquoMaterials and Methodsrsquo section)

Coupled energetic trade-offs when unfolding structures instandby site modules

Prior to initiating translation the 30S ribosome does nothave an external source of energy Initiation factor 2 doesnot hydrolyze GTP until after translation has beeninitiated during 50S recruitment (22) Consequently the30S complex expends its own free energy to unfold mRNAstructures Previous studies have shown that mRNA struc-tures that sequester the 16S rRNA binding site areunfolded prior to initiating translation (91422) ThemRNArsquos translation initiation rate is reduced accordingto the free energy penalty for unfolding these structures asdescribed in Equations (1) and (2) (lsquoMaterials andMethodsrsquo section) However it remains unclear whethermRNA structures located upstream of SD sequenceswithin standby site modules are unfolded oraccommodated We next investigated how the ribosomeunfolds hairpins in long 50 UTRs

By definition there is a coupled thermodynamic rela-tionship between the unfolding of an mRNA hairpin andincreasing the accessibility of the corresponding standbysite module where the accessibility of standby site modulecontrols the ribosomal platformrsquos distortion energypenalty Gdistortion (see lsquoMaterials and Methodsrsquosection) Unfolding a single base pair in a modulersquoshairpin will increase the standby site modulersquos surfacearea by three nucleotides the hairpinrsquos height decreasesby one and the lengths of both the proximal and distalbinding sites increase by one Therefore unfolding amodulersquos hairpin will lead to higher accessibility a lowerGdistortion and a higher translation rate However un-folding base pairs requires an input of free energy(Gunfoldinggt 0) which will lead to a lower translationrate Consequently there is an energetic trade-offbetween unfolding mRNA structures and increasing thesurface accessibility of standby site modules Theribosome may completely unfold a structured standbysite module to maximize its surface area but the energeticcost may be high (high Gunfolding penalty withGdistortion=0) In contrast the ribosome may fully ac-commodate a standby site modulersquos hairpin without

unfolding it even though its low surface accessibility willpenalize the ribosomersquos ability to bind to it (highGdistortion penalty with Gunfolding=0)Based on thermodynamic principles we hypothesize

that the ribosome will lsquoselectivelyrsquo and lsquopartiallyrsquo unfoldstructured standby site modules to minimize its totalbinding free energy penalty (Gstandby=Gdistortion+Gunfolding) Figure 3A and B presents an illustrativeexample of this principle and our use of minimization tocalculate the ribosomersquos binding free energy penalty to astandby site module A 50 UTR with a single standby sitemodule initially contains a short proximal binding site(P=2nt) a short distal binding site (D=2nt) and along hairpin (H=15nt) resulting in a largely inaccessible50 UTR (As=4) Based on the standby site modulersquossurface area the ribosome has an initial distortionenergy penalty of 115 kcalmol However the ribosomecan selectively unfold the hairpinrsquos closing AU basepairs increasing the standby site modulersquos surface areaand lowering its Gdistortion at the expense of increasingits Gunfolding Unfolding 1 bp increases the standby sitemodulersquos surface area by 3 nt decreases the distortionenergy penalty to 78 kcalmol and increases the unfoldingenergy penalty to 090 kcalmol leading to a total bindingfree energy penalty of 87 kcalmol The ribosome cancontinue to unfold base pairs to minimize its totalbinding free energy penalty unfolding 3 bp at a cost of27 kcalmol leads to a distortion energy penaltyof 26 kcalmol and a total binding free energy penaltyof 53 kcalmol (Figure 3B) Two additional examplesare shown in Supplementary Figure S4 The unfoldingfree energies are calculated using the Vienna RNA suitewith the Turner 1999 RNA parameters (see lsquoMaterials andMethodsrsquo section) Importantly thermodynamic mini-mization does not add any new parameters to the modelWe tested the proposed mechanism by characterizing the

translation rate of 22 synthetic 50 UTRs (Figure 3CndashE) thatcontain single standby site modules with varied geometries(D=0 to 7nt and P=1 to 20nt) and differing hairpinsequences sizes and folding energetics (SupplementaryFigure S5A) We employed the minimization principle tocalculate the number of unfolded base pairs and to predictthe ribosomersquos binding free energy penalty Gstandby

(Figure 3D) As comparisons we performed the same cal-culation while asserting that either no additional energywas inputted to unfold hairpins (high Gdistortion andGunfolding=0) (Figure 3C) or that hairpins were com-pletely unfolded by the ribosome (high Gunfolding andGdistortion=0) (Figure 3E)The data supports our hypothesis that the ribosome

partially and selectively unfolds structured standby sitemodules to minimize its total binding free energy penalty(Gstandby) rather than separately minimize its distortionor unfolding penalties (average error is 090 kcalmol whenminimizing Gstandby 242 kcalmol when Gunfolding=0and 1024 kcalmol when Gdistortion=0 comparativeP-values are 0004 and 10 13 respectively)Remarkably in all 22 cases only 1ndash3 bp of each hairpinare predicted by the model to be unfolded whileaccommodating the remaining large RNA structures

Nucleic Acids Research 2014 Vol 42 No 4 2651

Non-cooperative binding to multiple standby site modules

We have shown that the ribosome can accommodate largestructures of standby site modules and engage in frequenttranslation but it is unclear how the presence of multiplestandby site modules in a 50 UTR will affect its translationrate The presence of multiple standby site modules couldallow the ribosome to cooperatively bind mRNA andincrease its translation rate We next characterized ninesynthetic 50 UTRs (76ndash164 nt long) with up to fourstandby site modules to investigate the potential for co-operative binding (Figure 4A) If the ribosome can engagein cooperative binding then the additional standby sitemodules will increase the overall available surface areareduce the ribosomersquos binding free energy penalty andincrease the translation initiation rate In particular weexpect the greatest cooperativity when the availablesurface area of a single standby site module is limitedWe found that 50 UTRs with two three or four

standby site modules were all translated at the samerate indicating an absence of cooperative binding

(Figure 4B) including when the available surface areasfor individual standby site modules were significantlylowered Therefore even when the combined surfaceareas of multiple standby site modules could potentiallyincrease contact with the ribosomal platform and increasebinding a cooperative increase in translation rate was notobserved Consequently the data supports a mechanismwhereby the ribosome binds to only a single standby sitemodule prior to initiating translation

Distant standby site modules can support hightranslation rates

We next investigated whether any standby site modulecould be bound by the ribosome even if located farupstream of the start codon Using the modelrsquos predic-tions we forward-engineered multi-standby site modulesynthetic 50 UTRs where the most distantly upstreamsite is the only one capable of supporting high translationrates (Figure 4C) Internal standby site modules have longhairpins and limited surface areas (model predictions

B

0 2 4 6 8 1002468

10121416

Unfolded Base Pairs

Ene

rgy

[kca

lmol

]

Standby Site Module Surface Area (AS)4 10 16 22 28 34

dagger

GC

CGCU

C

C

CGA

UAUA

G

G

CGA

CUCG

AAuuu aaaAAUAAGGAGGUAAAAAAUGSD

After

GC

CGCUuuu

C

C

CGAaaa

UAUA

G

G

CGA

CUCG

AA AAUAAGGAGGUAAAAAAUGSD

dagger Before

C D

A

Partial RNA Unfolding

102

103

104

Fluo

resc

ence

[au]

Predicted ΔGstandby [kcalmol]0 5 10 15

No RNA Unfolding

0 5 10 15

102

103

104

Fluo

resc

ence

[au]

Predicted ΔGstandby [kcalmol]

Complete RNA Unfolding

0 5 10 15 20

10 2

10 3

10 4

Fluo

resc

ence

[au]

Predicted ΔGstandby [kcalmol]

E

Figure 3 (A and B) Competition between ribosomal distortion and selective RNA unfolding is illustrated on a structured 50 UTR with a weakhairpin The ribosomersquos distortion penalty Gdistortion (dashed red line) hairpin unfolding penalty Gunfolding (dotted dashed blue line) and their sumGstandby (green solid line) are shown as the hairpinrsquos closing base pairs are unfolded The total binding free energy penalty has a local minima whenthree base pairs are unfolded See also Supplementary Figure S4 (CndashE) Scatter plots comparing measured translation rates from 22 synthetic50 UTRs and model predictions to test three mechanisms for the ribosomersquos unfolding of RNA structures The average difference between modelpredictions and measurements is (C) 242 kcalmol when not unfolding structures (D) 090 kcalmol when partially unfolding structures and(E) 1024 kcalmol when fully unfolding structures Comparative P-values from two-sample T-tests are 0004 (C versus D) and 1013 (D versusE) Data points are the result of three measurements in 1 day In parts (CndashE) the diagonal dashed line is the predicted translation rate according toEquation (2)

2652 Nucleic Acids Research 2014 Vol 42 No 4

Gstandby between 90 and 174 kcalmol) compared to themaximally accessible upstream standby site modulesTherefore if the ribosome binds the farthest upstreamstandby site module its translation rate will be at least127-fold higher than binding to the internal standby sitemodules We measured fluorescence levels with flowcytometry and mRNA levels with RT-qPCR Comparedto an unstructured 50 UTR the mRNA levels decreased byonly 13- to 55-fold while the fluorescence levels werecorrespondingly reduced between 25- and 183-foldshowing that changes in fluorescence were mainly due tochanges in translation rate and ribosome binding freeenergy penalties (Figure 4D)

We found that all the mRNAsrsquo translation ratesremained very high even when the accessible standbysite module was located over 100 nt upstream of thestart codon (M4 50 UTR) (Figure 4E) In particular theM2 50 UTR is translated at a 338-fold higher rate whencompared to the translation rate that could be supportedby binding to the standby site module closest to the startcodon Moving the accessible standby site moduleupstream away from the start codon does not signifi-cantly reduce the translation rate Based on these meas-urements the apparent standby site binding penaltiesare 36 62 and 65 kcalmol for M2 to M4 50 UTRs

(Figure 4F) respectively which is much lower than theinternal standby site modulesrsquo predicted binding freeenergy penalties (90ndash174 kcalmol) These measurementssuggest that the ribosome will bind to the standby sitemodule that has the lowest binding free energy penaltyregardless of its distance from the start codon

The ribosomal platform can slide acrossupstream structures

Several mechanisms would allow the ribosomal platformto bind distant upstream standby site modules and tran-sition to a pre-initiation complex at the start codon Afterthe ribosome has bound to the upstream standby sitemodule the 30S complex could engage in diffusivehopping that would increase its local concentration andassembly rate at the start codon The rate of assemblywould largely depend on the physical distance betweenthe standby site module and start codon and not on thepresence of RNA structures Alternatively the ribosomecould employ a full contact sliding mechanism that reori-ents the mRNA until a stable pre-initiation complex hasformed The positively charged surface of the ribosomalplatform makes it ideal for maintaining contact with a50 UTR until 16S rRNA hybridization has anchored themRNA Unlike a diffusive hopping mechanism the

B

ΔGst

andb

y [kc

alm

ol]

Single Stranded Length (N) [nt]124 8

6543210

103

104

Fluo

resc

ence

[au]

A

SD AUGN N

SD AUGN NN

SD AUGN NNN

C

SD AUGM1

I

SD AUGM2

III

SD AUG

M3

IIIII

I

SD AUG

M4

IIIII

IIV

M3 M4M1 M2

Structured Standby Sites

Fluo

resc

ence

[au]

102

103

104

101

100

E

I IIM2

I II IIIM3

I II III IVM4

22

20

18

16

14

12

10

8

6

4

2

0

ΔGdi

stor

tion +

ΔG

slid

ing

[kca

lmol

]

IM1

22

20

18

16

14

12

10

8

6

4

2

0

ΔGdi

stor

tion

[kca

lmol

]

FD

M1 M2M3 M4

0

5

10

15

20

25

0

5

10

15

20

25

Structured Standby Sites

mR

NA

Lev

el F

old

Red

uctio

n

Fluo

resc

ence

Fol

d R

educ

tion

Figure 4 (A) Synthetic 50 UTRs with multiple evenly spaced standby site modules Spacing N is varied from 4 to 12 nt (B) Measured translation ratesare compared to standby site module multiplicity showing the absence of cooperative binding to multiple standby site modules regardless of limitingsurface areas The Gstandby numbers shown in secondary y-axis are directly related to the data according to Equation (4) (C) Synthetic 50 UTRs withone upstream and multiple internal standby site modules labeled with roman numbers (D) The reduction in fluorescence (red bars) and mRNA levels(blue bars) of 50 UTRs M1 to M4 compared to an unstructured 50 UTR (see also Supplementary Figure S2) (E) Fluorescence measurements of 50

UTRs M1 to M4 show that upstream standby site modules can support high translation rates (F) Model predictions are shown for 50 UTRs M1 to M4

using either Gdistortion (gray bars) or Gdistortion+Gsliding (blue bars) for each standby site module (IndashIV corresponding to labels in part C)compared to the ribosomersquos apparent binding free energy penalties (dashed lines yellow region) (average error with and without the sliding penalty is115 and 324 kcalmol respectively P-value is 0013) Gdistortion for each standby site module was calculated according to Equation (3) In parts Band E the horizontal dashed line is the translation rate and ribosome binding free energy penalty of the unstructured 50 UTR

Nucleic Acids Research 2014 Vol 42 No 4 2653

presence of mRNA structures would impede this slidingmotion and reduce the pre-initiation complexrsquos assemblyrateFurther analysis of our measurements shows that RNA

structures in between the upstream standby site moduleand start codon partly attenuate the mRNArsquos translationrate (Figure 4E) The apparent binding free energy penaltyof the M2 50 UTR (36 kcalmol) is higher than theupstream standby site modulersquos predicted binding freeenergy penalty (04 kcalmol) This energetic differencecannot be explained by binding to the internal standbysite module (114 kcalmol) or by the unfolding of thedownstream RNA structure (175 kcalmol) Thus thefree energy difference (36 ndash 04=32 kcalmol) is signifi-cant but much lower than predicted by alternativebinding to other standby site modules or from unfoldingof RNA structuresOur data supports the presence of a sliding mechanism

for the ribosomal platform Intervening mRNA structuresare not unfolded but are pushed aside with an energeticcost Using our measurements of the M2 50 UTR weestimate that 32plusmn05 kcalmol free energy is requiredto push aside the single RNA structure with a height of15 nt yielding a coefficient of 020plusmn004 kcalmol perhairpin height We then determined whether includingthis sliding penalty could improve the modelrsquos ability topredict the translation rate of long 50 UTRs with multiplestandby site modules with lsquohairpins of differing heightsrsquoFor each standby site module we calculate the heights ofall downstream RNA structures and multiplied the sumby the coefficient (c=02 kcalmolnt) to yield a freeenergy penalty of sliding Gsliding The sliding freeenergy penalties were 0 30 48 and 78 kcalmol for M1to M4 50 UTRs respectively We add together the slidingdistortion and unfolding free energy penalties to calculatethe ribosomersquos binding free energy penalty to each standbysite moduleFigure 4E shows the measured translation initiation

rates for M1 to M4 alongside the predicted binding freeenergy penalties (Figure 4F) both with and without thesliding free energy penalty The data shown in Figure 4Eand F are on the same scale according to the log-linearrelationship between translation initiation rate andbinding free energy differences (lsquoMaterials and Methodsrsquosection) enabling a comparison between the measure-ments and the two predictions The introduction of thesliding free energy penalty enabled the biophysical modelto more accurately predict the translation rate from longer50 UTRs with several standby site modules (average erroris 186 kcalmol with the sliding energy penalty comparedto 445 kcalmol without the sliding energy penaltyP-value is 0013)

A biophysical model predicts translation rates fromdiversely structured 50 UTRs

Altogether the ribosome binds to 50 UTRs by selectingthe most geometrically accessible standby site module thatrequires the least amount of RNA unfolding Standby sitemodule selection is not limited by distance to the startcodon and upstream standby site modules can support

high rates of translation We identified that ribosomal dis-tortion selective RNA unfolding and ribosomal slidingcontrol standby site module selection and binding ener-getics and combine them together to create a single bio-physical model The ribosomersquos binding free energypenalty to the standby site modules in a structured50 UTR is calculated according to

Gstandby frac14 Gdistortion+Gunfolding+Gsliding eth5THORN

All three Gibbs free energies are positive and a long un-structured 50 UTR would bind to the ribosomal platformwithout an energetic penalty (Gstandby=0kcalmol) AllRNA energetics are calculated using a semi-empirical freeenergy model of RNA and RNAndashRNA interactions(3031) and the minimization algorithms available in theVienna RNA suite version 185 (32) This model extendsour previous biophysical model that focused on down-stream mRNA interactions including the 16S rRNAbinding site spacer region and downstream RNA struc-tures (1314)

We further tested the accuracy of the biophysical modelby characterizing an additional 28 diversely structured50 UTRs (68ndash164 nt long) with two to four standby sitemodules of varying geometries RNA structure energeticsand hairpin heights (Supplementary Figure S5B) The bio-physical model was able to accurately predict the relativetranslation rates and ribosome binding free energypenalties (average error is 079 kcalmol and R2=083)(Figure 5A) Overall the biophysical model containsfour empirically determined parameter values but canaccurately predict the binding free energy penalties andtranslation rates of the 136 synthetic 50 UTRscharacterized in this study (average error is 075 kcalmoland R2=089) (Figure 5B)

Genome-wide predictions of 50 UTR translation rates

We next employed the biophysical model to examine howstandby sites in genomic 50 UTRs control their translationinitiation rates From the annotated E coli MG1655genome provided by EcoCyc (35) we collected 343050 UTRs that varied from short leaders to gene-sized regu-latory hot spots with average 50 UTR length of 100 nt The50 UTR sequences from the+1 of the mRNA transcript to+100 after the protein coding sequencersquos start codon wereinputted into the biophysical model For individual50 UTRs a list of standby site modules was automaticallygenerated For each standby site module the surface areaAs distortion energy penalty Gdistortion unfolding energypenalty Gunfolding and sliding energy penalty Gsliding

were calculated These calculations are combined to deter-mine the Gstandby for each standby site module Becauseof the absence of cooperative binding as shown thestandby site module with the lowest binding free energypenalty is selected as the one controlling the ribosomersquostranslation initiation rate

According to the model calculations we found that thelengths of the natural 50 UTRs do not correlate with theirability to bind the ribosomal platform (Figure 6A) Short50 UTRs with low accessibility can inhibit translationequally well compared to long 50 UTRs where

2654 Nucleic Acids Research 2014 Vol 42 No 4

accessibility is limited by RNA structures The model cal-culations indicate that the ribosomal platform can bind tonatural 50 UTRs with a wide range of energetic penalties(up to 174 kcalmol) repressing translation by up to 2500-fold (Figure 6B) Long 50 UTRs with low Gstandby can besubjected to translation repression when trans-actingsmall RNAs bind and reduce their single-stranded acces-sibility (232936ndash38) or when cis-acting riboswitchesreduce accessibility when bound to a chemical ligand(39ndash43) In one example the standby site in the tisB50 UTR is occluded when bound by the IstR-1 smallRNA and its translation is repressed (23) In contrastlong 50 UTRs with high Gstandby can be targets for

translation activation when structural remodeling bysmall RNAs or riboswitches increases their surface acces-sibility RprA small RNA has been shown to bind and up-regulate the translation of rpoS mRNA with the help ofHfq-protein (44)The model predicts that both types of translation regu-

lation occur in natural 50 UTRs In particular 39 ofgenomic 50 UTRs have standby site modules whosedistal or proximal binding sites are longer than 15 ntwhich are potent targets for translational repression incontrast 9 of 50 UTRs contain standby site moduleswith low accessibilities (Aslt 10) making them targetsfor translation activation Moreover 79 diverse 50 UTRs

A B

C

0 200 400 600 800 10000

2

4

6

8

10

12

14

16

18

ΔGst

andb

y [kca

lmol

]

Genomic 5rsquo UTR Length [nt]

0 10 20 30 400

2

4

6

8

10

12

14

16

18

ΔGst

andb

y [kca

lmol

]

0 2 4 6 8 10 12 14 16 18100

101

102

103

104

ΔGstandby

[kcalmol]

Num

ber

of G

enom

ic 5

rsquo UT

Rs

0 2 4 6 8 10 12 14 16 18

maximum change in ΔGstandby

[kcalmol]

Num

ber

of 5

rsquo UT

R Is

ofor

ms

100

101

102

103Genomic 5rsquo UTR Length [nt]

Figure 6 (A) The ribosomal platform binding free energy penalties Gstandby for 3430 transcribed 50 UTRs from the E coli MG1655 genome(EcoCyc release 171) are compared to their lengths The inset shows the Gstandby of short 50 UTRs (B) The number of genomic 50 UTRs withdifferent values of Gstandby is shown (C) 50 UTR isoforms within transcripts across the genome affect their standby site module characteristics asquantified by the maximum calculated change in Gstandby

A

0 2 4 6 8 1010

8

6

4

2

0

103

104

Predicted ΔGstandby

[kcalmol]

App

aren

t ΔG

stan

dby [

kcal

mol

]

Flu

ores

cenc

e [a

u]

B

0 2 4 6 8 10 12 14

102

103

104

14

12

10

8

6

4

2

0

Predicted ΔGstandby

[kcalmol]

App

aren

t ΔG

stan

dby [

kcal

mol

]

Flu

ores

cenc

e [a

u]

Figure 5 (A) A scatter plot showing fluorescence measurements and apparent Gstandby energy penalties compared to predicted Gstandby energypenalties for 28 synthetic 50 UTRs with diverse structures (average error is 079 kcalmol and R2=083) (B) The same comparison for the 136synthetic 50 UTRs characterized in this study (average error is 075 kcalmol and R2=089) In parts (A and B) the apparent Gstandby numbersshown in secondary y-axis are directly related to the data according to Equation (4) An x=y diagonal line is shown (dashed)

Nucleic Acids Research 2014 Vol 42 No 4 2655

with an average length of 175 nt have positive slidingenergy penalties (Gsliding) indicating that distantstandby site modules are controlling translation andmaking them potential targets for long-range regulationFurther there are 769 operons where variability in their

transcriptional start sites due to adjacent or overlappingpromoters result in 50 UTRs with different lengths andmRNA structures For example the mmuPM operon hasfive different transcriptional start sites leading to 50 UTRisoforms with lengths from 9 to 205 nt The biophysicalmodel predicts a 1088-fold change in the translation initi-ation rate of mmuP as a result of changes in the structureand surface area of the different standby site modulesOverall across all 50 UTR isoforms within the genomemodel calculations show large changes in translation ini-tiation rates through alterations in the standby sitemodulesrsquo characteristics (Figure 6C) These differencesimply that switching promoter usage due to changingtranscription factor or sigma factor levels can modulatean mRNArsquos translation rate and potentially createstandby site modules whose accessibility can be addition-ally regulated by translation factors However furtherinvestigation is necessary to examine the 50 UTRs ofisoforms and the regulation of their translation rates

Biophysical modeling to calculate translation rates fromsplit ribosome binding sites

Sacerdot et al (45) studied the translation of the 50 UTRof E coli thrS mRNA that uses a split ribosome bindingsite The biophysical model classifies the 50 UTR as con-taining three standby site modules with large RNA struc-tures and varying distal and proximal binding site lengths(Figure 7) The authors concluded that a 24 nt long singlestranded region (domain 3) is essential for ribosomebinding and that the ribosome can accommodate a long

hairpin (domain 2) without unfolding it which was latersupported by crystallographic data (3) The biophysicalmodelrsquos calculations are strongly consistent with their con-clusions In particular a mutation that deleted the entiredomain 3 and its upstream region (ILO4) repressed thetranslation rate by over 50-fold where it significantlyreduced the standby site accessibility (from As=24 andGdistortion=0 to As=0 and Gdistortion=174 kcalmol) To offset this large energetic burden model calcu-lations indicate that the ribosomal platform partiallyunfolded 6 bp from the domain 2 hairpin whichminimized its binding free energy penalty to a Gstandby

of 532 kcalmol (As=205 Gdistortion=002 andGunfolding=53 kcalmol) Destabilization of thedomain 2 hairpin (ILO5) restored the mRNArsquos transla-tion to its wild-type rate by increasing the surface area ofthe standby site to 205 nt resulting in an almost zeroGstandby The biophysical model calculations indicatethat other mutations to the thrS 50 UTR (L6 M1 N1BS4-9 BS4-9CS29 L19 L7 CS302 ILO1 andILO2 see Figure 7) did not alter its standby sitemodule surface area or RNA structure the maximumobserved change in translation was 25-fold for thesemutations

DISCUSSION

We designed and characterized synthetic mRNA se-quences and applied biophysical modeling to decipherthe rules for the ribosomersquos interactions at structured50 UTRs The 30S ribosome searches for a singlestandby site module that supports the highest translationrate regardless of its distance from the start codon Theribosomersquos binding free energy penalty to individualstandby site modules is governed by a competing trade-off between the accessibility of standby site modules

GAACUGGUUAAUAAACAAAUUUUUC

AG

GUGG

U

G

UACCUG

AAUU

G

G

AG

UUUG

AAC C

UG

ACUA

GA

UUCGUG

U

G

A

GCGUU

U

U

CGCAAU

GUUU

G

AA

UU

GC

A

CA

GU

AUUU

GAUACGGC

AU

UAAGGAUAUAAAAUGSD

U

AUGUU

U

UGCAAA

GCUU U

U

CGUG

ACUAGUGC

G

GU

C

CA

Domain 2

Domain 3

Domain 4

minus1

ILOΔ1

ILOΔ2

minus90

minus120

minus150

minus163 minus60 minus13

minus32

ILOΔ4 ILOΔ5

minus49

Figure 7 The sequence and structure of E coli thrS 50 UTR are shown Similar to Sacerdot et al (45) nucleotide numbering is negative for theupstream 50 UTR with respect to the start codon The list of mutations is BS4-9 G(-32)gtA BS4-9CS29 G(-32)gtA and G(-46)gtA L19 A(-13)gtCL7 U(-49)gtG CS302 deletion of domain 2 from A(-14) to U(-48) ILO1 transcription begins at G(-159) ILO2 transcription begins at G(-68)ILO4 transcription begins at G(-49) U(-49)gtG and A(-13)gtC ILO5 transcription begins at G(-49) U(-49)gtG and G(-46)gtA Bolded nucleo-tides are the positions of new transcriptional start sites

2656 Nucleic Acids Research 2014 Vol 42 No 4

(Figure 2) and the unfolding of RNA structures toincrease accessibility (Figure 3) We demonstrate thatthermodynamic minimization predicts the extent ofthe ribosomersquos selective and partial unfolding of RNAstructures Our data supports a sliding mechanismwhereby downstream RNA structures are not unfoldedbut attenuate translation rate in a height-dependentmanner (Figure 4)

Importantly initially designed sequences have well-defined features that perturb the ribosomersquos individualinteractions with mRNA allowing us to preciselymeasure their effect on the ribosomersquos binding freeenergy penalty The forward design of synthetic sequencesaccording to model predictions quantitatively tests ourknowledge of these interactions and the modelrsquos abilityto predict their strengths Overall the biophysical modelemploys first-principles calculations with four empiricallymeasured constants to accurately predict the translationinitiation rates of 136 mRNAs with diversely structuredlong 50 UTRs (Figure 5)

Based on our findings there is also the potential forlong-range regulation of translation Long structured50 UTRs can feature several upstream standby sitemodules extending over 100 nt away from an openreading frame Any of these standby site modules arepotential binding sites for the ribosome according totheir binding free energy penalties but only a singlestandby site module is ultimately bound Both cis-actingand trans-acting factors can modulate individual standbysite module surface accessibilities to control standby sitemodule selection and the resulting translation rate(31723) Importantly the regulation of standby sitemodule selection will produce discrete step changes intranslation rate that can further regulate downstreamprocesses according to non-Boolean multi-state logic

In this study we introduce a new metric to quantify thesurface area of a standby site module which was validatedusing diverse but canonical RNA secondary structuresThe metric relates the geometric characteristics of thestandby site module (P D H) to its available single-stranded surface area implicitly using coefficients of oneto indicate interchangeability Additional measurementswill be needed to precisely measure these coefficientsand relate them to the helical shape of A-form RNA Inaddition tertiary structures such as G-quadruplexes andpseudoknots will likely have a differently defined metricof surface area Further investigation will be necessary todetermine the effect of tertiary RNA structures on standbysite accessibility ribosome binding and translationinitiation

As suggested by the biophysical model the ribosomalplatformrsquos ability to bind standby site modules with lowsurface area relies on its structural flexibility Previouslypublished cryo-EM data shows that the platform domainin the 30S subunit in both the free and 50S-bound statescan occupy variable conformational states (46) The twohalves of the platform which are stabilized by ribosomalproteins at its surface switch conformations during trans-lation (47) Restraining this flexibility will reduce the ribo-somersquos translation initiation rate for example when theantibiotic Edeine binds 16S ribosomal RNA helices H23

and H24 (48) Consequently the ribosomal platformrsquosstructural flexibility appears to be an essential aspect oftranslating mRNAs with structured 50 UTRsSimilarly the rigidity of single-stranded RNA likely

plays an important role in modulating the ribosomersquos dis-tortion penalty due to the entropic differences of bindingflexible or rigid macromolecules (49) Similar to DNApolyA RNA sequences are less flexible than mixed orpolyU RNA sequences (5051) Differences in RNArigidity have been shown to alter the ribosomersquos bindingaffinity to spacer regions separating the SD and startcodon sequences (52) binding free energy differences are25 kcalmol between polyA and polyAU spacer se-quences and 5 kcalmol between polyA and polyUspacer sequences Incorporating a sequence-dependentmodel for RNA rigidity into the distortion penalty calcu-lations could potentially increase their accuracyAnalysis of the modelrsquos error can lead to further insight

into the ribosomersquos interactions The modelrsquos predictionshave a normally distributed error for 90 of sequences(Supplementary Figure S6) Thirteen outliers have errorsbetween 2 and 4 kcalmol (Supplementary Data) Someoutliers are due to the effects of increased mRNA degrad-ation of 50 UTRs with very long proximal or distal bindingsites (P Dgt 20 nt) (Supplementary Figure S2) In othersextremely short proximal or distal binding sites can resultin non-canonical base pairings that alter the ribosomersquosbinding affinity For example in the absence of aproximal binding site (P=0nt) co-axial stacking willtake place between the nucleotides in the distal bindingsite and the mRNAndashrRNA duplex that anchors theribosome to the mRNA Similarly co-axial stacking willplay a larger role in predicting the effects of small RNAsthat bind to sites nearby RNA structuresAs the list of interactions that control gene expression

continues to grow we will stretch the limits of humanpattern recognition and the use of observational correl-ations Biophysical modeling offers a comprehensiveapproach to account for all known interactions with thegene expression machinery identify gaps design experi-ments precisely measure strengths and make testable pre-dictions The accuracy of thesemodels will go hand-in-handwith our ability to engineer genetic systems without trial-and-error We have incorporated the ribosomersquos inter-actions with structured standby site modules into animproved biophysical model called the RBS Calculatorv20 A software implementation and user-friendly webinterface is available at httpsalispsuedusoftware

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online

ACKNOWLEDGEMENTS

The authors would like to thank Manish Kushwaha andthe Genomics Core Facility at Penn State for technicalassistance Chase Beisel Richard Lease Paul Babitzkeand the reviewers for providing comments on themanuscript

Nucleic Acids Research 2014 Vol 42 No 4 2657

FUNDING

DARPA Young Faculty Award [to HMS] Office ofNaval Research [N00014-13-1-0074 to HMS] NationalScience Foundation Career Award [CBET-1253641 toHMS] start-up funds provided by the Penn StateInstitute for the Energy and Environment Supported bythe NSF Research Experience for Undergraduatesprogram (to ASC) Funding for open access chargeThe Office of Naval Research [N00014-13-1-0074 toHMS]

CONTRIBUTIONS

AEB and HMS designed the study developed themodel analyzed results and wrote the manuscriptAEB and ASC conducted the experiments

Conflict of interest statement HMS is the founder of DeNovo DNA a company that commercializes software todesign genetic systems The presented results areindependent of financial consequences

REFERENCES

1 TsaiA PetrovA MarshallRA KorlachJ UemuraS andPuglisiJD (2012) Heterogeneous pathways and timing of factordeparture during translation initiation Nature 487 390ndash393

2 SimonettiA MarziS JennerL MyasnikovA RombyPYusupovaG KlaholzBP and YusupovM (2009) A structuralview of translation initiation in bacteria Cell Mol Life Sci 66423ndash436

3 JennerL RombyP ReesB Schulze-BrieseC SpringerMEhresmannC EhresmannB MorasD YusupovaG andYusupovM (2005) Translational operator of mRNA on theribosome how repressor proteins exclude ribosome bindingScience 308 120ndash123

4 BeiselCL UpdegroveTB JansonBJ and StorzG (2012)Multiple factors dictate target selection by Hfq-binding smallRNAs EMBO J 31 1961ndash1974

5 StorzG VogelJ and WassarmanKM (2011) Regulation bysmall RNAs in bacteria expanding frontiers Mol Cell 43880ndash891

6 BastetL DubeA MasseE and LafontaineDA (2011) Newinsights into riboswitch regulation mechanisms Mol Microbiol80 1148ndash1154

7 BeiselCL and SmolkeCD (2009) Design principles forriboswitch function PLoS Comput Biol 5 e1000363

8 LeaseRA and BelfortM (2000) A trans-acting RNA as acontrol switch in Escherichia coli DsrA modulates function byforming alternative structures Proc Natl Acad Sci USA 979919ndash9924

9 De SmitMH and Van DuinJ (1994) Control of translation bymRNA secondary structure in Escherichia coli A quantitativeanalysis of literature data J Mol Biol 244 144ndash150

10 De SmitMH and Van DuinJ (1994) Translational initiation onstructured messengers Another role for the Shine-Dalgarnointeraction J Mol Biol 235 173ndash184

11 OstermanIA EvfratovSA SergievPV and DontsovaOA(2013) Comparison of mRNA features affecting translationinitiation and reinitiation Nucleic Acids Res 41 474ndash486

12 BugautA and BalasubramanianS (2012) 50-UTR RNA G-quadruplexes translation regulation and targeting Nucleic AcidsRes 40 4727ndash4741

13 SalisHM (2011) The ribosome binding site calculator MethodsEnzymol 498 19ndash42

14 SalisHM MirskyEA and VoigtCA (2009) Automated designof synthetic ribosome binding sites to control protein expressionNat Biotechnol 27 946ndash950

15 CarothersJM GolerJA JuminagaD and KeaslingJD (2011)Model-driven engineering of RNA devices to quantitativelyprogram gene expression Science 334 1716ndash1719

16 SimonettiA MarziS MyasnikovAG FabbrettiAYusupovM GualerziCO and KlaholzBP (2008) Structure ofthe 30S translation initiation complex Nature 455 416ndash420

17 MarziS MyasnikovAG SerganovA EhresmannCRombyP YusupovM and KlaholzBP (2007) StructuredmRNAs regulate translation initiation by binding to the platformof the ribosome Cell 130 1019ndash1031

18 YusupovaGZ YusupovMM CateJ and NollerHF (2001)The path of messenger RNA through the ribosome Cell 106233ndash241

19 De SmitMH and Van DuinJ (2003) Translational standbysites how ribosomes may deal with the rapid folding kinetics ofmRNA J Mol Biol 331 737ndash743

20 QuX LancasterL NollerHF BustamanteC and TinocoI(2012) Ribosomal protein S1 unwinds double-stranded RNA inmultiple steps Proc Natl Acad Sci USA 109 14458ndash14463

21 MilonP MaracciC FilonavaL GualerziCO andRodninaMV (2012) Real-time assembly landscape of bacterial30S translation initiation complex Nat Struct Mol Biol 19609ndash615

22 StuderSM and JosephS (2006) Unfolding of mRNA secondarystructure by the bacterial translation initiation complex MolCell 22 105ndash115

23 DarfeuilleF UnosonC VogelJ and WagnerEGH (2007) Anantisense RNA inhibits translation by competing with standbyribosomes Mol Cell 26 381ndash392

24 MutalikVK GuimaraesJC CambrayG MaiQ-AChristoffersenMJ MartinL YuA LamC RodriguezCBennettG et al (2013) Quantitative estimation of activity andquality for collections of functional genetic elements NatMethods 10 347ndash353

25 CardinaleS and ArkinAP (2012) Contextualizing context forsynthetic biologyndashidentifying causes of failure of syntheticbiological systems Biotechnol J 7 856ndash866

26 LouC StantonB ChenY-J MunskyB and VoigtCA (2012)Ribozyme-based insulator parts buffer synthetic circuits fromgenetic context Nat Biotechnol 30 1137ndash1142

27 KosuriS GoodmanDB CambrayG MutalikVK GaoYArkinAP EndyD and ChurchGM (2013) Composability ofregulatory sequences controlling transcription and translation inEscherichia coli Proc Natl Acad Sci USA 110 14024ndash14029

28 TakahashiS FurusawaH UedaT and OkahataY (2013)Translation enhancer improves the ribosome liberation fromtranslation initiation J Am Chem Soc 135 13096ndash13106

29 HaoY ZhangZJ EricksonDW HuangM HuangY LiJHwaT and ShiH (2011) Quantifying the sequence-functionrelation in gene silencing by bacterial small RNAs Proc NatlAcad Sci USA 108 12473ndash12478

30 MathewsDH SabinaJ ZukerM and TurnerDH (1999)Expanded sequence dependence of thermodynamic parametersimproves prediction of RNA secondary structure J Mol Biol288 911ndash940

31 XiaT SantaLuciaJ BurkardME KierzekR SchroederSJJiaoX CoxC and TurnerDH (1998) Thermodynamicparameters for an expanded nearest-neighbor model for formationof RNA duplexes with Watson-Crick base pairs Biochemistry 3714719ndash14735

32 GruberAR LorenzR BernhartSH NeubockR andHofackerIL (2008) The Vienna RNA websuite Nucleic AcidsRes 36 W70ndashW74

33 ChenH BjerknesM KumarR and JayE (1994) Determinationof the optimal aligned spacing between the ShinendashDalgarnosequence and the translation initiation codon of Escherichia colimRNAs Nucleic Acids Res 22 4953ndash4957

34 PertzevAV and NicholsonAW (2006) Characterization ofRNA sequence determinants and antideterminants of processingreactivity for a minimal substrate of Escherichia coli ribonucleaseIII Nucleic Acids Res 34 3708ndash3721

35 KeselerIM Collado-VidesJ Santos-ZavaletaA Peralta-GilMGama-CastroS Muniz-RascadoL Bonavides-MartinezCPaleyS KrummenackerM AltmanT et al (2011) EcoCyc a

2658 Nucleic Acids Research 2014 Vol 42 No 4

comprehensive database of Escherichia coli biology Nucleic AcidsRes 39 D583ndashD590

36 MutalikVK QiL GuimaraesJC LucksJB and ArkinAP(2012) Rationally designed families of orthogonal RNA regulatorsof translation Nat Chem Biol 8 447ndash454

37 RodrigoG LandrainTE and JaramilloA (2012) De novoautomated design of small RNA circuits for engineering syntheticriboregulation in living cells Proc Natl Acad Sci USA 10915271ndash15276

38 CalluraJM CantorCR and CollinsJJ (2012) Geneticswitchboard for synthetic biology applications Proc Natl AcadSci USA 109 5850ndash5855

39 SinhaJ ReyesSJ and GallivanJP (2010) Reprogrammingbacteria to seek and destroy an herbicide Nat Chem Biol 6464ndash470

40 AnthonyPC PerezCF Garcıa-GarcıaC and BlockSM(2012) Folding energy landscape of the thiamine pyrophosphateriboswitch aptamer Proc Natl Acad Sci USA 109 1485ndash1489

41 PerdrizetGA II ArtsimovitchI FurmanR SosnickTR andPanT (2012) Transcriptional pausing coordinates folding of theaptamer domain and the expression platform of a riboswitchProc Natl Acad Sci USA 109 3323ndash3328

42 LynchSA DesaiSK SajjaHK and GallivanJP (2007) Ahigh-throughput screen for synthetic riboswitches revealsmechanistic insights into their function Chem Biol 14 173ndash184

43 CaronM-P BastetL LussierA Simoneau-RoyM MasseEand LafontaineDA (2012) Dual-acting riboswitch control oftranslation initiation and mRNA decay Proc Natl Acad SciUSA 109 E3444ndashE3453

44 UpdegroveT WilfN SunX and WartellRM (2008) Effect ofHfq on RprArpoS mRNA pairing HfqRNA binding and the

influence of the 50 rpoS mRNA leader region Biochemistry 4711184ndash11195

45 SacerdotC CailletJ GraffeM EyermannF EhresmannBEhresmannC SpringerM and RombyP (1998) The Escherichiacoli threonyl-tRNA synthetase gene contains a split ribosomalbinding site interrupted by a hairpin structure that is essential forautoregulation Mol Microbiol 29 1077ndash1090

46 GabashviliIS AgrawalRK GrassucciR and FrankJ (1999)Structure and structural variations of the Escherichia coli 30 Sribosomal subunit as revealed by three-dimensional cryo-electronmicroscopy J Mol Biol 286 1285ndash1291

47 ClemonsWM MayJL WimberlyBT McCutcheonJPCapelMS and RamakrishnanV (1999) Structure of a bacterial30S ribosomal subunit at 55 A resolution Nature 400 833ndash840

48 PiolettiM SchlunzenF HarmsJ ZarivachR GluhmannMAvilaH BashanA BartelsH AuerbachT and JacobiC(2001) Crystal structures of complexes of the small ribosomalsubunit with tetracycline edeine and IF3 EMBO J 201829ndash1839

49 WilliamsonJR (2000) Induced fit in RNAndashprotein recognitionNat Struct Mol Biol 7 834ndash837

50 FioreJL HolmstromED and NesbittDJ (2012) Entropicorigin of Mg2+-facilitated RNA folding Proc Natl Acad SciUSA 109 2902ndash2907

51 MillsJB VacanoE and HagermanPJ (1999) Flexibility ofsingle-stranded DNA use of gapped duplex helices to determinethe persistence lengths of poly (dT) and poly (dA) J Mol Biol285 245ndash257

52 EgbertRG and KlavinsE (2012) Fine-tuning gene networksusing simple sequence repeats Proc Natl Acad Sci USA 10916817ndash16822

Nucleic Acids Research 2014 Vol 42 No 4 2659

Page 7: Translation rate is controlled by coupled trade-offs ......Translation rate is controlled by coupled trade-offs between site accessibility, selective RNA unfolding and sliding at upstream

Non-cooperative binding to multiple standby site modules

We have shown that the ribosome can accommodate largestructures of standby site modules and engage in frequenttranslation but it is unclear how the presence of multiplestandby site modules in a 50 UTR will affect its translationrate The presence of multiple standby site modules couldallow the ribosome to cooperatively bind mRNA andincrease its translation rate We next characterized ninesynthetic 50 UTRs (76ndash164 nt long) with up to fourstandby site modules to investigate the potential for co-operative binding (Figure 4A) If the ribosome can engagein cooperative binding then the additional standby sitemodules will increase the overall available surface areareduce the ribosomersquos binding free energy penalty andincrease the translation initiation rate In particular weexpect the greatest cooperativity when the availablesurface area of a single standby site module is limitedWe found that 50 UTRs with two three or four

standby site modules were all translated at the samerate indicating an absence of cooperative binding

(Figure 4B) including when the available surface areasfor individual standby site modules were significantlylowered Therefore even when the combined surfaceareas of multiple standby site modules could potentiallyincrease contact with the ribosomal platform and increasebinding a cooperative increase in translation rate was notobserved Consequently the data supports a mechanismwhereby the ribosome binds to only a single standby sitemodule prior to initiating translation

Distant standby site modules can support hightranslation rates

We next investigated whether any standby site modulecould be bound by the ribosome even if located farupstream of the start codon Using the modelrsquos predic-tions we forward-engineered multi-standby site modulesynthetic 50 UTRs where the most distantly upstreamsite is the only one capable of supporting high translationrates (Figure 4C) Internal standby site modules have longhairpins and limited surface areas (model predictions

B

0 2 4 6 8 1002468

10121416

Unfolded Base Pairs

Ene

rgy

[kca

lmol

]

Standby Site Module Surface Area (AS)4 10 16 22 28 34

dagger

GC

CGCU

C

C

CGA

UAUA

G

G

CGA

CUCG

AAuuu aaaAAUAAGGAGGUAAAAAAUGSD

After

GC

CGCUuuu

C

C

CGAaaa

UAUA

G

G

CGA

CUCG

AA AAUAAGGAGGUAAAAAAUGSD

dagger Before

C D

A

Partial RNA Unfolding

102

103

104

Fluo

resc

ence

[au]

Predicted ΔGstandby [kcalmol]0 5 10 15

No RNA Unfolding

0 5 10 15

102

103

104

Fluo

resc

ence

[au]

Predicted ΔGstandby [kcalmol]

Complete RNA Unfolding

0 5 10 15 20

10 2

10 3

10 4

Fluo

resc

ence

[au]

Predicted ΔGstandby [kcalmol]

E

Figure 3 (A and B) Competition between ribosomal distortion and selective RNA unfolding is illustrated on a structured 50 UTR with a weakhairpin The ribosomersquos distortion penalty Gdistortion (dashed red line) hairpin unfolding penalty Gunfolding (dotted dashed blue line) and their sumGstandby (green solid line) are shown as the hairpinrsquos closing base pairs are unfolded The total binding free energy penalty has a local minima whenthree base pairs are unfolded See also Supplementary Figure S4 (CndashE) Scatter plots comparing measured translation rates from 22 synthetic50 UTRs and model predictions to test three mechanisms for the ribosomersquos unfolding of RNA structures The average difference between modelpredictions and measurements is (C) 242 kcalmol when not unfolding structures (D) 090 kcalmol when partially unfolding structures and(E) 1024 kcalmol when fully unfolding structures Comparative P-values from two-sample T-tests are 0004 (C versus D) and 1013 (D versusE) Data points are the result of three measurements in 1 day In parts (CndashE) the diagonal dashed line is the predicted translation rate according toEquation (2)

2652 Nucleic Acids Research 2014 Vol 42 No 4

Gstandby between 90 and 174 kcalmol) compared to themaximally accessible upstream standby site modulesTherefore if the ribosome binds the farthest upstreamstandby site module its translation rate will be at least127-fold higher than binding to the internal standby sitemodules We measured fluorescence levels with flowcytometry and mRNA levels with RT-qPCR Comparedto an unstructured 50 UTR the mRNA levels decreased byonly 13- to 55-fold while the fluorescence levels werecorrespondingly reduced between 25- and 183-foldshowing that changes in fluorescence were mainly due tochanges in translation rate and ribosome binding freeenergy penalties (Figure 4D)

We found that all the mRNAsrsquo translation ratesremained very high even when the accessible standbysite module was located over 100 nt upstream of thestart codon (M4 50 UTR) (Figure 4E) In particular theM2 50 UTR is translated at a 338-fold higher rate whencompared to the translation rate that could be supportedby binding to the standby site module closest to the startcodon Moving the accessible standby site moduleupstream away from the start codon does not signifi-cantly reduce the translation rate Based on these meas-urements the apparent standby site binding penaltiesare 36 62 and 65 kcalmol for M2 to M4 50 UTRs

(Figure 4F) respectively which is much lower than theinternal standby site modulesrsquo predicted binding freeenergy penalties (90ndash174 kcalmol) These measurementssuggest that the ribosome will bind to the standby sitemodule that has the lowest binding free energy penaltyregardless of its distance from the start codon

The ribosomal platform can slide acrossupstream structures

Several mechanisms would allow the ribosomal platformto bind distant upstream standby site modules and tran-sition to a pre-initiation complex at the start codon Afterthe ribosome has bound to the upstream standby sitemodule the 30S complex could engage in diffusivehopping that would increase its local concentration andassembly rate at the start codon The rate of assemblywould largely depend on the physical distance betweenthe standby site module and start codon and not on thepresence of RNA structures Alternatively the ribosomecould employ a full contact sliding mechanism that reori-ents the mRNA until a stable pre-initiation complex hasformed The positively charged surface of the ribosomalplatform makes it ideal for maintaining contact with a50 UTR until 16S rRNA hybridization has anchored themRNA Unlike a diffusive hopping mechanism the

B

ΔGst

andb

y [kc

alm

ol]

Single Stranded Length (N) [nt]124 8

6543210

103

104

Fluo

resc

ence

[au]

A

SD AUGN N

SD AUGN NN

SD AUGN NNN

C

SD AUGM1

I

SD AUGM2

III

SD AUG

M3

IIIII

I

SD AUG

M4

IIIII

IIV

M3 M4M1 M2

Structured Standby Sites

Fluo

resc

ence

[au]

102

103

104

101

100

E

I IIM2

I II IIIM3

I II III IVM4

22

20

18

16

14

12

10

8

6

4

2

0

ΔGdi

stor

tion +

ΔG

slid

ing

[kca

lmol

]

IM1

22

20

18

16

14

12

10

8

6

4

2

0

ΔGdi

stor

tion

[kca

lmol

]

FD

M1 M2M3 M4

0

5

10

15

20

25

0

5

10

15

20

25

Structured Standby Sites

mR

NA

Lev

el F

old

Red

uctio

n

Fluo

resc

ence

Fol

d R

educ

tion

Figure 4 (A) Synthetic 50 UTRs with multiple evenly spaced standby site modules Spacing N is varied from 4 to 12 nt (B) Measured translation ratesare compared to standby site module multiplicity showing the absence of cooperative binding to multiple standby site modules regardless of limitingsurface areas The Gstandby numbers shown in secondary y-axis are directly related to the data according to Equation (4) (C) Synthetic 50 UTRs withone upstream and multiple internal standby site modules labeled with roman numbers (D) The reduction in fluorescence (red bars) and mRNA levels(blue bars) of 50 UTRs M1 to M4 compared to an unstructured 50 UTR (see also Supplementary Figure S2) (E) Fluorescence measurements of 50

UTRs M1 to M4 show that upstream standby site modules can support high translation rates (F) Model predictions are shown for 50 UTRs M1 to M4

using either Gdistortion (gray bars) or Gdistortion+Gsliding (blue bars) for each standby site module (IndashIV corresponding to labels in part C)compared to the ribosomersquos apparent binding free energy penalties (dashed lines yellow region) (average error with and without the sliding penalty is115 and 324 kcalmol respectively P-value is 0013) Gdistortion for each standby site module was calculated according to Equation (3) In parts Band E the horizontal dashed line is the translation rate and ribosome binding free energy penalty of the unstructured 50 UTR

Nucleic Acids Research 2014 Vol 42 No 4 2653

presence of mRNA structures would impede this slidingmotion and reduce the pre-initiation complexrsquos assemblyrateFurther analysis of our measurements shows that RNA

structures in between the upstream standby site moduleand start codon partly attenuate the mRNArsquos translationrate (Figure 4E) The apparent binding free energy penaltyof the M2 50 UTR (36 kcalmol) is higher than theupstream standby site modulersquos predicted binding freeenergy penalty (04 kcalmol) This energetic differencecannot be explained by binding to the internal standbysite module (114 kcalmol) or by the unfolding of thedownstream RNA structure (175 kcalmol) Thus thefree energy difference (36 ndash 04=32 kcalmol) is signifi-cant but much lower than predicted by alternativebinding to other standby site modules or from unfoldingof RNA structuresOur data supports the presence of a sliding mechanism

for the ribosomal platform Intervening mRNA structuresare not unfolded but are pushed aside with an energeticcost Using our measurements of the M2 50 UTR weestimate that 32plusmn05 kcalmol free energy is requiredto push aside the single RNA structure with a height of15 nt yielding a coefficient of 020plusmn004 kcalmol perhairpin height We then determined whether includingthis sliding penalty could improve the modelrsquos ability topredict the translation rate of long 50 UTRs with multiplestandby site modules with lsquohairpins of differing heightsrsquoFor each standby site module we calculate the heights ofall downstream RNA structures and multiplied the sumby the coefficient (c=02 kcalmolnt) to yield a freeenergy penalty of sliding Gsliding The sliding freeenergy penalties were 0 30 48 and 78 kcalmol for M1to M4 50 UTRs respectively We add together the slidingdistortion and unfolding free energy penalties to calculatethe ribosomersquos binding free energy penalty to each standbysite moduleFigure 4E shows the measured translation initiation

rates for M1 to M4 alongside the predicted binding freeenergy penalties (Figure 4F) both with and without thesliding free energy penalty The data shown in Figure 4Eand F are on the same scale according to the log-linearrelationship between translation initiation rate andbinding free energy differences (lsquoMaterials and Methodsrsquosection) enabling a comparison between the measure-ments and the two predictions The introduction of thesliding free energy penalty enabled the biophysical modelto more accurately predict the translation rate from longer50 UTRs with several standby site modules (average erroris 186 kcalmol with the sliding energy penalty comparedto 445 kcalmol without the sliding energy penaltyP-value is 0013)

A biophysical model predicts translation rates fromdiversely structured 50 UTRs

Altogether the ribosome binds to 50 UTRs by selectingthe most geometrically accessible standby site module thatrequires the least amount of RNA unfolding Standby sitemodule selection is not limited by distance to the startcodon and upstream standby site modules can support

high rates of translation We identified that ribosomal dis-tortion selective RNA unfolding and ribosomal slidingcontrol standby site module selection and binding ener-getics and combine them together to create a single bio-physical model The ribosomersquos binding free energypenalty to the standby site modules in a structured50 UTR is calculated according to

Gstandby frac14 Gdistortion+Gunfolding+Gsliding eth5THORN

All three Gibbs free energies are positive and a long un-structured 50 UTR would bind to the ribosomal platformwithout an energetic penalty (Gstandby=0kcalmol) AllRNA energetics are calculated using a semi-empirical freeenergy model of RNA and RNAndashRNA interactions(3031) and the minimization algorithms available in theVienna RNA suite version 185 (32) This model extendsour previous biophysical model that focused on down-stream mRNA interactions including the 16S rRNAbinding site spacer region and downstream RNA struc-tures (1314)

We further tested the accuracy of the biophysical modelby characterizing an additional 28 diversely structured50 UTRs (68ndash164 nt long) with two to four standby sitemodules of varying geometries RNA structure energeticsand hairpin heights (Supplementary Figure S5B) The bio-physical model was able to accurately predict the relativetranslation rates and ribosome binding free energypenalties (average error is 079 kcalmol and R2=083)(Figure 5A) Overall the biophysical model containsfour empirically determined parameter values but canaccurately predict the binding free energy penalties andtranslation rates of the 136 synthetic 50 UTRscharacterized in this study (average error is 075 kcalmoland R2=089) (Figure 5B)

Genome-wide predictions of 50 UTR translation rates

We next employed the biophysical model to examine howstandby sites in genomic 50 UTRs control their translationinitiation rates From the annotated E coli MG1655genome provided by EcoCyc (35) we collected 343050 UTRs that varied from short leaders to gene-sized regu-latory hot spots with average 50 UTR length of 100 nt The50 UTR sequences from the+1 of the mRNA transcript to+100 after the protein coding sequencersquos start codon wereinputted into the biophysical model For individual50 UTRs a list of standby site modules was automaticallygenerated For each standby site module the surface areaAs distortion energy penalty Gdistortion unfolding energypenalty Gunfolding and sliding energy penalty Gsliding

were calculated These calculations are combined to deter-mine the Gstandby for each standby site module Becauseof the absence of cooperative binding as shown thestandby site module with the lowest binding free energypenalty is selected as the one controlling the ribosomersquostranslation initiation rate

According to the model calculations we found that thelengths of the natural 50 UTRs do not correlate with theirability to bind the ribosomal platform (Figure 6A) Short50 UTRs with low accessibility can inhibit translationequally well compared to long 50 UTRs where

2654 Nucleic Acids Research 2014 Vol 42 No 4

accessibility is limited by RNA structures The model cal-culations indicate that the ribosomal platform can bind tonatural 50 UTRs with a wide range of energetic penalties(up to 174 kcalmol) repressing translation by up to 2500-fold (Figure 6B) Long 50 UTRs with low Gstandby can besubjected to translation repression when trans-actingsmall RNAs bind and reduce their single-stranded acces-sibility (232936ndash38) or when cis-acting riboswitchesreduce accessibility when bound to a chemical ligand(39ndash43) In one example the standby site in the tisB50 UTR is occluded when bound by the IstR-1 smallRNA and its translation is repressed (23) In contrastlong 50 UTRs with high Gstandby can be targets for

translation activation when structural remodeling bysmall RNAs or riboswitches increases their surface acces-sibility RprA small RNA has been shown to bind and up-regulate the translation of rpoS mRNA with the help ofHfq-protein (44)The model predicts that both types of translation regu-

lation occur in natural 50 UTRs In particular 39 ofgenomic 50 UTRs have standby site modules whosedistal or proximal binding sites are longer than 15 ntwhich are potent targets for translational repression incontrast 9 of 50 UTRs contain standby site moduleswith low accessibilities (Aslt 10) making them targetsfor translation activation Moreover 79 diverse 50 UTRs

A B

C

0 200 400 600 800 10000

2

4

6

8

10

12

14

16

18

ΔGst

andb

y [kca

lmol

]

Genomic 5rsquo UTR Length [nt]

0 10 20 30 400

2

4

6

8

10

12

14

16

18

ΔGst

andb

y [kca

lmol

]

0 2 4 6 8 10 12 14 16 18100

101

102

103

104

ΔGstandby

[kcalmol]

Num

ber

of G

enom

ic 5

rsquo UT

Rs

0 2 4 6 8 10 12 14 16 18

maximum change in ΔGstandby

[kcalmol]

Num

ber

of 5

rsquo UT

R Is

ofor

ms

100

101

102

103Genomic 5rsquo UTR Length [nt]

Figure 6 (A) The ribosomal platform binding free energy penalties Gstandby for 3430 transcribed 50 UTRs from the E coli MG1655 genome(EcoCyc release 171) are compared to their lengths The inset shows the Gstandby of short 50 UTRs (B) The number of genomic 50 UTRs withdifferent values of Gstandby is shown (C) 50 UTR isoforms within transcripts across the genome affect their standby site module characteristics asquantified by the maximum calculated change in Gstandby

A

0 2 4 6 8 1010

8

6

4

2

0

103

104

Predicted ΔGstandby

[kcalmol]

App

aren

t ΔG

stan

dby [

kcal

mol

]

Flu

ores

cenc

e [a

u]

B

0 2 4 6 8 10 12 14

102

103

104

14

12

10

8

6

4

2

0

Predicted ΔGstandby

[kcalmol]

App

aren

t ΔG

stan

dby [

kcal

mol

]

Flu

ores

cenc

e [a

u]

Figure 5 (A) A scatter plot showing fluorescence measurements and apparent Gstandby energy penalties compared to predicted Gstandby energypenalties for 28 synthetic 50 UTRs with diverse structures (average error is 079 kcalmol and R2=083) (B) The same comparison for the 136synthetic 50 UTRs characterized in this study (average error is 075 kcalmol and R2=089) In parts (A and B) the apparent Gstandby numbersshown in secondary y-axis are directly related to the data according to Equation (4) An x=y diagonal line is shown (dashed)

Nucleic Acids Research 2014 Vol 42 No 4 2655

with an average length of 175 nt have positive slidingenergy penalties (Gsliding) indicating that distantstandby site modules are controlling translation andmaking them potential targets for long-range regulationFurther there are 769 operons where variability in their

transcriptional start sites due to adjacent or overlappingpromoters result in 50 UTRs with different lengths andmRNA structures For example the mmuPM operon hasfive different transcriptional start sites leading to 50 UTRisoforms with lengths from 9 to 205 nt The biophysicalmodel predicts a 1088-fold change in the translation initi-ation rate of mmuP as a result of changes in the structureand surface area of the different standby site modulesOverall across all 50 UTR isoforms within the genomemodel calculations show large changes in translation ini-tiation rates through alterations in the standby sitemodulesrsquo characteristics (Figure 6C) These differencesimply that switching promoter usage due to changingtranscription factor or sigma factor levels can modulatean mRNArsquos translation rate and potentially createstandby site modules whose accessibility can be addition-ally regulated by translation factors However furtherinvestigation is necessary to examine the 50 UTRs ofisoforms and the regulation of their translation rates

Biophysical modeling to calculate translation rates fromsplit ribosome binding sites

Sacerdot et al (45) studied the translation of the 50 UTRof E coli thrS mRNA that uses a split ribosome bindingsite The biophysical model classifies the 50 UTR as con-taining three standby site modules with large RNA struc-tures and varying distal and proximal binding site lengths(Figure 7) The authors concluded that a 24 nt long singlestranded region (domain 3) is essential for ribosomebinding and that the ribosome can accommodate a long

hairpin (domain 2) without unfolding it which was latersupported by crystallographic data (3) The biophysicalmodelrsquos calculations are strongly consistent with their con-clusions In particular a mutation that deleted the entiredomain 3 and its upstream region (ILO4) repressed thetranslation rate by over 50-fold where it significantlyreduced the standby site accessibility (from As=24 andGdistortion=0 to As=0 and Gdistortion=174 kcalmol) To offset this large energetic burden model calcu-lations indicate that the ribosomal platform partiallyunfolded 6 bp from the domain 2 hairpin whichminimized its binding free energy penalty to a Gstandby

of 532 kcalmol (As=205 Gdistortion=002 andGunfolding=53 kcalmol) Destabilization of thedomain 2 hairpin (ILO5) restored the mRNArsquos transla-tion to its wild-type rate by increasing the surface area ofthe standby site to 205 nt resulting in an almost zeroGstandby The biophysical model calculations indicatethat other mutations to the thrS 50 UTR (L6 M1 N1BS4-9 BS4-9CS29 L19 L7 CS302 ILO1 andILO2 see Figure 7) did not alter its standby sitemodule surface area or RNA structure the maximumobserved change in translation was 25-fold for thesemutations

DISCUSSION

We designed and characterized synthetic mRNA se-quences and applied biophysical modeling to decipherthe rules for the ribosomersquos interactions at structured50 UTRs The 30S ribosome searches for a singlestandby site module that supports the highest translationrate regardless of its distance from the start codon Theribosomersquos binding free energy penalty to individualstandby site modules is governed by a competing trade-off between the accessibility of standby site modules

GAACUGGUUAAUAAACAAAUUUUUC

AG

GUGG

U

G

UACCUG

AAUU

G

G

AG

UUUG

AAC C

UG

ACUA

GA

UUCGUG

U

G

A

GCGUU

U

U

CGCAAU

GUUU

G

AA

UU

GC

A

CA

GU

AUUU

GAUACGGC

AU

UAAGGAUAUAAAAUGSD

U

AUGUU

U

UGCAAA

GCUU U

U

CGUG

ACUAGUGC

G

GU

C

CA

Domain 2

Domain 3

Domain 4

minus1

ILOΔ1

ILOΔ2

minus90

minus120

minus150

minus163 minus60 minus13

minus32

ILOΔ4 ILOΔ5

minus49

Figure 7 The sequence and structure of E coli thrS 50 UTR are shown Similar to Sacerdot et al (45) nucleotide numbering is negative for theupstream 50 UTR with respect to the start codon The list of mutations is BS4-9 G(-32)gtA BS4-9CS29 G(-32)gtA and G(-46)gtA L19 A(-13)gtCL7 U(-49)gtG CS302 deletion of domain 2 from A(-14) to U(-48) ILO1 transcription begins at G(-159) ILO2 transcription begins at G(-68)ILO4 transcription begins at G(-49) U(-49)gtG and A(-13)gtC ILO5 transcription begins at G(-49) U(-49)gtG and G(-46)gtA Bolded nucleo-tides are the positions of new transcriptional start sites

2656 Nucleic Acids Research 2014 Vol 42 No 4

(Figure 2) and the unfolding of RNA structures toincrease accessibility (Figure 3) We demonstrate thatthermodynamic minimization predicts the extent ofthe ribosomersquos selective and partial unfolding of RNAstructures Our data supports a sliding mechanismwhereby downstream RNA structures are not unfoldedbut attenuate translation rate in a height-dependentmanner (Figure 4)

Importantly initially designed sequences have well-defined features that perturb the ribosomersquos individualinteractions with mRNA allowing us to preciselymeasure their effect on the ribosomersquos binding freeenergy penalty The forward design of synthetic sequencesaccording to model predictions quantitatively tests ourknowledge of these interactions and the modelrsquos abilityto predict their strengths Overall the biophysical modelemploys first-principles calculations with four empiricallymeasured constants to accurately predict the translationinitiation rates of 136 mRNAs with diversely structuredlong 50 UTRs (Figure 5)

Based on our findings there is also the potential forlong-range regulation of translation Long structured50 UTRs can feature several upstream standby sitemodules extending over 100 nt away from an openreading frame Any of these standby site modules arepotential binding sites for the ribosome according totheir binding free energy penalties but only a singlestandby site module is ultimately bound Both cis-actingand trans-acting factors can modulate individual standbysite module surface accessibilities to control standby sitemodule selection and the resulting translation rate(31723) Importantly the regulation of standby sitemodule selection will produce discrete step changes intranslation rate that can further regulate downstreamprocesses according to non-Boolean multi-state logic

In this study we introduce a new metric to quantify thesurface area of a standby site module which was validatedusing diverse but canonical RNA secondary structuresThe metric relates the geometric characteristics of thestandby site module (P D H) to its available single-stranded surface area implicitly using coefficients of oneto indicate interchangeability Additional measurementswill be needed to precisely measure these coefficientsand relate them to the helical shape of A-form RNA Inaddition tertiary structures such as G-quadruplexes andpseudoknots will likely have a differently defined metricof surface area Further investigation will be necessary todetermine the effect of tertiary RNA structures on standbysite accessibility ribosome binding and translationinitiation

As suggested by the biophysical model the ribosomalplatformrsquos ability to bind standby site modules with lowsurface area relies on its structural flexibility Previouslypublished cryo-EM data shows that the platform domainin the 30S subunit in both the free and 50S-bound statescan occupy variable conformational states (46) The twohalves of the platform which are stabilized by ribosomalproteins at its surface switch conformations during trans-lation (47) Restraining this flexibility will reduce the ribo-somersquos translation initiation rate for example when theantibiotic Edeine binds 16S ribosomal RNA helices H23

and H24 (48) Consequently the ribosomal platformrsquosstructural flexibility appears to be an essential aspect oftranslating mRNAs with structured 50 UTRsSimilarly the rigidity of single-stranded RNA likely

plays an important role in modulating the ribosomersquos dis-tortion penalty due to the entropic differences of bindingflexible or rigid macromolecules (49) Similar to DNApolyA RNA sequences are less flexible than mixed orpolyU RNA sequences (5051) Differences in RNArigidity have been shown to alter the ribosomersquos bindingaffinity to spacer regions separating the SD and startcodon sequences (52) binding free energy differences are25 kcalmol between polyA and polyAU spacer se-quences and 5 kcalmol between polyA and polyUspacer sequences Incorporating a sequence-dependentmodel for RNA rigidity into the distortion penalty calcu-lations could potentially increase their accuracyAnalysis of the modelrsquos error can lead to further insight

into the ribosomersquos interactions The modelrsquos predictionshave a normally distributed error for 90 of sequences(Supplementary Figure S6) Thirteen outliers have errorsbetween 2 and 4 kcalmol (Supplementary Data) Someoutliers are due to the effects of increased mRNA degrad-ation of 50 UTRs with very long proximal or distal bindingsites (P Dgt 20 nt) (Supplementary Figure S2) In othersextremely short proximal or distal binding sites can resultin non-canonical base pairings that alter the ribosomersquosbinding affinity For example in the absence of aproximal binding site (P=0nt) co-axial stacking willtake place between the nucleotides in the distal bindingsite and the mRNAndashrRNA duplex that anchors theribosome to the mRNA Similarly co-axial stacking willplay a larger role in predicting the effects of small RNAsthat bind to sites nearby RNA structuresAs the list of interactions that control gene expression

continues to grow we will stretch the limits of humanpattern recognition and the use of observational correl-ations Biophysical modeling offers a comprehensiveapproach to account for all known interactions with thegene expression machinery identify gaps design experi-ments precisely measure strengths and make testable pre-dictions The accuracy of thesemodels will go hand-in-handwith our ability to engineer genetic systems without trial-and-error We have incorporated the ribosomersquos inter-actions with structured standby site modules into animproved biophysical model called the RBS Calculatorv20 A software implementation and user-friendly webinterface is available at httpsalispsuedusoftware

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online

ACKNOWLEDGEMENTS

The authors would like to thank Manish Kushwaha andthe Genomics Core Facility at Penn State for technicalassistance Chase Beisel Richard Lease Paul Babitzkeand the reviewers for providing comments on themanuscript

Nucleic Acids Research 2014 Vol 42 No 4 2657

FUNDING

DARPA Young Faculty Award [to HMS] Office ofNaval Research [N00014-13-1-0074 to HMS] NationalScience Foundation Career Award [CBET-1253641 toHMS] start-up funds provided by the Penn StateInstitute for the Energy and Environment Supported bythe NSF Research Experience for Undergraduatesprogram (to ASC) Funding for open access chargeThe Office of Naval Research [N00014-13-1-0074 toHMS]

CONTRIBUTIONS

AEB and HMS designed the study developed themodel analyzed results and wrote the manuscriptAEB and ASC conducted the experiments

Conflict of interest statement HMS is the founder of DeNovo DNA a company that commercializes software todesign genetic systems The presented results areindependent of financial consequences

REFERENCES

1 TsaiA PetrovA MarshallRA KorlachJ UemuraS andPuglisiJD (2012) Heterogeneous pathways and timing of factordeparture during translation initiation Nature 487 390ndash393

2 SimonettiA MarziS JennerL MyasnikovA RombyPYusupovaG KlaholzBP and YusupovM (2009) A structuralview of translation initiation in bacteria Cell Mol Life Sci 66423ndash436

3 JennerL RombyP ReesB Schulze-BrieseC SpringerMEhresmannC EhresmannB MorasD YusupovaG andYusupovM (2005) Translational operator of mRNA on theribosome how repressor proteins exclude ribosome bindingScience 308 120ndash123

4 BeiselCL UpdegroveTB JansonBJ and StorzG (2012)Multiple factors dictate target selection by Hfq-binding smallRNAs EMBO J 31 1961ndash1974

5 StorzG VogelJ and WassarmanKM (2011) Regulation bysmall RNAs in bacteria expanding frontiers Mol Cell 43880ndash891

6 BastetL DubeA MasseE and LafontaineDA (2011) Newinsights into riboswitch regulation mechanisms Mol Microbiol80 1148ndash1154

7 BeiselCL and SmolkeCD (2009) Design principles forriboswitch function PLoS Comput Biol 5 e1000363

8 LeaseRA and BelfortM (2000) A trans-acting RNA as acontrol switch in Escherichia coli DsrA modulates function byforming alternative structures Proc Natl Acad Sci USA 979919ndash9924

9 De SmitMH and Van DuinJ (1994) Control of translation bymRNA secondary structure in Escherichia coli A quantitativeanalysis of literature data J Mol Biol 244 144ndash150

10 De SmitMH and Van DuinJ (1994) Translational initiation onstructured messengers Another role for the Shine-Dalgarnointeraction J Mol Biol 235 173ndash184

11 OstermanIA EvfratovSA SergievPV and DontsovaOA(2013) Comparison of mRNA features affecting translationinitiation and reinitiation Nucleic Acids Res 41 474ndash486

12 BugautA and BalasubramanianS (2012) 50-UTR RNA G-quadruplexes translation regulation and targeting Nucleic AcidsRes 40 4727ndash4741

13 SalisHM (2011) The ribosome binding site calculator MethodsEnzymol 498 19ndash42

14 SalisHM MirskyEA and VoigtCA (2009) Automated designof synthetic ribosome binding sites to control protein expressionNat Biotechnol 27 946ndash950

15 CarothersJM GolerJA JuminagaD and KeaslingJD (2011)Model-driven engineering of RNA devices to quantitativelyprogram gene expression Science 334 1716ndash1719

16 SimonettiA MarziS MyasnikovAG FabbrettiAYusupovM GualerziCO and KlaholzBP (2008) Structure ofthe 30S translation initiation complex Nature 455 416ndash420

17 MarziS MyasnikovAG SerganovA EhresmannCRombyP YusupovM and KlaholzBP (2007) StructuredmRNAs regulate translation initiation by binding to the platformof the ribosome Cell 130 1019ndash1031

18 YusupovaGZ YusupovMM CateJ and NollerHF (2001)The path of messenger RNA through the ribosome Cell 106233ndash241

19 De SmitMH and Van DuinJ (2003) Translational standbysites how ribosomes may deal with the rapid folding kinetics ofmRNA J Mol Biol 331 737ndash743

20 QuX LancasterL NollerHF BustamanteC and TinocoI(2012) Ribosomal protein S1 unwinds double-stranded RNA inmultiple steps Proc Natl Acad Sci USA 109 14458ndash14463

21 MilonP MaracciC FilonavaL GualerziCO andRodninaMV (2012) Real-time assembly landscape of bacterial30S translation initiation complex Nat Struct Mol Biol 19609ndash615

22 StuderSM and JosephS (2006) Unfolding of mRNA secondarystructure by the bacterial translation initiation complex MolCell 22 105ndash115

23 DarfeuilleF UnosonC VogelJ and WagnerEGH (2007) Anantisense RNA inhibits translation by competing with standbyribosomes Mol Cell 26 381ndash392

24 MutalikVK GuimaraesJC CambrayG MaiQ-AChristoffersenMJ MartinL YuA LamC RodriguezCBennettG et al (2013) Quantitative estimation of activity andquality for collections of functional genetic elements NatMethods 10 347ndash353

25 CardinaleS and ArkinAP (2012) Contextualizing context forsynthetic biologyndashidentifying causes of failure of syntheticbiological systems Biotechnol J 7 856ndash866

26 LouC StantonB ChenY-J MunskyB and VoigtCA (2012)Ribozyme-based insulator parts buffer synthetic circuits fromgenetic context Nat Biotechnol 30 1137ndash1142

27 KosuriS GoodmanDB CambrayG MutalikVK GaoYArkinAP EndyD and ChurchGM (2013) Composability ofregulatory sequences controlling transcription and translation inEscherichia coli Proc Natl Acad Sci USA 110 14024ndash14029

28 TakahashiS FurusawaH UedaT and OkahataY (2013)Translation enhancer improves the ribosome liberation fromtranslation initiation J Am Chem Soc 135 13096ndash13106

29 HaoY ZhangZJ EricksonDW HuangM HuangY LiJHwaT and ShiH (2011) Quantifying the sequence-functionrelation in gene silencing by bacterial small RNAs Proc NatlAcad Sci USA 108 12473ndash12478

30 MathewsDH SabinaJ ZukerM and TurnerDH (1999)Expanded sequence dependence of thermodynamic parametersimproves prediction of RNA secondary structure J Mol Biol288 911ndash940

31 XiaT SantaLuciaJ BurkardME KierzekR SchroederSJJiaoX CoxC and TurnerDH (1998) Thermodynamicparameters for an expanded nearest-neighbor model for formationof RNA duplexes with Watson-Crick base pairs Biochemistry 3714719ndash14735

32 GruberAR LorenzR BernhartSH NeubockR andHofackerIL (2008) The Vienna RNA websuite Nucleic AcidsRes 36 W70ndashW74

33 ChenH BjerknesM KumarR and JayE (1994) Determinationof the optimal aligned spacing between the ShinendashDalgarnosequence and the translation initiation codon of Escherichia colimRNAs Nucleic Acids Res 22 4953ndash4957

34 PertzevAV and NicholsonAW (2006) Characterization ofRNA sequence determinants and antideterminants of processingreactivity for a minimal substrate of Escherichia coli ribonucleaseIII Nucleic Acids Res 34 3708ndash3721

35 KeselerIM Collado-VidesJ Santos-ZavaletaA Peralta-GilMGama-CastroS Muniz-RascadoL Bonavides-MartinezCPaleyS KrummenackerM AltmanT et al (2011) EcoCyc a

2658 Nucleic Acids Research 2014 Vol 42 No 4

comprehensive database of Escherichia coli biology Nucleic AcidsRes 39 D583ndashD590

36 MutalikVK QiL GuimaraesJC LucksJB and ArkinAP(2012) Rationally designed families of orthogonal RNA regulatorsof translation Nat Chem Biol 8 447ndash454

37 RodrigoG LandrainTE and JaramilloA (2012) De novoautomated design of small RNA circuits for engineering syntheticriboregulation in living cells Proc Natl Acad Sci USA 10915271ndash15276

38 CalluraJM CantorCR and CollinsJJ (2012) Geneticswitchboard for synthetic biology applications Proc Natl AcadSci USA 109 5850ndash5855

39 SinhaJ ReyesSJ and GallivanJP (2010) Reprogrammingbacteria to seek and destroy an herbicide Nat Chem Biol 6464ndash470

40 AnthonyPC PerezCF Garcıa-GarcıaC and BlockSM(2012) Folding energy landscape of the thiamine pyrophosphateriboswitch aptamer Proc Natl Acad Sci USA 109 1485ndash1489

41 PerdrizetGA II ArtsimovitchI FurmanR SosnickTR andPanT (2012) Transcriptional pausing coordinates folding of theaptamer domain and the expression platform of a riboswitchProc Natl Acad Sci USA 109 3323ndash3328

42 LynchSA DesaiSK SajjaHK and GallivanJP (2007) Ahigh-throughput screen for synthetic riboswitches revealsmechanistic insights into their function Chem Biol 14 173ndash184

43 CaronM-P BastetL LussierA Simoneau-RoyM MasseEand LafontaineDA (2012) Dual-acting riboswitch control oftranslation initiation and mRNA decay Proc Natl Acad SciUSA 109 E3444ndashE3453

44 UpdegroveT WilfN SunX and WartellRM (2008) Effect ofHfq on RprArpoS mRNA pairing HfqRNA binding and the

influence of the 50 rpoS mRNA leader region Biochemistry 4711184ndash11195

45 SacerdotC CailletJ GraffeM EyermannF EhresmannBEhresmannC SpringerM and RombyP (1998) The Escherichiacoli threonyl-tRNA synthetase gene contains a split ribosomalbinding site interrupted by a hairpin structure that is essential forautoregulation Mol Microbiol 29 1077ndash1090

46 GabashviliIS AgrawalRK GrassucciR and FrankJ (1999)Structure and structural variations of the Escherichia coli 30 Sribosomal subunit as revealed by three-dimensional cryo-electronmicroscopy J Mol Biol 286 1285ndash1291

47 ClemonsWM MayJL WimberlyBT McCutcheonJPCapelMS and RamakrishnanV (1999) Structure of a bacterial30S ribosomal subunit at 55 A resolution Nature 400 833ndash840

48 PiolettiM SchlunzenF HarmsJ ZarivachR GluhmannMAvilaH BashanA BartelsH AuerbachT and JacobiC(2001) Crystal structures of complexes of the small ribosomalsubunit with tetracycline edeine and IF3 EMBO J 201829ndash1839

49 WilliamsonJR (2000) Induced fit in RNAndashprotein recognitionNat Struct Mol Biol 7 834ndash837

50 FioreJL HolmstromED and NesbittDJ (2012) Entropicorigin of Mg2+-facilitated RNA folding Proc Natl Acad SciUSA 109 2902ndash2907

51 MillsJB VacanoE and HagermanPJ (1999) Flexibility ofsingle-stranded DNA use of gapped duplex helices to determinethe persistence lengths of poly (dT) and poly (dA) J Mol Biol285 245ndash257

52 EgbertRG and KlavinsE (2012) Fine-tuning gene networksusing simple sequence repeats Proc Natl Acad Sci USA 10916817ndash16822

Nucleic Acids Research 2014 Vol 42 No 4 2659

Page 8: Translation rate is controlled by coupled trade-offs ......Translation rate is controlled by coupled trade-offs between site accessibility, selective RNA unfolding and sliding at upstream

Gstandby between 90 and 174 kcalmol) compared to themaximally accessible upstream standby site modulesTherefore if the ribosome binds the farthest upstreamstandby site module its translation rate will be at least127-fold higher than binding to the internal standby sitemodules We measured fluorescence levels with flowcytometry and mRNA levels with RT-qPCR Comparedto an unstructured 50 UTR the mRNA levels decreased byonly 13- to 55-fold while the fluorescence levels werecorrespondingly reduced between 25- and 183-foldshowing that changes in fluorescence were mainly due tochanges in translation rate and ribosome binding freeenergy penalties (Figure 4D)

We found that all the mRNAsrsquo translation ratesremained very high even when the accessible standbysite module was located over 100 nt upstream of thestart codon (M4 50 UTR) (Figure 4E) In particular theM2 50 UTR is translated at a 338-fold higher rate whencompared to the translation rate that could be supportedby binding to the standby site module closest to the startcodon Moving the accessible standby site moduleupstream away from the start codon does not signifi-cantly reduce the translation rate Based on these meas-urements the apparent standby site binding penaltiesare 36 62 and 65 kcalmol for M2 to M4 50 UTRs

(Figure 4F) respectively which is much lower than theinternal standby site modulesrsquo predicted binding freeenergy penalties (90ndash174 kcalmol) These measurementssuggest that the ribosome will bind to the standby sitemodule that has the lowest binding free energy penaltyregardless of its distance from the start codon

The ribosomal platform can slide acrossupstream structures

Several mechanisms would allow the ribosomal platformto bind distant upstream standby site modules and tran-sition to a pre-initiation complex at the start codon Afterthe ribosome has bound to the upstream standby sitemodule the 30S complex could engage in diffusivehopping that would increase its local concentration andassembly rate at the start codon The rate of assemblywould largely depend on the physical distance betweenthe standby site module and start codon and not on thepresence of RNA structures Alternatively the ribosomecould employ a full contact sliding mechanism that reori-ents the mRNA until a stable pre-initiation complex hasformed The positively charged surface of the ribosomalplatform makes it ideal for maintaining contact with a50 UTR until 16S rRNA hybridization has anchored themRNA Unlike a diffusive hopping mechanism the

B

ΔGst

andb

y [kc

alm

ol]

Single Stranded Length (N) [nt]124 8

6543210

103

104

Fluo

resc

ence

[au]

A

SD AUGN N

SD AUGN NN

SD AUGN NNN

C

SD AUGM1

I

SD AUGM2

III

SD AUG

M3

IIIII

I

SD AUG

M4

IIIII

IIV

M3 M4M1 M2

Structured Standby Sites

Fluo

resc

ence

[au]

102

103

104

101

100

E

I IIM2

I II IIIM3

I II III IVM4

22

20

18

16

14

12

10

8

6

4

2

0

ΔGdi

stor

tion +

ΔG

slid

ing

[kca

lmol

]

IM1

22

20

18

16

14

12

10

8

6

4

2

0

ΔGdi

stor

tion

[kca

lmol

]

FD

M1 M2M3 M4

0

5

10

15

20

25

0

5

10

15

20

25

Structured Standby Sites

mR

NA

Lev

el F

old

Red

uctio

n

Fluo

resc

ence

Fol

d R

educ

tion

Figure 4 (A) Synthetic 50 UTRs with multiple evenly spaced standby site modules Spacing N is varied from 4 to 12 nt (B) Measured translation ratesare compared to standby site module multiplicity showing the absence of cooperative binding to multiple standby site modules regardless of limitingsurface areas The Gstandby numbers shown in secondary y-axis are directly related to the data according to Equation (4) (C) Synthetic 50 UTRs withone upstream and multiple internal standby site modules labeled with roman numbers (D) The reduction in fluorescence (red bars) and mRNA levels(blue bars) of 50 UTRs M1 to M4 compared to an unstructured 50 UTR (see also Supplementary Figure S2) (E) Fluorescence measurements of 50

UTRs M1 to M4 show that upstream standby site modules can support high translation rates (F) Model predictions are shown for 50 UTRs M1 to M4

using either Gdistortion (gray bars) or Gdistortion+Gsliding (blue bars) for each standby site module (IndashIV corresponding to labels in part C)compared to the ribosomersquos apparent binding free energy penalties (dashed lines yellow region) (average error with and without the sliding penalty is115 and 324 kcalmol respectively P-value is 0013) Gdistortion for each standby site module was calculated according to Equation (3) In parts Band E the horizontal dashed line is the translation rate and ribosome binding free energy penalty of the unstructured 50 UTR

Nucleic Acids Research 2014 Vol 42 No 4 2653

presence of mRNA structures would impede this slidingmotion and reduce the pre-initiation complexrsquos assemblyrateFurther analysis of our measurements shows that RNA

structures in between the upstream standby site moduleand start codon partly attenuate the mRNArsquos translationrate (Figure 4E) The apparent binding free energy penaltyof the M2 50 UTR (36 kcalmol) is higher than theupstream standby site modulersquos predicted binding freeenergy penalty (04 kcalmol) This energetic differencecannot be explained by binding to the internal standbysite module (114 kcalmol) or by the unfolding of thedownstream RNA structure (175 kcalmol) Thus thefree energy difference (36 ndash 04=32 kcalmol) is signifi-cant but much lower than predicted by alternativebinding to other standby site modules or from unfoldingof RNA structuresOur data supports the presence of a sliding mechanism

for the ribosomal platform Intervening mRNA structuresare not unfolded but are pushed aside with an energeticcost Using our measurements of the M2 50 UTR weestimate that 32plusmn05 kcalmol free energy is requiredto push aside the single RNA structure with a height of15 nt yielding a coefficient of 020plusmn004 kcalmol perhairpin height We then determined whether includingthis sliding penalty could improve the modelrsquos ability topredict the translation rate of long 50 UTRs with multiplestandby site modules with lsquohairpins of differing heightsrsquoFor each standby site module we calculate the heights ofall downstream RNA structures and multiplied the sumby the coefficient (c=02 kcalmolnt) to yield a freeenergy penalty of sliding Gsliding The sliding freeenergy penalties were 0 30 48 and 78 kcalmol for M1to M4 50 UTRs respectively We add together the slidingdistortion and unfolding free energy penalties to calculatethe ribosomersquos binding free energy penalty to each standbysite moduleFigure 4E shows the measured translation initiation

rates for M1 to M4 alongside the predicted binding freeenergy penalties (Figure 4F) both with and without thesliding free energy penalty The data shown in Figure 4Eand F are on the same scale according to the log-linearrelationship between translation initiation rate andbinding free energy differences (lsquoMaterials and Methodsrsquosection) enabling a comparison between the measure-ments and the two predictions The introduction of thesliding free energy penalty enabled the biophysical modelto more accurately predict the translation rate from longer50 UTRs with several standby site modules (average erroris 186 kcalmol with the sliding energy penalty comparedto 445 kcalmol without the sliding energy penaltyP-value is 0013)

A biophysical model predicts translation rates fromdiversely structured 50 UTRs

Altogether the ribosome binds to 50 UTRs by selectingthe most geometrically accessible standby site module thatrequires the least amount of RNA unfolding Standby sitemodule selection is not limited by distance to the startcodon and upstream standby site modules can support

high rates of translation We identified that ribosomal dis-tortion selective RNA unfolding and ribosomal slidingcontrol standby site module selection and binding ener-getics and combine them together to create a single bio-physical model The ribosomersquos binding free energypenalty to the standby site modules in a structured50 UTR is calculated according to

Gstandby frac14 Gdistortion+Gunfolding+Gsliding eth5THORN

All three Gibbs free energies are positive and a long un-structured 50 UTR would bind to the ribosomal platformwithout an energetic penalty (Gstandby=0kcalmol) AllRNA energetics are calculated using a semi-empirical freeenergy model of RNA and RNAndashRNA interactions(3031) and the minimization algorithms available in theVienna RNA suite version 185 (32) This model extendsour previous biophysical model that focused on down-stream mRNA interactions including the 16S rRNAbinding site spacer region and downstream RNA struc-tures (1314)

We further tested the accuracy of the biophysical modelby characterizing an additional 28 diversely structured50 UTRs (68ndash164 nt long) with two to four standby sitemodules of varying geometries RNA structure energeticsand hairpin heights (Supplementary Figure S5B) The bio-physical model was able to accurately predict the relativetranslation rates and ribosome binding free energypenalties (average error is 079 kcalmol and R2=083)(Figure 5A) Overall the biophysical model containsfour empirically determined parameter values but canaccurately predict the binding free energy penalties andtranslation rates of the 136 synthetic 50 UTRscharacterized in this study (average error is 075 kcalmoland R2=089) (Figure 5B)

Genome-wide predictions of 50 UTR translation rates

We next employed the biophysical model to examine howstandby sites in genomic 50 UTRs control their translationinitiation rates From the annotated E coli MG1655genome provided by EcoCyc (35) we collected 343050 UTRs that varied from short leaders to gene-sized regu-latory hot spots with average 50 UTR length of 100 nt The50 UTR sequences from the+1 of the mRNA transcript to+100 after the protein coding sequencersquos start codon wereinputted into the biophysical model For individual50 UTRs a list of standby site modules was automaticallygenerated For each standby site module the surface areaAs distortion energy penalty Gdistortion unfolding energypenalty Gunfolding and sliding energy penalty Gsliding

were calculated These calculations are combined to deter-mine the Gstandby for each standby site module Becauseof the absence of cooperative binding as shown thestandby site module with the lowest binding free energypenalty is selected as the one controlling the ribosomersquostranslation initiation rate

According to the model calculations we found that thelengths of the natural 50 UTRs do not correlate with theirability to bind the ribosomal platform (Figure 6A) Short50 UTRs with low accessibility can inhibit translationequally well compared to long 50 UTRs where

2654 Nucleic Acids Research 2014 Vol 42 No 4

accessibility is limited by RNA structures The model cal-culations indicate that the ribosomal platform can bind tonatural 50 UTRs with a wide range of energetic penalties(up to 174 kcalmol) repressing translation by up to 2500-fold (Figure 6B) Long 50 UTRs with low Gstandby can besubjected to translation repression when trans-actingsmall RNAs bind and reduce their single-stranded acces-sibility (232936ndash38) or when cis-acting riboswitchesreduce accessibility when bound to a chemical ligand(39ndash43) In one example the standby site in the tisB50 UTR is occluded when bound by the IstR-1 smallRNA and its translation is repressed (23) In contrastlong 50 UTRs with high Gstandby can be targets for

translation activation when structural remodeling bysmall RNAs or riboswitches increases their surface acces-sibility RprA small RNA has been shown to bind and up-regulate the translation of rpoS mRNA with the help ofHfq-protein (44)The model predicts that both types of translation regu-

lation occur in natural 50 UTRs In particular 39 ofgenomic 50 UTRs have standby site modules whosedistal or proximal binding sites are longer than 15 ntwhich are potent targets for translational repression incontrast 9 of 50 UTRs contain standby site moduleswith low accessibilities (Aslt 10) making them targetsfor translation activation Moreover 79 diverse 50 UTRs

A B

C

0 200 400 600 800 10000

2

4

6

8

10

12

14

16

18

ΔGst

andb

y [kca

lmol

]

Genomic 5rsquo UTR Length [nt]

0 10 20 30 400

2

4

6

8

10

12

14

16

18

ΔGst

andb

y [kca

lmol

]

0 2 4 6 8 10 12 14 16 18100

101

102

103

104

ΔGstandby

[kcalmol]

Num

ber

of G

enom

ic 5

rsquo UT

Rs

0 2 4 6 8 10 12 14 16 18

maximum change in ΔGstandby

[kcalmol]

Num

ber

of 5

rsquo UT

R Is

ofor

ms

100

101

102

103Genomic 5rsquo UTR Length [nt]

Figure 6 (A) The ribosomal platform binding free energy penalties Gstandby for 3430 transcribed 50 UTRs from the E coli MG1655 genome(EcoCyc release 171) are compared to their lengths The inset shows the Gstandby of short 50 UTRs (B) The number of genomic 50 UTRs withdifferent values of Gstandby is shown (C) 50 UTR isoforms within transcripts across the genome affect their standby site module characteristics asquantified by the maximum calculated change in Gstandby

A

0 2 4 6 8 1010

8

6

4

2

0

103

104

Predicted ΔGstandby

[kcalmol]

App

aren

t ΔG

stan

dby [

kcal

mol

]

Flu

ores

cenc

e [a

u]

B

0 2 4 6 8 10 12 14

102

103

104

14

12

10

8

6

4

2

0

Predicted ΔGstandby

[kcalmol]

App

aren

t ΔG

stan

dby [

kcal

mol

]

Flu

ores

cenc

e [a

u]

Figure 5 (A) A scatter plot showing fluorescence measurements and apparent Gstandby energy penalties compared to predicted Gstandby energypenalties for 28 synthetic 50 UTRs with diverse structures (average error is 079 kcalmol and R2=083) (B) The same comparison for the 136synthetic 50 UTRs characterized in this study (average error is 075 kcalmol and R2=089) In parts (A and B) the apparent Gstandby numbersshown in secondary y-axis are directly related to the data according to Equation (4) An x=y diagonal line is shown (dashed)

Nucleic Acids Research 2014 Vol 42 No 4 2655

with an average length of 175 nt have positive slidingenergy penalties (Gsliding) indicating that distantstandby site modules are controlling translation andmaking them potential targets for long-range regulationFurther there are 769 operons where variability in their

transcriptional start sites due to adjacent or overlappingpromoters result in 50 UTRs with different lengths andmRNA structures For example the mmuPM operon hasfive different transcriptional start sites leading to 50 UTRisoforms with lengths from 9 to 205 nt The biophysicalmodel predicts a 1088-fold change in the translation initi-ation rate of mmuP as a result of changes in the structureand surface area of the different standby site modulesOverall across all 50 UTR isoforms within the genomemodel calculations show large changes in translation ini-tiation rates through alterations in the standby sitemodulesrsquo characteristics (Figure 6C) These differencesimply that switching promoter usage due to changingtranscription factor or sigma factor levels can modulatean mRNArsquos translation rate and potentially createstandby site modules whose accessibility can be addition-ally regulated by translation factors However furtherinvestigation is necessary to examine the 50 UTRs ofisoforms and the regulation of their translation rates

Biophysical modeling to calculate translation rates fromsplit ribosome binding sites

Sacerdot et al (45) studied the translation of the 50 UTRof E coli thrS mRNA that uses a split ribosome bindingsite The biophysical model classifies the 50 UTR as con-taining three standby site modules with large RNA struc-tures and varying distal and proximal binding site lengths(Figure 7) The authors concluded that a 24 nt long singlestranded region (domain 3) is essential for ribosomebinding and that the ribosome can accommodate a long

hairpin (domain 2) without unfolding it which was latersupported by crystallographic data (3) The biophysicalmodelrsquos calculations are strongly consistent with their con-clusions In particular a mutation that deleted the entiredomain 3 and its upstream region (ILO4) repressed thetranslation rate by over 50-fold where it significantlyreduced the standby site accessibility (from As=24 andGdistortion=0 to As=0 and Gdistortion=174 kcalmol) To offset this large energetic burden model calcu-lations indicate that the ribosomal platform partiallyunfolded 6 bp from the domain 2 hairpin whichminimized its binding free energy penalty to a Gstandby

of 532 kcalmol (As=205 Gdistortion=002 andGunfolding=53 kcalmol) Destabilization of thedomain 2 hairpin (ILO5) restored the mRNArsquos transla-tion to its wild-type rate by increasing the surface area ofthe standby site to 205 nt resulting in an almost zeroGstandby The biophysical model calculations indicatethat other mutations to the thrS 50 UTR (L6 M1 N1BS4-9 BS4-9CS29 L19 L7 CS302 ILO1 andILO2 see Figure 7) did not alter its standby sitemodule surface area or RNA structure the maximumobserved change in translation was 25-fold for thesemutations

DISCUSSION

We designed and characterized synthetic mRNA se-quences and applied biophysical modeling to decipherthe rules for the ribosomersquos interactions at structured50 UTRs The 30S ribosome searches for a singlestandby site module that supports the highest translationrate regardless of its distance from the start codon Theribosomersquos binding free energy penalty to individualstandby site modules is governed by a competing trade-off between the accessibility of standby site modules

GAACUGGUUAAUAAACAAAUUUUUC

AG

GUGG

U

G

UACCUG

AAUU

G

G

AG

UUUG

AAC C

UG

ACUA

GA

UUCGUG

U

G

A

GCGUU

U

U

CGCAAU

GUUU

G

AA

UU

GC

A

CA

GU

AUUU

GAUACGGC

AU

UAAGGAUAUAAAAUGSD

U

AUGUU

U

UGCAAA

GCUU U

U

CGUG

ACUAGUGC

G

GU

C

CA

Domain 2

Domain 3

Domain 4

minus1

ILOΔ1

ILOΔ2

minus90

minus120

minus150

minus163 minus60 minus13

minus32

ILOΔ4 ILOΔ5

minus49

Figure 7 The sequence and structure of E coli thrS 50 UTR are shown Similar to Sacerdot et al (45) nucleotide numbering is negative for theupstream 50 UTR with respect to the start codon The list of mutations is BS4-9 G(-32)gtA BS4-9CS29 G(-32)gtA and G(-46)gtA L19 A(-13)gtCL7 U(-49)gtG CS302 deletion of domain 2 from A(-14) to U(-48) ILO1 transcription begins at G(-159) ILO2 transcription begins at G(-68)ILO4 transcription begins at G(-49) U(-49)gtG and A(-13)gtC ILO5 transcription begins at G(-49) U(-49)gtG and G(-46)gtA Bolded nucleo-tides are the positions of new transcriptional start sites

2656 Nucleic Acids Research 2014 Vol 42 No 4

(Figure 2) and the unfolding of RNA structures toincrease accessibility (Figure 3) We demonstrate thatthermodynamic minimization predicts the extent ofthe ribosomersquos selective and partial unfolding of RNAstructures Our data supports a sliding mechanismwhereby downstream RNA structures are not unfoldedbut attenuate translation rate in a height-dependentmanner (Figure 4)

Importantly initially designed sequences have well-defined features that perturb the ribosomersquos individualinteractions with mRNA allowing us to preciselymeasure their effect on the ribosomersquos binding freeenergy penalty The forward design of synthetic sequencesaccording to model predictions quantitatively tests ourknowledge of these interactions and the modelrsquos abilityto predict their strengths Overall the biophysical modelemploys first-principles calculations with four empiricallymeasured constants to accurately predict the translationinitiation rates of 136 mRNAs with diversely structuredlong 50 UTRs (Figure 5)

Based on our findings there is also the potential forlong-range regulation of translation Long structured50 UTRs can feature several upstream standby sitemodules extending over 100 nt away from an openreading frame Any of these standby site modules arepotential binding sites for the ribosome according totheir binding free energy penalties but only a singlestandby site module is ultimately bound Both cis-actingand trans-acting factors can modulate individual standbysite module surface accessibilities to control standby sitemodule selection and the resulting translation rate(31723) Importantly the regulation of standby sitemodule selection will produce discrete step changes intranslation rate that can further regulate downstreamprocesses according to non-Boolean multi-state logic

In this study we introduce a new metric to quantify thesurface area of a standby site module which was validatedusing diverse but canonical RNA secondary structuresThe metric relates the geometric characteristics of thestandby site module (P D H) to its available single-stranded surface area implicitly using coefficients of oneto indicate interchangeability Additional measurementswill be needed to precisely measure these coefficientsand relate them to the helical shape of A-form RNA Inaddition tertiary structures such as G-quadruplexes andpseudoknots will likely have a differently defined metricof surface area Further investigation will be necessary todetermine the effect of tertiary RNA structures on standbysite accessibility ribosome binding and translationinitiation

As suggested by the biophysical model the ribosomalplatformrsquos ability to bind standby site modules with lowsurface area relies on its structural flexibility Previouslypublished cryo-EM data shows that the platform domainin the 30S subunit in both the free and 50S-bound statescan occupy variable conformational states (46) The twohalves of the platform which are stabilized by ribosomalproteins at its surface switch conformations during trans-lation (47) Restraining this flexibility will reduce the ribo-somersquos translation initiation rate for example when theantibiotic Edeine binds 16S ribosomal RNA helices H23

and H24 (48) Consequently the ribosomal platformrsquosstructural flexibility appears to be an essential aspect oftranslating mRNAs with structured 50 UTRsSimilarly the rigidity of single-stranded RNA likely

plays an important role in modulating the ribosomersquos dis-tortion penalty due to the entropic differences of bindingflexible or rigid macromolecules (49) Similar to DNApolyA RNA sequences are less flexible than mixed orpolyU RNA sequences (5051) Differences in RNArigidity have been shown to alter the ribosomersquos bindingaffinity to spacer regions separating the SD and startcodon sequences (52) binding free energy differences are25 kcalmol between polyA and polyAU spacer se-quences and 5 kcalmol between polyA and polyUspacer sequences Incorporating a sequence-dependentmodel for RNA rigidity into the distortion penalty calcu-lations could potentially increase their accuracyAnalysis of the modelrsquos error can lead to further insight

into the ribosomersquos interactions The modelrsquos predictionshave a normally distributed error for 90 of sequences(Supplementary Figure S6) Thirteen outliers have errorsbetween 2 and 4 kcalmol (Supplementary Data) Someoutliers are due to the effects of increased mRNA degrad-ation of 50 UTRs with very long proximal or distal bindingsites (P Dgt 20 nt) (Supplementary Figure S2) In othersextremely short proximal or distal binding sites can resultin non-canonical base pairings that alter the ribosomersquosbinding affinity For example in the absence of aproximal binding site (P=0nt) co-axial stacking willtake place between the nucleotides in the distal bindingsite and the mRNAndashrRNA duplex that anchors theribosome to the mRNA Similarly co-axial stacking willplay a larger role in predicting the effects of small RNAsthat bind to sites nearby RNA structuresAs the list of interactions that control gene expression

continues to grow we will stretch the limits of humanpattern recognition and the use of observational correl-ations Biophysical modeling offers a comprehensiveapproach to account for all known interactions with thegene expression machinery identify gaps design experi-ments precisely measure strengths and make testable pre-dictions The accuracy of thesemodels will go hand-in-handwith our ability to engineer genetic systems without trial-and-error We have incorporated the ribosomersquos inter-actions with structured standby site modules into animproved biophysical model called the RBS Calculatorv20 A software implementation and user-friendly webinterface is available at httpsalispsuedusoftware

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online

ACKNOWLEDGEMENTS

The authors would like to thank Manish Kushwaha andthe Genomics Core Facility at Penn State for technicalassistance Chase Beisel Richard Lease Paul Babitzkeand the reviewers for providing comments on themanuscript

Nucleic Acids Research 2014 Vol 42 No 4 2657

FUNDING

DARPA Young Faculty Award [to HMS] Office ofNaval Research [N00014-13-1-0074 to HMS] NationalScience Foundation Career Award [CBET-1253641 toHMS] start-up funds provided by the Penn StateInstitute for the Energy and Environment Supported bythe NSF Research Experience for Undergraduatesprogram (to ASC) Funding for open access chargeThe Office of Naval Research [N00014-13-1-0074 toHMS]

CONTRIBUTIONS

AEB and HMS designed the study developed themodel analyzed results and wrote the manuscriptAEB and ASC conducted the experiments

Conflict of interest statement HMS is the founder of DeNovo DNA a company that commercializes software todesign genetic systems The presented results areindependent of financial consequences

REFERENCES

1 TsaiA PetrovA MarshallRA KorlachJ UemuraS andPuglisiJD (2012) Heterogeneous pathways and timing of factordeparture during translation initiation Nature 487 390ndash393

2 SimonettiA MarziS JennerL MyasnikovA RombyPYusupovaG KlaholzBP and YusupovM (2009) A structuralview of translation initiation in bacteria Cell Mol Life Sci 66423ndash436

3 JennerL RombyP ReesB Schulze-BrieseC SpringerMEhresmannC EhresmannB MorasD YusupovaG andYusupovM (2005) Translational operator of mRNA on theribosome how repressor proteins exclude ribosome bindingScience 308 120ndash123

4 BeiselCL UpdegroveTB JansonBJ and StorzG (2012)Multiple factors dictate target selection by Hfq-binding smallRNAs EMBO J 31 1961ndash1974

5 StorzG VogelJ and WassarmanKM (2011) Regulation bysmall RNAs in bacteria expanding frontiers Mol Cell 43880ndash891

6 BastetL DubeA MasseE and LafontaineDA (2011) Newinsights into riboswitch regulation mechanisms Mol Microbiol80 1148ndash1154

7 BeiselCL and SmolkeCD (2009) Design principles forriboswitch function PLoS Comput Biol 5 e1000363

8 LeaseRA and BelfortM (2000) A trans-acting RNA as acontrol switch in Escherichia coli DsrA modulates function byforming alternative structures Proc Natl Acad Sci USA 979919ndash9924

9 De SmitMH and Van DuinJ (1994) Control of translation bymRNA secondary structure in Escherichia coli A quantitativeanalysis of literature data J Mol Biol 244 144ndash150

10 De SmitMH and Van DuinJ (1994) Translational initiation onstructured messengers Another role for the Shine-Dalgarnointeraction J Mol Biol 235 173ndash184

11 OstermanIA EvfratovSA SergievPV and DontsovaOA(2013) Comparison of mRNA features affecting translationinitiation and reinitiation Nucleic Acids Res 41 474ndash486

12 BugautA and BalasubramanianS (2012) 50-UTR RNA G-quadruplexes translation regulation and targeting Nucleic AcidsRes 40 4727ndash4741

13 SalisHM (2011) The ribosome binding site calculator MethodsEnzymol 498 19ndash42

14 SalisHM MirskyEA and VoigtCA (2009) Automated designof synthetic ribosome binding sites to control protein expressionNat Biotechnol 27 946ndash950

15 CarothersJM GolerJA JuminagaD and KeaslingJD (2011)Model-driven engineering of RNA devices to quantitativelyprogram gene expression Science 334 1716ndash1719

16 SimonettiA MarziS MyasnikovAG FabbrettiAYusupovM GualerziCO and KlaholzBP (2008) Structure ofthe 30S translation initiation complex Nature 455 416ndash420

17 MarziS MyasnikovAG SerganovA EhresmannCRombyP YusupovM and KlaholzBP (2007) StructuredmRNAs regulate translation initiation by binding to the platformof the ribosome Cell 130 1019ndash1031

18 YusupovaGZ YusupovMM CateJ and NollerHF (2001)The path of messenger RNA through the ribosome Cell 106233ndash241

19 De SmitMH and Van DuinJ (2003) Translational standbysites how ribosomes may deal with the rapid folding kinetics ofmRNA J Mol Biol 331 737ndash743

20 QuX LancasterL NollerHF BustamanteC and TinocoI(2012) Ribosomal protein S1 unwinds double-stranded RNA inmultiple steps Proc Natl Acad Sci USA 109 14458ndash14463

21 MilonP MaracciC FilonavaL GualerziCO andRodninaMV (2012) Real-time assembly landscape of bacterial30S translation initiation complex Nat Struct Mol Biol 19609ndash615

22 StuderSM and JosephS (2006) Unfolding of mRNA secondarystructure by the bacterial translation initiation complex MolCell 22 105ndash115

23 DarfeuilleF UnosonC VogelJ and WagnerEGH (2007) Anantisense RNA inhibits translation by competing with standbyribosomes Mol Cell 26 381ndash392

24 MutalikVK GuimaraesJC CambrayG MaiQ-AChristoffersenMJ MartinL YuA LamC RodriguezCBennettG et al (2013) Quantitative estimation of activity andquality for collections of functional genetic elements NatMethods 10 347ndash353

25 CardinaleS and ArkinAP (2012) Contextualizing context forsynthetic biologyndashidentifying causes of failure of syntheticbiological systems Biotechnol J 7 856ndash866

26 LouC StantonB ChenY-J MunskyB and VoigtCA (2012)Ribozyme-based insulator parts buffer synthetic circuits fromgenetic context Nat Biotechnol 30 1137ndash1142

27 KosuriS GoodmanDB CambrayG MutalikVK GaoYArkinAP EndyD and ChurchGM (2013) Composability ofregulatory sequences controlling transcription and translation inEscherichia coli Proc Natl Acad Sci USA 110 14024ndash14029

28 TakahashiS FurusawaH UedaT and OkahataY (2013)Translation enhancer improves the ribosome liberation fromtranslation initiation J Am Chem Soc 135 13096ndash13106

29 HaoY ZhangZJ EricksonDW HuangM HuangY LiJHwaT and ShiH (2011) Quantifying the sequence-functionrelation in gene silencing by bacterial small RNAs Proc NatlAcad Sci USA 108 12473ndash12478

30 MathewsDH SabinaJ ZukerM and TurnerDH (1999)Expanded sequence dependence of thermodynamic parametersimproves prediction of RNA secondary structure J Mol Biol288 911ndash940

31 XiaT SantaLuciaJ BurkardME KierzekR SchroederSJJiaoX CoxC and TurnerDH (1998) Thermodynamicparameters for an expanded nearest-neighbor model for formationof RNA duplexes with Watson-Crick base pairs Biochemistry 3714719ndash14735

32 GruberAR LorenzR BernhartSH NeubockR andHofackerIL (2008) The Vienna RNA websuite Nucleic AcidsRes 36 W70ndashW74

33 ChenH BjerknesM KumarR and JayE (1994) Determinationof the optimal aligned spacing between the ShinendashDalgarnosequence and the translation initiation codon of Escherichia colimRNAs Nucleic Acids Res 22 4953ndash4957

34 PertzevAV and NicholsonAW (2006) Characterization ofRNA sequence determinants and antideterminants of processingreactivity for a minimal substrate of Escherichia coli ribonucleaseIII Nucleic Acids Res 34 3708ndash3721

35 KeselerIM Collado-VidesJ Santos-ZavaletaA Peralta-GilMGama-CastroS Muniz-RascadoL Bonavides-MartinezCPaleyS KrummenackerM AltmanT et al (2011) EcoCyc a

2658 Nucleic Acids Research 2014 Vol 42 No 4

comprehensive database of Escherichia coli biology Nucleic AcidsRes 39 D583ndashD590

36 MutalikVK QiL GuimaraesJC LucksJB and ArkinAP(2012) Rationally designed families of orthogonal RNA regulatorsof translation Nat Chem Biol 8 447ndash454

37 RodrigoG LandrainTE and JaramilloA (2012) De novoautomated design of small RNA circuits for engineering syntheticriboregulation in living cells Proc Natl Acad Sci USA 10915271ndash15276

38 CalluraJM CantorCR and CollinsJJ (2012) Geneticswitchboard for synthetic biology applications Proc Natl AcadSci USA 109 5850ndash5855

39 SinhaJ ReyesSJ and GallivanJP (2010) Reprogrammingbacteria to seek and destroy an herbicide Nat Chem Biol 6464ndash470

40 AnthonyPC PerezCF Garcıa-GarcıaC and BlockSM(2012) Folding energy landscape of the thiamine pyrophosphateriboswitch aptamer Proc Natl Acad Sci USA 109 1485ndash1489

41 PerdrizetGA II ArtsimovitchI FurmanR SosnickTR andPanT (2012) Transcriptional pausing coordinates folding of theaptamer domain and the expression platform of a riboswitchProc Natl Acad Sci USA 109 3323ndash3328

42 LynchSA DesaiSK SajjaHK and GallivanJP (2007) Ahigh-throughput screen for synthetic riboswitches revealsmechanistic insights into their function Chem Biol 14 173ndash184

43 CaronM-P BastetL LussierA Simoneau-RoyM MasseEand LafontaineDA (2012) Dual-acting riboswitch control oftranslation initiation and mRNA decay Proc Natl Acad SciUSA 109 E3444ndashE3453

44 UpdegroveT WilfN SunX and WartellRM (2008) Effect ofHfq on RprArpoS mRNA pairing HfqRNA binding and the

influence of the 50 rpoS mRNA leader region Biochemistry 4711184ndash11195

45 SacerdotC CailletJ GraffeM EyermannF EhresmannBEhresmannC SpringerM and RombyP (1998) The Escherichiacoli threonyl-tRNA synthetase gene contains a split ribosomalbinding site interrupted by a hairpin structure that is essential forautoregulation Mol Microbiol 29 1077ndash1090

46 GabashviliIS AgrawalRK GrassucciR and FrankJ (1999)Structure and structural variations of the Escherichia coli 30 Sribosomal subunit as revealed by three-dimensional cryo-electronmicroscopy J Mol Biol 286 1285ndash1291

47 ClemonsWM MayJL WimberlyBT McCutcheonJPCapelMS and RamakrishnanV (1999) Structure of a bacterial30S ribosomal subunit at 55 A resolution Nature 400 833ndash840

48 PiolettiM SchlunzenF HarmsJ ZarivachR GluhmannMAvilaH BashanA BartelsH AuerbachT and JacobiC(2001) Crystal structures of complexes of the small ribosomalsubunit with tetracycline edeine and IF3 EMBO J 201829ndash1839

49 WilliamsonJR (2000) Induced fit in RNAndashprotein recognitionNat Struct Mol Biol 7 834ndash837

50 FioreJL HolmstromED and NesbittDJ (2012) Entropicorigin of Mg2+-facilitated RNA folding Proc Natl Acad SciUSA 109 2902ndash2907

51 MillsJB VacanoE and HagermanPJ (1999) Flexibility ofsingle-stranded DNA use of gapped duplex helices to determinethe persistence lengths of poly (dT) and poly (dA) J Mol Biol285 245ndash257

52 EgbertRG and KlavinsE (2012) Fine-tuning gene networksusing simple sequence repeats Proc Natl Acad Sci USA 10916817ndash16822

Nucleic Acids Research 2014 Vol 42 No 4 2659

Page 9: Translation rate is controlled by coupled trade-offs ......Translation rate is controlled by coupled trade-offs between site accessibility, selective RNA unfolding and sliding at upstream

presence of mRNA structures would impede this slidingmotion and reduce the pre-initiation complexrsquos assemblyrateFurther analysis of our measurements shows that RNA

structures in between the upstream standby site moduleand start codon partly attenuate the mRNArsquos translationrate (Figure 4E) The apparent binding free energy penaltyof the M2 50 UTR (36 kcalmol) is higher than theupstream standby site modulersquos predicted binding freeenergy penalty (04 kcalmol) This energetic differencecannot be explained by binding to the internal standbysite module (114 kcalmol) or by the unfolding of thedownstream RNA structure (175 kcalmol) Thus thefree energy difference (36 ndash 04=32 kcalmol) is signifi-cant but much lower than predicted by alternativebinding to other standby site modules or from unfoldingof RNA structuresOur data supports the presence of a sliding mechanism

for the ribosomal platform Intervening mRNA structuresare not unfolded but are pushed aside with an energeticcost Using our measurements of the M2 50 UTR weestimate that 32plusmn05 kcalmol free energy is requiredto push aside the single RNA structure with a height of15 nt yielding a coefficient of 020plusmn004 kcalmol perhairpin height We then determined whether includingthis sliding penalty could improve the modelrsquos ability topredict the translation rate of long 50 UTRs with multiplestandby site modules with lsquohairpins of differing heightsrsquoFor each standby site module we calculate the heights ofall downstream RNA structures and multiplied the sumby the coefficient (c=02 kcalmolnt) to yield a freeenergy penalty of sliding Gsliding The sliding freeenergy penalties were 0 30 48 and 78 kcalmol for M1to M4 50 UTRs respectively We add together the slidingdistortion and unfolding free energy penalties to calculatethe ribosomersquos binding free energy penalty to each standbysite moduleFigure 4E shows the measured translation initiation

rates for M1 to M4 alongside the predicted binding freeenergy penalties (Figure 4F) both with and without thesliding free energy penalty The data shown in Figure 4Eand F are on the same scale according to the log-linearrelationship between translation initiation rate andbinding free energy differences (lsquoMaterials and Methodsrsquosection) enabling a comparison between the measure-ments and the two predictions The introduction of thesliding free energy penalty enabled the biophysical modelto more accurately predict the translation rate from longer50 UTRs with several standby site modules (average erroris 186 kcalmol with the sliding energy penalty comparedto 445 kcalmol without the sliding energy penaltyP-value is 0013)

A biophysical model predicts translation rates fromdiversely structured 50 UTRs

Altogether the ribosome binds to 50 UTRs by selectingthe most geometrically accessible standby site module thatrequires the least amount of RNA unfolding Standby sitemodule selection is not limited by distance to the startcodon and upstream standby site modules can support

high rates of translation We identified that ribosomal dis-tortion selective RNA unfolding and ribosomal slidingcontrol standby site module selection and binding ener-getics and combine them together to create a single bio-physical model The ribosomersquos binding free energypenalty to the standby site modules in a structured50 UTR is calculated according to

Gstandby frac14 Gdistortion+Gunfolding+Gsliding eth5THORN

All three Gibbs free energies are positive and a long un-structured 50 UTR would bind to the ribosomal platformwithout an energetic penalty (Gstandby=0kcalmol) AllRNA energetics are calculated using a semi-empirical freeenergy model of RNA and RNAndashRNA interactions(3031) and the minimization algorithms available in theVienna RNA suite version 185 (32) This model extendsour previous biophysical model that focused on down-stream mRNA interactions including the 16S rRNAbinding site spacer region and downstream RNA struc-tures (1314)

We further tested the accuracy of the biophysical modelby characterizing an additional 28 diversely structured50 UTRs (68ndash164 nt long) with two to four standby sitemodules of varying geometries RNA structure energeticsand hairpin heights (Supplementary Figure S5B) The bio-physical model was able to accurately predict the relativetranslation rates and ribosome binding free energypenalties (average error is 079 kcalmol and R2=083)(Figure 5A) Overall the biophysical model containsfour empirically determined parameter values but canaccurately predict the binding free energy penalties andtranslation rates of the 136 synthetic 50 UTRscharacterized in this study (average error is 075 kcalmoland R2=089) (Figure 5B)

Genome-wide predictions of 50 UTR translation rates

We next employed the biophysical model to examine howstandby sites in genomic 50 UTRs control their translationinitiation rates From the annotated E coli MG1655genome provided by EcoCyc (35) we collected 343050 UTRs that varied from short leaders to gene-sized regu-latory hot spots with average 50 UTR length of 100 nt The50 UTR sequences from the+1 of the mRNA transcript to+100 after the protein coding sequencersquos start codon wereinputted into the biophysical model For individual50 UTRs a list of standby site modules was automaticallygenerated For each standby site module the surface areaAs distortion energy penalty Gdistortion unfolding energypenalty Gunfolding and sliding energy penalty Gsliding

were calculated These calculations are combined to deter-mine the Gstandby for each standby site module Becauseof the absence of cooperative binding as shown thestandby site module with the lowest binding free energypenalty is selected as the one controlling the ribosomersquostranslation initiation rate

According to the model calculations we found that thelengths of the natural 50 UTRs do not correlate with theirability to bind the ribosomal platform (Figure 6A) Short50 UTRs with low accessibility can inhibit translationequally well compared to long 50 UTRs where

2654 Nucleic Acids Research 2014 Vol 42 No 4

accessibility is limited by RNA structures The model cal-culations indicate that the ribosomal platform can bind tonatural 50 UTRs with a wide range of energetic penalties(up to 174 kcalmol) repressing translation by up to 2500-fold (Figure 6B) Long 50 UTRs with low Gstandby can besubjected to translation repression when trans-actingsmall RNAs bind and reduce their single-stranded acces-sibility (232936ndash38) or when cis-acting riboswitchesreduce accessibility when bound to a chemical ligand(39ndash43) In one example the standby site in the tisB50 UTR is occluded when bound by the IstR-1 smallRNA and its translation is repressed (23) In contrastlong 50 UTRs with high Gstandby can be targets for

translation activation when structural remodeling bysmall RNAs or riboswitches increases their surface acces-sibility RprA small RNA has been shown to bind and up-regulate the translation of rpoS mRNA with the help ofHfq-protein (44)The model predicts that both types of translation regu-

lation occur in natural 50 UTRs In particular 39 ofgenomic 50 UTRs have standby site modules whosedistal or proximal binding sites are longer than 15 ntwhich are potent targets for translational repression incontrast 9 of 50 UTRs contain standby site moduleswith low accessibilities (Aslt 10) making them targetsfor translation activation Moreover 79 diverse 50 UTRs

A B

C

0 200 400 600 800 10000

2

4

6

8

10

12

14

16

18

ΔGst

andb

y [kca

lmol

]

Genomic 5rsquo UTR Length [nt]

0 10 20 30 400

2

4

6

8

10

12

14

16

18

ΔGst

andb

y [kca

lmol

]

0 2 4 6 8 10 12 14 16 18100

101

102

103

104

ΔGstandby

[kcalmol]

Num

ber

of G

enom

ic 5

rsquo UT

Rs

0 2 4 6 8 10 12 14 16 18

maximum change in ΔGstandby

[kcalmol]

Num

ber

of 5

rsquo UT

R Is

ofor

ms

100

101

102

103Genomic 5rsquo UTR Length [nt]

Figure 6 (A) The ribosomal platform binding free energy penalties Gstandby for 3430 transcribed 50 UTRs from the E coli MG1655 genome(EcoCyc release 171) are compared to their lengths The inset shows the Gstandby of short 50 UTRs (B) The number of genomic 50 UTRs withdifferent values of Gstandby is shown (C) 50 UTR isoforms within transcripts across the genome affect their standby site module characteristics asquantified by the maximum calculated change in Gstandby

A

0 2 4 6 8 1010

8

6

4

2

0

103

104

Predicted ΔGstandby

[kcalmol]

App

aren

t ΔG

stan

dby [

kcal

mol

]

Flu

ores

cenc

e [a

u]

B

0 2 4 6 8 10 12 14

102

103

104

14

12

10

8

6

4

2

0

Predicted ΔGstandby

[kcalmol]

App

aren

t ΔG

stan

dby [

kcal

mol

]

Flu

ores

cenc

e [a

u]

Figure 5 (A) A scatter plot showing fluorescence measurements and apparent Gstandby energy penalties compared to predicted Gstandby energypenalties for 28 synthetic 50 UTRs with diverse structures (average error is 079 kcalmol and R2=083) (B) The same comparison for the 136synthetic 50 UTRs characterized in this study (average error is 075 kcalmol and R2=089) In parts (A and B) the apparent Gstandby numbersshown in secondary y-axis are directly related to the data according to Equation (4) An x=y diagonal line is shown (dashed)

Nucleic Acids Research 2014 Vol 42 No 4 2655

with an average length of 175 nt have positive slidingenergy penalties (Gsliding) indicating that distantstandby site modules are controlling translation andmaking them potential targets for long-range regulationFurther there are 769 operons where variability in their

transcriptional start sites due to adjacent or overlappingpromoters result in 50 UTRs with different lengths andmRNA structures For example the mmuPM operon hasfive different transcriptional start sites leading to 50 UTRisoforms with lengths from 9 to 205 nt The biophysicalmodel predicts a 1088-fold change in the translation initi-ation rate of mmuP as a result of changes in the structureand surface area of the different standby site modulesOverall across all 50 UTR isoforms within the genomemodel calculations show large changes in translation ini-tiation rates through alterations in the standby sitemodulesrsquo characteristics (Figure 6C) These differencesimply that switching promoter usage due to changingtranscription factor or sigma factor levels can modulatean mRNArsquos translation rate and potentially createstandby site modules whose accessibility can be addition-ally regulated by translation factors However furtherinvestigation is necessary to examine the 50 UTRs ofisoforms and the regulation of their translation rates

Biophysical modeling to calculate translation rates fromsplit ribosome binding sites

Sacerdot et al (45) studied the translation of the 50 UTRof E coli thrS mRNA that uses a split ribosome bindingsite The biophysical model classifies the 50 UTR as con-taining three standby site modules with large RNA struc-tures and varying distal and proximal binding site lengths(Figure 7) The authors concluded that a 24 nt long singlestranded region (domain 3) is essential for ribosomebinding and that the ribosome can accommodate a long

hairpin (domain 2) without unfolding it which was latersupported by crystallographic data (3) The biophysicalmodelrsquos calculations are strongly consistent with their con-clusions In particular a mutation that deleted the entiredomain 3 and its upstream region (ILO4) repressed thetranslation rate by over 50-fold where it significantlyreduced the standby site accessibility (from As=24 andGdistortion=0 to As=0 and Gdistortion=174 kcalmol) To offset this large energetic burden model calcu-lations indicate that the ribosomal platform partiallyunfolded 6 bp from the domain 2 hairpin whichminimized its binding free energy penalty to a Gstandby

of 532 kcalmol (As=205 Gdistortion=002 andGunfolding=53 kcalmol) Destabilization of thedomain 2 hairpin (ILO5) restored the mRNArsquos transla-tion to its wild-type rate by increasing the surface area ofthe standby site to 205 nt resulting in an almost zeroGstandby The biophysical model calculations indicatethat other mutations to the thrS 50 UTR (L6 M1 N1BS4-9 BS4-9CS29 L19 L7 CS302 ILO1 andILO2 see Figure 7) did not alter its standby sitemodule surface area or RNA structure the maximumobserved change in translation was 25-fold for thesemutations

DISCUSSION

We designed and characterized synthetic mRNA se-quences and applied biophysical modeling to decipherthe rules for the ribosomersquos interactions at structured50 UTRs The 30S ribosome searches for a singlestandby site module that supports the highest translationrate regardless of its distance from the start codon Theribosomersquos binding free energy penalty to individualstandby site modules is governed by a competing trade-off between the accessibility of standby site modules

GAACUGGUUAAUAAACAAAUUUUUC

AG

GUGG

U

G

UACCUG

AAUU

G

G

AG

UUUG

AAC C

UG

ACUA

GA

UUCGUG

U

G

A

GCGUU

U

U

CGCAAU

GUUU

G

AA

UU

GC

A

CA

GU

AUUU

GAUACGGC

AU

UAAGGAUAUAAAAUGSD

U

AUGUU

U

UGCAAA

GCUU U

U

CGUG

ACUAGUGC

G

GU

C

CA

Domain 2

Domain 3

Domain 4

minus1

ILOΔ1

ILOΔ2

minus90

minus120

minus150

minus163 minus60 minus13

minus32

ILOΔ4 ILOΔ5

minus49

Figure 7 The sequence and structure of E coli thrS 50 UTR are shown Similar to Sacerdot et al (45) nucleotide numbering is negative for theupstream 50 UTR with respect to the start codon The list of mutations is BS4-9 G(-32)gtA BS4-9CS29 G(-32)gtA and G(-46)gtA L19 A(-13)gtCL7 U(-49)gtG CS302 deletion of domain 2 from A(-14) to U(-48) ILO1 transcription begins at G(-159) ILO2 transcription begins at G(-68)ILO4 transcription begins at G(-49) U(-49)gtG and A(-13)gtC ILO5 transcription begins at G(-49) U(-49)gtG and G(-46)gtA Bolded nucleo-tides are the positions of new transcriptional start sites

2656 Nucleic Acids Research 2014 Vol 42 No 4

(Figure 2) and the unfolding of RNA structures toincrease accessibility (Figure 3) We demonstrate thatthermodynamic minimization predicts the extent ofthe ribosomersquos selective and partial unfolding of RNAstructures Our data supports a sliding mechanismwhereby downstream RNA structures are not unfoldedbut attenuate translation rate in a height-dependentmanner (Figure 4)

Importantly initially designed sequences have well-defined features that perturb the ribosomersquos individualinteractions with mRNA allowing us to preciselymeasure their effect on the ribosomersquos binding freeenergy penalty The forward design of synthetic sequencesaccording to model predictions quantitatively tests ourknowledge of these interactions and the modelrsquos abilityto predict their strengths Overall the biophysical modelemploys first-principles calculations with four empiricallymeasured constants to accurately predict the translationinitiation rates of 136 mRNAs with diversely structuredlong 50 UTRs (Figure 5)

Based on our findings there is also the potential forlong-range regulation of translation Long structured50 UTRs can feature several upstream standby sitemodules extending over 100 nt away from an openreading frame Any of these standby site modules arepotential binding sites for the ribosome according totheir binding free energy penalties but only a singlestandby site module is ultimately bound Both cis-actingand trans-acting factors can modulate individual standbysite module surface accessibilities to control standby sitemodule selection and the resulting translation rate(31723) Importantly the regulation of standby sitemodule selection will produce discrete step changes intranslation rate that can further regulate downstreamprocesses according to non-Boolean multi-state logic

In this study we introduce a new metric to quantify thesurface area of a standby site module which was validatedusing diverse but canonical RNA secondary structuresThe metric relates the geometric characteristics of thestandby site module (P D H) to its available single-stranded surface area implicitly using coefficients of oneto indicate interchangeability Additional measurementswill be needed to precisely measure these coefficientsand relate them to the helical shape of A-form RNA Inaddition tertiary structures such as G-quadruplexes andpseudoknots will likely have a differently defined metricof surface area Further investigation will be necessary todetermine the effect of tertiary RNA structures on standbysite accessibility ribosome binding and translationinitiation

As suggested by the biophysical model the ribosomalplatformrsquos ability to bind standby site modules with lowsurface area relies on its structural flexibility Previouslypublished cryo-EM data shows that the platform domainin the 30S subunit in both the free and 50S-bound statescan occupy variable conformational states (46) The twohalves of the platform which are stabilized by ribosomalproteins at its surface switch conformations during trans-lation (47) Restraining this flexibility will reduce the ribo-somersquos translation initiation rate for example when theantibiotic Edeine binds 16S ribosomal RNA helices H23

and H24 (48) Consequently the ribosomal platformrsquosstructural flexibility appears to be an essential aspect oftranslating mRNAs with structured 50 UTRsSimilarly the rigidity of single-stranded RNA likely

plays an important role in modulating the ribosomersquos dis-tortion penalty due to the entropic differences of bindingflexible or rigid macromolecules (49) Similar to DNApolyA RNA sequences are less flexible than mixed orpolyU RNA sequences (5051) Differences in RNArigidity have been shown to alter the ribosomersquos bindingaffinity to spacer regions separating the SD and startcodon sequences (52) binding free energy differences are25 kcalmol between polyA and polyAU spacer se-quences and 5 kcalmol between polyA and polyUspacer sequences Incorporating a sequence-dependentmodel for RNA rigidity into the distortion penalty calcu-lations could potentially increase their accuracyAnalysis of the modelrsquos error can lead to further insight

into the ribosomersquos interactions The modelrsquos predictionshave a normally distributed error for 90 of sequences(Supplementary Figure S6) Thirteen outliers have errorsbetween 2 and 4 kcalmol (Supplementary Data) Someoutliers are due to the effects of increased mRNA degrad-ation of 50 UTRs with very long proximal or distal bindingsites (P Dgt 20 nt) (Supplementary Figure S2) In othersextremely short proximal or distal binding sites can resultin non-canonical base pairings that alter the ribosomersquosbinding affinity For example in the absence of aproximal binding site (P=0nt) co-axial stacking willtake place between the nucleotides in the distal bindingsite and the mRNAndashrRNA duplex that anchors theribosome to the mRNA Similarly co-axial stacking willplay a larger role in predicting the effects of small RNAsthat bind to sites nearby RNA structuresAs the list of interactions that control gene expression

continues to grow we will stretch the limits of humanpattern recognition and the use of observational correl-ations Biophysical modeling offers a comprehensiveapproach to account for all known interactions with thegene expression machinery identify gaps design experi-ments precisely measure strengths and make testable pre-dictions The accuracy of thesemodels will go hand-in-handwith our ability to engineer genetic systems without trial-and-error We have incorporated the ribosomersquos inter-actions with structured standby site modules into animproved biophysical model called the RBS Calculatorv20 A software implementation and user-friendly webinterface is available at httpsalispsuedusoftware

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online

ACKNOWLEDGEMENTS

The authors would like to thank Manish Kushwaha andthe Genomics Core Facility at Penn State for technicalassistance Chase Beisel Richard Lease Paul Babitzkeand the reviewers for providing comments on themanuscript

Nucleic Acids Research 2014 Vol 42 No 4 2657

FUNDING

DARPA Young Faculty Award [to HMS] Office ofNaval Research [N00014-13-1-0074 to HMS] NationalScience Foundation Career Award [CBET-1253641 toHMS] start-up funds provided by the Penn StateInstitute for the Energy and Environment Supported bythe NSF Research Experience for Undergraduatesprogram (to ASC) Funding for open access chargeThe Office of Naval Research [N00014-13-1-0074 toHMS]

CONTRIBUTIONS

AEB and HMS designed the study developed themodel analyzed results and wrote the manuscriptAEB and ASC conducted the experiments

Conflict of interest statement HMS is the founder of DeNovo DNA a company that commercializes software todesign genetic systems The presented results areindependent of financial consequences

REFERENCES

1 TsaiA PetrovA MarshallRA KorlachJ UemuraS andPuglisiJD (2012) Heterogeneous pathways and timing of factordeparture during translation initiation Nature 487 390ndash393

2 SimonettiA MarziS JennerL MyasnikovA RombyPYusupovaG KlaholzBP and YusupovM (2009) A structuralview of translation initiation in bacteria Cell Mol Life Sci 66423ndash436

3 JennerL RombyP ReesB Schulze-BrieseC SpringerMEhresmannC EhresmannB MorasD YusupovaG andYusupovM (2005) Translational operator of mRNA on theribosome how repressor proteins exclude ribosome bindingScience 308 120ndash123

4 BeiselCL UpdegroveTB JansonBJ and StorzG (2012)Multiple factors dictate target selection by Hfq-binding smallRNAs EMBO J 31 1961ndash1974

5 StorzG VogelJ and WassarmanKM (2011) Regulation bysmall RNAs in bacteria expanding frontiers Mol Cell 43880ndash891

6 BastetL DubeA MasseE and LafontaineDA (2011) Newinsights into riboswitch regulation mechanisms Mol Microbiol80 1148ndash1154

7 BeiselCL and SmolkeCD (2009) Design principles forriboswitch function PLoS Comput Biol 5 e1000363

8 LeaseRA and BelfortM (2000) A trans-acting RNA as acontrol switch in Escherichia coli DsrA modulates function byforming alternative structures Proc Natl Acad Sci USA 979919ndash9924

9 De SmitMH and Van DuinJ (1994) Control of translation bymRNA secondary structure in Escherichia coli A quantitativeanalysis of literature data J Mol Biol 244 144ndash150

10 De SmitMH and Van DuinJ (1994) Translational initiation onstructured messengers Another role for the Shine-Dalgarnointeraction J Mol Biol 235 173ndash184

11 OstermanIA EvfratovSA SergievPV and DontsovaOA(2013) Comparison of mRNA features affecting translationinitiation and reinitiation Nucleic Acids Res 41 474ndash486

12 BugautA and BalasubramanianS (2012) 50-UTR RNA G-quadruplexes translation regulation and targeting Nucleic AcidsRes 40 4727ndash4741

13 SalisHM (2011) The ribosome binding site calculator MethodsEnzymol 498 19ndash42

14 SalisHM MirskyEA and VoigtCA (2009) Automated designof synthetic ribosome binding sites to control protein expressionNat Biotechnol 27 946ndash950

15 CarothersJM GolerJA JuminagaD and KeaslingJD (2011)Model-driven engineering of RNA devices to quantitativelyprogram gene expression Science 334 1716ndash1719

16 SimonettiA MarziS MyasnikovAG FabbrettiAYusupovM GualerziCO and KlaholzBP (2008) Structure ofthe 30S translation initiation complex Nature 455 416ndash420

17 MarziS MyasnikovAG SerganovA EhresmannCRombyP YusupovM and KlaholzBP (2007) StructuredmRNAs regulate translation initiation by binding to the platformof the ribosome Cell 130 1019ndash1031

18 YusupovaGZ YusupovMM CateJ and NollerHF (2001)The path of messenger RNA through the ribosome Cell 106233ndash241

19 De SmitMH and Van DuinJ (2003) Translational standbysites how ribosomes may deal with the rapid folding kinetics ofmRNA J Mol Biol 331 737ndash743

20 QuX LancasterL NollerHF BustamanteC and TinocoI(2012) Ribosomal protein S1 unwinds double-stranded RNA inmultiple steps Proc Natl Acad Sci USA 109 14458ndash14463

21 MilonP MaracciC FilonavaL GualerziCO andRodninaMV (2012) Real-time assembly landscape of bacterial30S translation initiation complex Nat Struct Mol Biol 19609ndash615

22 StuderSM and JosephS (2006) Unfolding of mRNA secondarystructure by the bacterial translation initiation complex MolCell 22 105ndash115

23 DarfeuilleF UnosonC VogelJ and WagnerEGH (2007) Anantisense RNA inhibits translation by competing with standbyribosomes Mol Cell 26 381ndash392

24 MutalikVK GuimaraesJC CambrayG MaiQ-AChristoffersenMJ MartinL YuA LamC RodriguezCBennettG et al (2013) Quantitative estimation of activity andquality for collections of functional genetic elements NatMethods 10 347ndash353

25 CardinaleS and ArkinAP (2012) Contextualizing context forsynthetic biologyndashidentifying causes of failure of syntheticbiological systems Biotechnol J 7 856ndash866

26 LouC StantonB ChenY-J MunskyB and VoigtCA (2012)Ribozyme-based insulator parts buffer synthetic circuits fromgenetic context Nat Biotechnol 30 1137ndash1142

27 KosuriS GoodmanDB CambrayG MutalikVK GaoYArkinAP EndyD and ChurchGM (2013) Composability ofregulatory sequences controlling transcription and translation inEscherichia coli Proc Natl Acad Sci USA 110 14024ndash14029

28 TakahashiS FurusawaH UedaT and OkahataY (2013)Translation enhancer improves the ribosome liberation fromtranslation initiation J Am Chem Soc 135 13096ndash13106

29 HaoY ZhangZJ EricksonDW HuangM HuangY LiJHwaT and ShiH (2011) Quantifying the sequence-functionrelation in gene silencing by bacterial small RNAs Proc NatlAcad Sci USA 108 12473ndash12478

30 MathewsDH SabinaJ ZukerM and TurnerDH (1999)Expanded sequence dependence of thermodynamic parametersimproves prediction of RNA secondary structure J Mol Biol288 911ndash940

31 XiaT SantaLuciaJ BurkardME KierzekR SchroederSJJiaoX CoxC and TurnerDH (1998) Thermodynamicparameters for an expanded nearest-neighbor model for formationof RNA duplexes with Watson-Crick base pairs Biochemistry 3714719ndash14735

32 GruberAR LorenzR BernhartSH NeubockR andHofackerIL (2008) The Vienna RNA websuite Nucleic AcidsRes 36 W70ndashW74

33 ChenH BjerknesM KumarR and JayE (1994) Determinationof the optimal aligned spacing between the ShinendashDalgarnosequence and the translation initiation codon of Escherichia colimRNAs Nucleic Acids Res 22 4953ndash4957

34 PertzevAV and NicholsonAW (2006) Characterization ofRNA sequence determinants and antideterminants of processingreactivity for a minimal substrate of Escherichia coli ribonucleaseIII Nucleic Acids Res 34 3708ndash3721

35 KeselerIM Collado-VidesJ Santos-ZavaletaA Peralta-GilMGama-CastroS Muniz-RascadoL Bonavides-MartinezCPaleyS KrummenackerM AltmanT et al (2011) EcoCyc a

2658 Nucleic Acids Research 2014 Vol 42 No 4

comprehensive database of Escherichia coli biology Nucleic AcidsRes 39 D583ndashD590

36 MutalikVK QiL GuimaraesJC LucksJB and ArkinAP(2012) Rationally designed families of orthogonal RNA regulatorsof translation Nat Chem Biol 8 447ndash454

37 RodrigoG LandrainTE and JaramilloA (2012) De novoautomated design of small RNA circuits for engineering syntheticriboregulation in living cells Proc Natl Acad Sci USA 10915271ndash15276

38 CalluraJM CantorCR and CollinsJJ (2012) Geneticswitchboard for synthetic biology applications Proc Natl AcadSci USA 109 5850ndash5855

39 SinhaJ ReyesSJ and GallivanJP (2010) Reprogrammingbacteria to seek and destroy an herbicide Nat Chem Biol 6464ndash470

40 AnthonyPC PerezCF Garcıa-GarcıaC and BlockSM(2012) Folding energy landscape of the thiamine pyrophosphateriboswitch aptamer Proc Natl Acad Sci USA 109 1485ndash1489

41 PerdrizetGA II ArtsimovitchI FurmanR SosnickTR andPanT (2012) Transcriptional pausing coordinates folding of theaptamer domain and the expression platform of a riboswitchProc Natl Acad Sci USA 109 3323ndash3328

42 LynchSA DesaiSK SajjaHK and GallivanJP (2007) Ahigh-throughput screen for synthetic riboswitches revealsmechanistic insights into their function Chem Biol 14 173ndash184

43 CaronM-P BastetL LussierA Simoneau-RoyM MasseEand LafontaineDA (2012) Dual-acting riboswitch control oftranslation initiation and mRNA decay Proc Natl Acad SciUSA 109 E3444ndashE3453

44 UpdegroveT WilfN SunX and WartellRM (2008) Effect ofHfq on RprArpoS mRNA pairing HfqRNA binding and the

influence of the 50 rpoS mRNA leader region Biochemistry 4711184ndash11195

45 SacerdotC CailletJ GraffeM EyermannF EhresmannBEhresmannC SpringerM and RombyP (1998) The Escherichiacoli threonyl-tRNA synthetase gene contains a split ribosomalbinding site interrupted by a hairpin structure that is essential forautoregulation Mol Microbiol 29 1077ndash1090

46 GabashviliIS AgrawalRK GrassucciR and FrankJ (1999)Structure and structural variations of the Escherichia coli 30 Sribosomal subunit as revealed by three-dimensional cryo-electronmicroscopy J Mol Biol 286 1285ndash1291

47 ClemonsWM MayJL WimberlyBT McCutcheonJPCapelMS and RamakrishnanV (1999) Structure of a bacterial30S ribosomal subunit at 55 A resolution Nature 400 833ndash840

48 PiolettiM SchlunzenF HarmsJ ZarivachR GluhmannMAvilaH BashanA BartelsH AuerbachT and JacobiC(2001) Crystal structures of complexes of the small ribosomalsubunit with tetracycline edeine and IF3 EMBO J 201829ndash1839

49 WilliamsonJR (2000) Induced fit in RNAndashprotein recognitionNat Struct Mol Biol 7 834ndash837

50 FioreJL HolmstromED and NesbittDJ (2012) Entropicorigin of Mg2+-facilitated RNA folding Proc Natl Acad SciUSA 109 2902ndash2907

51 MillsJB VacanoE and HagermanPJ (1999) Flexibility ofsingle-stranded DNA use of gapped duplex helices to determinethe persistence lengths of poly (dT) and poly (dA) J Mol Biol285 245ndash257

52 EgbertRG and KlavinsE (2012) Fine-tuning gene networksusing simple sequence repeats Proc Natl Acad Sci USA 10916817ndash16822

Nucleic Acids Research 2014 Vol 42 No 4 2659

Page 10: Translation rate is controlled by coupled trade-offs ......Translation rate is controlled by coupled trade-offs between site accessibility, selective RNA unfolding and sliding at upstream

accessibility is limited by RNA structures The model cal-culations indicate that the ribosomal platform can bind tonatural 50 UTRs with a wide range of energetic penalties(up to 174 kcalmol) repressing translation by up to 2500-fold (Figure 6B) Long 50 UTRs with low Gstandby can besubjected to translation repression when trans-actingsmall RNAs bind and reduce their single-stranded acces-sibility (232936ndash38) or when cis-acting riboswitchesreduce accessibility when bound to a chemical ligand(39ndash43) In one example the standby site in the tisB50 UTR is occluded when bound by the IstR-1 smallRNA and its translation is repressed (23) In contrastlong 50 UTRs with high Gstandby can be targets for

translation activation when structural remodeling bysmall RNAs or riboswitches increases their surface acces-sibility RprA small RNA has been shown to bind and up-regulate the translation of rpoS mRNA with the help ofHfq-protein (44)The model predicts that both types of translation regu-

lation occur in natural 50 UTRs In particular 39 ofgenomic 50 UTRs have standby site modules whosedistal or proximal binding sites are longer than 15 ntwhich are potent targets for translational repression incontrast 9 of 50 UTRs contain standby site moduleswith low accessibilities (Aslt 10) making them targetsfor translation activation Moreover 79 diverse 50 UTRs

A B

C

0 200 400 600 800 10000

2

4

6

8

10

12

14

16

18

ΔGst

andb

y [kca

lmol

]

Genomic 5rsquo UTR Length [nt]

0 10 20 30 400

2

4

6

8

10

12

14

16

18

ΔGst

andb

y [kca

lmol

]

0 2 4 6 8 10 12 14 16 18100

101

102

103

104

ΔGstandby

[kcalmol]

Num

ber

of G

enom

ic 5

rsquo UT

Rs

0 2 4 6 8 10 12 14 16 18

maximum change in ΔGstandby

[kcalmol]

Num

ber

of 5

rsquo UT

R Is

ofor

ms

100

101

102

103Genomic 5rsquo UTR Length [nt]

Figure 6 (A) The ribosomal platform binding free energy penalties Gstandby for 3430 transcribed 50 UTRs from the E coli MG1655 genome(EcoCyc release 171) are compared to their lengths The inset shows the Gstandby of short 50 UTRs (B) The number of genomic 50 UTRs withdifferent values of Gstandby is shown (C) 50 UTR isoforms within transcripts across the genome affect their standby site module characteristics asquantified by the maximum calculated change in Gstandby

A

0 2 4 6 8 1010

8

6

4

2

0

103

104

Predicted ΔGstandby

[kcalmol]

App

aren

t ΔG

stan

dby [

kcal

mol

]

Flu

ores

cenc

e [a

u]

B

0 2 4 6 8 10 12 14

102

103

104

14

12

10

8

6

4

2

0

Predicted ΔGstandby

[kcalmol]

App

aren

t ΔG

stan

dby [

kcal

mol

]

Flu

ores

cenc

e [a

u]

Figure 5 (A) A scatter plot showing fluorescence measurements and apparent Gstandby energy penalties compared to predicted Gstandby energypenalties for 28 synthetic 50 UTRs with diverse structures (average error is 079 kcalmol and R2=083) (B) The same comparison for the 136synthetic 50 UTRs characterized in this study (average error is 075 kcalmol and R2=089) In parts (A and B) the apparent Gstandby numbersshown in secondary y-axis are directly related to the data according to Equation (4) An x=y diagonal line is shown (dashed)

Nucleic Acids Research 2014 Vol 42 No 4 2655

with an average length of 175 nt have positive slidingenergy penalties (Gsliding) indicating that distantstandby site modules are controlling translation andmaking them potential targets for long-range regulationFurther there are 769 operons where variability in their

transcriptional start sites due to adjacent or overlappingpromoters result in 50 UTRs with different lengths andmRNA structures For example the mmuPM operon hasfive different transcriptional start sites leading to 50 UTRisoforms with lengths from 9 to 205 nt The biophysicalmodel predicts a 1088-fold change in the translation initi-ation rate of mmuP as a result of changes in the structureand surface area of the different standby site modulesOverall across all 50 UTR isoforms within the genomemodel calculations show large changes in translation ini-tiation rates through alterations in the standby sitemodulesrsquo characteristics (Figure 6C) These differencesimply that switching promoter usage due to changingtranscription factor or sigma factor levels can modulatean mRNArsquos translation rate and potentially createstandby site modules whose accessibility can be addition-ally regulated by translation factors However furtherinvestigation is necessary to examine the 50 UTRs ofisoforms and the regulation of their translation rates

Biophysical modeling to calculate translation rates fromsplit ribosome binding sites

Sacerdot et al (45) studied the translation of the 50 UTRof E coli thrS mRNA that uses a split ribosome bindingsite The biophysical model classifies the 50 UTR as con-taining three standby site modules with large RNA struc-tures and varying distal and proximal binding site lengths(Figure 7) The authors concluded that a 24 nt long singlestranded region (domain 3) is essential for ribosomebinding and that the ribosome can accommodate a long

hairpin (domain 2) without unfolding it which was latersupported by crystallographic data (3) The biophysicalmodelrsquos calculations are strongly consistent with their con-clusions In particular a mutation that deleted the entiredomain 3 and its upstream region (ILO4) repressed thetranslation rate by over 50-fold where it significantlyreduced the standby site accessibility (from As=24 andGdistortion=0 to As=0 and Gdistortion=174 kcalmol) To offset this large energetic burden model calcu-lations indicate that the ribosomal platform partiallyunfolded 6 bp from the domain 2 hairpin whichminimized its binding free energy penalty to a Gstandby

of 532 kcalmol (As=205 Gdistortion=002 andGunfolding=53 kcalmol) Destabilization of thedomain 2 hairpin (ILO5) restored the mRNArsquos transla-tion to its wild-type rate by increasing the surface area ofthe standby site to 205 nt resulting in an almost zeroGstandby The biophysical model calculations indicatethat other mutations to the thrS 50 UTR (L6 M1 N1BS4-9 BS4-9CS29 L19 L7 CS302 ILO1 andILO2 see Figure 7) did not alter its standby sitemodule surface area or RNA structure the maximumobserved change in translation was 25-fold for thesemutations

DISCUSSION

We designed and characterized synthetic mRNA se-quences and applied biophysical modeling to decipherthe rules for the ribosomersquos interactions at structured50 UTRs The 30S ribosome searches for a singlestandby site module that supports the highest translationrate regardless of its distance from the start codon Theribosomersquos binding free energy penalty to individualstandby site modules is governed by a competing trade-off between the accessibility of standby site modules

GAACUGGUUAAUAAACAAAUUUUUC

AG

GUGG

U

G

UACCUG

AAUU

G

G

AG

UUUG

AAC C

UG

ACUA

GA

UUCGUG

U

G

A

GCGUU

U

U

CGCAAU

GUUU

G

AA

UU

GC

A

CA

GU

AUUU

GAUACGGC

AU

UAAGGAUAUAAAAUGSD

U

AUGUU

U

UGCAAA

GCUU U

U

CGUG

ACUAGUGC

G

GU

C

CA

Domain 2

Domain 3

Domain 4

minus1

ILOΔ1

ILOΔ2

minus90

minus120

minus150

minus163 minus60 minus13

minus32

ILOΔ4 ILOΔ5

minus49

Figure 7 The sequence and structure of E coli thrS 50 UTR are shown Similar to Sacerdot et al (45) nucleotide numbering is negative for theupstream 50 UTR with respect to the start codon The list of mutations is BS4-9 G(-32)gtA BS4-9CS29 G(-32)gtA and G(-46)gtA L19 A(-13)gtCL7 U(-49)gtG CS302 deletion of domain 2 from A(-14) to U(-48) ILO1 transcription begins at G(-159) ILO2 transcription begins at G(-68)ILO4 transcription begins at G(-49) U(-49)gtG and A(-13)gtC ILO5 transcription begins at G(-49) U(-49)gtG and G(-46)gtA Bolded nucleo-tides are the positions of new transcriptional start sites

2656 Nucleic Acids Research 2014 Vol 42 No 4

(Figure 2) and the unfolding of RNA structures toincrease accessibility (Figure 3) We demonstrate thatthermodynamic minimization predicts the extent ofthe ribosomersquos selective and partial unfolding of RNAstructures Our data supports a sliding mechanismwhereby downstream RNA structures are not unfoldedbut attenuate translation rate in a height-dependentmanner (Figure 4)

Importantly initially designed sequences have well-defined features that perturb the ribosomersquos individualinteractions with mRNA allowing us to preciselymeasure their effect on the ribosomersquos binding freeenergy penalty The forward design of synthetic sequencesaccording to model predictions quantitatively tests ourknowledge of these interactions and the modelrsquos abilityto predict their strengths Overall the biophysical modelemploys first-principles calculations with four empiricallymeasured constants to accurately predict the translationinitiation rates of 136 mRNAs with diversely structuredlong 50 UTRs (Figure 5)

Based on our findings there is also the potential forlong-range regulation of translation Long structured50 UTRs can feature several upstream standby sitemodules extending over 100 nt away from an openreading frame Any of these standby site modules arepotential binding sites for the ribosome according totheir binding free energy penalties but only a singlestandby site module is ultimately bound Both cis-actingand trans-acting factors can modulate individual standbysite module surface accessibilities to control standby sitemodule selection and the resulting translation rate(31723) Importantly the regulation of standby sitemodule selection will produce discrete step changes intranslation rate that can further regulate downstreamprocesses according to non-Boolean multi-state logic

In this study we introduce a new metric to quantify thesurface area of a standby site module which was validatedusing diverse but canonical RNA secondary structuresThe metric relates the geometric characteristics of thestandby site module (P D H) to its available single-stranded surface area implicitly using coefficients of oneto indicate interchangeability Additional measurementswill be needed to precisely measure these coefficientsand relate them to the helical shape of A-form RNA Inaddition tertiary structures such as G-quadruplexes andpseudoknots will likely have a differently defined metricof surface area Further investigation will be necessary todetermine the effect of tertiary RNA structures on standbysite accessibility ribosome binding and translationinitiation

As suggested by the biophysical model the ribosomalplatformrsquos ability to bind standby site modules with lowsurface area relies on its structural flexibility Previouslypublished cryo-EM data shows that the platform domainin the 30S subunit in both the free and 50S-bound statescan occupy variable conformational states (46) The twohalves of the platform which are stabilized by ribosomalproteins at its surface switch conformations during trans-lation (47) Restraining this flexibility will reduce the ribo-somersquos translation initiation rate for example when theantibiotic Edeine binds 16S ribosomal RNA helices H23

and H24 (48) Consequently the ribosomal platformrsquosstructural flexibility appears to be an essential aspect oftranslating mRNAs with structured 50 UTRsSimilarly the rigidity of single-stranded RNA likely

plays an important role in modulating the ribosomersquos dis-tortion penalty due to the entropic differences of bindingflexible or rigid macromolecules (49) Similar to DNApolyA RNA sequences are less flexible than mixed orpolyU RNA sequences (5051) Differences in RNArigidity have been shown to alter the ribosomersquos bindingaffinity to spacer regions separating the SD and startcodon sequences (52) binding free energy differences are25 kcalmol between polyA and polyAU spacer se-quences and 5 kcalmol between polyA and polyUspacer sequences Incorporating a sequence-dependentmodel for RNA rigidity into the distortion penalty calcu-lations could potentially increase their accuracyAnalysis of the modelrsquos error can lead to further insight

into the ribosomersquos interactions The modelrsquos predictionshave a normally distributed error for 90 of sequences(Supplementary Figure S6) Thirteen outliers have errorsbetween 2 and 4 kcalmol (Supplementary Data) Someoutliers are due to the effects of increased mRNA degrad-ation of 50 UTRs with very long proximal or distal bindingsites (P Dgt 20 nt) (Supplementary Figure S2) In othersextremely short proximal or distal binding sites can resultin non-canonical base pairings that alter the ribosomersquosbinding affinity For example in the absence of aproximal binding site (P=0nt) co-axial stacking willtake place between the nucleotides in the distal bindingsite and the mRNAndashrRNA duplex that anchors theribosome to the mRNA Similarly co-axial stacking willplay a larger role in predicting the effects of small RNAsthat bind to sites nearby RNA structuresAs the list of interactions that control gene expression

continues to grow we will stretch the limits of humanpattern recognition and the use of observational correl-ations Biophysical modeling offers a comprehensiveapproach to account for all known interactions with thegene expression machinery identify gaps design experi-ments precisely measure strengths and make testable pre-dictions The accuracy of thesemodels will go hand-in-handwith our ability to engineer genetic systems without trial-and-error We have incorporated the ribosomersquos inter-actions with structured standby site modules into animproved biophysical model called the RBS Calculatorv20 A software implementation and user-friendly webinterface is available at httpsalispsuedusoftware

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online

ACKNOWLEDGEMENTS

The authors would like to thank Manish Kushwaha andthe Genomics Core Facility at Penn State for technicalassistance Chase Beisel Richard Lease Paul Babitzkeand the reviewers for providing comments on themanuscript

Nucleic Acids Research 2014 Vol 42 No 4 2657

FUNDING

DARPA Young Faculty Award [to HMS] Office ofNaval Research [N00014-13-1-0074 to HMS] NationalScience Foundation Career Award [CBET-1253641 toHMS] start-up funds provided by the Penn StateInstitute for the Energy and Environment Supported bythe NSF Research Experience for Undergraduatesprogram (to ASC) Funding for open access chargeThe Office of Naval Research [N00014-13-1-0074 toHMS]

CONTRIBUTIONS

AEB and HMS designed the study developed themodel analyzed results and wrote the manuscriptAEB and ASC conducted the experiments

Conflict of interest statement HMS is the founder of DeNovo DNA a company that commercializes software todesign genetic systems The presented results areindependent of financial consequences

REFERENCES

1 TsaiA PetrovA MarshallRA KorlachJ UemuraS andPuglisiJD (2012) Heterogeneous pathways and timing of factordeparture during translation initiation Nature 487 390ndash393

2 SimonettiA MarziS JennerL MyasnikovA RombyPYusupovaG KlaholzBP and YusupovM (2009) A structuralview of translation initiation in bacteria Cell Mol Life Sci 66423ndash436

3 JennerL RombyP ReesB Schulze-BrieseC SpringerMEhresmannC EhresmannB MorasD YusupovaG andYusupovM (2005) Translational operator of mRNA on theribosome how repressor proteins exclude ribosome bindingScience 308 120ndash123

4 BeiselCL UpdegroveTB JansonBJ and StorzG (2012)Multiple factors dictate target selection by Hfq-binding smallRNAs EMBO J 31 1961ndash1974

5 StorzG VogelJ and WassarmanKM (2011) Regulation bysmall RNAs in bacteria expanding frontiers Mol Cell 43880ndash891

6 BastetL DubeA MasseE and LafontaineDA (2011) Newinsights into riboswitch regulation mechanisms Mol Microbiol80 1148ndash1154

7 BeiselCL and SmolkeCD (2009) Design principles forriboswitch function PLoS Comput Biol 5 e1000363

8 LeaseRA and BelfortM (2000) A trans-acting RNA as acontrol switch in Escherichia coli DsrA modulates function byforming alternative structures Proc Natl Acad Sci USA 979919ndash9924

9 De SmitMH and Van DuinJ (1994) Control of translation bymRNA secondary structure in Escherichia coli A quantitativeanalysis of literature data J Mol Biol 244 144ndash150

10 De SmitMH and Van DuinJ (1994) Translational initiation onstructured messengers Another role for the Shine-Dalgarnointeraction J Mol Biol 235 173ndash184

11 OstermanIA EvfratovSA SergievPV and DontsovaOA(2013) Comparison of mRNA features affecting translationinitiation and reinitiation Nucleic Acids Res 41 474ndash486

12 BugautA and BalasubramanianS (2012) 50-UTR RNA G-quadruplexes translation regulation and targeting Nucleic AcidsRes 40 4727ndash4741

13 SalisHM (2011) The ribosome binding site calculator MethodsEnzymol 498 19ndash42

14 SalisHM MirskyEA and VoigtCA (2009) Automated designof synthetic ribosome binding sites to control protein expressionNat Biotechnol 27 946ndash950

15 CarothersJM GolerJA JuminagaD and KeaslingJD (2011)Model-driven engineering of RNA devices to quantitativelyprogram gene expression Science 334 1716ndash1719

16 SimonettiA MarziS MyasnikovAG FabbrettiAYusupovM GualerziCO and KlaholzBP (2008) Structure ofthe 30S translation initiation complex Nature 455 416ndash420

17 MarziS MyasnikovAG SerganovA EhresmannCRombyP YusupovM and KlaholzBP (2007) StructuredmRNAs regulate translation initiation by binding to the platformof the ribosome Cell 130 1019ndash1031

18 YusupovaGZ YusupovMM CateJ and NollerHF (2001)The path of messenger RNA through the ribosome Cell 106233ndash241

19 De SmitMH and Van DuinJ (2003) Translational standbysites how ribosomes may deal with the rapid folding kinetics ofmRNA J Mol Biol 331 737ndash743

20 QuX LancasterL NollerHF BustamanteC and TinocoI(2012) Ribosomal protein S1 unwinds double-stranded RNA inmultiple steps Proc Natl Acad Sci USA 109 14458ndash14463

21 MilonP MaracciC FilonavaL GualerziCO andRodninaMV (2012) Real-time assembly landscape of bacterial30S translation initiation complex Nat Struct Mol Biol 19609ndash615

22 StuderSM and JosephS (2006) Unfolding of mRNA secondarystructure by the bacterial translation initiation complex MolCell 22 105ndash115

23 DarfeuilleF UnosonC VogelJ and WagnerEGH (2007) Anantisense RNA inhibits translation by competing with standbyribosomes Mol Cell 26 381ndash392

24 MutalikVK GuimaraesJC CambrayG MaiQ-AChristoffersenMJ MartinL YuA LamC RodriguezCBennettG et al (2013) Quantitative estimation of activity andquality for collections of functional genetic elements NatMethods 10 347ndash353

25 CardinaleS and ArkinAP (2012) Contextualizing context forsynthetic biologyndashidentifying causes of failure of syntheticbiological systems Biotechnol J 7 856ndash866

26 LouC StantonB ChenY-J MunskyB and VoigtCA (2012)Ribozyme-based insulator parts buffer synthetic circuits fromgenetic context Nat Biotechnol 30 1137ndash1142

27 KosuriS GoodmanDB CambrayG MutalikVK GaoYArkinAP EndyD and ChurchGM (2013) Composability ofregulatory sequences controlling transcription and translation inEscherichia coli Proc Natl Acad Sci USA 110 14024ndash14029

28 TakahashiS FurusawaH UedaT and OkahataY (2013)Translation enhancer improves the ribosome liberation fromtranslation initiation J Am Chem Soc 135 13096ndash13106

29 HaoY ZhangZJ EricksonDW HuangM HuangY LiJHwaT and ShiH (2011) Quantifying the sequence-functionrelation in gene silencing by bacterial small RNAs Proc NatlAcad Sci USA 108 12473ndash12478

30 MathewsDH SabinaJ ZukerM and TurnerDH (1999)Expanded sequence dependence of thermodynamic parametersimproves prediction of RNA secondary structure J Mol Biol288 911ndash940

31 XiaT SantaLuciaJ BurkardME KierzekR SchroederSJJiaoX CoxC and TurnerDH (1998) Thermodynamicparameters for an expanded nearest-neighbor model for formationof RNA duplexes with Watson-Crick base pairs Biochemistry 3714719ndash14735

32 GruberAR LorenzR BernhartSH NeubockR andHofackerIL (2008) The Vienna RNA websuite Nucleic AcidsRes 36 W70ndashW74

33 ChenH BjerknesM KumarR and JayE (1994) Determinationof the optimal aligned spacing between the ShinendashDalgarnosequence and the translation initiation codon of Escherichia colimRNAs Nucleic Acids Res 22 4953ndash4957

34 PertzevAV and NicholsonAW (2006) Characterization ofRNA sequence determinants and antideterminants of processingreactivity for a minimal substrate of Escherichia coli ribonucleaseIII Nucleic Acids Res 34 3708ndash3721

35 KeselerIM Collado-VidesJ Santos-ZavaletaA Peralta-GilMGama-CastroS Muniz-RascadoL Bonavides-MartinezCPaleyS KrummenackerM AltmanT et al (2011) EcoCyc a

2658 Nucleic Acids Research 2014 Vol 42 No 4

comprehensive database of Escherichia coli biology Nucleic AcidsRes 39 D583ndashD590

36 MutalikVK QiL GuimaraesJC LucksJB and ArkinAP(2012) Rationally designed families of orthogonal RNA regulatorsof translation Nat Chem Biol 8 447ndash454

37 RodrigoG LandrainTE and JaramilloA (2012) De novoautomated design of small RNA circuits for engineering syntheticriboregulation in living cells Proc Natl Acad Sci USA 10915271ndash15276

38 CalluraJM CantorCR and CollinsJJ (2012) Geneticswitchboard for synthetic biology applications Proc Natl AcadSci USA 109 5850ndash5855

39 SinhaJ ReyesSJ and GallivanJP (2010) Reprogrammingbacteria to seek and destroy an herbicide Nat Chem Biol 6464ndash470

40 AnthonyPC PerezCF Garcıa-GarcıaC and BlockSM(2012) Folding energy landscape of the thiamine pyrophosphateriboswitch aptamer Proc Natl Acad Sci USA 109 1485ndash1489

41 PerdrizetGA II ArtsimovitchI FurmanR SosnickTR andPanT (2012) Transcriptional pausing coordinates folding of theaptamer domain and the expression platform of a riboswitchProc Natl Acad Sci USA 109 3323ndash3328

42 LynchSA DesaiSK SajjaHK and GallivanJP (2007) Ahigh-throughput screen for synthetic riboswitches revealsmechanistic insights into their function Chem Biol 14 173ndash184

43 CaronM-P BastetL LussierA Simoneau-RoyM MasseEand LafontaineDA (2012) Dual-acting riboswitch control oftranslation initiation and mRNA decay Proc Natl Acad SciUSA 109 E3444ndashE3453

44 UpdegroveT WilfN SunX and WartellRM (2008) Effect ofHfq on RprArpoS mRNA pairing HfqRNA binding and the

influence of the 50 rpoS mRNA leader region Biochemistry 4711184ndash11195

45 SacerdotC CailletJ GraffeM EyermannF EhresmannBEhresmannC SpringerM and RombyP (1998) The Escherichiacoli threonyl-tRNA synthetase gene contains a split ribosomalbinding site interrupted by a hairpin structure that is essential forautoregulation Mol Microbiol 29 1077ndash1090

46 GabashviliIS AgrawalRK GrassucciR and FrankJ (1999)Structure and structural variations of the Escherichia coli 30 Sribosomal subunit as revealed by three-dimensional cryo-electronmicroscopy J Mol Biol 286 1285ndash1291

47 ClemonsWM MayJL WimberlyBT McCutcheonJPCapelMS and RamakrishnanV (1999) Structure of a bacterial30S ribosomal subunit at 55 A resolution Nature 400 833ndash840

48 PiolettiM SchlunzenF HarmsJ ZarivachR GluhmannMAvilaH BashanA BartelsH AuerbachT and JacobiC(2001) Crystal structures of complexes of the small ribosomalsubunit with tetracycline edeine and IF3 EMBO J 201829ndash1839

49 WilliamsonJR (2000) Induced fit in RNAndashprotein recognitionNat Struct Mol Biol 7 834ndash837

50 FioreJL HolmstromED and NesbittDJ (2012) Entropicorigin of Mg2+-facilitated RNA folding Proc Natl Acad SciUSA 109 2902ndash2907

51 MillsJB VacanoE and HagermanPJ (1999) Flexibility ofsingle-stranded DNA use of gapped duplex helices to determinethe persistence lengths of poly (dT) and poly (dA) J Mol Biol285 245ndash257

52 EgbertRG and KlavinsE (2012) Fine-tuning gene networksusing simple sequence repeats Proc Natl Acad Sci USA 10916817ndash16822

Nucleic Acids Research 2014 Vol 42 No 4 2659

Page 11: Translation rate is controlled by coupled trade-offs ......Translation rate is controlled by coupled trade-offs between site accessibility, selective RNA unfolding and sliding at upstream

with an average length of 175 nt have positive slidingenergy penalties (Gsliding) indicating that distantstandby site modules are controlling translation andmaking them potential targets for long-range regulationFurther there are 769 operons where variability in their

transcriptional start sites due to adjacent or overlappingpromoters result in 50 UTRs with different lengths andmRNA structures For example the mmuPM operon hasfive different transcriptional start sites leading to 50 UTRisoforms with lengths from 9 to 205 nt The biophysicalmodel predicts a 1088-fold change in the translation initi-ation rate of mmuP as a result of changes in the structureand surface area of the different standby site modulesOverall across all 50 UTR isoforms within the genomemodel calculations show large changes in translation ini-tiation rates through alterations in the standby sitemodulesrsquo characteristics (Figure 6C) These differencesimply that switching promoter usage due to changingtranscription factor or sigma factor levels can modulatean mRNArsquos translation rate and potentially createstandby site modules whose accessibility can be addition-ally regulated by translation factors However furtherinvestigation is necessary to examine the 50 UTRs ofisoforms and the regulation of their translation rates

Biophysical modeling to calculate translation rates fromsplit ribosome binding sites

Sacerdot et al (45) studied the translation of the 50 UTRof E coli thrS mRNA that uses a split ribosome bindingsite The biophysical model classifies the 50 UTR as con-taining three standby site modules with large RNA struc-tures and varying distal and proximal binding site lengths(Figure 7) The authors concluded that a 24 nt long singlestranded region (domain 3) is essential for ribosomebinding and that the ribosome can accommodate a long

hairpin (domain 2) without unfolding it which was latersupported by crystallographic data (3) The biophysicalmodelrsquos calculations are strongly consistent with their con-clusions In particular a mutation that deleted the entiredomain 3 and its upstream region (ILO4) repressed thetranslation rate by over 50-fold where it significantlyreduced the standby site accessibility (from As=24 andGdistortion=0 to As=0 and Gdistortion=174 kcalmol) To offset this large energetic burden model calcu-lations indicate that the ribosomal platform partiallyunfolded 6 bp from the domain 2 hairpin whichminimized its binding free energy penalty to a Gstandby

of 532 kcalmol (As=205 Gdistortion=002 andGunfolding=53 kcalmol) Destabilization of thedomain 2 hairpin (ILO5) restored the mRNArsquos transla-tion to its wild-type rate by increasing the surface area ofthe standby site to 205 nt resulting in an almost zeroGstandby The biophysical model calculations indicatethat other mutations to the thrS 50 UTR (L6 M1 N1BS4-9 BS4-9CS29 L19 L7 CS302 ILO1 andILO2 see Figure 7) did not alter its standby sitemodule surface area or RNA structure the maximumobserved change in translation was 25-fold for thesemutations

DISCUSSION

We designed and characterized synthetic mRNA se-quences and applied biophysical modeling to decipherthe rules for the ribosomersquos interactions at structured50 UTRs The 30S ribosome searches for a singlestandby site module that supports the highest translationrate regardless of its distance from the start codon Theribosomersquos binding free energy penalty to individualstandby site modules is governed by a competing trade-off between the accessibility of standby site modules

GAACUGGUUAAUAAACAAAUUUUUC

AG

GUGG

U

G

UACCUG

AAUU

G

G

AG

UUUG

AAC C

UG

ACUA

GA

UUCGUG

U

G

A

GCGUU

U

U

CGCAAU

GUUU

G

AA

UU

GC

A

CA

GU

AUUU

GAUACGGC

AU

UAAGGAUAUAAAAUGSD

U

AUGUU

U

UGCAAA

GCUU U

U

CGUG

ACUAGUGC

G

GU

C

CA

Domain 2

Domain 3

Domain 4

minus1

ILOΔ1

ILOΔ2

minus90

minus120

minus150

minus163 minus60 minus13

minus32

ILOΔ4 ILOΔ5

minus49

Figure 7 The sequence and structure of E coli thrS 50 UTR are shown Similar to Sacerdot et al (45) nucleotide numbering is negative for theupstream 50 UTR with respect to the start codon The list of mutations is BS4-9 G(-32)gtA BS4-9CS29 G(-32)gtA and G(-46)gtA L19 A(-13)gtCL7 U(-49)gtG CS302 deletion of domain 2 from A(-14) to U(-48) ILO1 transcription begins at G(-159) ILO2 transcription begins at G(-68)ILO4 transcription begins at G(-49) U(-49)gtG and A(-13)gtC ILO5 transcription begins at G(-49) U(-49)gtG and G(-46)gtA Bolded nucleo-tides are the positions of new transcriptional start sites

2656 Nucleic Acids Research 2014 Vol 42 No 4

(Figure 2) and the unfolding of RNA structures toincrease accessibility (Figure 3) We demonstrate thatthermodynamic minimization predicts the extent ofthe ribosomersquos selective and partial unfolding of RNAstructures Our data supports a sliding mechanismwhereby downstream RNA structures are not unfoldedbut attenuate translation rate in a height-dependentmanner (Figure 4)

Importantly initially designed sequences have well-defined features that perturb the ribosomersquos individualinteractions with mRNA allowing us to preciselymeasure their effect on the ribosomersquos binding freeenergy penalty The forward design of synthetic sequencesaccording to model predictions quantitatively tests ourknowledge of these interactions and the modelrsquos abilityto predict their strengths Overall the biophysical modelemploys first-principles calculations with four empiricallymeasured constants to accurately predict the translationinitiation rates of 136 mRNAs with diversely structuredlong 50 UTRs (Figure 5)

Based on our findings there is also the potential forlong-range regulation of translation Long structured50 UTRs can feature several upstream standby sitemodules extending over 100 nt away from an openreading frame Any of these standby site modules arepotential binding sites for the ribosome according totheir binding free energy penalties but only a singlestandby site module is ultimately bound Both cis-actingand trans-acting factors can modulate individual standbysite module surface accessibilities to control standby sitemodule selection and the resulting translation rate(31723) Importantly the regulation of standby sitemodule selection will produce discrete step changes intranslation rate that can further regulate downstreamprocesses according to non-Boolean multi-state logic

In this study we introduce a new metric to quantify thesurface area of a standby site module which was validatedusing diverse but canonical RNA secondary structuresThe metric relates the geometric characteristics of thestandby site module (P D H) to its available single-stranded surface area implicitly using coefficients of oneto indicate interchangeability Additional measurementswill be needed to precisely measure these coefficientsand relate them to the helical shape of A-form RNA Inaddition tertiary structures such as G-quadruplexes andpseudoknots will likely have a differently defined metricof surface area Further investigation will be necessary todetermine the effect of tertiary RNA structures on standbysite accessibility ribosome binding and translationinitiation

As suggested by the biophysical model the ribosomalplatformrsquos ability to bind standby site modules with lowsurface area relies on its structural flexibility Previouslypublished cryo-EM data shows that the platform domainin the 30S subunit in both the free and 50S-bound statescan occupy variable conformational states (46) The twohalves of the platform which are stabilized by ribosomalproteins at its surface switch conformations during trans-lation (47) Restraining this flexibility will reduce the ribo-somersquos translation initiation rate for example when theantibiotic Edeine binds 16S ribosomal RNA helices H23

and H24 (48) Consequently the ribosomal platformrsquosstructural flexibility appears to be an essential aspect oftranslating mRNAs with structured 50 UTRsSimilarly the rigidity of single-stranded RNA likely

plays an important role in modulating the ribosomersquos dis-tortion penalty due to the entropic differences of bindingflexible or rigid macromolecules (49) Similar to DNApolyA RNA sequences are less flexible than mixed orpolyU RNA sequences (5051) Differences in RNArigidity have been shown to alter the ribosomersquos bindingaffinity to spacer regions separating the SD and startcodon sequences (52) binding free energy differences are25 kcalmol between polyA and polyAU spacer se-quences and 5 kcalmol between polyA and polyUspacer sequences Incorporating a sequence-dependentmodel for RNA rigidity into the distortion penalty calcu-lations could potentially increase their accuracyAnalysis of the modelrsquos error can lead to further insight

into the ribosomersquos interactions The modelrsquos predictionshave a normally distributed error for 90 of sequences(Supplementary Figure S6) Thirteen outliers have errorsbetween 2 and 4 kcalmol (Supplementary Data) Someoutliers are due to the effects of increased mRNA degrad-ation of 50 UTRs with very long proximal or distal bindingsites (P Dgt 20 nt) (Supplementary Figure S2) In othersextremely short proximal or distal binding sites can resultin non-canonical base pairings that alter the ribosomersquosbinding affinity For example in the absence of aproximal binding site (P=0nt) co-axial stacking willtake place between the nucleotides in the distal bindingsite and the mRNAndashrRNA duplex that anchors theribosome to the mRNA Similarly co-axial stacking willplay a larger role in predicting the effects of small RNAsthat bind to sites nearby RNA structuresAs the list of interactions that control gene expression

continues to grow we will stretch the limits of humanpattern recognition and the use of observational correl-ations Biophysical modeling offers a comprehensiveapproach to account for all known interactions with thegene expression machinery identify gaps design experi-ments precisely measure strengths and make testable pre-dictions The accuracy of thesemodels will go hand-in-handwith our ability to engineer genetic systems without trial-and-error We have incorporated the ribosomersquos inter-actions with structured standby site modules into animproved biophysical model called the RBS Calculatorv20 A software implementation and user-friendly webinterface is available at httpsalispsuedusoftware

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online

ACKNOWLEDGEMENTS

The authors would like to thank Manish Kushwaha andthe Genomics Core Facility at Penn State for technicalassistance Chase Beisel Richard Lease Paul Babitzkeand the reviewers for providing comments on themanuscript

Nucleic Acids Research 2014 Vol 42 No 4 2657

FUNDING

DARPA Young Faculty Award [to HMS] Office ofNaval Research [N00014-13-1-0074 to HMS] NationalScience Foundation Career Award [CBET-1253641 toHMS] start-up funds provided by the Penn StateInstitute for the Energy and Environment Supported bythe NSF Research Experience for Undergraduatesprogram (to ASC) Funding for open access chargeThe Office of Naval Research [N00014-13-1-0074 toHMS]

CONTRIBUTIONS

AEB and HMS designed the study developed themodel analyzed results and wrote the manuscriptAEB and ASC conducted the experiments

Conflict of interest statement HMS is the founder of DeNovo DNA a company that commercializes software todesign genetic systems The presented results areindependent of financial consequences

REFERENCES

1 TsaiA PetrovA MarshallRA KorlachJ UemuraS andPuglisiJD (2012) Heterogeneous pathways and timing of factordeparture during translation initiation Nature 487 390ndash393

2 SimonettiA MarziS JennerL MyasnikovA RombyPYusupovaG KlaholzBP and YusupovM (2009) A structuralview of translation initiation in bacteria Cell Mol Life Sci 66423ndash436

3 JennerL RombyP ReesB Schulze-BrieseC SpringerMEhresmannC EhresmannB MorasD YusupovaG andYusupovM (2005) Translational operator of mRNA on theribosome how repressor proteins exclude ribosome bindingScience 308 120ndash123

4 BeiselCL UpdegroveTB JansonBJ and StorzG (2012)Multiple factors dictate target selection by Hfq-binding smallRNAs EMBO J 31 1961ndash1974

5 StorzG VogelJ and WassarmanKM (2011) Regulation bysmall RNAs in bacteria expanding frontiers Mol Cell 43880ndash891

6 BastetL DubeA MasseE and LafontaineDA (2011) Newinsights into riboswitch regulation mechanisms Mol Microbiol80 1148ndash1154

7 BeiselCL and SmolkeCD (2009) Design principles forriboswitch function PLoS Comput Biol 5 e1000363

8 LeaseRA and BelfortM (2000) A trans-acting RNA as acontrol switch in Escherichia coli DsrA modulates function byforming alternative structures Proc Natl Acad Sci USA 979919ndash9924

9 De SmitMH and Van DuinJ (1994) Control of translation bymRNA secondary structure in Escherichia coli A quantitativeanalysis of literature data J Mol Biol 244 144ndash150

10 De SmitMH and Van DuinJ (1994) Translational initiation onstructured messengers Another role for the Shine-Dalgarnointeraction J Mol Biol 235 173ndash184

11 OstermanIA EvfratovSA SergievPV and DontsovaOA(2013) Comparison of mRNA features affecting translationinitiation and reinitiation Nucleic Acids Res 41 474ndash486

12 BugautA and BalasubramanianS (2012) 50-UTR RNA G-quadruplexes translation regulation and targeting Nucleic AcidsRes 40 4727ndash4741

13 SalisHM (2011) The ribosome binding site calculator MethodsEnzymol 498 19ndash42

14 SalisHM MirskyEA and VoigtCA (2009) Automated designof synthetic ribosome binding sites to control protein expressionNat Biotechnol 27 946ndash950

15 CarothersJM GolerJA JuminagaD and KeaslingJD (2011)Model-driven engineering of RNA devices to quantitativelyprogram gene expression Science 334 1716ndash1719

16 SimonettiA MarziS MyasnikovAG FabbrettiAYusupovM GualerziCO and KlaholzBP (2008) Structure ofthe 30S translation initiation complex Nature 455 416ndash420

17 MarziS MyasnikovAG SerganovA EhresmannCRombyP YusupovM and KlaholzBP (2007) StructuredmRNAs regulate translation initiation by binding to the platformof the ribosome Cell 130 1019ndash1031

18 YusupovaGZ YusupovMM CateJ and NollerHF (2001)The path of messenger RNA through the ribosome Cell 106233ndash241

19 De SmitMH and Van DuinJ (2003) Translational standbysites how ribosomes may deal with the rapid folding kinetics ofmRNA J Mol Biol 331 737ndash743

20 QuX LancasterL NollerHF BustamanteC and TinocoI(2012) Ribosomal protein S1 unwinds double-stranded RNA inmultiple steps Proc Natl Acad Sci USA 109 14458ndash14463

21 MilonP MaracciC FilonavaL GualerziCO andRodninaMV (2012) Real-time assembly landscape of bacterial30S translation initiation complex Nat Struct Mol Biol 19609ndash615

22 StuderSM and JosephS (2006) Unfolding of mRNA secondarystructure by the bacterial translation initiation complex MolCell 22 105ndash115

23 DarfeuilleF UnosonC VogelJ and WagnerEGH (2007) Anantisense RNA inhibits translation by competing with standbyribosomes Mol Cell 26 381ndash392

24 MutalikVK GuimaraesJC CambrayG MaiQ-AChristoffersenMJ MartinL YuA LamC RodriguezCBennettG et al (2013) Quantitative estimation of activity andquality for collections of functional genetic elements NatMethods 10 347ndash353

25 CardinaleS and ArkinAP (2012) Contextualizing context forsynthetic biologyndashidentifying causes of failure of syntheticbiological systems Biotechnol J 7 856ndash866

26 LouC StantonB ChenY-J MunskyB and VoigtCA (2012)Ribozyme-based insulator parts buffer synthetic circuits fromgenetic context Nat Biotechnol 30 1137ndash1142

27 KosuriS GoodmanDB CambrayG MutalikVK GaoYArkinAP EndyD and ChurchGM (2013) Composability ofregulatory sequences controlling transcription and translation inEscherichia coli Proc Natl Acad Sci USA 110 14024ndash14029

28 TakahashiS FurusawaH UedaT and OkahataY (2013)Translation enhancer improves the ribosome liberation fromtranslation initiation J Am Chem Soc 135 13096ndash13106

29 HaoY ZhangZJ EricksonDW HuangM HuangY LiJHwaT and ShiH (2011) Quantifying the sequence-functionrelation in gene silencing by bacterial small RNAs Proc NatlAcad Sci USA 108 12473ndash12478

30 MathewsDH SabinaJ ZukerM and TurnerDH (1999)Expanded sequence dependence of thermodynamic parametersimproves prediction of RNA secondary structure J Mol Biol288 911ndash940

31 XiaT SantaLuciaJ BurkardME KierzekR SchroederSJJiaoX CoxC and TurnerDH (1998) Thermodynamicparameters for an expanded nearest-neighbor model for formationof RNA duplexes with Watson-Crick base pairs Biochemistry 3714719ndash14735

32 GruberAR LorenzR BernhartSH NeubockR andHofackerIL (2008) The Vienna RNA websuite Nucleic AcidsRes 36 W70ndashW74

33 ChenH BjerknesM KumarR and JayE (1994) Determinationof the optimal aligned spacing between the ShinendashDalgarnosequence and the translation initiation codon of Escherichia colimRNAs Nucleic Acids Res 22 4953ndash4957

34 PertzevAV and NicholsonAW (2006) Characterization ofRNA sequence determinants and antideterminants of processingreactivity for a minimal substrate of Escherichia coli ribonucleaseIII Nucleic Acids Res 34 3708ndash3721

35 KeselerIM Collado-VidesJ Santos-ZavaletaA Peralta-GilMGama-CastroS Muniz-RascadoL Bonavides-MartinezCPaleyS KrummenackerM AltmanT et al (2011) EcoCyc a

2658 Nucleic Acids Research 2014 Vol 42 No 4

comprehensive database of Escherichia coli biology Nucleic AcidsRes 39 D583ndashD590

36 MutalikVK QiL GuimaraesJC LucksJB and ArkinAP(2012) Rationally designed families of orthogonal RNA regulatorsof translation Nat Chem Biol 8 447ndash454

37 RodrigoG LandrainTE and JaramilloA (2012) De novoautomated design of small RNA circuits for engineering syntheticriboregulation in living cells Proc Natl Acad Sci USA 10915271ndash15276

38 CalluraJM CantorCR and CollinsJJ (2012) Geneticswitchboard for synthetic biology applications Proc Natl AcadSci USA 109 5850ndash5855

39 SinhaJ ReyesSJ and GallivanJP (2010) Reprogrammingbacteria to seek and destroy an herbicide Nat Chem Biol 6464ndash470

40 AnthonyPC PerezCF Garcıa-GarcıaC and BlockSM(2012) Folding energy landscape of the thiamine pyrophosphateriboswitch aptamer Proc Natl Acad Sci USA 109 1485ndash1489

41 PerdrizetGA II ArtsimovitchI FurmanR SosnickTR andPanT (2012) Transcriptional pausing coordinates folding of theaptamer domain and the expression platform of a riboswitchProc Natl Acad Sci USA 109 3323ndash3328

42 LynchSA DesaiSK SajjaHK and GallivanJP (2007) Ahigh-throughput screen for synthetic riboswitches revealsmechanistic insights into their function Chem Biol 14 173ndash184

43 CaronM-P BastetL LussierA Simoneau-RoyM MasseEand LafontaineDA (2012) Dual-acting riboswitch control oftranslation initiation and mRNA decay Proc Natl Acad SciUSA 109 E3444ndashE3453

44 UpdegroveT WilfN SunX and WartellRM (2008) Effect ofHfq on RprArpoS mRNA pairing HfqRNA binding and the

influence of the 50 rpoS mRNA leader region Biochemistry 4711184ndash11195

45 SacerdotC CailletJ GraffeM EyermannF EhresmannBEhresmannC SpringerM and RombyP (1998) The Escherichiacoli threonyl-tRNA synthetase gene contains a split ribosomalbinding site interrupted by a hairpin structure that is essential forautoregulation Mol Microbiol 29 1077ndash1090

46 GabashviliIS AgrawalRK GrassucciR and FrankJ (1999)Structure and structural variations of the Escherichia coli 30 Sribosomal subunit as revealed by three-dimensional cryo-electronmicroscopy J Mol Biol 286 1285ndash1291

47 ClemonsWM MayJL WimberlyBT McCutcheonJPCapelMS and RamakrishnanV (1999) Structure of a bacterial30S ribosomal subunit at 55 A resolution Nature 400 833ndash840

48 PiolettiM SchlunzenF HarmsJ ZarivachR GluhmannMAvilaH BashanA BartelsH AuerbachT and JacobiC(2001) Crystal structures of complexes of the small ribosomalsubunit with tetracycline edeine and IF3 EMBO J 201829ndash1839

49 WilliamsonJR (2000) Induced fit in RNAndashprotein recognitionNat Struct Mol Biol 7 834ndash837

50 FioreJL HolmstromED and NesbittDJ (2012) Entropicorigin of Mg2+-facilitated RNA folding Proc Natl Acad SciUSA 109 2902ndash2907

51 MillsJB VacanoE and HagermanPJ (1999) Flexibility ofsingle-stranded DNA use of gapped duplex helices to determinethe persistence lengths of poly (dT) and poly (dA) J Mol Biol285 245ndash257

52 EgbertRG and KlavinsE (2012) Fine-tuning gene networksusing simple sequence repeats Proc Natl Acad Sci USA 10916817ndash16822

Nucleic Acids Research 2014 Vol 42 No 4 2659

Page 12: Translation rate is controlled by coupled trade-offs ......Translation rate is controlled by coupled trade-offs between site accessibility, selective RNA unfolding and sliding at upstream

(Figure 2) and the unfolding of RNA structures toincrease accessibility (Figure 3) We demonstrate thatthermodynamic minimization predicts the extent ofthe ribosomersquos selective and partial unfolding of RNAstructures Our data supports a sliding mechanismwhereby downstream RNA structures are not unfoldedbut attenuate translation rate in a height-dependentmanner (Figure 4)

Importantly initially designed sequences have well-defined features that perturb the ribosomersquos individualinteractions with mRNA allowing us to preciselymeasure their effect on the ribosomersquos binding freeenergy penalty The forward design of synthetic sequencesaccording to model predictions quantitatively tests ourknowledge of these interactions and the modelrsquos abilityto predict their strengths Overall the biophysical modelemploys first-principles calculations with four empiricallymeasured constants to accurately predict the translationinitiation rates of 136 mRNAs with diversely structuredlong 50 UTRs (Figure 5)

Based on our findings there is also the potential forlong-range regulation of translation Long structured50 UTRs can feature several upstream standby sitemodules extending over 100 nt away from an openreading frame Any of these standby site modules arepotential binding sites for the ribosome according totheir binding free energy penalties but only a singlestandby site module is ultimately bound Both cis-actingand trans-acting factors can modulate individual standbysite module surface accessibilities to control standby sitemodule selection and the resulting translation rate(31723) Importantly the regulation of standby sitemodule selection will produce discrete step changes intranslation rate that can further regulate downstreamprocesses according to non-Boolean multi-state logic

In this study we introduce a new metric to quantify thesurface area of a standby site module which was validatedusing diverse but canonical RNA secondary structuresThe metric relates the geometric characteristics of thestandby site module (P D H) to its available single-stranded surface area implicitly using coefficients of oneto indicate interchangeability Additional measurementswill be needed to precisely measure these coefficientsand relate them to the helical shape of A-form RNA Inaddition tertiary structures such as G-quadruplexes andpseudoknots will likely have a differently defined metricof surface area Further investigation will be necessary todetermine the effect of tertiary RNA structures on standbysite accessibility ribosome binding and translationinitiation

As suggested by the biophysical model the ribosomalplatformrsquos ability to bind standby site modules with lowsurface area relies on its structural flexibility Previouslypublished cryo-EM data shows that the platform domainin the 30S subunit in both the free and 50S-bound statescan occupy variable conformational states (46) The twohalves of the platform which are stabilized by ribosomalproteins at its surface switch conformations during trans-lation (47) Restraining this flexibility will reduce the ribo-somersquos translation initiation rate for example when theantibiotic Edeine binds 16S ribosomal RNA helices H23

and H24 (48) Consequently the ribosomal platformrsquosstructural flexibility appears to be an essential aspect oftranslating mRNAs with structured 50 UTRsSimilarly the rigidity of single-stranded RNA likely

plays an important role in modulating the ribosomersquos dis-tortion penalty due to the entropic differences of bindingflexible or rigid macromolecules (49) Similar to DNApolyA RNA sequences are less flexible than mixed orpolyU RNA sequences (5051) Differences in RNArigidity have been shown to alter the ribosomersquos bindingaffinity to spacer regions separating the SD and startcodon sequences (52) binding free energy differences are25 kcalmol between polyA and polyAU spacer se-quences and 5 kcalmol between polyA and polyUspacer sequences Incorporating a sequence-dependentmodel for RNA rigidity into the distortion penalty calcu-lations could potentially increase their accuracyAnalysis of the modelrsquos error can lead to further insight

into the ribosomersquos interactions The modelrsquos predictionshave a normally distributed error for 90 of sequences(Supplementary Figure S6) Thirteen outliers have errorsbetween 2 and 4 kcalmol (Supplementary Data) Someoutliers are due to the effects of increased mRNA degrad-ation of 50 UTRs with very long proximal or distal bindingsites (P Dgt 20 nt) (Supplementary Figure S2) In othersextremely short proximal or distal binding sites can resultin non-canonical base pairings that alter the ribosomersquosbinding affinity For example in the absence of aproximal binding site (P=0nt) co-axial stacking willtake place between the nucleotides in the distal bindingsite and the mRNAndashrRNA duplex that anchors theribosome to the mRNA Similarly co-axial stacking willplay a larger role in predicting the effects of small RNAsthat bind to sites nearby RNA structuresAs the list of interactions that control gene expression

continues to grow we will stretch the limits of humanpattern recognition and the use of observational correl-ations Biophysical modeling offers a comprehensiveapproach to account for all known interactions with thegene expression machinery identify gaps design experi-ments precisely measure strengths and make testable pre-dictions The accuracy of thesemodels will go hand-in-handwith our ability to engineer genetic systems without trial-and-error We have incorporated the ribosomersquos inter-actions with structured standby site modules into animproved biophysical model called the RBS Calculatorv20 A software implementation and user-friendly webinterface is available at httpsalispsuedusoftware

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online

ACKNOWLEDGEMENTS

The authors would like to thank Manish Kushwaha andthe Genomics Core Facility at Penn State for technicalassistance Chase Beisel Richard Lease Paul Babitzkeand the reviewers for providing comments on themanuscript

Nucleic Acids Research 2014 Vol 42 No 4 2657

FUNDING

DARPA Young Faculty Award [to HMS] Office ofNaval Research [N00014-13-1-0074 to HMS] NationalScience Foundation Career Award [CBET-1253641 toHMS] start-up funds provided by the Penn StateInstitute for the Energy and Environment Supported bythe NSF Research Experience for Undergraduatesprogram (to ASC) Funding for open access chargeThe Office of Naval Research [N00014-13-1-0074 toHMS]

CONTRIBUTIONS

AEB and HMS designed the study developed themodel analyzed results and wrote the manuscriptAEB and ASC conducted the experiments

Conflict of interest statement HMS is the founder of DeNovo DNA a company that commercializes software todesign genetic systems The presented results areindependent of financial consequences

REFERENCES

1 TsaiA PetrovA MarshallRA KorlachJ UemuraS andPuglisiJD (2012) Heterogeneous pathways and timing of factordeparture during translation initiation Nature 487 390ndash393

2 SimonettiA MarziS JennerL MyasnikovA RombyPYusupovaG KlaholzBP and YusupovM (2009) A structuralview of translation initiation in bacteria Cell Mol Life Sci 66423ndash436

3 JennerL RombyP ReesB Schulze-BrieseC SpringerMEhresmannC EhresmannB MorasD YusupovaG andYusupovM (2005) Translational operator of mRNA on theribosome how repressor proteins exclude ribosome bindingScience 308 120ndash123

4 BeiselCL UpdegroveTB JansonBJ and StorzG (2012)Multiple factors dictate target selection by Hfq-binding smallRNAs EMBO J 31 1961ndash1974

5 StorzG VogelJ and WassarmanKM (2011) Regulation bysmall RNAs in bacteria expanding frontiers Mol Cell 43880ndash891

6 BastetL DubeA MasseE and LafontaineDA (2011) Newinsights into riboswitch regulation mechanisms Mol Microbiol80 1148ndash1154

7 BeiselCL and SmolkeCD (2009) Design principles forriboswitch function PLoS Comput Biol 5 e1000363

8 LeaseRA and BelfortM (2000) A trans-acting RNA as acontrol switch in Escherichia coli DsrA modulates function byforming alternative structures Proc Natl Acad Sci USA 979919ndash9924

9 De SmitMH and Van DuinJ (1994) Control of translation bymRNA secondary structure in Escherichia coli A quantitativeanalysis of literature data J Mol Biol 244 144ndash150

10 De SmitMH and Van DuinJ (1994) Translational initiation onstructured messengers Another role for the Shine-Dalgarnointeraction J Mol Biol 235 173ndash184

11 OstermanIA EvfratovSA SergievPV and DontsovaOA(2013) Comparison of mRNA features affecting translationinitiation and reinitiation Nucleic Acids Res 41 474ndash486

12 BugautA and BalasubramanianS (2012) 50-UTR RNA G-quadruplexes translation regulation and targeting Nucleic AcidsRes 40 4727ndash4741

13 SalisHM (2011) The ribosome binding site calculator MethodsEnzymol 498 19ndash42

14 SalisHM MirskyEA and VoigtCA (2009) Automated designof synthetic ribosome binding sites to control protein expressionNat Biotechnol 27 946ndash950

15 CarothersJM GolerJA JuminagaD and KeaslingJD (2011)Model-driven engineering of RNA devices to quantitativelyprogram gene expression Science 334 1716ndash1719

16 SimonettiA MarziS MyasnikovAG FabbrettiAYusupovM GualerziCO and KlaholzBP (2008) Structure ofthe 30S translation initiation complex Nature 455 416ndash420

17 MarziS MyasnikovAG SerganovA EhresmannCRombyP YusupovM and KlaholzBP (2007) StructuredmRNAs regulate translation initiation by binding to the platformof the ribosome Cell 130 1019ndash1031

18 YusupovaGZ YusupovMM CateJ and NollerHF (2001)The path of messenger RNA through the ribosome Cell 106233ndash241

19 De SmitMH and Van DuinJ (2003) Translational standbysites how ribosomes may deal with the rapid folding kinetics ofmRNA J Mol Biol 331 737ndash743

20 QuX LancasterL NollerHF BustamanteC and TinocoI(2012) Ribosomal protein S1 unwinds double-stranded RNA inmultiple steps Proc Natl Acad Sci USA 109 14458ndash14463

21 MilonP MaracciC FilonavaL GualerziCO andRodninaMV (2012) Real-time assembly landscape of bacterial30S translation initiation complex Nat Struct Mol Biol 19609ndash615

22 StuderSM and JosephS (2006) Unfolding of mRNA secondarystructure by the bacterial translation initiation complex MolCell 22 105ndash115

23 DarfeuilleF UnosonC VogelJ and WagnerEGH (2007) Anantisense RNA inhibits translation by competing with standbyribosomes Mol Cell 26 381ndash392

24 MutalikVK GuimaraesJC CambrayG MaiQ-AChristoffersenMJ MartinL YuA LamC RodriguezCBennettG et al (2013) Quantitative estimation of activity andquality for collections of functional genetic elements NatMethods 10 347ndash353

25 CardinaleS and ArkinAP (2012) Contextualizing context forsynthetic biologyndashidentifying causes of failure of syntheticbiological systems Biotechnol J 7 856ndash866

26 LouC StantonB ChenY-J MunskyB and VoigtCA (2012)Ribozyme-based insulator parts buffer synthetic circuits fromgenetic context Nat Biotechnol 30 1137ndash1142

27 KosuriS GoodmanDB CambrayG MutalikVK GaoYArkinAP EndyD and ChurchGM (2013) Composability ofregulatory sequences controlling transcription and translation inEscherichia coli Proc Natl Acad Sci USA 110 14024ndash14029

28 TakahashiS FurusawaH UedaT and OkahataY (2013)Translation enhancer improves the ribosome liberation fromtranslation initiation J Am Chem Soc 135 13096ndash13106

29 HaoY ZhangZJ EricksonDW HuangM HuangY LiJHwaT and ShiH (2011) Quantifying the sequence-functionrelation in gene silencing by bacterial small RNAs Proc NatlAcad Sci USA 108 12473ndash12478

30 MathewsDH SabinaJ ZukerM and TurnerDH (1999)Expanded sequence dependence of thermodynamic parametersimproves prediction of RNA secondary structure J Mol Biol288 911ndash940

31 XiaT SantaLuciaJ BurkardME KierzekR SchroederSJJiaoX CoxC and TurnerDH (1998) Thermodynamicparameters for an expanded nearest-neighbor model for formationof RNA duplexes with Watson-Crick base pairs Biochemistry 3714719ndash14735

32 GruberAR LorenzR BernhartSH NeubockR andHofackerIL (2008) The Vienna RNA websuite Nucleic AcidsRes 36 W70ndashW74

33 ChenH BjerknesM KumarR and JayE (1994) Determinationof the optimal aligned spacing between the ShinendashDalgarnosequence and the translation initiation codon of Escherichia colimRNAs Nucleic Acids Res 22 4953ndash4957

34 PertzevAV and NicholsonAW (2006) Characterization ofRNA sequence determinants and antideterminants of processingreactivity for a minimal substrate of Escherichia coli ribonucleaseIII Nucleic Acids Res 34 3708ndash3721

35 KeselerIM Collado-VidesJ Santos-ZavaletaA Peralta-GilMGama-CastroS Muniz-RascadoL Bonavides-MartinezCPaleyS KrummenackerM AltmanT et al (2011) EcoCyc a

2658 Nucleic Acids Research 2014 Vol 42 No 4

comprehensive database of Escherichia coli biology Nucleic AcidsRes 39 D583ndashD590

36 MutalikVK QiL GuimaraesJC LucksJB and ArkinAP(2012) Rationally designed families of orthogonal RNA regulatorsof translation Nat Chem Biol 8 447ndash454

37 RodrigoG LandrainTE and JaramilloA (2012) De novoautomated design of small RNA circuits for engineering syntheticriboregulation in living cells Proc Natl Acad Sci USA 10915271ndash15276

38 CalluraJM CantorCR and CollinsJJ (2012) Geneticswitchboard for synthetic biology applications Proc Natl AcadSci USA 109 5850ndash5855

39 SinhaJ ReyesSJ and GallivanJP (2010) Reprogrammingbacteria to seek and destroy an herbicide Nat Chem Biol 6464ndash470

40 AnthonyPC PerezCF Garcıa-GarcıaC and BlockSM(2012) Folding energy landscape of the thiamine pyrophosphateriboswitch aptamer Proc Natl Acad Sci USA 109 1485ndash1489

41 PerdrizetGA II ArtsimovitchI FurmanR SosnickTR andPanT (2012) Transcriptional pausing coordinates folding of theaptamer domain and the expression platform of a riboswitchProc Natl Acad Sci USA 109 3323ndash3328

42 LynchSA DesaiSK SajjaHK and GallivanJP (2007) Ahigh-throughput screen for synthetic riboswitches revealsmechanistic insights into their function Chem Biol 14 173ndash184

43 CaronM-P BastetL LussierA Simoneau-RoyM MasseEand LafontaineDA (2012) Dual-acting riboswitch control oftranslation initiation and mRNA decay Proc Natl Acad SciUSA 109 E3444ndashE3453

44 UpdegroveT WilfN SunX and WartellRM (2008) Effect ofHfq on RprArpoS mRNA pairing HfqRNA binding and the

influence of the 50 rpoS mRNA leader region Biochemistry 4711184ndash11195

45 SacerdotC CailletJ GraffeM EyermannF EhresmannBEhresmannC SpringerM and RombyP (1998) The Escherichiacoli threonyl-tRNA synthetase gene contains a split ribosomalbinding site interrupted by a hairpin structure that is essential forautoregulation Mol Microbiol 29 1077ndash1090

46 GabashviliIS AgrawalRK GrassucciR and FrankJ (1999)Structure and structural variations of the Escherichia coli 30 Sribosomal subunit as revealed by three-dimensional cryo-electronmicroscopy J Mol Biol 286 1285ndash1291

47 ClemonsWM MayJL WimberlyBT McCutcheonJPCapelMS and RamakrishnanV (1999) Structure of a bacterial30S ribosomal subunit at 55 A resolution Nature 400 833ndash840

48 PiolettiM SchlunzenF HarmsJ ZarivachR GluhmannMAvilaH BashanA BartelsH AuerbachT and JacobiC(2001) Crystal structures of complexes of the small ribosomalsubunit with tetracycline edeine and IF3 EMBO J 201829ndash1839

49 WilliamsonJR (2000) Induced fit in RNAndashprotein recognitionNat Struct Mol Biol 7 834ndash837

50 FioreJL HolmstromED and NesbittDJ (2012) Entropicorigin of Mg2+-facilitated RNA folding Proc Natl Acad SciUSA 109 2902ndash2907

51 MillsJB VacanoE and HagermanPJ (1999) Flexibility ofsingle-stranded DNA use of gapped duplex helices to determinethe persistence lengths of poly (dT) and poly (dA) J Mol Biol285 245ndash257

52 EgbertRG and KlavinsE (2012) Fine-tuning gene networksusing simple sequence repeats Proc Natl Acad Sci USA 10916817ndash16822

Nucleic Acids Research 2014 Vol 42 No 4 2659

Page 13: Translation rate is controlled by coupled trade-offs ......Translation rate is controlled by coupled trade-offs between site accessibility, selective RNA unfolding and sliding at upstream

FUNDING

DARPA Young Faculty Award [to HMS] Office ofNaval Research [N00014-13-1-0074 to HMS] NationalScience Foundation Career Award [CBET-1253641 toHMS] start-up funds provided by the Penn StateInstitute for the Energy and Environment Supported bythe NSF Research Experience for Undergraduatesprogram (to ASC) Funding for open access chargeThe Office of Naval Research [N00014-13-1-0074 toHMS]

CONTRIBUTIONS

AEB and HMS designed the study developed themodel analyzed results and wrote the manuscriptAEB and ASC conducted the experiments

Conflict of interest statement HMS is the founder of DeNovo DNA a company that commercializes software todesign genetic systems The presented results areindependent of financial consequences

REFERENCES

1 TsaiA PetrovA MarshallRA KorlachJ UemuraS andPuglisiJD (2012) Heterogeneous pathways and timing of factordeparture during translation initiation Nature 487 390ndash393

2 SimonettiA MarziS JennerL MyasnikovA RombyPYusupovaG KlaholzBP and YusupovM (2009) A structuralview of translation initiation in bacteria Cell Mol Life Sci 66423ndash436

3 JennerL RombyP ReesB Schulze-BrieseC SpringerMEhresmannC EhresmannB MorasD YusupovaG andYusupovM (2005) Translational operator of mRNA on theribosome how repressor proteins exclude ribosome bindingScience 308 120ndash123

4 BeiselCL UpdegroveTB JansonBJ and StorzG (2012)Multiple factors dictate target selection by Hfq-binding smallRNAs EMBO J 31 1961ndash1974

5 StorzG VogelJ and WassarmanKM (2011) Regulation bysmall RNAs in bacteria expanding frontiers Mol Cell 43880ndash891

6 BastetL DubeA MasseE and LafontaineDA (2011) Newinsights into riboswitch regulation mechanisms Mol Microbiol80 1148ndash1154

7 BeiselCL and SmolkeCD (2009) Design principles forriboswitch function PLoS Comput Biol 5 e1000363

8 LeaseRA and BelfortM (2000) A trans-acting RNA as acontrol switch in Escherichia coli DsrA modulates function byforming alternative structures Proc Natl Acad Sci USA 979919ndash9924

9 De SmitMH and Van DuinJ (1994) Control of translation bymRNA secondary structure in Escherichia coli A quantitativeanalysis of literature data J Mol Biol 244 144ndash150

10 De SmitMH and Van DuinJ (1994) Translational initiation onstructured messengers Another role for the Shine-Dalgarnointeraction J Mol Biol 235 173ndash184

11 OstermanIA EvfratovSA SergievPV and DontsovaOA(2013) Comparison of mRNA features affecting translationinitiation and reinitiation Nucleic Acids Res 41 474ndash486

12 BugautA and BalasubramanianS (2012) 50-UTR RNA G-quadruplexes translation regulation and targeting Nucleic AcidsRes 40 4727ndash4741

13 SalisHM (2011) The ribosome binding site calculator MethodsEnzymol 498 19ndash42

14 SalisHM MirskyEA and VoigtCA (2009) Automated designof synthetic ribosome binding sites to control protein expressionNat Biotechnol 27 946ndash950

15 CarothersJM GolerJA JuminagaD and KeaslingJD (2011)Model-driven engineering of RNA devices to quantitativelyprogram gene expression Science 334 1716ndash1719

16 SimonettiA MarziS MyasnikovAG FabbrettiAYusupovM GualerziCO and KlaholzBP (2008) Structure ofthe 30S translation initiation complex Nature 455 416ndash420

17 MarziS MyasnikovAG SerganovA EhresmannCRombyP YusupovM and KlaholzBP (2007) StructuredmRNAs regulate translation initiation by binding to the platformof the ribosome Cell 130 1019ndash1031

18 YusupovaGZ YusupovMM CateJ and NollerHF (2001)The path of messenger RNA through the ribosome Cell 106233ndash241

19 De SmitMH and Van DuinJ (2003) Translational standbysites how ribosomes may deal with the rapid folding kinetics ofmRNA J Mol Biol 331 737ndash743

20 QuX LancasterL NollerHF BustamanteC and TinocoI(2012) Ribosomal protein S1 unwinds double-stranded RNA inmultiple steps Proc Natl Acad Sci USA 109 14458ndash14463

21 MilonP MaracciC FilonavaL GualerziCO andRodninaMV (2012) Real-time assembly landscape of bacterial30S translation initiation complex Nat Struct Mol Biol 19609ndash615

22 StuderSM and JosephS (2006) Unfolding of mRNA secondarystructure by the bacterial translation initiation complex MolCell 22 105ndash115

23 DarfeuilleF UnosonC VogelJ and WagnerEGH (2007) Anantisense RNA inhibits translation by competing with standbyribosomes Mol Cell 26 381ndash392

24 MutalikVK GuimaraesJC CambrayG MaiQ-AChristoffersenMJ MartinL YuA LamC RodriguezCBennettG et al (2013) Quantitative estimation of activity andquality for collections of functional genetic elements NatMethods 10 347ndash353

25 CardinaleS and ArkinAP (2012) Contextualizing context forsynthetic biologyndashidentifying causes of failure of syntheticbiological systems Biotechnol J 7 856ndash866

26 LouC StantonB ChenY-J MunskyB and VoigtCA (2012)Ribozyme-based insulator parts buffer synthetic circuits fromgenetic context Nat Biotechnol 30 1137ndash1142

27 KosuriS GoodmanDB CambrayG MutalikVK GaoYArkinAP EndyD and ChurchGM (2013) Composability ofregulatory sequences controlling transcription and translation inEscherichia coli Proc Natl Acad Sci USA 110 14024ndash14029

28 TakahashiS FurusawaH UedaT and OkahataY (2013)Translation enhancer improves the ribosome liberation fromtranslation initiation J Am Chem Soc 135 13096ndash13106

29 HaoY ZhangZJ EricksonDW HuangM HuangY LiJHwaT and ShiH (2011) Quantifying the sequence-functionrelation in gene silencing by bacterial small RNAs Proc NatlAcad Sci USA 108 12473ndash12478

30 MathewsDH SabinaJ ZukerM and TurnerDH (1999)Expanded sequence dependence of thermodynamic parametersimproves prediction of RNA secondary structure J Mol Biol288 911ndash940

31 XiaT SantaLuciaJ BurkardME KierzekR SchroederSJJiaoX CoxC and TurnerDH (1998) Thermodynamicparameters for an expanded nearest-neighbor model for formationof RNA duplexes with Watson-Crick base pairs Biochemistry 3714719ndash14735

32 GruberAR LorenzR BernhartSH NeubockR andHofackerIL (2008) The Vienna RNA websuite Nucleic AcidsRes 36 W70ndashW74

33 ChenH BjerknesM KumarR and JayE (1994) Determinationof the optimal aligned spacing between the ShinendashDalgarnosequence and the translation initiation codon of Escherichia colimRNAs Nucleic Acids Res 22 4953ndash4957

34 PertzevAV and NicholsonAW (2006) Characterization ofRNA sequence determinants and antideterminants of processingreactivity for a minimal substrate of Escherichia coli ribonucleaseIII Nucleic Acids Res 34 3708ndash3721

35 KeselerIM Collado-VidesJ Santos-ZavaletaA Peralta-GilMGama-CastroS Muniz-RascadoL Bonavides-MartinezCPaleyS KrummenackerM AltmanT et al (2011) EcoCyc a

2658 Nucleic Acids Research 2014 Vol 42 No 4

comprehensive database of Escherichia coli biology Nucleic AcidsRes 39 D583ndashD590

36 MutalikVK QiL GuimaraesJC LucksJB and ArkinAP(2012) Rationally designed families of orthogonal RNA regulatorsof translation Nat Chem Biol 8 447ndash454

37 RodrigoG LandrainTE and JaramilloA (2012) De novoautomated design of small RNA circuits for engineering syntheticriboregulation in living cells Proc Natl Acad Sci USA 10915271ndash15276

38 CalluraJM CantorCR and CollinsJJ (2012) Geneticswitchboard for synthetic biology applications Proc Natl AcadSci USA 109 5850ndash5855

39 SinhaJ ReyesSJ and GallivanJP (2010) Reprogrammingbacteria to seek and destroy an herbicide Nat Chem Biol 6464ndash470

40 AnthonyPC PerezCF Garcıa-GarcıaC and BlockSM(2012) Folding energy landscape of the thiamine pyrophosphateriboswitch aptamer Proc Natl Acad Sci USA 109 1485ndash1489

41 PerdrizetGA II ArtsimovitchI FurmanR SosnickTR andPanT (2012) Transcriptional pausing coordinates folding of theaptamer domain and the expression platform of a riboswitchProc Natl Acad Sci USA 109 3323ndash3328

42 LynchSA DesaiSK SajjaHK and GallivanJP (2007) Ahigh-throughput screen for synthetic riboswitches revealsmechanistic insights into their function Chem Biol 14 173ndash184

43 CaronM-P BastetL LussierA Simoneau-RoyM MasseEand LafontaineDA (2012) Dual-acting riboswitch control oftranslation initiation and mRNA decay Proc Natl Acad SciUSA 109 E3444ndashE3453

44 UpdegroveT WilfN SunX and WartellRM (2008) Effect ofHfq on RprArpoS mRNA pairing HfqRNA binding and the

influence of the 50 rpoS mRNA leader region Biochemistry 4711184ndash11195

45 SacerdotC CailletJ GraffeM EyermannF EhresmannBEhresmannC SpringerM and RombyP (1998) The Escherichiacoli threonyl-tRNA synthetase gene contains a split ribosomalbinding site interrupted by a hairpin structure that is essential forautoregulation Mol Microbiol 29 1077ndash1090

46 GabashviliIS AgrawalRK GrassucciR and FrankJ (1999)Structure and structural variations of the Escherichia coli 30 Sribosomal subunit as revealed by three-dimensional cryo-electronmicroscopy J Mol Biol 286 1285ndash1291

47 ClemonsWM MayJL WimberlyBT McCutcheonJPCapelMS and RamakrishnanV (1999) Structure of a bacterial30S ribosomal subunit at 55 A resolution Nature 400 833ndash840

48 PiolettiM SchlunzenF HarmsJ ZarivachR GluhmannMAvilaH BashanA BartelsH AuerbachT and JacobiC(2001) Crystal structures of complexes of the small ribosomalsubunit with tetracycline edeine and IF3 EMBO J 201829ndash1839

49 WilliamsonJR (2000) Induced fit in RNAndashprotein recognitionNat Struct Mol Biol 7 834ndash837

50 FioreJL HolmstromED and NesbittDJ (2012) Entropicorigin of Mg2+-facilitated RNA folding Proc Natl Acad SciUSA 109 2902ndash2907

51 MillsJB VacanoE and HagermanPJ (1999) Flexibility ofsingle-stranded DNA use of gapped duplex helices to determinethe persistence lengths of poly (dT) and poly (dA) J Mol Biol285 245ndash257

52 EgbertRG and KlavinsE (2012) Fine-tuning gene networksusing simple sequence repeats Proc Natl Acad Sci USA 10916817ndash16822

Nucleic Acids Research 2014 Vol 42 No 4 2659

Page 14: Translation rate is controlled by coupled trade-offs ......Translation rate is controlled by coupled trade-offs between site accessibility, selective RNA unfolding and sliding at upstream

comprehensive database of Escherichia coli biology Nucleic AcidsRes 39 D583ndashD590

36 MutalikVK QiL GuimaraesJC LucksJB and ArkinAP(2012) Rationally designed families of orthogonal RNA regulatorsof translation Nat Chem Biol 8 447ndash454

37 RodrigoG LandrainTE and JaramilloA (2012) De novoautomated design of small RNA circuits for engineering syntheticriboregulation in living cells Proc Natl Acad Sci USA 10915271ndash15276

38 CalluraJM CantorCR and CollinsJJ (2012) Geneticswitchboard for synthetic biology applications Proc Natl AcadSci USA 109 5850ndash5855

39 SinhaJ ReyesSJ and GallivanJP (2010) Reprogrammingbacteria to seek and destroy an herbicide Nat Chem Biol 6464ndash470

40 AnthonyPC PerezCF Garcıa-GarcıaC and BlockSM(2012) Folding energy landscape of the thiamine pyrophosphateriboswitch aptamer Proc Natl Acad Sci USA 109 1485ndash1489

41 PerdrizetGA II ArtsimovitchI FurmanR SosnickTR andPanT (2012) Transcriptional pausing coordinates folding of theaptamer domain and the expression platform of a riboswitchProc Natl Acad Sci USA 109 3323ndash3328

42 LynchSA DesaiSK SajjaHK and GallivanJP (2007) Ahigh-throughput screen for synthetic riboswitches revealsmechanistic insights into their function Chem Biol 14 173ndash184

43 CaronM-P BastetL LussierA Simoneau-RoyM MasseEand LafontaineDA (2012) Dual-acting riboswitch control oftranslation initiation and mRNA decay Proc Natl Acad SciUSA 109 E3444ndashE3453

44 UpdegroveT WilfN SunX and WartellRM (2008) Effect ofHfq on RprArpoS mRNA pairing HfqRNA binding and the

influence of the 50 rpoS mRNA leader region Biochemistry 4711184ndash11195

45 SacerdotC CailletJ GraffeM EyermannF EhresmannBEhresmannC SpringerM and RombyP (1998) The Escherichiacoli threonyl-tRNA synthetase gene contains a split ribosomalbinding site interrupted by a hairpin structure that is essential forautoregulation Mol Microbiol 29 1077ndash1090

46 GabashviliIS AgrawalRK GrassucciR and FrankJ (1999)Structure and structural variations of the Escherichia coli 30 Sribosomal subunit as revealed by three-dimensional cryo-electronmicroscopy J Mol Biol 286 1285ndash1291

47 ClemonsWM MayJL WimberlyBT McCutcheonJPCapelMS and RamakrishnanV (1999) Structure of a bacterial30S ribosomal subunit at 55 A resolution Nature 400 833ndash840

48 PiolettiM SchlunzenF HarmsJ ZarivachR GluhmannMAvilaH BashanA BartelsH AuerbachT and JacobiC(2001) Crystal structures of complexes of the small ribosomalsubunit with tetracycline edeine and IF3 EMBO J 201829ndash1839

49 WilliamsonJR (2000) Induced fit in RNAndashprotein recognitionNat Struct Mol Biol 7 834ndash837

50 FioreJL HolmstromED and NesbittDJ (2012) Entropicorigin of Mg2+-facilitated RNA folding Proc Natl Acad SciUSA 109 2902ndash2907

51 MillsJB VacanoE and HagermanPJ (1999) Flexibility ofsingle-stranded DNA use of gapped duplex helices to determinethe persistence lengths of poly (dT) and poly (dA) J Mol Biol285 245ndash257

52 EgbertRG and KlavinsE (2012) Fine-tuning gene networksusing simple sequence repeats Proc Natl Acad Sci USA 10916817ndash16822

Nucleic Acids Research 2014 Vol 42 No 4 2659