3 fitness landscapes, error thresholds, and cofactors in...

41
3 Fitness Landscapes, Error Thresholds, and Cofactors in Aptamer Evolution ıdam Kun, Marie-Christine Maurel, Mauro Santos, and EȰrs Szathmőry 3.1 Introduction The idea that RNA was genetic as well as enzymatic material goes back to earlier speculations concerning the possible role of RNA in the origin of life (Woese, 1967; Crick, 1968; Orgel, 1968). Although the idea was clearly expressed, and the molec- ular nature of RNA – with genetically specified positioning in three dimensions of different chemical building blocks – should have convinced everybody of the po- tential enzymatic capacity of RNA based almost entirely on first principles, the acceptance of the idea – in line with the deeply non-theoretical nature of molecular biology – came only after the discoveries of RNA self-splicing and the enzymatic activity of RNase P RNA (Kole and Altman, 1981; Zaug and Cech, 1982). The em- pirical finding that RNA can be a catalyst as well as an information carrier made it strongly plausible that the first genetic systems could indeed have consisted of RNA alone (Pace and Marsh, 1985; Sharp, 1985; Orgel, 1986). White (1976) had formerly argued that the so-called nucleotide coenzymes (like NAD, NADP, FAD, FMN, etc.) are fossils of an earlier metabolic stage when RNAs acted as enzymes. It is interesting to note that whereas in White’s paper the emphasis was on metabolism, in Gilbert’s (1986) manifesto for the “RNA world” interconversions of RNA molecules are the focus. A much lesser known – but very rigorous – attempt was that of Gőnti (1979), who put early enzymatic RNAs into his minimal cell model (see Fernando et al., 2005, for a review on mod- els of minimal cells): the so-called chemoton (see Gőnti, 2003a,b). In that model it is assumed that replicative ribozymes catalyze steps of an autocatalytic metabolic cycle, surrounded by a growing membrane. This approach was complemented by a thorough confirmation of White’s insights on coenzymes as remnants of early ribozymes (Korőnyi and Gőnti, 1981). In Gőnti’s theoretical world the RNA world was in full bloom by the advent of the experimental demonstration of ribozyme activity in natural systems. Analysis of a “bag of genes” enclosed in compartments led to the stochastic cor- rector model (Szathmőry and Demeter, 1987), demonstrating that selection at the 54 The Aptamer Handbook. Edited by S. Klussmann Copyright c 2006 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim ISBN 3-527-31059-2

Upload: others

Post on 18-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

3Fitness Landscapes, Error Thresholds, and Cofactors in AptamerEvolution�dam Kun, Marie-Christine Maurel, Mauro Santos, and E�rs Szathm�ry

3.1Introduction

The idea that RNA was genetic as well as enzymatic material goes back to earlierspeculations concerning the possible role of RNA in the origin of life (Woese, 1967;Crick, 1968; Orgel, 1968). Although the idea was clearly expressed, and the molec-ular nature of RNA – with genetically specified positioning in three dimensions ofdifferent chemical building blocks – should have convinced everybody of the po-tential enzymatic capacity of RNA based almost entirely on first principles, theacceptance of the idea – in line with the deeply non-theoretical nature of molecularbiology – came only after the discoveries of RNA self-splicing and the enzymaticactivity of RNase P RNA (Kole and Altman, 1981; Zaug and Cech, 1982). The em-pirical finding that RNA can be a catalyst as well as an information carrier made itstrongly plausible that the first genetic systems could indeed have consisted ofRNA alone (Pace and Marsh, 1985; Sharp, 1985; Orgel, 1986).White (1976) had formerly argued that the so-called nucleotide coenzymes (like

NAD, NADP, FAD, FMN, etc.) are fossils of an earlier metabolic stage whenRNAs acted as enzymes. It is interesting to note that whereas in White’s paperthe emphasis was on metabolism, in Gilbert’s (1986) manifesto for the “RNAworld” interconversions of RNA molecules are the focus. A much lesser known– but very rigorous – attempt was that of G�nti (1979), who put early enzymaticRNAs into his minimal cell model (see Fernando et al., 2005, for a review on mod-els of minimal cells): the so-called chemoton (see G�nti, 2003a,b). In that model itis assumed that replicative ribozymes catalyze steps of an autocatalytic metaboliccycle, surrounded by a growing membrane. This approach was complemented bya thorough confirmation of White’s insights on coenzymes as remnants of earlyribozymes (Kor�nyi and G�nti, 1981). In G�nti’s theoretical world the RNA worldwas in full bloom by the advent of the experimental demonstration of ribozymeactivity in natural systems.Analysis of a “bag of genes” enclosed in compartments led to the stochastic cor-

rector model (Szathm�ry and Demeter, 1987), demonstrating that selection at the

54 3 Fitness Landscapes, Error Thresholds, and Cofactors in Aptamer Evolution

The Aptamer Handbook. Edited by S. KlussmannCopyright c 2006 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN 3-527-31059-2

Page 2: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

compartment (protocell) level is strong enough to oppose (correct) the adverseconsequences of within-cell competition among replicative, unlinked genes.The difficulty of internal competition within an early genomic set was firstpointed out by Eigen (1971). The stochastic corrector not only solves the competi-tion problem, but is also a theoretical construct to account for selection for ribo-zyme functions in reproducing compartments (Fig. 3.1), as explicitly stated in thepaper.

553.1 Introduction

Fig. 3.1 The stochastic corrector model(Szathm�ry and Demeter, 1987). Differenttemplates (open and closed circles) contributeto the well-being of the compartments (proto-cells) in that they catalyze steps of metabo-lism, for example. During protocell growthtemplates replicate at differential expectedrates, but stochastically. Upon division (p)

there is chance assortment of templates intooffspring compartments. Stochastic replicationand reassortment generate variation amongprotocells, on which natural selection at thecompartment level can act and oppose to(correct) internal deterioration due to within-cell competition. Such compartmentationselects for efficient ribozyme variants.

Page 3: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

However, a burning open question, also realized by Eigen (1971), still remains;namely, the problem of the error threshold of replication (that is, a sharply de-fined threshold beyond which heredity breaks down and evolutionary adaptationbecomes impossible). This concept is the flip side of the coin on which we see“mutational load” on one side. Ever since the pioneering works of Haldane(1937) it was clear to population geneticists that too high a mutational load(the decrease in average fitness of a population due to recurrent deleterious mu-tations) could kill a population, but Eigen looked at this problem from anotherangle. If you fix the mutation rate, how long can a replicator grow before it canno longer be maintained by natural selection, despite it being a fast replicator?Early replicators must have been error prone; therefore, they could not havebeen very long (the size of a tRNA is usually assumed). Does the stochastic cor-rector model push up the error threshold so that genomes composed of several tomany different genes can be maintained by selection? The answer is encouraging,but not sufficient (Zintzaras et al., 2002): the origin of a sizeable genome is still aproblem (see also Santos et al., 2004).We outline here a possible resolution to this conundrum. In the first part of this

chapter we analyze existing ribozymes to obtain a “function landscape,” which as-signs an activity value (ideally) to each mutant sequence. Then we use this func-tion landscape as a proxy for the fitness landscape; the crucial assumption beingthat ribozyme activity affects protocell fitness, hence protocell fitness translatesback to some average ribozyme fitness. In conclusion, we argue that the positionof the error threshold was previously estimated to be too severe: neutral and com-pensatory mutations crucially modify the picture. This result strengthens the pos-sibility of an RNA world in protocells (Szathm�ry, 1990a).In the late 1908s one of us was interested in putting the important theoretical

considerations of an RNA world to experimental test. Contemporary natural ribo-zymes almost exclusively conform to Gilbert’s vision of an RNA world, rather thanlending support to a metabolically complex RNA era, as envisaged by Benner et al.(1987, 1989). It was clear that a proof of the principle of the general enzymaticcapability of RNAs should come from novel experiments. It was suggested thata protocol similar to the production of catalytic antibodies should be followed(Szathm�ry, 1989, 1990b). The suggested method envisaged selection of RNAsby binding to transition state analogs of a given reaction, linked to an affinitychromatography column. Amplification would have happened at the DNA levelafter reverse transcription of the best-binding RNA molecules. Szathm�ry(1989) also pointed out that the same protocol could possibly be used to obtainRNA “aptamers” (as we call them now) that could specifically bind aminoacids, thus enabling the community to test certain ideas (primarily the stereo-chemical one) of the origin of the genetic code.Following this line, in the second half of the chapter we summarize experimen-

tal work on co-ribozymes (cofactor-assisted ribozymes) and aptazymes (aptamersattached to ribozymes). Finally, we discuss ideas around how aptamers and nu-cleotides could have been relevant for the origin of the genetic code and transla-tion.

56 3 Fitness Landscapes, Error Thresholds, and Cofactors in Aptamer Evolution

Page 4: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

3.2Functionality Landscapes Inferred from Examples

3.2.1Fitness Landscape

Fitness landscapes or adaptive landscapes are often used in evolutionary biologyto envisage the relationship between genotypes – or phenotypes – and Darwiniansuccess (fitness). The idea of studying evolution by visualizing the distribution offitness values as a kind of landscape was introduced by Wright (1932). MaynardSmith (1970) was the first to coin the term “protein space” for a high-dimensionalspace in which each sequence of length N amino acids (out of 20N possible se-quences) represents one point and is next to 19N points representing all theone-mutant neighbors of each other. In order to produce an adaptive landscapein sequence space a fitness value has to be assigned to each sequence, and anevolving population of proteins typically climbs uphill in the fitness landscape.This concept has since been used by a number of authors (for example Eigen,1985; Schuster, 1986, 1987).The simplest theoretical fitness landscape is the single-peaked fitness landscape

used in Eigen’s (1971) study of the error-threshold in a replicating population ofRNA sequences. In this landscape one sequence (the “master sequence”) has thehighest fitness value, and all other sequences have the same or lower fitness. Thisbiologically rather unrealistic fitness landscape still attracts considerable theoreti-cal interest, mainly because it can be tackled analytically (Drossel, 2001). In evo-lutionary optimization methods such as genetic algorithms (see references inFlamm et al., 1999), as well as the use of the concept of a potential or energy func-tion in physics (for example spin-glasses; Bonhoeffer and Stadler, 1993), fitnesslandscapes have also been applied to the study of biological evolution. However,the structure and characteristics of these landscapes are quite unlikely to matchwith the fitness landscapes of biological systems.The N-K model of Kauffman (Kauffman, 1993) describes a landscape where K

out of N elements are involved in some epistatic interaction. The model producesa rugged fitness landscape which was believed to resemble molecular fitness land-scapes on the basis of its ruggedness. Mutational additivity usually holds for posi-tions in biological sequences that do not interact, and such mutational additivityhas been demonstrated for several proteins (Tekada et al., 1989; Sarai and Tekada,1989; Sandberg and Terwilliger, 1993; Serrano et al., 1993; Zhang et al., 1995;Skinner and Terwilliger, 1996; Nikolova et al., 1998; Aita et al., 2001, 2002). Inthese cases, the most realistic of the N-K landscapes is the Mount Fuji type fitnesslandscapes (also known as multiplicative fitness landscapes), where each elementin a sequence individually and independently contributes to the fitness and thereis a single fittest sequence.

573.2 Functionality Landscapes Inferred from Examples

Page 5: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

3.2.2Damage Selection Experiments with Ribozymes

A wealth of information has been accumulated on ribozymes since their discoverynearly 20 years ago. Most of the studies can be fitted into one of the three mainlines of research: (1) characterization of known ribozymes (that is, inferring thestructure and mechanisms of catalysis; Lilley, 1999); (2) modification of naturalribozymes to be used in therapeutics (Sullenger and Gilboa, 2002); and (3) invitro evolution of novel ribozymes (Joyce, 1998, 2002; Landweber et al., 1998; Spir-in, 2002). The characterization of ribozymes frequently involved mutagenesis ex-periments, where the enzymatic activity of certain mutants was measured inorder to get insight into either the structure of the molecule or the mechanismof catalysis. While not directed toward the study of fitness landscapes, these ex-periments certainly contain a wealth of empirical information necessary for as-sembling the realistic fitness landscape of the studied ribozyme. Albeit all natu-rally occurring ribozymes are being studied extensively, there are only a few in-stances where the realistic fitness landscape can be conveniently investigated.Group I and group II introns, as well as the RNAase P, have to be excluded be-cause of their rather large size. Furthermore, it is inevitable to employ an RNAfolding algorithm in any sensible investigation of the fitness landscapes of ribo-zymes. Therefore, ribozymes with a pseudo-knot in their structure also have tobe excluded because most conventional folding algorithms cannot satisfactorilycope with pseudo-knots. This requirement singles out the hepatitis delta virus,which contains such structural elements (Perrotta and Been, 1991).On the other hand, the hammerhead, hairpin, and Neurospora VS self-cleaving

ribozymes can be separated into a substrate and a trans-cleaving ribozyme. Withrespect to these three ribozymes the trans-cleaving enzyme does not contain apseudo-knot structure. The hammerhead can be separated into a 13-mer enzymeand a 41-mer oligonucleotide substrate (Jeffries and Symons, 1989). The hairpincan be separated into a 50-mer enzyme and a 15-mer substrate (Fedor, 2000). Thetrans-acting ribozyme is 144 nucleotides long for the Neurospora VS ribozyme(Fig. 3.2), and the substrate is 20 nucleotides long (Guo and Collins, 1995). Werestrict our further analysis to the trans-acting ribozyme, and assume that the sub-strate is the same as the natural one. Unfortunately for our study, many of themutagenesis experiments have been directed towards the substrate and sub-strate-binding regions (Joseph et al., 1993; Joseph and Burke, 1993; Nishikawaet al., 1997; Ananvoranich and Perreault, 1998; P�rez-Ruiz et al., 1999) in orderto produce new RNA- or DNA-cleaving ribozymes to be used in therapeutics(Yu et al., 1998; And�ng et al., 1999; And�ng et al., 2004; Zhang et al., 2004).Based mainly on experiments with the VS and the hairpin ribozymes the follow-ing general conclusions can be derived:x Structure is important. From experimental data on the VS ribo-zyme Lilley and co-workers (Lafontaine et al., 2002c) state that“the secondary structure of the ribozyme is important, but thenature of most individual base pairs is not. Many can be reversed

58 3 Fitness Landscapes, Error Thresholds, and Cofactors in Aptamer Evolution

Page 6: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

or replaced by a different pair without major loss of activity, solong as a base pair is retained at a given position.” Similarly, inthe hairpin ribozyme all base pairs can be altered (except basepair G11:C/U-2) as long as the base pairing is maintained (Fedor,2000).

x There are critical regions in the molecule. For the single-strandedregions the structure has to be maintained, but at many suchpositions the nature of the base located there is also important.For example, most of the bases in the four loops of the hairpinribozyme are essential, and any change in those positionsseverely reduces activity (Siwkowski et al., 1997; Shippy et al.,1998). For the VS ribozyme 16 such critical sites were identified(Lafontaine et al., 2002a): these sites are located around the activesite, the substrate-binding region, and in the two- or three-wayjunctions.

x Structure can be varied slightly. The structure of the naturallyoccurring ribozymes can be slightly varied, as there are regionsthat are not crucial to function. For example, the stem–loop IV ofthe VS ribozyme is virtually completely dispensable, but thejunction 3–4–5 must be formed (albeit after the completeremoval of stem–loops IV and V the ribozyme still has detectableactivity; Sood and Collins, 2002). Similarly, in the hairpin ribo-zyme the helices H1 and H4 can be shortened and greatlyextended without any loss of activity (Fedor, 2000; Sargueil et al.,1995).

593.2 Functionality Landscapes Inferred from Examples

Fig. 3.2 Sequence and secondary structure ofthe enzyme part of the Neurospora VS ribozyme(numbering according to Beattie et al., 1995).Roman numbers indicate the regions of the

ribozyme. Capitalized nucleotides indicatepositions for which mutagenesis studies areavailable. Bold nucleotides indicate criticalsites.

Page 7: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

While the previous general conclusions can be easily incorporated into a model of afitness landscape, one general difficulty stills remain; namely, the combined effectof multiple mutations. Most mutagenesis experiments have investigated only sin-gle mutations (or mutations involving a base pair) in the vicinity of the wild type insequence space and rarely report the activity of double or higher order mutants. Inthose few instances where the effects of multiple mutations were evaluated, theactivities of the single-point mutants were not always included. The only remark-able exception is the study of Lehman and Joyce (1993) from an initial pool of theTetrahymena ribozyme, where they found that in general the mutational effectswere multiplicative (which implies mutational additivity for ribozyme activity).Table 3.1 summarizes the available experimental information on multiple mu-

tational effects for some of the known nucleolytic ribozymes. The plot of the mea-sured enzymatic activities of the double mutants on the estimated activities fromthe single mutants (Fig. 3.3) clearly suggests that mutational effects are nearlymultiplicative, with a slight positive synergy. Such positive synergy was alsofound for chemical modifications of the hairpin ribozyme (Klosermeier and Mill-ar, 2002). Accordingly, the fitness of a molecule containing n mutations (wn) couldbe estimated as:

wmultiplicativen =

Yni= 1

wi

wn = wmultiplicativen p (axb)

(3.1)

where wi is the fitness of a single-error variant, wnmultiplicative is the fitness of an

n-error variant assuming multiplicative effects, and a and b are parameters tobe fitted given the data (Fig. 3.3). We stress, however, that although the data setcontains information from three different ribozymes the number of points isstill quite small. Therefore, some care should be taken when translating the em-pirical available information to a fitness function.Besides these synergistic effects there are also examples of mutations that

“rescue” enzymatic activity to some extent. Mutations that result in the loss ofcatalytic activity also exist (Table 3.2). Mutants containing these and other point

60 3 Fitness Landscapes, Error Thresholds, and Cofactors in Aptamer Evolution

Fig. 3.3 Plot of the mea-sured enzymatic activities ofthe double mutants on theestimated activities from thesingle mutants for theNeurospora VS ribozyme.

Page 8: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

mutations might or might not have detectable activity. To our knowledge theseinteractions are impossible to predict, thus they can only be incorporated into adefinition of a fitness landscape if known from experiments. In conclusion, theeasiest way to deal with multiple mutations is to assume mutational indepen-dence (multiplicative effects), although it slightly overestimates the decrease in fit-ness due to multiple mutations. A more realistic assumption can come from tak-ing the synergy into account, albeit more data would be highly welcome. If rescuemutations or other such effects are known of the ribozyme, then they can also beincorporated to increase the realism of the fitness landscape.

613.2 Functionality Landscapes Inferred from Examples

Table 3.1 Available experimental data for the evaluation ofmultiple mutant effects on fitness

Mutant 1 Activity Mutant 2 Activity Activity ofthe doublemutant

Estimatedactivity ofthe doublemutant

Reference

U39Ca 0.289 A11G 0.32 0.093 0.095 Joseph et al., 1993

G21Ua 0.01 A20U 0.9 0.01 0.009 Sargueil et al., 2000

G21Ua 0.01 A20G 0.2 0.003 0.002 Sargueil et al., 2000

G21Ua 0.01 A20C 0.6 0.15 0.06 Sargueil et al., 2000

G21Ua 0.01 A43G 0.4 I 0.001 0.004 Sargueil et al., 2000

G21Ua 0.075 A43G 0.085 I 0.001 0.0064 Siwkowski et al., 1997;Sargueil et al., 2000

G21Ua 0.075 A43U 0.002 I 0.001 0.00015 Siwkowski et al., 1997;Sargueil et al., 2000

A7Ca 1 A20C 0.81 1.04 0.81 Anderson et al., 1994

A730Cb 0.32 A731C 0.39 0 0.12 Kumar et al., 1992

G726Ab 0 A730C 0.32 0 0 Kumar et al., 1992

U752Cc 0.80 U753C 0.42 0.52 0.336 Lafontaine et al., 2001b

G722C;C763Gc

0.81 C723G;-G762C

0.84 0.75 0.680 Beattie et al., 1995

G716Cc 0.21 U717A 0.21d 0.02 0.044 Beattie et al., 1995

C662Gc 0.23 A661U 0.23d 0.06 0.053 Beattie et al., 1995

a Hairpin ribozyme (numbering follows Butcher and Burke, 1994a,b).b Hepatitis delta virus (numbering according to Makino et al., 1987).c Neurospora VS ribozyme (numbering according to Beattie et al., 1995).d No data available. The activity of mutant 2 is assumed to be equal tothe activity of mutant 1.

Page 9: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

3.2.3Construction of the Fitness Landscape

Based on the foregoing general conclusions we propose a method to generate afitness landscape for any ribozyme for which enough mutagenesis data are avail-able. The result of the method is an algorithm which assigns a relative activity toeach of the 4N possible RNA sequences of length N. For the sake of simplicity werestrict the sequence space to sequences of a given length; however, the methodcould be applied if insertions and deletions were also considered. The algorithmconsists of four steps: (1) compatible structure, (2) mispairs, (3) critical sites, and(4) predicted structure. In each step we calculate an activity value pertaining to thegiven step. The relative activity and the fitness of the sequence (Asequence) isthe product of the individual activities calculated at each step (that is, Asequence =Astructure q Amispair q Acritical q Aenergy).

62 3 Fitness Landscapes, Error Thresholds, and Cofactors in Aptamer Evolution

Table 3.2 Single null and multiple mutations that in somecases “rescue” enzymatic activity to some extent

Mutation thatabolishesactivity

Multiple mutant Relativeactivity

Ribozyme Reference

A43C A43C; G21U 0 Hairpin Siwkowski et al., 1997a;Sargueil et al., 2000

G726A G726A; A730C 0 HDV Kumar et al., 1992

G726A G726A; G727C; A731C 0.02 HDV Kumar et al., 1992

G726A G726A; G728A; C729G 0 HDV Kumar et al., 1992

G726A G726A; G727U; C729G 0.01 HDV Kumar et al., 1992

G726C G726C; G728U; A730C 0 HDV Kumar et al., 1992

G726U G726U; G727U; A731C 0.04 HDV Kumar et al., 1992

G727C G726A; G727C; A731C 0.02 HDV Kumar et al., 1992

G727C G727C; C729A; A731G 0 HDV Kumar et al., 1992

G728C G727A; G728C 0.03 HDV Kumar et al., 1992

G728C G728C; A731U 0.01 HDV Kumar et al., 1992

G728C G728C; A730G 0.06 HDV Kumar et al., 1992

G728C G727A; G728C; C729U 0.04 HDV Kumar et al., 1992

C763A C763A; A766G 0 HDV Kumar et al., 1992

C763A C763A; G764A 0 HDV Kumar et al., 1992

C763G C763G; A765U 0 HDV Kumar et al., 1992

C763G C763G; A765U 0 HDV Kumar et al., 1992

Page 10: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

3.2.3.1 Compatible StructureIn order to have any enzymatic activity the molecule should fold into the structureof the ribozyme. Here a number of alternative structures can also be considered.A sequence is said to be compatible with a structure if for every base pair (i,j) in

the structure the bases at the ith and jth positions in the sequence can form oneof the allowed base pairs (that is, A:U, U:A, G:C, C:G, U:C, G:U). Enforcing strictcompatibility might result in an overestimation of the negative effects of mispairmutations. Some mispair mutants retain relatively high level of activity; for exam-ple the C662G mispair mutation in the VS ribozyme decreases activity to 23% ofthe wild type (Beattie et al., 1995). Thus, even sequences with partial compatibilityshould be considered compatible in this step. The negative effect of mispairs istaken into account in the next step.If a sequence is not compatible – even when considering the possibility of mis-

pairs – with any of the possible structures, then it has no activity and its fitness isset to 0. The activity factor for this step (Astructure) is the activity associated with thestructure to which the sequence can fold. The activities of the various possiblestructures can be different.

3.2.3.2 MispairsWhen a sequence is perfectly compatible with a structure (that is, there are nomispairs in it) then Amispair = 1, otherwise every single allowed mispair decreasesactivity to some extent. Every mispair has an associated relative enzymatic activity(Amispair,i), and the activity factor for this step is the product of the activities of theindividual mispairs: Amispair = PAmispair,i.If synergistic effects are taken into account, then they should be incorporated in

this step and/or the next step. Some regions of the ribozyme can show differentsensitivity to mispairs, thus the associated relative enzymatic activity (Amispair,i) willbe different.

3.2.3.3 Critical SitesThe nature of nucleotides at critical sites of the molecule is taken into account inthis third step of the algorithm. These sites are well studied so we can nearly as-sign a measured activity to every possible nucleotide at these positions. In fact, allpossible single mutants of the single-stranded regions of the hairpin ribozymehave been analyzed (Siwkowski et al., 1997; Shippy et al., 1998).As before, the product of the individual activities (Acritical,i) gives the activity fac-

tor for this step (Acritical = PAcritical,i). If synergistic effects are taken into account,then they should be incorporated in this step and/or the previous step. Further-more, if other epistatic effects were present, they would likely affect positions in-volved in this step.

633.2 Functionality Landscapes Inferred from Examples

Page 11: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

3.2.3.4 Predicted StructureThe last step in establishing the fitness of a sequence is to predict the secondarystructure the sequence will fold into and contrast it with the structure resolved inthe first step. The folding can be done with any available RNA folding routine, asfor example the MFold (Zuker et al., 1999) or the Vienna RNA Package (Hofackeret al., 1994). It has to be noted at this stage that the predicted minimum free en-ergy structure of the wild-type ribozyme sequence does not always correspondwith the actual secondary structure. In this case that structure can also be ac-cepted as a good structure. Furthermore, if mispairs are allowed then they haveto be taken into account during structure comparisons. When the predictedand the target structure are the same Aenergy = 1, otherwise Aenergy = 0. Thisstep is undoubtedly the most costly in terms of CPU time.

3.2.4Case Study: The Fitness Landscape of the Neurospora VS Ribozyme

The Neurospora VS ribozyme (Saville and Collins, 1990) can be used as a modelsystem to show the usefulness of the algorithm in generating the functional fitnesslandscape of the molecule (Fig. 3.2). There is a wealth of mutagenesis informationavailable for this ribozyme (Beattie et al., 1995; Lafontaine et al., 2001a,b, 2002a,b,d;Rastogi et al., 1996; Rastogi and Collins, 1998; Sood and Collins, 2002): out of the144 positions in the ribozyme (stem–loops II–VI; nucleotides 640–783 according tothe conventional numbering), 87 positions have documented mutants (excludingdeletions and insertions) and the total number of mutants studied so far is 183.The ribozyme has six regions (Fig. 3.2), where stem–loop I contains the cleav-

age site and is the substrate of the reaction, while regions II–VI perform the cat-alysis. The naturally occurring VS ribozyme is self-cleaving; however, it can be di-vided into a trans-acting ribozyme plus a substrate system (Guo and Collins,1995). Unlike trans-cleaving versions of the hammerhead, hairpin, and HDV ribo-zymes, the VS ribozyme does not use long stretches of complementary base pair-ing to associate with its substrate. Nonetheless, the ribozyme–substrate interac-tion is quite tight (Guo and Collins, 1995; Lafontaine et al., 2001b). The pseu-do-knot forming between the loops of the substrate and the stem–loop V of theribozyme is crucial for substrate biding and orientation (Rastogi et al., 1996;Andersen and Collins, 2001).

3.2.4.1 Compatible Structure of the VS RibozymeThe length of stem–loop IV and the distal part of stem–loop VI can be varied to agreat extent. The length of stem–loop V can be varied slightly (Lafontaine et al.,2002b), but the length of stem–loop III cannot be changed (Lafontaine et al.,2002b). There are no data available on stem–loop II. Furthermore, the A718bulge in region III can be complemented (that is, replaced by a base pair, withthe insertion of U after position 660) without significant loss of activity (Beattieet al., 1995; Lafontaine et al., 2002b). These changes constitute the basis of

64 3 Fitness Landscapes, Error Thresholds, and Cofactors in Aptamer Evolution

Page 12: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

other acceptable structures beside the wild-type structure. Table 3.3 lists variousstructures and their activities.In order to maintain the wild-type total length N = 144 constant, all structures

including deletions or insertions were slightly modified in our study. Single-stranded regions were removed from or added to the beginning and the end ofthe structure in all possible combinations. We allowed some mispairs in the struc-ture, but a mispair can only be considered if the following criteria are met (in whatfollows all numbers refer to positions in the wild-type sequence. In the alternativestructures these positions change according to the shift due to the insertion ordeletion):x Two mispairs cannot be adjacent. There are six experimentallytested mutants that have mispairs at adjacent positions. Four ofthem have absolutely no activity ([G727C, U728A], [G722C,C723G], [G762C, C763G], [A759U, C760G]). In the case of theremaining two mutants ([G716C, U717A], [A661U, C662G]) theactivities are 0.02 and 0.06, respectively.

x The pairs [653:771], [654:770], and [655:769] in stem–loop II haveto be paired. The mispair mutants of the region ([G653C],[C771G], [G655C]) do not have measurable activities, except forthe mutant [C769G] whose relative enzymatic activity is 0.18.

x The base pairs adjacent to a junction have to be paired. The stableforming of the junction is critical for enzymatic activity. If thebases next to a junction do not form a base pair, then the struc-ture of the junction changes considerably and enzymatic activitywould vanish. Accordingly, the following base pairs have to bepaired: [658:721], [663:715], [666:685], [687:709], [722:763].

653.2 Functionality Landscapes Inferred from Examples

Table 3.3 Enzymatic activities of different structures of theNeurospora VS ribozyme

Structure Astructure Reference

Original structure 1.000

Deletion of the A652 bulge 0.013 Beattie et al., 1995

Deletion of the A718 bulge 0.136 Lafontaine et al., 2002b

Pairing of the A718 bulge 0.820 Lafontaine et al., 2002b

Length 8 stem III 0.087 Lafontaine et al., 2002b

Length 7 stem V 0.045 Lafontaine et al., 2002b

Length 6 stem IV 0.470 Lafontaine et al., 2002b

Length 4 stem IV 0.590 Lafontaine et al., 2002b

Length 8 stem VI 0.970 Lafontaine et al., 2001b

Length 6 stem VI 1.000 Lafontaine et al., 2001b

Page 13: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

x A stem–loop cannot contain more than two mispairs. Generally,two mispairs in the same region greatly reduce activity (0.78[G704C, U706A], 0.54 [U670A, C672G], 0.28 [A748U, U750A], 0.25[A735U, U737], 0.07 [A690U, C692G], 0.06 [A661U, C662G], and0.02 [G716C, U717A]) or abolish it completely (see the adjacentmispairs mentioned above or the mutant [G679C, A681U]). Weassume that if a stem–loop contains more than two mispairs, thenthe resulting molecule would not have any enzymatic activity.

3.2.4.2 Allowed Mispairs in the VS RibozymeWe have distinguished various kinds of different mispairs according to their posi-tion. First, a mispair next to the active site (that is, at positions [731:754] and[729:758]) decreases activity to Amispair,i = 0.025 e 0.011. Second, a mispair nextto the loop of stem V (at positions [695:701]) decreases activity to Amispair,i =0.05. Thus, mutants [U695G] and [A701C] have enzymatic activities of 0.06 and0.04, respectively. Third, a mispair inside a stem decreases activity to 0.2.The experimentally tested mispairs that do not fall into any of the previous ca-

tegories have the following activities: 0.29 [C773G], 0.23 [C662G], 0.21 [G716C],0.12 [G650C], 0.05 [A720U], 0.00 [U659A]. The mean activity of these mutant is0.15 e 0.095, but it should be stressed that these mispairs are located in impor-tant parts of the ribozyme. There are no experimentally tested mutants in thefunctionally less important parts, and it would be reasonable to assume a slightlyless severe decrease of activity. Finally, a mispair next to the loops of stem IV andVI will probably not decrease activity to a great extent. For these mispairs we haveassumed Amispair,i = 0.80.

3.2.4.3 Critical Sites in the VS RibozymeCritical sites in the VS ribozyme are located in the junctions, in the substratebinding region, and in the A730 internal loop (Table 3.4). Thus, the two junctionsof the ribozyme (3–4–5 and 2–3–6) play an important role in the formation ofthe 3D structure of the molecule (Lafontaine et al., 2001a,b). Some critical posi-tions exist in both junctions: positions 656, 657, 665, 686, 710, 712, 713, 767and 768. The other nucleotides of the junctions can be changed to some othernucleotide with minor loss of activity.The substrate interacts with the enzyme through base pairing with bases 697–

699 located in the loop at the end of region V. Therefore, these positions are alsoconsidered as critical for the function of the molecule. Finally, the internal loop ofregion VI (positions 730, 755, 756, and 757) shows the greatest susceptibility tonucleotide substitution (Lafontaine et al., 2001b; Sood and Collins, 2002) and vir-tually any change in the sequence leads to severe reduction in cleavage rate. Thisloop is quite probably the active site of the VS ribozyme. For those positions nu-cleotide A756 appears to be the most important, particularly the amino group atlocation 6 of the purine base (Lafontaine et al., 2002d).

66 3 Fitness Landscapes, Error Thresholds, and Cofactors in Aptamer Evolution

Page 14: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

3.2.4.4 Predicted Structure for the VS RibozymeThe minimum free energy (MFE) structure of the original sequence of the VSribozyme is not the experimentally determined secondary structure. The twostructures differ in only three base pairs in stem–loop VI, and the energy differ-ence between the two is a mere 0.3 kcal. But it has to be noted that the nuclearmagnetic resonance structure of the isolated stem–loop VI is the same as in theMFE structure (Flinders and Dieckmann, 2004). Thus both structures might playan important role in the catalysis. Accordingly, we accept the sequence if it foldseither to the MFE or the experimentally predicted structure of the wild-type VSribozyme. We have used the RNA folding algorithm (Hofacker et al., 1994) forsecondary structure predictions.As previously indicated, the enzymatic activity of the molecule is simply the

product of the activity factors estimated in the above four steps. If the resultingactivity was less than the lowest activity that can be reliably measured (that is,Asequence J 10–3), then it was set to 0.

673.2 Functionality Landscapes Inferred from Examples

Table 3.4 Activities of the critical sites in the Neurospora VS ribozyme

Critical site Acritical,i

A U C G

656 1.000 0.002 0.003 0.061

657 1.000 0.063 0.063a 0.063a

665 0.014 0.01b 1.000 0.01

686 0.006 1.000 0.006a 0.006a

697 0.000 0.001 0.000 1.000

698 1.000 0.000 0.041 0.000

699 0.006 0.018 1.000 0.000

710 0.002 1.000 0.029 0.002

712 1.000 0.005 0.005b 0.006

713 0.019 1.000 0.136 0.019

730 1.000 0.036 0.055 0.011

755 0.840 0.170 1.000 0.019

756 1.000 0.001 0.002 0.002

757 0.044 0.018 0.014 1.000

767 1.000 0.047 0.047a 0.047a

768 0.573 0.015 0.573c 1.000

a No data available. We assume that enzymatic activity is the same asfor the known mutant.

b No data available. We assume that enzymatic activity is the same asfor the known mutant with the lower activity.

c No data available. We assume that enzymatic activity is the same asfor the known mutant with the higher activity.

Page 15: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

3.2.4.5 Properties of the Estimated Fitness Landscape for the VS RibozymeTo characterize the fitness landscape some additional analyses were made. Allpossible one- or two-mutant molecules around the original sequence were gener-ated and their fitness recorded. Molecules containing more than two mutationscould not be completely enumerated in reasonable time, thus only one millionrandomly chosen sequences with 4, 5, 6, 7, 8, 9, and 10 mutations were evaluated(Figure 3.4).Of the 432 possible single mutants, 114 (26.4%) had the same activity as

the wild-type ribozyme (that is, they are “selectively neutral” neighbors), 222(51.4%) had an activity higher than 0.1, and 254 (58.8%) still retained “measur-able” enzymatic activity. Of the 92 664 possible double mutants only 6,579(7.1%) had the same activity as the wild type, whereas 29 096 (31.4%) still retainedsome enzymatic activity (that is, for 68.6% of the sequences Asequence J 10–3).In addition, we generated 10 million random sequences (a very tiny fraction of

the sequence space, which contains approximately 4.973 q 1086 sequences). Noneof them had any enzymatic activity.

3.3Error Thresholds Inferred from Functional Landscapes:The “Realistic” Error Threshold of the Neurospora VS Ribozyme

As previously pointed out, the formal description by Eigen (1971) of the muta-tion–selection dynamics of a population of biological sequences led to the realiza-tion of one of the greatest paradox of prebiotic evolution: the error rate poses alimit to the length of the information that can be selectively maintained withinthe system. A stable cloud of mutants (that is, a quasi-species) can form arounda master sequence as long as the maximum chain length (N) is below the criticalerror rate per site per replication (m*) as determined by the following expression:

68 3 Fitness Landscapes, Error Thresholds, and Cofactors in Aptamer Evolution

Fig. 3.4 Fraction of all possiblemutants of the original se-quence folding into a perfectribozyme (circles), and aribozyme with some activity(square) as function of thenumber of point-mutationsintroduced into the sequences.

Page 16: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

N Iln s

m�(3.2)

where s is the selective superiority of the master. Very roughly then, assuming ascustomary that lns z 1, Eq. (2) states that the maximum selectively maintainableamount of information (N) is about the inverse of the mutation rate per base perreplication. Experimental evidence suggests that without the aid of peptide en-zymes the upper bound of copying fidelity per nucleotide per replication couldnot be higher than 0.99 (Johnston et al., 2001), and is quite likely significantlylower than this figure. Accordingly, the maximum N would be lower than 100 nu-cleotides.However, Eigen’s model is based on the assumption that the whole genotype –

the master sequence – has to be maintained for functionality. This assumptionmight be justifiable in a DNA–protein world, but in the RNA world the enzymaticactivity of a molecule was mainly based on its three-dimensional structure ratherthan on the exact order of its building blocks (except for a few critical sites). It is awell-established fact that the number of possible RNA secondary structures is con-siderably less than the number of possible sequences (Schuster et al., 1994; Stadlerand Haslinger, 1999), and it is also possible that two molecules with completelydifferent sequences share the same secondary structure (Huynen, 1996; Huynenet al., 1996; van Nimwegen et al., 1999). In other words, when RNA structure isconsidered it might be feasible to maintain ribozyme functionality (phenotype) atmutational rates that would not allow the preservation of the master sequence.In order to determine the “realistic” error threshold for the VS ribozyme we

have explored the dynamics of a population of RNA molecules with N = 144 atvarious mutation rates per nucleotide per replication (m). At the replication stepa sequence is picked at random according to its fitness. Thus, the probability pjof choosing sequence i with enzymatic activity Ai is:

pi =Ai

SAj(3.3)

where SAj is the sum of activities for all sequences in the population. The nextstep is to copy the chosen sequence with error rate m (only point mutationswere considered). Because a quarter of the times (assuming equal probabilityfor each nucleotide) no effective change will occur in the position even thoughthere is a mutational event, the effective mutation rate is m* = 0.75 m.The new sequence then replaces a randomly chosen sequence, which allows

keeping a constant population of molecules and is also equivalent to the assump-tion that the rate of degradation is the same for all molecules and independent ofenzymatic activity (Bonhoeffer and Stadler, 1993). As a final point we should em-phasize here that the occurrence of thresholds for error propagation was origin-ally derived as a deterministic kinetic theory that is only valid in the limitedcase of an infinite number of molecules. Alves and Fontanari (1998) have ex-tended it to finite populations and found that the critical error rate per site per

693.3 Error Thresholds Inferred from Functional Landscapes

Page 17: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

replication decreases linearly with 1/N. In our present case we extrapolated to aninfinite population size by recording the time to extinction (that is, the number ofgenerations when no functional ribozymes remained in the population) at variouserror rates and fitting a straight line to those last few points which still showed adownward trend. The error threshold is then the intersection of the line with theerror rate axis.As shown in Fig. 3.5 the “realistic” error threshold for the VS ribozyme was es-

timated to be m* = 0.052 (this figure refers to the “effective” mutation rate; seeabove). To compare this figure with the error threshold that would be obtainedwithout considering secondary structure we have considered two different fitnesslandscapes: Mount Fuji and single-peaked landscapes.In Mount Fuji landscape we have assigned an activity value to all possible nu-

cleotides at a given position, with the wild-type nucleotides at each position hav-ing a value Ai = 1 for the enzymatic activity. For those positions where experimen-tal data are available we considered that value, otherwise we either used the valuefor the same position derived for another nucleotide (if more than one such valuewere available, we used the lowest of the two) or used a predefined value. In thelast case we considered two scenarios: those mutants have uniformly either Ai =0.8 or Ai = 0.2. Therefore, in the first case we assumed that those positions arenot functionally important, whereas in the second case they are. In no case canthe enzymatic activity of the molecule be higher than 1. The fitness value of a

70 3 Fitness Landscapes, Error Thresholds, and Cofactors in Aptamer Evolution

Fig. 3.5 Error threshold for the VS ribozyme.Maximum (downward triangles), mean(square) and minimal (upward triangles) ob-served times to extinction are plotted as afunction of the error rate. The population size

was 5000 molecules. A generation is defined asa number of replication steps equal to thepopulation size. For each error rate 10 repli-cates were obtained. A line is fitted to the lastfour data points (solid line; R= –0.904).

Page 18: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

sequence was then calculated as the product of the individual activities, and theresulting “Mount Fuji error thresholds” were m* = 0.032 (Ai = 0.2) and m* =0.025 (Ai = 0.8), which are substantially lower than the “realistic” error thresholdfrom the functional landscape. Mount Fuji landscape retains some characteristicsof the functional landscape – the fitness effect of the single-stranded regions – butit is no longer possible to have compensatory mutations by changing one basepair into another. In fact, every mutation affecting a helical region counts as amispair, which causes the error catastrophe to occur at lower value.For the single-peaked landscape we used the same assumption as in Eigen’s

model. Thus, the wild-type sequence has Ai = 1 at each position and all other se-quences have Ai = 0.217, which is the average activity of all experimentally testedone-point mutants. The “single-peaked error threshold” was found to be m* =0.023, lower than in either of the previous cases. The reason is that this landscaperetains no information about the structure, and no neutral mutations are possible.By using Eq. (2) above and assuming lns = 1, the maximal error rate for the VS

ribozyme would be m* z 0.007. This figure is nearly an order of magnitude lowerthan the one we got by using a realistic fitness landscape. Furthermore, accordingto the Eigen’s model the error rate of 0.052 would permit a ribozyme of maximumlength 20 to be maintained. In summary, it is quite obvious that the inclusion ofstructural information, as well as information derived from experimental data,crucially alleviates the burden imposed by Eigen’s (1971). This is the first reportof a realistic error threshold calculated for an existing ribozyme. To the best of ourknowledge there is only one other indication of an error threshold calculatedusing a structural landscape. Huynen and co-workers (1996) used the secondarystructure of the phenylalanine tRNA (N = 76) as their object of investigation, andassumed that the fitness decrease of a mutant is proportional to the differencebetween the target structure and the structure of the mutant sequence. They re-ported that the error threshold for the tRNAPhe is 0.0031. This seems to be too low,as even the Eigen’s model predicts a higher error threshold (m* z 0.01). This lowvalue might be an artefact of the folding algorithm, which for tRNAs often pre-dicts a minimum free energy structure unlike the known cloverleaf structures.There is as yet no reported replicase ribozyme. The most promising result thus

far is a ribozyme that can extend a sequence by 14 nucleotides according to a tem-plate (Johnston et al., 2001). This ribozyme works with a 0.967 copying fidelity. Ifa functional replicase ribozyme had the same fidelity, then it could replicate theVS ribozyme without the threat of the error catastrophe.

3.4Looking for Catalytic Partners: Cofactors and Aptamers

The discovery that present-day living cells use RNA catalysts to hydrolyze RNAmolecules or to perform the complex reactions of excision, ligation, and cycliza-tion supporting a limited catalytic diversity, raises the question of the respectivedomains of RNA and protein catalysis. The recent advances in RNA catalysis

713.4 Looking for Catalytic Partners: Cofactors and Aptamers

Page 19: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

using the SELEX method make it possible to enhance the catalytic capabilities ofRNA with small molecules as catalytic partners. In this way some RNAs may beanalogous to protein in catalytic competence.Purine nucleotides, and in particular those containing adenine, participate in a

large variety of cellular biochemical processes (Maurel and D�cout, 1999). Theirbest-known function is that of monomeric precursors of RNAs and DNAs. Never-theless, derivatives of adenine are universal cofators. They serve in biological sys-tems as source of energy (ATP), allosteric regulators of enzymatic activity and reg-ulation signals (cyclic AMP). They are also found as acceptors during oxidativephosphorylation (ADP), as components of coenzymes (such as in FAD, NAD,NADP, coenzyme A; Fig. 3.6), as transfer agents of methyl groups of S-adenosyl-methionine, as possible precursors of polyprenoids in C5 (adenosylhopane)(Neunlist et al., 1987), and – last but not least – adenine 2451 conserved withinthe large rRNA in the three kingdoms, would be involved in catalysis duringthe formation of the peptide bond (Muth et al., 2000, 2001; Green and Lorsch,2002).On the other hand, biosynthesis of the amino acid histidine, which would have

appeared late in evolution, begins with 5-phosphoribosyl-1-phosphate (PRPP) thatforms Nl-(5-phosphoribosyl)-ATP by condensation with ATP. This reaction is akinto the initial reaction of purine biosynthesis. Finally, the ease with which purinebases are formed in prebiotic conditions (Or�, 1960) suggests that these baseswere probably essential components of an early genetic system. The nucleotidesthat by post-transcriptional modification can acquire the majority of functionalgroups present in amino acids possess a great potential diversity that is expressedin the modified bases of tRNAs and rRNAs and also at the level of ribonucleotidecoenzymes (several coenzymes derive from AMP; Fig. 3.7). In particular manycoenzymes are nucleotide analogs and the role of these cofactors at all steps ofthe current metabolism, and their distribution within the three kingdoms, sug-gests that a great variety of nucleotides were present in the last common ancestor.It as been suggested (White, 1976; Tr�moli�res, 1980) that coenzymes and mod-ified nucleotides that were present before the appearance of the translationmachinery may have played a prominent role in primeval catalysis.Proteins would have appeared only at a later stage, coenzymes and ribozymes

being fossil traces of past catalysts. Indeed, in the living cell, only a minority ofenzymes function without coenzyme; they are mostly hydrolases, and apartfrom this group, 70% of the enzymes require a coenzyme. If metal coenzymesinvolved in catalysis are considered, the number of enzymes that depend oncoenzymes increases further. Present-day coenzymes, indispensable cofactorsfor many proteins, would be living fossils of catalysts of primitive metabolism(Maurel and Haenni, 2005).Most coenzymes are nucleotides (NAD, NADP, FAD, coenzyme A, ATP, etc.) or

contain heterocyclic nitrogen bases that can originate from nucleotides (thiaminepyrophosphate, tetrahydrofolate, pyridoxal phosphate, etc.). Consequently theefficiency of selected ribozymes can be further expanded if coenzymes and/ornucleotide analogs containing functionalities (thiols, amino groups, imidazolyl

72 3 Fitness Landscapes, Error Thresholds, and Cofactors in Aptamer Evolution

Page 20: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

733.4 Looking for Catalytic Partners: Cofactors and Aptamers

Fig. 3.6 Coenzyme structures.

Page 21: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

moieties, etc.) are used. Modern selections yield various co-assisted dependentribozymes, justifying an RNA-based metabolism.

3.4.1Co-ribozymes (cofactor-assisted ribozymes)

Aptamers are capable of recognizing targets as small as metal ions (most RNAenzymes are metallo-ribozymes using metal as cofactors). They can interactwith a wide variety of molecules that are important for metabolism, includingamino acids, porphyrins, nucleotide factors, coenzymes, small peptides, andshort oligonucleotides (Illangasekare and Yarus, 1997; Jadhav and Yarus, 2002a;Joyce, 2002; McGinness et al., 2002; Reader and Joyce, 2002) (Table 3.5).The first aptamer selected for a biological cofactor was an ATP-binding RNA

(Sassanfar and Szostak, 1993; Fig. 3.8) showing a change in the conformationof the RNA and number of close contacts between the ATP and RNA.Since the ATP motif also binds adenosine and NAD+, the idea was that it could

serve to bind adenosine-derived cofactors as well. The presence of adenosine inmany common biological cofactors (ATP, CoA, FAD, NAD+, SAM, coenzymeB12) has been postulated to reflect an evolutionary origin for modern metabolism.It is even possible to consider that catalytic groups that were part of nucleicenzymes were incorporated in specific amino acids rather than being “retained”as coenzymes. This could be the case for imidazole, the functional group of his-tidine, whose present synthesis in the cell is triggered by a nucleotide (Maurel andNinio, 1987; Benner et al., 1989; Maurel, 1992). Further, the use of organic cofac-tors was illustrated by a DNA enzyme that requires histidine as a cofactor duringRNA cleavage (Szazani et al., 2004).A small aptamer recognizing anionic moieties has also been obtained by Szos-

tak and co-workers (Szazadni et al., 2004). In this case the significant interactionsare with the phosphate of ATP (Kd of 5 mmol/L compared with the AMP Kd of 5.5

74 3 Fitness Landscapes, Error Thresholds, and Cofactors in Aptamer Evolution

Fig. 3.7 List of coenzymes derived from AMP.

Page 22: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

753.4 Looking for Catalytic Partners: Cofactors and Aptamers

Table 3.5 Small targets of RNA aptamers

Target Minimal size Kd (mmol/L) Reference

Adenine – 10 Meli et al., 2002

Guanine 32 1.3 Kiga et al., 1998

Guanosine – 32 Connell and Yarus, 1994

l-Adenosine 58 1.7 Klussmann et al., 1996

Theophylline 38 0.1 Jenison et al., 1994

Caffeine – 3 500 Jenison et al., 1994

Riboflavin 35 0.5 Burgstaller and Famulok, 1994

NMN/NAD – 2.5 Lauhon and Szostak, 1995

d-ATP 40 14 Sassanfar et al., 1993

Cyanocobalamin 35 88 nmol/L Lorsch and Szostak, 1994

l-Valine – 12 000 Majerfeld and Yarus, 1994

l-Citrulline 44 65 Famulok, 1994

l-Arginine – 0.33 Geiger et al., 1996

d-Arginine 38 135 Nolte et al., 1996

Biotin 31 6 Wilson et al., 1998

Fig. 3.8 ATP andthe corresponding aptamer.

Page 23: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

mmol/L). This is of particular interest as two messenger RNA riboswitches fromEscherichia coli are known to discriminate between phosphorylated small mole-cules.The first self-incorporation of a coenzyme into a ribozyme was performed in

1995 by Breaker and Joyce (1995), substituting neatly alternative coenzymesinto the primary structure of group I intron by replacing the guanosine substratewith natural coenzymes or analogs (Fig. 3.9a and b respectively).Looking for significant metabolic reactions and for coenzyme-dependent ribo-

zymes, the primary biological cofactor used in acyltransfer reactions, coenzymeA (CoA), has been the target of RNA pools leading to a 52-nucleotide minimalaptamer (Burke and Hoffmann, 1998) which recognizes the adenosine moietyof CoA and binds others ATP analogs (Fig. 3.10). The selection of coenzymesynthetase ribozymes is of particular interest in an RNA-catalyzed energy metabo-lism. Yarus, exploring the origin of ribonucleotide coenzymes, demonstrated theRNA-catalyzed formation of three common coenzymes CoA, NAD, and FAD fromtheir precursors, 4l-phosphopantetheine, NMN, and FMN, respectively (Huang etal., 2000). A ribozyme capable of utilizing CoA for the synthesis of acyl-CoA wasselected in vitro. The co-ribozyme isolated, that is an acyl-CoA synthetase, pro-duced acetyl-CoA and butyryl-CoA (Jadhav and Yarus, 2002b).The use of organic cofactors has been illustrated by a hairpin ribozyme that

requires adenine as a cofactor during RNA reversible self-cleavage reaction(Meli et al., 2003). This may lead to a better understanding of prebiotic cofactorsin primeval catalysis. Our working hypothesis is based on the demonstration of

76 3 Fitness Landscapes, Error Thresholds, and Cofactors in Aptamer Evolution

Fig. 3.9 Self-incorporation of coenzyme (adapted from Breaker and Joyce, 1995).

Page 24: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

esterase activity in a nucleoside analog, the N6-ribosyladenine (Fuller et al., 1972;Maurel and Ninio, 1987). The activity, due to the presence of an imidazole groupthat is free and available for catalysis, is comparable to that of histidine placed inthe same conditions (Fig. 3.11). We have studied the kinetic behavior of this typeof catalyst (Ricard et al., 1996) and have shown that the catalytic effect increasesgreatly when the catalytic element, the pseudohistidine, is placed in a favorableenvironment within a macromolecule (D�cout et al., 1995). Moreover, primitivenucleotides were not necessarily restricted to the standard nucleotides encoun-tered today, and because of their replicative and catalytic properties, the N6 andN3 substituted derivatives of purines could have constituted essential links be-tween the nucleic acid world and the protein world.Following this line of investigation we started the selection of ribozymes depen-

dent on adenine. Starting from a heterogeneous population of RNAs with 1015

variants (a population of 1015 different molecules) we have selected five popula-tions of RNAs capable of specifically recognizing adenine after ten generations(Meli et al., 2002). When cloned, sequenced, and modeled, the best one amongthe individuals of these populations, has a shape reminiscent of a claw capableof grasping adenine. Following this result we have isolated from a degeneratedhairpin ribozyme, by in vitro selection, two adenine-dependent ribozymes capableof triggering reversible cleavage reactions (Fig. 3.12). One of them is also activewith imidazole alone (Meli et al., 2003).A quarter of classified enzymatic reactions are redox reactions involved in var-

ious biological events, such as metabolism of biological molecules, detoxification,energy production, and regulation of protein functions. An RNA molecule bind-ing hemin and exhibiting a peroxidase activity has been reported (Travascio et al.,1999). In this case, RNA and DNA of the same nucleotide sequence are capable offorming comparable cofactor-binding sites promoting catalysis.

773.4 Looking for Catalytic Partners: Cofactors and Aptamers

Fig. 3.10Secondary struc-tures of coenzyme–RNA aptamers.

Page 25: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

Recently, Tsukiji et al. (2004) reported a ribozyme that exhibits activity analo-gous to alcohol dehydrogenase (ADH). The ADH ribozyme was selected in vitrofrom a pool of random RNA sequences in the presence of NAD+ and Zn2+,based on the ability to convert a benzyl alcohol derivative to the correspondingaldehyde that was covalently attached to the 5l end of the RNA pool. In fact,the ribozyme oxidizes the benzyl alcohol to benzaldehyde in an NAD+ andZn2+-dependent manner, with a rate acceleration at least 7 orders of magnitudehigher than the uncatalyzed reaction.

78 3 Fitness Landscapes, Error Thresholds, and Cofactors in Aptamer Evolution

Fig. 3.11 (a) Adenine. (b) Comparison of modified adenosine and histidine.(c) Catalytic activity of adenine residue.

Fig. 3.12 Adenine-dependent hairpin ribozymes (ADHR). Arrowheads: cleavage sites. Graydots: degenerated (mutated) sites. Vertical bars: separation between the primer binding regionand the ribozyme sequence.

Page 26: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

3.4.2Aptazymes

Joining aptamer and ribozyme yields aptazymes. Several examples of new apta-zymes have been obtained from random-sequence RNA selections. Breaker et al.merged sequences of ATP aptamers into the hammerhead ribozyme to designallosteric ribozyme controlled by ATP (Tang and Breaker, 1997). The same ideawas used by Araki et al. (1998) with the hammerhead ribozyme and a flavin-spe-cific RNA aptamer. FMN binding affects the necessary conformation for catalyticactivity. The aptazymes exhibit high activation such as this ATP-dependent ligaseshowing 830-fold activation and a theophylline-dependent ligase that shows 1600-fold activation (Robertson and Ellington, 1999, 2000).These allosteric ribozymes composed of two independent structural domains

can activate kinetic reactivity and regulation providing molecular switches madeof RNA (Soukup and Breaker, 1999). Furthermore, these kinds of selected ribo-zymes can have allosteric responses that are orders of magnitude greater thanthose seen for protein enzymes. The allosteric selection strategy also providesnovel RNA molecular switches responding to cNMP targets. For instance Koizu-mi et al. (1999) demonstrated specific activation of ribozymes cleavage with cGMPand cAMP exhibiting an 5000-fold activation in the presence of the effector com-pounds.One of the most popular biotech applications of riboswitches is developing in

gene therapy, allowing patients to take pills to switch genes on or off. On theother hand, natural aptamers are involved in riboswitch regulation of transcrip-tion termination and translation initiation in bacteria. Either FMN or thiamineacts on bacterial riboswitch RNAs located within the 5l-untranslated region(5l-UTR). Binding of the target ligand alters the conformation resulting in achange in gene expression (Winkler and Breaker, 2003). Recently Winkler et al.(2004) demonstrated that glmS ribozyme responds to glucosamine 6-phosphate(GlcN6P), a crucial compound in sugar metabolism in all three kingdoms oflife and in cell wall biosynthesis in Gram-positive microorganisms, for geneticpurposes. Also in prokaryotes, mRNA binds coenzyme B12 to modulate gene ex-pression involved in cobalamine transport protein. The complexity of riboswitchstructures is supported by numerous short helices and conserved sequences ele-ments, allowing a great functional diversity in modern cells as well as possiblediversity of ancient RNA functions.

3.5The Use of Coenzymes:From the RNA World to the Protein World via Translation and the Genetic Code

Although conclusive evidence for an RNA world does not exist, there is a widelyshared belief that such a world did exist. The main obstacle to the acceptance ofthis appealing idea is that we do not understanding the origin of nucleic acid

793.5 The Use of Coenzymes

Page 27: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

replication of long templates (which would give unlimited hereditary potential:Maynard Smith and Szathm�ry, 1995), despite some excellent results concerningthe non-enzymatic replication of short pieces of nucleic acid analogues (von Kie-drowski, 1986), and exponential, non-autonomous amplification of somewhatlonger oligonucleotides (the SPREAD procedure; Luther et al., 1998). It is the suc-cessful selection for aptamers and ribozymes (see various chapters of this book),more than anything else, that convinces people of the plausibility of an RNAworld. In line with an early suggestion (Szathm�ry, 1989, 1990b), it now seemsthat transition state stabilization is a crucial component of successful catalysisby ribozymes (Rupert et al., 2002).In the previous section we gave an overview of coenzyme usage by aptamers

(co-ribozymes). In this final section we comment on the evolutionary significanceof these findings. We think, in the footsteps of White (1976), Kor�nyi and G�nti(1981), and G�nti (2003a), that coenzymes have played a crucial link between theRNA and RNA–protein worlds. If we assume that coenzymes were already of sig-nificance in primordial metabolism in the RNA world, then it is difficult to seehow they could have been widely replaced by any other molecules in metabolicevolution. One can replace RNA enzymes by protein enzymes (see MaynardSmith and Szathm�ry, 1995, for a discussion) through evolution one by one,but any one coenzyme takes part in so many different reactions that unlessthere is a very strong selective force, replacement is unnecessary and/or unima-ginable. The fact that ribozymes can make use of cofactors has been shown ina number of experiments. Roth and Breaker (1998) demonstrated the use of his-tidine in one DNA enzyme and recently, Tsukiji et al. (2004) showed that a se-lected ribozyme can catalyze reduction of an aldehyde using the NADH cofactor.The replacement of ribozymes by protein enzymes must have been driven by

the higher catalytic versatility of the latter (Benner et al., 1987), given the factthat the 20 amino acids offer significantly more functional groups than thefour nucleotides (Table 3.5). Thus if such enzyme takeover indeed has takenplace then modern metabolism is a “palimpsest” (Benner et al., 1989). Yet, inspec-tion of Table 3.5 reveals that when complemented by coenzymes, the catalytic ver-satility of ribozymes approaches that of proteins, at least based on the diversity offunctional groups. But of course it makes a big difference how often an enzymeneeds such an aid. Presumably proteins in a rich metabolism need coenzymesless often and/or use them more efficiently than ribozymes.It now seems that all critical steps of protein synthesis can be catalyzed by

ribozymes, including amino acid binding (Famulok and Szostak, 1992) and acti-vation (Kumar and Yarus, 2001), peptide formation, as it happens in contempor-ary ribosomes; Noller et al., 1992), or in artificial systems (e.g. Illangasekare andYarus, 1999; Sun et al., 2002). The next logical question then is how and whyamino acids have been introduced into the RNA world in the first place. We can-not review the entire relevant literature here, but rather briefly present ourview here that amino acids were among critical coenzymes in the RNA world(Szathm�ry, 1990c, 1993, 1996, 1999), and that proteins are evolved descendantsof amino acid coenzymes.

80 3 Fitness Landscapes, Error Thresholds, and Cofactors in Aptamer Evolution

Page 28: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

Before one accepts this line of argument another question must be considered:why did evolution not raise the number of nucleotides in replicable templates?One constraint is replicability. For example, N6-ribosyl-adenine could be a chemi-cally feasible substitute for the amino acid histidine (Maurel and Ninio, 1987;D�cout and Maurel, 1993), but it could not be inserted into RNA by conventionalreplication. From the chemical point of view even replicable alternative base pairsdo not look unfeasible: Piccirilli et al. (1990) have shown that alternative basepairs can be designed, synthesized, and incorporated into nucleic acids. Furtheradditions to the genetic alphabet have also been proposed (reviewed bySzathm�ry, 2003). To be sure, the requirement of replicability puts a severe con-straint on the diversity of these novel building blocks, but it does seem reasonableto propose that increasing the size of the genetic alphabet in templates would alsoincrease the size of the catalytic alphabet when such molecules are used an en-zymes. A theoretical analysis predicts that the increase in catalytic efficiency ofribozyme-like molecules should be slower than exponential with the number ofletters in the alphabet (Szathm�ry, 1991, 1992). But this advance comes at aprice: copying fidelity of these molecules should decrease faster than exponential,so there must be an optimal alphabet size, where the product of catalytic effi-ciency and copying fidelity (a measure of fitness, cf. Eigen, 1971) is maximal(reviewed in Szathm�ry, 2003). Despite the technical difficulties involved, itwould be important to test this prediction in experimental systems.Assuming that the trade-off between replicative fidelity and enzymatic effi-

ciency holds then the canonical solution to the problem of increasing catalytic ver-satility should come by the addition of coenzymes to ribozymes, rather than byincreasing the alphabet size of templates (cf. Szathm�ry and Maynard Smith,1997). Once again there are two options. Coenzymes can be either covalentlylinked to ribozymes or they can be used by reversible binding. Most experimentalattempts demonstrated the first option (reviewed by Jadhav and Yarus, 2002b) butwe emphasize that this is more the result of the applied selection protocol, whereselection on cells is not an option. Note that it is possible to obtain nucleic acidenzymes where cofactors are reversibly rather than covalently bound (Roth andBreaker, 1998; Tsukiji et al., 2004). In general we favor the reversible alternative,for the following reason. Imagine a ribozyme (R) catalyzing a step in metabolism,using a cofactor (C). If the cofactor is bound covalently (R–C), then the followingconstraints hold: (1) there must be a selective charging activity, either within theribozyme or performed by another ribozyme, specific to the first ribozyme (R + Cp R–C); (2) each cofactor molecule is stuck with its own ribozyme molecule; (3)coenzymes cannot be used as cosubstrates at various points in metabolism, sincethey are irreversibly linked to particular ribozyme molecules. The last point is per-haps the most important one. First, it holds for the contemporary nucleotide-likecoenzymes that in any particular reaction they are not catalysts but cosubstrates:only the coenzyme coupling is catalytic, when the coenzyme is recycled by otherreactions restoring its original state (G�nti, 2003a). Schematically it holds that A +C p B + C* and D + C* p E + C, where C and C* are the two states of the samecoenzyme (e.g. NAD and NADH), and A, B, D, and E are metabolic intermedi-

813.5 The Use of Coenzymes

Page 29: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

ates. Suppose that a full reaction looks as follows: A + R–C p B + R–C*. Becauseof the constraints on this ribozyme R, specific for the A p B conversion, it cannotcatalyze the reaction D p E that would restore the original state of its coenzymepart C*. Therefore, the only possibility for the coenzyme coupling would be areaction such as Rl + D + R–C*p Rl + E + R–C, where Rl is a different ribozyme,specific for the D p E reaction! We conclude that efficient use of coenzymes andthe realization of coenzyme coupling require in general that coenzymes be rever-sibly bound by most ribozymes, where the latter are specific to different indivi-dual reactions in the network. (Coenzymes in the contemporary protein worldwork this way.) A good question still is whether we should find coenzymessuch as C mostly in free form or bound to a nucleic acid moiety Rll (which, ac-cording to the above reasoning, in general need not be a ribozyme). In the scenar-io of White (1976) and Jadhav and Yarus (2002a,b) the favored alternative is anR–C option. This is consonant with the fact that typically the nucleotide partsact as “handles” (H = Rll) only and do not take part in catalysis (Benner et al.,1987). A good reason for this solution is that RNA handles can be more readilyrecognized and bound by other RNA molecules than anything else (Szathm�ry,1990c, 1993). Thus, rather than forcing all coenzyme-using ribozymes to self-charge the non-nucleic acid part C, they can bind the H–C molecule reversibly,for which there is only one synthetic reaction per coenzyme: S + H + C p S +H–C, where S is a coenzyme synthetase. Occasionally it may happen that the han-dle is self-charging (H + C p H–C), but in general one would favor specific but asshort handles as possible, which hence could not self-charge, therefore in generalone would expect synthetases (S) to catalyze the H + C p H–C reaction.The link between these general considerations on coenzymes and the way out

of the RNA world is that, according to our favored scenario, amino acids had beenutilized as coenzymes of ribozymes before the genetic code was used in the codedsynthesis of proteins in translation. As emphasized before, there is evidence thathistidine can complement a nucleic acid enzyme (Roth and Breaker, 1998).Furthermore, various short peptides turn out to be useful in a nucleic acid context(Bergstrom et al., 2001; Viladkar, 2002). We urge for more experiments where theuse of various amino acids as coenzymes of ribozymes can be tested.The rationale for amino acids (aa) acting alone or linked to handles (H–aa),

favoring the latter, is similar to the one revealed above for coenzymes in general(Szathm�ry, 1993, 1996, 1999). One can imagine that amino acids are introducedinto ribozymes via specific post-replicational modification. Suppose that a ribo-zyme R uses two different amino acids, aa1 and aa2, at two different points inthe molecule. This would need two modifying ribozymes M1 and M2, specificfor aa1 and aa2 and for R, similar to tRNA modification after transcription(Bj�rk, 1995). Even if another ribozyme Rl uses the same two amino acids aa1and aa2, a different pair of modifying ribozymes M1l and M2l would be neededto produce the functional form of Rl. Introduction of more and more amino acidsinto more and more ribozymes by post-replicational modification is therefore animpasse (Szathm�ry and Maynard Smith, 1997).

82 3 Fitness Landscapes, Error Thresholds, and Cofactors in Aptamer Evolution

Page 30: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

A feasible alternative is to bind amino acids reversibly. The simplest known ap-tamer for isoleucine (Lozupone et al., 2003) is significantly longer than a shortRNA hairpin, such as the anticodon stem and loop of tRNA, although the latterwould be a specific and stable handle for a linked amino acid (Szathm�ry, 1999),provided it would not have to self-charge. A small ribozyme self-charging Phe is29 nucleotides long (Illangasekare and Yarus, 1999), which is also considerablylonger than an anticodon stem–loop. Thus amino acids could be bound specifi-cally and reversibly via oligonucleotide handles by the ribozymes using them. Sta-bility and reversible binding of such handles would have favored an RNA hairpin(Szathm�ry, 1999).A consequence of the above reasoning is that the anticodon hairpin is likely to

have been the earliest adapter. A remarkable other benefit is that genetic codingpops out for free: different handles could have been bound safely to the sameamino acid, but charging of different amino acids to the same handles wouldhave been forbidden. The origin of the unambiguous but degenerate nature ofthe genetic code lies in the use of such coding coenzyme handles (CCHs), accord-ing to our view (Szathm�ry, 1990c). The crucial difference between the CCH hy-pothesis and the similar, in many ways, idea of Wong (1991) about selection forRNA peptidation is that in the latter coding does not naturally emerge due to thefact that oligopeptides rather than amino acids are assumed to be linked to RNA.As discussed above, many aptamers that bind amino acids are self-charging, but

this is not necessary, as shown by a series of remarkable experiments performedin Hiroaki Suga’s lab. First, they selected for a ribozyme that self-acylates at the 3lend of its integral tRNA part (Saito et al., 2001a). The leader sequence at the 5lend can be cut off by RNAse P from this pre-tRNA, which in turn is able to ami-noacylate the tRNA part in trans (Saito et al., 2001b). The cognate amino acid Pheis recognized by a U/A-rich (codonic/anticodonic!) region, whereas the 3l end oftRNA is recognized by a triplet in the ribozyme. Perhaps even more interesting isthe next series of experiments, in which a somewhat similar bifunctional ribo-zyme charging Gln was selected for (Lee et al., 2000). One domain of this ribo-zyme recognizes an activated glutaminyl ester and charges it to its own 5l-OHgroup (thus forming a covalent intermediate). The other domain binds tRNAand transfers the aminoacyl group to its 3l end. The first domain can be separatedfrom the other and can be significantly simplified. The glutamine-binding do-main can be reduced to 29 nucleotides and can aminoacylate the other domainin trans (Lee and Suga, 2001). Interestingly, a paired cognate codon:anticodontriplet is essential in this domain for the recognition of Gln.These experiments lend support to the idea that short adapter molecules (han-

dles) could have been specifically charged by synthetase ribozymes (Szathm�ry,1993, 1999). Where on the adapter this was carried out is an open question.Woese (1967) and Szathm�ry (1999) suggested that the primordial site foramino acid linking may have been nucleotide 37, immediately adjacent to the an-ticodon. Not only is this position universally modified, but several amino acidsplay a part in the process, in a way that makes sense in the light of scenariosabout evolution of the genetic code (cf. Szathm�ry, 1999). As suggested by

833.5 The Use of Coenzymes

Page 31: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

Wong (1991), primordial amino acids may have been N-linked, because the esterbond is too labile.Surprising suggestive evidence came to light recently. In E. coli the tRNAAsp mo-

lecule is modified in its wobble position. The queuosine residue is glutamylatedby a truncated version of glutamyl-tRNA synthetase (Blaise et al., 2004; Salazaret al., 2004) that recognizes the anticodon stem and loop of tRNAAsp. This findingraises the possibility of a role of other synthetase-related molecules in tRNA mod-ifications as well as the possibility of similar activity in the RNA world, catalyzedby ribozymes. Testing this idea again calls for further experimentation.We mentioned that, perhaps surprisingly, codons and/or anticodons tend to

show up in aptamers selected to bind amino acids. Does nature want to tell ussomething or is this merely due to chance? The most recent analysis (Caporasoet al., 2005) favors the latter view. As was noted by Szathm�ry (1999), it is notonly codons but also anticodons that show up more frequently than random inselected aptamers. Indeed now the data set (including aptamers for Gln, Tyr,Leu, Ile, Phe, His, Trp, Arg) seems to be biased towards anticodons, as advocatedin the CCH model (Szathm�ry, 1993).We note that there are still many puzzles around tRNA to be sorted out. Krzy-

zaniak et al. (1994) found that Phe and Met can be specifically charged to theircognate tRNAs in the absence of enzymes at high pressure. Should this resultbe confirmed, then it shows that at least these two tRNAs are barophylic apta-mers, which raises interesting thoughts in connection with the deep-sea hot-spring scenario for the origin of life (reviewed by Hazen et al., 2002) and theneed to select for extremophilic aptamers in general. Another odd result is thathistidine and its anticodon GpUpG act similarly as catalysts in certain in vitro me-tabolic reactions (Shimizu, 2004). Clearly, it would be very important to test andpossibly generalize these observations. We are content that many surprising re-sults will come to light concerning aptamers and their role in evolution.

3.6Outlook

Certain stages of evolution can only be reconstructed by inference from results ofcomparative and experimental analysis. The ultimate goal is to construct scenar-ios for certain critical transitions that are plausible and preferably make somepredictions that were not part of the initial assumptions and that can be tested.Analysis of present-day living systems gives us strong hints that an RNA worldin some form existed for a while.Reconstruction of that world can only be partial, and should be rather based on

experiments that prove certain principles. Aptamers eminently serve this purposefor the RNA world: the experimental results of their in vitro evolution can be jus-tifiably called breathtaking. In this chapter we have addressed two key issues: howthe fitness of RNA replicators is affected by mutations, and how coenzymes couldhave complemented the RNA world and how they, in a literary sense, could have

84 3 Fitness Landscapes, Error Thresholds, and Cofactors in Aptamer Evolution

Page 32: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

betrayed it by paving a way out of it into the protein world. None of these issuescan be taken as settled, however. Although we have put forward a detailed analysisof the function landscape of an RNA enzyme, and have used this to infer the fit-ness landscape of RNA replicators, many more data are needed in order to be con-clusive. In a similar vein, hypotheses centered on the evolutionary role of coen-zymes should be more rigorously tested. More ribozymes utilizing cofactors, in-cluding various amino acids, should be selected for. Ribozymes that chargeother RNA molecules with amino acids should be analyzed to see whether codo-nic or anticodonic triplets predominate in the critical binding sites. Ultimately,some crucial steps in the takeover of ribozymes by protein enzymes need to bere-enacted in experiments and detailed models.

Acknowledgments

�.K. is supported by a postdoctoral fellowship from the Hungarian ScientificResearch Fund (OTKA D048406). M.S. is supported by Fundaci�n Ram�n Areces(Spain). E.Sz. is supported by the Hungarian Scientific Research Fund (OTKA).The sponsorship from COST-European Cooperation in the field of Scientificand Technical Research, Internal Project number D27/0003/02 “Emergence andSelection of Networks of Catalytic Species” (URL: http://cost.cordis.lu/src/extranet/publish/D27WGP/d27–0003–02.cfm), and from COST D27 workinggroup (D27/0005/03) entitled “Etiology, Replication, Activity and Persistenceof RNA” (URL: http://cost.cordis.lu/src/extranet/publish/D27WGP/d27–0005–03.cfm), is very much appreciated.

85References

References

Aita, T., Iwakura, M., Husimi, Y. (2001). Across-section of the fitness landscape of di-hydrofolate reductase. Protein Eng 14, 633–638.

Aita, T., Hamamatsu, N., Nomiya, Y., Uchiya-ma, H., Shibanaka, Y., Husimi, Y. (2002).Surveying the local fitness landscape of aprotein with epistatic sites for the study ofdirected evolution. Biopolymers 64, 95–105.

Alves, D., Fontanari, J. F. (1998). Errorthreshold in finite populations. Phys Rev E57, 7008–7013.

Ananvoranich, S., Perreault, J.-P. (1998). Sub-strate specificity of d ribozyme cleavage.J Biol Chem 273, 13182–13188.

And�ng, M., Hinkula, J., Hotchkiss, G., Lars-son, S., Britton, S., Wong-Staal, F., Wahren,B., �hrlund-Richter, L. (1999). Dose-re-sponse resistance to HIV-1/MuLV pseudo-

type virus ex vivo in a hairpin transgenicmouse model. Proc Natl Acad Sci USA 96,12749–12753.

And�ng, M., Maijgren-Steffensson, C.,Hinkula, J., �hrlund-Richter, L. (2004). Cis-cleavage affects hammerhead and hairpinribozyme steady-state levels differently andhas strong impact on trans-targeting effi-ciency. Oligonucleotides 14, 11–21.

Andersen, A. A., Collins, R. A. (2001). Intra-molecular secondary structure rearrange-ment by the kissing interaction of theNeurospora VS ribozyme. Proc Natl Acad SciUSA 98, 7730–7735.

Anderson, P., Monforte, J., Tritz, R. R.,Nesbitt, S., Hearst, J., Hampel, A. (1994).Mutagenesis of the hairpin ribozyme.Nucleic Acid Res 22, 1096–1100.

Page 33: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

86 3 Fitness Landscapes, Error Thresholds, and Cofactors in Aptamer Evolution

Araki, M., Okuno, Y., Hara, Y., Sugiura, Y.(1998). Allosteric regulation of a ribozymeactivity through ligand-induced conforma-tional change. Nucleic Acid Res 26, 3379–3384.

Beattie, T. L., Olive, J. E., Collins, R. A. (1995).A secondary-structure model for the self-cleaving region of the Neurospora VS RNA.Proc Natl Acad Sci USA 92, 4686–4690.

Benner, B., Ellington, A. D., Ge, L., Gasfeld,A., Leanz, G. F. (1987). Natural selection,protein engineering, and the last riboor-ganism: rational model building in bio-chemistry. Cold Spring Harbor Symp QuantBiol 52, 56–63.

Benner, S. A., Ellington, A. D., Tauer, A.(1989). Modern metabolism as a palimpsestof the RNA world. Proc Natl Acad Sci USA86, 7054–7058.

Bergstrom, R. C., Mayfield, L. D., Corey, D. R.(2001). A bridge between the RNA andprotein worlds? Accelerating delivery ofchemical reactivity to RNA and DNA by aspecific short peptide (AAKK)4. Chem Biol8, 199–205.

Bj�rk, G. R. (1995). Biosynthesis of and func-tion of modified nucleosides. In: tRNA:Structure, Biosynthesis, and Function, D. S�ll,U. L. RajBhandry, eds. Washington, D.C.:ASM Press.

Blaise, M., Becker, H. D., Keith, G., Cambil-lau, C., Lapointe, J., Giege, R., Kern, D.(2004). A minimalist glutamyl-tRNAsynthetase dedicated to aminoacylation ofthe tRNAAsp QUC anticodon. Nucleic AcidRes 32, 2768–2775.

Bonhoeffer, S., Stadler, P. F. (1993). Errorthresholds on complex fitness landscapes.J Theor Biol 164, 359–372.

Breaker, R. R., Joyce, G. F. (1995). Self-incor-poration of coenzymes by ribozymes. J MolEvol 40, 551–558.

Burgstaller, P., Famulok, M. (1994). Isolationof RNA aptamers for biological cofactors byin vitro selection. Angew Chem Int Ed Engl33, 1084–1087.

Burke, D. H., Hoffmann, D. (1998). A novelacidophilic RNA motif that recognizescoenzyme A. Biochemistry 37, 4653–4663.

Butcher, S. E., Burke, J. M. (1994a). Structure-mapping of the hairpin ribozyme. Magne-sium-dependent folding and evidence for

tertiary interactions within the ribozyme-substrate complex. J Mol Biol 244, 52–63.

Butcher, S. E., Burke, J. M. (1994b). A photo-cross-linkable tertiary structure motif foundin functionally distinct RNA molecules isessential for catalytic function of the hairpinribozyme. Biochemistry 33, 992–999.

Caporaso, J. G., Yarus, M., Knight, R. (2005).Error minimization and coding triplet/binding site associations are independentfeatures of the canonical genetic code. J MolEvol Epub ahead of print.

Connell, G. J., Yarus, M. (1994). RNAs withdual specificity and dual RNAs with similarspecificity. Science 264, 1137–1141.

Crick, F. H. C. (1968). The origin of the ge-netic code. J Mol Biol 38, 367–379.

D�cout, J. L., Maurel, M.-C. (1993). N6-sub-stituted adenine derivatives and RNA pri-mitive catalysts. Origins Life Evol Biosphere23, 299–306.

D�cout, J. L., Vergne, J., Maurel, M.-C. (1995).Synthesis and catalytic activity of adeninecontaining polyamines. Macromol ChemPhys 196, 2615–2624.

Drossel, B. (2001). Biological evolution andstatistical physics. Adv Phys 50, 209–295.

Eigen, M. (1971). Selforganization of matterand the evolution of biological macromole-cules. Naturwissenshaften 10, 465–523.

Eigen, M. (1985). Macromolecular evolution:Dynamical ordering in sequence space. In:Emerging Synthesis in Science: Proceedings ofthe Founding Workshops of the Santa FeInstitute, D. Pines, ed. Santa Fe: Santa FeInstitute.

Famulok, M. (1994). Molecular recognition ofamino acids by RNA-aptamers: An L-citrul-line binding RNA motif and its evolutioninto an L-arginine binder. J Am Chem Soc116, 1689–1706.

Famulok, M., Szostak, J. W. (1992). Stereo-specific recognition of tryptophan-agaroseby in vitro selected RNA. J Am Chem Soc114, 3990–3991.

Fedor, M. (2000). Structure and function of thehairpin ribozyme. J Mol Biol 297, 269–291.

Fernando, C., Santos, M., Szathm�ry, E.(2005). Evolutionary potential and require-ments for minimal protocells. Topics inCurrent Chemistry 259, 167–211.

Flamm, C., Hofacker, I. L., Stadler, P. F.(1999). RNA in silico: The Computational

Page 34: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

87References

Biology of RNA Secondary Structures. AdvComplex Syst 2, 65–90.

Flinders, J., Dieckmann, T. (2004). The solu-tion structure of the VS ribozyme active siteloop reveals a dynamic “hot-spot”. J MolBiol 341, 935–949.

Fuller, W. D., Sanchez, R. A., Orgel, L. E.(1972). Studies in prebiotic synthesis: VII.Solid-state synthesis of purine nucleosides.J Mol Biol 67, 25–33.

G�nti, T. (1979). Interpretation of prebioticevolution based on chemoton theory. (inHungarian). Biol�gia 27, 161–175.

G�nti, T. (2003a). Chemoton Theory. New York,Kluwer Academic/Plenum Publishers.

G�nti, T. (2003b). The Principles of Life. Oxford,Oxford University Press.

Geiger, A., Burgstaller, P., von der Eltz, H.,Roeder, A., Famulok, M. (1996). RNA apta-mers that bind L-arginine with sub-micro-molar dissociation constants and high en-antioselectivity. Nucleic Acid Res 24, 1029–1036.

Gilbert, W. (1986). The RNA world. Nature319, 618.

Green, R., Lorsch, J. R. (2002). The path toperdition is paved with protons. Cell 110,665–668.

Guo, H. C. T., Collins, R. A. (1995). Efficienttrans-cleavage of a stem-loop RNA substrateby a ribozyme derived from Neurospora VSRNA. EMBO J 14, 368–376.

Haldane, J. B. S. (1937). The effect of variationon fitness. Am Naturalist 71, 337–349.

Hazen, R. H., Boctor, N., Brandes, J. A., Cody,G. D., Hemley, R. J., Sharma, A., Yoder, H.S. (2002). High pressure and the origin oflife. J Phys Condensed Matter 14, 11489–11494.

Hofacker, I. L., Fontana, W., Stadler, P. F.,Bonhoeffer, S., Tacker, M., Schuster, P.(1994). Fast folding and comparison of RNAsecondary structures. Monatch Chem 125,167–188.

Huang, F., Bugg, C. W., Yarus, M. (2000).RNA-Catalyzed CoA, NAD, and FADsynthesis from phosphopantetheine, NMN,and FMN. Biochemistry 39, 15548–15555.

Huynen, M. A. (1996). Exploring phenotypespace through neutral evolution. J Mol Evol43, 165–169.

Huynen, M. A., Stadler, P. F., Fontana, W.(1996). Smoothness within ruggedness: the

role of neutrality in adaptation. Proc NatlAcad Sci USA 93, 397–401.

Illangasekare, M., Yarus, M. (1997). Small-molecule-substrate interactions with a self-aminoacylating ribozyme. J Mol Biol 268,631–639.

Illangasekare, M., Yarus, M. (1999). A tinyRNA that catalyzes both aminoacyl-tRNAand peptidyl-RNA synthesis. RNA 5, 1482–1489.

Jadhav, V. R., Yarus, M. (2002a). Acyl-CoAsfrom coenzyme ribozymes. Biochemistry 41,723–729.

Jadhav, V. R., Yarus, M. (2002b). Coenzymes ascoribozymes. Biochimie 84, 877–888.

Jeffries, A. C., Symons, R. H. (1989). A cata-lytic 13-mer ribozyme. Nucleic Acid Res 17,1371–1377.

Jenison, R. D., Gill, S. C., Pardi, A., Polisky, B.(1994). High-resolution molecular discri-mination by RNA. Science 263, 1425–1429.

Johnston, W. K., Unrau, P. J., Lawrence, M. S.,Glasen, M. E., Bartel, D. P. (2001). RNA-catalyzed RNA polymerization: accurate andgeneral RNA-templated primer extension.Science 292, 1319–1325.

Joseph, S., Burke, J. M. (1993). Optimizationof an anti-HIV hairpin ribozyme by in vitroselection. J Biol Chem 268, 24515–24518.

Joseph, S., Berzal-Herranz, A., Chowrira, B.M., Butcher, S. E., Burke, J. M. (1993).Substrate selection rules for the hairpin ri-bozyme determined by in vitro selection,mutation, and analysis of mismatchedsubstrates. Genes Dev 7, 130–138.

Joyce, G. F. (1998). Nucleid acid enzymes:playing with a fuller deck. Proc Natl AcadSci USA 95, 5845–5847.

Joyce, G. F. (2002). The antiquity of RNA-based evolution. Nature 418, 214–220.

Kauffman, S. A. (1993). The Origins of Order.Oxford, Oxford University Press.

Kiga, D., Futurama, Y., Sakamoto, K., Yo-koyama, S. (1998). An RNA aptamer to thexanthine/guanine base with a distinctivemode of purine recognition. Nucleic AcidRes 26, 1755–1760.

Klosermeier, D., Millar, D. P. (2002). Ener-getics of hydrogen bond networks in RNA:Hydrogen bonds surrounding G+1 and U42are the major determinants for the tertiarystructure stability of the hairpin ribozyme.Biochemistry 41, 14095–14102.

Page 35: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

88 3 Fitness Landscapes, Error Thresholds, and Cofactors in Aptamer Evolution

Klussmann, S., Nolte, A., Bald, R., Erdmann,V. A., Frste, J. P. (1996). Mirror-imageRNA that binds D-adenosine. Nat Biotechnol14, 1112–1115.

Koizumi, M., Soukup, G. A., Kerr, J. N.,Breaker, R. R. (1999). Allosteric selection ofribozymes that respond to the second mes-sengers cGMP and cAMP. Nat Struct Biol 6,1962–1071.

Kole, R., Altman, S. (1981). Properties ofpurified ribonuclease P from Escherichiacoli. Biochemistry 20, 1902–1906.

Kor�nyi, P., G�nti, T. (1981). Coenzymes asremnants of ancient enzyme-RNA mole-cules (in Hungarian). Biol�gia 29, 107–124.

Krzyzaniak, A., Baciszewski, J., Salanski, P.,Jurczak, J. (1994). The non-enzymatic spe-cific aminoacylation of transfer RNA at highpressure. Int J Biol Macromol 16, 153–158.

Kumar, R. K., Yarus, M. (2001). RNA-catalyzedamino acid activation. Biochemistry 40,6998–7004.

Kumar, P. K. R., Suh, Y.-A., Miyashiro, H.,Nishikawa, F., Kawakami, J., Taira, K.,Nishikawa, S. (1992). Random mutation toevaluate the role of bases at two importantsingle-standed regions of genomic HDV ri-bozyme. Nucleic Acid Res 20, 3919–3924.

Lafontaine, D. A., Norman, D. G., Lilley, D. M.J. (2001a). Structure, folding and activity ofthe VS ribozyme: importance of the 2–3–6helical junction. EMBO J 20, 1415–1424.

Lafontaine, D. A., Wilson, T. J., Norman, D.G., Lilley, D. M. J. (2001b). The A730 loop isan important component of the active siteof the VS ribozyme. J Mol Biol 312, 663–674.

Lafontaine, D. A., Norman, D. G., Lilley, D. M.J. (2002a). Folding and catalysis by the VSribozyme. Biochimie 84, 889–896.

Lafontaine, D. A., Norman, D. G., Lilley, D. M.J. (2002b). The global structure of the VSribozyme. EMBO J 21, 2461–2471.

Lafontaine, D. A., Norman, D. G., Lilley, D. M.J. (2002c). The structure and active site ofthe Varkund satellite ribozyme. Biochem SocTrans 30, 1170–1175.

Lafontaine, D. A., Wilson, T. J., Zhao, Z.-Y.,Lilley, D. M. J. (2002d). Functional grouprequirements in the probable active site ofthe VS ribozyme. J Mol Biol 323, 23–34.

Landweber, L. F., Simon, P. J., Wagner, T. A.(1998). Ribozyme engineering and earlyevolution. BioScience 48, 94–103.

Lauhon, C. T., Szostak, J. W. (1995). RNA ap-tamers that bind flavin and nicotinamideredox cofactors. J Am Chem Soc 117, 1246–1257.

Lee, N., Suga, H. (2001). A minihelix-loopRNA acts as a trans-aminoacylation catalyst.RNA 7, 1043–1051.

Lee, N., Bessho, Y., Wei, K., Szostak, J. W.,Suga, H. (2000). Ribozyme-catalyzed tRNAaminoacylation. Nat Struct Biol 7, 28–33.

Lehman, N., Joyce, G. F. (1993). Evolution invitro: analysis of a lineage of ribozymes.Curr Biol 3, 723–734.

Lilley, D. M. J. (1999). RNA folding and cata-lysis. Genetica 106, 95–102.

Lorsch, J. R., Szostak, J. W. (1994). In vitroselection of RNA aptamers specific for cya-nocobalamin. Biochemistry 33, 973–982.

Lozupone, C., Changayil, S., Majerfeld, I.,Yarus, M. (2003). Selection of the simplestRNA that binds isoleucine. RNA 9, 1315–1322.

Luther, A., Brandsch, R., von Kiedrowski, G.(1998). Surface-promoted replication andexponential amplification of DNA analo-gues. Nature 396, 245–248.

Majerfeld, I., Yarus, M. (1994). An RNA pocketfor an aliphatic hydrophobe. Nature StructBiol 1, 287–292.

Makino, S., Chang, M. F., Shieh, C. K., Ka-mahora, T., Vannier, D. M., Govindarajan,S., Lai, M. M. (1987). Molecular cloning andsequencing of a human hepatitis delta (d)virus RNA. Nature 329, 343–346.

Maurel, M.-C. (1992). RNA in evolution. J EvolBiol 5, 173–188.

Maurel, M.-C., D�cout, J. L. (1999). Origins oflife: molecular foundations and new ap-proaches. Tetrahedron 55, 3141–3182.

Maurel, M.-C., Haenni, A. L. (2005). The RNAworld: Hypothesis, facts and experimentalresults. In: Lectures in Astrobiology, B. Barb-ier, M. Gargaud, H. Martin, J. Reisse, eds.Berlin: Springer-Verlag, pp. 557–581.

Maurel, M.-C., Ninio, J. (1987). Catalysis by aprebiotic nucleotide analog of histidine.Biochimie 69, 551–553.

Maynard Smith, J. (1970). Natural selectionand the concept of protein space. Science225, 563–564.

Maynard Smith, J., Szathm�ry, E. (1995). TheMajor Transition in Evolution. Oxford: W.H.Freeman.

Page 36: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

89References

McGinness, K. E., Wright, M. C., Joyce, G. F.(2002). Continous in vitro evolution of aribozyme that catalyzes three successivenucleotidil addition reaction. Chem Biol 9,585–596.

Meli, M., Vergne, J., D�cout, J. L., Maurel,M.-C. (2002). Adenine–aptamer complexes.A bipartite RNA site which binds the ade-nine nucleic base. J Biol Chem 277, 2104–2111.

Meli, M., Vergne, J., Maurel, M.-C. (2003). Invitro selection of adenine-dependent hair-pin ribozymes. J Biol Chem 278, 9835–9842.

Muth, G. W., Ortoleva-Donnelly, L., Stobel,S. A. (2000). A single adenosine with aneutral pKa in the ribosomal peptidyltransferase center. Science 289, 947–950.

Muth, G. W., Chen, L., Kosek, A. B., Strobel,S. A. (2001). pH-dependent conformationalflexibility within the ribosomal peptidyltransferase center. RNA 7, 1403–1415.

Neunlist, S., Bisseret, P., Rohmer, M. (1987).The hopanoids of the purple non-sulfurbacteria Rhodopseudomonas palustris andRhodopseudomonas acidophila and the abso-lute configuration of bacteriohopanetetrol.Eur J Biochem 171, 245–252.

Nikolova, P. V., Henckel, J., Lane, D. P.,Fersht, A. R. (1998). Semirational design ofactive tumor suppressor p53 DNA bindingdomain with enhanced stability. Proc NatlAcad Sci USA 95, 14675–14680.

Nishikawa, F., Fauzi, H., Nishikawa, S. (1997).Detailed analysis of base preferences at thecleavage site of a trans-acting HDV ribo-zyme: a mutation that changes cleavage sitespecificity. Nucleic Acid Res 25, 1605–1610.

Noller, H. F., Hoffarth, V., Zimniak, L. (1992).Unusual resistance of peptidyl transferaseto protein extraction procedures. Science256, 1416–1419.

Nolte, A., Klussmann, S., Bald, R., Erdmann,V. A., Frste, J. P. (1996). Mirror-design ofL-oligonucleotide ligands binding to L-argi-nine. Nat Biotechnol 14, 1116–1119.

Orgel, L. E. (1968). Evolution of the geneticapparatus. J Mol Biol 38, 381–393.

Orgel, L. E. (1986). RNA catalysis and theorigins of life. J Theor Biol 123, 127–149.

Or�, J. (1960). Synthesis of adenine fromammonium cyanide. Biochem Biophys ResCommun 2, 407–412.

Pace, N. R., Marsh, T. L. (1985). RNA catalysisand the origin of life. Origins Life Evol Bio-sphere 16, 97–116.

P�rez-Ruiz, Barroso-delJesus, A., Berzal-Her-ranz, A. (1999). Specificity of the hairpinribozyme. J Biol Chem 274, 29376–29380.

Perrotta, A. T., Been, M. D. (1991). A pseudo-knot-like structure required for efficientself-cleavage of hepatitis delta virus RNA.Nature 350, 434–436.

Piccirilli, J. A., Krauch, T., Moroney, S. E.,Benner, S. A. (1990). Enzymatic incorpora-tion of a new base pair into DNA and RNAextends the genetic alphabet. Nature 343,33–37.

Rastogi, T., Collins, R. A. (1998). Smaller,faster ribozymes reveal the catalytic core ofNeurospora VS RNA. J Mol Biol 277, 215–224.

Rastogi, T., Beattie, T. L., Olive, J. E., Collins,R. A. (1996). A long-range pseudoknot isrequired for activity of the Neurospora VSribozyme. EMBO J 15, 2820–2825.

Reader, J. S., Joyce, G. F. (2002). A ribozymecomposed of only two different nucleotides.Nature 420, 841–844.

Ricard, J., Vergne, J., D�cout, J. L., Maurel, M.-C. (1996). The origin of kinetic cooperativityin prebiotic catalysts. J Mol Evol 43, 315–325.

Robertson, M. P., Ellington, A. D. (1999). Invitro selection of an allosteric ribozyme thattransduces analytes to amplicons. Nat Bio-technol 17, 62–66.

Robertson, M. P., Ellington, A. D. (2000). De-sign and optimization of efector-activatedribozyme ligases. Nucleic Acid Res 28, 1751–1759.

Roth, A., Breaker, R. R. (1998). An amino acidas a cofactor for a catalytic polynucleotid.Proc Natl Acad Sci USA 95, 6027–6031.

Rupert, P. B., Massey, A., Sigurdsson, S. T.,Ferr�-D’Amar�, A. R. (2002). Transitionstate stabilization by a catalytic RNA.Science 298, 1421–1424.

Saito, H., Kourouklis, D., Suga, H. (2001a). Anin vitro evolved precursor tRNA with ami-noacylation activity. EMBO J 20, 1797–1878.

Saito, H., Watanabe, K., Suga, H. (2001b).Concurrent molecular recognition of theamino acid and tRNA by a ribozyme. RNA7, 1867–1878.

Page 37: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

90 3 Fitness Landscapes, Error Thresholds, and Cofactors in Aptamer Evolution

Salazar, J. C., Ambrogelly, A., Crain, P. F.,McCloskey, J. A., S�ll, D. (2004). A trun-cated aminoacyl-tRNA synthetase modifiesRNA. Proc Natl Acad Sci USA 101, 7536–7541.

Sandberg, W. S., Terwilliger, T. C. (1993). En-gineering multiple properties of a proteinby combinatorial mutagenesis. Proc NatlAcad Sci USA 90, 8367–8371.

Santos, M., Zintzaras, E., Szathm�ry, E.(2004). Recombination in primeval gen-omes: a step forward but still a long leapfrom maintaining a sizeable genome. J MolEvol 59, 507–519.

Sarai, A., Tekada, Y. (1989). Lambda repressorrecognizes the approximately 2-fold sym-metric half-operator sequences asymmetri-cally. Proc Natl Acad Sci USA 86, 6513–6517.

Sargueil, B., Pecchia, D. B., Burke, J. M.(1995). An improved version of the hairpinribozyme complex functions as a ribonu-cleoprotein complex. Biochemistry 34, 7739–7748.

Sargueil, B., McKenna, J., Burke, J. M. (2000).Analysis of the functional role of GAsheared base pair by in vitro genetics. J BiolChem 275, 32157–32166.

Sassanfar, M., Szostak, J. W. (1993). An RNAmotif that binds ATP. Nature 364, 550–553.

Saville, B. J., Collins, R. A. (1990). A site-spe-cific self-cleavage reaction performed by anovel RNA in Neurospora mitochondria. Cell61, 685–696.

Schuster, P. (1986). The physical basis of mo-lecular evolution. Chem Scripta 26B, 27–41.

Schuster, P. (1987). Structure and dynamics ofreplication-mutation systems. Phys Scripta35, 402–416.

Schuster, P., Fontana, W., Stadler, P. F., Ho-facker, I. L. (1994). From sequences toshapes and back: a case study in RNA sec-ondary structures. Proc R Soc Lond B 255,279–284.

Serrano, L., Day, A. G., Fersht, A. R. (1993).Step-wise mutation of barnase to binase. Aprocedure for engineering increased stabi-lity of proteins and an experimental analysisof the evolution of protein stability. J MolBiol 20, 305–312.

Sharp, P. A. (1985). On the origin of RNAsplicing and introns. Cell 42, 397–400.

Shimizu, M. (2004). Histidine and its antico-don GpUpG are similar metabolic reactionrate enhancers: Molecular origin of the ge-netic code. J Phys Soc Jpn 73, 323–326.

Shippy, R., Siwkowski, A., Hampel, A. (1998).Mutational analysis of loop 1 and 5 of thehairpin ribozyme. Biochemistry 37, 564–570.

Siwkowski, A., Shippy, R., Hampel, A. (1997).Analysis of hairpin ribozyme base muta-tions in loop 2 and 4 and their effects on cis-cleavage in vitro. Biochemistry 63, 3930–3940.

Skinner, M. M., Terwilliger, T. C. (1996). Po-tential use of additivity of mutational effectsin simplifying protein engineering. ProcNatl Acad Sci USA 93, 10753–10757.

Sood, V. D., Collins, R. A. (2002). Identifica-tion of the catalytic subdomain of the VSribozyme and evidence for remarkable se-quence tolerance in the active site loop.J Mol Biol 320, 443–454.

Soukup, G. A., Breaker, R. R. (1999). Engi-neering precision RNA molecular switches.Proc Natl Acad Sci USA 96, 3584–3589.

Spirin, A. S. (2002). Omnipotent RNA. FEBSLett 530, 4–8.

Stadler, P. F., Haslinger, C. (1999). RNAstructure with pseudo-knots: graph-theore-tical and combinatorial properties. BullMath Biol 61, 437–467.

Sullenger, B. A., Gilboa, E. (2002). Emergingclinical applications of RNA. Nature 418,252 – 258.

Sun, L., Cui, Z., Gottlieb, R. L., Zhang, B. (2002).A selected ribozyme catalyzing diverse di-peptide synthesis. Chem Biol 9, 619–628.

Szathm�ry, E. (1989). The emergence, main-tenance, and transitions of the earliest evo-lutionary units. Oxford Surveys Evol Biol 6,169–205.

Szathm�ry, E. (1990a). RNA worlds in testtubes and protocells. In: OrganizationalConstraints on the Dynamics of Evolution, J.Maynard Smith, G. Vida, eds. Manchester:Manchester University Press, pp. 3–14.

Szathm�ry, E. (1990b). Towards the evolutionof ribozymes. Nature 344, 115.

Szathm�ry, E. (1990c). Useful coding beforetranslation: the coding coenzymes handlehypothesis for the origin of the geneticcode. In: Evolution: from Cosmogenesis toBiogenesis, B. Luk�cs, S. B�rczi, I. Moln�r,G. Pa�l, eds. Budapest, KFKI-1990–50/C,pp. 77–83.

Page 38: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

91References

Szathm�ry, E. (1991). Four letters in the ge-netic alphabet: a frozen evolutionary opti-mum? Proc R Soc Lond B 245, 91–99.

Szathm�ry, E. (1992). What determines thesize of the genetic alphabet? Proc Natl AcadSci USA 89, 2614–2618.

Szathm�ry, E. (1993). Coding coenzyme han-dles: A hypothesis for the origin of the ge-netic code. Proc Natl Acad Sci USA 90,9916–9920.

Szathm�ry, E. (1996). Coding coenzyme han-dles and the origin of the genetic code. In:From Simplicity to Complexity in Chemistry –and Beyond. Part I., A. Mller, A. Dress, F.V�gtle, eds. Braunschweig: Vieweg,pp. 33–41.

Szathm�ry, E. (1999). The origin of the geneticcode – amino acids as cofactors in an RNAworld. Trends Genet 15, 223–229.

Szathm�ry, E. (2003). Why are there four let-ters in the genetic alphabet? Nat Rev Genet4, 995–1001.

Szathm�ry, E., Demeter, L. (1987). Group se-lection of early replicators and the origin oflife. J Theor Biol 128.

Szathm�ry, E., Maynard Smith, J. (1997).From replicators to reproducers: The firstmajor transitions leading to life. J Theor Biol187, 555–571.

Szazani, P. L., Larralde, R., Szostak, J. W.(2004). A small aptamer with strong andspecific recognition of the triphosphate ofATP. J Am Chem Soc 126, 8370–8371.

Tang, J., Breaker, R. R. (1997). Rational designof allosteric ribozymes. Chem Biol 4, 453–459.

Tekada, Y., Sarai, A., Rivera, V. M. (1989).Analysis of the sequence-specific interac-tions between Cro repressor and operatorDNA by systematic base substitution ex-periments. Proc Natl Acad Sci USA 86, 439–443.

Travascio, P., Bennet, A. J., Wang, D. Y., Sen,D. (1999). A ribozyme and a catalytic DNAwith peroxidase activity: active sites versuscofactor-binding sites. Chem Biol 6, 779–787.

Tr�moli�res, A. (1980). Nucleotidic cofactorsand the origin of the genetic code. Biochimie62, 493–496.

Tsukiji, S., Pattnaik, S. B., Suga, H. (2004).Reduction of an aldehyde by a NADH/Zn2+ -dependent redox active ribozyme.J Am Chem Soc 126, 5044–5045.

van Nimwegen, E., Crutchfield, J. P., Huynen,M. A. (1999). Neutral evolution of muta-tional robustness. Proc Natl Acad Sci USA96, 9716–9720.

Viladkar, S. M. (2002). Guanine rich oligonu-cleotide-amino acid/peptide conjugates:preparation and characterization. Tetrahe-dron 58, 495–502.

von Kiedrowski, G. (1986). A self-replicatinghexadeoxy nucleotide. Angew Chem Int EdEngl 25, 932–935.

White, H. B. (1976). Coenzymes as fossilsof an earlier metabolic state. J Mol Evol 7,101–104.

Wilson, C., Nix, J., Szostak, J. W. (1998).Functional requirements for specific ligandrecognition by a biotin-binding RNA pseu-doknot. Biochemistry 37, 14410–14419.

Winkler, W. C., Breaker, R. R. (2003). Geneticcontrol by metabolite-binding riboswitches.ChemBioChem 4, 1024–1032.

Winkler, E., Nahvi, A., Roth, A., Collins, J. A.,Breaker, R. R. (2004). Control of gene ex-pression by a natural metabolite-responsiveribozyme. Nature 428, 281–286.

Woese, C. R. (1967). The Genetic Code. NewYork: Harper & Row.

Wong, J. T. F. (1991). Origin of geneticallyencoded protein synthesis: a model basedon selection for RNA peptidation. OriginsLife Evol Biosphere 21, 165–176.

Wright, S. (1932). The roles of mutation, in-breeding, crossbreeding, and selection inevolution. Paper presented at: Proceedingsof the Sixth International Congress on Ge-netics.

Yu, Q., Pecchia, D. B., Kingsley, S. L., Heck-man, J. E., Burke, J. M. (1998). Cleavage ofhighly structured viral RNA molecules bycombinatorial libraries of hairpin ribo-zymes. J Biol Chem 273, 23524–23533.

Zaug, A. J., Cech, T. R. (1982). The interveningsequence excised from the ribosomal RNAprecursor of Tetrahymena contains a 5-ter-minal guanosine residue not encoded bythe DNA. Nucleic Acid Res 10, 2823–2838.

Zhang, X. J., Baase, W. A., Shoichet, B. K.,Wilson, K. P., Matthews, B. W. (1995). En-hancement of protein stability by the com-bination of point mutations in T4 lysozymeis additive. Protein Eng 8, 1017–1022.

Zhang, W., Xie, Q., Zhou, X.-Q., Jiang, S., Jin,Y.-X. (2004). Expression and in vitro cleav-age activity of anti-caspase-7 hammerhead

Page 39: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

92 3 Fitness Landscapes, Error Thresholds, and Cofactors in Aptamer Evolution

ribozymes. World J Gastroenterol 10, 2078–2081.

Zintzaras, E., Mauro, S., Szathm�ry, E. (2002).“Living” under the challenge of informationdecay: the stochastic corrector model versushypercycles. J Theor Biol 217, 167–181.

Zuker, M., Mathews, D. H., Turner, D. H.(1999). Algorithms and thermodynamicsfor RNA secondary structure prediction:a practical guide. In: RNA Biochemistryand Biotechnology, J. Barciszewski, B. F. C.Clark, eds. Dordrecht: Kluwer Academic.

Page 40: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing

Part 2In Vitro Selection of Target-binding Oligonucleotides

The Aptamer Handbook. Edited by S. KlussmannCopyright c 2006 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimISBN 3-527-31059-2

Page 41: 3 Fitness Landscapes, Error Thresholds, and Cofactors in ...plantsys.elte.hu/pdf/kunadam/Kun-AptamerHandbook.pdf · biology – came only after the discoveries of RNA self-splicing