the origin of life — a task in molecular engineering

23
Review Article The Origin of Life - A Task in Molecular Engineering HANS KUHN Max-Planck lnstitut fiir biophysikalische Chemie, P.O.B. 2841, Am Fassberg D3w4 Gbttingen, F.R.G. (Received: 8 April 1991; accepted: 18 October 1991) Abstract. In attempting to understand how life originated, we search for a detailed sequence of experimentally testable physico-chemical steps in an appropriately structured system. This goal is approached in two stages. First we search for the organizational structure of processes leading to systems with the basic features of living organisms. This is an engineering problem: finding a certain construct by taking care of logical requirements and restrictions from physics. Then we face this construct with the chemical and geophysical reality, and this leads to the view that systems with the essential features of early living organisms evolve following a distinct pathway. Energy supply and the presence of a particular structure in space and time are necessary to induce and drive the processes triggered by stochastic events; but if these particular conditions are given, the broad line of the evolutionary processes is determined by logical requirements and by chemical and geophysical con- straints and invariants. The genetic machinery considered to evolve in this manner agrees, in its organizational structure and in many details, with the actual genetic machinery of biosystems. A surprising simplicity and transparency is observed in the logic of the basic processes involved in the origin of life. In the present view, the processes leading to the origin of life begin in a very particular, highly structured, small region where the relevant chemistry can be quite different from overall prebiotic chemistry. Energy-rich compounds are present in ample amounts and a succession of physico-chemical processes, which are per se thermodynamically allowed, takes place. This is in contrast to popular views that the origin of life is connected with fundamental thermodynamic questions related to the problem of getting order out of chaos. Key words. Origin of life, molecular engineering, biology, evolution, genetic code, translation machine, self instruction. Introduction The present approach is based on a theoretical model proposed in [l] and later refined in [2-61. A comparison with other approaches is given there. In [7] we discuss the model in connection with the recent comprehensive approach by Chris- tian de Duve [S]. In papers [l-6] each step is described as realistic as possible by molecular modeling and detailed quantitative estimates. The purpose of the pre- sent paper is to focus on essential points emphasizing logical simplicity and trans- parency of the basic processes in the origin of life. The benefit of simple logical consideration in understanding fundamental features of living matter is best appar- ent from the classical work of von Neumann on self-reproducing automata [9], that resulted in the prediction of essential features of the genetic apparatus. Molecular Engineering 1: 377-399, 1992. @ 1992 Kluwer Academic Publishers. Printed in the Netherlands.

Upload: hans-kuhn

Post on 06-Jul-2016

214 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: The origin of life — A task in molecular engineering

Review Article

The Origin of Life - A Task in Molecular Engineering

HANS KUHN Max-Planck lnstitut fiir biophysikalische Chemie, P.O.B. 2841, Am Fassberg D3w4 Gbttingen, F.R.G.

(Received: 8 April 1991; accepted: 18 October 1991)

Abstract. In attempting to understand how life originated, we search for a detailed sequence of experimentally testable physico-chemical steps in an appropriately structured system. This goal is approached in two stages. First we search for the organizational structure of processes leading to systems with the basic features of living organisms. This is an engineering problem: finding a certain construct by taking care of logical requirements and restrictions from physics. Then we face this construct with the chemical and geophysical reality, and this leads to the view that systems with the essential features of early living organisms evolve following a distinct pathway. Energy supply and the presence of a particular structure in space and time are necessary to induce and drive the processes triggered by stochastic events; but if these particular conditions are given, the broad line of the evolutionary processes is determined by logical requirements and by chemical and geophysical con- straints and invariants. The genetic machinery considered to evolve in this manner agrees, in its organizational structure and in many details, with the actual genetic machinery of biosystems. A surprising simplicity and transparency is observed in the logic of the basic processes involved in the origin of life.

In the present view, the processes leading to the origin of life begin in a very particular, highly structured, small region where the relevant chemistry can be quite different from overall prebiotic chemistry. Energy-rich compounds are present in ample amounts and a succession of physico-chemical processes, which are per se thermodynamically allowed, takes place. This is in contrast to popular views that the origin of life is connected with fundamental thermodynamic questions related to the problem of getting order out of chaos.

Key words. Origin of life, molecular engineering, biology, evolution, genetic code, translation machine, self instruction.

Introduction

The present approach is based on a theoretical model proposed in [l] and later refined in [2-61. A comparison with other approaches is given there. In [7] we discuss the model in connection with the recent comprehensive approach by Chris- tian de Duve [S]. In papers [l-6] each step is described as realistic as possible by molecular modeling and detailed quantitative estimates. The purpose of the pre- sent paper is to focus on essential points emphasizing logical simplicity and trans- parency of the basic processes in the origin of life. The benefit of simple logical consideration in understanding fundamental features of living matter is best appar- ent from the classical work of von Neumann on self-reproducing automata [9], that resulted in the prediction of essential features of the genetic apparatus.

Molecular Engineering 1: 377-399, 1992. @ 1992 Kluwer Academic Publishers. Printed in the Netherlands.

Page 2: The origin of life — A task in molecular engineering

378 HANS KUHN

Biology , ,Engineering

Functional units of cooperating components

Molecular / \

Molecular Biology Engineering

Fig. 1.

1. Origin of Life: Diversified Microstructure Inducement to Form Intricate Machinery

The problem of how life originated is approached by theoretical modeling a process leading to the formation of a machine with properties of simple living systems, i.e., the problem is considered as a task in molecular engineering. This term was used in the early sixties to describe a challenging new task in chemistry (Figure 1). Bio-systems and machines have a common feature, components cooperate, and molecular biology made it clear that this cooperation in biology is based on a cooperation at the molecular level: biosystems are considered to be molecular machines. Molecular engineering, fabricating systems where single molecules co- operate forming functional units appeared as a counterpart of molecular biology: to plan chemical synthesis of different kinds of molecules that interlock and interact forming a functional entity. Prototypes of molecular machines were made by using monomolecular layers as pincers to pick up single molecules and superimposing layers to arrange different kinds of molecules in planned manner. Today, chemical synthesis with the aim of constructing supramolecular entities is in a most active state of development.

Reflections in terms of molecular machines, besides constructing molecular machines by synthesizing and assembling interlocking functional components, should be important in anticipating future developments in molecular engineering. An obvious question related to the goal of fabricating molecular machines is asking how it came to the natural molecular machines, the early forms of life.

Firstly, we shall attempt to approach this question from a general point of view by considering particular elements exposed to certain conditions and by investigating the processes taking place, by asking: what are the particular con- ditions driving towards the formation of systems with the basic features of early living organisms? The study (Section 2) shows that the systems evolve according to a distinct logic pattern when free energy carrying elements with some specific properties interact. The presence of a particular structure in space and time is necessary to drive the processes that are initiated by stochastic events. Then the general line of evolution is essentially determined by logic requirements.

Secondly, we shall ask (Section 3): are these elements and conditions realistic in the viewpoint of physics and chemistry? Can such systems be formed in a sequence of many physico-chemical steps? What are the chemical constraints and invariants in the design of processes that lead to the formation of increasingly

Page 3: The origin of life — A task in molecular engineering

THE ORIGIN OF LIFE 379

specific monomers carrying free energy

particular micro-structure

- periodicity in time - compartmentation - micro-heterogeneity

stepwise less dependent on highly specific environment:

+-w-w-w assembly of increasingly intricate interlocking components

Fig. 2.

molecular machine with basic features of living systems:

- ability to reproduce - occasionally copying

error by mismatch - erroneous copies

still able to reproduce

complex and intricate interlocking components constituting systems with the basic nature of biosystems? What are the chemical constraints in obtaining possible molecular building blocks to be used in these processes? Can such molecules be available on a primeval earth and can the microstructure be present?

The paradigm considered here, seeing the problem of how life originates as a distinct task in molecular engineering - finding the basic structure in space and time that drives the process - is in contrast to the familiar paradigm of life having originated in the ocean of a primeval soup by a process that can be described, in the essence, as a self-organization in a homogeneous phase in a stationary state.

The two paradigms give different answers to the familiar question whether the origin of life is thermodynamically possible.

In the primeval soup paradigm origin of life is seen as a fundamental thermody- namic problem related to the question of the emergence of order out of a chaos, and this has often caused difficulties in accepting origin of life as a consequence of the laws of physics.

In the view of the molecular-engineering-paradigm, an answer to the question can only be approached by inventing a detailed sequence of concrete, experimen- tally testable, physico-chemical steps leading to systems with the properties of simple living systems. Physico-chemical processes are per se thermodynamically allowed, and thus confidence in the assumption that the origin of life is thermody- namically allowed grows with each additional successful attempt to establish such a pathway.

According to the molecular engineering paradigm the highly specific structure in time and space is a conditio sine qua 12012 (Figure 2).

- Periodicity in time is required to drive alternation between multiplication and selection phases (e.g., day/night change).

- Compartmentation is required to allow aggregation and interlocking of the components forming functional units, and to keep off competitors (e.g., porous rock impregnated with aqueous solution).

- Micro-heterogeneity is required as the driving force to increase in complexity and intricacy.

In addition, a highly specific structure is obviously needed to guarantee supply of the molecular building blocks required for engineering life (syntheses of a number

Page 4: The origin of life — A task in molecular engineering

380 HANS KUHN

of compounds under various condition (in the gaseous, liquid and solid phase, in water and excluding water), processes to separate and accumulate compounds by drying and dissolving, adsorption and desorption processes etc.

Micro-porosity is important for early polymers capable of being copied under certain conditions, on the one hand to block the strands and prevent loss by diffusion, on the other hand to allow supply of energy-rich monomers.

2. Engineering Machines with Properties of Early Living Systems In this section we focus on what is essential in starting the evolutionary machinery and in keeping it going. We consider the logic barriers and ways to overcome them. For that reason we begin with a particularly simple case and study the effect of slowly increasing complexity of initial conditions and boundary conditions.

2.1. EVOLVING SYSTEM MADE OUT OF ONE KIND OF MONOMERS

A solution of some monomers may be considered. By some exceptional event (say, evaporation of solvent) short strands shall have formed by linking monomers, and a particular strand shall have the property of forming copies under the given conditions, by weakly binding monomers that join forming a daughter strand (Figure 3a; the matrix strand and the daughter strand can have the same direction (above) or the reverse direction (below) depending on geometry and energetics; this has no consequence until the situation discussed in Figure 5 is reached). The two strands separate, e.g., by rising temperature. Conditions shall periodically change. Then an increasing number of strands are formed by being copied in each subsequent period.

Appropriate conditions are assumed to be given in a fine porous region sur- rounded by a solution of monomers. The monomers can easily diffuse in and out of the porous region. The strands are largely blocked and thus accumulate in the porous region in subsequent periods.

A stationary situation is reached after a number of periods at a certain concentra- tion of strands. The production of new strand is balanced by the loss of strands (strands may leave the porous region or may be decomposed).

Eventually, as a rare event, two strands may fuse forming a longer strand. Such a strand has no advantage in the given region. To the contrary, it has disadvantages since copying takes a longer time. But now assuming a neighboring region with larger pores. Short strands entering that region cannot survive. They diffuse away. They are not blocked as they were in the region considered first. In contrast, the longer strands are blocked, multiply and are accumulated in the new region (Figure 3b). Again, a stationary situation will be reached after a number of periods. By occasional fusion strands of increasing length evolve in an appropriately porous environment.

The longer strands (the systems of higher complexity than the short strands) populate a region where the short strands cannot survive. Thus an evolution in the direction of higher complexity has taken place.

It may be asked: Why do we have to start with short strands in modelling this evolutionary process? Why not consider longer strands formed by linking

Page 5: The origin of life — A task in molecular engineering

I-HE ORIGIN OF LIFE 381

(4

B -L f

exceptional -2W”l (cvaporalion i 01 solvenl)

8

exccplional BVO”l (evaporalion 01 solvo”l)

0 -L appropriale co condifions

approprialc

-c

increasing lcmperal”rc

periodic . l

chilngc .

01 lompcrnlure . .

. . strands . l

multiply . solution l

.‘, .*

.

periodic chnngc 01 tempernluro

/ \ rcgio” region with wilh line largcr ports p0,lX short Slrandg lonoer s,ran&

Fig. 3. One kind of monomers. (a) Spontaneous formation of short strands. Reproduction by matrix- oriented polymerization, above: matrix strand and doughter strand have the same direction, below: matrix strand and doughter strand have opposite direction. Environment consisting of fine pores blocks strands: loss by diffusion restricted. (b) Occasional formation of longer strands by fusion of short

strands. Longer strands populate region with larger pores where short strands can not survive.

Page 6: The origin of life — A task in molecular engineering

382 HANS KUHN

monomers? The monomers must be linked in well defined geometry in order to obtain a strand that has the power of acting as matrix for copying under given conditions. Let Q be the probability that two monomers are linked correctly. A chain of n monomers acting as matrix requires n - 1 correct links. Thus the probability of spontaneous formation of such a chain is cr”-‘. Assuming (Y = l/100 this probability is lo-* for IZ = 5; 10-r* for n = 10; 10w2* for IZ = 15. Thus the probability of finding a correct chain among a number of chains obtained by spontaneous condensation of monomers is dramatically decreasing with increasing chain length. For IZ = 10 we find one correctly linked strand among 101* strands, i.e., after drying 10 milliliter of a millimolar solution of monomers.

The evolution of short strands to strands of increasing length must soon come to an end, since there is always a certain probability for errors in the copying process leading to daughter strands that do not act anymore as matrices for further copying. This probability increases with the number of links and at a certain number the proportion of error free copies becomes too small to allow further multiplication. For the matrix-oriented linkage CY can be about 0.95 instead of l/100 for spontaneous linkage, since the monomers arrange at the matrix appro- priately for linking correctly. Then the proportion of correct copies cu”-l is 0.5 for 12 = 15 and 0.2 for IZ = 30.

2.2. EVOLVING SYSTEM MADE OUT OF TWO KINDS OF MONOMERS THAT ARE COMPLEMENTARY

We have assumed one kind of monomers. The copying process requires precise docking of monomers at the matrix strand by interlocking equal groups. This should be difficult to achieve in practical systems. Interlocking complementary parts should be easier (Figure 4a).

For that reason we consider a solution of two kinds of monomers that are linked in an exceptional event of forming short strands of arbitrary sequence (Figure 4b). The monomers are complementary, and a particular strand occurs spontaneously that acts as a matrix inducing the formation of a complementary strand under the given conditions. Again, this strand and the daughter strand separate. In an appropriate, periodically changing environment a similar evolutionary process takes place as discussed in Section 2.1. Longer strands evolve from the short strands. Due to the more precise docking (Y can be 0.99 instead of 0.95 in the case of Section 2.1. Then the ratio of correct copies, cr”-‘, is 0.5 for IZ = 70 and 0.2 for n = 200. Thus strands with up to about 100 links may evolve instead of about 20 links in the case of Section 2.1.

We must again distinguish between the two cases where the matrix strand and the daughter strand have the same direction and the reverse direction respectively, depending on the geometry and energetics of the interlocking monomers with the matrix. Figure 4b shows the second case but the first case must equally be kept in mind.

The slight change in the thinking model by proceeding from Section 2.1 to 2.2 leads to a new feature: The strands can form sequence-dependent folding conformations by intramolecular pairing of complementary elements (Figure 5).

Above we have considered the occasional occurrence of a mismatch in the copy

Page 7: The origin of life — A task in molecular engineering

THE ORIGIN OF LIFE 383

Fig. 4. Two kinds of monomers that are complementary. Less mismatch than in case of Figure 3a. (a) Interlocking of complementary monomers. Left: backbones in the same direction. Right: backbones in opposite direction. (b) Monomers according to Fig. 4a, right. Formation of short strands that

replicate. Symbols different from those in Figure 4a (for simplicity).

process leading to a daughter strand that cannot be used as a matrix for further replication. Occasionally, a monomer can be incorporated in the daughter strand that is identical with the corresponding monomer in the original instead of being complementary to it. The sequence is changed by this mismatch without loosing the power to act as a matrix for further replication. A change in the sequence results in a change in the folding conformation. In the course of many periods this kind of copying error leads to a continuous production of new sequences and selection of particular folding conformations. Conformations will survive that are compact and therefore protected against decomposition.

Compact conformations are possible, for steric reasons, in the case where the strands in the replication process have reverse direction (Figure 4b), and therefore, where the strand sections linked by intramolecular pairing have reverse direction, in contrast to the case where linked strands have the same direction (Figure 5). Further evolutionary progress is only possible in the first case and only this case is of interest in discussing further steps.

A particularly compact form is a hairpin-conformation (Figure 5). We assume that such solid molecular species are able to aggregate by lateral interlocking and binding (Figure 6, symbolically represented in the framed inset). By their size aggregates will be blocked in larger porous regions and populate such regions where single strands can not survive. Aggregation being a by-product in the

Page 8: The origin of life — A task in molecular engineering

384 HANS KUHN

a+ & & & & a+ a+ a+ a+ & & a+ & a+ a+ a+ & a+

hairpin

monomers according to monomers according to Fig. 4a, right Fig. 4a, left

Fig. 5. Intramolecular pairing of complementary elements: sequence-dependent folding. Hairpins: compactest form (resistent) survives. This conformation is only possible in the case of monomers according to Figure 4a, right (reverse direction of complementary strand sections). Compact confor- mations are not possible in the case of monomers according to Figure 4a, left (same direction of

complementary strand sections).

Page 9: The origin of life — A task in molecular engineering

THE ORIGIN OF LIFE 385

i-

aggregate

Fig. 6. Aggregation of hairpins by lateral match and interlocking illustrated symbolically in inset. Aggregates are copying-error filters: mismatch in copying process: erroneous copies do not fit in

aggregate and are repelled: sequence for hairpin preserved.

evolution of strands forming various folding conformations is of fundamental importance. Aggregates are copying-error filters: By a mismatch in the replication process the strand will usually loose its property to form a hairpin conformation and therefore, in the process of aggregation, it will not match in the aggregate and it will diffuse away. By this process copying errors do not accumulate.

The aggregate is a supramolecular machine. It is a functional unit of cooperating molecules. This machine constitutes a mechanism to preserve the sequence re- quired for the hairpin conformation. It allows to surmount the barrier against increasing complexity considered in Section 2.1, the accumulation of copies that cannot serve as matrices for replication.

The size of aggregates is limited (limited speed of formation, limited mechanical strength) and thus the pore size of the regions that can be populated is limited.

2.3. DEVELOPMENT OF A MACHINE FOR MAKING A POROUS STRUCTURE

The question arises: How can regions with even larger pores be populated? This would be possible if the machinery would form its own porous structure transparent to monomers. A new kind of monomers is supposed to be attached to the hairpins, the monomers form links with each other; the oligomer thus obtained leaves the aggregate. Oligomers coagulate and form a porous surrounding structure acting as diffusion barrier for aggregating strands (Figure 7). Being independent of the presence of a given porous structure such systems can populate less restricted areas. A considerable evolutionary advance has taken place.

It may be asked: Why not begin evolutionary modelling with a given coagulate or coazervate as originally presumed by Oparin? There would be no inducement to develop a machinery for making coagulating oligomers. The situation would not be different from that discussed in the beginning of Section 2.1. In a region of fine pores short strands may be copied and multiply; the strand size is un- changed. Regions of different pore size are needed to induce the development of longer strands and the evolution of increasingly complex systems.

2.4. EMERGENCE OF A TRANSLATION MACHINE AS A BY-PRODUCT

Any change in the system discussed in Section 2.3 has a selectional advantage that speeds up the formation of the aggregate or improves the lateral interlocking

Page 10: The origin of life — A task in molecular engineering

386 HANS KUHN

0 0 monomers of second type 0

0

oligomer

coagulate of oligomers of second type

Fig. 7. Machine for fabricating oligomers that form a coagulate functioning as a self-made porous environment.

between hairpins and thus improves functioning of aggregate as coagulate-fabricat- ing machine. A strand acting as assembler of hairpins should serve both purposes (Figure 8).

To see the simplest possibility of such a system an interesting feature of the hairpins must be considered. Let us open a hairpin-conformation, form a replica and compare the two strands (called (+) and (-) strand). Both strands form hairpins which are identical with the exception of the elements in the middle and at the ends which are complementary (Figure 9). Therefore, (+) and (-) strands can be equally well incorporated in the aggregate, and this is most economic. The assembler can be a strand of arbitrary sequence and the hairpins bind to this strand by pairing complementary elements (Figure lOa). We assume sterical and energetic conditions allowing lateral interlocking and binding hairpins. This is an idealization, useful for focusing on principles, which must be modified in discussing practical realizations (see Section 2.4 below, Section 3.2).

assembler

lif?!.. - RRRRR

Fig, 8. Assembler: assists in assembling hairpins.

Page 11: The origin of life — A task in molecular engineering

THE ORIGIN OF LIFE 387

original: (+) strand

A’

copy: (-) strand

Fig. 9. Hairpin ((+) strand) and copy ((-) strand) identical except elements AA’, BB’; CC’ are complementary.

In this way the process of aggregate-formation is speeded up and the coagulate- making machinery is improved by the closer lateral binding. The sequence of (+) and (-) hairpins is given by the sequence of elements in the assembler. The (+) and (-) hairpins shown in Figure 10a have complementary elements at the open ends. Assuming the presence of two kinds of monomers forming the coagulate, one binding to the hairpin (+) strand the other to the hairpin (-) strand. Each oligomer thus obtained will have the sequence given by the sequence in the assembler. A translation machinery is present as a by-product.

By occasional mismatch in the copying process (see Section 2.2) assemblers with many different sequences will appear in the course of many periods. The resulting change in sequence in the oligomer should not be of much influence on its cap- ability in forming the coagulate. But an occasionally occurring oligomer with specific sequence is thought to have an additional property besides acting as a diffusion barrier: It binds to the double helix, saves it from hydrolysis and assists in replication: the oligomer interacts with the growing strand and thus lowers the probability of mismatch (Figure 11). We consider the case that this effect dimin- ishes the copying-error rate to such an extent that the sequence in the assembler does not change during the number of generations required for the selection of the form. In this case the sequence of the translation product is preserved over subsequent generations.

This step constitutes a fundamental break-through: a machinery producing and preserving translation products of distinct sequence is present. This is the model proposition to solve the so called hen-and-egg problem. Systems of increasing complexity and intricacy will evolve that produce an increasing number of different

Page 12: The origin of life — A task in molecular engineering

388 HANS KUHN

a.

b.

(+) strand

(-) strand

reading /frame

III

(+) strand (-) strand (-) strand

Fig. 10. Translation machine. Sequence of elements at assembler correspond to sequence of (+) and (-) strands in aggregate. (a) one element in loop of hairpin. (b) hairpin (+) and (-) strand (Figure 9) opened forming loop with three elements. Assembler with reading frame assisting assembly formation.

translation products. The copying and translation machinery will develop further and further.

As mentioned above the model must be modified. The assumed steric match between laterally interlocking hairpins is not realistic. For steric reasons the loop in the hairpins should have more than one element. Assume the two elements next to element B (Figure 9) do not pair with each other for steric reasons but

Page 13: The origin of life — A task in molecular engineering

THE? ORIGIN OF LIFE 389

Fig. 11. Oligomer lowering probability of mismatch.

are pairing with complementary elements of the assembler and that a compact structure is reached by laterally interlocking hairpins and binding (Figure lob). In this case the assembler cannot have an arbitrary sequence as in case of Figure 10a. The elements connected with brackets (Figure lob) must be arranged according to a distinct reading frame. Thus hairpins are arranged at the assembler appropriately for achieving lateral interlocking. To assist lateral interlocking is a by-product of the modification.

The bonds of a given element with the neighbors in the strand are assumed to be on a straight line and the bond with the complementary partner is assumed to be perpendicular to this line. These special assumptions are unrealistic. In realistic elements the bonds are at fixed angles in space, different from 180” and 90”. Then the arms of the hairpins are helical (Figure 12). Again, this effect assists lateral interlocking of the hairpins in the aggregate as a by-product of the modification.

Fig. 12. Hairpin. Angle between bonds of given elements to neighbor elements no more 180” as in the case of Figure 9. Such more realistic elements are chiral and consequently the hairpin arms are

screw-like. Translation machine improved by better lateral interlocking.

Page 14: The origin of life — A task in molecular engineering

390 HANS KUHN

The elements are chiral, and in a realistic system the two mirror image forms D and L should be considered to be present in equal amounts, but only one form, say the D-form, is used. How to avoid interference with the L-form? A steric- energetic match of reacting groups is certainly crucial to allow replication (Figure 4). In the present case of chiral elements this requires a pure D or pure L matrix strand, say a D matrix strand, and then D elements will interlock forming a pure D daughter strand. The a priori probabilities that D-organisms or L-organisms will evolve is equal, the decision is made with the spontaneous formation of either a D or an L strand in the beginning. The probability of spontaneous formation of a D (or an L) strand with it monomers is (l/2)” C’. (This means in the case discussed above (n = 10; Q = l/100) that one finds one correctly linked strand among 1000 x 1O1’ strands, i.e., in 10 liter of a millimolar solution of monomers).

2.5. REBUILDING THE REPRODUCTION MACHINERY

In the present evolutionary stage systems appear that produce an increasing number of translation products with various functions. The translation products interact and some cooperation takes place which develops in the course of further periods. By this cooperation the survival chance of the cluster and its components increases. The cluster grows with each period and occasionally divides in an uncontrolled manner. A multiplication of clusters takes place in the course of many periods (Figure 13).

The clusters are increasingly less dependent on the specific conditions first present. At some stage of the evolutionary process a difficulty appears; To focus on it, assembler strands that carry the message to produce specifically interacting translation products, we call (+) strands. The translation products of the comp- lementary assembler strands, the (-) strands, will be nonsensical in general, it can not be expected that both translation products will act as specific functional components (Figure 14). In contrast to simple systems, where oligomers with unspecific sequence served as diffusion barrier, accumulation of such unspecific oligomers leads to increasing difficulties with increasing complexity.

An evolutionary barrier by waste-accumulation is reached which can only be surmounted by a basic reconstruction of the genetic machinery. This takes place in several logically conceivable steps discussed elsewhere (3) in detail. The result is the development of a system in which the copying apparatus and the translation apparatus are separated (Figure 15). The blue-print is unified. Work is divided among specialists and this brings advantages, an event occurring again and again in the evolutionary processes.

At the present stage complexity greatly increases. The cooperation of an increas- ing number of translation products takes place, the system becomes an intricate functional entity, an organism. At this level of sophistication a cell membrane, a close envelope with channels to control influx and outflux becomes useful and will be developed. A continuous increase in intricacy of the functional network and in precision of the genetic machinery will take place. The specificity and sensitivity of the membrane’s response to environmental influences will develop continuously.

Page 15: The origin of life — A task in molecular engineering

THE ORIGIN OF LIFE 391

different translational products with various functions

assembler

(+) replic. t-1

- Funct. transl. assembler

(+I replic. t-1

I - Funct. transl. *I camp. II

(+I replic. t-1

I - Funct. transl. *I camp. Ill

Fig. 13. Aggregates confined by coagulate. Production of oligomers I, II, III, . . . with various func- tions. Some cooperation of these oligomers increases survival chance of cluster. Cluster grows and

occasionally divides uncontrolled.

3. Comparison of Logical Construct with Reality 3.1. SIMPLE LIVING SYSTEMS

In living systems the blue-print is unified in the DNA strand and is translated in the ribosomes. Translation products are proteins (made of 20 kinds of L-amino acids in defined sequence). They are folding specifically, assemble and form a functional entity, the organism. They form structures and act as enzymes perfor- ming a metabolism.

Figure 16a shows a hypothetical ancestral organism from which all existing living forms have arisen ([8] pp. 99-105). The cell is enclosed by a cell membrane of proteins and lipids. The membrane-proteins are assumed to include some transport

(+I (+I

4 replic. t-1

+I replic.

-I replic.

w

assembler - Funct. transl.

- non-sense transl.

- Funct. camp. I transl. camp. I

Fig. 14. Translation products of (-) assembler nonsensical. Waste-accumulation difficulty avoids further evolution at some stage.

Page 16: The origin of life — A task in molecular engineering

392

blue-print of cooperative

/ assembler

/

-+ - Funct.

I transl. camp. I

HANS KUHN

- - Funct. I transl. camp. II

-+ h Funct. I transl. camp. III

Fig. 1.5. Rebuilt copying machinery. Blue-print for cooperative unified. Machine for copying this blue-print has separated from translation machine indicated in the figure. Fundamental breakthrough.

Strong increase in intricacy at this stage: Controlled border inside/outside (membrane).

systems to maintain an adequate intra-cellular ionic milieu and mediate exchange between cell and environment to allow the cell to be an anaerobic chemo-auto- trophic respirer.

Proteins are continuously made by the ribosomes. The number of ribosomes is increasing. The DNA strand is copied. Original strand and copy separate and the cell volume grows until the system becomes unstable and a division into two cells takes place.

The blue-print is transcribed from the DNA to the messenger-RNA and is translated (Figure 16b). Transfer-RNA’s carrying amino acids are attached to the mRNA by complementary pairing in triplets, and the amino acids are linked in the sequence given by the sequence of elements in the mRNA. DNA and RNA’s consist of 4 kinds of monomers (two complementary pairs).

The assembler is seen as the precursor of the messenger-RNA, the hairpin as the precursor of the transfer-RNA, the second type of monomers are considered as amino acids, the carrier of the total genetic message as precursor of the DNA. It is assumed that only one complementary pair of nucleotides (G and C) is present at first and the two most abundant amino acids gly and ala. Oligomers of gly and ala (hydrophobic amino acid) coagulate and therefore have the required property to act as self-made porous environment (Figure 7).

3.2. BIOMOLECULES AS STRUCTURAL COMPONENTS

The question arises: are the theoretical concepts derived in Section 2 in accord with the steric-energetic properties of nucleotides and amino acids?

Computer simulation [lo] (force field calculations) and molecular modeling [5] show that the translation machine proposed in Section 2.4 can well be represented

Page 17: The origin of life — A task in molecular engineering

393 THE ORIGIN OF LIFE

(b)

translation machine

copying

translation -e

& & a-3 8--( a 8 -c \

nucleotide

*.. ~~~ 9-m - folding, assembling,

proteins forming functional entity:

organism

GNA messenger RNA /

-n-

I

- protein I transl.

- translI I

protein II

- - protein III

I transl.

Fig. 16. Most primitive living organism. (a) Division of blue print (DNA); growing and division of cell. (b) Copying DNA, transcription into RNA, translation into proteins.

Page 18: The origin of life — A task in molecular engineering

394 HANS KUHN

r;

b ala

Fig. 17. Computer modelling with nucleotides and amino acids glycin and alanin. Schematic represen- tation for alanin.

with nucleotides and amino acids. In the computer simulation hairpins are bound to the assembler by triplet base pairing and linked laterally by Mg++ assumed to be present in the solution and to bridge between the phosphates in the backbones of the double helices of the hairpins. The amino acids gly and ala are assumed to be activated by CMP and GMP respectively, and to be attached specifically to the (+I and F-1 h ai rp in as indicated in Fig. 17. Binding to 2’ OH group at the 3’ end can be easily simulated. The proposed reaction [4], [6] is similar to the reaction

D-Ribose 0 \ 1 ,o-

,o --t P H “‘0

0

D-R bow

which turned out to be the basis for the activity of the ribozymes [ll]. A possible explanation for the specific binding of alanin to CMP and glycin to

GMP is given by the fact that gly and GMP are eluted at the same speed, while ala and CMP are eluted at another, significantly different speed, offering the possibility of in situ formation of GMP-gly and CMP-ala, while no formation of CMP-gly and GMP-ala can take place [12].

Binding hairpins to assembler by triplets of complementary bases can be demon- strated [13] by immobilizing the assembler (fixing to cellulose) and binding the hairpins at 0°C. The strength of the binding was measured by warming up and measuring the temperature of elution. The result was compared with the elution

Page 19: The origin of life — A task in molecular engineering

THE ORIGIN OF LIFE 395

temperature of corresponding single stranded oligonucleotides. By adding Mg+ + to the solution of hairpins the temperature of elution is increased pointing to lateral binding between the hairpins linked to the assembler. Non denaturating gel-electrophoresis of a Mg’ + containing solution of assembler and hairpins indicates gap-less linkage of hairpins to assembler by triplets of complementary bases, as required by the theoretical model.

These results support the idea that the genetic machinery developed from a simple translation machine (Figure lob) by gradually increasing the number and complexity of the translation products, which requires a gradually increasing number of amino acids.

Eigen [14] proposed the predominance of G and C and with this specification the original reading frame and code (Figure lob) should be CGG and CCG or GGC and GCC. GGC and GCC are codons for gly and ala respectively, and we may assume that these are the preserved original codons. The other nucleotides besides G and C can be present and will then be incorporated, say A and U forming weaker bonding pairs than G and C. Corresponding hairpins, depending on environmental conditions, can have selectional advantage. The melting temper- ature, i.e., the transition temperature from the hairpin to the open conformation, depends strongly on the GC/AU ratio, and hairpins with a certain GC/AU ratio in the double helical portion leading to the optimal transition temperature for the given environment will be selected in the course of evolution. In contrast, the GNC reading frame (the strongly binding G and C nucleotides in positions 1 and 3 in the triplet) will be preserved (N = G, C, A, U). This frame is important in this early stage for precise docking. At position 2 can then be G, C, A, or U, so we should expect an extension of the code to GGC; GCC; GAC, GUC, whereby GAC and GUC should code for the most abundant amino acids besides gly and ala. These are asp and val, and indeed GAC and GUC code for asp and val respectively.

To stop polymerization, i . e . , to avoid correctly binding hairpin charged with amino acid to messenger, Py (C or U) should be at position 1 and Pu (G or A) at position 3. Hairpins with Pu in the loop bind better to an assembler with complementary Py than vice versa (13), so the best stop signal should have the weakly binding pyrimidin (U) at position 1 and Pu at positions 2 and 3. This early stop codon can have been preserved giving a possible explanation for UAPu, UGA being the stop codon in the modern genetic code.

The GNC reading frame should be very important in the beginning and should loose its importance with increasing sophistication of the translation apparatus. Positions 1 and 2 in the triplet should remain important for docking: the pairing of the bases at these positions to the complementary bases of the assembler assist lateral interlocking hairpin with adjacent hairpin being already incorporated in the aggregate. The early proteins will assist further interlocking, thus position 3 should loose part of its specificity (wobble), then GGN, GCN, GAN, GUN with N = G, C, A, U should code for gly, ala, asp, val respectively. We may assume at this early stage that no distinction is made between asp and glu by their chemical similarity (acidic) and that the modern code GAPy for asp and GAPu for glu evolved later.

In improving the translation device it will be sufficient to have a purine at position 1. Purine G may be exchanged for purine A. This gives the code for ile

Page 20: The origin of life — A task in molecular engineering

396 HANS KUHN

Page 21: The origin of life — A task in molecular engineering

THE ORIGIN OF LIFE 397

(AUN), thr (ACN), ser, asn, lys arg (APuN). We may assume that no distinction is made between lys and arg (polar basic) (APuPu) and between asn, gln and ser (polar neutral) (APuPy) and that the present code AAPu for lys, AGPu for arg, AAPy for asn AGPy for ser evolved later. The amino acids con- sidered are those that are available in simulating prebiotic conditions.

In the further improvement of the translation device the PuNN reading frame becomes out of use, C and U can now be in position 1. Fully new codes besides the old code will be used for important amino acids (UCN for ser, CGN for arg).

Thus the evolution of the codes of the most abundant and most frequently used amino acids besides leu and pro can be interpreted in this manner. A possible explanation for the codes of leu and pro is to assume, as mentioned above, that no distinction is made between ile, leu and pro (non-polar) (code AUN) until the PuNN reading frame is abandoned, and fully new codes are available and used (CUN and UUPu for leu, CCN for pro). Thus one base (leu) or two bases (pro) must be exchanged in the anticodon, leaving the tRNA-aminoacyl-tRNA synthetase interaction unchanged. This would explain the striking fact emphasized by Christian de Duve that there are 20 synthetases for 20 amino acids, whatever the number of tRNA’s ([S], p. 182). The code for gln (CAPu) is rationalized in the same way.

The residual amino acids (not available by prebiotic synthesis and less used in proteins) are assumed to be coded later using empty codons (UUPy for phe, UAPy for tyr, UGPy for cys, CAPy for his. They are left with codons that differ from the stop codons in position 3 (Py instead of Pu). The codes originally used for ile +leu+pro (AUN) and asn+gln (AAPy) code alone for ile and asn respectively, after the new codes for leu, pro, and gln have appeared. Now all 64 codons are related to an amino acid or to stop. Finally AUG for met, and UGG for trp separate from the codes for ile and stop respectively. This gives the modern universal code (Figure 18).

The mitochondrial codes differ from the universal code ([8] p. 93). As empha- sized by de Duve the deviations from the universal code are most likely late acquisitions. The late deviation is supported by the present approach. The branch- ing between ile and met is different: in human, saccharomyces and drosophila mitochondria AUA codes for met instead of ile. The branching between stop and trp is different: in human, neurospora, saccaromyces and drosophila mitochondria UGA codes for trp instead of stop. Differences in the late additional codes for broadly used aminoacids occur: in saccharomyces mitochondria CUA codes for thr instead of leu; in human mitochondria AGPu codes for stop, in drosophila mitochondria for ser instead of arg; it is assumed that the old code for arg, AGPu, was given up after the new code CGN occurred, and AGPu was then available and used for stop and ser respectively.

Initially, purine G and pyrimidine C at position 2 code for polar gly and nonpolar ala respectively: the base at this position is related to the bases at the open end of the hairpin which interact with the activated amino acid. This relation is essentially preserved - Pu at position 2 codes for a polar amino acid and Py for a nonpolar amino acid (exceptions: ACN for thr, UCN for ser).

It is not surprising that an early code has been preserved to a considerable

Page 22: The origin of life — A task in molecular engineering

398 HANS KUHN

extent. Modeling detailed evolutionary steps elucidates the general difficulty of later changing well experienced functional and structural interconnections.

Nucleotides in biosystems are composed of D-ribose. This is rationalized by the general consideration in Section 2.4. The use of D-form (instead of the a-priori equivalent L-form) is a frozen accident. The translation machinery in biosystems uses L-amino acids. This can be rationalized by force field calculations [lo] showing that the activated complex of L-ala with GMP (containing D-ribose) should be considerably more stable than the corresponding complex with D-ala. This suggests an explanation for the fact that D-ribose and L-amino acids are used in biosystems and no D-ribose and D-amino acids (a priori equally probable would be the use of L-ribose and D-amino acids instead of D-ribose and L-amino acids).

The theoretical model gives a possible explanation for details in the construction of the genetic apparatus as derived from the given steric-energetic conditions (Figure 17):

- coding by triplets, - reading in 5’-3’ direction of mRNA by attaching tRNA strands in opposite 3’-

5’ direction, - amino acids bound at 3’ ends of tRNA’s.

3.3. SELF INSTRUCTION OF NUCLEOTIDES AND AMINO ACIDS?

Can nucleotides and amino acids exist in sufficient concentrations under prebiotic conditions?

In the classical experiments by Stanley Miller [15] and Juan Oro [16] amino acids gly, ala, asp and nucleic acid base adenine were obtained under simple conditions without additional external instruction. All four nucleotide bases aden- ine, guanine, cytosine and uracil show this most astonishing coincidence of biologi- cal functionality and the property of being formed largely by self instruction under conditions that should be easily available on the prebiotic planet, where reactions under exclusion of dioxygen and water and reactions in water are supposed to take place.

In contrast to nucleotide bases, experimental evidence for obtaining nucleotides under prebiotic conditions is still missing. The fundamental difficulty was seen in the lack of experimental evidence to obtain ribose under prebiotic condition, but very recently a fundamental break-through occurred; Albert Eschenmoser [ 17,181 obtained ribose-2,4 diphosphat in 17% yield in an alkaline solution of glycolalde- hyde-phosphat and formaldehyde. Sugar-phosphates are of particular interest since they are easy to concentrate by ion exchange.

The fact that ribose-2,4 diphosphate is so easily formed by selfinstruction is important for judging future prospects. It is not known if selfinstruction of nucleo- tides and nucleic acids is possible under prebiotic conditions, but some optimism is justified, considering a most astonishing example of selfinstruction of a biomolecule given by Albert Eschenmoser [19]: Uro-porphyrinogene III, an important interme- diate in the biosynthesis of vitamin B 12 is formed as a highly specific reaction product in a prebiotic synthesis and the ring contraction takes place spontaneously in metal complexes of model compounds.

As pointed out by Albert Eschenmoser [18] the structural complexity of biomo-

Page 23: The origin of life — A task in molecular engineering

THE ORIGIN OF LIFE 399

lecules is, to a large extent, resulting from selfinstruction, and a complex structure can be very simple if complexity is measured by the intricacy of its production, i.e., by the number of changes of conditions necessary to synthesize the compound from simple constituents.

Finding appropriate conditions for the proposed enzyme-free polymerization of nucleotides on nucleic acid templates is a great challenge. Very promising results were recently obtained [20]. Complementary strand RNA synthesis by a multisub- unit ribozyme catalyst [21] can be another way to obtain selfreplication of a nucleic acid. In attempting to engineer artificial evolving systems the spontaneous formation of double helices of poly-bipyridinium complexed by Cu(I)-ions is of great interest [22].

In the present view the origin of life must be approached by modeling a pathway leading to a particular molecular machine. Considerations on the mechanisms involved should be useful in future attempts for engineering artificial systems acting as selfreproducing supramolecular machines of increasing complexity by darwinian competition.

References 1. H. Kuhn: Angew. Chem. (Znternat. Ed.) 11, 798 (1972). 2. H. Kuhn: Nuturwksenschuften 63, 68 (1976). 3. H. Kuhn and J. Waser: Angew. Chem. (Znternar. Ed.) 20, 500 (1981). 4. H. Kuhn: in ‘Darwin today’, VIII. Ktlhlungsborner Kolloquium 8.11.-12.11.1981. Abhandlungen

und Mitteilungen der Akademie der Wissenschaften, ed. E. Geissler, Akademie-Verlag Berlin (1983).

5. H. Kuhn and J. Waser: in Biophysics, eds. W. Hoppe, W. Lohmann, H. Markl, and H. Ziegler, Springer-Verlag, Berlin, 1983, p. 830.

6. H. Kuhn and J. Waser: Experientia 39, 834 (1983). 7. H. Kuhn: submitted to J. Mol. Evol. 8. C. de Duve: Blueprint for a Cell: The Nature and Origin of Life, Neil Patterson Publishers,

Burlington (1991). 9. J. V. Neumann: Theory of Self-Reproducing Automata. University of Illinois Press, Urbana (1966);

see also H. Atlan: L’organisation biologique et la theorie de I’information, Hermann, Paris (1972). 10. E. v. Kitzing: Dissertation, University of Gottingen, Germany (1985). 11. A. J. Zaug and T. R. Cech: Science 231, 470 (1986). 12. U. Lehmann and H. Kuhn: Adv. Space Res. 4, 153 (1984); U. Lehmann: BioSystems 17, 193

(1985). 13. U. Baumann, U. Lehmann, K. Schwellnuss, J. H. van Boom, and H. Kuhn: Eur. J. Biochem.

170, 267 (1987); U. Baumann, U. R. Frank, and H. Blocker: Biochem. Biophys. Res. Commun. 157, 986 (1988); Anal. Biochem. 183, 152 (1989).

14. M. Eigen and R. Winkler: Nuturwissenschaften 68, 217 (1981). 15. S. L. Miller: Science 117, 528 (1953); J. Am. Chem. Sot. 77, 2351 (1955). 16. J. Oro: Biochim. Biophys. Res. Commun. 2,407 (1960); J. Oro and A. P. Kimball: Arch. Biochem.

Biophys. 94, 217 (1961). 17. D. Mtiller, S. Pitsch, A. Kittaka, E. Wagner, C. E. Wintner, and A. Eschenmoser: Helv. Chim.

Acta 73, 1410 (1990). 18. A. Eschenmoser: Verhandlungen der Ges. Deutscher Naturf. und Aerzte, 116. Versamlung, Berlin,

September (1990). 19. A. Eschenmoser: Angew. Chem. 100, 5 (1988). 20. G. V. Kiedrowski, B. Wlotzka, J. Helbing, M. Matzen, and S. P. Jordan: Angew. Chem. 103,

456 (1991); W. S. Zielinski and L. E. Orgel: Nature (London) 327, 346 (1987). 21. J. A. Doudna, S. Couture, and J. W. Szostak: Science 25, 1605 (1991). 22. J.-M. Lehn: Angew. Chem. 100, 91 (1988).