origin of life on earth and shannon’s
Post on 02-Jun-2018
224 Views
Preview:
TRANSCRIPT
-
8/10/2019 Origin of Life on Earth and Shannons
1/19
-
8/10/2019 Origin of Life on Earth and Shannons
2/19
106 H.P. Yockey / Computers Chemistry 24 (2000) 105-123
1 In tro duct io n
S o c r a t e s ( T h e R e p u b l i c , B o o k V I , p . 7 4 5 ) h a d n o t e d ,
i n a n e a r l i e r c o n v e r s a t i o n w i t h G l a u c o n , t h a t s t u d e n t s
o f g e o m e t r y a n d r e c k o n i n g f i r s t s e t u p p o s t u l a t e s a p -
p r o p r i a t e t o e a c h b r a n c h o f s c i e n c e , t r e a t i n g t h e m a s
k n o w n a b s o l u t e a s s u m p t i o n s , t a k i n g i t f o r g r a n t e d t h a t
t h e y a r e o b v i o u s t o e v e r y b o d y . T h u s , a s S o c r a t e s t a u g h t
u s , t h e e s s e n c e o f s c i e n c e i s m e a s u r i n g , c o u n t i n g a n d
w e i g h i n g t o g e t h e r w i t h r e a s o n i n g f r o m p o s t u l a t e s o r
a x i o m s . W h e n s c ie n c e a t t e m p t s t o p r o c e e d f r o m q u a l i -
t a t i v e a r g u m e n t s , t h e r e i s a d a n g e r t h a t w e m a y b e
l o o k i n g a t R o r s c h a c h i n k b l o ts , s o t o s p e a k , a n d s e e in g
w h a t w e w a n t t o se e. T h e m o r e d i s c u s s io n s c a n b e m a d e
q u a n t i t a t i v e a n d a v o i d a d h o c e x p l a n a t i o n s t h e b e t t e r
t h e o r e t i c a l b i o l o g y i s s e r v e d . O n l y b y e x p r e s s i n g o u r
k n o w l e d g e i n n u m b e r s a n d e m p l o y i n g c a l c u l a t i o n a n d
r e a s o n i n g c a n w e s e p a r a t e f a c t f r o m t h e t r i c k s o f i l l u -
s i o n i n d e t e r m i n i n g w h e t h e r a n y p r o p o s a l o n t h e n a -
t u r e , o r i g i n a n d e v o l u t i o n o f l i fe c a n b e r e t a i n e d o r
d i s c a r d e d . T h i s b r e a k s m o l e c u l a r b i o l o g y o u t o f s o p h i s -
t i c a t e d K i p l i n g ' s Ju s t S o S t o r i e s in t o t h e q u a n t i t a t iv e
m o d e , u s e d b y n a t u r a l s c i e n t i s ts ( W o l y n e s , 1 9 9 8) .
C h a r l e s D a r w i n ( 1 8 0 9 - 1 8 8 2 ) a n d a l m o s t a l l n a t u r a l -
i s t s o f h i s t i m e , b e l i e v e d i n t h e b l e n d i n g t h e o r y o f
g e n e t i c s ( J e n k i n , 1 8 6 7) . H o w e v e r , G r e g o r J o h a n n
M e n d e l ( 1 8 2 2 - 1 8 8 4 ) , p u b l i s h e d a p a p e r i n 1 8 6 5
( M e n d e l , 1 8 65 ) t h a t p r o v e d i n h e r i t e d c h a r a c t e r i s t i c s a r e
s e g r e g a t e d a n d d o n o t b l e n d . I t w a s n o t u n t i l 1 9 30 t h a t
F i s h e r ( F i s h e r , 1 93 0 ) s h o w e d t h a t t h e b l e n d i n g t h e o r y i s
i n c o r r e c t a n d t h a t n a t u r a l s e l ec t io n p r o c e e d s a c c o r d i n g
t o t h e p a r t i c u l a t e la w s o f G r e g o r M e n d e l . T h e n e x t
i m p o r t a n t d i s c o v e r y i n g e n e ti c s w a s t h a t t h e m e c h a n i s m
o f i n h e r i t a n c e i s l i n e a r . E 1 m o m e n t o d e l a v e r d a d c a m e
i n 1 95 3, w h e n W a t s o n a n d C r i c k d i s c o v e r e d t h a t t h e
f o r m a t i o n o f b i o m o l e c u l e s is c o n t r o l l e d b y t h e s e-
q u e n c e s o f n u c l e o t i d e s r e c o r d e d i n t h e d o u b l e h e l i x o f
D N A ( o r R N A i n s o m e v i r u s e s ) . T h u s t h e g e n e t i c
i n f o r m a t i o n s y s t e m i s s e g r e g a t e d , l i n e a r a n d d i g i t a l . I t i s
a s t o n i s h in g , f u r t h e r m o r e , t h a t t h e t e c h n o l o g y o f i n f o r-
m a t i o n t h e o r y a n d c o d i n g t h e o r y , j u s t n o w b e i n g d is -
c o v e r e d , h a s b e e n i n p l a c e i n b i o l o g y f o r a t l e a s t 3 .8 5 0
b i l l io n y ea r s. I n f o r m a t i o n t h e o r y a n d c o d i n g t h e o r y
a n d t h e i r to o l s o f m e a s u r i n g i n f o r m a t i o n i n s e q u e n c e s
a r e e s s e n t i a l t o u n d e r s t a n d i n g t h e c r u c i a l q u e s t i o n s o f
t h e n a t u r e a n d o r i g i n o f li fe .
T h e d e v e l o p m e n t o f a h u m a n b e i n g i s g u i d e d b y ju s t
7 5 0 m e g a b y t e s o f d i g i t a l i n f o r m a t i o n ( O l s o n , 1 9 95 ). I t
c o u l d b e s t o r e d o n a s in g l e C D - R O M i n t h e b i o l o g i s t 's
p e r s o n a l c o m p u t e r . T h e g e n o m e i n p r in c i p l e c o n t a i n s
a l l th e i n f o r m a t i o n n e c e s s ar y t o b r i d g e t h e g a p b e t w e e n
g e n o t y p e a n d p h e n o t y p e . A l l i n f o r m a t i o n n e c e s s a r y t o
d e t e r m i n e t h e t h r e e - d i m e n s i o n a l s t r u c t u r e o f p r o t e i n s
l i e s i n t h e f o r m o f a m i n o a c i d s e q u e n c e s , y e t w e r e m a i n
u n a b l e t o p r e d i c t t h e i r t e r t i a r y s t r u c t u r e s ( H u y n e n a n d
Bork, 1998) .
C h e m i c a l o r p h y s i c a l r e a c t i o n s i n n o n - l i v i n g m a t t e r
a r e n o t c o n t r o l l e d b y a m e s s a g e . I f th e g e n e t i c a l p r o -
c e s s e s w e r e p u r e l y c h e m i c a l , a s m e c h a n i s t s - r e d u c t i o n -
i s ts b e li e v e, th e l a w o f m a s s a c t i o n a n d t h e r m o d y n a m i c s
w o u l d g o v e r n t h e p l a c e m e n t o f a m i n o a c i d s i n th e
p r o t e i n s e q u e n c e s a c c o r d i n g t o t h e i r c o n c e n t r a t i o n .
T h e r e f o r e , i f a s c e n a r i o f o r t h e o r i g i n o f l i f e i s t o b e
a c c e p t a b l e , i t m u s t s h o w h o w a g e n e t i c m e s s a g e w a s
g e n e r a t e d a n d a t t a i n e d a m i n i m u m t h r e s h o l d o f c o m -
p l e x i t y - - w h i c h is m e a s u r a b l e i n b it s a n d b y t e s - - i n
a m o l e c u l e c h a r a c t e r i s t i c o f l i f e a n d n e e d e d f o r t h e
a s s i m i l a t io n o f c a r b o n d i o x i d e a n d n i t r o g e n b y t h e
p r o t o b i o n c
F i r s t , h o w l a r g e a n i n f o r m a t i o n c o n t e n t d o e s a g e -
n e t i c m e s s a g e r e q u i r e t o b e c h a r a c t e r i s t i c o f l i f e ? S e c -
o n d , d o t h e l a w s o f p h y s ic s a n d c h e m i s t ry h a v e e n o u g h
i n f o r m a t i o n c o n t e n t s o t h a t n o n - l i v i n g m a t t e r m a y o r -
g a n i z e i ts e l f a n d b e c o m e l iv i n g m a t t e r , s a y i n t h e m a n -
n e r t h a t c r y s t a l s a r e f o r m e d ? C o m p u t e r u s e r s a r e
f a m i l i a r w i t h t h e r a t h e r l a r g e m e m o r y r e q u i re m e n t s ,
m e a s u r e d i n b y t e s , o f c o m p l i c a t e d p r o g r a m s a n d t h e
c a p a c i t i e s o f t h e i r h a r d d r i v e s . T h e s a m e a p p l i e s t o
p r o g r a m s t h a t p u r p o r t t o g e n e r a t e t h e g e n o m e o f t h e
p r o t o b i o n t .
2 The m a the m a t i ca l theo ry o f sequences
I s h a l l t h e r e f o re a p p r o a c h t h e s u b j e c t fr o m t h e m a t h -
e m a t i c a l t h e o r y o f s e q u e n c e s a n d t h e c o d e s b e t w e e n
s e q u e n c e s . F i r s t w e m u s t e s t a b l i s h a m a t h e m a t i c a l
m e a n i n g a n d a m e a s u r e o f t h e information in a message .
S h a n n o n ( 1 94 8 ) e s t a b l i s h e d i n f o r m a t i o n t h e o r y a s a
m a t h e m a t i c a l d i s c i p l in e in h i s c l a ss i c p a p e r a n d p o i n t e d
o u t , i n t h e s e c o n d p a r a g r a p h :
T h e f u n d a m e n t a l p r o b l e m o f c o m m u n i c a t i o n i s t h a t o f
r e p r o d u c i n g a t o n e p o i n t e i t h er e x a c t ly o r a p p r o x -
i m a t e l y a m e s s a g e s e le c t ed a t a n o t h e r p o i n t . F r e q u e n t l y
t h e m e s s a g e s h a v e
meaning;
t h a t i s t h e y r e f e r t o o r a r e
c o r r e l a t e d a c c o r d i n g t o s o m e s y s t e m w i t h c e r t a i n
p h y s i c a l o r c o n c e p t u a l e n t i t i e s . T h e s e s e m a n t i c a s p e c t s
o f c o m m u n i c a t i o n a r e i r r el e v a n t to t h e e n g i n e e ri n g
( b i o l o g i c a l ) p r o b l e m . T h e s i g n i f i c a n t a s p e c t i s t h a t t h e
a c t u a l m e s s a g e i s
selected from a set
o f p o s s i b l e
m e s s a g e s . T h e s y s t e m m u s t b e d e s i g n e d t o o p e r a t e f o r
e a c h p o s s i b l e s e l e c t i o n , n o t j u s t t h e o n e w h i c h w i l l b e
a c t u a l l y c h o s e n s i n c e t h i s is u n k n o w n a t t h e t i m e o f
d e s i g n .
S i m i l a r l y , t h e g e n e t i c lo g i c s y s t e m m u s t b e c a p a b l e o f
a c c o m m o d a t i n g t h e g e n e t i c m e s s a g e s o f a l l o r g a n i s m s
t h a t h a v e e v e r l i v e d , l i v e n o w o r w i l l b e e v o l v e d i n t h e
f u t u r e . T h e D N A s e q u e n c e s t h a t m a k e u p t h e g e n o m e
o f a n y o r g a n i s m a r e s e l e c t e d f r o m a s e t o f p o s s i b l e
g e n e t i c m e s s a g e s . T h u s t h e b i o l o g i c a l i n f o r m a t i o n s y s -
-
8/10/2019 Origin of Life on Earth and Shannons
3/19
H.P . Yockey / Computers Chemistry 24 (2000) 105 123 107
tem or genetic logic system is independent of the specifi-
city or meaning of the genetic message in the genome.
Likewise, commu nica tion systems will process meanin g-
less noise as well as a play by Sophocles.
To some people it may seem counter-intuitive that
the information in a sequence or a message can be
measured, while there is no measure for
m e a n i n g ,
k n o w le d g e , e r u d i t io n o r s p e c i f i c i t y . We have accepted
this as true in our daily life in many ways. As comput-
ers, hard drives, modems and other devices that mea-
sure information in bits and bytes are becoming
increasingly commonplace , one becomes aware that
information and meaning are separate concepts.
The meaning, if any, of words, that is, a sequence of
letters, is arbitrary. It is determined by the natural
language and is not a property of the letters or their
arrangement. For example, the English word 'hell'
means bright in German, 'fern' means far, 'Gift' means
poison, 'bald' means soon, 'Boot' means boat, "singe"
means sing. In French 'pain' means bread, 'ballot'
means a bundle, 'coin' means a corner or a wedge,
'chair' means flesh, 'cent' means hundred, 'son' means
his, 'tire' means a pull, 'ton' means your. This confu-
sion of meaning goes as far as sentences. For example,
'O singe fort ' has no meaning as a sentence in English,
although each is an English word, yet in German it
means 'O sing on ' and in French it means 'O strong
monkey. ' However, while the meaning of a sequence of
letters is arbitrary, the meaning or specificity of a
sequence of amino acids composing a protein is not
arbitrary, but rather is determined by nature.
In formal probability theory, one must establish the
probabil ity sample space ~ consisting of the set A, of
events, x, under discussion, which are random variables
with the cor responding probabiliti es, p~. The set of
probabilities Pi form a probability vector p. The proba-
bility sample space is designated (~, A, p). All messages
are formed from a sequence of the members of a finite
alphabet. For example, the primary alphabet used in
electronic commu nica tion is [0, 1]; in molecular biology
they are the nucleotide bases in DNA, RNA and the
amino acids in protein. The members of these alphabets
are the events in the probability sample space.
Gilbert (1987) and Gilbe rt et al. (1997) ca lculated the
number of possible sequences in polypeptide chains of
length N by the following expression:
(20) N (1)
Eq. (1) gives the total number of sequences we must
be concerned with only if all events are equally proba-
ble. However, many events in general and amino acids
in particular do not have the same probability. Does
neglecting that fact affect our analysis? Sha nno n (1948)
addressed this problem as follows: Let us consider a
long sequence of N symbols selected from an alphabet
of A letters or events. In the present case the letters or
events will be the alphabet of either codons or amino
acids. With high probability the sequence will contain
N p ( i ) of the ith symbol. Let P be the probability of the
sequence. Then:
p = H U p(i )p ( i)N (2)
Taking the logarithm of both sides:
log2 P = N Z , p ( i ) log2p ( i ) (3)
log2 P = N Z ~ p ( i ) log2p ( i ) = - N H (4)
Where
H = - Z~p(i) log2p( i) (5)
H is the informati on or S hann on entropy of the proba-
bility space containing the events i. Accordingly, the
probability of a sequence of N symbols or events is
approximately:
P = 2 NH
6 )
The number of sequences of length N is approximately:
2 NH (7)
Notice that the expression for H was not introduced ad
hoc; rather it comes out of the woodwork, so to speak.
Let us apply Eq. (7) to calculating the number of
sequences in 100 throws of a dice game where the
probabilities of all events are known exactly and are
not all equal. For a given throw the probability of 2
and 12 is 1/36 whereas the probability of 7 is 6/36
because there are six ways to roll a 7 and only one way
to roll 2 or 12. Substit uting the probabi lities of each of
the possible events in Eq. (7) to calculate H one finds
that the n umber of sequences is only 2.69 x 10 -6 of
that calculated from Gilbert's expression (Eq. (1)) 11N.
We have approached the calculation of the number
of sequences of length N in two apparently correct ways
and the question arises as to what happened to the
sequences left out of the total possible by the second
method. This is explained by the Sh an non -M cMi ll an -
Breiman theorem ( Shannon, 1948; McMillan, 1953;
Breiman, 1957):
For sequences of length N being sufficiently long, all
sequences being chosen from an alphabet of A symbols,
the ensemble of sequences can be divided into two
groups such that:
1. The probabil ity P of any sequence in the first group
is equal to 2 NH
2. The sum of the probabilit ies of all sequences in the
second group is less than e, a very small number.
The Sha nno n-M cMi lla n-B rei man theorem tells us
that the number of sequences in the first or high
probability group is 2 NH and they are all nearly equally
probable. We can ignore all those in the second or low
probability group because, if N is large, their
t o t a l
p r o b a b i l i t y
is very small.
-
8/10/2019 Origin of Life on Earth and Shannons
4/19
108
H.P. Yockey / Computers & Chemistry 24 (2000) 105-123
The expression for the measure of information that
appears in Shannon (1948) is:
H = - KS ip
logpi (8)
where the p~ are the probabilities of the ith member of
the alphabet. Shannon was quick to explain that K is a
positive constant that 'merely amoun ts to a choice of a
unit of measure'. K appears from the mathematical
argument that justifies Eq. (8). It has no physical
dimensions and when the logarithm is taken to the base
2 and K= 1, H is measured in bits, a contraction of
binary digit. Like all messages, the life message is
non-materi al but has a measurable inform ation con-
tent. Of course, the genetic message, although non-ma-
terial, must be recorded in matter or energy. Now in
this paper, the term
information
has the meaning given
in Eq. (8). It does not mean
knowledge,
although a
message composed of a sequence of symbols may trans-
fer knowledge to the receiver of the message.
The expression given in Eq. (8) is very similar to that
for entropy in statistical mechanics. As I explained in
Yockey (1992), entropy is a general term and must be
defined as a function of the probability space and the
probability distribution of the elements or events to
which it refers. As Hamming (1986) wrote:
For us entropy is simply a function of a probability
distribution p~ .. . The confusio n at this point has been
very great for outsiders who glance at information
theory.
Every probability sample space has an entropy. For
example, the p robab ility sample space in classical statis-
tical mechanics, called
phase space
by theoretical physi-
cists, is six-dimensional and the Pi are defined by the
position and momentum vectors of the particles in the
ensemble. The function for entropy in both classical
statistical mechanics and quantum statistical mechanics
has the dimensions of the Boltzma nn constant k a nd
has to do with energy and moment um, not information.
However, entropy in information theory and probabil-
ity theory has no mechanical dimensions. There are no
counterparts in communication theory to temperature,
energy, pressure, work or volume. There is, further-
more, no counterpart to the First Law of Thermody-
namics, namely, the conservation of the energy of a
system. To illustrate this point further, one may con-
sider the probability space of a dice game that consists
of the numbers 2 through 12 as random variables and
calculate the cor respond ing entropy (Yockey, 1992, see
exercise on page 88), Clearly, the entropy o f a dice
game has nothing whatever to do with statistical me-
chanics or thermodynamics. It may have something to
do with information theory since a sequence of symbols
selected from the alphabet is generated as a stochastic
process by a series of tosses of the dice. Such a se-
quence of letters forms a message in which some gam-
biers find meaning or knowledge by which they make
their bets. On the other hand, information theory is
concerned with messages expressed in sequences of
letters selected from a finite alphabet by a Markov
process. The probability sample space is constructed of
the letters of the alphabet under consideration as ran-
dom variables and the p~ are defined accordingly.
3 The ex ten s io ns o f a n a lpha bet : def in i ti o n o f the by te
A binary alphabet consisting of zero and one, each
containin g one bit of information , is used in the source
alphabet of computers and all communi cation technol-
ogy. The receiving alphabets in all applications have
more than two letters. The binary source alphabet must
be
extended
by forming pairs, triplets, quadruplets and
so forth to form receiving alphabets larger than two.
These alphabets are called the first, second, third a nd so
forth extensions of the primary b inary alphabet. These
extensions are called a
byte.
Packaged items in stores have a bar code attached
that permits the cashier to record the price of the item.
The Postal Service in the United States has established
a ZIP + 4 bar code in which mailing addresses can be
written. The alphabets of the postal and other bar
codes are composed of short bars and long bars, that is,
the source alphabet is binary. There are 32 (25= 32)
members of the fifth extension of the source binary bar
code alphabet. Thus the postal bar code has a five bit
byte. An important receiving alphabet is one in which
the ten numbers from 0 to 9, have two ones or two
zeros. The source probabili ty space (fl, A, p) alphabet is
mapped on the receiver probability space alphabet, (fL
B, p). The Postal Service has assigned arbitrarily the ten
members of the fifth extension that have two ones to
one of the ten digits in the decimal system. The ten
members selected from the fifth extension are called
sense code letters.
The assignment of sense code letters
in the fifth extension alphabet of the postal ZIP + 4
code is shown by the first row in Table 1.
The other code letters are called
non-sense
because
they have been given
no sense
or
meaning
assignment in
the receiving alphabet. (Remember that non-sense dose
not mean nonsense or foolishness.) I have listed those
non-sense code letters of the fifth extension of the
binary code with just one change from the sense code
letters in each column, five rows down. Notice that the
postal ZIP + 4 code is an error detecting code. A single
error does not change one sense code letter to another
sense code letter. If, due to smudging or other malfunc-
tion, the sort ing machine reads a non-sense code letter
the mail piece is rejected to be examined by a postal
employee.
Computer equipment uses the seventh extension,
ASCII [A(merican) S(tandard) C(ode for) I(nforma-
-
8/10/2019 Origin of Life on Earth and Shannons
5/19
H.P. Yoekey / Computers Chemistry 24 (2000) 105-123 109
Table 1
The postal ZIP+4 code
0 1 2 3 4 5 6 7 8 9
11000 00011 00101 00110 01001 01010 01100 10001 10010 10100
01000 10011 10101 10110 11001 11010 11100 0000 00010 00100
10000 010|1 01101 01110 00001 00010 00100 11001 11010 11100
11100 00111 00001 00001 01101 01110 01000 1010 10110 10000
11010 00001 00111 00100 01011 01000 01110 10011 10000 10110
11001 00010 00100 00111 01000 01011 01101 10000 10001 10101
tion) I(nterchange)] computer code. The byte is 7 bits,
the amount of information in one printed character in
the receiver alphabet. There are 27= 128 members of
the seventh extension alphabet. Members of this sev-
enth extension of the primary alphabet are assigned
arbitrarily to the letters in the receiving alphabet of the
word processor. This assignment is appropriate to the
language for which the word processor is designed.
4 T h e g e n e ti c c o d e
In another corner of the groves of academe, remote
from the activities of mathematicians and communica-
tion engineers, after the discoveries of Watson and
Crick in 1953, it was realized that there must be a code
that performs a mapping between the sequences of the
four nucleotides in DNA to the sequences of the 20
amino acids in protein. George Gamow made the first
suggestion of an overlapping code. It was soon proved
to be incorrect, but the search was on. It is easy to see
that overlapping codes impose severe restrictions on the
allowed amino acid sequences. Since it was clear that
there are 64 possible triplets and only 20 amino acids,
the genetic code was believed to be 'degenerate' and
that some codons must be 'nonsense'. Both terms are
unnecessarily pejorative and non-constructive. The first
sentence of Sonnebo rn (1965) expresses the att itude of
the time: "Part I of this paper deals with the amou nt of
degeneracy in the genetic code or, looked at in reverse,
the amo unt o f nonsense." Unfortunate ly, the use of the
word 'nonse nse' persists in some current publications
(Maeshiro and Kimura, 1998).
The apartheid in the groves of academe between
mathematicians, electrical engineers and molecular biol-
ogists prevented the notion of block codes, initiators
and enders to be applied to the und erstand ing of the
genetic code and the genetic logic system. For example,
there are three different frames in which a sequence of
nucleotides in DNA can be read. Crick et al. (1957)
assumed that there must be 'commas' to separate a
string of nucleotides forming codons to prevent them
from reading out of frame. They attempted to show
that the maximum number of sense codons cannot be
greater than 20 and gave a solution for the 20. Unfortu-
nately they proved (sic) that the triplet AAA and other
triplets must be nonsense (sic). Table 2 shows that
AAA codes for lysine, CCC codes for proline, UUU
codes for p henylalanine and GG G codes for glycine.
Furthermore, a tRNA species possesses an anticodon
complementary to UGA, one of the nonsense codons,
which codes for selenocysteine, thus making 21, not 20
amino acids (Hawkes and Tappel, 1983; Sunde and
Evenson, 1987; Leinfelder et al., 1988; Mizutani and
Hitaka, 1988), The use of an initiator codon an d
codons to terminate the sequence and the realization
that the genetic code is a block code, like the bar codes
discussed above, makes it unnecessary to assume that
commas are needed.
The source alphabet of the genetic code is the four
nucleotides of DNA and mRNA. Thus each nucleotide
has a two-bit byte. The first extension of that quater-
nary alphabet has sixteen letters. That is not sufficient
to code the canonical 20 amino acids that are tran-
scribed in protein; accordingly, Nature has gone to the
second extension 64-letter alphabet. Thus the genetic
code has a six-bit byte, usually called a codon. Some-
times the codon is called a word, but clearly this is
inappropriate. The codon is a letter in the second
extension of the mR NA alphabet; the protein is a word.
The genetic code, shown in Table 2, shares a number of
properties with the postal ZIP + 4 code and the ASCII
computer codes. The genetic code is a block code
because all codons are triplets. The genetic code is
distinct and uniquely decodable because the single meo
thionine codon AUG, and sometimes the leucine
codons UUG and CUG, serve as a starting signal for
the protein sequence and perform the same function as
the long frame bar at the beginning of the postal
message in the ZI P +4 code. The non-sense codons
UGA, UAA and UAG stop the translation of the
protein from the mRNA and initiate the release of the
protein sequence from the mRNA (Maeshiro and
Kimura, 1998). They perform the same function as the
long frame bar at the end of the postal bar code
message.
-
8/10/2019 Origin of Life on Earth and Shannons
6/19
110 H.P. Yockey / Computers Chemistry 24 (2000) 105 123
Table 2
The mRNA genetic code
Amino acid Triplet codons Amino acid Triplet codons Amino acid Triplet codons
Glycine GGG Phenylalanine UUU Leucine UUA
GGC UUC UUG
GGU
GGA
Proline CCG Cysteine UGU Tryptophan UGG
CCC UGC Nonsense UGA
CCU
CCA
Leucine CUG Glutamine CAA Histidine CAU
CUC CAG CAC
CUU
CUA
Arginine CGG Aspartic acid AAU Lysine AAA
CGC AAC AAG
CGU
CGA
Threonine ACG Glutamic acid GAA Asparagine GAU
ACC GAG GAC
ACU
ACA
Valine GUG Isoleucine AUU Methionine AUG
GUC AUC
GUU AUA
GUA
Alanine GCG Nonsense UAA Tyrosine UAU
GCC UAG UAC
GCU
GCA
Serine UCG Arginine AGA
UCC AGG
UCU Serine AGU
UCA AGC
5 T h e o r i g i n a n d e v o lu t i o n o f t h e g e n e t i c c o d e
The Standard Genetic Code as represented in Table 2
was once thought to be universal in all biology, indeed,
in the earlier papers it was sometimes refereed to as the
Universal Genetic Code. Crick (1968) proposed that the
genetic code was a froze n accident . It has been learned
since that the genetic code is not universal, nor is it
fixed or frozen. Placing this problem in the hands of the
fickle goddess Fortuna begs the question. There has
been considerable speculation about the order in the
genetic code. Any small group of chance events seems
to ha ve ord er . This is the gambler s ruin. As the
ancient Greeks and Romans presumed to see the future
in the flight of birds, he sees a pattern or ord er in the
fall of the dice or the cards. Fortuna, the goddess of
chance, may allow him a brief episode of luck but
inevitably she turns against him to his ruin (Feller,
1957).
Jukes (1966, 1983) suggested that the second exten-
sion triplet genetic code may have evolved from a
preceding first extension doublet code. Jukes archety-
pal genetic code had 16 codons that translated 14 or 15
amino acids together with a stop codon (Jukes, 1966).
All amino acid codons except Methionine and Tryp-
tophan have some redundance in the third place in the
codon; this lends some support for that suggestion.
That redundance provides some protection from muta-
tional error. I have given a detailed discussion of the
consequences of the evolution of the standard and
mitochondrial genetic codes from an archetypal do ublet
code in which the third nucleotide is silent (Yockey,
1992).
Crick (1968) pointed out, correctly, that any change
in the code would cause errors throughout the process
of protein formation, resulting in a disruption that
would be lethal. However, Jukes (1993) noted that
codon use varies considerably and that in some organ-
isms certain codons fall out of use altogether. He
-
8/10/2019 Origin of Life on Earth and Shannons
7/19
H.P. Yockey / Computers Chemistry 24 (2000) 105-123 111
s u g g e s t e d t h a t a f t e r a t i m e s u c h c o d o n s h a d b e e n
r e a s s i g n e d o r c a p t u r e d . F o r e x a m p l e , t h e c o d o n i n
v e r t e b r a t e m i t o c h o n d r i a f o r m e t h i o n i n e i s A U A , a n d
U G A f o r t r y p t o p h a n . T h e m o s t r e c e n t s t u d ie s o f c o d o n s
t h a t c o d e f o r a m i n o a c i d s th a t d i f f e r f r o m t h o s e i n t h e
S t a n d a r d G e n e t i c C o d e i n T a b l e 2 is in O s a w a ( 19 9 5 ) a n d
O s a w a e t a l . , ( 19 9 0, 1 9 92 ) a n d M a e s h i r o a n d K i m u r a
(1998) .
6 T h e C e n t r al D o g m a o f F r a n c is C r ic k
W i t h t h e a p p e a r a n c e o f t h e e x p e r im e n t a l k n o w l e d g e
a b o u t t h e t r a n s fe r o f i n f o r m a t i o n f r o m t h e f o u r l e t te r
a l p h a b e t o f D N A a n d o f m R N A , t h e q u e s ti o n e m e rg e d
a m o n g m o l e c u l a r b io l o g i s t s o f h o w a f o u r l e t t er a lp h a b e t
c o u l d s e n d i n f o r m a t i o n t o a 2 0 l e t t e r a l p h a b e t . C r i c k
( 1 9 68 ) , i n a r e m a r k a b l y p r e s c i e n t p a p e r , s u g g e s t e d t h a t
i n f o r m a t i o n c o u l d b e t r a n s f e r r e d f r o m D N A t o R N A
a n d f r o m R N A t o p r o t e i n , b u t n o t f r o m p r o t e i n t o
p r o t e i n . H e c a l l e d t h i s b y t h e s o m e w h a t e c c l e s i a s t i c a l
n a m e The Central Dogma o f m o l e c u l a r b i o l o g y ( C r i c k ,
1 9 70 ). T h e r e a s o n f o r t h e s e s t a t e m e n t s , a s w e h a v e s e e n
a b o v e , i s t h a t N a t u r e h a d e x t e n d e d t h e p r i m a r y f o u r
l e t t e r a l p h a b e t t o t h e s i x - b i t , 6 4 - c o d o n a l p h a b e t o f t h e
g e n e t i c c o d e . T h u s t h e g e n e t i c c o d e i s a s e v e r a l - t o - o n e
c o d e . N o t e a l s o , t h a t t h e d i c e g a m e i s a l s o a s e v e r a l - t o -
o n e c o d e a n d f o r t h a t r e a so n h a s a C e n t r a l D o g m a . T h e
s o u r c e a l p h a b e t h a s 3 6 m e m b e r s , n a m e l y , a l l p a i rs o f t h e
n u m b e r s 1 t h r o u g h 6 . T h e s e p a i r s o f n u m b e r s a r e
a n a l o g o u s t o c o d o n s i n t h e g e n e t ic c o d e . T h e r e c ei v e r
a l p h a b e t i s f o r m e d b y a d d i n g t h e t w o n u m b e r s a n d h a s
o n l y 1 1 m e m b e r s , t h e n u m b e r s 2 t h r o u g h 1 2 . T h e s e
n u m b e r s a r e a n a l o g o u s t o t h e a m i n o a c i d s i n p r o t e i n
s e q u e n c e s . E x c e p t f o r 2 a n d 1 2, a l l t o t a l s c a n b e m a d e
b y m o r e t h a n o n e c o m b i n a t i o n o f n u m b e r s r e a d f ro m t h e
d i c e. T h u s t h e r e i s a l o s s o f i n f o r m a t i o n ( a n d k n o w l e d g e )
w h e n t h e n u m b e r s a r e a d d e d b e c a u s e t h e l o g i c o p e r a t i o n
l a c k s a s i n g l e - v a l u e d i n v e r s e. T h i s i s k n o w n a s a l o g i c
A D D g a t e a n d i s i r re v e r s ib l e . T h u s t h e C e n t r a l D o g m a
i s a t h e o r e m i n c o d i n g t h e o r y a n d t h e l o g i c o f t h e g e n e t i c
c o m m u n i c a t i o n s y s t e m . I t d o e s n o t c o m e f r o m b i o c h e m -
i s t r y . I n t h e f u l l g l o r y o f i t s m a t h e m a t i c a l g e n e r a l i t y i t
a p p l i e s t o a l l c o d e s w h e r e t h e i n f o r m a t i o n e n t r o p y o f t h e
s o u r c e a l p h a b e t i s l a r g e r t h a n t h a t o f t h e r e c e iv e r
a l p h a b e t ( K o l m o g o r o v , 1 9 5 8 ; B i l l i n g s l e y , 1 9 6 5 ; S h i e l d s ,
1973; O r ns te in , 1974; Y ockey , 1992) .
I t i s o b v i o u s t h a t i f t h e s o u r c e a n d r e c e iv e r a l p h a b e t s
h a v e t h e s am e n u m b e r o f s y m b o l s , a n d a o n e - t o - o n e
c o r r e s p o n d e n c e b e t w e e n t h e m e m b e r s o f t h e a l p h a b e t s ,
t h e l o g i c o p e r a t i o n h a s a s i n g l e v a l u e d i n v e r s e a n d
i n f o r m a t i o n m a y b e p a s s e d , w i t h o u t l o s s , i n e i t h e r
d i r e c ti o n . T h u s , s i n ce R N A a n d D N A b o t h h a v e a
f o u r - l e t t e r a l p h a b e t , g e n e t i c m e s s a g e s m a y b e p a s s e d
f ro m R N A t o D N A . T h e p a s sa g e o f i n f o r m a t io n f r o m
p r o t e i n t o R N A o r D N A i s p r o h i b i t e d f o r t w o r e a so n s :
( 1 ) t h e l o g i c O R g a t e i s i rr e v e r s i b l e , a n d ( 2 ) t h e r e i s n o t
e n o u g h i n f o r m a t i o n i n th e 2 0 l e t te r p r o t e i n a l p h a b e t t o
d e t e r m i n e t h e 6 4 le t t e r m R N A a l p h a b e t f r o m w h i c h i t i s
t r a n s l a t e d ( Y o c k e y , 1 9 92 , 1 9 95 a) . T h e r e i s n o m a t h e m a t -
i c a l r e s t r i c t i o n o f t h e t r a n s f e r o f i n f o r m a t i o n f r o m
p r o t e i n t o p r o t e i n . T h is w a s o n c e t h o u g h t t o b e t h e m e a n s
b y w h i c h p r i o n s w e r e f o r m e d b u t t h a t i s n o l o n g e r
b e l i e v e d t o b e t h e c as e . T h e r e a p p e a r s t o b e n o b i o l o g i c a l
m e c h a n i s m b y w h i c h i n f o r m a t i o n c a n b e e x c h a n g e d
b e t w e e n t w o p r o t e i n s .
7 T h e g e n e t i c c o d e a n d e r ro r in th e g e n e t ic m e s s a g e
L e t u s r ef e r t o t h e s e c o n d p a r a g r a p h i n S h a n n o n ' s 1 94 8
p a p e r w h e r e h e w r o t e :
T h e f u n d a m e n t a l p r o b l e m o f c o m m u n i c a t i o n i s t h a t o f
r e p r o d u c i n g a t o n e p o i n t e i t h e r e x a c t l y o r a p p r o x i m a t e l y
a m e s s a g e s e l e c t e d a t a n o t h e r p o i n t .
S h a n n o n m o d e l e d t h e g e n e r a t i o n o f a m e s s a g e a s a
M a r k o v p r o c e s s . A t a l l s t ag e s o f t h e c o m m u n i c a t i o n
s y s te m t h e m e s s a g e m a y b e a c t e d u p o n b y a s e c o n d
M a r k o v p r o c e s s l e a d i n g t o a n i n t e r c h a n g e i n s o m e o f t h e
l e t te r s i n t h e m e s sa g e in a r a n d o m a n d n o n - r e p r o d u c i b l e
f a s h i o n . T h e r e s u l t o f t h i s p r o c e s s i s c a l l e d n o i s e .
F i g . 1 ( f r o m I n f o r m a t i o n T h e o r y a n d M o l e c u l a r B i-
o l o g y ( 1 9 9 2 ) , C a m b r i d g e U n i v e r s i t y P r e s s , w i t h p e r m i s -
s o u r c e
tape
c o d e
I
e n c o d e r
transmitter
c h a n n e l
de c ode r de s t inat ion
tape
channel
code
- t J -I i - I
T
r e c e ive r
n o i s e
Fig. 1. The transmission of inform ation from source to de stinatio n as conceived in electrical engineering. No ise occurs in all stages
but is shown according to accepted practice in electrical engineering.
-
8/10/2019 Origin of Life on Earth and Shannons
8/19
112 H.P. Yockey / Computers Chemistry 24 (2000) 105-123
Source ~ I ~ Channel code
code
r I
Source tape Encoding
Message
] [ D N A co de I Channe l
in DNA ~ to ~ ~-
code mRNA code I -mRNA
message
D N A tape Tr an sc r ip t io n in RN A code l
Nucleic I Genetic noise I
RN A
polymerase
Destination
[ code letters ,.._ Destination
I
-I code
D e c o d i n g ~ _ ~ protein
Channel I
message
' ~ in
protein
mRNA message code
+ genetic noise Tran]~ation Protein
T
tape
I Genetic
noise
mischarged
RNA I
I amino acyl
synthetases
tRNA
amino
acids
Fig. 2. The transmission of genetic messages from the DNA to the protein tape as conceived in molecular biology. Genetic noise
occurs in all stages but is lumped in the figure to f ix the idea.
s i o n ) s h o w s t h e b a s i c e l e m e n t s o f a c o m m u n i c a t i o n s
s y s t e m a s p r e s e n t e d b y S h a n n o n . I n F i g . 2 ( f r o m I n f o r -
m a t i o n T h e o r y a n d M o l e c u l a r B io l o g y, C a m b r i d g e
U n i v e r s i t y P r e s s , w i t h p e r m i s s i o n ) w e s e e t h e c o m m u n i -
c a t i o n o f i n f o r m a t i o n f r o m t h e g e n o m e a s r e c o r d e d in
D N A t o t h e m e s s a g e t h a t d e t e r m i n e s t h e p r o t e i n . T h e
s e q u e nc e o f n u c l e o t i d e s t h a t c o m p o s e D N A p l a y s t h e
s a m e r o l e a s th e t a p e i n a t a p e r e c o r d i n g m a c h i n e . T h e
p r o t e i n m o l e c u l e , w h i c h i s t h e d e s t i n a t i o n o f t h e m e s -
s a g e , i s a l s o a t a p e . T h u s t h e o n e - d i m e n s i o n a l g e n e t i c
m e s s a g e i s r e c o r d e d i n a s e q u e n c e o f a m i n o a c i d s ,
w h i c h f o l d s u p t o b e c o m e a t h r e e - d i m e n s i o n a l a c t i v e
p r o t e i n m o l e c u l e . O n e i s r e m i n d e d o f t h e l i n e a r s e -
q u e n c e o f s i g n a ls f ro m a t e l e v i s i o n s t a t i o n t h a t , b e c a u s e
o f t h e r a s t e r , d e fi n e s a t w o - d i m e n s i o n a l o b j e c t , n a m e l y ,
t h e p i c t u r e o n t h e s c r e e n .
A r e a d i n g e r r o r t h a t a f f li c t s r e a d i n g t h e m e s s a g e i n
t h e p o s t a l c o d e , a n d i n d e e d , a l l c o d e s , i s c a l l e d g e n e t i c
n o i s e i n F i g . 2 . T h e r e i s v e r y l i t t l e e r r o r d e t e c t i o n o r
c o r r e c t i o n a t t r i b u t e d t o t h e s t r u c t u r e o f t h e g e n e ti c
c o d e . N e v e r t h e l e s s , t h e g e n e t i c i n f o r m a t i o n s y s t e m i s
e x t r e m e l y a c c u r a t e . T h e e r r o r f r e q u e n c y i n t h e f i r s t
a m i n o a c y l a t i o n o f a n a m i n o a c id o n i t s a d a p t e r t R N A
i s a b o u t 3 x 1 0 - 4 I t i s a c u r i o u s p h e n o m e n o n t h a t
t h e r e i s a p r o o f r e a d i n g o r d o u b l e s i e v i n g m e c h a n i s m
t h a t c h e c k s t h i s p r o c e s s b y d i s c a r d i n g ' w r o n g ' a m i n o
a c i d s . I t a ls o h a s a n e r r o r f re q u e n c y o f a b o u t 3 x 1 0 4 .
T h e e r r o r f r e q u e n c y a f t e r t h e o p e r a t i o n o f b o t h m e c h a -
n is m s is t he p r o d u ct 3 x 1 0 - 4 x 3 x 1 0 - 4 = 9 x 10 a
( F r e i s t e t a l . , 1998) .
8 . Sha n no n s Cha nnel Ca pa c i ty Theo rem a nd the ro le
of error in protein funct ion and speci f ici ty
S h a n n o n ' s C h a n n e l C a p a c i t y T h e o re m p r o v e d t h a t
c o d e s e x i s t s u c h t h a t s u f f i c ie n t r e d u n d a n c e c a n b e i n t r o -
d u c e d s o t h a t a m e s s a g e c a n b e s e n t f r o m s o u r c e t o
r e c e i v e r w i t h a s f e w e r r o r s a s m a y b e s p e c i fi e d . T h e
t h e o r e m i s n o t c o n s t r u c t i v e s o o n e m u s t u s e o n e ' s
i n g e n u i t y t o c o n s t r u c t s u i t a b le c o d e s . E r r o r d e t e c t in g
a n d c o r r e c t i o n c o d e s a r e f o r m e d b y g o i n g t o h i g h e r
e x t e n s io n s a n d u s i n g t h e r e d u n d a n c e o f t h e e x t en d e d
s y m b o l s f o r e r r o r d e t e c t i o n a n d c o r r e c t i o n ( H a m m i n g ,
1 9 8 6 ) . T h e i n t e n s e a c a d e m i c a n d i n d u s t r i a l i n t e r e s t i n
t h e i m p r o v e m e n t s i n c o m m u n i c a t i o n p r o m i s e d b y
S h a n n o n ' s C h a n n e l C a p a c i t y T h e o r e m l e d t o t h e d is -
c o v e r y b y m a t h e m a t i c i a n s a n d e l e c tr i c al e n g i n e e rs o f a
n u m b e r o f e r r o r c o r r e c t in g c o d e s .
B i o l o g is t s, w o r k i n g i n a n o t h e r p a r t o f t h e g r o v e s o f
a c a d e m e , c o n f r o n t e d t h e p r o b l e m o f e r r o r in t h e f o r m a -
t i o n o f p r o t e i n f r o m t h e s e q u e n c e o f n u c l e o t i d e s in
D N A . A l t h o u g h t h ey h a d t h e i d e a o f e r r o r a c c u m u l a-
t i o n a n d i t s e f fe c t o n t h e v i a b i l i t y o f o r g a n i s m s , r e g r e t -
t a b l y t h e y a t t e m p t e d t o c o n s i d e r t h e p r o b l e m i n t e r m s
of the e r r or s them se lves ( O r ge l , 1963 , 1970). This l ed
m a n y o f t h e m t o s p e c u l a te t h a t a n ' e r r o r c a t a s t r o p h e '
-
8/10/2019 Origin of Life on Earth and Shannons
9/19
H.P . Yockey / Computers Chemis try 24 (2000) 105 -123
113
w o u l d o c c u r i n e v i t a b l y a s e r r o r s a c c u m u l a t e d , l e a d i n g
t o t h e d e a t h o f t h e o r g a n i s m . E i g e n (1 9 71 , 1 9 92 ) a n d
E i g e n a n d S c h u s t e r ( 1 9 7 7 ) , a l s o a d d r e s s t h e q u e s t i o n o f
t h e e f fe c t o f e r r o r s i n t h e f o r m a t i o n o f p r o t e i n s f r o m
c o n s i d e r a t i o n s o f t h e e r r o r s t h e m s e l v e s . T h e s e a u t h o r s
f i n d a n ' e r r o r t h r e s h o l d ' t o a p p l y t o t h e t r a n s f e r o f
i n f o r m a t i o n f r o m D N A t h r o u g h m R N A t o p r o t e i n .
S o m e e r r o r c a n b e t o l e r a t e d i n t h e p r o c e s s o f p r o t e i n
f o r m a t i o n . S p e c if ic p r o t e i n m o l e c u l e s h a v in g a m i n o
a c i d s t h a t d i f f e r f r o m t h o s e c o d e d f o r i n D N A m a y
h a v e f u l l s p e c i f i c it y i f t h e m u t a t i o n i s t o a f u n c t i o n a l l y
e q u i v a l e n t a m i n o a c i d . P r o t e in m o l e c u l e s re n d e r e d i n a c -
t i ve b e c a u s e o f m i s c h a r g e d a m i n o a c i d s t h a t a r e n o t
f u n c t i o n a l l y e q u i v a l e n t a r e d e g r a d e d b y p r o t e o l y t i c e n -
z y m e s . I t i s o n l y w h e n t h e s u p p l y o f e s s e n t i a l p r o t e i n s
d e c a y s b e l o w a c r i t i c a l l e v e l t h a t p r o t e i n e r r o r b e c o m e s
l e t h al . A l t h o u g h o t h e r c o n s i d e r a t i o n s o n t h e a g i n g p r o -
c e s s h a v e b e e n m a d e ( H o l l i d a y , 1 9 8 6 ) , t h e a c c u r a c y o f
p r o t e i n b i o s y n t h e s i s i s s t i l l o f t o p i c a l i n t e r e s t ( F r e i s t e t
al. , 1998).
S i n c e S h a n n o n r e g a r d e d t h e g e n e r a t i o n o f a m e s s a g e
t o b e a M a r k o v p r o c e s s , i t w a s n a t u r a l t o m e a s u r e t h e
e f fe c t o f e rr o r s d u e t o n o i s e b y t h e c o n d i t i o n a l e n t r o p y ,
H ( x ] y ) , b e t w e e n t h e s o u r c e p r o b a b i l i t y s p a c e (f L A , p )
w i t h a n i n p u t a l p h a b e t A w i t h el e m e n t s x , a n d t h e
r e c e i v i n g p r o b a b i l i t y s p a c e ( ~ , B , p ) w i t h r e c e i v i n g
a l p h a b e t B w i t h e l e m e n t s y . T h e c o n d i t i o n a l p r o b a b i l i t y
m a t r i x P w i t h m a t r i x e l e m e n t s
p ( i [ j )
g i v es t h e p r o b a -
b i l i t y t h a t i f le t t e r y j a p p e a r s a t t h e r e c e i v e r t h a t l e t t e r x~
w a s s e n t . T h e p r o b a b i l i t i e s p ~ i n ( f l , A , p ) a n d p j i n ( f L
B , p ) a r e r e l a t e d b y t h e f o l l o w i n g e q u a t i o n :
p j = Z ~ p , p ( i [ j ) ( 9 )
T h e c o n d i t i o n a l e n t r o p y , H ( x [ y ) i s w r i t t e n i n t e rm s o f
t h e c o m p o n e n t s o f p a n d t h e e l e m e n t s o f P :
H ( x [ y ) = E ~ p j p (i [ j ) l o g 2 p ( i [ j ) (1 0)
T h e m u t u a l e n t r o p y
I ( A ; B )
t h a t m e a s u r e s t h e a m o u n t
o f s h a r e d o r m u t u a l i n f o r m a t i o n o f th e i n p u t s e q u e n c e
a n d t h e o u t p u t i s :
I ( A ; B ) = H ( x ) - - H ( x [ y) ( 11)
I t p r o v e s m o r e c o n v e n i e n t t o d e a l w i t h t h e m e s s a g e a t
t h e s o u r c e a n d t h e r e f o r w i t h t h e p~ a n d t h e p ( j [ i )
( Y o c k e y , 1 9 74 , 1 99 2 ). F r o m B a y e s ' t h e o r e m o n c o n d i -
t i o n a l p r o b a b i l i l i t i e s , w e h a v e ( F e l l e r , 1 9 5 7) :
p ( i
] j )
= p ~ p ( j [ i ) / p j
(12)
S u b s t i t u t i n g t h i s e x p r e s s i o n f o r p ( i [ j ) i n E q . ( I 0 ) w e
h a v e :
I ( A ; B ) = H ( x ) - - g ( y [ x ) - - Z ~ p i p ( j [ i ) [ log2 (Pj/Pt)]
(13)
w h e r e
H ( y ] x ) = - - ~ : p i p ( j [ i ) l o g 2 p ( j [ i ) (14)
H ( y [ x ) v a n i s h e s i f t h e r e i s n o n o i s e b e c a u s e t h e m a t r i x
e l e m e n t s p ( j [ i ) a r e a l l e i t h e r 0 o r 1 . T h e t h i r d t e r m i n
E q . ( 1 3 ) i s t h e i n f o r m a t i o n t h a t c a n n o t b e t r a n s m i t t e d
t o t h e r e c e i v e r i f th e e n t r o p y o f ( fL A , p ) i s g r e a t e r t h a n
t h e e n t r o p y o f ( f L B , p ) . F o r i l l u s t r a t i o n w e m a y s e t a l l
t h e m a t r i x e l e m e n t s o f P , p ( j [ i ) t o t h e v a l u e s g i v e n i n
T a b l e 3 w h e r e c~ i s t h e p r o b a b i l i t y o f m i s r e a d i n g o n e
n u c l e o t i d e . S u b s t i t u t i n g t h e s e m a t r i x e l e m e n t s i n E q s .
( 1 3 ) a n d ( 1 4 ) a n d , r e p l a c i n g t h e l o g a r i t h m b y i t s e x p a n -
s i o n , k e e p i n g o n l y t e r m s o f s e c o n d d e g r e e w e h a v e :
I ( A ; B ) = H ( x ) - 1.7915 + 34 .2018~ 2 + 6 .803~ log 2
(15)
T h e g e n e t i c c o d e c a n n o t t r a n s f e r 1 .7 9 15 b i t s o f i t s
s i x - b i t a l p h a b e t t o t h e p r o t e i n s e q u e n c e s e v e n w h e n f r e e
o f e r ro r s . T h e r e a d e r m a y f in d i t i n s tr u c t iv e a n d a m u s -
i n g t o c a r r y o u t t h e s e c a l c u l a t i o n s f o r t h e d i c e g a m e .
H o w m u c h i n f o r m a t i o n i s l o s t b y a d d i n g t h e n u m b e r s
o n t h e d i c e ? O n e m u s t w r i t e z e r o i n s t e a d o f ~ f o r t h e
m a t r i x e l e m e n t i n T a b l e 3 i f t h e r e p l a c e m e n t a m i n o a c i d
i s f u n c t i o n a l l y e q u i v a l e n t a t e a c h s i te in t h e p r o t e i n
s e q u e n c e . I h a v e p u r s u e d t h i s d i s c u s s i o n t o c a l c u l a t e t h e
i n f o r m a t i o n c o n t e n t o f th e c y t o c h r o m e c m o l e c u l e t o b e
374 b i t s ( Y ockey , 1992) .
W e c a n r e p l a c e t h e s e n d e r a n d r e c e iv e r a n d c a l c u l a te
t h e m u t u a l e n t r o p y o f a n y s e q u e n c e s o r f a m i l ie s o f
s e q u e n c e s h o w e v e r t h e y m a y h a v e b e e n g e n e r a t e d . T h e
m u t u a l e n t r o p y w i ll gi v e u s a m e a s u r e o f t h e ' s i m i l a r i t y '
o f t h e s e q u e n c e s o r f a m i l i e s o f s e q u e n c e s ( S h a n n o n ,
1948; Y ockey , 1992; Tononi e t a l . , 1996 , 1999) . Pe r haps
b e c a u s e o f t h e a p a r t h e i d i n t h e g r o v e s o f a c a d e m e , t h e
m e a s u r e o f t h e s i m i l a r i t y o f s e q u e n c e s i s g i v e n i n t h e
l i t e r a t u r e a s ' p e r c e n t i d e n t i t y ' . ' P e r c e n t i d e n t i t y '
( A l t sc hul e t a l . , 1997; Levi t t and G er s te in , 1998) is on ly
a n a d h o c s c o r e o f si m i l a r i ty f o r t h e s a m e r e a s o n s t h a t
e r r o r f r e q u e n c y i s n o t a n a c c e p t a b l e m e a s u r e o f p r o t e i n
e r r o r . G i v e n a s p e c if i c s i t e i n a n a l i g n m e n t o f t w o
s e q u e n c e s o r f a m i l i e s o f s e q u e n c e s , o n e m u s t c o n s i d e r
t h e c l o s e l y r e l a t e d f u n c t i o n a l l y e q u i v a l e n t a m i n o a c i d s .
' P e r c e n t id e n t i t y ' d o e s n o t t a k e i n t o a c c o u n t t h a t
a m i n o a c i d s a r e a l m o s t a l w a y s n o t e q u a l l y p r o b a b l e
a n d f o r t h i s r e a s o n l e a d s t o i l l u s i o n s . S i n c e a s S o c r a t e s
s a i d, . . . b u t s a t i s fa c t o r y m e a n s h a v e b e en f o u n d f o r
d i s p e l l i n g t h e s e i l lu s i o n s b y m e a s u r i n g , c o u n t i n g a n d
w e i g h i n g . t h e m u t u a l e n t r o p y s h o u l d b e u s e d a s t h e
o n l y c o r r e c t m e a s u r e o f ' s i m i l a r i t y '.
A c c o r d i n g t o m e c h a n i s t s - r e d u c t i o n i s t s , n o n - l i v i n g
m a t t e r c a n s e l f - o rg a n i z e , a n d s o li f e a n d e v o l u t i o n a r e
m a n i f e s t a t i o n s o f th e s p o n t a n e o u s e m e r g e n c e o f ' o r d e r ' .
L i v i n g m a t t e r , t h e y s a y , h a s b o t h ' o r d e r ' a n d ' c o m p l e x -
i t y . ' I n f o r m a t i o n t h e o r y s h o w s t h a t t h e i d e a s o f ' c o m -
p l e x i t y ' a n d ' o r d e r ' a r e c o n t r a d i c t o r y a n d m u t u a l l y
e x c l u s i v e . C r y s t a l s a r e h i g h l y o r d e r e d d u e t o t h e r e g u -
l a r i t i e s i n t h e p l a c e m e n t o f t h e i r m o l e c u l e s a s d i r e c t e d
b y t h e l a w s o f p h y s i c s a n d c h e m i s t r y . H o w e v e r , t h e s e
r e g u l a r i t i e s , w h i c h a l l o w c r y s t a l s t o b e d e s c r i b e d b y a
-
8/10/2019 Origin of Life on Earth and Shannons
10/19
T
3
G
e
c
o
p
y
m
a
x
e
m
e
p
(
A
m
i
n
a
d
L
S
A
r
A
l
V
a
P
T
h
G
l
y
l
T
e
m
T
y
H
i
G
i
n
A
s
L
A
s
G
lu
C
y
P
T
r
M
e
C
o
x
U
U
A
(
7
~
:
~
U
U
G
(
7
~
:
~
C
U
U
(
6
:
~
C
U
C
(
6
z
z
C
U
A
(
5
x
C
U
G
1
~
U
C
U
(
c
U
C
C
(
U
C
A
~
(
6
U
C
G
(
6
:
~
A
G
U
(
8
3
A
G
C
(
8
3
C
G
U
:
~
:
~
(
C
G
C
z
~
C
G
A
:
~
(
5
~
C
G
G
~
(
5
A
G
G
2
~
(
7
A
G
A
2
~
(
7
G
C
U
~
(
G
C
C
x
(
G
C
A
~
(
:
G
C
G
z
(
G
U
U
~
:
~
(
6
G
U
C
~
~
(
G
U
A
2
~
(
0
G
U
G
2
~
:
~
(
C
C
U
:
~
~
~
C
C
C
~
~
~
C
C
A
~
~
~
C
C
G
~
~
~
2
~
2
~
1
-
6
~
)
~
(
0
:
1
-
6
:
~
)
~
I
4
-
8/10/2019 Origin of Life on Earth and Shannons
11/19
T
a
3
C
o
n
A
m
i
n
a
d
L
e
S
A
r
A
l
V
a
P
T
h
G
l
y
l
T
e
m
T
y
H
i
G
i
n
A
s
L
y
A
s
G
l
u
C
y
P
T
r
M
e
A
C
U
2
A
C
C
2
A
C
A
:
:
x
A
C
G
:
~
G
G
U
x
:
G
G
A
2
G
G
C
:
~
:
G
G
G
2
A
U
U
~
z
A
U
C
:
~
A
U
A
2
~
:
U
A
A
z
U
A
G
:
~
c
U
G
A
:
~
:
~
2
U
A
U
:
~
U
A
C
C
A
U
:
:
C
A
C
:
~
:
C
A
A
~
C
A
G
~
A
A
U
~
A
A
C
A
A
A
A
A
G
:
G
A
U
G
A
C
G
A
A
G
A
G
U
G
U
2
:
~
U
G
C
2
z
U
U
U
3
U
U
C
3
:
~
U
G
G
:
~
:
~
2
~
A
U
G
2
:
~
~
0
~
:
~
:
~
:
~
:
0
:
~
~
0
~
c
:
~
0
:
3
c
:
~
0
9
3
0
3
g
7
3 3
7
0
2
~
0
2
:
~
~
:
~
c
~
z
2
~
0
:
:
~
:
~
:
~
0
2
~
:
:
~
c
2
0
:
:
~
z
~
2
:
~
:
~
c
0
2
~
:
:
:
~
2
~
0
:
:
2
0
:
~
~
O
~
:
~
:
~
:
~
2
0
:
~
:
~
:
~
2
0
:
2
~
O
S
~
8
0
~
:
~
8
c
8
2
9
R
e
m
a
f
m
Y
o
H
P
A
n
a
o
o
n
m
a
o
t
h
o
h
c
d
m
a
a
h
h
h
J
T
h
B
i
o
1
4
3
P
w
i
h
p
m
x
o
-
8/10/2019 Origin of Life on Earth and Shannons
12/19
116 H.P. Yockey / Computers Chemistry 24 (2000) 105-123
short sequence or algorithm, mean that crystals have a
very low information content. Information theory nar-
rows this definition to say that we may only call a
sequence 'highly ordered' if it has regularities and can
be described by a much shorter sequence or algorithm.
The information content of a sequence is measured in
the bits (or bytes).
Sequences of symbols that exhibit no orderly pattern
from which the rest of the sequence can be predicted
may, nevertheless, have low complexity because they
can be computed from an algorithm of finite informa-
tion content (Pincus and Kalman, 1997; Pincus and
Singer, 1996). For example, 7r and e have been calcu-
lated to sequences of more than a billion digits. They
exhibit no orderly pattern - - yet each digit in turn is
computable. We may conclude, in general, that a finite
message, specified by a computer program, may exist
that carries all the information contained in an infinite
sequence even though there is no discernible pattern in
the sequence of symbols.
A sequence of symbols is highly 'complex' when it
has little or no redundance or 'order' and cannot be
described by a much shorter sequence or calculated by
an algorithm of finite length. A random sequence has
the highest degree of complexity, has no redundance
and cannot be described except by the sequence itself.
Thus ~r and e, although their digits have no orderly
pattern, are neither complex nor random. Thus when
we speak of the amou nt o f complexity in a sequence we
are speaking about the amount of its randomness as
well (Yockey, 1990).
The principle, that computer data compression soft-
ware may reduce the order, redundancies, patterns and
regularities peculiar to natural languages, is based on a
theorem in infor mation theory by Sh annon (1948). The
result, achieved after the compression, is nearly indis-
tinguishable from a random sequence since most of the
patterns and regularities have been removed.
9 . I n f o r m a t i o n t h e o r y a n d t h e o r i g i n o f l i f e : t h e g a p
b e t w e e n l i v i n g a n d n o n l i v i n g
It is a curious fact, highly relevant to the origin of
life, that the genetic code is construc ted to confront and
solve the problems of commun ication and recording by
the same principles found both in the genetic informa-
tion system and in modern computer and communica-
tion codes. The origin of a rather accurate genetic code
is a pons asinorum that must be crossed to pass over
the abyss that separates crystallography, high polymer
chemistry and physics from biology. As I mentioned
above, there is nothing in the physico-chemical world
that remotely resembles reactions being determined by
a sequence and codes between sequences. The existence
of a genome and the genetic code divides living organ-
isms from non-living matter.
The origin of life has concerned philosophers and
theologians since antiquity, for example, the idealist
philosopher, Immanuel Kant (1724-1804) in his Cri-
tique of Judgment. (A quotation in English transla-
tion can be found in Wfichtersh/iuser (1997).) It was
not until the nineteenth century that these consider-
ations can be called scientific. There were, in general,
three philosophical approaches to the origin of life.
The first was vitalism, the belief there is a metaphysi-
cal, supe rnatura l, non -mate rial, idealist 61an vital, a
vital princ iple or life force, which dis tinguishes living
from non-li ving matter. Vitalism has its roots in the
German idealist philosophy of F.W.A. Schelling
(1775 1854) and others in the ninet eenth century,
members of a romantic philosophic movement,
Naturphilosophie, who believed all creation was a
manifestation of a World Spirit. They believed all
matter possessed this Spirit and organized bodies had
it to an intense degree. In the nineteen th century it
was quite possible to be a vitalist, believing in a vital
force or 61an vital, withou t th inking of the vital force
being supernatural. At the time it was as valid to
attribute the laws and effects of vitality to a nonma-
terial vital force as it was to attribute the laws and
effects of gravity to a nonm ater ial gravitati onal force.
As Darwi n put it, Who can explain what is the
essence of the attraction of gravity? No one now ob-
jects to following out the results consequent on this
unk nown element of attracti on. (Origin of Species,
Chapter XV) Vitalism is no longer considered since
the discovery by Watson and Crick of the role played
by DNA and RNA in the formation of protein. The
second was mecha nist-re ductioni sm. A group of
young physiologists at an 1869 meeting in Innsbruck,
Austria, declared: The ultimate objective of the nat-
ural sciences is to reduce all processes in Nature to
the movements that underlie them and to find their
driving forces, that is, to reduce them to mechanics
(Mayr , 1982). Wfichtersh/iuser (1997) is a cur ren t
support er of mechanist reductioni sm, the view that if
the historic process of the origin and evolution of life
could be followed it would prove to be a purely
chemical process. The third was the dialectical materi-
alism of Friedrich Engels (1954), supported by Oparin
(1957). According to dialectical materialism, the ap-
pearance of life is achieved, not through the laws of
physics and chemistry, but through the Law of the
Transformation of Quantity into Quality. For that
reason the proponents of dialectical materialism are
extremely hostile to mechanist reductionism. Engels
(1820-1895) had observed in one of the fragments of
Dialectics of Nature:
Mechanism applied to life is a hopeless category, at
the most we could speak of chemism, if we do not
want to renounce all understanding of names.
-
8/10/2019 Origin of Life on Earth and Shannons
13/19
H.P. Yockey / Computers Chemistry 24 (2000) 105-123 117
O p a r i n , u p o n r e a d i n g t h a t , a b a n d o n e d r e d u c t i o n -
i s m - m e c h a n i s m a n d b e c a m e a n i m m e d i a t e c o n v e r t t o
d i a l e c t ic a l m a t e r i a l i s m . O p a r i n , i n h i s l a t e r b o o k s , w e n t
i n to g r e a t d e t a il t o d e n o u n c e r e d u c t i o n i s m - m e c h a n i s m
a n d s u p p o r t d i a l e c t i c a l m a t e r i a l i s m . W e s t e r n i n t el l e c-
t u a l a p o l o g i s t s f o r O p a r i n w h o a r e n o t t r a i n e d i n t h e
i n t r i c a c i e s o f p h i l o s o p h y m a y f i n d t h e r e l a t i o n s h i p b e -
t w e en r e d u c t i o n i s m - m e c h a n i s m a n d d i a le c t ic a l m a t e r i-
a l i s m t o b e a d i s t i n c t i o n w i t h o u t a d i f f e re n c e ( L a z c a n o ,
1997; M ille r e t a l . , 1997) .
I s h a l l p u r s u e t h o s e p r o p o s a l s t h a t p u r p o r t t o g e n e r -
a t e a g e n o m e a n d a g e n e t i c c o d e . A c e n t r a l s p e c u l a t io n
i n v o l v e s t h e p r e m i s e t h a t l i f e o r i g i n a t e d f r o m a p r i m e -
v a l s o u p o f o r g a n i c s u b s t a n c e s i n t h e o c e a n b y p r e b i o t i c
o r c h e m i c a l e v o l u t i o n , a b e l i e f t h a t l i fe w o u l d a r i s e
s p o n t a n e o u s l y i n f l a g r a n t e d e l i c t o f r o m t h i s n o n - l i v i n g
m a t t e r a n d t h a t i t w o u l d a l m o s t i n e v i t a b l y a r i se o n
' s u f f i c i e n t l y s i m i l a r y o u n g p l a n e t s e l s e w h e r e ' . G e o r g e
G a y l o r d S i m p s o n ( 1 9 6 4 ) p o i n t e d o u t t h a t t h e N a t i o n a l
A e r o n a u t i c s a n d S p a c e A d m i n i s t r a t io n ( N A S A ) h a s a
p r o g r a m i n ' s p a c e b i o s c i e n c e '. " T h e r e i s e v e n i n c r e a s in g
r e c o g n i t i o n o f a n e w s c i e n c e o f e x t r a t e r r e s t r i a l l i f e ,
s o m e t i m e s ca l l e d e x o b i o l o g y - - a c u ri o u s d e v e l o p m e n t
i n v i e w o f t h e f a c t t h a t t h i s ' s c i e n c e ' h a s y e t t o d e m o n -
s t r a t e t h a t i t s s u b j e c t m a t t e r e x i s t s " T h i r t y f i v e y e a r s
l a t e r t h a t s t a t e m e n t i s s t i l l t r u e .
S t a n l e y M i l l e r ( 1 9 53 ) i s u s u a l l y g i v e n c r e d i t f o r b e i n g
t h e f i r s t t o g e n e r a t e a m i n o a c i d s i n a p r e b i o t i c a t m o -
s p h e r e. H o w e v e r , W a l t h e r L 6 b (1 91 3) a n d O s k a r B a u d -
i s c h ( 1 9 1 3 ) a n t i c i p a t e d h i m b y 4 0 y e a r s . T h e s e t w o
w o r k e r s , b o t h a d d r e s s i n g t h e q u e s t i o n o f t h e a s s im i l a -
t i o n o f n i tr o g e n t o f o r m p r o t e i n , s h o w e d t h a t a m i n o
a c i d s a r e g e n e r a t e d i n t h e s i l e n t e le c t r i c a l d i s c h a r g e
( L 6 b , 1 9 13 ) o n l y i n a r e d u c i n g a t m o s p h e r e , a n d b y U V
( B a u d i s c h , 1 9 1 3 ) , a l s o o n l y i n a r e d u c i n g a t m o s p h e r e . I
h a v e g i v e n a r e v i ew o f t h e v i ew s o f D a r w i n , O p a r i n a n d
M i l l e r t o g e t h e r w i t h a c r i t i c i sm t h a t t h e r e i s n o g e o l o g -
i c a l e v i d e n c e t h a t s u c h a p r i m e v a l s o u p e v e r e x i s t e d
( Y o cke y , 1992 , 1995a ,b , 1997) . Mi l le r e t a l . ( 1997) and
L a z c a n o ( 19 9 7) a d m i t t h a t t h e r e i s n o g e o l o g i c a l o r
g e o c h e m i c a l e v i d e n c e t h a t a p r i m e v a l s o u p e v e r e x i s t e d .
M o j z s i s e t a l . ( 1 9 9 8 ) r e m a r k e d : " H o w e v e r , i t i s n o w
h e l d h i g h l y u n l i k e l y t h a t t h e c o n d i t i o n s u s e d i n t h e s e
e x p e r i m e n t s ( s i l e n t e l e c t r i c a l d i s c h a r g e ) c o u l d r e p r e s e n t
t h o s e i n t h e A r c h a e n a t m o s p h e r e . E v e n s o , s c i e n t i f i c
a r t ic l e s s ti l l o c c a s i o n a l l y a p p e a r t h a t r e p o r t e x p e r i m e n t s
m o d e l l e d o n t h e s e c o n d i t i o n s a n d e x p l i c it y o r t a c i t l y
c l a i m t h e p r e s e n c e o f r e s u l t i n g p r o d u c t s i n r e a c t i v e
c o n c e n t r a t i o n s ' o n t h e p r i m o r d i a l E a r t h ' o r i n a ' p r e b i -
o t i c s o u p . ' ( S e e f o r e x a m p l e , D e a m e r , 1 9 9 7 ; S m i t h ,
1 9 98 , 1 9 99 ). T h e i d e a o f s u c h a ' s o u p ' c o n t a i n i n g a l l th e
d e s i r e d o r g a n i c m o l e c u l e s i n c o n c e n t r a t e d f o r m i n t h e
o c e a n h a s b e e n a m i s l e a d in g c o n c e p t a g a i n s t w h i c h
o b j e c t i o n s w e r e r a i s e d e a r l y (S i l l6 n , 1 9 6 5 ) ." T h i s r e m a r k
b e l i e s t h e t i t l e o f t h e b o o k i n w h i c h i t a p p e a r s .
E x a c t ly b y w h o m a n d w h e n t h e m o d e r n n o t i o n t h a t
l i f e o r i g i n a t e d f r o m c o l l o i d s o r c o a c e r v a t e s g e n e r a t e d
f r o m t h e o r g a n i c s u b s t a n c e s i n t h e e a r l y o c e a n w a s f i r s t
e x p o u n d e d i n s p e ci f ic f o r m i s h a r d t o f i n d . H a e c k e l
( 1 8 3 4 - 1 9 1 9 ) c l a i m e d p r i o r i t y , H e w r o t e : " T h e m o n i s t i c
h y p o t h e s i s o f a b i o g e n e s i s , o r a u t o g e n y i n t h e s t r i c t l y
s c i e n ti f ic s e n s e o f t h e w o r d , w a s f i r st f o r m u l a t e d b y m e
i n 1 8 66 i n th e s e c o n d b o o k o f t h e G e n e r a l M o r p h o l -
o g y . " ( T o d a y a u t o g e n y i s ca l l ed s e l f - o r g a n i z a ti o n . ) A s
H a e c k e l ( 1 8 6 6 ) e x p l a i n e d :
" T h e c h e m i c a l p r o c e s s e s w h i c h f i r s t s e t i n a t t h i s s t a g e
o f d e v e l o p m e n t m u s t h a v e b e e n c a t a ly s i s , w h i c h l e d t o
t h e f o r m a t i o n o f a l b u m i n o u s c o m b i n a t i o n s , a n d e v e n -
t u a l l y o f p l a s m . T h e e a r l i e s t o r g a n i s m s t o b e t h u s
f o r m e d c a n o n l y h a v e b e en p l a m o d o m o u s M o n e r a ,
s t r u c t u r e l e s s o r g a n i s m s w i t h o u t o r g a n s ; t h e f i r s t f o r m s
i n w h i c h l i v i n g m a t t e r i n d i v i d u a l i z e d w e r e p r o b a b l y
h o m o g e n e o u s g l o b s o f p l as m , l i k e c e r t a i n o f t h e a c t u a l
c h r o m a c e a ( chroococcus ) . T h e f i rs t c el l s w e r e d e v e l o p e d
s e c o n d a r i l y f r o m t h e s e p r i m i t iv e M o n e r a , b y s e p a r a t i o n
o f t h e c e n t r a l c a r y o p l a s m ( n u c l e u s) a n d p e r i p h e r a l c y t o -
p l a s m ( ce ll b o d y ) . "
H a e c k e l ' s v i e w s w e r e w e l l - k n o w n i n t h e n i n e t e e n t h
a n d e a r l y t w e n t ie t h c e n t u r i e s f o r h e w a s w i d e l y p u b -
l i s h e d i n p r o f e s s i o n a l b o o k s a n d j o u r n a l s a n d i n b e s t -
s e ll i ng p o p u l a r b o o k s t r a n s l a t e d f r o m t h e G e r m a n t o
s e v e r a l l a n g u a g e s . H a e c k e l ' s d i s c u s s i o n o f t h e o r i g i n o f
l i fe f r o m a p r i m o r d i a l s o u p i n t h e e a r l y o c e a n w a s s o
w e l l k n o w n a m o n g s c i e n t i s t s , t h e o l o g i a n s a n d t h e t h e -
a t e r - g o i n g p u b l i c i n 1 8 7 8 t h a t S i r W i l l i a m S c h w e n c k
G i l b e r t (1 8 3 6 - 1 9 1 1 ) h a d P o o h - B a h , a c o m i c , g r e e d y
a n d c o n c e i t e d c h a r a c t e r i n T h e M i k a d o , i n t r o d u c e h i m -
s e l f a s f o l l o w s :
I a m i n p o i n t o f f ac t , a p a r t i c u l a r ly h a u g h t y a n d
e x c lu s i v e p e r s o n , o f p r e - A d a m i t e a n c e s t r a l d e s c e n t. Y o u
w i l l u n d e r s t a n d t h i s w h e n I t e ll y o u t h a t I c a n t r a c e m y
a n c e s t r y b a c k t o a p r o t o p l a s m a l p r i m o r d i a l a t o m i c
globule ( G i lbe r t , 1878) .
L o u i s P a s t e u r ( 1 8 2 2 - 1 8 9 5 ) s h o w e d t h a t p r o p e r l y
s t e r i li z e d c u l t u r e s a l w a y s b e c a m e i n f e c t e d w h e n e x p o s e d
t o a i r . A l p i n e a i r , w h i c h w a s a l m o s t f r e e o f g e rm s ,
s e ld o m p r o d u c e d a g r o w t h o f o r g an i s m s . T h o r o u g h l y
s t e r i l iz e d c u l t u r e s r e m a i n e d s o , f o r y e a r s , i n t h e a b s e n c e
o f g e r m s a d d e d f r o m w i t h o u t . L i fe , th e r e f o re , m u s t
c o m e o n l y f r o m l i f e . H a e c k e l m a i n t a i n e d t h a t P a s t e u r
h a d o n l y s e t t l e d t h e n e g a t i v e i n c e r t a i n c i r c u m s t a n c e s .
I t b e i n g v e r y d i f fi c u l t o r i m p o s s i b l e t o p r o v e a n e g a t i v e ,
m a n y s c i e n t is t s i n t h e n i n e t e e n t h c e n t u r y , a n d m a n y
t o d a y , s u p p o r t s p o n t a n e o u s o r i g i n o f l if e f r o m a p r i m e -
v a l c h e m i c a l s o u p i n t h e e a r l y o c e a n .
T h e ' w a r m l i t t le p o n d ' q u o t a t i o n ( D a r w i n , 1 89 8)
f r o m D a r w i n ' s p r i v a t e c o r r e s p o n d e n c e d o e s n o t r e f l e c t
C h a r l e s D a r w i n ' s v i e w s o n t h e o r i g i n o f li fe . H a d h e
-
8/10/2019 Origin of Life on Earth and Shannons
14/19
118 H.P. Yockey / Computers Chemistry 24 (2000) 105 123
t h o u g h t t h e ' w a r m l i tt l e p o n d ' i d e a w o r t h y o f p u b l i c a -
t i o n , h e w o u l d c e r t a i n l y h a v e d o n e s o . S i g n i f i c a n t l y ,
D a r w i n a v o i d e d t h e o r i g i n o f li fe c o n t ro v e r s y i n t h e
s i x t h e d i t i o n ( 18 7 2) o f T h e O r i g i n o f S p e c ie s :
I t i s n o v a l i d o b j e c t i o n t h a t s c i e n c e a s y e t t h r o w s n o
l i g h t o n t h e f a r h i g h e r p r o b l e m o f t h e essence or or igin
o f l i f e . W h o c a n e x p l a i n w h a t i s t h e e s s e n c e o f t h e
a t t r a c ti o n o f g r a v i t y ? N o o n e n o w o b j e c t s t o f o l l o w i n g
o u t t h e r e s u l t s c o n s e q u e n t o n t h i s u n k n o w n e l e m e n t o f
a t t r a c t i o n ; n o t w i t h s t a n d i n g t h a t L e i b n i t z f o r m e r l y a c -
c u s e d N e w t o n o f i n t r o d u c i n g ' o c c u l t q u a l i t ie s i n to p h i -
l o s o p h y ' ( C h a p t e r X V ) . ( M y i t a l i c s )
T h o s e w h o s p e c u l a t e d a b o u t t h e o r i g in o f l if e p r o -
p o s e d v a r i o u s m o d e l s s i m i l a r t o H a e c k e l ' s . B e f o r e t h e
d i s c o v e r y o f t h e g e n e t i c c o d e , J a c q u e s L o e b i n 1 9 06
o b j e c t e d t o t h e p r o p o s a l t h a t l i f e e m e r g e d f r o m c o l l o i d s
t h r o u g h c a t a l y s i s ( L o e b , 1 9 06 ). L o e b w a s a n e x p e r t i n
t h e c h e m i s t r y o f c o l l o i d s a n d v e r y w e l l - k n o w n i n t h e
f i rs t p a r t o f t h e t w e n t i e t h c e n t u r y . I n h i s b o o k , T h e
D y n a m i c s o f L i v i n g M a t t e r , p u b l i s h e d i n 1 90 6, L o e b
w r o t e :
B u t w e s e e t h a t p l a n t s a n d a n i m a l s d u r i n g t h e i r g r o w t h
c o n t i n u a l l y t r a n s f o r m d e a d i n t o l i v i n g m a t t e r , a n d t h a t
t h e c h e m i c a l p r o c e s s e s i n l i v i n g m a t t e r d o n o t d i f f e r i n
p r i n c i p l e f r o m t h o s e i n d e a d m a t t e r . T h e r e i s , t h e r e f o r ,
n o r e a s o n t o p r e d i c t t h a t a b i o g e n e s i s is im p o s s i b l e a n d
I b e l i e v e t h a t i t c a n o n l y h e l p s c i e n c e i f y o u n g e r i n v e s t i -
g a t o r s r e a l i z e t h a t e x p e r i m e n t a l b i o g e n e s i s i s t h e g o a l o f
b i o l o g y . O n t h e o t h e r h a n d , o u r l e c tu r e s s h o w c l e a r l y
t h a t w e c a n o n l y c o n s i d e r th e p r o b l e m o f a b io g e n e s i s
s o l v e d w h e n t h e a r t i f ic i a l l y p r o d u c e d s u b s t a n c e i s c a p a -
b l e o f d e v e l o p m e n t , g r o w t h , a n d r e p r o d u c t i o n . I t is n o t
s u f f i c i e n t f o r t h i s p u r p o s e t o m a k e p r o t e i n s y n t h e t i c a l l y ,
o r t o p r o d u c e i n g e l a ti n o r o t h e r c o l l o i d a l m a t e r i a l
r o u n d g r a n u l e s t h a t h a v e a n e x t e r n a l r e s e m b l a n c e t o
l iv ing ce l l s .
L o e b r e je c t e d t h e c o l l o i d o r P o o h B a h ' s p r o t o p l a s m a l
p r i m o r d i a l a t o m i c g l o b u l e m o d e l o f t h e o r i g i n o f l i f e ;
h o w e v e r , m u c h i t m a y h a v e a p p e a l e d t o h i s m e c h a n i s t
i d e o l o g y . H e s a w t h a t c o l l o i d s l a c k t h e c h a r a c t e r i s t i c
c h e m i c a l p r o c e s s e s , n a m e l y e n z y m e s , b y w h i c h o r g a n -
i s m s m a k e s u g a r s , f a t s , p r o t e i n s a n d o t h e r m o l e c u l e s
e s s e n t i a l t o t h e i r m e t a b o l i s m . I n t h e t e r m s I a m d i s -
c u s s i n g , t h e y h a v e n o g e n o m e t o c o n t r o l t h e f o r m a t i o n
o f t h e s e c r i t i c a l c o m p o u n d s . I t i s a t r a v e s t y t h a t L o e b ' s
c o m m e n t s a r e n o t n o w m e n t i o n e d i n t h e l i t e r a t u r e .
S e l f - r e p l i c a t i o n i s a f u n d a m e n t a l r e q u i r e m e n t i n s p e c -
u l a t i o n s o n t h e o r i g i n o f l if e . S e l f - r e p l i c a t i o n m e c h a -
n i s m s , a t t h e o r i g i n o f l if e , m u s t b e a c c o m p l i s h e d
w i t h o u t t h e a c t i o n o f e n z y m e s b e c a u s e e n z y m e s t h e m -
s e l v e s a r e p r o d u c e d b y i n s t r u c t i o n s i n t h e g e n e t i c m e s -
s a g e . A l l s c h e m e s f o r n o n - e n z y m a t i c s e l f - r e p l i c a t i o n
m u s t h a v e a s k i ll e d c h e m i s t a c t i n g a s a d e u s e x m a c h i n a
u s i n g p u r e c h e m i c a l s in a w e l l - e q u i p p e d c h e m i c a l l a b o -
r a t o r y . A t t e m p t s t o d i s c o v e r n o n - e n z y m a t i c s e l f - r e p l i -
c a t i n g c h e m i c a l s y s t e m s i s a c u r r e n t f i e l d o f
b i o c h e m i s t ry . T h e f u n c t i o n o f e n z y m e s is t o s p e e d u p o r
c a t a l y z e c h e m i c a l r e a c t i o n s . O n t h e b i o l o g i c a l l y s i g n if i -
c a n t t i m e s c a l e o f s e co n d s t o y e a r s , s o m e e n z y m e s m a y
c a t a l y z e a s m a n y a s a m i l l i o n r e a c t i o n s p e r m i n u t e ,
w h e r e a s i n t h e a b s e n c e o f th e e n z y m e t h e r e a c t i o n m a y
t a k e m a n y y e a r s .
D i r e c t e d P a n s p e r m i a , t h e s u g g e s t i o n t h a t l i f e w a s
i n t r o d u c e d f r o m o u t e r s p a c e , r a is e s i t s h e a d f r o m t i m e
t o t im e . H e r m a n n L u d w i g F e r d i n a n d v o n H e l m h o l t z
( 1 8 2 1 -1 8 9 4 ) a n d L o r d K e l v in , W i l l i a m T h o m p s o n
( 1 8 2 4 - 1 9 0 7 ) , p r o p o s e d t h e D i r e c t e d P a n s p e r m i a t h e o r y
i n t h e n i n e t e e n t h c e n t u r y . H e l m h o l t z a n d A r r h e n i u s
( 19 0 8) p r o p o s e d t h a t l i f e i s e t e r n a l a n d t h a t m e t e o r s
t h a t r o a m a b o u t t h e s o l a r sy s t e m m i g h t c o n t a i n g e r m s
o f o r g a n i s m s t h a t , u n d e r f a v o r a b l e c o n d i t io n s , m i g h t
r e a c h t h e E a r t h a n d o t h e r p l a n e t s . A l t h o u g h C r i c k
( 1 9 68 ) s u g g e s t e d t h a t : P o s s i b l y t h e f i r st e n z y m e w a s a n
R N A m o l e c u l e w i th r e p l ic a s e p r o p e r t i e s , t h is p r o p o s a l
i s n o w k n o w n a s T h e R N A W o r m a s c o i n e d b y G i l b e r t
( 19 8 6) . T h e d i s c o v e r y o f ri b o z y m e s , i n w h i c h R N A
p l a y s b o t h r o le s , n a m e l y , t h a t o f r e c o r d i n g a n d t r a n s f e r
o f i n f o r m a t i o n a n d t h e c a t a l y s is ro l e o f e n z y m e s p r o v e d
t h i s p r o p h e t i c ( C e c h , 1 9 8 6 ; Z a u g a n d C e c h , 1 9 8 6 ) .
S o m e t h o u g h t r i b o zy m e s p r o v id e d t h e p a t h w a y f r om
t h e p r i m e v a l so u p t o t h e p r o t o b i o n t . I t w a s p r o p o s e d a s
a s e l f - r e p l i c a t i n g s y s t e m p r e c e d i n g t h e p r e s e n t s y s t e m ,
w h i c h d i d n o t d e p e n d o n t h e e n z y m e a c t i o n o f p r o t e i n s
b u t r a t h e r o n a p r e - e n z y m e a c t i o n o f n u cl e ic a c id s
( J o y c e , 1 9 9 8 ) . T h i s s u g g e s t i o n s e e m s t o m o v e t h e p r o b -
l e m a s t e p n e a r e r t o t h e p r o t o b i o n t b u t s ti ll e n c o u n t e r s
t h e p r i m a r y q u e s t i o n s o f t h e g e n e r a t i o n o f t h e g e n e t i c
m e s s a g e a n d o f t h e o r i g i n a n d e v o l u t i o n o f t h e g e n e t i c
c o d e .
T a k e n a t b e s t, h o w e v e r , th i s p r o p o s a l s i m p l y m o v e d
t h e s e a r c h f o r a n o r i g i n o f l i f e s c e n a r i o o n e s t e p f r o m
t h e p r i m e v a l s o u p , w h e r e i t e n c o u n t e r s t h e n e e d f o r a
c h e m i c a l p r o c e d u r e f o r t h e p r e b i o t i c f o r m a t i o n o f t h e
s u g a r r i b o se , a c o m p o n e n t o f R N A . T h e r e i s o n l y o n e
p l a u s i b l e s y n th e s is o f r i b o s e t h a t m a y b e c o n s i d e r e d i n
t h e p r e b i o t i c m i l i e u , n a m e l y , t h e p o l y m e r i z a t i o n o f
f o r m a l d e h y d e . R i b o s e i s o n l y o n e o f a n u m b e r o f
s u g a r s a n d n e v e r t h e p r i m a r y p r o d u c t ( O r g e l , 1 9 8 6 ) .
F u r t h e r m o r e , t h e c o n d e n s a t i o n o f a d e n i n e o r g u a n i n e
w i t h r i b o s e l e a d s t o m i x t u r e s o f t h e o p t i c a l i s o m e r s t h a t
a r e v e r y c o m p l e x . F r o m t h e p o i n t o f vi e w o f t h e
b i o c h e m i s t , t h i s p r o b l e m i s a s d i f f i c u l t a s t h e o n e t h e
d i s c o v e ry o f r i b o z y m e s w a s s u p p o s e d t o s o lv e . F u r t h e r -
m o r e , n o n - b i o l o g i c a l r e a c t i o n s i n t h e p r e b i o t i c m i l i e u
l e a d t o 2 ' , 5 ' i s o m e r s ( O r g e l , 1 9 8 6 ). T h i s d o e s n o t l e a d i n
t h e d i r e c t i o n o f t h e o r i g i n o f l i f e b e c a u s e t h e r e a c t i o n s
t h a t a r e t y p i c a l o f b i o c h e m i c a l p r o d u c t s a r e a l m o s t
-
8/10/2
top related