helix–coil transition in heterogeneous chains. ii. dna model

17
BIOPOLY blERS VOL. $1, PP. 205-221 (1970) Helix-Coil Transition in Heterogeneous Chains. 11. DNA Model Synopsis Alelting curves are cdrulated for infinitely long DXA-like random copolymers com- posed of AT and GC pairs of nucleotides. The entropy of random roil rings formed on melting is explicitly included through use of the Jacobson-Storkmayer ring-weighting factors. Transition curves are calculated for values of the cooperativity parameter u in the range 10P < u < lop4. Xinety percent of the melting occurs in ca. 0.2% for u 6 regardless of the mole fraction of GC. We conclude that observed breadths of thermal denaturation curves for native DKAs result from a superposition of essen- tially all-or-none melting of various regions of the molecule. I t is argued that refined approximations to the ring-weighting factors are probably not. important when com- pared with the effects produced by long-range lmse seyuenre rorrelations which me known to occur in native IjSA. Introduction Thermal denaturations of helical macromolecules of biological im- portance have been intensively studied for some time. The initial theoretical work on polypeptides clearly demonstrated the effects to be expected of cooperativity, and provided an understanding of the relatively sharp transition profiles exhibited by biopolymers. Further questions regarding the effects of compositiorial heterogeneity have recently been ans~ered,*~-~~ arid an exact solution for copolymeric polypeptides2I has been found. Thus, one may say with some assurance that the problem of the thermal denaturation of polypeptides has been solved, at least insofar as the particular model conforms to physical reality. llolecules such as double-stranded DSA are riot so well understood, however. By virtue of the double-strandedness, the entropy gained on melting interior portions of helical DSA does not equal that characterizing melting from the ends of the molecule. The presence of randomly coiling rings of unbonded bases forces the use of approximatioris for description of such rings, and further complicates the mathematical analysis. Recently, an approximation scheme designed to incorporate both the effects of double-straridedriess arid of bake-sequence heterogeneity has been intro- * Present address: Ijepnrtnient of Cheniistq , University of 11-ashington, Seattle, \Vashington 98 105. @ 1970 by John Wiley & Sons, Inr. 203

Upload: b-k-eichinger

Post on 06-Jun-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Helix–coil transition in heterogeneous chains. II. DNA model

BIOPOLY blERS VOL. $1, PP. 205-221 (1970)

Helix-Coil Transition in Heterogeneous Chains. 11. DNA Model

Synopsis

Alelting curves are cdrulated for infinitely long DXA-like random copolymers com- posed of AT and GC pairs of nucleotides. The entropy of random roil rings formed on melting is explicitly included through use of the Jacobson-Storkmayer ring-weighting factors. Transition curves are calculated for values of the cooperativity parameter u in the range 10P < u < lop4. Xinety percent of the melting occurs in ca. 0.2% for u 6 regardless of the mole fraction of GC. We conclude that observed breadths of thermal denaturation curves for native DKAs result from a superposition of essen- tially all-or-none melting of various regions of the molecule. I t is argued that refined approximations to the ring-weighting factors are probably not. important when com- pared with the effects produced by long-range lmse seyuenre rorrelations which m e known to occur in native IjSA.

Introduction

Thermal denaturations of helical macromolecules of biological im- portance have been intensively studied for some time. The initial theoretical work on polypeptides clearly demonstrated the effects to be expected of cooperativity, and provided an understanding of the relatively sharp transition profiles exhibited by biopolymers. Further questions regarding the effects of compositiorial heterogeneity have recently been a n s ~ e r e d , * ~ - ~ ~ arid an exact solution for copolymeric polypeptides2I has been found. Thus, one may say with some assurance that the problem of the thermal denaturation of polypeptides has been solved, at least insofar as the particular model conforms to physical reality.

llolecules such as double-stranded DSA are riot so well understood, however. By virtue of the double-strandedness, the entropy gained on melting interior portions of helical DSA does not equal that characterizing melting from the ends of the molecule. The presence of randomly coiling rings of unbonded bases forces the use of approximatioris for description of such rings, and further complicates the mathematical analysis. Recently, an approximation scheme designed to incorporate both the effects of double-straridedriess arid of bake-sequence heterogeneity has been intro-

* Present address: Ijepnrtnient of Cheniistq , University of 11-ashington, Seattle, \Vashington 98 105.

@ 1970 by John Wiley & Sons, Inr. 203

Page 2: Helix–coil transition in heterogeneous chains. II. DNA model

206 EICHINGER AND FIXMAN

d ~ c e d . ~ ~ Judged on comparison22*2Y with both exact21 and l lon te Carlo calculationsz2 for a polypeptide model, the approximate methods introduced by Fixmari and ZerokaZ3 (here:tfter knowri ILS 12Z) to treat, compositional heterogeneity have been successful. Thus, with some confidence in the FZ polypeptide model, we have sought to understand the effects produced by inclusion of the internal rings or loops characterist ic of double-stranded DXA.

The formal similarities between polypeptides and nucleic acids is striking, and in many respects the two may be discussed interch:ingeably. This concept was formulated by FZ, arid is here re-expressed in the theoretical section. Incorporation of ring-weighting factors in the random flight approximationZ6 is there carried out in the manner suggested earlier. Results of the calculations are reported and discussed. Arguments are pre- sented to support the use of the Jacobson-Stockmayer theory, arid finally, the information content of denaturation profiles is discussed.

Theoretical Description of a random copolymer, the residues of which may exist in

either of two states, requires specification of two numbers (in the Ising model) for each residue i (1 _< i 5 N ) in a chain of N residues. I& the first of these, n,, refer to the state such that

( + 1 (helix)

where, for DNA, “helix” signifies a bonded base pair (residue) and “coil” an unbonded base pair. Further specification of the chain requires descrip- tion of the chemical nature of each residue; the set of numbers { b ) , with members bi for each residue i, is sufficient for this purpose.

It is appropriate to consider an ensemble with chains distributed with respect to { b ) , and characterized by the probability P { b ) . Since the free energy E { n [ b } (in units of kT) of a particular chain of sequence { b ) depends upon { n ) , the probability P { ~ z , b } of occurrence of chains specified by the sets ( n ) and ( b ) is

P {n,bl = exp[ -E { n I b } IP { b I /& { b I (1) where,

is the partition function for chains of given { b } . The ratio P { b ) /& { b } may be interpreted as an effective probability F { b ) , inasmuch as this ratio is a positive, normalizable function of { b ) .

As stated above, the free energy of a molecule depends on its specific configuration. Following FZ, let

N N

Page 3: Helix–coil transition in heterogeneous chains. II. DNA model

HELIX-COIL TRANSITION I N HETEROGENEOUS CHAINS. 11. 207

where S { n 1 is the ring entropy and (,? is a stacking free energy. Sirice :I.

factor -20. enters at, each j u i d o n between helix nrid coil stretches one may identify exp { -47i} with the cooper:itivit.y parmieter u of established usage.

Assuming now that, 721 = 1 a r i d 11.y = -1, me may substitute :I set of integers { .r) for the set { T I } , with x1 = As, the number of residues ( n = + 1) in the first helical stretch, ~l'? - .rl = Axr the number of residues in t.he succeeding stretch of coil ( t l l = - l) , :~nd so on. Thus, xii is the index of t.ho 1:wt residue in :i st,retch of helix (k odd) or coil ( k even) and A x k is t,he num- ber of residues iri that stretch. Furthermore, let there be I, + 1 regions of helix arid coil (i.e., there are L junct,ions), with 2 0 = 0 arid x L + ~ = N .

Since we are not coricerried with long range excluded volume inter- actions, it, is legitimate to assume that, the erit.ropy of t.he chain is :Lddit>ivc! over stretches of helix and coil. Thus,

1 if k odd (helix) or termiiul (X = 1, + 1) coil

h,(y) if k even h k ( Y ) = ( 5 )

By assigning the helical regions zero entropy in eq. ( 5 ) we have taken the standard state to be that of the r:tndom coil with two free ends. The true entropy of the helix relative to this standard state is lumped into L' arid { b } and hence does riot appear explicitly. Specification of the function h,(y) is left until a later stage.

The Boltzmann factor for our single chain of given ( 1 ) ) and ( . c } is, :ip:irt from the unimportant constnrit exp { C"),

L+I rx p[ - I!: ( 1 ) i 1) } 1 = e-? 1'1, n h (AX ) cxp - ( - 1 j k p 1 ((j)

h=l

with

k-1 i= z

The partition function for chains distributed with rcspect to ( n } and ( b ] c:m now be written in terms of n normalized F { h } as

2 = c exp[-E(nIb}] F ( h )

Page 4: Helix–coil transition in heterogeneous chains. II. DNA model

2Oh 15IC I I IN( ;I< 1 t AN I ) PIX 1 I AX

Eq. (S) is equivalerit to an average of In Q { b } over all { b } weighted by the appropriate a priori probability P { b } (proof given in the Appendix). After the sum over ( b ) is carried out, for fixed

where

To make further progress a t this point, we require that correlations among base pairs be short ranged. This condition could be realized in practice by chemical synthesis or by use of a terminal transferase enzyme with short memory. Whether or not native DSA conforms to this approximation is ari important point which will be discussed. Consider- ing [', and hence Ax, , to be large, 1:Z introduced the approxim a t ' 1011

L i l

F ( P 1 = n Fl(PJ (12) c=1

which allows treatment of correlations between residues b within :I stretch of helix or coil, but ignores correlations between different, P I .

Evaluation of Partition Function

The distribution of f l k is completely specified by the cumulants of that distribution, which may, in turn, be eva1u:ited by introducing the factor

into the partition function and by differenti:tting with respect to +. Thus

where A,(+) = - ( - I lk + 4 ( 1 5 )

The cumularits of P@), designated R,, are thus obtained from

R k = (l/N)[a' 111 %(+,N)/d+k],+o (16)

Page 5: Helix–coil transition in heterogeneous chains. II. DNA model

(l!))

(20)

(21)

(22)

(23)

where the 13?, :ire Bernoulli numbers and

(24)

(25)

The fuiictioris D L :ire similarly defined as the kth derivatives of D ( 4 ) . Substitution of eq. (23) into eq. (20) yields

Z ( + , N ) = exp iNS(d1 Zd4,N) (20) where

.v L

Z o = c e d U L n c hr(A.cn) exp[D(4)X~Aa~] (27)

The partition function may riow be evduzlted by introducing the factor exp ( - t N ) into eq. (27) niid by summing over all N as for :L grnrid erisemblc,

L=l h=1 AT!,

Page 6: Helix–coil transition in heterogeneous chains. II. DNA model

210 EICHINGER AND FIXMAN

thus giving

= 2 a”[w,,(/ - n) w,(l + n)p ,(=I

where

The weighting function for helical stretches is unity, by virtue of eq. ( 5 ) . Following the method of Lifson and Zimm,’ ZO(4,N) will be given by

Z d 4 , N ) = exp { G N ] (32) for large N , where G(q5) locates the positive pole of z(4,t) in eq. (29); G satisfies the condition

1 = awh(G - 0) w,(G + 0) (33)

R k = X k + Gk lz 2 1 (34)

Since Z ( 4 , N ) in eq. (26) also diverges for large N as exp { r N ) , we have

by combination of eqs. (16), (X) , arid (32) . The mean helix fract,ion 0 may now be evaluntetl ns

This operation may be conveniently accomplished by substituting 0‘ for D in every odd term in eq. (27), and by subsequent differentiation according to

0 = N-’(b In Z / ~ D ’ ) D ~ = D

= ( d G / b 0’) D ~ = D

From eqs. (30) and (33) we thus obtain

ec-D’ - 1 = awdG + n) 0 = 11 - u exp (no - Go) buc(Go + Do)/bGol-l

and

where Do = D(O), with n corrcspontling definition for Go.

Page 7: Helix–coil transition in heterogeneous chains. II. DNA model

IIELIX-COIL TItANSITION 1N IIlCTEltOGENEOUS CIIALNS. 11. 21 1

Ring-Weighting Functions

The treatment according to FZ that we have reiterated has avoided specialization to either polypeptides or DNA. Further progress necessi- tates specification of the functions h,(s) appearing in eq. (31). For poly- peptides, the case already treated by FZ, h,(x) = 1 for all x, by virtue of choice of standard states. The same weighting function is applicable to DNA if melting occurs only from the ends. In that case the configura- tional entropy of the two uribonded strands, each of x bases, affixed at a common juriction will be indistinguishable from that for a single-stranded polymer with 22 bases, provided of course that x is large enough so that spatial correlations a t the junction are unimportant.

A'Ielting from interior portions of the DNA molecule will be far more extensive than melting from the ends however, as has been shown by C r o t h e r ~ . ~ ~ JIoreover, use of the grand ensemble method presupposes very long chains, for which we are justified in allowing internal melting only.

The configurational entropy of random coil rings relative to freely jointed chains of the same number of units mas calculated some years ago by Jacobson arid Stockmayer.26 They showed that the entropy loss on forming a bond between the two terminal groups of a freely jointed chain is proportional to np3I2, where n is the number of statistical units in the chain.

For DXA, we require the probability of ring closure for a chain of 22 + 2 bonds, relative to a standard state, e.g., a ring of two bonds as represented by successive pairs of bonded bases. On this basis, and in conformity with the calculations of Zimm6 and others, we take

I&) = (x + 1)--3'2 (37)

other approximations to the ring-weighting factors notwithstanding. Substitution of eq. (37) into ey. (33) followed by elementary rearrange- ments yields

ueGfD@(G' + D, 3/2) + 1 - u (38) eG-D =

where m

+(y,s) = c 2-8 exp f -ley) (39) z=1

T ~ u e s d e l l ~ ~ has summed functions of the t,ype given by ey. (39) for small y, arid his result

<I>

*(y,s) = I'(1 - s)(/J)s-' + c {(s - u ) ( - g ) n / n ! (40) n=O

will be used throughout for numerical evaluation of eq. (38) and its derivatives. 111 no instance is it necessary to carry the sum in eq. (40) beyond three terms. In ey. (40) I'(1 - s) is the Gamma function, arid [(s - n) is the Riemann zeta function.28

Page 8: Helix–coil transition in heterogeneous chains. II. DNA model

Method of Calculation I;or completion of the model we require the correct cumulaiits of P ( @ .

For rwitlom placement of base pairs the individual 6 I are uncoupled, so that .v

(ex!) 1481) = II (exp (46, ) i L=l

= ( S A exp ( 4 6 ~ J + XH exp {46L3})" (41)

where SA = 1 - S u is the mole fraction of 'IT base pairs. I t ( :*re obtained from

The cumulants

R(4) = N-' Ill (exp (+@) ) (42)

by differeiitiatiori ill accordance with eq. (25) with R(4) replacing S(4). The 6's have been previously defincd by 1;Z iis

(43)

The method of calculation used by 1:Z i.; used here also, i.e., values of 0 1;rom and u are input, m d the equations :ire solved for the temperature.

eq. (36) and ( 3 s ) we obtain, after e1irnin:~tion of exp {Go - Do),

which may be solved for Go + Do by itemtioil. The individual values of Go :tiid Do are then obt,aincd from the solutioii of eq. (38) with G and I1 eviduaied at 4 = 0. We might mention here that aiialytical approxima- tions t,o eq. (3s) and (44) are certainly justified for small Go + DO and Go - Do, which obtaiii for small U . However, subsequent differentiations of eq. (3s) are conveniently performed with the exact expression.

111 accordance with eq. (24) and (34), we require the derivatives I ) , and G k , which are obtained from eq. (3s). The first derivative with respect to the implicit variable 4 yields, after some rearrangement

(45)

Page 9: Helix–coil transition in heterogeneous chains. II. DNA model

IIELIX-COIL TItANSITION I N lIETEliO(;ENEOUS CHAINS. 11. 213

with

fik = @(Go + D,,, 3/% - k ) + ( u - ~ - I) exp { -(Go + Do)

$ r = 2@(Go + Do, 5/2 - k ) - @(Go + Do, 3 / 2 - 12) (4% + (d - 1) exp 1 -(GO + DO>)

We denote as the Mth cumularit approximation that choice of c,, which yields the correct R, for 1 5 '12 5 111. Higher c L , k > AI are suppressed. -\ii analytical result may be obtained for the second cumularit approxima- tion as follows: D k and RA+, vanish for /c > 1 as may be verified from eq. (21) arid (23). According to eq. (24) and (34) we have

R1 = - A ( l - l ' i / T ) = S1 + GI = nu + pill (49)

(50)

is the mean nieltitig temperature. The latter ecluality i n eq. (49) :wise.: from eq. (24) where, iri the second approximation, S1 = Do. It will be noted that the last equality in eq. (49) is correct only in the second cumu- larit approximation. We also have, from eqs. (45) and (46),

(51)

1;or R2 me obtain by similar steps

R~ = D~ + D 1 w P f i 2 - ~l .~>/ f i , i (52)

in which the factor multiplying Equations (49) arid (52) are readily solved for both D1 m d T , given of course, the values of e and u from which Do is obtained. The two equations may bc solved eit,her exactly or, sirice R? P2, the estimate IZ? = It2(T:) ma)' bc used. Results are insensitive to the choice provided 0 is not, t,oo close t o unity. Kumerical computations mere performed in sixt,eeii digit double precision.

Higher cumulant approximat'ions were solved by stepping T in small increments, the smallest being 1OW"K. The G , in eqs. (34) were eliminated by use of eqs. (45)-(48), arid pairs of eqs. (34) were further used to eliminate some of the Dk. The sign of successive increments of T was determined by comparison of values of D , obtained from a t 1e:rst two different, set,s of R k .

In particular, for the third cumulaiit approximation, :i value for 1)1 was obtained from an initial l' (from the second approximation), atid two different values of Ds were computed from El(?') and L), and from R:j(?') and Ill. The temperahre was then adjusted by an increment, whose magnitude depended on the number of iterations arid whose s i p wi.s determined by the sigm of the difference between the two estimates of 112. 1ter:hons were stopped :~nd the current. T printed when the two values of D2 differed by less than ca. lo-; (various tolerances in the ranges 10-4-10-6 were used, with no noticeable effect on results).

is a funct,ion of 0 arid u only.

Page 10: Helix–coil transition in heterogeneous chains. II. DNA model

Numerical Results

Results of the second arid third cumularit approximations are depicted in l7igure 1 for random chains of X A = 0.5 arid for a variety of values of u. It will be noted that the transitions are extremely sharp, with 90yo of the melting occurring over a range of ca. 0.2"C for u = Three things are to be observed in the calculated curves: ( I ) the fraction of helix goes to zero with a discontinuous derivative; (2) curves for the third cumulant approximation lie some 2°C below those for the second; arid (3) curves for u = differ but little from one another, arid are substantially narrower than the curve for u = 10P. Results (1) arid (3) have been observed previously by Zimm6 for homogeneous chains without sliding degeneracy,12 arid our calculations (not shown) for such chains are in sub- stantial agreement with his earliw results.

Observation ( I ) is readily explained by inspection of the equ' d t ' 1011s. From eq. (44) and (39) it is seen that as 0 + 0, Go + Do + 0, arid by eq. (35), Go - Do -+ In { 1 + u[{(3/2) - 111, a finite number. In thc limit 0 -+ 0 then, DO is finite and forces Izl to remain finite. [To conform to the model, solutioris of eq. (49) arid (52) with negative Dl must be discarded.] In contrast, for the polypeptide model Do + - m as e -+ 0, which forces T to become large. The result is a sigmoidal shape for melting curves of polypeptides as opposed to abrupt melting for DKA.

observation (2) is very different from what one expects, especially by comparison with the calculations of I Z for polypeptides. We may, how- ever, trace the effect through the equations for the third cumularit approxi- mation, which are:

and

121 = Do + (1/3) D2 + PDI

Rz = Di + pD2 -+ Ci1Ji2

Ra = D2[1 + 3C1D11 + C2D13

where C1 and C2 are functions of the Q L and $ h , and herice of e and u only. For X A = 0.5, RB = 0, so that the last of eqs. (53) may be simplified. It is easy to show that C1 > 0 arid C, < 0; herice that D2 > 0. Confining attention now to p = 0, i.e., 8 = 3, we see that Rl(M = 3) > Rl(di = 2 ) . This implies ?'(A1 = 3) < 2 ' ( X = 2 ) . The same reasoning holds for p # 0, but is complicated considerably.

The last observation, (3) , though riot of the same nature as ( I ) and (2) , is important, in that thc most reliable estimates12 of u are in the r m g e 10-5 < u < with perhaps the best choice. Sirice calculated curves for u = are in close agreement we can be reason- ably certain that our conclusions will not be vitiated by uncertainties in the value of u.

Calculated tr:rrisitions for v:uiou:, compositions X A ill the range 0 < X A < 1 gave essentially the same results as the curves for XA = 0.5. For X A = 0 or 1, the second cumulant approximation is exact, arid as a

(53)

arid u =

Page 11: Helix–coil transition in heterogeneous chains. II. DNA model

HELIX- COIL THANSITION 1N HETEIiOGENEOUS CIIAINS. 11. 215

T - TG Big. 1. Calculated melting curves for random copolymer DNA with 50% G + C for

various values of the cooperativity parameter u as indicated. Curves a t the higher temperatures are for the second cumulant approximation, whereas those a t lower temperatures obtain for the third cumulant approximation. Contrary to appearances, each family of curves does not intersect a t a point.

result the curves for homopolymers occur only slightly above the pre- assigned melting temperature (3425°K and 383.5"K, respectively). In- clusion of sliding degeneracy according to the method of Crothers and Zimmlz led to results in good agreement with the previous calculations.12

Higher approximations than M = 3 were computed for X A = 0.5, e = 0.5, and u = 0.05. The results of these calculations will be found in Table I. Calculations for 114 > 3 were carried out by using analytical approximations for the Truesdell functions : errors arising from this source are not expected to exceed f0.01". From the results in Table I we con- clude that the third cumulant approximation is sufficiently accurate. The eabe with which results can be obtained for M = 3 in comparison to the slow convergence of the program for higher approximations made further calculations for Af > 3 unattractive. < u <

are not expected to differ from those for M = m by more than ca. O.1"K; no gross detail is added for M > 2 in fact.

Results for A4 = 3 and

TABLE I Results of Calculations for Mth Cumulant Approximation

xA = 0.5, e = 0.5, = 0.05

M 7' - 7',g, "K

2 2.069 3 1.263 4 1.170 5 1.177 6 1.177

1 0 1.176

Page 12: Helix–coil transition in heterogeneous chains. II. DNA model

216 EICIIINGEIl AND FIXAIAN

Comparison with Previous Calculations

As already mentioned, results obtained here are in substantial agreemerlt with the previous calculations of Zimm6 and of Crothers and Kallenbach.'G JIore recent calculations of CrothersZ4 are riot in accord with our findirlgs, however. In seeking a possible explanation for the discrepancy we were led to calculations t.hat tried t.o duplicate, insofar as possible, the methods used by Crothers.

Crothers performed Jlorite Carlo calculations as follows : Infinite mo- lecular chairis of repeating sequence were generated as random binary numbers.* The melting profile of each chain was calculated by using the mat.rix method of ZirnmG,lG with the approximation t.hst blocks of 30 or 50 base pairs could be treated as ii single unit capable of existing in all-helix or all-coil forms. Results for 50 such chains were averaged and compared with the transitiori curve for T2 DSA. Good agreement was established, as is evidenced in I'igure 2.

Proceeding from the premise that nothing essential was missing from our calculations, we attempted to understand these differences in results. We considered this especially important, because of the seemingly excellent agreement between the calculations of 1"Z23 and of 1;ink arid CrothersZ2 for polypeptide chains. Perhaps the I;% method, adequate for polypeptides, is qualitatively erroneous in its treatment of ring entropy, but we see no a priori basis for this suspicion. There should be no essential difference between our calculations and Crot,hers' except insofar as approximations in either method affect the results.

To investigate the effects of "coarse graining," i.e., the lumping together of 50 base pairs to make a single unit,2 we modified the Jacobson-Stock- mayer factors in a manner ident.ica1 to that used by C r ~ t h e r s . ~ ~ Sums over the modified ring-weighting functions were computed in three parts: as the Truesdell function, as x sum to 1, - 1 terms of the difference between the required sum and the Truesdell function :~nd as an integral of the remaining terms from L to a. I t was found that for L = 20 values of the sum plus integral were convergent to wit,hin :it 1e:ist 0.01%. These calculations, performed only in the second cumulant approximation for XA = 0.5, yielded transition breadths (measured between 0 = 0.3 and 0.7) approxi- mately 0.5"K wider than t,he exact, results for u = lop3. JIidpoint,s of transit,ions occurred ca. 0.Ci"K higher in temperature for lumped bases thxii for exact results. 1:or the same value of u Crothers lumped together 30 bases, so that breadths calculated by his method would be less broad than the 0.3"K observed here. Clearly the mngnit,ude of thc effect is in- sufficient to remove the discrepancy.

Errors in the Monte Carlo method might also be expected to arise from periodicity of st>ruct,ure, but) are more difficult to wssess. Whether repeat-

* Sequences o f (i00 or I000 I)we pairs distributed raiidoinly nrre generated. High iiiolecular chairis were represented its repeating structures ~irade up ot these ''Inoriurncr'' units.

Page 13: Helix–coil transition in heterogeneous chains. II. DNA model

II1.;LIS-COII, TllANSITION I N IIETEKOGENEOUS CIlAINS. 11. 2 I i'

1 I 0- 1---

0 8 -

I

1

1

0 6 -

0 4 - 1

0 2 -

- 1 0 0 0 I 0 0 0-

T-T;

Fig. 2. Conip:irison of Crothers' LIonte Carlo calrulations with expected tr:tnsition cwrves for mixtures of molecules with conipositions distributed about the mean: (- - 1 results of Crothers for (1) u = 10-4, IV = 1000 and ( 2 ) u = lo-", ,V = 600; calculatcd from eq. (55) of the tes t with u effectively infinite, (1) for N = 1000 and (2 ) for N = (i00.

ing structure broadens or narrows the transition would seem to depend on the specific sequence. In the absence of specific conjectures, one may assert that for small values of 6, where the average length of coil stretche.; approaches or exceeds the repeating unit size, the calculations will be in error. Presumably such an effect would occur a t such small values of 0 as to be of negligible importance. .4 third, and more subtle, source of discrepancy is to be found in fluctua-

tions of composition to be expected for finite sequences generated on the computer. Presuming random number generators to be truly random, the sequence of a particular chain conforms to the statistics of Bernoulli trii~lh, m d for large collections of chains each composed of N base pairs the frac- tion z',v(X~) of chains with composition between XA arid xA + dXA will be given by

P . V ( X A ) dXA = (27rzAzB/N)-1'2 exp { ( X A - zA)'/2(zAXB/N>) (34)

where 8, arid zg = 1 - FA are the a p r i o ~ i probabilities of A's and 13's. For 8~ = 0.5 the standard deviation is 0.016 for chains of 1000 base pairs and 0.020 for chains with S = GOO. If the melting of each chain of composition X A is given by B(T,XA), the observed behavior of the en- semble, e (T ,XA) , will be given by

r .

(53)

If we assume that B(T,XA) is a step function (which the calculated e nearly is) with discontinuity at TZ, = XATA + XBTB, the result, of the integratiori of eq. (3.5) is an error function. I:igure 2 compares predictions from eq. (55) with calculations of Cr0the1-s~~ using parameters corresponding to his, i.e., for cr = N = 1000 and for cr = N = 600.

Page 14: Helix–coil transition in heterogeneous chains. II. DNA model

218 EICIiINGEIB AND FIXMAN

An alternative choice for e(T,XA) from the results of our calculations was also used in conjunction with eq. (55) for the numerical calculation of e(T,XA). It was found that for 0.45 5 X A I 0.55 the present theory carried to the third cumulant approximation yields curves for u = that may be adequately represented by

e(T,XA) = (2.015 + T - TZ)/(1.993 + T - TZ) (56)

Numerical integration of eq. (55) with use of eqs. (36) and (54) (for N = 1000) yielded transitions broader by only 0.02"K (as measured between 0 = 0.7 and 0.3) than those obtained from eqs. (54) and (55) with e(T,XA) a step function. Thus, within small limits the breadth of individual 8(T,XA) curves has negligible effect on the melting curve for the ensemble. Only for Values of e(T,xA) very near unity will one see the effects of the more gradual decrease of e(T,XA) from unity that is characteristic of the profiles in Figure 1.

Discussion

The occurrence of internal random coil rings in DXA molecules a t thermal denaturation temperatures obviously has important consequences. In effect, internal loops impose an infinite ranged cooperativity in addition to the finite ranged cooperativity induced by the base stacking free energy, represented by the parameter u. It is well to re-examine the approxima- tions inherent in dealing with these loops in view of their overwhelming effect on calculated transition behavior.

Treatment of single stranded DNA as a freely jointed chain is open to criticism. According to sedimentation and viscosity results,29 single- stranded DNA is highly flexible. One may estimate parameters for the equivalent freely jointed chain based on the mean-square radius of gyration obtained from light scattering of 4x174 DNA.30 1;or these calculations we have used a phosphorus-phosphorus distance of 7.1 A (corresponding to the P-I' distance3' in the double helix) and an average nucleotide mo- lecular weight of 298 g/mole. The equivalent segment is made up of approximately four nucleotides. Thus, the ring-weighting function (z + l ) p 2 is clearly not exact; it should be modified by a proportionality constant. Such a constant, however, has already been absorbed into u as it is evaluated from homopolymer melting,'2 so that no error will be made on this account.

A more serious objection to the freely jointed chain approximation is that sinall rings will certainly not be represented accurately by this artifice. Hence, the first few terms of the sum in eq. (31) will be incorrect. The entropy of large rings may be affected by excluded volume interactions, but we do not mt,icipnt,e radical departures from (x + 1)-3/2, especially in light of the above-mentioned viscosit,y results. As was pointed out in the preceding section, the abruptness of the calculated transitions is the result of the sum in eq. (31) remaining small for small values of the argumerit.

Page 15: Helix–coil transition in heterogeneous chains. II. DNA model

HELIX-COIL TRANSITION IN HETEROGENEOUS CHAINS. 11. 219

Breakdown of the freely jointed chain approximation, which alters the first few terms of the sum, is not expected to alter the asymptotic form of eq. (31), and therefore will not yield more than minor corrections to the calculated curves.

Excluded volume effects (or, perhaps more exactly, electrostatic repul- sions) become important for low salt concentrations. The exponent of 3/2 would necessarily become larger in such circumstances, and the melting profiles would become even more abrupt, as already emphasized by Fisher.25 Regardless, we can be confident that neglect of charges will not be a significant omission for high salt concentrations.

The approximation involved in factoring $ { p } , cf. eq. (12), is more questionable in the caie of DNA than for polypeptides. For small values of the argument, the sum in eq. (31) attains half its value within six terms. That is, half the rings are of six or fewer bases. Correlations among bases might be expected to extend beyond this range, thereby introducing appreciable end effects. However, it has been shown by FZ that such end effects may be absorbed into the effective u, and only affect the results insofar as -In u may differ from the true base stacking free energy. The breadth of calculated curves is greater, the greater is u, but all such curves approach 0 = 0 with a discontinuous derivative.

Conclusions

Having thus convinced ourselves that nothing essential is missing from the theory, we can only point to the known heterogeneity of s e q u e n ~ e ~ ~ J ~ in DNA’s as the agent responsible for spreading the transitions over several degrees.

Consider a molecule composed of independent stretches of nucleotides, each of which is composed of N base pairs and has uniform, random com- position but whose neighbors have differing compositions. Then, pro- vided the stretches are sufficiently long for adequate description by the methods of the grand ensemble, the melting of the molecule as a whole will be described by eq. (55) with P N ( X A ) appropriate for the compositional distribution of stretches and with B(T,XA) given by theory. (Distribu- tions in lengths of stretches could also be included in the model.) Since o(T,xA) is nearly a step function for reasonable values of u, P N ( X * ) would then be obtained as the first derivative of the observed denaturation curve.

A model of independent stretches is consistent with the observations of Thomas and c o - w ~ r k e r s ~ ~ , ~ ~ on the distribution of fragments from sonicated DNA. We can only guess how closely intact D S A conforms to this model. The results of Skalka et al.33 confirm the presence of long-ranged composi- tional heterogeneity and their results are also suggestive of sharp dis- tinctions between regions of differing composition, which could induce in- dependent regional melting. However, it would be possible to refine our arguments so that the presence of abrupt compositional demarkations was not a necessary condition for regional melting.

Page 16: Helix–coil transition in heterogeneous chains. II. DNA model

220 EICHINGER ANI) FIXbIAN

In a recent paper A p p l e q ~ i s t ~ ~ has interpreted data on the poly(ribo- adenylic acid) transition in a manner that yields values of the stacking parameter u in the range 0.05-0.11 ; ring-weighting functions are found to be of the form (z + l)-a, with the effective value of a in the range 1.3-1.5. The latter result can be interpreted as a confirmatioil of the Jacobson- Stockmayer factors for chains which remains in register. The alternative interpretation favored by Applequist is that there is some mismatching or slip degeneracy, and that the Jacobson-Stockmayer exponent of 3 / 2 is too small due to departure of the chain from random-flight statistics. In view of the considerable difference of structure between antiparallel double- stranded helices and the parallel chain helices of poly A,36 it is perhaps not surprising that they are described by different values of the stacking free energy.

Putting further speculation 011 numerical values of parameters aside, the fact that the transitions observed by Applequist are discontinuous at e = 0 is in accord with our observations. Should further analysis reveal the failure of the random-flight approximation for DKA, then our con- clusions stand to be altered only in respect to exact numerical values. This point has been stressed above in the Discussion.

APPENDIX

Here we prove the equivalence of eq. (8) to the ensemble average of In Q { b ) . Define

N N

-&.(nIbI = L'C ntnz+i + C ? I , @ , + fi) + T S { n ] (A-1) 1 1

Obviously Eo(n I b } = E { n [ b } , in conformity with eq. (3). Then,

Q + { b ) = c exp ~ - 4 f ? ~ l b l l (A-2) h I

and the equality W'(d 111 Q+ f b ) /?)$]+a = 26 - 1 (A-3)

We require the ensemble average of 0 = ( e ) , which is easily established. is obtained by

N-l[d(ln &+{ b } )/d#]+.=O = 2(0) - 1 (A-4) But

(111Q+{bI ) = C P l b l 111 &+{bl (A-5)

where P { b ] is the a priori probability for the set of sequences ( b ] . Com- putation of the derivative required in eq. (A-4) yields a result identicd to that obtained from similar operations with eqs. (1)-(8) of the text.

( b )

This work was supported in part by KIH Grant GRI 13556-03.

Page 17: Helix–coil transition in heterogeneous chains. II. DNA model

References

I . S. A. Rice and A. \Vada, J . C‘hem. Php. , 29, 2:U (1058). 2. L. Peller, J . Ph?is. Chem., 63, 1104 (1959). :3. J. H. Gibbs and 1s:. A. Ihiarzio, J . Chem. f’hys., 30, 2 i l (1959). 1. T. I,. Hill, J . Chem. f’hjjs., 30, 383 (1959). 5. B. H. Zirnrn and .J. K. Bragg, J . Chem. f’hys., 31, 526 (1959). 0. B. H. Zirnrn, J . (‘hem. Phys , 33, 1349 (1060). ’7. S. Lifson and H. H. Zinmi, Hiopolytjters, 1, 15 ( I W 3 ) . 8. S. Iifson, Hiopolyrners, 1, 25 (1!)63). 9. S. Idifson and G. Allegra, Hiopolymers, 2, (i5 (llI(i4).

10. 31. Ozaki, 31. l‘anaka, and 15. leranioto, J . f’hgs. Sor. Japan, 18, 551 (1963). 11. Y. Icawai, 31. Ozalii, 31. l‘anaka, and 1.;. Teranioto, J . Phys. SOC. Japan, 20, 145i

( 1O65). 12. I ) . M. Crothers and B. H. Ziniin, J . .l/ol. Riol. , 9, 1 (1064). 13 . I ) . M. Crothers, S. It. Kallenbach, and B. H. Zinirn? J . illol. Hiol., 11, 802 (1!)65). 14. H. Ileiss, I ) . A. JIcQuarrie, ,J . 1’. lIcTague, and 1.;. It. Cohen, J . Chrtn. /’hys., 44,

15. P. d. 1TIi)ry and 1V. G, lliller, J . .l/o/. Hzol., 15, 284 (1!)66). I t i . I ) . M. Crothers and S . Il. Iiallenbach, J . C‘hern. f’hys., 45, 9 l i (1968). 17. b;. Montroll and S. Goel, Niopolymers, 4, S55 (l!KXi). 18. I ) . 31. Crothers, Biopolymers, 4, 1025 (1966). 19. S. Strassler, J . C’heni. f’h?/s., 46, JO:1i (1!)6i). 20. 11. Ozaki, 31. Tanalta, Y. Kawai, and 1,;. ‘Yeratnoto, f’roqr. T’heor. Phys . (Kyoto),

21. G. 1V. I,ehman, in Sfalisfical .llechanics, I ’omtlaf ions and .-I pplications, T. A. Bali,

22. T. It . Fink and 11. 31. Crothers, Hiopolymers, 6, 863 (1!)68). 23. 31. 1;ixrnan and I) . Zeroka, J . Chem. Phys., 48, 522:’I (19G3). 21. 1). A I . Crothers, Niopolymers, 6, 1:3IlL (1068). 25. 31. I<. Fisher, J . C’heni. /’h!/s,, 45, l4ti9 (1966). 2(i. H. Jacobson and \V. H. Stockniayer, J . Cheni. f’hys., 18, 1600 (1!)50). 2 i . C. Truesdell, L 4 r ~ r ~ . Jllalh. 121, 46, 144 (1945). 28. H. B. ])wight, J l a t h e m a t i d Y’uDles, 1)over Publications, Sew York, I MI, pp.

29. J. IGgner and P. Iloty, J . J /o l . Hiol., 12, 549 (1965). :XI. It. I,. Sinsheimer, J . Jlol. Hiol., 1, 13 (IU5U). :31. J . 1). Watson and 1‘. H. C . (!rick, Nalure, 171, 084 (1953). :32. Y. Jliyazawa and C. A. Thornas, J . illol. Hiol., 11, 228 (1065). :3:3. A. Sliallia, 15. Burgi, and i\. 1). Hershey, J . .l/o/. Hiol., 34, 1 (I!)(i8). 114. C. A. l‘homas and I,. A. AlacHattie, .Inn. Hen,. hi or he?^., 36, 11, 485 (l!Nji). :<5. J. Applequist, J . C‘hetn. Phys. , 50, 600 (I!)W). :<ti. A. Itirh, I ) . It. Ihvies, F. H. C. Crick, and J . I ) . Ll-atson, J . .l/ol. K i d . , 3, 71

4567 ( I !)Mi).

38, 9 (IDti7).

lcd., Benjamin, Sew I-ork, INi i , pp. 204-236.

210-213.

(I!J(jl).

Received l l a y 12, 1969 Revised August 11, 1969