incorporating the protein – dipole langevin – dipole model into tanford-kirkwood theory

6
Incorporating the Protein-Dipole Langevin-Dipole Model into Tanford-Kirkwood Theory XIANG ZHEXIN, HUANG FUHUA, and SHI YUNYU" Department of Biology, University of Science and Technology of China, Hefei, Anhui, 230026, China XU YINWU Center for Fundamental Physics, University of Science and Technology of China, Hefei, Anhui, 230026, China Received 15 March 1994; accepted 19 October 1994 A new technique incorporating the protein-dipole Langevin-dipole (PDLD) model into the Tanford-Kirkwood (TK) formula has been proposed which provides a rather detailed description of solvent and ionic strength effects on the electrostatic energies. Applications of this method to realistic problems have been performed and concern the solvation energies of four residues of bovine pancreatic trypsin inhibitor (BPTI) and the pK shift of His-64 of mutant subtilisin BPN'. We focus our calculation on the back-field effects of bulk solvent.? Our calculations indicate that the back-field effects of bulk solvent on protein dipoles can simply be ignored, introducing a relative error less than 3%; whereas such back field-effects on protein net charges are relatively important and cannot simply be ignored, especially when considering a system of highly charged species. 0 1995 by John Wiley & Sons, Inc. *Author whom all correspondence should be addressed. 'The determination of protein-induced dipoles is cumber- some due to the solvent screening effects. The protein-induced dipoles are dependent on their local electric fields, which come from the protein net charges, the surrounding water molecules, and bulk solvent. The bulk solvent will usually screen the electric fields from the protein net charges and dipoles, which Introduction lectrostatic interactions are among the key E factors in determining the structure and function Of biomolecules. They are involved in enzymic mechanisms,' protein folding? and bind- are defined here as the back-field effects of bulk solvent on protein net charges and dipoles, respectively. Journal of ComputationalChemistry, Vol. 16, No. 12, 1468-1473 (1995) 0 1995 by John Wiley & Sons, Inc. CCC 0192-8651 195 I 121 468-06

Upload: xiang-zhexin

Post on 11-Jun-2016

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Incorporating the protein – dipole Langevin – dipole model into Tanford-Kirkwood theory

Incorporating the Protein-Dipole Langevin-Dipole Model into Tanford-Kirkwood Theory

XIANG ZHEXIN, HUANG FUHUA, and SHI YUNYU" Department of Biology, University of Science and Technology of China, Hefei, Anhui, 230026, China

XU YINWU Center for Fundamental Physics, University of Science and Technology of China, Hefei, Anhui, 230026, China

Received 15 March 1994; accepted 19 October 1994

A new technique incorporating the protein-dipole Langevin-dipole (PDLD) model into the Tanford-Kirkwood (TK) formula has been proposed which provides a rather detailed description of solvent and ionic strength effects on the electrostatic energies. Applications of this method to realistic problems have been performed and concern the solvation energies of four residues of bovine pancreatic trypsin inhibitor (BPTI) and the pK shift of His-64 of mutant subtilisin BPN'. We focus our calculation on the back-field effects of bulk solvent.? Our calculations indicate that the back-field effects of bulk solvent on protein dipoles can simply be ignored, introducing a relative error less than 3%; whereas such back field-effects on protein net charges are relatively important and cannot simply be ignored, especially when considering a system of highly charged species. 0 1995 by John Wiley & Sons, Inc.

*Author whom all correspondence should be addressed. 'The determination of protein-induced dipoles is cumber-

some due to the solvent screening effects. The protein-induced dipoles are dependent on their local electric fields, which come from the protein net charges, the surrounding water molecules, and bulk solvent. The bulk solvent will usually screen the electric fields from the protein net charges and dipoles, which

Introduction

lectrostatic interactions are among the key E factors in determining the structure and function Of biomolecules. They are involved in enzymic mechanisms,' protein folding? and bind-

are defined here as the back-field effects of bulk solvent on protein net charges and dipoles, respectively.

Journal of Computational Chemistry, Vol. 16, No. 12, 1468-1473 (1995) 0 1995 by John Wiley & Sons, Inc. CCC 0192-8651 195 I 121 468-06

Page 2: Incorporating the protein – dipole Langevin – dipole model into Tanford-Kirkwood theory

PROTEIN -DIPOLE LANGEVIN -DIPOLE MODEL

ing energies3 as well as other essential phenom- ena, such as allosteric control! Although the im- portance of electrostatic effects has been acknowl- edged, finding a proper method to simulate this interaction is still problematic? A major problem has been that of accounting for solvent effects far from the first solvation shell.6 Since the pioneering work of K i rk~ood ,~ several methods have been developed to solve this problem, of which the Tanford-Kirkwood method' and the PDLD model of Warshel' have made important contributions.

The famous PDLD model was developed by Warshel and his co -w~rke r s .~~ '~ The PDLD method takes account of effects due to individual protein dipoles by treating each protein atom as bearing a point charge and a point polarizability, and it represents surrounding solvent by the Langevin equation. This method has worked extremely well for the problems to which it has been applied. In particular, its ability to treat the self-energy contri- butions made it possible to treat electrostatic con- tributions to enzyme catalysis in a physically rig- orous fashion for the first time.

On the other hand, the successful simulation by the Tanford-Kirkwood (TK) model in predicting protein titration curves is encouraging. This macroscopic approach, which was proposed before the availability of X-ray structures of proteins, views the protein as a medium of low-dielectric constant and predicts that all ionizable groups must be near the surface of the protein. Experi- mental findings of some ionized groups inside proteins indicate that the model should be modi- fied. An attempt to overcome this problem, while retaining a macroscopic approach was made by Gurd and co-workers,",'2 who modified the Tanford-Kirkwood model (TKM) by reducing the electrostatic energy of each group in direct propor- tion to its solvent accessibility parameter.I3 How- ever, some limitations still exist in this model. The most serious limitation of the TK and its modified models may be the implication that electrostatic energies in proteins are determined and controlled by charge-charge interactions. Although charge-charge interactions are important, neglect- ing the key role of protein dipoles in determining self-energies of charged groups is risky. Thus, to treat correctly the energies of ion pairs in a protein, one must use a microscopic approach that takes into account the local en~ironment.'~

For these reasons, a microscopic approach based on the TK model has been proposed which takes into account the protein dipoles and the surround- ing solvent. This was done by a combination of

PDLD and the Tanford-Kirkwood model. We then applied this hybrid method to realistic problems for a test.

General Formalism

We consider a protein with arbitrary shape im- mersed in a solvent with dielectric constant D (for water, in this article D is taken as 80). First, we take a sphere s to enclose the whole protein, as shown in Figure 1. The whole system can then be divided into four regions: (1) the charged group of interest; (2) the rest of the protein interior region; (3) the surrounding waters in the sphere s; and (4) the region outside the sphere s modeled by dielec- tric constant D. In what follows, the charg?, dis- tance, and energy are given in units of e, A, and kcal/mol, respectively.

The electrostatic potential I,!$ at any point p inside the sphere s can be represented as follows:

(1)

where qq( p) is the Columbic potential contribution from the protein atomic charges in regions 1 and 2, 1,9~( p) is the Columbic potential contribution from protein-induced dipoles in regions 1 and 2 plus solvent Langevin dipoles in region 3, and $x,( p> is the potential contribution from the bulk solvent outside the sphere s in region 4. The potential &,( p) come from two sources:

( 2 )

*; = $& P) + &( P) + rCr,( P)

k ( P ) = *JP) + ccI,,(P)

FIGURE 1. The four-region model used to calculate the electrostatic energies in proteins. Region 1 includes the charged group of interest. Region 2 includes the permanent and induced dipoles of protein. Region 3 includes surrounding waters that can be simulated by the Langevin -dipoles model. Region 4 is the bulk solvent.

JOURNAL OF COMPUTATIONAL CHEMISTRY 1469

Page 3: Incorporating the protein – dipole Langevin – dipole model into Tanford-Kirkwood theory

ZHEXIN ET AL.

where $9w( p ) is the potential contribution at point p from the bulk solvent polarized by protein charges in regions 1 and 2, and GUw( p ) is the contribution from the bulk solvent polarized by dipoles in regions 1, 2, and 3. It is well known from the Kirkwood model7 that - 9 while keeping

where j denotes the summation of all the dipoles in regions 1, 2, and 3, including the protein- induced dipoles and solvent Langevin dipoles. To obtain rCI,,(p), it is necessary to consider that a dipole u is formed by two charges (i.e., +9" and

u = 9;L (11) (3)

(4)

4 * $ p ) =

where L is the vector directed from -9" to +9", the value of which is the distance between these two charges. So the second part of q. (2) will be

i "zp

* 9 w ( p ) = cBn/9zFz n i

F, = r;r,'IP,(cos 0 , , ) / a 2 " (5)

where n is an integer varying from zero to positive infinite (+a) and i is an integer varying from one to the number of protein atomic charges in the sphere s. The Quantity P,(cos 0) is the nth rank Legendre functions, where O I p is the angle be- tween r, and r p and a is the radius of the sphere. B, can be obtained by boundary conditions7 for the case of the zero ionic exclusion region. Consid- ering the dielectric constant in the sphere D, = 1, we have7,

1 ( n + 1)(1 - D ) ?I a ( n + l ) D + n

B = -

1 2 n + 1

Y ,, n( D - 1) X 2 K , - I ( X )

( n + 1)D + n ( 2 n - 1)(2n + 1) K , + , ( x ) +

(6)

except for n = 0, where

Here the polynomials of K,( x) can be expressed in the form7

2"!(2n - s)! X S (8)

K n ( x ) = s = o s!(2n)!(n - s)!

x = k a (9)

where k is the Debye screening constant. The sec- ond part of eq. (1) comes from the dipole contri- bution, which can be easily obtained as follows:

(10)

Here the superscript + and - represent the charges of f 9 " and -9,,, respectively. It is clear that

(13)

where 1' = 1/1. Incorporating eqs. (11) and (13) into eq. (12), we have

+bUw( p ) = E B, X u j * VF' (14) n i

For our purpose of considering the protein- induced dipoles, a consistent procedure to calcu- late the electric field at any point in the sphere must be obtained. These induced dipoles are due to the distorted electron clouds polarized by the permanent dipoles of the protein and water. These effects can simply be simulated by assigning an induced point dipole to each atom in the protein:

up = (Y e t (15)

where (Y is the scalar atomic polarizability of atom p,E is the total local electric field acting on it, which can be expressed by Eb = -VI$, and +; is governed by eq. (1).

For water molecules considered in region 3, the Langevin equation can be applied to determine their dipole^'^,'^:

ui = eOu,[coth(xi) - l / x i l (16)

Xi = uo(E: - E,)eo/K,T (17)

e0 = Eqi/Eqi (18)

where i denotes the ith water molecule. The per- manent dipoleoof the solvent is u, which can be taken as 0.35 eAI3. E9i is the vacuum field from the protein charges, E f is the total local field, and E, is

P P

1470 VOL. 16, NO. 12

Page 4: Incorporating the protein – dipole Langevin – dipole model into Tanford-Kirkwood theory

PROTEIN -DIPOLE IANGEVIN -DIPOLE MODEL

the field from the nearest neighboring. K , is the Boltzmann constant and T is the absolute tempera- ture.

If we are given the charges and dipoles inside the sphere, we can obtain the potential at any position p in s by eq. (1). As such, the protein- induced dipoles and solvent Langevin dipoles should be determined first. They can be obtained using eqs. (15) and (16-18), respectively. Because electrostatic fields must satisfy E6 = -Vt$, an it- erative scheme is needed to achieve a balance between eq. (1) and eqs. (15-18). After such a balance has been obtained, we can calculate the charge-solvent and charge-dipole interactions. Such energy interactions have been discussed many times in different article^^,'^ and are not repeated here.

Applications

Any model must be exposed to realistic prob- lems for a test. In such a process, a poor model will be discarded, and a good model may be further improved. Probably the most crucial test of our method is the evaluation of the electrostatic energetics of a single charged group in proteins. To test the validity of this method, we first apply it to a number of amino acids and four residues of bovine pancreatic trypsin inhibitor upon ionizing them in water, and then we evaluate the p K , shift on 64-His of subtilisin BPN'. In the calculation procedure, the water molecules in region 3 are modeled by the Langevin equation, in which t te water permanent dipoles are as 0.35 eA. The polarizabilities of the protein atoms are deter- mined by eq. 7 in ref. 14, weereas the polarizabil- ity of hydrogen is set as 0.5 A'. The charges of the protein atoms are taken from the GROMOS pa- rameter set.15 The dielectric constants inside and outside the sphere are taken as 1 and 80, respec-

tively. The available X-ray coordinates from the protein data bank (PDB entry 4PTI and SNI) are taken as an average structure. Polar hydrogen po- sitions were generated using the facility of the GROMOS package.I5 The center of the sphere was chosen at the average position of the molecular atoms. A sphere with a proper radius enclosing the whole protein is assumed for the molecule. The surrounding waters in region 3 were inserted into the sphere by immersing the whole sphere contain- ing the protein in an equilibrium configuration of bulk SPC water,16 and subsequently removing all water molecules that are outside the sphere. In the following tables, A AG, is the charge-charge itera- tion energy between regions 1 and 2; AAG, is the charge-dipole interaction energy between charges in regions 1 and 2 and dipoles in regions 1, 2, and 3; AAGw is the interaction energy between charges in regions 1 and 2 and bulk solvent out of sphere s; and AAG, is the sum of AAGq, AAG,, AAG,. In what follows, the position of a group means the average position of all of the group's atoms.

SOLVATION ENERGY CHANGE OF THE FOUR RESIDUES OF BPTI

We first calculated the change of solvation ener- gies of the four residues (3Asp, 7Glu, 49Glu, and 50Asp), respectively, when all other ionizable groups were noeutral. The radius of the sphere was chosen as 22 A. To evaluate the back-field effects of bulk solvent outside the sphere on the final results, we did the calculations in three steps. First, the back-field effects of bulk solvent were fully considered, the results of which are listed in Table I. Second, the back-field effects on the pro- tein dipoles (including protein permanent and in- duced dipoles) were ignored; the deviation of the solvation energy change due to this is shown in column a of Table 11. For groups with the sum of charges equivalent to zero, we can treat them as

TABLE I . Comparison of the Calculated Solvation Energies with the Observed Data for the Four Amino Residues in BPTI.

A A G q AAG, AAG, A A G , AAGex,

3Asp 7Glu 49Glu 50Asp

- 0.9 - 15.7 - 54.9 -71.5 - 71.3 - 15.2 - 8.7 - 43.2 -67.1 - 70.6 - 27.0 - 12.2 - 34.8 - 74.0 - 70.3 - 32.0 - 10.0 - 25.3 - 67.3 - 70.3

Those water molecules more than 10 A away from the residues of interest in region 3 were ignored. All parameters were set as in the text. AAG,,, is the experimental value from ref. 10. All values are given in kcal/ mol.

JOURNAL OF COMPUTATIONAL CHEMISTRY 1471

Page 5: Incorporating the protein – dipole Langevin – dipole model into Tanford-Kirkwood theory

ZHEXIN ET AL.

TABLE II. Deviation of Solvation Energy Change due to Ignoring the Back-Field Effects of Bulk Solvent.

TABLE IV. Deviation of Solvation Energy Change due to Ignoring the Back-Field Effects of Bulk Solvent.

3Asp 7Glu 49Glu 5OAsp 3Asp 7Glu 49Glu 50Asp

a -1.5 - 0.2 -0.7 - 0.5 b - 3.1 - 0.0 - 0.9 - 0.5

Column a is the case without considering the back-field effects of bulk solvent on protein dipoles, whereas column b is the case without considering the back-field effects of bulk solvent on the protein net charges. The deviation was ob- tained by comparing the solvation energy change between two cases when the back-field effects of bulk solvent were fully considered or partly ignored. All values are given in kcal/ mol. All parameters were set as in Table I.

protein permanent dip01es.l~ Third, the back-field effects of bulk solvent on the protein net charges were ignored, and the deviation of the change of solvation energy due to this is listed in column b of Table 11. Because the other ionizable groups are neutral, the net charges are only on the ionized groups.

The results in Table I indicate that our method provides a reliable means for quantitative estima- tion of solvation energy. From Table 11, we know that the back-field effects of bulk solvent on the protein dipoles are unimportant, introducing an error less than -1.5 kcal/mol ( a relative error about 2%); whereas such back-field effects on only one residue with a unit net charge are somewhat important, introducing an error up to -3.1 kcal/mol. The largest deviation of - 3.1 kcal/mol on 3Asp and the least deviation of zero on 7Glu must be observed because 3Asp is the nearest to the sphere surface whereas 7Glu is farthest. For highly charged species such as nucleic acids, ig- noring the back-field effects of bulk solvent on the nonzero charged groups might introduce a larger error in the fmal results. This assumption was

- 3.6 - 6.4 -2.1 - 4.0

The back-field effects of bulk solvent on protein dipoles and net charges were not considered. All parameters were set as in Table 111. All values are given in kcal/ mol. Each amino residue’s nearest neighbor acid group is not shown in this table but is listed in Table 111.

justified by the following calculations, with each acid’s nearest neighbor acid group ionized.

Table 111 shows the energy contribution to the ionization of the aforementioned four residues with the back-field effects of bulk solvent fully consid- ered, whereby each residue’s nearest neighbor acid group is ionized. Calculation was then performed in the case without considering the back-field ef- fects of bulk solvent on protein dipoles and net charges (the deviation of the solvation energy change due to this is shown in Table IV).

For the present case with two ionized groups, ignoring the back-field effects of bulk solvent may create a deviation of the solvation energy change up to -6.4 kcal/mol, as shown in Table IV, much bigger than that with only one group ionized, as shown in Table 11. To show the influence of ionic strength on the change of solvation energy, we performed calculations for the case of ionic strength of 0.5 mol/L. Ionic strength effects in region 4 can be handled automatically by the TK model. How- ever, ionic strength effects in region 3 must be considered specifically. Moreover, ions in region 3 are usually more important because they are closer to the protein than those in region 4. This ionic strength in region 3 can be handled with the finite element method.I7 Although ionic strength plays an important role in estimation of the pK-shift value, its effect in determining solvation energies is comparatively negligible. Even in the case of

TABLE 111. Calculated Energy Contribution to the Ionization of the Four Acidic Groups in BPTl when Each Amino Residue’s Nearest Neighbor Acid Group Is ionized.

AAG, AAG, AAG, AAGt

30.7 - 16.0 - 86.8 - 72.1 - 70.5 16.4 -9.1 - 80.8 - 73.5 - 69.8 15.1 - 12.3 - 68.7 - 65.9 - 69.3 10.0 - 10.1 - 65.2 - 65.3 - 69.3

Those water molecules more than 15 8, away from the residues of interest in region 3 were ignored. All parameters were set as in the text. AAGexp is the experimental value from ref. 10. All values are given in kcal I mol.

1472 VOL. 16, NO. 12

Page 6: Incorporating the protein – dipole Langevin – dipole model into Tanford-Kirkwood theory

PROTEIN -DIPOLE IANGEVIN -DIPOLE MODEL

TABLE V. Calculated and Experimental pK Difference of His-64 of Mutant Subtilisin BPN’.

Residue APK, Substitution Calculated Experimental

99Asp + Ser 0.22 0.42 156Glu + Ser 0.25 0.42 36Asp -+ Gln 0.07 0.1 8

213Lys -+ Thr 0.05 0.08

Those water molecules more than 20 I% away from His-64 in region 3 were ignored. Ionic strength is 0.01 mol/ L. Experi- mental values are taken from refs. 18 and 19.

high ionic strength of 0.5 mol/L, effects of ionic strength in the total solvation energies on the four acids of 3Asp, 7Glu, 49Glu, and 50Asp are only about -1.0, -0.4, -1.3, and -0.7 kcal/mol, respectively.

THE PH SHIW OF 64-HIS OF MUTANT SUBTILlSlN BPN’

We now apply the method to the calculation of the pK shift o 64-His of the mutant subtilisin BPN’. A sphere with a radius of 33 A was used to enclose the whole BPN’. Because we do not know which of the two nitrogen sites in 64-His is actu- ally protonated, we chose the site at their average position. The p K shift due to the change at each of the four charged residues, respectively (99Asp +

Ser, 36Asp -+ Gln, 156Glu + Ser, and 213Lys +

Thr), was calculated on His-64 of mutant subtilisin BPN‘ under ionic strength of 0.01 mol/L, as shown in Table V. These results suggest that the quantita- tive calculations of p K difference are practically feasible, although the agreement is not completely satisfying. In our calculation of the pK difference, the back-field effects of bulk solvent were fully considered.

Discussion

Our approach, based on incorporating PDLD into the TK model, provides a reliable means for quantitative treatment of the solvation problem. Our calculations show that the back-field effects of the bulk solvent on the protein dipoles can be ignored, whereas such effects on the net charges are somewhat important, especially in a highly charged system. Because the bulk solvent and pro- tein-induced dipoles polarize each other, ignorance

of the interaction between them may speed the calculation greatly. However, in general, one must examine the importance of such back-field effects of the bulk solvent in different systems.

Compared with the previous Tanford-Kirkwood (TK) method, in which protein dipoles are ignored in determining self-energies of charged groups, our method is a step forward in the simulation of biomolecular functions that occur at the molecular level.

Acknowledgments

We thank Professor Wang Cunxin and Dr. Wu Jihui for their kind help in preparing this work. This work was co-supported by grant 863-103-22-01 for protein engineering from the Chinese National High Technology Project and grant 8512-11-12 from the National Commission of Science and Technol- ogy of China.

References

1. A. Warshel, Biochemistry, 20, 3167 (1981). 2. J. M. Thorton, Nature, 295, 13 (1982) 3. M. Gilson, and B. Honig, Proteins, 4, 7 (1988). 4. J. M. Baldwin, Prog. Biophys. Mol. Biol., 29, 225 (1975). 5. M. E. Davis, and J. A. McCammon, Chem. Reu., 90, 509

6. A. A. Rashin, 1. Phys. Chem., 94, 1725 (1990). 7. J. G. Kirkwood, 1. Chem. Phys., 2, 351 (1934). 8. C. Tanford, and J. G. Kirkwood, 1. A m . Chem. Soc., 79, 5333

9. A. Warshel, and M. Levitt, I. Mol. Biol., 103, 227 (1976).

(1990).

(1957).

10. S. T. Russel, and A. Warshel, 1. Mol. Biol., 185, 389 (1985). 11. S. J. Shire, G. I. H. Hanania, and F. R. N. Gurd, Biochemistry,

13, 2967 (1974). 12. J. B. Mattew, G. I. H. Hanania, and F. R. N. Gurd, Biochem-

istry, 18, 1919 (1979). 13. A. Warshel, S. T. Russel, and A. K. Churg, Proc. Nutl. Acad.

Sci. USA, 81, 4785 (1984). 14. Y. W. Xu, C. X. Wang, and Y. Y. Shi, 1. Cornp. Chern., 13,

1109 (1992). 15. W. F. Van Gunsteren, and H. J. C. Berendsen, Groningen

Molecular Simulation (GROMOS) Library Manual, Biomos, Groningen, 1987.

16. H. J. C. Berendsen, J. P. M. Postma, W. F. Van Gunsteren, and J. Hermans, In Intermolecular Forces, 8. Pullman, ed., Reidel, Dordrecht, The Netherlands, p. 331, 1981.

17. F. S. Lee, Z . T. Chu, and A. Warshel, 1. Comp. Chem., 14, 161 (1993).

18. M. K. Gilson, and B. Honig, Nafure, 330, 84 (1987). 19. M. J. E. Stemberg, F. R. F. Hayes, A. J. Russel, P. G. Thomas,

and F. R. F. Fersht, Nature, 330, 86 (1987).

JOURNAL OF COMPUTATIONAL CHEMISTRY 1473