university of groningen molecular dynamics simulations of

University of Groningen

Molecular dynamics simulations of haloalkane dehalogenaseLinssen, A.B M

IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite fromit. Please check the document version below.

Document VersionPublisher's PDF, also known as Version of record

Publication date:1998

Link to publication in University of Groningen/UMCG research database

Citation for published version (APA):Linssen, A. B. M. (1998). Molecular dynamics simulations of haloalkane dehalogenase: A statisticalanalysis of protein motions. s.n.

CopyrightOther than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of theauthor(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons).

The publication may also be distributed here under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license.More information can be found on the University of Groningen website: https://www.rug.nl/library/open-access/self-archiving-pure/taverne-amendment.

Take-down policyIf you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediatelyand investigate your claim.

Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons thenumber of authors shown on this cover page is limited to 10 maximum.

Download date: 30-10-2021

https://research.rug.nl/en/publications/molecular-dynamics-simulations-of-haloalkane-dehalogenase(e774a47b-c5de-4681-bde5-6d28ca0adb34).html

Chapter 3

Essential Dynamics of

Proteins1

3.1 Introduction

Functional proteins are generally stable mechanical constructs that allowcertain types of internal motion to enable their biological function. Theinternal motions may allow the binding of a substrate or coenzyme, theadaptation to a di�erent environment as in speci�c aggregation, or thetransmission of a conformational adjustment to a�ect the binding or re-activity at a remote site, as in allosteric e�ects. Such functional internalmotions may be subtle and involve complex correlations between atomicmotions, but their nature is inherent in the structure and interactionswithin the molecule. It is a challenge to derive such motions from themolecular structure and interactions, to identify their functional role,and to reduce the complex protein dynamics to its essential degrees offreedom.

We investigate the correlations between atomic positional uctua-tions in a protein, as derived from (nanosecond) molecular dynamics

1Most of this chapter was published in: A. Amadei, A.B.M. Linssen and H.J.C.Berendsen, Essential Dynamics of Proteins, Proteins: (1993), 17, 412-425

33

34 CHAPTER 3. ESSENTIAL DYNAMICS OF PROTEINS

(MD), both in vacuum and in aqueous environment. By diagonalizingthe covariance matrix of the atomic displacements, we �nd that most ofthe positional uctuations are concentrated in correlated motions in asubspace of only a few ( not more than 1%) degrees of freedom, while allother degrees of freedom represent much less important, basically inde-pendent, Gaussian uctuations orthogonal to the "essential" subspace.The motion outside the essential subspace can be considered as essentiallyconstrained. This o�ers the possibility of representing protein dynamicsin the essential subspace only.

Our treatment di�ers from a harmonic or quasi-harmonic normalmode analysis [27, 28, 29, 30, 31] in two ways. First, we do not ana-lyze the motion but rather the positional uctuations, without involvingthe atomic masses in the analysis. Our purpose is to identify an "irrele-vant" subspace which may be considered essentially constrained. Second,we do not attempt to describe the motion in the "essential" subspace asharmonic, or even as mutually uncoupled, because it is neither harmonicnor uncoupled and such a treatment would restrict the mechanics of aprotein to the level of uninteresting vibrations. The projection of a MDtrajectory onto normal mode axes as carried out by Horiuchi and G�o[28] bears a resemblance to our analysis of displacements in the essentialsubspace, be it that the spaces onto which the motion is projected arenot the same: they consider a dihedral angle subspace de�ned by normalmodes of low frequency: we retain Cartesian coordinates and de�ne thesubspace from the covariance matrix. They �nd that the motions in thelower modes are restricted in narrower ranges than those derived by theharmonic approximation; we �nd a similar restriction due to nonlinearbehaviour and the presence of nonlinear constraints within the essentialsubspace. Our analysis is in fact identical to the one described by Garcia[32]. He found that the largest linearly correlated motions in a proteinare de�ned by the eigenvectors of the covariance matrix with the largesteigenvalue. Also he recognized that these motions are far from harmonic.In fact, his and our approach correspond to Principal Component Anal-ysis (PCA) of the con�gurational space.

3.2. THEORY 35

The covariance matrix of atomic displacements has been used pre-viously for quasiharmonic analysis [29, 30], for entropy determination[33, 34, 35], and for an analysis of collective motions [27]. We note thatIchiye and Karplus [27] construct the N � N covariance matrix, whereN is the number of atoms in the system, of the vectorial inner products,while we construct the 3N �3N matrix of Cartesian displacements. Theformer indicates whether two particles move in the same or opposite di-rections, while the latter includes more complex correlations such as twistand mutually perpendicular displacements.

3.2 Theory

We consider the dynamics of a protein in equilibrium in a given envi-ronment at temperature T . Assume that a trajectory in phase space isavailable from a reliable MD simulation. We �rst eliminate the overalltranslational and rotational motion because these are irrelevant for theinternal motion we wish to analyze. The precise method of eliminatingthe overall motion is not important: either the linear and angular mo-ments are removed every step in the simulation, or the molecular axesare constructed each step by a least-squares translational and rotational�t. The result in any case is a Cartesian molecular coordinate system inwhich the atomic motions can be expressed. The internal motion is nowdescribed by a trajectory x(t), where x can represent a subset of atoms.The correlation between atomic motions can be expressed in the covari-ance matrix C of the positional deviations:

C = cov(x) = h(x� hxi)(x� hxi)T i (3.1)

where hi denote an average over time. The symmetric matrix C canalways be diagonalyzed by an orthogonal coordinate transformation T :

x� hxi = Tq or q = T T (x� hxi) (3.2)


which transforms C into a diagonal matrix � = hqqT i of eigenvalues �i:

C = T�T T or � = T TCT (3.3)

The ith column of T is the eigenvector belonging to �i. When a su�cientnumber of independent con�gurations (at least 3N + 1) is available toevaluate C, there will be 3N eigenvalues, of which at least 6 representingoverall translation and rotation are nearly zero. When a number of con-�gurations, S, less then 3N +1, is analyzed, the total number of nonzeroeigenvalues is at most S�1 since the covariance matrix will not have fullrank (see appendix B).

The matrix C has the property of being always connected to theholonomic constraints of the system. In appendix A we show that asubspace which is forbidden (or almost forbidden) for the motion is al-ways fully de�ned by a subset of eigenvectors of the matrix C with zero(or approximately zero) eigenvalues. It is also important to note thatthe probability distribution of the displacements along the eigenvectors,although linearly uncorrelated, is not necessarily statistically indepen-dent. On the other hand, if a linear orthogonal transformation de�nes asubset of statistically independent generalized coordinates, then the unitvectors corresponding to this subset will always be eigenvectors of thecovariance matrix C. The total positional uctuation

Pih(xi � hxii)2i

can be thought to be built up from the contributions of the eigenvectors:

Pi h(xi � hxii)2 > = h(x� hxi)T (x� hxi)i =hqTT TTqi = hqTqi = P

i hq2i i =P

i �i(3.4)

We choose to sort �i in order of decreasing value. Thus the �rst eigenvec-tors represent the largest positional deviations, and most of the positional uctuations reside in a limited subset of the �rst n eigenvalues, where nis small compared to a total of 3N .

3.3. METHODS 37

3.3 Methods

Analysis was performed on the trajectories of two distinct simulations ofhen eggwhite lysozyme.A simulation in vacuum was performed by the authors, using the GRO-MOS simulation package and the GROMOS force �eld [15]. A startingstructure was taken from the Brookhaven Protein Data Bank [36], en-try 3LYZ. Including polar hydrogens, the system contained 1258 atoms.Nonpolar hydrogens were incorporated implicitly by the use of unitedatoms. In total a simulation of 1 ns was performed, with a step size of 2fs. The temperature was kept at 298 K by coupling to an external tem-perature bath [37] , with a coupling constant � = 0:01 ps. Bond lenghtswere constrained using the procedure SHAKE [16]. Rotational motionaround, and translationalmotion of the center of mass was removed every0.5 ps to prevent conversion of thermal motions into overall rotationaland translational ones. Nonbonded interactions were evaluated using ashort cuto� range of 0.8 nm, within which interactions were calculatedevery time step. Interactions in the range of 0.8-1.2 nm were updatedevery 20 fs. During the simulation, con�gurations were saved every 0.5 ps.

A. Mark kindly o�ered a 900 ps (100-1000 ps) trajectory of a simula-tion of lysozyme. This simulation, which included 5,345 water molecules,was performed at 300 K, also using the GROMOS package and and thecorresponding force �eld [15] . Here con�gurations were saved every 0.05ps. For further details concerning this calculation we refer to Smith et al[38].

Before the covariance matrix was built, all con�gurations were �t-ted to the �rst con�guration by �rst �tting the center of mass and nextperforming a least square �t procedure [39] on the C� coordinates. Co-variance matrices C were constructed from the position coordinates ofthe atoms (all atoms or C� atoms only) according to:

Cij =1

S

Xt

fxi(t)� hxiigfxj(t) � hxjig; (3.5)


where S is the total number of con�gurations, t = 1; 2; :::S, xi(t) are theposition coordinates with i = 1; 2; ::::3N , N is the number of atoms fromwhich C is constructed and hxii is the average of coordinate i over allcon�gurations. The system contained 129 C� atoms, having 387 positioncoordinates. Eigenvalues and their corresponding eigenvectors were cal-culated using the QL algorithm [40]. Diagonalization of the C� matrices,of size 387 by 387, required 24 s of CPU time on a single processor ofa CONVEX 240, while diagonalizing the all atom covariance matrix ofthe solvent simulation of size 3792 by 3792 required 20.4 hr on the samemachine.

3.4 Results and Discussion

Three di�erent covariance matrices were diagonalized. The correspond-ing eigenvalues are shown in �g. 3.1, plotted in descending order againstthe corresponding eigenvector indices. Fig 3.1a shows the eigenvaluesfrom the matrix that was constructed from (387) C� coordinates in thevacuum simulation. In �g. 3.1b we show the eigenvalues as obtained fromthe (387) C� coordinates from the solvent simulation. Finally, �g. 3.1cshows the eigenvalues obtained by analyzing the covariance matrix con-structed from all atom coordinates (3792) of the protein from the solventsimulation. In this case the �rst few eigenvalues (mean square displace-ments) are one order of magnitude larger then in the previous plots,because the number of atoms involved in these displacements is approx-imately 10 times larger. Since the eigenvalues are mean square displace-ments, it is clear from �g. 3.1 that the con�gurational space of the proteinis not a homogeneous space, in terms of the motion along the eigenvectordirections. As can be seen from �g. 3.1a and b, the eigenvalues from thesolvent simulation show a steeper decrease than those from the vacuumsimulation. One reason for this may be the fact that the force �eldsused are not equivalent. In the vacuum force �eld, full charges have beenreplaced by dipoles. This produces a weakening of the electrostatic in-

3.4. RESULTS AND DISCUSSION 39

0 50 100 150 200

a

b

c

eigenvector index

0.0

10.0

20.0

0.0

0.5

1.0

eige

nval

ue (

nm2 )

0.0

0.5

1.0

Figure 3.1: Eigenvalues, in decreasing order of magnitude, obtained from(a) C� coordinates covariance matrix from the vacuum simulation; (b)C� coordinates covariance matrix from the solvent simulation; (c) allatom coordinates covariance matrix from the solvent simulation.

teractions that, as we found, mainly a�ects the near constraints. As faras the methodology presented here is concerned there are no basic di�er-ences between vacuum and solvent simulation; so in the subsequent textwe will show only the results obtained from the solvent simulation. Atthe end of this section the vacuum and solvent results will be compared.

The amount of motion associated to a subspace spanned by the �rstn eigenvectors can be de�ned as the corresponding subspace positional uctuation (eq. 3.4, for the summation over n) where the eigenvalues areordered in descending order. In �g. 3.2 we show this relative subspacepositional uctuation (with respect to the total positional uctuation)versus the increasing number of eigenvectors that span the subspace. In


100 200 300 400eigenvector index

0.0

0.5

1.0

b

100 200 300 4000.0

0.5

1.0

a0.9

20

0.9

35

rela

tive

posi

tiona

l flu

ctua

tion

Figure 3.2: (a) Relative positional uctuation (see text) of the motionsalong the eigenvectors obtained from the C� coordinates matrix matrix(solvent simulation). (b) Relative positional uctuation of motions alongthe eigenvectors obtained from the all atom coordinates covariance ma-trix (solvent simulation).

�g. 3.2a we show the results as obtained from the C� matrix eigenvec-tors. Fig. 3.2b shows the results from the all atom analysis. From �g. 3.2it can be seen that 90% of the total motion is described by the �rst 20eigenvectors out of 387. If we analyze the motion due to all atoms (�g. 3.2b) we see that the �rst 35 eigenvectors out of 3792 contribute to90% of the overall motion. This shows that most of the internal motionof the protein is con�ned within a subspace of very small dimension.

To have a closer look at the motion along the eigenvector directionsone can project the trajectory onto these individual eigenvectors. In�g. 3.3 some projections of the C� trajectory on the eigenvectors obtainedfrom the C� covariance matrix are plotted against time. It is clear from


100.0 300.0 500.0 700.0 900.0-3.0

-2.0

-1.0

0.0

1.0

2.0

3.0

vec 5

-3.0

-2.0

-1.0

0.0

1.0

2.0

3.0di

spla

cem

ent (

nm)

vec 3

-3.0

-2.0

-1.0

0.0

1.0

2.0

3.0

vec 1

100.0 300.0 500.0 700.0 900.0

vec 6

vec 4

vec 2

100.0 300.0 500.0 700.0 900.0time (ps)

-3.0

-2.0

-1.0

0.0

1.0

2.0

3.0

vec 20

-3.0

-2.0

-1.0

0.0

1.0

2.0

3.0

disp

lace

men

t (nm

)

vec 9

-3.0

-2.0

-1.0

0.0

1.0

2.0

3.0

vec 7

100.0 300.0 500.0 700.0 900.0time (ps)

vec 50

vec 10

vec 8

Figure 3.3: Motions along several eigenvectors obtained from the C�

coordinates covariance matrix (solvent simulation)


-2.0 -1.0 0.0 1.0 2.00.0

1.0

2.0

3.0

4.0 vec 5

0.0

1.0

2.0

3.0

4.0

prob

abili

ty

vec 3

0.0

1.0

2.0

3.0

4.0 vec 1

-2.0 -1.0 0.0 1.0 2.00.0

1.0

2.0

3.0

4.0 vec 6

0.0

1.0

2.0

3.0

4.0 vec 4

0.0

1.0

2.0

3.0

4.0 vec 2

-2.0 -1.0 0.0 1.0 2.0displacement (nm)

0.0

1.0

2.0

3.0

4.0 vec 20

0.0

1.0

2.0

3.0

4.0

prob

abili

ty

vec 9

0.0

1.0

2.0

3.0

4.0 vec 7

-2.0 -1.0 0.0 1.0 2.0displacement (nm)

0.0

5.0

10.0

15.0vec50

0.0

1.0

2.0

3.0

4.0 vec 10

0.0

1.0

2.0

3.0

4.0 vec 8

Figure 3.4: Probability distributions for the displacements along severaleigenvectors obtained from the C� coordinates covariance matrix (sol-vent simulation). Solid line: Gaussian distributions derived from theeigenvalues of the corresponding eigenvectors. Dashed line: samplingdistributions.


100.0 300.0 500.0 700.0 900.0-10.0

-5.0

0.0

5.0

10.0

vec 5

-10.0

-5.0

0.0

5.0

10.0di

spla

cem

ent (

nm)

vec 3

-10.0

-5.0

0.0

5.0

10.0

vec 1

100.0 300.0 500.0 700.0 900.0

vec 6

vec 4

vec 2

100.0 300.0 500.0 700.0 900.0time (ps)

-10.0

-5.0

0.0

5.0

10.0

vec 20

-10.0

-5.0

0.0

5.0

10.0

disp

lace

men

t (nm

)

vec 9

-10.0

-5.0

0.0

5.0

10.0

vec 7

100.0 300.0 500.0 700.0 900.0time (ps)

vec 50

vec 10

vec 8

Figure 3.5: Motions along several eigenvectors obtained from the all atomcoordinates covariance matrix (solvent simulation)


-10.0 -5.0 0.0 5.0 10.00.0

0.5

1.0

vec 5

0.0

0.5

1.0

prob

abili

ty

vec 3

0.0

0.5

1.0

vec 1

-10.0 -5.0 0.0 5.0 10.00.0

0.5

1.0

vec 6

0.0

0.5

1.0

vec 4

0.0

0.5

1.0

vec 2

-10.0 -5.0 0.0 5.0 10.0displacment (nm)

0.0

0.5

1.0

vec 20

0.0

0.5

1.0

prob

abili

ty

vec 9

0.0

0.5

1.0

vec 7

-10.0 -5.0 0.0 5.0 10.0displacement (nm)

0.0

0.5

1.0

1.5

2.0

2.5 vec 50

0.0

0.5

1.0

vec 10

0.0

0.5

1.0

vec 8

Figure 3.6: Probability distributions for the displacements along severaleigenvectors obtained from the all atom coordinates covariance matrix(solvent simulation). Solid line: Gaussian distributions derived from theeigenvalues of the corresponding eigenvectors. Dashed line: samplingdistributions.


this �gure that all motions that have not yet reached their equilibrium uctuation belong to the �rst 10 eigenvectors. Fig. 3.4 shows the observeddistribution functions for the displacements along the same eigenvectors,as well as the corresponding Gaussian functions with the same varianceand average value. Obviously the only non-Gaussian distributions areagain found within the �rst 10 eigenvectors. Fig. 3.5 and 3.6 show thesame, but now projections and distributions have been evaluated usingthe eigenvectors that were obtained from the all atom covariance matrix.Just as in the case for the C� analysis we �nd that all the motions thathave not yet reached equilibrium uctuation are con�ned within the �rst10 eigenvectors. Also the only non-Gaussian distributions appear withinthe same eigenvectors.

We also noticed a great similarity between the motions along the �rstfew eigenvectors of the C� matrix and those along the �rst few eigen-vectors derived from the all atom matrix. To investigate this similar-ity further, we extracted the components from the all atom eigenvectorsthat corresponded to the C� coordinates and normalized the vectors thatwe obtained in this way. In �g. 3.7 the projections of these vectors onthe eigenvectors of the C� matrix are plotted. It is clear that the �rst8 extracted vectors correspond to the �rst 8 C� matrix eigenvectors. Itshould be mentioned that the length of these extracted vectors is approxi-mately 20% of the whole length of the corresponding all atom eigenvector,whereas the total number of C� atoms is about 10% of the total numberof atoms in the protein. This indicates that the essential internal motionof the protein mainly involves the backbone atoms. We also noted thatthe displacements along the �rst 5 eigenvectors produced a large motionnear the active site of the molecule. Fig. 3.8 shows a superposition of 10sequential projections of the C� motion onto the �rst eigenvector, eachseparated by 100 ps (compare with �g. 3.3). The catalytic site residuesGlu-35 and Asp-52 are rigid, but the entrance to the active site cleft,including residues involved in substrate binding (59,62,63,101,107) [41]shows extensive exibility. This motion, which also involves other loopsin the protein, possibly a�ects the association and dissociation of sub-


0 5 10 15 200.0

0.5

0.0

0.5

0.0

0.5

abs.

val

ue o

f pro

ject

ion

of e

xtra

cted

vec

tor

0.0

0.5

0.0

0.5

0 5 10 15 200.0

0.5

0.0

0.5

0.0

0.5

0.0

0.5

0.0

0.5

C-alpha eigenvector index

Figure 3.7: Absolute value of the projections of the (normalized) ex-tracted vectors coming from the �rst 10 eigenvectors of the all atom co-ordinate covariance matrix (see text) on the eigenvectors obtained fromthe C� covariance matrix (solvent simulation)

strates and products.

Fig. 3.9 shows the trajectory projected on four planes, each de�nedby two all atom matrix eigenvectors. In the planes of �g 3.9a and b (re-spectively, eigenvectors 1 and 2 and eigenvectors 2 and 3) the trajectoriesare con�ned within narrower ranges than those expected from indepen-dent motions, suggesting the presence of a coupled force �eld. In �g. 3.9d(eigenvectors 20 and 50) the trajectories �ll the expected ranges almostcompletely. This means that we are dealing with basically independentmotions. We analyzed the vacuum simulation and compared the motionin the essential space with that of the solvent simulation. The motionin the vacuum simulation appears to be largely restricted to the carboxyterminal strand; the motion near the active site is no longer present.


35

52 5962

63

101107 35

52 5962

63

101107

Figure 3.8: Superposition of 10 con�gurations obtained by projectingthe C� motion onto the �rst eigenvector. Con�gurations are separatedby 100 ps. Residues involved in the catalytic reaction (35 and 52) and inthe binding of the substrate (59,62,63,101, and 107) are indicated


-10.0 -5.0 0.0 5.0 10.0nanometer

-10.0

-5.0

0.0

5.0

10.0

nano

met

er

-10.0 -5.0 0.0 5.0 10.0-10.0

-5.0

0.0

5.0

10.0

nano

met

er

-2.0 -1.0 0.0 1.0 2.0nanometer

-2.0

-1.0

0.0

1.0

2.0

-10.0 -5.0 0.0 5.0 10.0-10.0

-5.0

0.0

5.0

10.0

Figure 3.9: Projection of the trajectory (solvent simulation) on the planesde�ned by two eigenvectors from the all atom coordinates covariance ma-trix. (a) Horizontal axis: displacement along �rst eigenvector. Verticalaxis: displacement along second eigenvector. (b) Horizontal axis: dis-placement along second eigenvector. Vertical axis: displacement alongthird eigenvector. (c) Horizontal axis: displacement along �rst eigenvec-tor. Vertical axis: displacement along 50th eigenvector. (d) Horizontalaxis: displacement along 20th eigenvector. Vertical axis: displacementalong 50th eigenvector. (note the di�erence in scale)

3.5. CONCLUSIONS 49

3.5 Conclusions

The analysis given in this article shows that the essential dynamics oflysozyme, and presumably of other globular proteins, can be described ina subspace of very small dimension (less then 1% of the original Carte-sian space) consisting of linear combinations of Cartesian degrees of free-dom de�ned in a molecule-�xed coordinate system. All other degrees offreedom can be considered as corresponding to irrelevant Gaussian uc-tuations, behaving like near-constraints. The essential subspace itselfis de�ned by the near-constraints, which are related to the mechanicalstructure of the molecule in a given conformation. We have strong ev-idence from inspection of a few proteins studied up to now (lysozyme,thermolysin, and a subtilisin analog) that these motions are related tothe functional behavior of the proteins such as opening and closing ofthe active site and hinge-bending motions between two domains enclos-ing the active site. The analysis of this behaviour will be the subjectof a subsequent study. A (major) conformational change to a di�erentfolded conformation may alter the characteristics of the essential sub-space, while unfolding will lead to an increase of its dimensionality. Thefact that active site motions are not present in the essential space of thevacuum simulation suggests strongly that vacuum simulations are notsuitable for the study of biologically relevant motions.

3.6 Acknowledgments

We thank Dr. A.E. Mark from ETH in Z�urich for o�ering a 900 ps tra-jectory of lysozyme in solvent. We are also grateful to Daan van Aaltenwho applied the method presented in this paper to a few other pro-teins. Finally, we gratefully acknowledge the Italian foundation "CenciBolognetti", Istituto Pasteur, for �nancially supporting Dr. A. Amadeiin this project.

university of groningen molecular dynamics simulations of

Documents