conformational dynamics of a crystalline protein from microsecond-scale molecular ... ·...

6
Conformational dynamics of a crystalline protein from microsecond-scale molecular dynamics simulations and diffuse X-ray scattering Michael E. Wall a,1 , Andrew H. Van Benschoten b , Nicholas K. Sauter c , Paul D. Adams c,d , James S. Fraser b , and Thomas C. Terwilliger e a Computer, Computational, and Statistical Sciences Division and e Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM 87545; b Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA 94158; c Physical Biosciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720; and d Department of Bioengineering, University of California, Berkeley, CA 94720 Edited by Peter B. Moore, Yale University, New Haven, CT, and approved November 4, 2014 (received for review September 1, 2014) X-ray diffraction from protein crystals includes both sharply peaked Bragg reflections and diffuse intensity between the peaks. The information in Bragg scattering is limited to what is available in the mean electron density. The diffuse scattering arises from correlations in the electron density variations and therefore contains information about collective motions in proteins. Pre- vious studies using molecular-dynamics (MD) simulations to model diffuse scattering have been hindered by insufficient sampling of the conformational ensemble. To overcome this issue, we have performed a 1.1-μs MD simulation of crystalline staphylococcal nuclease, providing 100-fold more sampling than previous studies. This simulation enables reproducible calculations of the diffuse intensity and predicts functionally important motions, including transitions among at least eight metastable states with different active-site geometries. The total diffuse intensity calculated using the MD model is highly correlated with the experimental data. In particular, there is excellent agreement for the isotropic compo- nent of the diffuse intensity, and substantial but weaker agree- ment for the anisotropic component. Decomposition of the MD model into protein and solvent components indicates that pro- teinsolvent interactions contribute substantially to the overall diffuse intensity. We conclude that diffuse scattering can be used to validate predictions from MD simulations and can provide in- formation to improve MD models of protein motions. diffuse scattering | protein crystallography | molecular-dynamics simulation | protein dynamics | staphylococcal nuclease P roteins explore many conformations while carrying out their functions in biological systems (13). X-ray crystallography is the dominant source of information about protein structure; however, crystal structure models usually consist of just a single major conformation and at most a small portion of the model as alternate conformations. Crystal structures therefore are missing many details about the underlying conformational ensemble (4). Proteins assembled in crystalline arrays, like proteins in solu- tion, exhibit rich conformational diversity (4) and often can perform their native functions (5). Many methods have emerged for using Bragg data to model conformational diversity in protein crystals (617). The development of these methods has been important as conformational diversity can lead to inaccuracies in protein structure models (9, 1820). A key limitation of using the Bragg data, however, is that different models of conformational diversity can yield the same mean electron density. Whereas the Bragg scattering only contains information about the mean electron density, diffuse scattering (diffraction result- ing in intensity between the Bragg peaks) is sensitive to spatial correlations in electron density variations (2128) and therefore contains information about the way that atomic positions vary together in protein crystals. Because models that yield the same mean electron density can yield different correlations in electron density variations, diffuse scattering provides a means to increase the accuracy of crystallography for determining protein confor- mational variations (29). Peter Moore (30) and Mark Wilson (31) have argued that diffuse scattering should be used to test models of conformational diversity in X-ray crystallography. Several pioneering studies used diffuse scattering to reveal insights into correlated motions in proteins (17, 30, 3249). Some of these studies used diffuse scattering to experimentally validate predictions of correlated motions from molecular-dynamics (MD) simulations (3537, 40, 4244). These studies revealed important insights but were limited by inadequate sampling of the conformational ensemble, leading to lack of convergence of the diffuse scattering calculations (35). Microsecond-scale simu- lations of staphylococcal nuclease were predicted to be adequate for convergence of diffuse scattering calculations (42). Modern simulation algorithms and computer hardware now enable mi- crosecond or longer MD simulations of protein crystals (50). Here, we present calculations of diffuse X-ray scattering using a 1.1-μs MD simulation of crystalline staphylococcal nuclease. The results demonstrate that we have overcome the past limi- tation of inadequate sampling. We chose staphylococcal nuclease because the experiments of Wall et al. (49) still represent the only complete, high-quality, 3D diffuse scattering data set from a protein crystal. The calculated diffuse intensity is very similar using two independent halves of the trajectory; the results therefore are reproducible and can be meaningfully compared with the experimental data. The MD simulation provides a rich Significance A major challenge of protein crystallography is to accurately describe protein structure variations. Whereas Bragg peaks only yield a picture in which the variations are superimposed, diffuse X-ray scattering (diffraction between the peaks) reports on the way different atoms move together and can be used to increase the accuracy of models. We have performed a molec- ular-dynamics (MD) simulation of crystalline staphylococcal nuclease, yielding a rich picture of functionally important motions. This simulation is 100-fold longer than previous studies and yields reproducible calculations of diffuse intensity, overcoming a major challenge in the field. Experimental data support the MD models and indicate that diffuse scattering can be used both to validate and to improve MD simulations. Author contributions: M.E.W. designed research; M.E.W. performed research; M.E.W., N.K.S., and T.C.T. contributed new reagents/analytic tools; M.E.W., A.H.V.B., and T.C.T. analyzed data; and M.E.W., A.H.V.B., N.K.S., P.D.A., J.S.F., and T.C.T. wrote the paper. The authors declare no conflict of interest. This article is a PNAS Direct Submission. Freely available online through the PNAS open access option. 1 To whom correspondence should be addressed. Email: [email protected]. This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. 1073/pnas.1416744111/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.1416744111 PNAS | December 16, 2014 | vol. 111 | no. 50 | 1788717892 BIOPHYSICS AND COMPUTATIONAL BIOLOGY Downloaded by guest on October 23, 2020

Upload: others

Post on 05-Aug-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Conformational dynamics of a crystalline protein from microsecond-scale molecular ... · Conformational dynamics of a crystalline protein from microsecond-scale molecular dynamics

Conformational dynamics of a crystalline protein frommicrosecond-scale molecular dynamics simulations anddiffuse X-ray scatteringMichael E. Walla,1, Andrew H. Van Benschotenb, Nicholas K. Sauterc, Paul D. Adamsc,d, James S. Fraserb,and Thomas C. Terwilligere

aComputer, Computational, and Statistical Sciences Division and eBioscience Division, Los Alamos National Laboratory, Los Alamos, NM 87545; bDepartmentof Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA 94158; cPhysical Biosciences Division, Lawrence Berkeley NationalLaboratory, Berkeley, CA 94720; and dDepartment of Bioengineering, University of California, Berkeley, CA 94720

Edited by Peter B. Moore, Yale University, New Haven, CT, and approved November 4, 2014 (received for review September 1, 2014)

X-ray diffraction from protein crystals includes both sharplypeaked Bragg reflections and diffuse intensity between the peaks.The information in Bragg scattering is limited to what is availablein the mean electron density. The diffuse scattering arises fromcorrelations in the electron density variations and thereforecontains information about collective motions in proteins. Pre-vious studies using molecular-dynamics (MD) simulations to modeldiffuse scattering have been hindered by insufficient sampling ofthe conformational ensemble. To overcome this issue, we haveperformed a 1.1-μs MD simulation of crystalline staphylococcalnuclease, providing 100-fold more sampling than previous studies.This simulation enables reproducible calculations of the diffuseintensity and predicts functionally important motions, includingtransitions among at least eight metastable states with differentactive-site geometries. The total diffuse intensity calculated usingthe MD model is highly correlated with the experimental data. Inparticular, there is excellent agreement for the isotropic compo-nent of the diffuse intensity, and substantial but weaker agree-ment for the anisotropic component. Decomposition of the MDmodel into protein and solvent components indicates that pro-tein–solvent interactions contribute substantially to the overalldiffuse intensity. We conclude that diffuse scattering can be usedto validate predictions from MD simulations and can provide in-formation to improve MD models of protein motions.

diffuse scattering | protein crystallography |molecular-dynamics simulation | protein dynamics | staphylococcal nuclease

Proteins explore many conformations while carrying out theirfunctions in biological systems (1–3). X-ray crystallography

is the dominant source of information about protein structure;however, crystal structure models usually consist of just a singlemajor conformation and at most a small portion of the model asalternate conformations. Crystal structures therefore are missingmany details about the underlying conformational ensemble (4).Proteins assembled in crystalline arrays, like proteins in solu-

tion, exhibit rich conformational diversity (4) and often canperform their native functions (5). Many methods have emergedfor using Bragg data to model conformational diversity in proteincrystals (6–17). The development of these methods has beenimportant as conformational diversity can lead to inaccuracies inprotein structure models (9, 18–20). A key limitation of using theBragg data, however, is that different models of conformationaldiversity can yield the same mean electron density.Whereas the Bragg scattering only contains information about

the mean electron density, diffuse scattering (diffraction result-ing in intensity between the Bragg peaks) is sensitive to spatialcorrelations in electron density variations (21–28) and thereforecontains information about the way that atomic positions varytogether in protein crystals. Because models that yield the samemean electron density can yield different correlations in electrondensity variations, diffuse scattering provides a means to increase

the accuracy of crystallography for determining protein confor-mational variations (29). Peter Moore (30) and Mark Wilson(31) have argued that diffuse scattering should be used to testmodels of conformational diversity in X-ray crystallography.Several pioneering studies used diffuse scattering to reveal

insights into correlated motions in proteins (17, 30, 32–49). Someof these studies used diffuse scattering to experimentally validatepredictions of correlated motions from molecular-dynamics(MD) simulations (35–37, 40, 42–44). These studies revealedimportant insights but were limited by inadequate sampling ofthe conformational ensemble, leading to lack of convergence ofthe diffuse scattering calculations (35). Microsecond-scale simu-lations of staphylococcal nuclease were predicted to be adequatefor convergence of diffuse scattering calculations (42). Modernsimulation algorithms and computer hardware now enable mi-crosecond or longer MD simulations of protein crystals (50).Here, we present calculations of diffuse X-ray scattering using

a 1.1-μs MD simulation of crystalline staphylococcal nuclease.The results demonstrate that we have overcome the past limi-tation of inadequate sampling. We chose staphylococcal nucleasebecause the experiments of Wall et al. (49) still represent theonly complete, high-quality, 3D diffuse scattering data set froma protein crystal. The calculated diffuse intensity is very similarusing two independent halves of the trajectory; the resultstherefore are reproducible and can be meaningfully comparedwith the experimental data. The MD simulation provides a rich

Significance

A major challenge of protein crystallography is to accuratelydescribe protein structure variations. Whereas Bragg peaksonly yield a picture in which the variations are superimposed,diffuse X-ray scattering (diffraction between the peaks) reportson the way different atoms move together and can be used toincrease the accuracy of models. We have performed a molec-ular-dynamics (MD) simulation of crystalline staphylococcalnuclease, yielding a rich picture of functionally importantmotions. This simulation is 100-fold longer than previousstudies and yields reproducible calculations of diffuse intensity,overcoming a major challenge in the field. Experimental datasupport the MD models and indicate that diffuse scattering canbe used both to validate and to improve MD simulations.

Author contributions: M.E.W. designed research; M.E.W. performed research; M.E.W.,N.K.S., and T.C.T. contributed new reagents/analytic tools; M.E.W., A.H.V.B., and T.C.T.analyzed data; and M.E.W., A.H.V.B., N.K.S., P.D.A., J.S.F., and T.C.T. wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

Freely available online through the PNAS open access option.1To whom correspondence should be addressed. Email: [email protected].

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1416744111/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1416744111 PNAS | December 16, 2014 | vol. 111 | no. 50 | 17887–17892

BIOPH

YSICSAND

COMPU

TATIONALBIOLO

GY

Dow

nloa

ded

by g

uest

on

Oct

ober

23,

202

0

Page 2: Conformational dynamics of a crystalline protein from microsecond-scale molecular ... · Conformational dynamics of a crystalline protein from microsecond-scale molecular dynamics

picture of conformational diversity in the energy landscape ofa protein crystal, consisting of at least eight metastable states.Like previous MD studies of crystalline staphylococcal nuclease(42–44), the agreement of the simulation with the total experi-mental diffuse intensity is excellent, supporting the use of MDsimulations to model diffuse scattering data. Unlike previousMD studies, we separately compared the more finely structured,anisotropic component of the diffuse intensity with experimentaldata. The agreement is substantial but weaker than for the iso-tropic component, indicating there are inaccuracies in the MDmodels. Our results therefore point toward using diffuse scat-tering to improve MD models of protein motions.

ResultsFunctionally Important Motions of Crystalline Staphylococcal Nuclease.We performed a 1.1-μs MD simulation of a staphylococcal nucleaseunit cell (Fig. S1 and Methods), revealing a rich conformationallandscape. To analyze the trajectory, we performed principal com-ponent analysis (Methods) and gradually added sequential timepoints to a scatter plot in the space of the two dominant principalcomponents. The points clustered visually into ellipsoidal shapedregions (Fig. 1).We identified at least eight metastable states by noting when

sharp transitions occurred between the regions (Fig. 1, curvedblack line with tick marks). The time between the transitionsvaries considerably from 32 ns (Fig. 1, orange basin) to 540 ns(Fig. 1, green basin); notably, these times are longer than the 10-nsduration of the previous MD simulation of a staphylococcalnuclease unit cell (42) (Fig. 1, gray region). The cyan basin isvisited twice: the first time for 52 ns, and the second time for66 ns. The system spends nearly half the duration of the simu-lation in the green basin (540 ns); visualization in three dimen-sions revealed that this basin has a fine structure of substatesthat, compared with the separations in Fig. 1, lie close togetherin the space of the dominant principal components.To gain insight into the functional significance of the states in

Fig. 1, we created movies to visualize the motions of the twodominant principal components (Movies S1 and S2). Both com-ponents are dominated by internal motions rather than overallrotations and translations of the protein. The residues in the active

site (51) cluster into localized regions whose motions are highlycorrelated (Fig. 2): (A) Glu43, at the beginning of the omega loop(Fig. 2, red sticks); (B) Arg35, Tyr85, and Arg87, at the top of thebinding pocket (Fig. 2, Arg35 and Arg87 in orange sticks, andTyr85 in blue sticks); and (C) Tyr113 and Tyr115, at the bottom ofthe binding pocket (Fig. 2, cyan sticks). Both of the principalcomponents involve substantial motions of regions A and C withlittle motion in region B. Especially prominent are a clampingmotion of region C against region B, which would likely modulatebinding interactions (Fig. 2, blue double-headed arrow), and anopening of region A away from region B, modulating the envi-ronment of a water molecule putatively involved in the hydrolysisof the 5′-phosphodiester bond (Fig. 2, red double-headed arrow)(52). The motions vary for the four copies of the protein in theunit cell (Fig. 2, Inset): the region C motion is most pronouncedfor the pink chain; the region A motion is most pronounced forthe blue and yellow chains; and the region A and C motions aresmallest in the green chain. The states therefore correspond tostructures with different active site geometries, resulting in iden-tifiable functional consequences for binding and catalysis.

MD Model of Diffuse Scattering. Wall et al. (49) obtained 3D ex-perimental diffuse scattering data by averaging measurementsof the continuous intensity DoðsÞ at scattering vectors s in theneighborhood of each Bragg peak. This procedure yielded asingle diffuse intensity DoðhklÞ at integer reciprocal lattice pointshkl (SI Text); the diffuse lattice showed the expected P4/m Pat-terson symmetry, which was used to create an eightfold re-dundant expanded lattice with symmetry averaged observations(49). We similarly computed the diffuse intensity DmdðsÞ at eachreciprocal lattice point and expanded the lattice using the P4/mPatterson symmetry (Methods). To assess the reproducibility ofdiffuse intensity calculations, we compared DmdðhklÞ calculatedusing either the first or second half of the following subsectionsextracted from the full 1.1-μs trajectory: the first 10 ns; the first

Fig. 1. Scatter plot of structures extracted from the MD trajectory projectedon the first two principal components of the α-carbon position covariancematrix. The first component corresponds to the x axis, and the second cor-responds to the y axis. Gray regions correspond to the first 10 ns and last50 ns of the trajectory; colored regions correspond to metastable states (theyellow region overlaps the first gray region). The curved line indicates therough trajectory of the system; tick marks indicate approximate times atwhich state transitions occur.

Fig. 2. Active-site conformational dynamics in crystalline staphylococcalnuclease. The backbone is rendered using a ribbon, and the P41 unit cellpacking along the c axis is shown in the Inset (the screw axis translation isinto the page, with the green copy closest and the orange copy farthestaway). The residues are shown using sticks, proceeding counterclockwise:Glu43 (red), Arg35 (orange on the β-sheet), Arg87 (orange on the loop),Tyr85 (blue), Tyr115 (cyan), and Tyr113 (cyan). The rest of the protein isrendered as a gray cartoon. The inhibitor thymidine 3′,5′-bisphosphate isshown using spheres and the calcium ion using a yellow sphere. Arrows in-dicate the direction of motion in the two dominant principal components ofthe microsecond MD simulation. The loop containing Glu43 (red) moves inthe approximate direction indicated by the transparent red double-headedarrow. The loop containing Tyr113 and Tyr115 (cyan) moves in the approx-imate direction indicated by the transparent blue double-headed arrow. Theregion containing Tyr85 (blue) and Arg35 and Arg87 (orange) moves muchless by comparison. The image was created using PyMOL (66).

17888 | www.pnas.org/cgi/doi/10.1073/pnas.1416744111 Wall et al.

Dow

nloa

ded

by g

uest

on

Oct

ober

23,

202

0

Page 3: Conformational dynamics of a crystalline protein from microsecond-scale molecular ... · Conformational dynamics of a crystalline protein from microsecond-scale molecular dynamics

100 ns; and the last 1,000 ns. Correlation coefficients using all ofthe DmdðhklÞ lattice points (∼138,500 in total, of which ∼17,300are independent, due to symmetry) were calculated between theresults obtained using either the first or second half (Methods).Correlations of the total intensity, r12, were excellent for each ofthe three subsections of the trajectory: r12 = 0.994 for 10 ns, 0.995for 100 ns, and 0.998 for 1,000 ns.We performed a more detailed assessment of the reproducibility

by first decomposing the diffuse intensity into components that aredistributed either isotropically or anisotropically about the origin(Methods). We then calculated correlations for just the anisotropiccomponent, D′mdðhklÞ, which corresponds to the striking 3D fea-tures visible in the experimental data (see, e.g., figure 3 of ref. 49).The correlations for the anisotropic intensity, r′12, were lower thanfor the total intensity: r′12 = 0.646 for 10 ns, 0.654 for 100 ns, and0.832 for 1,000 ns. The first and second halves of the 1,000-nstrajectory sample a similar range in the dominant principal com-ponent (Fig. 1), which is consistent with the high correlation forthat trajectory. The difference between the values of r12 and r′12indicates that the total intensity correlation is dominated by thelarge isotropic component of the signal that varies gradually withresolution. The increase in the anisotropic intensity correlationwith simulation time indicates that diffuse scattering calculationsbecome more reproducible for longer simulations, as expected.The fact that r′12 = 0.832 for the 1,000-ns trajectory demonstratesthat sampling is not a limitation for our validation of MD modelsof diffuse X-ray scattering.Meinhold and Smith (42) analyzed diffuse scattering using an

approximation in which the variations in atomic positions arenormally distributed. In this case, the diffuse intensity in Eq. 1can be computed using the covariance matrix of atomic dis-placements hukuTk′i between all atom pairs ðk; k′Þ (equation 2 inref. 42). The reproducibility of diffuse scattering calculationsthen depends on the reproducibility of these matrix elementcalculations. Similar to Meinhold and Smith (42), we assessedthe reproducibility of the matrix elements by just examiningα-carbon coordinates. The covariance matrices were calculatedfrom either the early or late halves of the 1,000-ns trajectory(Methods), and were compared by calculating the Pearson cor-relation coefficient between all elements. The resulting value of0.517 is much lower than the values of r12 (0.9975) or r′12 (0.832)computed from the diffuse intensity, indicating that calculationsof the covariance matrix were not as reproducible as diffusescattering calculations. A possible explanation is that the diffuseintensity results from the statistically accumulated signal of manyatom pairs, whereas the covariance matrix has an element foreach individual atom pair.

Experimental Validation of the MD Model. We compared the MDmodel to the 64,335 experimental observations collected by Wallet al. (49). The observations were placed on a symmetry ex-panded diffuse intensity lattice containing 120,845 points(Methods). For each simulation, we multiplied all of the DmdðhklÞby a single scalar weight w and optimized the value of w tominimize the root-mean-squared deviation (RMSD) of the calcu-lated DmdðhklÞ with respect to the experimental DoðhklÞ values.Comparisons were performed by calculating correlation coefficientsusing the lattice points where both calculations and observationswere available (∼118,500, of which ∼14,800 are independent due tosymmetry). Correlations were computed for four trajectories, eitherwith or without using hydrogen atoms: 0–10, 0–100, 0–1,100, and100–1,100 ns (Table S1). The correlation roc of the total intensityranged from 0.90 to 0.94 for the all-atom calculations, and from 0.85to 0.92 for the heavy-atom calculations. The correlations for heavy-atom models were in each case lower than for all-atom models(difference of 0.02–0.05), indicating that adding hydrogens some-what improves the model of total intensity. The correlations in-crease as the trajectories increase in duration. Excellent agreement

was achieved for the 0- to 1,100-ns and 100- to 1,100-ns all-atomsimulations, where roc = 0.94.We also calculated Meinhold and Smith’s R-factor–like

agreement statistic (equation 3 in ref. 42) between DoðhklÞ andthe total DmdðhklÞ computed from the 100- to 1,100-ns simula-tion. This involved a minimization with respect to an overallweighting factor and baseline shifting of the DmdðhklÞ. We founda value of 0.029, which is much lower than the minimum value of0.081 found by Meinhold and Smith (table 2 in ref. 42). Theagreement factor therefore is substantially improved for ourmicrosecond simulation compared with their 10-ns simulation.The isotropically averaged diffuse intensity from the MD

model and experimental data were very similar (Fig. 3A), whichis consistent with the global correlation results. To better un-derstand the origin of the isotropic intensity, we decomposedDmdðhklÞ for the 1,000-ns trajectory into protein (Dmd;p), solvent(Dmd;s), and protein–solvent (Dmd;x) terms (Methods). (Note thatthe contribution of Dmd;x to the sum is negative.) Each compo-nent contributes about equally to the total signal (Fig. 3B).The protein and solvent curves are similar to those calculated

by Meinhold and Smith (42). The cross term, Dmd;x, starts high atsmall scattering vectors, dips below zero at about 0.3 Å−1, andsettles to zero at ∼0.35 Å−1. To gain further insight into thenature of the correlations implied by this behavior, we fit thecross term to a model of correlations that decay exponentiallyin real space over a length γ (Methods). A reasonable fit wasobtained using γ = 0:89± 0:08 Å (Fig. S2), suggesting that thecross term is dominated by short-range correlations between thefluctuations of the protein and solvent (Discussion).As for the model D′md, we calculated the anisotropic compo-

nent of the experimentally observed diffuse intensity, D′o; bysubtracting the isotropic component (Methods). For both theexperimental data and the MD model, at each resolution the SDof the anisotropic intensity is less than 10% of the isotropic in-tensity (Fig. 3A, error bars). The anisotropic component there-fore is weaker than the isotropic component. The anisotropicintensity due the individual protein, solvent, and protein–solventcross terms show differences in magnitude (Fig. 3B). At scat-tering vectors greater than 0.1 Å−1, the protein has the strongestanisotropic intensity (largest green error bars). The protein–solvent cross term is weaker (small cyan error bars), and thesolvent is weakest (very small magenta error bars). Below 0.1 Å−1,each component has substantial anisotropic intensity; however,different components nearly cancel, summing to a very smalltotal anisotropic intensity (blue bars).To assess the agreement of the anisotropic component of the

MD model (D′md) with experimental data (D′o), we visualized

Fig. 3. Analysis of isotropic diffuse intensity. (A) Comparison of isotropicdiffuse intensity for data (red) to the MD model (blue). (B) Decomposition ofthe total simulated isotropic intensity (blue) into contributions from protein(green), solvent (magenta), and the protein–solvent cross term (cyan). Thetotal intensity is equal to the protein term plus the solvent term minus thecross term (Methods). In both A and B, the SD of the anisotropic diffuseintensity is indicated using error bars.

Wall et al. PNAS | December 16, 2014 | vol. 111 | no. 50 | 17889

BIOPH

YSICSAND

COMPU

TATIONALBIOLO

GY

Dow

nloa

ded

by g

uest

on

Oct

ober

23,

202

0

Page 4: Conformational dynamics of a crystalline protein from microsecond-scale molecular ... · Conformational dynamics of a crystalline protein from microsecond-scale molecular dynamics

level surfaces of D′o (Fig. 4A), D′md (Fig. 4B), and D′o −D′md(Fig. 4C). The comparison revealed places where the MD modelintensity was either higher (Fig. 4C, green) or lower (Fig. 4C,red) than the experimental intensity. As for the total intensity, toquantify the agreement we calculated the anisotropic correlationcoefficient r′oc (Methods) between D′md and D′o for each of thetrajectories, either with or without hydrogens (Table S1). Thecorrelation r′oc of the anisotropic intensity ranged from 0.35 to 0.43for the all-atom calculations, and from 0.34 to 0.40 for the heavy-atom calculations. The highest anisotropic correlation (r′oc = 0.43)is strongly positive, but is lower than the lowest total intensitycorrelation (roc = 0.85), indicating that the MD model more ac-curately describes the isotropic than the anisotropic diffuse in-tensity. The best all-atom correlation is for the 0- to 10-ns trajectory(r′oc = 0.43), and the worst is for the 0- to 100-ns trajectory (r′oc =0.35). The 0- to 1,000-ns and 0- to 1,100-ns trajectories yield in-termediate correlations (r′oc = 0.40).Correlations for heavy-atom models were lower than for all-

atom models by 0.01 or less (Table S1), indicating that addinghydrogens did not substantially improve the model of anisotropicintensity. The correlation for the protein component of the MDmodel (Dmd;p) was r′oc = 0.38 (not listed in Table S1), which isvery similar to that for the entire MD model. This supports thefinding that the protein is the dominant contributor to the an-isotropic intensity (Fig. 3A, error bars).We also calculated the anisotropic correlation within resolu-

tion shells, and found the highest value was 0.58 in the 0.27 Å−1

resolution shell. Visualization of the experimental and MDmodel

intensity in this shell shows arcs of diffuse intensity extendingbetween the two poles, and rich large-scale features near theequator (Fig. 4D). A liquid-like motions model showed similarpatterns (figure 5 in ref. 49), suggesting that they can be explainedlargely by the protein structure factor (equation 4 in ref. 49).

DiscussionThe present 1.1-μs simulation, which is 100-fold longer than theprevious 10-ns simulation of a staphylococcal nuclease unit cell(42), revealed eight metastable states with lifetimes that arelonger than the duration of the shorter simulation. Analysis ofthe motions revealed that the basins correspond to structureswith different active-site geometries, resulting in identifiablefunctional consequences for binding and catalysis.Diffuse scattering can independently validate the predictions

of MD simulations. Conversely, interpretation of the diffuse in-tensity in terms of a multistate conformational ensemble is nowpossible (Fig. 1). Although there is room for improvement inmodeling the isotropic intensity at resolutions exceeding 0.3 Å−1

(Fig. 3A), the agreement of the MD model with the total ex-perimental diffuse intensity is remarkable considering that nofree parameters were adjusted. Moreover, this agreement is ro-bust to sampling (Table S1) and to a model perturbation thatincludes a change of force field (SI Text).Compared with the total intensity, the MD model had a lower,

but still strongly positive, correlation with the anisotropic diffuseintensity. The lower correlation of the anisotropic intensityindicates there are inaccuracies in the MD models. Maximizingthe correlation of the anisotropic intensity is therefore a poten-tial strategy for increasing the accuracy of the MD models. Forexample, both the simulation of a single unit cell with periodicboundary conditions and the assumption of independent unitcells inherent in Eq. 1might limit the accuracy of the MD model.One way to test this possibility is to simulate larger sections ofthe crystal with several unit cells (50), as this will enable corre-lations between unit cells to be included in the model.Decomposition of the diffuse intensity into protein and solvent

components revealed that the protein component Dmd;p domi-nates the overall magnitude of the anisotropic intensity (Fig. 3A,error bars). In addition, the MD model of anisotropic intensityusing the protein alone (r′oc = 0.38) was almost as accurate asusing the protein and solvent (r′oc = 0.40). These results supportthe hypothesis that the anisotropic intensity reports largely onvariations in the protein structure (49).The decomposition also revealed that the protein–solvent

interactions contribute a strong negative component to the sum(Dmd;x is mostly positive in Fig. 3B and enters with a minus sign).Combined with the subatomic correlation length derived from thefit to Eq. 2, these results suggest that the cross term might bemainly due to negative correlations in the density fluctuationsthat occur in the region between the protein and solvent atoms.One possible explanation for this is that, when atoms move to-gether, the density shifts, resulting in a positive fluctuation (+Δρ)for the leading edge density of one atom (e.g., a protein atom)coupled a negative fluctuation (−Δρ) for the trailing edge densityof a neighboring atom (e.g., a solvent atom) (Fig. S2, Inset).Increasing the duration of the MD trajectory yielded more

reproducible calculations of diffuse intensity and better modelsof the total intensity. Calculations of the covariance matrix ofα-carbon displacements, however, were less reproducible thanthe calculations of diffuse intensity. In addition, only the cyanbasin in Fig. 1 is visited more than once. Thus, even though wehave achieved reproducible diffuse scattering calculations usingan MD model, the sampling of the conformational ensemble isstill incomplete. Millisecond all-atom simulations (53), advancedsampling methods such as parallel tempering (54), and acceler-ation schemes such as Markov state models (55) and parallel

Fig. 4. Comparison of anisotropic diffuse intensity between models andexperimental data. (A–C) Isosurfaces in the (A) experimental (D′o), (B) scaledMD model (D′md ), and (C) difference (D′o −D′md ) intensity maps. Positiveintensity is shown in green, and negative intensity in red. All isosurfacesincluding the difference map are displayed at an intensity level equal to theSD of D′o in the solvent ring. The values of D′md were multiplied by a uni-form scale factor to yield the same SD in the solvent ring as D′o. The x di-rection in each panel corresponds to the a* axis, varying from −0.5 Å−1 at theleft to 0.5 Å−1 at the right; the y direction corresponds to the b* axis, varyingfrom −0.5 Å−1 at the bottom to 0.5 Å−1 at the top. (D) Visualizations of theexperimental (Left) and MD model (Right) anisotropic diffuse intensity in the0.27-Å−1 resolution shell, for which the agreement is best. The images wereconstructed as in figure 3 of ref. 49, using the shimlt and seesh routines inLUNUS (63, 72). The y direction corresponds to the polar angle in the shell asmeasured from the c* axis, varying from 0 at the top to π at the bottom ofeach image. The x direction corresponds to the azimuthal angle as measuredfrom the a* axis in a right-handed sense, varying from −π at the left to π atthe right. Pixel values are displayed as the deviation from the mean ona linear gray scale, with −500 corresponding to black, and 500 correspondingto white.

17890 | www.pnas.org/cgi/doi/10.1073/pnas.1416744111 Wall et al.

Dow

nloa

ded

by g

uest

on

Oct

ober

23,

202

0

Page 5: Conformational dynamics of a crystalline protein from microsecond-scale molecular ... · Conformational dynamics of a crystalline protein from microsecond-scale molecular dynamics

replica MD (56, 57) can be used to pursue increased samplingand improved convergence of the calculations.Our comparisons point to opportunities for improving

models of protein motions using diffuse scattering. For exam-ple, if the accuracy of the model of anisotropic diffuse intensitycan be improved, it might become possible to use diffusescattering to improve MD force fields. Indeed, use of NMRdata for force field improvement provides an example of thepotential impact of diffuse scattering. Early comparisons ofMD to NMR data for staphylococcal nuclease revealed dif-ferences in the residue-by-residue backbone flexibility (58), andmore recent force fields now do a much better job of repro-ducing the NMR data (59–62). Another possibility is to usediffuse scattering for validating results obtained using schemesfor increasing sampling (54–57).There is also room for improvement in integration of experi-

mental diffuse scattering data from diffraction images. For ex-ample, our current methods aim to extract the large-scale features(due to short-range correlations within the unit cell) and to ignorethe small-scale, streaked features (due to long-range correlationsacross unit cells) (63). However, these methods can be improvedby accounting for better separation of the large-scale from small-scale features, or a finer sampling of all diffuse features (29). In-creased diffuse scattering data collection efforts and advances inX-ray detector technology will further this goal (29).

MethodsMD Simulations. The 48.5 Å × 48.5 Å × 63.5-Å P1 unit cell model was pre-pared from the Research Collaboratory for Structural Bioinformatics(RCSB) Protein Data Bank (PDB) (64) (www.rcsb.org) entry 1STN (65) usingthe UnitCell code in AmberTools (ambermd.org/#AmberTools). The RMSDof the atomic coordinates [calculated using PyMOL (66)] between 1STN andPDB entry 4WOR (49) (the crystal structure refined against the Bragg data) is0.70 Å, with backbone deviations concentrated in loops near the ligandbinding site (Fig. S3). Waters and counterions were added using genbox andgenion in GROMACS (67), resulting in a model with 8,904 protein atoms(four proteins arranged in a P41 configuration), 6,477 TIP3P water atoms,and 40 Cl− ions (Fig. S1). The OPLS-AA force field (68, 69) was selected. Aninitial model was prepared using energy minimization followed by 100-psequilibration in a constant-particle number, -volume, and -temperature(NVT) ensemble at 300 K, using harmonic restraints for the protein atomcoordinates. The modified Berendsen thermostat with a 0.1-ps time constantwas used for temperature equilibration.

The production 1.1-μs simulation was performed in GROMACS (67) ona 128 core 2.00 GHz Intel Xeon X6550 machine using a 2-fs time step witha constant-particle number, -pressure, and -temperature (NPT) ensemble at300 K and 1 bar. The Parrinello–Rahman barostat with a 2.0-ps time constantwas used for pressure equilibration. At a rate of 44.6-ns system time per dayof wall clock time, it took roughly 30 d to complete, including periods ofdown time. Snapshots were recorded every 2 ps in compressed (.xtc) format,yielding a total of 5.5 × 108 frames in a 31.1-GB file.

Upon the switch from the restrained NVT to the unrestrained NPT simu-lation, the unit cell dimensions began fluctuating (all scaled simultaneouslywith the volume fluctuations) and the mean values rapidly shrank by 1.3% oftheir initial values. The structure also drifted away from the crystal structure:the RMSD between the mean coordinates from the simulation and PDB entry4WOR was 1.8 Å (1.3 Å) for heavy atoms (α-carbons). These values are similarto those obtained by Meinhold and Smith (42) for their 10-ns simulation:they reported deviations of less than 0.8% in unit cell dimensions, anda RMSD between the mean simulation coordinates and PDB entry 2SNS (52)(an experimentally determined crystal structure available at the time) of1.7 Å (1.3 Å) for heavy atoms (α-carbons).

MD Model Diffuse Scattering Calculations and Reproducibility. To calculate thediffuse intensity for a given section of the trajectory, ensembles of atomiccoordinates were extracted from the full trajectory in 5-ns chunks. Thisprocedure yielded protein and solvent coordinates of the P1 unit cell ina series of separate files (SI Text).

Each of the files was then processed to calculate the diffuse intensity. Thestructure factor, fnðhklÞ, for each unit cell structure n was calculated to 1.6-Åresolution at Miller indices hkl using the iotbx package in the computationalcrystallography toolbox (CCTBX) (70). The average structure factor, ÆfnðhklÞæn,

and the average squared structure factor, ÆjfnðhklÞj2æn, were calculated usinga modified version of the Phenix (71) Python script get_struct_fact_from_md.py, which we called get_diffuse_from_md.py. Averages for longer sectionsof the trajectory were accumulated from averages of the 5-ns chunks. Theintensity of the diffuse scattering for the first ½Dmd1ðhklÞ� and second½Dmd2ðhklÞ� parts of the trajectory were calculated using the following equation:

DmdðhklÞ= Æ��fnðhklÞ

��2æn −��ÆfnðhklÞæn

��2: [1]

Eq. 1 appears in the classic text of Guinier (24) and assumes independentunit cell fluctuations. It states that the diffuse intensity is equal to thevariance of the unit cell structure factor, and therefore contains in-formation that is distinct from the Bragg peak intensities, which are de-termined by the square of the mean structure factor (second term of Eq. 1).Eq. 1 is an established method for calculating diffuse scattering from pro-tein MD simulations (35, 42–44).

To decompose the diffuse intensity into isotropic and anisotropic com-ponents, reciprocal space was subdivided into concentric spherical shells, eachwith a thickness equal to the reciprocal unit cell diagonal (Δs = 0.0336 Å−1).The isotropic intensity DmdðsnÞ was calculated as the mean intensity atscattering vector sn at the midpoint of each shell n. The anisotropic intensityD′mdðhklÞ was then calculated at each lattice point hkl by subtracting theisotropic intensity DmdðshklÞ from the original signal DmdðhklÞ. The value ofDmdðshklÞ at scattering vector shkl in the range ðsn,sn+1Þ was obtained bylinear interpolation of DmdðsnÞ. The same method was used to obtain iso-tropic ½DoðsnÞ� and anisotropic ½D′oðhklÞ� components of the experimentallyobserved diffuse intensity.

Because the experimental diffuse intensity shows symmetry consistentwith the P41 symmetry of the unit cell (49), we enforced the P4/m Pattersonsymmetry (corresponding to the P41 unit cell symmetry) by replacing eachDmdðhklÞ value with the average over all symmetry equivalent hkl positionsin the map. The resulting symmetry-expanded map had ∼17,300 in-dependent values on ∼138,500 lattice points with eightfold redundancy. Thevalues were very similar to the original map (Pearson correlation coefficientrsym = 0.996 for total intensity and r′sym = 0:789 for anisotropic intensity).

To assess the reproducibility of diffuse scattering calculations in the MDmodel, we divided the extracted trajectory into first and second halves ofequal duration. We then calculated the simulated diffuse scattering from thefirst and second parts independently, and compared the results quantita-tively. A global comparison of DmdðhklÞ and D′mdðhklÞ was made using thePearson correlation coefficients r12 and r′12, respectively.

Covariance Matrices and Principal Components. To calculate covariance ma-trices and to perform principal component analysis, the trajectory was adjustedto remove discontinuous jumps of atomic positions to symmetry-relatedpositions during the course of the simulation (trjconv –pbc nojump). Next,the coordinates were adjusted to remove translational drift (trjconv –fittranslation) and to preserve the covalent bonding structure of the protein(trjconv –pbc mol). Finally the α-carbon coordinates were selected and used tocalculate and diagonalize the covariance matrix (g_covar). Comparisons ofcovariance matrices and projections of trajectories onto principal componentswere performed using the tool g_anaeig. Visualization of the energy land-scape was performed by displaying the projected coordinates in a 2Dscatter plot.

Decomposition into Protein and Solvent Components. We used Eq. 1 and theequation Dmd =Dmd,p +Dmd,s −Dmd,x to decompose Dmd into proteinðDmd,p = Æjfprot j2æ− jÆfprot æj2Þ, solvent ðDmd,s = Æjfsolv j2æ− jÆfsolv æj2Þ and pro-tein– solvent ðDmd,x =Dmd,p +Dmd,s −DmdÞ terms, where fprot and fsolvare the structure factors computed from just the protein or solvent component,respectively. Note that positive values of Dmd,x contribute negatively to Dmd .

We modeled Dmd,x using the following function (34, 49):

FT½e−γr �= 8πγ3�1+ ð2πsγÞ2�2

, [2]

where s is the scattering vector, γ is the correlation length of atomic dis-placements, and FT½e−γr � indicates the 3D Fourier transform of the real-spacefunction e−γr , which decays exponentially with distance r. The fitting of Dmd,x

to Eq. 2 was performed using the nonlinear-least-squares fitting feature ofgnuplot (www.gnuplot.info).

ACKNOWLEDGMENTS. This work was supported by the US Department ofEnergy under Contract DE-AC52-06NA25396 through the Laboratory-Directed Research and Development Program at Los Alamos National

Wall et al. PNAS | December 16, 2014 | vol. 111 | no. 50 | 17891

BIOPH

YSICSAND

COMPU

TATIONALBIOLO

GY

Dow

nloa

ded

by g

uest

on

Oct

ober

23,

202

0

Page 6: Conformational dynamics of a crystalline protein from microsecond-scale molecular ... · Conformational dynamics of a crystalline protein from microsecond-scale molecular dynamics

Laboratory. N.K.S. is supported by NIH Grant GM095887. P.D.A. and T.C.T.are supported by NIH Grant 1P01GM063210. J.S.F. is a Searle Scholar and

a Pew Scholar, and is supported by NIH Grants OD009180 and GM110580,and National Science Foundation Grant STC-1231306.

1. Austin RH, et al. (1973) Dynamics of carbon monoxide binding by heme proteins.Science 181(4099):541–543.

2. Weber G (1972) Ligand binding and internal equilibria in proteins. Biochemistry 11(5):864–878.

3. Frauenfelder H, Parak F, Young RD (1988) Conformational substates in proteins. AnnuRev Biophys Biophys Chem 17:451–479.

4. Frauenfelder H, Petsko GA, Tsernoglou D (1979) Temperature-dependent X-ray dif-fraction as a probe of protein structural dynamics. Nature 280(5723):558–563.

5. Doscher MS, Richards FM (1963) The activity of an enzyme in the crystalline state:Ribonuclease S. J Biol Chem 238(7):2393–2398.

6. Burnley BT, Afonine PV, Adams PD, Gros P (2012) Modelling dynamics in proteincrystal structures by ensemble refinement. Elife 1:e00311.

7. Chaudhry C, Horwich AL, Brunger AT, Adams PD (2004) Exploring the structural dy-namics of the E. coli chaperonin GroEL using translation-libration-screw crystallo-graphic refinement of intermediate states. J Mol Biol 342(1):229–245.

8. Chen Z, Chapman MS (2001) Conformational disorder of proteins assessed by real-space molecular dynamics refinement. Biophys J 80(3):1466–1472.

9. DePristo MA, de Bakker PI, Blundell TL (2004) Heterogeneity and inaccuracy in proteinstructures solved by X-ray crystallography. Structure 12(5):831–838.

10. Fraser JS, et al. (2009) Hidden alternative structures of proline isomerase essential forcatalysis. Nature 462(7273):669–673.

11. Fraser JS, et al. (2011) Accessing protein conformational ensembles using room-temperature X-ray crystallography. Proc Natl Acad Sci USA 108(39):16247–16252.

12. Kuriyan J, et al. (1991) Exploration of disorder in protein structures by X-ray re-strained molecular dynamics. Proteins 10(4):340–358.

13. Kuriyan J, Weis WI (1991) Rigid protein motion as a model for crystallographic tem-perature factors. Proc Natl Acad Sci USA 88(7):2773–2777.

14. Pellegrini M, Grønbech-Jensen N, Kelly JA, Pfluegl GM, Yeates TO (1997) Highlyconstrained multiple-copy refinement of protein crystal structures. Proteins 29(4):426–432.

15. Sternberg MJ, Grace DE, Phillips DC (1979) Dynamic information from protein crys-tallography. An analysis of temperature factors from refinement of the hen egg-white lysozyme structure. J Mol Biol 130(3):231–252.

16. van den Bedem H, Bhabha G, Yang K, Wright PE, Fraser JS (2013) Automated iden-tification of functional dynamic contact networks from X-ray crystallography. NatMethods 10(9):896–902.

17. Wall ME, Clarage JB, Phillips GN (1997) Motions of calmodulin characterized usingboth Bragg and diffuse X-ray scattering. Structure 5(12):1599–1612.

18. García AE, Krumhansl JA, Frauenfelder H (1997) Variations on a theme by Debye andWaller: From simple crystals to proteins. Proteins 29(2):153–160.

19. Furnham N, Blundell TL, DePristo MA, Terwilliger TC (2006) Is one solution goodenough? Nat Struct Mol Biol 13(3):184–185, discussion 185.

20. Kuzmanic A, Pannu NS, Zagrovic B (2014) X-ray refinement significantly under-estimates the level of microscopic heterogeneity in biomolecular crystals. Nat Com-mun 5:3220.

21. Zachariasen W (1945) Theory of X-Ray Diffraction in Crystals (Wiley, New York).22. James R (1948) The Optical Principles of the Diffraction of X-Rays (Bell, London).23. Wooster WA (1962) Diffuse X-Ray Reflections from Crystals (Oxford Univ Press,

Oxford).24. Guinier A (1963) X-ray Diffraction in Crystals, Imperfect Crystals, and Amorphous

Bodies (W. H. Freeman and Company, San Francisco).25. Amorós JL, Amorós M (1968) Molecular Crystals; Their Transforms and Diffuse Scat-

tering (Wiley, New York).26. Warren BE (1969) X-Ray Diffraction (Addison-Wesley, Reading, MA).27. Willis BTM, Pryor AW (1975) Thermal Vibrations in Crystallography (Cambridge Univ

Press, Cambridge, UK).28. Welberry TR (2004) Diffuse X-Ray Scattering and Models of Disorder (Oxford Univ

Press, Oxford).29. Wall ME, Adams PD, Fraser JS, Sauter NK (2014) Diffuse X-ray scattering to model

protein motions. Structure 22(2):182–184.30. Moore PB (2009) On the relationship between diffraction patterns and motions in

macromolecular crystals. Structure 17(10):1307–1315.31. Wilson MA (2013) Visualizing networks of mobility in proteins. Nat Methods 10(9):

835–837.32. Caspar DL, Clarage J, Salunke DM, Clarage M (1988) Liquid-like movements in crys-

talline insulin. Nature 332(6165):659–662.33. Chacko S, Phillips GN, Jr (1992) Diffuse x-ray scattering from tropomyosin crystals.

Biophys J 61(5):1256–1266.34. Clarage JB, Clarage MS, Phillips WC, Sweet RM, Caspar DL (1992) Correlations of

atomic movements in lysozyme crystals. Proteins 12(2):145–157.35. Clarage JB, Romo T, Andrews BK, Pettitt BM, Phillips GN, Jr (1995) A sampling

problem in molecular dynamics simulations of macromolecules. Proc Natl Acad SciUSA 92(8):3288–3292.

36. Doucet J, Benoit JP (1987) Molecular dynamics studied by analysis of the X-ray diffusescattering from lysozyme crystals. Nature 325(6105):643–646.

37. Faure P, et al. (1994) Correlated intramolecular motions and diffuse X-ray scatteringin lysozyme. Nat Struct Biol 1(2):124–128.

38. Glover ID, Harris GW, Helliwell JR, Moss DS (1991) The variety of X-ray diffuse-scat-tering frommacromolecular crystals and its respective components. Acta Crystallogr BStruct Sci 47(Pt 6):960–968.

39. Helliwell JR, Glover ID, Jones A, Pantos E, Moss DS (1986) Protein dynamics—use ofcomputer-graphics and protein crystal diffuse-scattering recorded with synchrotronX-radiation. Biochem Soc Trans 14(3):653–655.

40. Héry S, Genest D, Smith JC (1998) X-ray diffuse scattering and rigid-body motion incrystalline lysozyme probed by molecular dynamics simulation. J Mol Biol 279(1):303–319.

41. Kolatkar AR, Clarage JB, Phillips GN, Jr (1994) Analysis of diffuse scattering from yeastinitiator tRNA crystals. Acta Crystallogr D Biol Crystallogr 50(Pt 2):210–218.

42. Meinhold L, Smith JC (2005) Fluctuations and correlations in crystalline protein dy-namics: A simulation analysis of staphylococcal nuclease. Biophys J 88(4):2554–2563.

43. Meinhold L, Smith JC (2005) Correlated dynamics determining x-ray diffuse scatteringfrom a crystalline protein revealed by molecular dynamics simulation. Phys Rev Lett95(21):218103.

44. Meinhold L, Smith JC (2007) Protein dynamics from X-ray crystallography: Anisotropic,global motion in diffuse scattering patterns. Proteins 66(4):941–953.

45. Meinhold L, Merzel F, Smith JC (2007) Lattice dynamics of a protein crystal. Phys RevLett 99(13):138101.

46. Mizuguchi K, Kidera A, G�o N (1994) Collective motions in proteins investigated byX-ray diffuse scattering. Proteins 18(1):34–48.

47. Phillips GN, Jr, Fillers JP, Cohen C (1980) Motions of tropomyosin. Crystal as metaphor.Biophys J 32(1):485–502.

48. Riccardi D, Cui Q, Phillips GN, Jr (2010) Evaluating elastic network models of crystallinebiological molecules with temperature factors, correlated motions, and diffuse x-rayscattering. Biophys J 99(8):2616–2625.

49. Wall ME, Ealick SE, Gruner SM (1997) Three-dimensional diffuse x-ray scattering fromcrystals of Staphylococcal nuclease. Proc Natl Acad Sci USA 94(12):6180–6184.

50. Janowski PA, Cerutti DS, Holton J, Case DA (2013) Peptide crystal simulations revealhidden dynamics. J Am Chem Soc 135(21):7938–7948.

51. Grissom CB, Markley JL (1989) Staphylococcal nuclease active-site amino acids: pHdependence of tyrosines and arginines by 13C NMR and correlation with kineticstudies. Biochemistry 28(5):2116–2124.

52. Cotton FA, Hazen EE, Jr, Legg MJ (1979) Staphylococcal nuclease: Proposed mecha-nism of action based on structure of enzyme-thymidine 3′,5′-bisphosphate-calciumion complex at 1.5-Å resolution. Proc Natl Acad Sci USA 76(6):2551–2555.

53. Dror RO, Dirks RM, Grossman JP, Xu H, Shaw DE (2012) Biomolecular simulation:A computational microscope for molecular biology. Annu Rev Biophys 41:429–452.

54. Earl DJ, Deem MW (2005) Parallel tempering: Theory, applications, and new per-spectives. Phys Chem Chem Phys 7(23):3910–3916.

55. Pande VS, Beauchamp K, Bowman GR (2010) Everything you wanted to know aboutMarkov state models but were afraid to ask. Methods 52(1):99–105.

56. Martínez E, Uberuaga BP, Voter AF (2014) Sublattice parallel replica dynamics. PhysRev E Stat Nonlin Soft Matter Phys 89(6):063308.

57. Voter AF (1998) Parallel replica method for dynamics of infrequent events. Phys Rev BCondens Matter 57(22):13985–13988.

58. Chatfield DC, Szabo A, Brooks BR (1998) Molecular dynamics of staphylococcal nu-clease: Comparison of simulation with 15N and 13C NMR relaxation data. J Am ChemSoc 120(21):5301–5311.

59. Showalter SA, Brüschweiler R (2007) Validation of molecular dynamics simulationsof biomolecules using NMR spin relaxation as benchmarks: Application to theAMBER99SB force field. J Chem Theory Comput 3(3):961–975.

60. Lindorff-Larsen K, et al. (2010) Improved side-chain torsion potentials for the Amberff99SB protein force field. Proteins 78(8):1950–1958.

61. Lindorff-Larsen K, et al. (2012) Systematic validation of protein force fields againstexperimental data. PLoS One 7(2):e32131.

62. Beauchamp KA, Lin YS, Das R, Pande VS (2012) Are protein force fields getting better?A systematic benchmark on 524 diverse NMR measurements. J Chem Theory Comput8(4):1409–1414.

63. Wall ME (2009) Methods and software for diffuse X-ray scattering from proteincrystals. Methods Mol Biol 544:269–279.

64. Berman HM, et al. (2000) The Protein Data Bank. Nucleic Acids Res 28(1):235–242.65. Hynes TR, Fox RO (1991) The crystal structure of staphylococcal nuclease refined at

1.7 Å resolution. Proteins 10(2):92–105.66. DeLano WL, The PyMOL Molecular Graphics System (Schrödinger, LLC, New York),

Version 1.7.1.1.67. Berendsen HJC, van der Spoel D, van Drunen R (1995) GROMACS: A message-passing

parallel modecular dynamics implementation. Comput Phys Commun 91(1-3):43–56.68. Jorgensen WL, Maxwell DS, Tirado-Rives J (1996) Development and testing of the

OPLS all-atom force field on conformational energetics and properties of organicliquids. J Am Chem Soc 118(45):11225–11236.

69. Kaminski GA, Friesner RA, Tirado-Rives J, Jorgensen WL (2001) Evaluation and rep-arametrization of the OPLS-AA force field for proteins via comparison with accuratequantum chemical calculations on peptides. J Phys Chem B 105(28):6474–6487.

70. Grosse-Kunstleve RW, Sauter NK, Moriarty NW, Adams PD (2002) The ComputationalCrystallography Toolbox: Crystallographic algorithms in a reusable software frame-work. J Appl Cryst 35(Pt 1):126–136.

71. Adams PD, et al. (2010) PHENIX: A comprehensive Python-based system for macro-molecular structure solution. Acta Crystallogr D Biol Crystallogr 66(Pt 2):213–221.

72. Wall ME (1996) Diffuse features in X-ray diffraction from protein crystals. PhD thesis(Princeton University, Princeton).

17892 | www.pnas.org/cgi/doi/10.1073/pnas.1416744111 Wall et al.

Dow

nloa

ded

by g

uest

on

Oct

ober

23,

202

0