coordinates and pathways in mm and qm/mm modeling haiyan liu school of life sciences, university of...
TRANSCRIPT
Coordinates and Pathways in MM and QM/MM modeling
Haiyan Liu
School of Life Sciences, University of Science and Technology of China
In MM and QM/MM modeling of biomolecules,we often aim at understanding mechanisms of processes, many of which too slow to be investigated by direct simulations.
Examples
To study protein functions:
Possible chemical/conformational (sub)states ? Mechanism of transitions between them?
To study protein/peptide folding:
any preferred “pathways” or “order of events”? Roles of topologies and sequences?
Two (?) basic causes for macroscopic slowness
• Need to overcome major enthalpic barriers (e.g., chemical reactions…)
• Need to “zoom” into a very limited region in the conformational space
(e.g., protein folding, binding…)
Among major obstacles in simulations
• Sampling (in)efficiency
time
A state
B state
Waiting time Transition time
Two basic types of approaches
A. Connecting known terminal states
A1 “forced” barrier crossing Umbrella sampling, Targeting or Steered MD,
Drawbacks: projecting a many-dimensional system onto a few pre-assumed reaction coordinates
A projected representation of the many-dimensional problem
Reaction coordinates (Rc)
Problem associated with Improper projection
Environ.Degrees of Freedom
Restrained optimization: discontinuous environmentPotential of mean forces along Rc: sampling minima but not transition states
A2 Chain of states or path optimization methods Discrete representation of pathways (a pathway is
represented by a chain of replicas)
“enforced” continuity of the pathway
A parametric representation of the many-dimensional problem
B. Introducing more frequent transitions between states
Accelerate minimum-escaping (elevated temperature simulations, conformational flooding or local elevation, parallel replica simulations, potential energy function deformation)
The key is to avoid over-expanding the accessible conformational space.
Accelerated sampling approaches
• Potential energy-based v.s. kinetic energy-based
• Equilibrium v.s. non-equilibrium sampling
• Degree of freedom (DOF)-specific and degree of freedom-nonspecific– delocalized (collective) DOF or local DOF
coordinates (or order parameters) are
essential, provided that we have good enough energy model…
• “forced” transitions and free energy surfaces: which coordinates to project onto?
• Chain of states method: enforcing continuity on which coordinates?
• Accelerated sampling: which coordinates to apply the bias?
Examples• Local elevation
– Potential energy-based, non-equilibrium,DOF-specific, local DOFs,
• Conformational flooding– Potential energy-based, non-equilibrium, DOF-specifi
c, delocalized DOFs• Temperature REMD
– Kinetic energy-based, equilibrium, DOF-non specific• Amplified collective motion (ACM) model
– Kinetic energy-based, non-equilibrium, DOF-specific,delocalized DOFs…
Our works in recent years
Amplified collective motion MD simulation (B)
Obtaining minimum energy paths in QM/MM modeling of enzymatic reactions with a modified nudged elastic band method (A2)
coarsely-guided sampling of folding trajectories of a small protein domain in implicit solvent (A1)
Hamiltonian replica change simulation with free energy-surface-derived umbrella potentials (B)
Accelerate conformation search by Amplifying collective motions
Collective coordinates have been used in the analysisof protein dynamics for a long time:
Normal mode analysis
Principal component (or essential dynamics) analysis of conformational sets
Coarse grained elastic net work models.
Several important observations from such studies:
Protein motions (e.g. atomic positional fluctuations) are dominated by a very small number of slow modes.
These slow modes often correspond to functional motions.
The low frequency space is insensitive to details of models
Zhang et al Biophys. J., 2003, 84, 3583 He , et al J. Chem. Phys. 2003, 119, 4005.
20
,
)(2
1ijijij
iji
rrkU
jirrr
rrrrk
jirrr
rrrrk
ijijij
jiji
jij
ijijij
jijiijij
02
02
))((
;))((
Derive low frequency collective modes using the coarse-grainedelastic network model
no need for exact minimum but use only a single conformation; low frequency modes can be updated on the fly in a simulation; correctly captures the low frequency modes along the “valley” on the energy surface (for compact structures)
Advantages
Sampling in conformational space extended along “valleys” of the energy landscape. No “melting” of local structures.
Lower frequency subspace updated on the fly.
No deformation of potential energy surface.
No pre-definition of “path” or “reaction coordinates”.
Drawbacks
• Functionally important motions may not correspond to the slowest few modes
• Does not correspond to any equilibrium ensemble. Difficult to be quantitative
Test systems
Inter-domain motions of T4 lysozyme in explicit solvent.
Folding of a S-peptide analog (in implicit solvent described by a Generalized-Born model)
0 5 10 15 20 25 30 35 400.0
0.1
0.2
0.3
0.4R
MS
D (
nm
)
-4 -2 0 2 4 6
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
2.0
en
v 2
(n
m)
env 1 (nm)
env 1 (0.40 nm) env 2 (0.13 nm)
X-ray structuresBacteriophage T4 lysozyme
First three modes of the coarse grained model: 80% of the variations
Atom position RMS fluctuations in MD (300 K dashed line) and ACM-MD (Three slowest modes: 800 K, other modes: 300 K)
ACM-MD produces larger fluctuations
Zhang et al Biophys. J., 2003, 84, 3583
Projection on the two largest principal componentsof the crystal structures(dots), MD trajectory (red), andACM-MD trajectory(blue).
ACM-MD sampled larger variations in the two PCA direction.
Zhang et al Biophys. J., 2003, 84, 3583
N-term domain C-term domain
RMSD from native structure Number of residuesIn secondary structures
Solid: MDDotted: ACM-MD
ACM-MD and normal MD are similar in intra-domain motions
Zhang et al Biophys. J., 2003, 84, 3583
a b
MD ACM-MD MD ACM-MD
Folding of a S-peptide analog
0 5000 10000 15000 20000 25000 300000.0
0.1
0.2
0.3
0.4
0.5
a
RM
SD
(n
m)
0 5000 10000 15000 20000 25000 300000.0
0.1
0.2
0.3
0.4
0.5
b
0 10000 20000 30000 40000 50000 600000.0
0.1
0.2
0.3
0.4
0.5
c
Time (ps)0 10000 20000 30000 40000 50000 60000
0.0
0.1
0.2
0.3
0.4
0.5 d
MD , start from native ACM-md , start from native
MD , start from unfolded ACM-md , start from unfolded
Solid : RMS deviation from unfolded as functions of timeDotted : RMSD from native as functions of time
ACM-MD refolds the peptide while normal MD cannot
Zhang et al Biophys. J., 2003, 84, 3583
The ACM method: Collective DOF; kinetic energy based; improves sampling; non-equilibrium ensemble thus difficult to go quantitative
Application by another group: Biochemistry , 2006, 45 (51) : 15269-15278
Chain of states method in path optimization
The nudged elastic band method
Each replica moves to minimizethe force perpendicular to the path.and to maintain even distributionof the replicas along the path
| ( ) |si i iF F V X
Force:
( ) | ( ) ( )i i i iV X V X V X
1 1|si i i i i iF k X X X X
Reaction coordinate driven
Problems for enzymatic reactions
Enzyme systems contain many floppy degrees of freedom.Impractically small radius of convergence.
Advantages:
No pre-assumed reaction coordinate.Suits for parallel computations
Soft spectator degree of freedom Y spoils theNEB calculation
cos(2 ) cos( )20
E x y
Xie et al J.Chem. Phys., 2004, 120,8039.
1/ 2
2
, , ,1
N
i j i k j kk
S d d
1, , 1|si i i i i iF k S S
0
00 0
0
( )
( )
d if d d
f d d dd if d d
d d
d
f(d)
Heuristic solution: Exclude spectator degrees of freedom
Use a set of inter-atomic distances (chemical subspace)
Multiple step reactions
Xie et al J.Chem. Phys., 2004, 120,8039.
Active site groups of A-type beta-lactamase
The acylation step of type A beta-lactamase
Xie et al J.Chem. Phys., 2004, 120,8039.
Energy decomposition
TS stabilization
Xie et al J.Chem. Phys., 2004, 120,8039.
An application
Metal-preferences of metallo-proteases
E-coli peptide deformylase: prefers Fe++ over Zn++
Thermolysin: prefers Zn++
Dong et al, J.Phys.Chem. B, 2008 ( 112 ),10280-10290.
comparative modeling of Zn-TLN and Zn-PDF using NEB
Dong et al, J.Phys.Chem. B, 2008 ( 112 ),10280-10290.
ab initio QM/MM Potential energy surfaces reproduce metal preferences
Dong et al, J.Phys.Chem. B, 2008 ( 112 ) ,10280-10290.
Summary
• Some general discussions on “coordinates”-based or DOF specific approaches to accelerate the modeling of slow processes
• Two particular types of approaches– Amplified collective motions– NEB adapted for the simulations of enzyme reactions
• An example showing comparative modeling provides biochemical insights
Acknowledgements
Zhiyong Zhang, Jianbin He (ACM)Li Xie (adapted NEB)Minghui Dong (PDF and TLN)
All former and current group members
Adapted NEB: Weitao Yang and group
Funding: CAS, NSCFC
谢 谢!谢 谢! ThanThan
ksks