movimientos en proteínas ≡ dinámica molecular

34
XIIIª edición XIIIª edición del Máster Máster en Bioinformática Bioinformática Bioinformática Bioinformática Bioinformática Bioinformática Bioinformática Bioinformática Movimientos en proteínas ≡ Dinámica molecular Prof. Federico Gago Departamento de Ciencias Biomédicas y y y y y Biología Biología Biología Biología Biología Biología Biología Biología Computacional Computacional Computacional Computacional Computacional Computacional Computacional Computacional (curso 2015-2016) "the period 1965-1975 may be described as the decade of the rigid macromolecule. Brass models of DNA and a variety of proteins dominated the scene and much of the thinking". D.C. Phillips Biomolecular Stereodynamics, 1981 '... a protein cannot be said to have "a" secondary structure but exists mainly as a group of structures not too different from one another in free energy but frequently differing considerably in energy and entropy. In fact, the molecule must be conceived as trying every possible structure...' K. U. Linderstrom-Lang & J. A. Schellman Enzymes 1, 443-465 (1959) Enzymes 1, 443-465 (1959) CALMODULIN “Certainly no subject or field is making more progress on so many fronts, at the present moment, than biology, and if we were to name the most powerful assumption of all, which leads one on and on in an attempt to understand life, it is that all things are made of atoms and that everything that living things do can be understood in terms of the jigglings and wigglings of atoms. wigglings of atoms. The Relation of Physics to Other Sciences, chapter 3 Feynman, R. P.; Leighton, R. B.; Sands, M. The Feynman Lectures in Physics; Addison-Wesley: Reading, MA, 1963; Vol. I, p 3.6.

Upload: buidung

Post on 31-Dec-2016

230 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Movimientos en proteínas ≡ Dinámica molecular

XIIIª ediciónXIIIª edición delMásterMáster en

Bioinformática Bioinformática Bioinformática Bioinformática Bioinformática Bioinformática Bioinformática Bioinformática

Movimientosen proteínas ≡

Dinámica molecular

Prof. Federico Gago

Departamento de Ciencias Biomédicas

Bioinformática Bioinformática Bioinformática Bioinformática Bioinformática Bioinformática Bioinformática Bioinformática yyyyyyyy BiologíaBiologíaBiologíaBiologíaBiologíaBiologíaBiologíaBiología

ComputacionalComputacionalComputacionalComputacionalComputacionalComputacionalComputacionalComputacional(curso 2015-2016)

"the period 1965-1975 may be described as the decade of

the rigid macromolecule. Brass models of DNA and a variety

of proteins dominated the scene and much of the thinking".

D.C. Phillips

Biomolecular Stereodynamics, 1981

'... a protein cannot be said to have "a" secondary

structure but exists mainly as a group of structures not

too different from one another in free energy but

frequently differing considerably in energy and entropy.

In fact, the molecule must be conceived as trying every

possible structure...' K. U. Linderstrom-Lang & J. A. Schellman

Enzymes 1, 443-465 (1959)Enzymes 1, 443-465 (1959)

CALMODULIN

“Certainly no subject or field is making more progress

on so many fronts, at the present moment, than

biology, and if we were to name the most powerful

assumption of all, which leads one on and on in an

attempt to understand life, it is that all things are

made of atoms and that everything that living things do

can be understood in terms of the jigglings and

wigglings of atoms.”wigglings of atoms.”

The Relation of Physics to Other Sciences, chapter 3

Feynman, R. P.; Leighton, R. B.; Sands, M. The Feynman Lectures

in Physics; Addison-Wesley: Reading, MA, 1963; Vol. I, p 3.6.

Page 2: Movimientos en proteínas ≡ Dinámica molecular

Local motions (from 0.01 to 5 Å, from 10–15 to 10–1 s)Atomic fluctuationsSide-chain movementsLoop movements

Rigid-body motions (from 1 to 10 Å, from 10–9 to 1 s)

Biological macromolecules exhibit a wide range of time

scales over which specific processes take place such as:

Rigid-body motions (from 1 to 10 Å, from 10–9 to 1 s) Helical movementsDomain motions (hinge bending)Subunit motions

Large-scale motions (> 5 Å, from 10–7 to 104 s) Helix-coil transitionsDissociation/Association Folding and unfolding

Å

100

10

1

10-15 10-12 10-9 10-6 10-3 100 103time (s)

Proteins are flexible molecules Motion of macromolecules are often the

essential link between structure and function:

• Catalysis

• Regulation of activity

• Transport of metabolites

• Formation of large assemblies• Formation of large assemblies

• Cellular locomotion...

STRUCTURE BIOLOGICALFUNCTION

FLEXIBILITY

Page 3: Movimientos en proteínas ≡ Dinámica molecular

STABILIZATION OF THE TRANSITION STATE

CONNECTING SUBSTRATES AND PRODUCTS,

AND DECREASE OF THE ACTIVATION FREE

ENERGY

PRE-

ORGANIZED

ENVIRONMENT

OF THE

ENZYME

ACTIVE SITEhttp://www.ebi.ac.uk/thornton-srv/databases/enzymes/

The switching cycle of the elongation factor EF-Tu delivers

aminoacyl-tRNAs to the ribosome

aminoacyl tRNA

EF-Tu with GTP bound

GTP

GDP

GTP-EF-Tu binds to

aminoacyl-tRNA

interaction of GDP-bound

EF-Tu with EF-Ts

EF-Ts(nucleotide exchange factor)

aminoacyl-tRNA

GTP hydrolysis

leads to release

from the tRNA

aminoacyl-tRNA on ribosomeEF-Tu with

GDP bound

The switching cycle of the GTPases (e.g. Ras) involves interactions with proteins that facilitate binding of GTP

and stimulation of GTPase activity

signal

Page 4: Movimientos en proteínas ≡ Dinámica molecular

Obtaining Dynamic Information

• Experimental Approaches– X-ray Crystallography: proteins solved in multiple conformations ����

– Nuclear Magnetic Resonance– Hydrogen/Deuterium Exchange– Amide H/D exchange mass spectrometry– Amide H/D exchange mass spectrometry– Single-Molecule Electron Transfer

• Computational Approaches– Normal Mode Analysis– Gaussian Network Model– COREX / BEST– Molecular Dynamics

High frequency Structural stability

Low frequency Flexible region

Protein MorpherThe purpose of morphing is to smooth the visual transition, making it easier to see the structural relationships between the two empirical conformations.

� For some proteins, major changes in secondary,

tertiary, or quaternary structure are essential to

function.

http://www.umass.edu/microbio/chime/morpher/

� In some cases, investigators have succeeded in

obtaining empirically determined structures for a

protein in two conformations.

� The challenge for visualization is then to be able to

follow the movements of each portion between the

two conformations.

� A series of intermediate conformations are

generated by linear interpolation of alpha carbon

positions.

recoverin

[1iku1iku1iku1iku→→→→ 1jsa1jsa1jsa1jsa]

it is crucial to realize that while the interpolated intermediate

conformations aid visualization, they are otherwise meaningless!!

http://molmovdb.mbb.yale.edu/molmovdb/

Krebs W.G. Gerstein M. Nucleic Acids Res. 28, 1665-1675 (2000)

© 1997-2003 Werner Krebs,

Nat Echols, Mark Gerstein

MovieMaker Version 1.0

http://wishart.biology.ualberta.ca/moviemaker/

Page 5: Movimientos en proteínas ≡ Dinámica molecular

Differences between shear (sliding)

and hinge motionsShear mechanism Hinge mechanism

Well-packed interfaces MAINTAINED, throughout motion

NOT MAINTAINED; rather created, burying surface

Mainchain packing Constrained by close packing

Free to kink at hinge

Mainchain torsions Many small changes A few large changes

Motion overall Concatenation of small local motions

Identical to twisting at hinge

Motion at interface Parallel to plane of interface (shear)

Perpendicular to interface

Sidechain packing Same packing in both forms

New contacts, packing at base of hinge crucial

Sidechain torsions Mostly small changes Some large changes

Simple example Trp repressor, insulin Lactoferrin, calmodulin

Gerstein M, Krebs W. A database of macromolecular motions. Nucleic Acids Res. 26, 4280-4290 (1998 )

Shear

motionHinge

motion

Interfaces

Hinge

Hingefind an algorithm to investigate domain motions in proteins.

� Identification, characterization and visualization of domain movements by effective rotation axes (hinges).

� The program compares two known structures (e.g. two different crystal structures of a protein or the results of molecular dynamics simulations)

http://biomachina.org/disseminate/hingefind/hingefind.html

W. Wriggers & K. Schulten. Protein Domain Movements: Detection of Rigid Domains and Visualization of Effective Rotations in Comparisons of Atomic Coordinates.

Proteins: Structure, Function, and Genetics, 29, 1-14 (1997).

the results of molecular dynamics simulations) and partitions the protein with a prespecified tolerance in preserved subdomains.

� It then determines effective rotation axes which characterize the domain movements with respect to the reference domain ("rigid core").

� The method does not require any previous knowledge about functionally relevant domains or hinged domain motions.

A program to determine domains, hinge axes and hinge bending residues in proteins where two conformations are available

http://www.cmp.uea.ac.uk/dyndom/

� it calculates the most-probable trajectory between two known

structural states of a protein, in the sense of maximum-likelihood or

Minimum Action invented by Onsager and Machlup (1953).

� it differs fundamentally from "morphing", which is purely numerical

interpolation between 2 structures using either internal angles or

intramolecular distances.

� the "real" trajectory is calculated using path-integral techniques.

The main approximation of the current implementation is that the energy landscape around

each of the two states is harmonic (the Elastic Network Model (ENM) centered on each state).

Page 6: Movimientos en proteínas ≡ Dinámica molecular

I. Motions of Fragments Smaller than Domains

A. Motion is predominantly shear - Proteins for which two or more conformations are known: dihydrofolate reductase, insulin, thymidylate synthase, bacteriorhodopsin…

http://molmovdb.mbb.yale.edu/molmovdb/

I. Motions of Fragments Smaller than Domains

B. Motion is predominantly hinge - Proteins for which two or more conformations are known:annexin V (Trp motion), cystatin, enolase, HIV-1 protease, Hhal methyltransferase, immunoglobulin (CDR motion), isocitrate dehydrogenase, lactate dehydrogenase, lipase, malate dehydrogenase, seryl-tRNA synthetase, triglyceride lipase, triose phosphate isomerase, Yersinia protein tyrosine phosphatase, ras protein…

II. Domain Motions

A. Motion is predominantly shear - Proteins for which two or more conformations are known:alcohol dehydrogenase, aspartate amino transferase, citrate synthase, endothiapepsin, glyceraldehyde-3-phosphate dehydrogenase, glycerol kinase, hexokinase, human interleukin 5, phosphofructokinase (not allosteric transition), Trp repressor…

glucose

II. Domain Motions

B. Motion is predominantly hinge - Proteins for which two or more conformations are known:Acetylcholinesterase, adenylate kinase, annexin V (breathing motion), calbindin, calmodulin, canine lymphoma immunoglobulin (Fc-Fab hinge), catabolite gene activator protein (CAP), cell adhesion molecule CD2, DNA polymerase beta, diphtheria toxin, E. coli. periplasmic dipeptide binding protein, family-5 endoglucanase CelC, formate dehydrogenase, glutamate dehydrogenase, glutamine binding protein, GroEL domain, heat shock transcription factor, interferon-gamma, iron sulfur protein (bc1 complex), lactoferrin, Lysine/Arginine/Ornithine (LAO) binding protein, maltodextrin binding protein, phosphoglycerate kinase, recoverin, T4 lysozyme mutants (Ile3->Pro & Met6->Ile), TBSV coat protein, troponin-C, tryptophan synthase, lysozyme mutants (Ile3->Pro & Met6->Ile), TBSV coat protein, troponin-C, tryptophan synthase, c-Src tyrosine kinase, cAMP-dependent protein kinase (catalytic domain)…

Page 7: Movimientos en proteínas ≡ Dinámica molecular

Periplasmic Binding Proteinsa structurally conserved family of bilobate, soluble receptor proteins

Glucose/GalactoseBinding Protein

Ribose Binding Protein

Leu, Ile, ValBinding Protein

Dipeptide Binding Protein

Maltose Binding Protein

Arabinose Binding Protein

LeucineBinding Protein

D-AlloseBinding Protein

II. Domain Motions

C. Motion involves partial refolding of tertiary structure - Proteins for which two or more conformations are known: Gα, HIV-1 reverse transcriptase, haemagglutinin, serpins…

efavirenz

HIV-1 reverse transcriptase p66 subunit

1RTD →→→→ 1FK9

III. Larger Movements than Domain Movements involving the Motion of Subunits

A. Motion involves an allosteric transition - Proteins for which two or more conformations are known: aspartate transcarbamoylase, fructose-1,6-biphosphatase, glycogen phosphorylase, hemoglobin, Lac repressor core (allosteric motion), Lac repressor upon binding DNA (subunit motion via tetramerization domain), phosphofructokinase…

B. Motion does not involve an allosteric transition - Proteins for which two or more conformations are known: aspartate receptor, Bam HI endonuclease, immunoglobulin (VL-VH movement), S. cerevisiae PPR1 Zn-finger DNA recognition protein, erythropoietin receptor, F1-ATPase, polymerase processivity factor PCNA…

Page 8: Movimientos en proteínas ≡ Dinámica molecular

http://idp1.force.cs.is.nagoya-u.ac.jp/pscdb/

[domain motions induced upon ligand binding] [domain motions regardless of ligand binding]

[local motions induced upon ligand binding] [local motions regardless of ligand binding]

[imaginable motions required to hold ligand inside protein]

Obtaining Dynamic Information

• Experimental Approaches– X-ray Crystallography: proteins solved in multiple conformations ����

– Nuclear Magnetic Resonance– Hydrogen/Deuterium Exchange– Amide H/D exchange mass spectrometry– Amide H/D exchange mass spectrometry– Single-Molecule Electron Transfer

• Computational Approaches– Normal Mode Analysis– Gaussian Network Model– Molecular Dynamics

High frequency Structural stability

Low frequency Flexible region

In Silico Approaches

• Computer modelling of protein dynamics:Different representation of protein structures.

Molecular Dynamics Simulation

Normal Mode Analysis

Gaussian Network Analysis

Accu

racy

Deta

il

Co

mp

uta

tion

Tim

e

Normal Mode Analysis

A simple harmonic oscillator-based system that analyzes dynamics near a local minimum.

Details of the protein structure are reduced by converting from a cartesiancoordinate to an internal coordinate (in some casesbased on dihedral angles).

Ma JP

Usefulness and limitations of normal mode analysis in modeling dynamics of biomolecular complexes

STRUCTURE 13 (3): 373-380 MAR 2005

Page 9: Movimientos en proteínas ≡ Dinámica molecular

Normal Mode Analysis

� Fast and simple method to calculatevibrational modes and proteinflexibility.

� Atoms (sometimes only Cαααα) are modelled as point masses connectedAtoms (sometimes only C ) are modelled as point masses connectedby springs, which represent theinteratomic force fields.

� One particular type is the ElasticNetwork Model: the springs connecting each node to allother neighbouring nodes are of equalstrength, and only the atom pairs within a cutoff distance are considered.

Detailed specific potentials

“A single parameter potential is sufficient to reproduce the slow

dynamics in good detail”

Approximate uniform potential

Normal mode calculation and

visualisation using PyMOL

http://lorentz.immstr.pasteur.fr/nomad-ref.php

The most important characteristic controlling the dynamics of folded structures is the TOPOLOGY of CONTACTS

http://lorentz.immstr.pasteur.fr/nomad-ref.php

Use of Normal Modes as a basis set of collective movements for macro-molecules that are then used to:

• generate alternative models with correct stereochemistry but large amplitude movements

• refine models against experimental data such as X-Ray diffraction or cryo-EM data

• refine docking solutions in cases where it is known that the receptor is flexible

Page 10: Movimientos en proteínas ≡ Dinámica molecular

elNémo is the Web-interface to the Elastic Network Model (ENM), a fast and simple way

for computing the low frequency normal modes of a macromolecule.

Thanks to the ‘rotations-translations of blocks’ (RTB) approximation, this server can

Welcome to elNémo !

http://www.sciences.univ-nantes.fr/elnemo/index.html

� Thanks to the ‘rotations-translations of blocks’ (RTB) approximation, this server can

perform calculations for all-atom systems.

� One major application of normal modes is the identification of potential

conformational changes, e.g. of enzymes upon ligand binding, membrane channel

opening, analysis of structural movements of systems as large as the ribosome.

� In 50% of the cases where protein structures are available in two different

conformations, the related motion can be described by using only one or two low

frequency normal modes.

� Application in X-ray crystallography data phasing: use of normal mode perturbed

models as templates in molecular replacement.

http://imods.chaconlab.org/

Lopéz-Blanco JR, Garzón JI, Chacón P. iMod: multipurpose normal mode analysis in internal

coordinates. Bioinformatics. 2011 Oct 15;27(20):2843-50. doi: 10.1093/bioinformatics/btr497

López-Blanco JR, Aliaga JI, Quintana-Ortí ES, Chacón P. iMODS: internal coordinates

normal mode analysis server. Nucleic Acids Res. 2014 Jul;42(Web Server issue):W271-6. doi: 10.1093/nar/gku339.

http://imods.chaconlab.org/

GroEL monomer cis→trans transition

Page 11: Movimientos en proteínas ≡ Dinámica molecular

HIV protease(PDB entry 1T3R)

Yuan Wang

Robert Jernigan Lab

Iowa State University

PyANMPyANM A Pymol plugin for Anisotropic Network Model (ANM) building

and visualization

http://pymolwiki.org/index.php/PyANM

Iowa State University

Gaussian Network Model

Models protein structure as a 3-D elastic network with Cαacting as nodes for matrix connectivity.

Bahar, I., A.R. Atilgan, and B. Erman

Direct evaluation of thermal fluctuations in proteins using a single-parameter harmonic potential.

Folding & Design. 2(3): 173-181 (1997)

ribbon diagram

Can we predict fluctuations dynamics Can we predict fluctuations dynamics from native state topology only?from native state topology only?

Gaussian Network ModelGaussian Network ModelA new way of representing structure !A new way of representing structure !

An elastic network model is ribbon diagram

GNM representation

An elastic network model is proposed for the

interactions between closely (≤≤≤≤ 7.0 Å) located Cαααα pairs

in folded proteins.

Bahar, I., Atilgan, A.R. & Erman, B. (1997) Direct evaluation of thermal fluctuations in proteins using a single-parameter harmonic potential. Folding & Des. 2, 173-181.

Flory, P.J. (1976) Statistical thermodynamics of random networks. Proc. Roy. Soc. London

A. 351, 351-380.

Gaussian Network ModelGaussian Network Model

A single-parameter harmonic potential is adopted for the fluctuations of residues about their mean positions in the crystal structure.

The model is based on writing The model is based on writing the Kirchhoff adjacency matrixfor a protein defining the proximity of residues in space.

The elements of the inverse of the Kirchhoff matrix give directly the auto-correlations or cross-correlations of atomic fluctuations.

Page 12: Movimientos en proteínas ≡ Dinámica molecular

Gaussian Network Model (GNM)

Atilgan et al. Biophys. J. 80 (2001)

N-1 DOF GNM

Schematic representation of the

equilibrium positions of the ith and jth

nodes, with respect to laboratory-

fixed coordinate systemChennubhotla et al. Phys. Biol. 2 (2005)

GNM Model Definition

• Kirchhoff matrix for the network topology 1 -1

-1 2 -1

-1 3 -1 -1

… …For any given i,

• Cross-correlation between residue fluctuations

i=j => Mean-squared fluctuations

… …

-1 2 -1

-1 1

For any given i,

1 ≤ j ≤ N

GNM Model Analysis

• Eigen decomposition for Kirchhoff matrix

• Mean-square fluctuation of residue i• Mean-square fluctuation of residue i

• Assume fluctuations are isotropic

• Predict motions only in their amplitudes, no direction information

• A coarse – grained model to study vibrational dynamics of proteins in the folded state.

• Interactions between residues are replaced by linear springs, in analogy with the elasticity theory of random polymer networks.

• The above approximation is based on a Gaussian distribution of

Gaussian Network Model (GNM)

• The above approximation is based on a Gaussian distribution of inter-atomic distances about their equilibrium values.

http://klee.bme.boun.edu.tr/research1.html

(The mean square fluctuations of Cα of apomyoglobin, <Ri

2>, as a function of residue index i.

The dashed line is experimental results for temperature factor,Bi=8π2<∆Ri . ∆Ri >/3)

Page 13: Movimientos en proteínas ≡ Dinámica molecular

iGNM 1.2 (Internet Accessible GNM Software)A Database of Protein Functional Motions Based on Gaussian Network Model

Dr. Ivet Bahar

http://ignm.ccbb.pitt.edu/

oGNM (On-line GNM) http://ignm.ccbb.pitt.edu/

1. oGNM provides direct computation of dynamics for proteins' biologically

functional unit by switching the radial selection from 'PDB Files' to 'Biological Unit File'.

- Note that oGNM takes the entire PDB file (all the MODEL entries) as a whole and assumes it an integral complex if you check 'Biological Unit File'. For instance, you DO NOT want to check 'Biological Unit File' if you submit an NMR structure containing multiple models. By checking 'PDB File', oGNM will only compute the normal modes for the first model in an NMR structure.

2. Selectable cutoff distances for Cα-Cα in proteins (ranging from 6 to 20 Å) and P-P in nucleotides (ranging from 9 to 40 Å for 1-node per nucleotide model; 6.5 to 40 Å for 3-node per nucleotide model) . The cutoff of Cα-P is the arithmetic average of the two.

3. ALL of the Cα and nucleotide atoms (P or P/C4*/C2) in the uploaded structure will be taken as NODEs including both standard and non-standard amino acids and nucleotides.

oGNM (On-line GNM) http://ignm.ccbb.pitt.edu/

4. The connectivity matrix is stored in a sparse format in the .kdat file where only non-zero contacts are recorded5. BLZPACK is used as the current eigensolver. 6. Theoretically predicted B-factors (time average fluctuations over all modes) are calculated by the algorithm, PowerB.7. The current output includes the mobility profiles of residues corresponding to the 20 slowest modes of motion predicted by the GNM; to the 20 slowest modes of motion predicted by the GNM; • the average profile resulting from the first 2 slowest modes; • the associated eigenvalues (21 of them, including the zero eigenvalue); • the predicted and experimental B-factors, and the correlation coefficient

between the two sets of B-factors; • the spring constant (g) in units of kcal/mol.Å2

• the cross-correlation between residue fluctuations, plotted as a correlation map (for structures containing less than 500 nodes).

• the nodes included in the GNM analysis, summarized in the .ca file.

� Theoretical studies of biological molecules make it possible to study the relationships between structure, function and dynamics in atomic detail.

� Since many of the problems that one would like to address in biological systems involve many atoms, it is not yet feasible to treat these systems using quantum mechanics.

� The problems become much more tractable when turning to empirical

PPotentialotential EEnergynergy FFunctionsunctions

� The problems become much more tractable when turning to empirical potential energy functions which are much less computationally demanding than quantum mechanics

� Current generation force fields provide a reasonably good compromise between accuracy and computational efficiency.

Page 14: Movimientos en proteínas ≡ Dinámica molecular

� A computational technique used to model the conformational behaviour and energetic properties of molecules.

� The molecule is treated at the atomic level, i.e. the electrons are not treated explicitly.

� MM uses an Energy Function, defined so that given a particular

Molecular Mechanics (MM)Molecular Mechanics (MM)

� MM uses an Energy Function, defined so that given a particularconformation, (i.e. given a set of spatial coordinates for all the atoms) the energy of the molecule can be calculated.

� The energy function is empirical, i.e. it is not entirely derived from rigorous theories.

� The energy function makes a distinction between 'bonded' and 'non-bonded'interactions.

� Force fields are often calibrated to experimental results and quantum mechanical calculations of small model compounds.

� Their ability to reproduce physical properties measurable by experiment is tested; these properties include structural data obtained from x-ray crystallography and NMR, dynamic data obtained from spectroscopy and inelastic neutron scattering and thermodynamic data.

ForceForce fieldsfields

inelastic neutron scattering and thermodynamic data.

� The development of parameter sets is a very laborious task, requiring extensive optimization.

� This is an area of continuing fundamental and applied research and many groups have been working over the past two decades to derive functional forms and parameters for potential energy functions of general applicability to biological molecules.

Epotential = Ebonded + Enon-bonded

Molecular MechanicsMolecular Mechanics

Epotential = Ebonded + Enon-bonded

Ebonded = ΣΣΣΣi Ebonds + ΣΣΣΣi Eangles + ΣΣΣΣi Edihedrals

Enon-bonded = ΣΣΣΣi Eelectrostatic + ΣΣΣΣi Evan der Waals

TINKER's "Molecular Mechanics" Logo Illustration by Jay Nelson.Courtesy of Prof. Robert Paine, Chemistry Dept., Univ. of New Mexico.

( )∑ −=angles

20angles k

21

E θθθ

BONDED TERMSBONDED TERMS

( )[ ]∑ −+=dihedrals

0ddihedrals cos1k21

E φφ

( )20b

bondsbonds bbk

21

E −= ∑

Page 15: Movimientos en proteínas ≡ Dinámica molecular

∑=ij ij

ji

0ticelectrosta r

qq

41

Eεεππ

∑ −=−

ij6

ij

ij12ij

ijJonesLennard r

B

r

AE

ji++

NONNON--BONDED TERMSBONDED TERMS

repulsion

attraction ji

ji

+–

John Kendrew’s

forest of rodsCVFF forest of rods

Summary of interactions included in a representative molecular

mechanics force field

� ThThThTheeee empirical empirical empirical empirical ppppotential energy otential energy otential energy otential energy function isfunction isfunction isfunction is differentiabledifferentiabledifferentiabledifferentiable with with with with respect to the atomicrespect to the atomicrespect to the atomicrespect to the atomic coordinatescoordinatescoordinatescoordinates....

� TTTThis gives the value and thehis gives the value and thehis gives the value and thehis gives the value and thedirection of the direction of the direction of the direction of the forceforceforceforce acting on acting on acting on acting on eacheacheacheach atom and can thus be used atom and can thus be used atom and can thus be used atom and can thus be used in a in a in a in a molecular dynamics molecular dynamics molecular dynamics molecular dynamics simulationsimulationsimulationsimulation.... “Number crunching” ( numerical “experiments”)

Page 16: Movimientos en proteínas ≡ Dinámica molecular

AMBER (Assisted Model Building with Energy Refinement)

http://amber.scripps.edu/

CHARMm® (Chemistry at HARvard Macromolecular Mechanics)

http://www.charmm.org/

SomeSome popular popular forceforce fieldsfields

CVFF (Consistent-Valence Force Field)

GROMOS (GROningen MOlecular Simulation package)

http://www.igc.ethz.ch/gromos/ | http://www.gromos.net/

OPLS (Optimized Potentials for Liquid Simulations)

AMBER (Assisted Model Building with Energy Refinement)

Have the bugs magically disappeared?

http://ambermd.org/

( ) ( ) ( )[ ]

∑∑

∑∑∑

−<

−+

⋅+−+

−++−+−=

bondsH ij

ij

ij

ij

ji ij

ji

ij

ij

ij

ij

dihedrals

n

anglesbonds

b

r

D

r

C

r

qq

r

B

r

A

nV

kbbk

1012612

2

0

2

0total cos12

E

ε

γφθθθ

( )

AMBER: some recent history

http://ambermd.org/

( ) ( )

( ) ∑∑

∑∑

+−+−

+−+−=

2

0

2

0

2

0

)(cos

anglesbonds

b

knkk

krrkE

ωωφ

θθ

ωφφ

θ

CHARMm® (Chemistry at HARvard Macromolecular Mechanics)

( )

∑ ∑

∑∑

−+−

+

+

⋅+−

+−+−

−−−−

2

0

2

0

,612

0

)()(

)(cos)(cos''

)(cos

iiiiii

HAAA

n

DHA

m

i

AD

i

AD

jipairs ij

ji

ij

ij

ij

ij

dihedralsimproperdihedralsproper

KrrK

xr

B

r

A

r

qq

r

B

r

A

knkk

φφ

θθ

ε

ωωφ ωφφ

Page 17: Movimientos en proteínas ≡ Dinámica molecular

� a parallel molecular dynamics code designed for high-performance

simulation of large biomolecular systems

� based on Charm++ parallel objects, NAMD scales to hundreds of

cores for typical simulations and beyond 500,000 cores for the largest

simulations

http://www.ks.uiuc.edu/Research/namd/

� uses the popular molecular graphics program VMD for simulation

setup and trajectory analysis, but is also file-compatible with AMBER,

CHARMM, and X-PLOR

� distributed free of charge with source code, binaries can be built

for a wide variety of platforms

A viral ion channel

http://www.ks.uiuc.edu/Research/namd/

CVFF (Consistent-Valence

Force Field)

GROMOS(Dynamic Modelling of Molecular Systems)

Informatikgestützte Chemie

ETH Zurich

GROMACS (GROningen MAchine Chemical Simulations):

a molecular dynamics simulation package

http://www.gromacs.org/

Page 18: Movimientos en proteínas ≡ Dinámica molecular

( ) ( )φφ

ε

2cos12

cos12

21

612

−+−

+

⋅+−=

∑<

VV

r

qq

r

C

r

AE

torsions

ji ij

ji

ij

ij

ij

ij

total

OPLS (Optimized Potentials for Liquid Simulations)

22torsions

BOSS (Biochemical and Organic Simulation System): a general-

purpose molecular modeling program

http://zarbi.chem.yale.edu/software.html

http://www.schrodinger.com/

Prof. William L. Jorgensen

Atomic positions

(coordinate file)

Covalent structure

(topology file)

Potential energy function

(parameter file)

Additional atoms

Total potential energy

Forces on each atom

ALGORITHMS FOR ALGORITHMS FOR ENERGY MINIMIZATIONENERGY MINIMIZATION AND AND MOLECULAR DYNAMICSMOLECULAR DYNAMICS

Additional atoms

(hydrogens; heterogroups;

solvent; counterions)

Special features

(periodic boundary conditions;

# constant pressure

# constant temperature)

Atomic velocities

Forces on each atom

Effective temperature

Newton’s Second Law

2xdm

vm

amFdxdE

-

⋅=⋅

=⋅==

Molecular DynamicsMolecular Dynamics

2dtm

tm ⋅=⋅

"If I have seen further, it is only by standing on the shoulders of giants."

Page 19: Movimientos en proteínas ≡ Dinámica molecular

Molecular DynamicMolecular Dynamicss SimulationSimulationss

The ultimate detail modeling of protein fluctuation,

every atom is accounted for.

1977 - First protein modeled by molecular dynamics simulation (< 10 ps).

Bovine Pancreatic Trypsin Inhibitor (BPTI)Bovine Pancreatic Trypsin Inhibitor (BPTI)

1979 - Simulations resulted in the recognition that B factors can be used to

infer internal motions.

With the improvement of computational technology, simulations can run up to hundreds of ns for larger systems (104-106).

Karplus M, McCammon JA

Molecular dynamics simulations of biomolecules

NATURE STRUCTURAL BIOLOGY 9 (9): 646-652 (2002)

First Movie of Protein Molecular Dynamics

simulated by Michael Levitt in 1979:

https://www.youtube.com/watch?v=_hMa6G0ZoPQ

All hydrogen atoms were included to make sure the

structure remained close to the native X-ray structure.

“for the development of multiscale models

for complex chemical reactions”

+ = QM/MM

Page 20: Movimientos en proteínas ≡ Dinámica molecular

Shaw et al.: 250 ns of simulated motion, sampled every 0.25 ns

Science 330: 341-346 (2010)

• Time scales and molecular motions

• Atomic fluctuations, vibrations 10-15 to 10-12 s < 1Å• Group motions (covalently linked units) 10-12 – 10-3 s < 1 Å – 50 Å• Molecular rotation, reorientation 10-12 – 10-9 s• Molecular translation, diffusion• Rotation of methyl groups 10-12 – 10-9 s

Dynamics and Relaxation

• Rotation of methyl groups 10-12 – 10-9 s• Flips of aromatic rings 10-9 – 10-6 s• Domain motions 10-8 – 10-3 s• Proline isomerization > 10-3 s

� Chemical exchange (e.g. two protein conformations)� Amide exchange� Ligand binding

DSMM –Database of Simulated Molecular Motions

http://projects.villa-bosch.de/dbase/dsmm/

DNA Polymerase Beta

• The purpose of this database is to provide an easily-searchable source of information about movies showing biomolecular motions that have been generated by computer simulation. All of the movies are available through the internet.

• Molecules simulated include proteins, DNA, RNA, sugars and lipids. Simulation techniques include Molecular Dynamics, Brownian Dynamics and

Automated Docking procedures.

Finocchiaro G., Wang T., Hoffmann R., Gonzalez A. and Wade R.CNucleic Acid Research,31:456-457 (2003)

molecular dynamics

calculation of properties

comparative modelling

ligand docking

binding site description DRUG DESIGN

VIRTUAL SCREENING

STRUCTURE-ACTIVITY RELATIONSHIPS

PROTEIN-PROTEIN INTERACTION NETWORKS

Page 21: Movimientos en proteínas ≡ Dinámica molecular

� The force-field equations can be integrated numerically using small time increments ∆t (~10–15 s), which makes it possible to obtain a trajectory of the system (atomic positions as a function of time).

A million steps are necessary to provide only 1 ns!

� Due to the atoms’ inertia, MD allows the system to surmount energy barriers which are of the order of kT, given that the average kinetic energy per

DynamicDynamic mmodellingodelling: molecular : molecular dynamicsdynamics (MD)(MD)

barriers which are of the order of kT, given that the average kinetic energy per degree of freedom is ½kT.

� MD at elevated temperatures can be used to generate a variety of configurations of the molecular system.

� MD explores a larger part of configuration space in search of local energy minima and generally ends up in a lower energy minimum than do ordinary energy minimizers.

MD simulations allow the study of complex, dynamic

processes that take place in biological systems, such as:

� Protein stability

� Conformational changes

� Protein folding

� Molecular recognition: ligands, proteins, DNA,

membranes... membranes...

� Transport of ions in biological systems

and provide the means to carry out the following studies:

- structure determination by X-ray diffraction and NMR

spectroscopy

- calculation of differences in ligand affinity

MD MD simulationssimulations can be can be usedused to:to:

� Predict structures and properties of proteins differing in only a few amino acids from an initial known structure

� Refine model-built homologous protein structures

� Study protein stability with respect to changes in pH, temperature or � Study protein stability with respect to changes in pH, temperature or solvent

� Study the conformational changes associated with the catalytic properties of enzymes (transition state stabilization)

� Study the stability of the proposed binding mode for a ligand

� Study binding affinities and/or specificities

SettingSetting Up and Up and RunningRunning

a Ma Molecularolecular DDynamicsynamics

SSimulationimulation

Initial coordinates

Energy minimization

Assign initial velocities

Heating dynamics

water molecules?

Equilibration dynamics

Production dynamics

Analysis of trajectories

Rescale velocities?Temp OK?

Yes

No

Page 22: Movimientos en proteínas ≡ Dinámica molecular

Modeling biomoleculeModeling biomolecule--solvent interactionssolvent interactions

• Solvent models– Explicit

• Molecular dynamics• Monte Carlo

– Integral equation

• RISM• 3D methods

• Ion models

– Explicit

• Molecular dynamics• Monte Carlo

– Integral equation

• RISM

Leve

l of d

etai

l

Com

puta

tiona

l cos

t

• RISM• 3D methods• DFT

– Primitive

• Poisson equation– Phenomenological

• Generalized Born• Modified Coulomb’s

law

• RISM• 3D methods• DFT

– Field theory

• Poisson-Boltzmann• Extended PB, etc.

– Phenomenological

• Generalized Born• Debye-Hückel

Leve

l of d

etai

l

Com

puta

tiona

l cos

t

Explicit solvent simulationsExplicit solvent simulations• Sample the configuration space of the

system: ions, atomically detailed water, solute

• Sampling performed with respect to an ensemble: NpT, NVT, etc.

• Algorithms: molecular dynamics and Monte Carlo

• Advantages:• Advantages:– High levels of detail– Easy inclusion of additional degrees of

freedom– All interactions considered explicitly

• Disadvantages:– Slow (and uncertain) convergence– Time-consuming– Boundary effects– Poor scaling to larger systems– Some effects still not considered in many

force fields…

216 water molecules

water model descriptions

http://www1.lsbu.ac.uk/water/water_models.html

Stochastic boundary conditions Periodic boundary conditions

Page 23: Movimientos en proteínas ≡ Dinámica molecular

Periodic Boundary ConditionsPeriodic Boundary Conditions

1) If a molecule leaves

the Central Box

2) Then one of its images

will enter through the

opposite face

Periodic Boundary ConditionsPeriodic Boundary Conditions

A truncated octahedron has

the advantage of being more

nearly spherical than most

other MD cells.

The truncated octahedronThe truncated octahedron

other MD cells.

This can be very useful when

simulating a large molecule

in solution, where fewer

solvent molecules are

required for a given

simulation cell width.

Catalytic mechanism of

Escherichia coli Thioredoxin ReductaseNADPH

NADP+ FADred

FADox

-S |-S

-SH-SH

-S |-S

-SH-SH

Reductivecellularprocesses

Thioredoxin reductase Thioredoxin

Negri A, Rodríguez-Larrea D, Marco E, Jiménez-Ruiz A, Sánchez-Ruiz JM, Gago F.

Proteins. 78(1):36-51 (2010 ). doi: 10.1002/prot.22490

Implicit solvent simulations: backgroundImplicit solvent simulations: background

• Solute typically only accounts for 5-10% of atoms in explicit solvent simulation

• Implicit methods:– Solvent treated as continuum of – Solvent treated as continuum of

infinitesimal dipoles– Ions treated as continuum of charge

• Some deficiencies:– Polarization response is linear and

local– Mean field ion distribution ignores

fluctuations and correlations– Apolar effects treated by various,

heuristic methods

Page 24: Movimientos en proteínas ≡ Dinámica molecular

Electrostatics Electrostatics ssoftwareoftware

Software

package

Description URL Availability

APBS Solves PBE in parallel with FD MG and FE AMG solvers.Provides limited GB support

http://www.poissonboltzmann.org/apbs Windows, All Unix. Free, open source.

DelPhi Solves PBE sequentially with highly optimized FD GS solver.

http://wiki.c2b2.columbia.edu/honiglab_public/index.php/Software:DelPhi

SGI, Linux, AIX. $250 academic.

GRASP Visualization program with emphasis on graphics; offers sequential calculation of qualitative PB potentials.

http://wiki.c2b2.columbia.edu/honiglab_public/index.php/Software:GRASP

SGI. $500 academic.

MEAD Solves PBE sequentially with FD SOR solver. http://stjuderesearch.org/site/lab/bashford/ Windows, All Unix. Free, open source.

UHBD Multi-purpose program with emphasis on SD; offers sequential FD SOR PBE solver.

http://chemcca51.ucsd.edu/uhbd.html All Unix. $300 academic.

MacroDox Multi-purpose program with emphasis on SD; offers sequential FD SOR PBE solver.

http://pirn.chem.tntech.edu/macrodox.html SGI. Free, open source.

Jaguar Multi-purpose program with emphasis on QM; offers sequential FE MG, SOR, and CG PBE solvers. Offers GB support.

http://www.schrodinger.com/Products/jaguar.html Most Unix. Commercial.

CHARMM Multi-purpose program with emphasis on MD; offers sequential FD MG PBE solver and can be linked with APBS. Offers GB support.

http://www.charmm.org/ All Unix. $600 academic.

AMBER Multi-purpose program with emphasis on MD; offers GB support.

http://ambermd.org/ All Unix. $400 academic

APBS: Adaptive PoissonAPBS: Adaptive Poisson--Boltzmann SolverBoltzmann SolverSoftware for evaluating the electrostatic properties of nanoscale biomolecular systems

http://www.poissonboltzmann.org/apbs/

� designed to efficiently evaluate electrostatic properties in:• simulations of diffusional processes to determine ligand-protein and protein-protein binding kinetics, • implicit solvent molecular dynamics of biomolecules, • implicit solvent molecular dynamics of biomolecules, • solvation and binding energy calculations to determine ligand-protein and protein-protein equilibrium binding constants, • biomolecular titration studies involving tens to millions of atoms and a wide range of time scales.

� uses Parallel Algebraic MultiGrid code and the Finite Element ToolKit(http://www.fetk.org) to solve the Poisson-Boltzmann equation numerically.

Complex systems are often characterized by rough and complicated energy landscapes.

Such features can yield Such features can yield Such features can yield Such features can yield complicated dynamics evolving complicated dynamics evolving complicated dynamics evolving complicated dynamics evolving over long time scales.over long time scales.over long time scales.over long time scales.

Page 25: Movimientos en proteínas ≡ Dinámica molecular

Use of Experimental Data in MD Use of Experimental Data in MD SimulationsSimulations

� The conformational space accessible to proteins (or biological

macromolecules in general) is enormous.

� No simulation, irrespective of its length, will ever cover the conformational

space.

� From experiment (e.g. X-ray, NMR) it is known that biomolecules tend to � From experiment (e.g. X-ray, NMR) it is known that biomolecules tend to

populate a particular fold or class of conformations.

� The experimental information can be used in a computer simulation to

restrict the conformational space accessible to the macromolecular system.

� Those parts of configurational space that contain molecular configurations

incompatible with the experimental information are made difficult to access in

the simulation: penalty functions.

� The more the actual configuration of the system violates the experimental

data the higher the value of Vrestr.

� Different types of experimental information can be incorporated into Vrestr

RestrainedRestrained Molecular Dynamics Molecular Dynamics SimulationsSimulations

V = Vphys({ri}) + Vrestr ({ri})

� A penalty function or restraining potential energy term, Vrestr, is added to

the physical potential energy function, Vphys:

� Different types of experimental information can be incorporated into Vrestr

to force the trajectory to satisfy the experimental data:

• Atom-atom distance upper bounds (e.g. from NOE measurements).

• Torsional angles from NMR J-coupling data or structure factor amplitudes from X-ray diffraction measurements.

� Replacing instantaneous values by time-averaged values leads to time-

dependent restraints.

� The occurrences of large displacements and transitions over high free energy barriers typically have small probabilities.

� Though rare, such events are often important compared with higher frequency fluctuations, e.g. switching of states of bistable systems or systems with multiple (meta)stable states, such as one sees in protein conformational changes in the regulation of cellular components.states, such as one sees in protein conformational changes in the regulation of cellular components.

� However, while it is relatively easy to study dynamics on short time scales (typically at nanoseconds to microseconds for biomolecules) using simulations, it is painful to extend the naive molecular dynamics simulations to the long time scales that are often of biological interest (often as long as microseconds to seconds for biomolecular systems).

Structural Homology between PBP and S1S2 Glu R2

Periplasmic Gln-binding Protein Glutamate Receptor 2 (+kainate)

Domain 1 Domain 1

Domain 2 Domain 2

hinge

Page 26: Movimientos en proteínas ≡ Dinámica molecular

Glutamate receptor

Amino

Terminal

Domain

1FTL 1GR2 1TXF

glutamato (−−−−)

kainate

transmembrane segment

insertion loop.

glutamato (+)

glutamato (−−−−)

distancia entre los Cα

de Ser-158 y Arg-108

Mendieta, J.; Ramírez, G.; Gago, F.Proteins: Structure, Function & Genetics, 44(4): 460-469 (2001)

� A coarse search is carried out at a high temperature, and a local minimum is approached during the cooling stage.

� The “temperature” of the system is not a physical temperature but rather a

SpecialSpecial case: case: simulatedsimulated annealingannealing oror

‘‘quenchedquenched’ molecular ’ molecular dynamicsdynamics

� The “temperature” of the system is not a physical temperature but rather a control parameter that determines whether the system can escape certain local minima.

� It may include non-physical energy terms.

� A mean-field technique that allows increasing sampling of the region of interest by constructing multiple copies of this region, e.g. a loop, a ligand, etc.

� Copies do not interact with each other and interact with the rest of the system in an average way.

SpecialSpecial case: case: LocallyLocally EnhancedEnhanced SamplingSampling (LES)(LES)

� This average is an average force or energy from all of the individual copy contributions, not one force or energy from an average conformation of the copies.

� Multiple trajectories of this region can be obtained while carrying out a single simulation.

� Barriers to conformational transitions are reduced.

Page 27: Movimientos en proteínas ≡ Dinámica molecular

� the frequent switching of simulation temperatures (based on a Metropoliscriterion) between a number of concurrent simulations allows for significantly enhanced sampling of conformations.

SpecialSpecial case: Replica Exchange MDcase: Replica Exchange MD

β1 = 1/kBT1 ; β2 = 1/kBT2

T1 and T2 = the temperatures

kB = Boltzmann’s constant

� can be used to systematically improve the structure of a peptide towards the correct native structure in solvent, even when starting from a random, extended state.

� usually allow enough sampling to deduce a detailed Gibbs free energy landscape in three dimensions

U1 and U2 = the potential energies of replicas 1 and 2, respectively.

� Suitable for studying processes that are intrinsically fast but constitute rare events (average frequency << 1011 s–1) because they are limited by one or more energy barriers.

SpecialSpecial case: ‘case: ‘steeredsteered’, ‘’, ‘biasedbiased’, ’, oror ‘‘activatedactivated’ ’

molecular molecular dynamicsdynamics

� The addition of external forces reduces the energy barriers and increases the probability of unlikely configurations.

� This approach corresponds closely to micromanipulation through atomic force microscopy or optical tweezers.

� An additional term is added to the potential energy function based on the mass-weighted root-mean-square deviation of a set of atoms in the current structure compared to a reference structure.

� At each MD step, the algorithm performs a best-fit of the reference structure to the simulation structure and calculates the rmsd for the selected

SpecialSpecial case: ‘case: ‘targetedtargeted’ molecular ’ molecular dynamicsdynamics

oror ‘‘templatetemplate forcingforcing’’

structure to the simulation structure and calculates the rmsd for the selected atoms.

( )2/1

2

−= ∑

N

pairs

template

ii

N

RRKE ( )∑ −=

N

pairs

template

iii RRKE2

� Used to determine the free energy difference between two states which differ in composition.

� A reaction coordinate defines the fundamental “mutational” change

SpecialSpecial case: Free case: Free EnergyEnergy PerturbationPerturbation and and

ThermodynamicsThermodynamics IntegrationIntegration calculationscalculations

� A reaction coordinate defines the fundamental “mutational” change involved as a function of λ within the context of an otherwise free molecular dynamics simulation.

� Free energy changes that accumulate as the mutated atoms grow or shrink are translated from those of the initial state (λ = 0) to those of the final state (λ = 1) are calculated.

Page 28: Movimientos en proteínas ≡ Dinámica molecular

State

AState

B

∆G1 ∆G2 ∆GN-1λ3λ2 λN-1

(λ1) (λN)

∆G1 ∆G2 ∆GN-1λ3λ2 λN-1

∆G1 ∆G2 ∆GN-1λ3λ2 λN-1

Free Free EnergyEnergy PerturbationPerturbation

Σ∆Gi (path 1) = Σ∆Gi (path 2) = Σ∆Gi (path 3)

G(λ) = – k T ln ∆(λ)

∆GBA + ∆GAB = 0

ThermodynamicThermodynamic cyclescycles

RA + L RAL

RB + L RBL

∆G1

∆G2

∆G3 ∆G4

R + L1 RL1

R + L2 RL2

∆G1

∆G2

∆G3 ∆G4

∆∆G = ∆G2 – ∆G1 = ∆G4 – ∆G3

� The free energy is a thermodynamic state function: as

long as the system changes in a reversible way, the free

energy change, DG, will be independent of the pathway.

� If the non-physical processes are simulated in identical

conditions, the errors inherent to this approximation can

be cancelled out.

OOH OH

BrBr

Br Br

BrBr

IC50 = 6.4 ± 1.1 µM

Free Energy Perturbation“Computational Alchemy”

Metamorphosis“Escher’s Magic”

OOH OH

BrH

Br Br

BrBr

IC50 = 25%

“Computational Alchemy”“Escher’s Magic”

de la Fuente et al. J. Med. Chem. 46(24): 5208-5221 (2003)

� Used to determine the free energy difference between two states which differ in conformation, rather than in composition, e.g. rotation about a Phe ring, base pair opening in DNA, etc.

� A reaction coordinate defines the fundamental structural changes involved

SpecialSpecial case: case: PotentialPotential of Mean of Mean ForceForce calculationcalculation

� A reaction coordinate defines the fundamental structural changes involved as a function of λ within the context of an otherwise free molecular dynamics simulation.

� Free energy changes that accumulate as the internal constraints are translated from those of the initial state (λ = 0) to those of the final state (λ = 1) are calculated.

Page 29: Movimientos en proteínas ≡ Dinámica molecular

AnalysisAnalysis of Molecular Dynamics of Molecular Dynamics SimulationsSimulations

� Coordinates and velocities that are saved from the MD simulation are then used for the analysis.

� Time dependent properties can be displayed graphically, where one of the axis corresponds to time and the other to the quantity of interest, such as energy, root-mean-square deviation (rmsd), etc.

� Other representations include the time dependence of certain distances (e.g. hydrogen bonds) and angle rotations (dihedrals), as well as correlation functions.

� Average structures can be calculated and compared to experimental structures.

� Molecular dynamics simulations can help visualize and understand conformational changes at the atomic level when combined with molecular graphics programs which can display the structural parameters of interest in a time-dependent way.

AnalysisAnalysis of Molecular Dynamics of Molecular Dynamics SimulationsSimulations

Some quantities that are routinely calculated from a molecular dynamicssimulation include:

� mean energy:

� RMS differencebetween two structures:

∑=

=N

i

iEN

E1

1

( ) ( )∑ −=−= rrrrRMSD2

2/12 1 βαβα� RMS difference

between two structures:

� RMS fluctuations:

� radius of gyration:

where ri - rcm is the distance between atom i and the center of mass of the molecule.

( ) ( )∑ −=−=i

ii

i

ii rrN

rrRMSD22 1 βαβα

( )∑ −=f

ave

i

f

i

f

fluct

i rrN

RMS21

( )∑ −=i

cmi

i

rrN

gr21

..

Cαααα eEF1A vs initial

Cαααα eEF1A· DB vs initial

Cαααα eEF1A vs average

Cαααα eEF1A· DB vs average

energ

y (

kcal m

ol-1

)energ

y (

kcal m

ol

Page 30: Movimientos en proteínas ≡ Dinámica molecular

Essential dynamicsEssential dynamics:: a useful method for analyzing a useful method for analyzing

trajectoriestrajectories generated by molecular dynamicsgenerated by molecular dynamics

� Separating functionally important motions fromrandom thermal fluctuations is a major challenge in analyzingMD trajectories.

Interactive Essential Dynamics (IED)

http://mccammon.ucsd.edu/software.html

� Principal componentanalysis (PCA) of MD trajectory data, often called essential dynamics(ED), is frequently used to separate large-scale correlated motions from local harmonic fluctuations.

http://mccammon.ucsd.edu/ied/

Essential dynamicsEssential dynamics analysisanalysis

� ED analysis constructs a new orthogonal basis set for the atomic coordinates in a trajectory, such that the greatest variance occurs along the first vector, with monotonically decreasing variance along successive vectors.

� These vectors are often called principal componentsor eigenvectors, since their derivation involves an eigen decomposition.

� The eigenvalues from the eigen decomposition represent the relative amount of molecular motion that occurs along each eigenvector.

Essential dynamicsEssential dynamics analysisanalysis

� Most of the molecular motion can be described by displacements along the first few eigenvectors.

� A trajectory can be projected onto a subset of selected eigenvectors so only motion along the selected vectors is allowed.

� The most commonly selected subset is the first n eigenvectors such that a given percentage of the molecular motion occurs within the subspace formed by the selected eigenvectors.

� Projection onto these vectors filters out thermal noise, making the functionally interesting motions easier to appreciate.

Essential dynamicsEssential dynamics analysisanalysis

� To perform ED, coordinate data from each timestep is fitted to a reference structure to remove translational and rotational motion.

� The fitted trajectory data are used to construct a covariance matrix C according to:

( )( )TxxxxC −−=

where <> represents the mean across all timesteps, and the T superscript represents transpose.

� An eigen decomposition (or diagonalization) of the symmetric matrix C is performed to identify Λ, a diagonal matrix of eigenvalues, and T , a matrix of column eigenvectors forming a new orthonormal basis set, satisfying

( )( )

TTTC Λ=

Page 31: Movimientos en proteínas ≡ Dinámica molecular

Essential dynamicsEssential dynamics analysisanalysis

� A zero-mean trajectory matrix, X, can be constructed by subtracting <x> from the coordinate vector for each timestep to form the rows of X.

� The matrix of the projections of each timestep onto each eigenvector, P, is obtained by multiplying the trajectory matrix, X, by T:matrix, X, by T:

� A matrix of filtered trajectory data, F , can be calculated by multiplying a subset of the (column) projection vectors in P by the corresponding subset of the eigenvectors in TT. This way F contains only motions that occur along the eigenvectors selected from P and T , since motions along other eigenvectors are represented by projections omitted from the calculation of F.

TXP =

Essential Dynamics results ≈≈≈≈ NORMAL MODE ANALYSIS results

In In silicosilico pharmacologypharmacology

Page 32: Movimientos en proteínas ≡ Dinámica molecular

Databases of MD trajectories

http://mmb.pcb.ub.es/MoDEL/http://mmb.pcb.ub.es/modelLinks/MDGridLink.htm

http://www.dynameomics.org/ http://www.dynameomics.org/

Page 33: Movimientos en proteínas ≡ Dinámica molecular

Challenges in MD Simulations of Biomolecular Systems

� System sizeThe computational complexity of most

algorithms used to simulate biomolecular systems scales with the number of atoms N, or sometimes N log N.

� Time/Length scale

� SamplingTo obtain statistically robust computational results that are more directly comparable to experimentally measurable quantities, one needs to repeat several runs.

� Time/Length scaleIf the energy barrier that separates two

states is ∆G, the time needed to overcome this barrier is ∝∝∝∝exp(∆G/kT).

Enhanced Sampling in MD Simulations via Enhanced Sampling in MD Simulations via

MetadynamicsMetadynamics

� Introducing collective variables that describe the relevant degrees of freedom in the free-energy surface (FES).

Introducing a non-Markovian

Laio A, Parrinello M. Escaping free-energy minima. Proc Natl Acad Sci U S A. 99(20):12562-6 (2002)

� Introducing a non-Markovian potential energy term in the collective variable space that discourages visiting regions already visited. This term is chosen in such a way that the real free-energy of the system is estimated for asymptotic times.

http://md.chem.rug.nl/cgmartini/index.php/home http://md.chem.rug.nl/cgmartini/index.php/home

Page 34: Movimientos en proteínas ≡ Dinámica molecular

• Which molecular dynamics (MD) techniques are available for drug design?

• How is MD used to investigate ligand–macromolecule complexes?

• How are MD studies applied to human and non-human therapeutic targets?

• What are the latest advances in the field of MD?

doi:10.1016/j.drudis.2015.01.003

Molecular Dynamics, Molecular Dynamics, believe it or notbelieve it or not

Anthony Nicholls

(OpenEye scientific software, Cambridge, MA)

• MD is not a useless technique but it’s not held up to the same • MD is not a useless technique but it’s not held up to the same

standards as other techniques, and therefore its true utility is at best unknown,

• MD can accomplish in days what other techniques can achieve

in seconds or hours,

• MD can look and feel “real” and seductive,

• Using jargon, movies and the illusion of reality, MD oversells

itself to the public and to journals.

Baltimore Lectures on

Molecular Dynamics

and the Wave Theory of Light

William Thomson, 1st Baron

Kelvin

Lecture XI (1884)

In resolution, he plunged himself so deeply in his reading of these books, as he

spent many times in the lecture of them whole days and nights; and in the end,

through his little sleep and much reading, he dried up his brains in such sort as he

lost wholly his judgment. His fantasy was filled with those things that he read, of

enchantments, quarrels, battles, challenges, wounds, wooings, loves, tempests,

and other impossible follies. And these toys did so firmly possess his imagination

with an infallible opinion that all that machina of dreamed inventions which he

read was true, as he accounted no history in the world to be so certain and

sincere as they were.

Don Quixote (1605)

En resolución, él se enfrascó tanto en su lectura, que se le pasaban las noches

leyendo de claro en claro, y los días de turbio en turbio, y así, del poco dormir y

del mucho leer, se le secó el cerebro, de manera que vino a perder el juicio.

Llenósele la fantasía de todo aquello que leía en los libros, así de encantamientos,

como de pendencias, batallas, desafíos, heridas, requiebros, amores, tormentas y

disparates imposibles, y asentósele de tal modo en la imaginación que era verdad

toda aquella máquina de aquellas soñadas invenciones que leía, que para él no

había otra historia más cierta en el mundo.

Don Quijote (1605)

Then... REMEMBER:

Simulations are fiction aspiring to emulate reality.

Pretty pictures and even a few good numbersdo not guarantee good science.

Peter J. Steinbach

QUESTIONS WELCOME

e-mail: [email protected]