the sheffield solvation model and other stunningly simple
TRANSCRIPT
The Sheffield Solvation Model and Other Stunningly
Simple Ideas to Improve Molecular Modelling
Barry Pickup and Andrew Grant
Football reference:
June 20th 2007 Holland beat England (U21 ECup semi-final)
13-12 aet on penalties.
(a) to model the screening of the Coulomb force which weakens
the interaction between charges
(a) to describe solvation, which opposes the removal of charged
(or polar) groups by burial into the interior of a molecule,
membrane, binding site and so on
Why solvent models ?
(b) Coulomb force is long range
the potential varies with distance as 1/r
cf the attractive part of the van der Waals which varies as 1/r6
What is it about this Coulomb Force anyway?
(a) The Coulomb Force is strong
How strong:
If i was standing at arms length from Anthony and we each had
one percent more electrons than protons, then the repulsive force
would be enough to lift the weight of:
(i) Bob
(ii) Empire State Building
(iii) The Earth
+
- +
-
+
- +
-
npessolv GGG !+!=!
1G!
npG!solv
G!
2G!
21GGG
es!+!=!
vacuum
water
The Sheffield Model is concerned with the calculation of !Ges
0P =0!P
Continuum Solvation models capture average or “continuum”
behaviour of water. These theories favour information about
the role of water on say proteins or small molecules, rather
than information about structural features of water acting as a
solvent.
The Sheffield Model is a continuum model
Water is treated as a (linearly) polarizable dielectric.
The relatively high dielectric of water ("=80) means that it is
much more easily polarized than most solutes (proteins, small
molecules etc).
From a long view of the history of
mankind –seen from, say ten thousand
years from now – there can be little
doubt that the most significant event of
the 19th century will be judged as
Maxwell’s discovery of the laws of
electrodynamics. The American civil
war will pale into provincial
insignificance in comparison with this
important scientific event of the same
decade.
Continuum Models are
all based upon the
equations of Maxwell.
)()( rrDrr
!="#
Born (1920), Martin (1929), Bell (1931) (charge/dipole in a
spherical cavity)
Kirkwood(1934) (off-centred charges in spherical cavities)
Onsager (1935), Böttcher (1938) (polarizable dipole)
Debye H!ckel (1923) (theory of electrolytes)
Scholte (1949) (dipole in ellipsoidal cavity)
Tanford and Kirkwood (1957) (protein electrostatic models in the
days when proteins were spheres)
The Protagonists
Classical electrostatics has also proved to be successful
quantitative tool yielding accurate descriptions of:
• electrical potentials,
•diffusion limited processes,
•pH-dependent properties of proteins,
•ionic strength-dependent phenomena,
• the solvation free energies of organic molecules.
Poisson Boltzmann Equation
(NB: not really atom based – dependency is on the spatial variation of the dielectric boundary)
So what is wrong with the Poisson Boltzmann Equation?
Well nothing much – it is the correct physical theory, but.....
•Implementations are difficult (requires multigrid FD solvers etc)
•Gradients are difficult to compute (especially high order
gradients)
•Computationally expensive relative to most classical modelling
approaches
•Difficult to combine with conventional modelling approaches
The alternative: Design bad atom-based solvation models
Take as starting point the fact that linear models can be written as
is the interaction of charged
atom(I) with the polarization it
produces in the dielectric
self
IE
is interactions of charged
atom(I) with the polarization
produced in the dielectric by
charged atom(J)
pair
IJE
+
How does this help ?
!!"
#$$%
&'=
outin
f((
(
11
I
IBorn
I
QfE
!"#
#
240
2
$=
++
Born
I
Iself
I
QfE
!"#
#
240
2
$=
Born
I
self
I EE !
I
Born
I!! "
So far no Maxwell Equation has been hurt in the making of
this atom-based equation
But now comes GB theory
Invoke the “Coulomb Field” approximation
2
1
11
4
ˆ)(
I
II
r
Q
!
rrD =
which simplifies the potential produced as a result of dielectric polarization
! "#$$%
&''(
)**=
2
12
12112
ˆ)(1
)(
4
1)(
rd
in
pol rr
rrr
1
1 +,
,
-+
to:
2
12
12
2
1
111
0
2,
ˆˆ)(
4
1
4)(
rrd
Q
I
IIpol
IGB
rrrrr !"#= $ %
&&%'
!!"
#$$%
&'=(
)(
11)(
1
1r
r))
)in
NB:
)(4
11
8 4
0
2
rr !"= # $$% rd
QE Iself
I
Born
I
Iself
I
QfE
!"#
#
240
2
$=
! "= #$%
fr
dI
Born
I
)(1
4
114
rr
GB theory is the computation of #Born
augmented by a simple ansatz to account for the more complex pair
terms
20
2
4
IJ
cR
B
J
B
I
JIpair
IJ
Re
QQfE
BJ
BI
IJ
+
!=!
""
#
""$#
!!"
#$$%
&'=(
)(
11)(
1
1r
r))
)in
NB:
Stunningly Simple idea (number 1)
Construct a dielectric function based on atoms modelled
as Gaussian functions:
Computation of Born radii is complicated because of the arbitrary
shape of solvent regions displaced by atoms.
Stunningly Simple idea (number 2)
Also use a Gaussian “masking” function to
remove Coulombic singularities (from the expressions
for Born radii)
Subsequent Gaussian “mathematics” leads to analytic formulae
for Born radii, with smooth gradients to all orders. This is the
GGB model, which contains only a single adjustable parameter,
kappa, (associated with the range of the masking function).
What is so good about these Born radii anyhow?
20
2
4
IJ
cR
B
J
B
I
JIpair
IJ
Re
QQfE
BJ
BI
IJ
+
!=!
""
#
""$#
!+
"=IJ
IJJI
JIsolv
Sheff
bRa
QQfE
2
04 ##$%
%
or
Another Stunningly Simple idea – dispense with Born Radii:
Introducing the “Sheffield Solvation Model”.
!+
"=IJ
IJJI
JIsolv
Sheff
bRa
QQfE
2
04 ##$%
%
What is so good about these Born radii anyhow?
or
Another Stunningly Simple idea – dispense with Born Radii:
Introducing the “Sheffield Solvation Model”.
‘a’ acts to globally scale the atomic radii (i.e. models an average
local environment of a given type of atom characterised by a
radius of #)‘b’ controls the rate of switching of the pair term between long and
short range behaviour.
CH3
CH3
CH3
CH3
N
H
N
H
O
O
!"#$ "#% "& '()'*+", "-.*/ 0+)1 '()')2-3"% "'()'-.'45+-.3*657*/ 085
& 9:;*.5+5<=-+7457*95<0+40> 07)(1
PB(6gpa) : -18.61
Sheffield(a=1.553149,b=0.735694): -18.63
GGB(kappa=0.057768) -18.48
)( 1!" kcalmol
elec
solvG
Does the Sheffield Model actually work ?
Obtaining ‘a’ and ‘b’ ?
( )! "=Nmols
k
Shef
k
PB
k baEEbaF2),(),(
minimize a suitable “fitting” function eg
For example:
we considered 64 000 neutral and 387 charged molecules taken from the MDDR.
Structures built with Corina
radii and charges assigned based on the MMFF force field
F was minimized for ten different subsets of 10 000 randomly chosen molecules
Results
a = 1.553149 (0.0146)
b = 0.735694 (0.0108)
Comparison of solvation energies computed using the
Sheffield Solvation Model and the PB equation (for the MDDR set).
Percentage differences between solvation energies computed using
atom-based models and Poisson Boltzmann.
The solution entropy contains translational, rotational, and vibrational
contributions. Solvent effects were included in the vibrational term by
computing second derivatives of the molecular solvation energy. Normal mode
analysis yields harmonic force constants (from which is computed)
Calculation of Molecular Solution Entropies.
vib
solvS
Conformational energies!"#$ "#% "& '()'*+", "-.*/ 0+)1 ' ()')2-3"% "'()'-.'45+-.3*657*/ 085
& 9:;*.5+5<=-+7457*95<0+40> 07)(1
Oh dear
1
1
74.11)735694.0,553149.1(
38.10
!
!
!===
!=
kcalmol
kcalmol
baE
E
Sheff
PB
Optimizing a/b for this molecule
138.10)716365.0,892990.1( !!=== kcalmolbaESheff
An outlier:
What about proteins ?
Globally is fine for solvation energies as the below graph
(on a crude energy) scale shows.
However: a/b probably need a dependence on say solvent
accessibility/depth from surface for a more precise description.
of protein solvation.
What about Experiment?
• Data Set of 200 polar molecules
(Courtesy of Rob Rizzo, UCSF)
ZAP Using: GAST MMFF AM1BCC AM1BCC_U
RMS Expt (kcal): 3.62 2.68 1.55 1.31
With AM1BCC_U:
ZAP: 1.31
GB: 1.491
Sheffield: 1.495
Summary
even Barry can program it: for(i=0;i<N;i++) {
for(j=i,j<N;j++) Solvation+= -charges[j]*charges[i]/sqrt(A*dist+B*radii[j]*radii[i]);
}
Solvation*=0.50;
local model of solvation (not designed to predict individual pair
or self energies).
physics content is low but the model contains a minimal number of
parameters to deliver a relative high accuracy.
Sheffield Solvation Model is an exceptionally simple formula
could be considered a base line against which other more complex
models can be judged.