theory of binless multi-state free energy estimation with applications to protein...

15
Theory of binless multi-state free energy estimation with applications to protein-ligand binding Zhiqiang Tan, Emilio Gallicchio, Mauro Lapelosa, and Ronald M. Levy Citation: J. Chem. Phys. 136, 144102 (2012); doi: 10.1063/1.3701175 View online: http://dx.doi.org/10.1063/1.3701175 View Table of Contents: http://jcp.aip.org/resource/1/JCPSA6/v136/i14 Published by the American Institute of Physics. Additional information on J. Chem. Phys. Journal Homepage: http://jcp.aip.org/ Journal Information: http://jcp.aip.org/about/about_the_journal Top downloads: http://jcp.aip.org/features/most_downloaded Information for Authors: http://jcp.aip.org/authors Downloaded 06 May 2012 to 192.12.88.130. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

Upload: others

Post on 12-Mar-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Theory of binless multi-state free energy estimation with applications to protein ...stat.rutgers.edu/~ztan/Publication/UWHAM-May2012.pdf · 2012-05-07 · THE JOURNAL OF CHEMICAL

Theory of binless multi-state free energy estimation with applications toprotein-ligand bindingZhiqiang Tan, Emilio Gallicchio, Mauro Lapelosa, and Ronald M. Levy Citation: J. Chem. Phys. 136, 144102 (2012); doi: 10.1063/1.3701175 View online: http://dx.doi.org/10.1063/1.3701175 View Table of Contents: http://jcp.aip.org/resource/1/JCPSA6/v136/i14 Published by the American Institute of Physics. Additional information on J. Chem. Phys.Journal Homepage: http://jcp.aip.org/ Journal Information: http://jcp.aip.org/about/about_the_journal Top downloads: http://jcp.aip.org/features/most_downloaded Information for Authors: http://jcp.aip.org/authors

Downloaded 06 May 2012 to 192.12.88.130. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

Page 2: Theory of binless multi-state free energy estimation with applications to protein ...stat.rutgers.edu/~ztan/Publication/UWHAM-May2012.pdf · 2012-05-07 · THE JOURNAL OF CHEMICAL

THE JOURNAL OF CHEMICAL PHYSICS 136, 144102 (2012)

Theory of binless multi-state free energy estimation with applicationsto protein-ligand binding

Zhiqiang Tan,1,a) Emilio Gallicchio,2,a) Mauro Lapelosa,2,b) and Ronald M. Levy2

1Department of Statistics, Rutgers, The State University of New Jersey, Piscataway, New Jersey 08854, USA2BioMaPS Institute for Quantitative Biology and Department of Chemistry and Chemical Biology, Rutgers,The State University of New Jersey, Piscataway, New Jersey 08854, USA

(Received 22 December 2011; accepted 21 March 2012; published online 9 April 2012)

The weighted histogram analysis method (WHAM) is routinely used for computing free energiesand expectations from multiple ensembles. Existing derivations of WHAM require observations tobe discretized into a finite number of bins. Yet, WHAM formulas seem to hold even if the bin sizesare made arbitrarily small. The purpose of this article is to demonstrate both the validity and value ofthe multi-state Bennet acceptance ratio (MBAR) method seen as a binless extension of WHAM. Wediscuss two statistical arguments to derive the MBAR equations, in parallel to the self-consistencyand maximum likelihood derivations already known for WHAM. We show that the binless method,like WHAM, can be used not only to estimate free energies and equilibrium expectations, but alsoto estimate equilibrium distributions. We also provide a number of useful results from the statisticalliterature, including the determination of MBAR estimators by minimization of a convex function.This leads to an approach to the computation of MBAR free energies by optimization algorithms,which can be more effective than existing algorithms. The advantages of MBAR are illustrated nu-merically for the calculation of absolute protein-ligand binding free energies by alchemical trans-formations with and without soft-core potentials. We show that binless statistical analysis can ac-curately treat sparsely distributed interaction energy samples as obtained from unmodified interac-tion potentials that cannot be properly analyzed using standard binning methods. This suggests thatbinless multi-state analysis of binding free energy simulations with unmodified potentials offers astraightforward alternative to the use of soft-core potentials for these alchemical transformations.© 2012 American Institute of Physics. [http://dx.doi.org/10.1063/1.3701175]

I. INTRODUCTION

The weighted histogram analysis method (WHAM)(Ref. 1) has emerged as an effective, general method forcomputing free energies and expectations from multiple en-sembles, for example, at different temperatures or with dif-ferent biasing potentials.2, 3 There are a variety of ways toderive and understand WHAM, including the self-consistencyapproach1, 4 and the maximum likelihood approach.2, 5, 6 How-ever, all existing derivations in the computational physics lit-erature involve discretizing observations into a finite numberof bins in order to construct proper histograms. On the otherhand, it has been recognized that WHAM formulas remainmathematically defined even if the bin sizes are made arbi-trarily small or equivalently if the actual data instead of theirdiscretizations are used (e.g., Sec. 8.3.2 of Ref. 4). Howeverno formal account exists in the chemical physics literature forwhether and under what conditions such a binless extensionis valid.

At the same time, there have been extensive develop-ments in the mathematical and statistical fields of theory

a)Authors to whom correspondence should be addressed. Electronicaddresses: [email protected] and [email protected].

b)Present address: Department of Chemical and Biological Engineering,Drexel University, Philadelphia, Pennsylvania 19104, USA.

and methods leading to essentially the binless extensionof WHAM.7–10 Shirts and Chodera11 presented the binlessmethod as the result of making the optimal choice among alarge class of estimators,10 and called it the multi-state Ben-net acceptance ratio method (MBAR) by the fact that themethod reduces to the optimal Bennet acceptance ratio (BAR)(Refs. 12 and 13) in the case of only two ensembles. In thisarticle, we discuss two statistical arguments to derive MBARequations, in parallel to the self-consistency and maximumlikelihood derivations already known for WHAM. Dissemi-nating these concepts to the chemical physics community ishelpful to better appreciate the theoretical foundations of themethod and to highlight the connections between MBAR andWHAM, building on the established familiarity and expertiseof practitioners with the latter.

To understand from a theoretical perspective the binlessformulation of WHAM, an important quantity to consider isthe measure of states, a non-negative measure from which thedensity of states is defined as the (Radon-Nikodym) derivativewith respect to the counting or Lebesgue measure.14 From thisperspective, the validity of MBAR as binless WHAM can beseen as follows. The measure of states can be consistently es-timated in the sense that integrals of the density of states canbe estimated with standard errors inversely proportional to thesquared root of the sample size, even though the density ofstates, in general, cannot be pointwise estimated at the usual

0021-9606/2012/136(14)/144102/14/$30.00 © 2012 American Institute of Physics136, 144102-1

Downloaded 06 May 2012 to 192.12.88.130. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

Page 3: Theory of binless multi-state free energy estimation with applications to protein ...stat.rutgers.edu/~ztan/Publication/UWHAM-May2012.pdf · 2012-05-07 · THE JOURNAL OF CHEMICAL

144102-2 Tan et al. J. Chem. Phys. 136, 144102 (2012)

rate of standard errors. Examples of integrals include the par-tition function or the probability that the value of a systemobservable falls into a given bin.

We also provide a number of analytically and computa-tionally useful results on MBAR from the statistical literature.The maximum likelihood derivation shows that the MBARestimators can be obtained by minimizing a convex objec-tive function, equivalent to solving a system of self-consistentequations. Various fast and reliable numerical algorithms havebeen developed for such optimization problems. For example,the trust region algorithm is globally convergent at the secondorder.15 Computing MBAR estimates by these optimizationalgorithms can be more effective than by algorithms in cur-rent use;11 relevant comparisons have recently been reportedin the context of solving WHAM equations.6

Statistical large-sample theory gives not only conditionsunder which the MBAR estimates are consistent and asymp-totically normal but also formulas for asymptotic variancematrices, as the sample size grows to infinity. Although thetheory can be applied to correlated data,7 the variance for-mulas are much simplified if the observations from each en-semble are independent.10, 16 These formulas can be used forvariance estimation provided that observations are subsam-pled to be approximately independent. Alternatively, as alsodone here, block bootstrapping17 can be used to estimate sta-tistical uncertainties taking into account data correlations.

We illustrate the advantages of MBAR, based on thesampled values directly without binning, over conventionalWHAM, with binning, on the calculation of absolute protein-ligand binding free energies by alchemical transformations.These calculations take various forms18 but they all consistof collecting samples from simulations distributed along asuitable thermodynamic path connecting the coupled and un-coupled states of the ligand-receptor complex. The path is pa-rameterized by a progress parameter λ whereby, for example,λ = 0 corresponds to the uncoupled state and λ = 1 to thecoupled state. The progress parameter λ, in turn, dials theparameters of a hybrid potential in such a way that at λ = 1it represents the bound complex and at λ = 0 the ligand andreceptor are not interacting.19–23

In typical applications, the binding free energy is com-puted from the free energy differences between neighboringλ-states using only data collected at these states usingpairwise exponential or more accurate BAR free energyestimators.13, 24, 25 These and analogous binding free energyestimators are notoriously affected by end point numericalinstabilities near λ = 0, when the ligand and the receptor arenearly uncoupled. Under these conditions conformationsare generated in which receptor and ligand atoms interpen-etrate each other yielding very large interaction energies.These cause instabilities which are difficult to overcomeunless specialized soft core potentials are employed.22, 26–28

Multi-state free energy estimation methods such asWHAM and MBAR (Refs. 3 and 11) are beginning to beemployed in binding free energy calculations. The generalidea behind these methods is to efficiently extract informationfrom all of the intermediate states so as to achieve bindingfree energy estimates with smaller statistical variance. Oneexample in this class of methods is the binding energy

distribution analysis method (BEDAM),29, 30 which is em-ployed here. The method is based on the analysis of samplesof the binding energy of the complex (defined as the changein the effective potential energy of the complex with implicitsolvation for bringing receptor and ligand from infinite sep-aration to the bound conformation) without internal confor-mational rearrangements. In BEDAM, the end point problemwith unmodified potentials is manifested with the occurrencenear λ = 0 of large binding energy values spread over anextremely wide range, which, as we will show, makes theapplication of binning-based methods such as WHAM unfea-sible. Binless methods such as MBAR do not suffer from thesame issues and are shown to be able to treat data sets of thiskind. This observation opens the possibility that using binlessmulti-state inference methods such as MBAR in conjunctionwith standard functional forms for the interactions potentialscould be as effective as using modified soft-core potentialsto circumvent the end point problem of binding free energycalculations.

II. THEORY AND METHODS

A. Setup

Consider a generalized ensemble whose Boltzmann prob-ability density function is

1

e−θTu(x), (1)

where u is a column vector of d generalized energy functionsof the configuration x of the system, θ is a column vector, alsoof length d, of corresponding coefficients, and

Zθ =∫

e−θTu(x) dx (2)

is the generalized configurational partition function in physicsor the normalizing constant in statistics. Throughout, a super-script T denotes transpose so that for two vectors a and b eachof length d,

aTb =d∑

k=1

akbk, (3)

where ak and bk are vector elements, gives the inner productof a and b.

The foregoing notation is suitable to accommodate vari-ous applications. For example, the canonical ensemble at in-verse temperature β = 1/kBT and potential energy functionU(x), is recovered by setting d = 1, θ = β, and u(x) = U(x)in Eq. (3). Similarly, the isothermal grand-canonical ensem-ble for a neat substance is recovered with d = 2, θ = (β, βμ),and u(x) = (U(x), N), where μ is the chemical potential andN the number of particles, so that θ Tu(x) = β(U (x) + μN ).(Note that in this case the system configuration x includesatomic coordinates as well as the number of particles N, andEq. (2) includes a summation over N.) A variety of ensem-bles commonly used in molecular simulations can also be ac-commodated by this notation. For example, each replica of atemperature replica exchange simulation is a canonical en-semble at the corresponding temperature as described above.

Downloaded 06 May 2012 to 192.12.88.130. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

Page 4: Theory of binless multi-state free energy estimation with applications to protein ...stat.rutgers.edu/~ztan/Publication/UWHAM-May2012.pdf · 2012-05-07 · THE JOURNAL OF CHEMICAL

144102-3 Tan et al. J. Chem. Phys. 136, 144102 (2012)

Free energy perturbation and “umbrella sampling” setupsare obtained by setting the potential energy vector as u(x)= (U0(x), ω1(x), . . . , ωd(x)), where U0(x) is the reference po-tential and ωk(x) is the perturbation or umbrella potential inwindow k, and by setting the coefficient vector in window kas θ k = (β, 0, . . . , 0, β, 0, . . . , 0), in which all elements arezero except for the first (corresponding to reference potentialU0) and the (k + 1)th element corresponding to the perturba-tion potential ωk(x). For the binding free energy applicationillustrated below, we adopt the latter setup but with a sim-plified notation afforded by the particular linear form, ωk(x)= λkb(x), of the perturbation (see Sec. III).

The notation introduced above is also useful to obtaincompact expressions for thermodynamic observables. For ex-ample, the distribution and expectation of some observablec(x) under Eq. (1) can be obtained in compact form (see, forexample, Eq. (22)) by formally including c(x) as a componentof the generalized energy vector u(x) with the correspondingcoefficient in θ set to zero, so as to leave the physical systemenergy θ Tu(x) unchanged. In the following, we will implic-itly assume that the generalized energy vector u(x) includescomponents related to system observables.

Assume that simulations are conducted at m coefficientvectors θ j (j = 1, . . . , m) and with the same energy vectoru(x). (Note that in this notation the dimensionality, d, of the θ

and u vectors and the number of simulations, m, are, in gen-eral, distinct; for example, for temperature replica exchanged = 1 while m is the number of replicas.) Denoted by {xji:i = 1, . . . , nj} the set of configurations of size nj obtainedfrom the jth simulation, and denoted by uji = u(xji) the cor-responding generalized energy vectors, which, as discussedabove, may also include system observables. The total samplesize is n = ∑m

j=1 nj . Typically, the low-dimensional vectorsuji are stored, instead of the high-dimensional, full configura-tions xji. For example, in the case of free energy perturbationcalculations, uji = (U0(xji), ω1(xji), . . . , ωd(xji)) contains thevalue of the perturbation potential, ωj(xji), corresponding tothe same window as the observed conformation, xji, as wellas values of the perturbation potential, ωk(xji), k ��= j , for allother windows for the same conformation. This specificationof u(x) well captures the type of data manipulations needed inmulti-state inference methods such as WHAM and, as will beseen, the binless extension of WHAM.

Under Eq. (1), the induced probability density functionof u(x) at θ is of the form

1

�(u)e−θTu, (4)

where �(u), formally defined as

�(u) =∫

δ(u(x) − u) dx (5)

is a generalized density of states, which does not depend on θ .The partition function Zθ can also be determined from �(u)as

Zθ =∫

�(u)e−θTu du. (6)

The density function (1) and relationship (2) are replaced byEqs. (4) and (6), respectively, when the data are reduced fromxji to uji (i = 1, . . . , nj; j = 1, . . . , m).

B. From WHAM to binless WHAM

The WHAM, first proposed by Ferrenberg andSwendsen,1 can be used to compute various quantitiesof interest. The method involves constructing a histogram,Nj(u), from each sample {uji: i = 1, . . . , nj}, where Nj(u)indicates the number of observations falling into a bin aboutu, for example, an interval or a rectangle if u(x) is 1 ortwo-dimensional. Then �(u) is estimated by

�(u)�u =∑M

r=1 Nr (u)∑Mr=1 nrZ

−1θr

e−θTr u

, (7)

where the partition function estimators (Zθ1 , . . . , Zθm) are de-

fined by self-consistency according to Eq. (6)

Zθk=

∑u

∑mr=1 Nr (u)∑m

r=1 nrZ−1θr

e(θk−θr )Tu(k = 1, . . . , m), (8)

where the summation∑

u is taken over all possible bins cen-tered at u of size �u. The estimators (Zθ1 , . . . , Zθm

) are deter-mined up to a multiplicative constant. It is customary to pick areference value, for example, Zθ1 , and then estimate the ratios(Zθ2/Zθ1 , . . . , Zθm

/Zθ1 ) from Eq. (8).Again by relationship (6), the partition function Zθ at any

other parameter value is estimated by

Zθ =∑

u

∑mr=1 Nr (u)∑m

r=1 nrZ−1θr

e(θ−θr )Tu. (9)

Furthermore, let h(u) be a function of u, for example, a com-ponent of u, and denote by 〈h〉θ the expectation of h(u) underEq. (4), that is, the expectation of h(u(x)) under Eq. (1). FromEqs. (4) and (7), the WHAM estimate hθ for 〈h〉θ is

hθ = 1

∑u

h(u)

∑mr=1 Nr (u)∑m

r=1 nrZ−1θr

e(θ−θr )Tu. (10)

This estimator depends on (Zθ , Zθ1 , . . . , Zθm) up to a mul-

tiplicative constant, that is, only depends on the ratios(Zθ /Zθ1 , Zθ2/Zθ1 , . . . , Zθm

/Zθ1 ). It is interesting to note thatthe summation over bins in Eq. (10) can be equivalently ex-pressed in terms of a weighted average over observations

hθ =∑ji

h(ubji)Fji(θ ), (11)

where ubji is a representative generalized energy of the bin

containing uji, Fji is the “WHAM weight” of uji that, by com-paring Eqs. (10) and (11), is defined as

Fji(θ ) = Z−1θ∑m

r=1 nrZ−1θr

e(θ−θr )Tubji

= 1

e−θTubji Gji (12)

and

Gji = 1∑mr=1 nrZ

−1θr

e−θTr ub

ji

(13)

Downloaded 06 May 2012 to 192.12.88.130. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

Page 5: Theory of binless multi-state free energy estimation with applications to protein ...stat.rutgers.edu/~ztan/Publication/UWHAM-May2012.pdf · 2012-05-07 · THE JOURNAL OF CHEMICAL

144102-4 Tan et al. J. Chem. Phys. 136, 144102 (2012)

is the θ -independent component of the WHAM weight Fji(θ )for each observation.

Equation (11) states that the expectation value of anyobservable can be obtained by attaching a statistical weightFji(θ ) to each observation uji which depends on the bin towhich it is assigned. An obvious simplification is to ex-press the WHAM estimate of 〈h〉θ and the WHAM weights[Eqs. (11) and (12)] in terms of the actual observations uji

rather than their closest bin representatives ubji . This idea,

which has been noted before without formal justification inthe computational physics literature,4, 31 leads naturally to abinless extension of WHAM. A closely related formalismhas been developed in statistics for computing normalizingconstants.7–10 Although this method can be derived by var-ious statistical arguments, it is essentially an extension ofWHAM without binning data. Below we give a formal deriva-tion of the binless method by importance weighting and self-consistency.

To understand binless WHAM, it is useful to introducethe concept of the measure G defined by

dG = �(u) du, (14)

that is, G(A) = ∫A

�(u) du for every measurable set A of u.Informally, Eq. (14) says that for an infinitesimal bin about uof size du, the weight assigned under G is �(u) du. ThereafterG is called the measure of states. The concept of measure canbe used to reformulate the ideas developed above. Denote byFθ the probability distribution of u(x) under (1), that is, theprobability distribution with density function (4). Then, fromEqs. (4) and (14), Fθ is related to G as

dFθ = 1

e−θTu�(u)du = 1

e−θTu dG, (15)

that is, Fθ (A) = Z−1θ

∫A

e−θTu dG for every measurable set Aof u. For an infinitesimal bin about u of size du, the probabilityassigned under Fθ is the density function (4) times du andhence is Z−1

θ e−θTu times the weight assigned under G. Thepartition function Zθ by Eq. (6) can then be expressed as

Zθ =∫

e−θTu dG. (16)

See, for example, Ref. 14 for discussion of measure-theoreticconcepts.

The pooled data {uji: i = 1, . . . , nj, j = 1, . . . , m} canbe regarded as an approximate sample from the mixturedistribution, F∗, whose components are (Fθ1 , . . . , Fθm

) withproportions (n1/n, . . . , nm/n). (Note that the pooled data arenot strictly an independent and identically distributed sam-ple from F∗, which would involve randomly selecting a dis-tribution Fθr

with probability nr/n (r = 1, . . . , m), simulat-ing one observation from Fθr

and then repeating this processfor n times. The numbers of observations from (Fθ1, . . . , Fθm

)would be random, instead of being fixed at (n1, . . . , nm). Tohighlight main ideas, this difference is ignored in the deriva-tion below. The resulting estimators are, however, evaluated inSec. II D without making this simplification.) Then, in anal-

ogy with Eq. (15), F∗ is related to G as

dF∗ ={

m∑r=1

nr

nZ−1

θre−θT

r u

}�(u) du =

{m∑

r=1

nr

nZ−1

θre−θT

r u

}dG.

(17)

For an infinitesimal bin about u of size du, the probabilityassigned under F∗ is the expression in the curly bracket timesthe weight assigned under G. Dividing both sides of Eq. (17)by the quantity in the curly brackets gives

dG ={

m∑r=1

nr

nZ−1

θre−θT

r u

}−1

dF∗ . (18)

For an infinitesimal bin about u of size du, the weight assignedunder G is the inverse of the quantity in the curly bracketstimes the probability assigned under F∗.

Relationship (18) can be used for estimating G from thepooled data by importance weighting. Recall that the pooleddata form an approximate sample from F∗. Then F∗ can beestimated by the empirical distribution F∗ for which each ob-servation uji is assigned the probability n−1. By Eq. (18), theresulting estimator G is a discrete measure for which eachobservation uji is assigned the weight

G(uji) = 1∑mr=1 nrZ

−1θr

e−θTr uji

, (19)

where (Zθ1 , . . . , Zθm) are defined by self-consistency accord-

ing to Eq. (16)

Zθk=

m∑j=1

nj∑i=1

e−θTk uji G(uji)

=m∑

j=1

nj∑i=1

1∑mr=1 nrZ

−1θr

e(θk−θr )Tuji

(k = 1, . . . , m).

(20)

Formulas (19) and (20) provide a binless extension ofEqs. (7) and (8) in WHAM.

By again relationship (16), the partition function Zθ atany other parameter value is estimated by

Zθ =m∑

j=1

nj∑i=1

e−θTuji G(uji)

=m∑

j=1

nj∑i=1

1∑mr=1 nrZ

−1θr

e(θ−θr )Tuji

. (21)

The expectation 〈h〉θ is by definition Z−1θ

∫h(u)e−θTudG and

hence estimated by

1

m∑j=1

nj∑i=1

h(uji)e−θTuji G(uji)

= 1

m∑j=1

nj∑i=1

h(uji)∑mr=1 nrZ

−1θr

e(θ−θr )Tuji

. (22)

Formulas (21) and (22) provide a binless extension ofEqs. (9) and (10) in WHAM. In addition, we see that the

Downloaded 06 May 2012 to 192.12.88.130. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

Page 6: Theory of binless multi-state free energy estimation with applications to protein ...stat.rutgers.edu/~ztan/Publication/UWHAM-May2012.pdf · 2012-05-07 · THE JOURNAL OF CHEMICAL

144102-5 Tan et al. J. Chem. Phys. 136, 144102 (2012)

WHAM weights (13), identified heuristically earlier, coincide(except for the difference between ub

ji vs. uji) with the dis-crete measure with weights (19) derived from the statisticaltheory sketched out above. Therefore, the binless formulationof WHAM, while it appears straightforward, is neverthelessrooted on fundamental statistical concepts.

It is worth emphasizing that the binless method, likeWHAM, can be used not only to estimate partition functionsZθ and equilibrium expectations 〈h〉θ , but also to estimateequilibrium distributions Fθ . Recall that u(x) is in general avector of multiple components and Fθ is the joint distributionof those components under Eq. (1). By relationship (15), Fθ

is estimated by a discrete distribution Fθ on the pooled datawith probabilities

Fθ (uji) = Z−1θ∑n

r=1 nrZ−1θr

e(θ−θr )Tuji

. (23)

In other words, Fθ is approximated by attaching weight (23)to each observation uji in the pooled data, where the weightssum up to 1 by Eq. (21). As a result of this approximation,the marginal distribution of h(u(x)) under Eq. (1) is approx-imated by attaching the same weight (23) to h(uji) for eachuji in the pooled data. Then the expectation 〈h〉θ is approxi-mated as before (Eq. (22)) by a weighted average of the form∑m

j=1

∑nj

i=1 h(uji)Fθ (uji).The above approximation to the marginal distribution of

h(u(x)) under Eq. (1) can be visualized as a weighted his-togram with suitable bins. The height of each bin is the sum ofFθ (uji) such that h(uji) falls into the bin for uji in the pooleddata. The histogram can be normalized into a probability den-sity plot, where the height of each bin is divided by the binsize. If θ = θ k for some k, this weighted histogram basedon the pooled data provides a better approximation than theraw histogram of h(uki) based on the observations uki from Fθk

only. On the other hand, a comparison of these two histogramscan be used to assess goodness of simulations. A substantialdiscrepancy between the two histograms suggest that the qual-ity of simulations is questionable, that is, the simulated dataare actually not distributed according to Eq. (4).

C. Maximum likelihood

We describe a derivation of binless WHAM by themethod of nonparametric maximum likelihood taking G asan infinite-dimensional unknown parameter.9 The likelihoodof the jth sample from Fθj

is by Eq. (15)

Lj =nj∏i=1

{1

Zθj

e−θTj uji G(uji)

}, (24)

where G(uji) is the mass assigned to the singleton uji under

G, and Zθj= ∫

e−θTj u dG, a functional of G, by Eq. (16). The

likelihood of the pooled sample is then L = ∏mj=1 Lj . The

method of nonparametric maximum likelihood is to find G

which maximizes the likelihood L among all possible non-negative measures including discrete measures.

There are two steps to find the maximum likelihood esti-mator G. First, it is sufficient to restrict our search to discrete

measures supported on the set of pooled data {uji: j = 1, . . . ,nj, j = 1, . . . , m}. If a positive mass is assigned under G to anyset outside the pooled data, then relocating the mass evenlyto each observation in the pooled data only increases L. Sec-ond, for a discrete measure G, put wji = G(uji). The likelihoodat G is

L =m∏

j=1

nj∏i=1

{1

Zθj

e−θTj uji wji

}, (25)

where Zθr= ∑m

j=1

∑nj

i=1 e−θTr uji wji for r = 1, . . . , m. Taking

the log of the likelihood gives

log L =⎧⎨⎩

m∑j=1

nj∑i=1

log wji −m∑

j=1

nr log Zθr

⎫⎬⎭−

m∑r=1

nj∑i=1

θ Tj uji .

(26)

The term outside the curly brackets does not depend on wji

and can be ignored. Taking the partial derivative of log L withrespect to wji gives

1

wji

−m∑

r=1

nr

e−θTr uji

Zθr

= 0 (27)

or

wji = 1∑mr=1 nrZ

−1θr

e−θTr uji

, (28)

which leads to the basic formulas (19) and (20). Furthermore,substituting the expression of wji into the term inside the curlybracket in Eq. (26) yields

−m∑

j=1

nj∑i=1

log

{m∑

r=1

nrZ−1r e−θT

r uji

}−

m∑r=1

nr log Zr, (29)

which is a function of (Z1, . . . , Zm) only. This function multi-plied by −n−1 and then subtracted by log n gives the functionκ below (Eq. (31)).

It is interesting that the maximum likelihood estimatorG is always a discrete measure, even though the actual mea-sure G is not. This discrete approximation of G by G servesprecisely our computational purpose. A complication is thateven though there is a general statistical theory to justify themethod of maximum likelihood with a finite-dimensional un-known parameter, the validity of the estimators obtained bythe method of nonparametric likelihood need to be establishedon a case-by-case basis. Fortunately, a statistical theory ofbinless WHAM has been rigorously developed in statistics,and is reviewed in Sec. II D.

The foregoing derivation takes the measure of states G asthe underlying unknown parameter. Equivalently, the methodof nonparametric maximum likelihood can be applied with areparameterization taking Fθ0 as the unknown parameter forsome fixed, reference value θ0. By Eq. (15), Fθ is related toFθ0 as

dFθ = Zθ0

e−(θ−θ0)Tu dFθ0 . (30)

By invariance of maximum likelihood under reparameteriza-tion, the resulting estimator of Fθ0 is the same as Eq. (23)

Downloaded 06 May 2012 to 192.12.88.130. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

Page 7: Theory of binless multi-state free energy estimation with applications to protein ...stat.rutgers.edu/~ztan/Publication/UWHAM-May2012.pdf · 2012-05-07 · THE JOURNAL OF CHEMICAL

144102-6 Tan et al. J. Chem. Phys. 136, 144102 (2012)

with θ set to θ0. Furthermore, the formulas (20)–(22) remainthe same as before. This derivation is essentially an exten-sion of the derivation of WHAM by Bartels and Karplus, andGallicchio et al.2, 5

D. Statistical theory

As seen from Sec. II B, the estimators in binless WHAMare similar to those in WHAM, but based on the actual datawithout binning. While this construction seems heuristicallyeasy, a central issue is to evaluate statistical and computa-tional properties of binless WHAM. We point out a num-ber of useful results which demonstrate the usefulness of thebinless formulation of WHAM, by drawing on related sta-tistical work. Although there are results applicable to corre-lated data,7 we assume for simplicity that {uji: i = 1, . . . , nj;j = 1, . . . , m} are independent.

First, the estimators (Zθ1 , . . . , Zθm) are defined by

Eq. (20), a system of nonlinear equations. Remarkably,an equivalent characterization is that log(Zθ1 , . . . , Zθm

) arejointly a minimizer of the criterion function7, 16

κ(log z1, . . . , log zm) =1

n

m∑j=1

nj∑i=1

log

{m∑

r=1

nr

nz−1r e−θT

r uji

}

+m∑

r=1

nr

nlog zr . (31)

See Sec. II C above for the derivation of κ by maximumlikelihood. The function κ is invariant under translation:κ(a + log z1, . . . , a + log zm) = κ(log z1, . . . , log zm) foran arbitrary constant a, in agreement with the fact thatlog(Zθ1 , . . . , Zθm

) are only determined up to an additive con-stant. Moreover, κ is bounded from below, by application ofJensen’s inequality to the log of the term in the curly brackets

log

{m∑

r=1

nr

nz−1r e−θT

r uji

}≥

m∑r=1

nr

nlog

{z−1r e−θT

r uji}

= −m∑

r=1

nr

nlog zr −

m∑r=1

nr

nθ Tr uji .

(32)

Finally, if one of (log z1, . . . , log zm) is fixed, for examplelog z1 = 0, then κ is strictly convex.16 The convexity can bedirectly shown by the fact that

m∑r=1

nr

nz−1r e−θT

r uji (33)

is convex, and consequently the log of this term is also convexin (log z1, . . . , log zm). Therefore, log(Zθ2/Zθ1 , . . . , Zθm

/Zθ1 )can be obtained as a unique minimizer of κ(0, log z2, . . . ,log zm). This approach of minimizing a convex function canbe more effective than solving the system of nonlinear equa-tions (20) by the self-consistency or the Newton-Raphsonalgorithm.11 See Appendix A for details.

Second, the estimators (Zθ2/Zθ1 , . . . , Zθm/Zθ1 ) are al-

ways consistent (that is, converge in probability to the true

values) and asymptotically normally distributed as the sam-ple size nj tends to infinity and nj/n is fixed for each j.10, 16

The connectedness condition required for the general result ofGill et al.16 and Tan10 is satisfied here because the weightingfunction e−θTu is positive. Moreover, the estimator Zθ /Zθ1 isconsistent and asymptotically normally distributed providedthat the variance under F∗ of the density ratio of Fθ over F∗ isfinite

∫ {m∑

r=1

nr

nZ−1

θre(θ−θr )Tu

}−2

dF∗ < ∞. (34)

Similarly, the estimator of 〈h〉θ is consistent and asymptoti-cally normally distributed provided that the variance under F∗of h(u) times the density ratio of F∗ over F∗ is finite

∫h2(u)

{m∑

r=1

nr

nZ−1

θre(θ−θr )Tu

}−2

dF∗ < ∞. (35)

These conditions require that the mixture “umbrella” distribu-tion F∗ should provide sufficient coverage of Fθ , so that ob-servations from F∗ can be weighted by the density ratio of Fθ

over F∗ to estimate Fθ . Therefore, interpolation is in generalvalid, but extrapolation needs to be considered more carefully.For example, for the application in Sec. III, it is important toobtain observations from the end thermodynamic states, in ad-dition to intermediate states, in order to estimate the free en-ergy differences between them. Obtaining observations fromthermodynamic states however close to the end states, but notat end states, would require extrapolation whereby condition(34) would be difficult to verify.

Third, the asymptotic variance matrix of(Zθ2/Zθ1 , . . . , Zθm

/Zθ1 ) and Zθ /Zθ1 jointly can be con-sistently estimated without using any generalized inversesuch as the Moore-Penrose inverse.10 This approach differsfrom that of Kong et al.9 and Shirts and Chodera11 based onthe asymptotic variance matrix of (Zθ1 , Zθ2 , . . . , Zθm

), whichnecessarily involves use of generalized inverses. Similarly,the asymptotic variance of the estimator of 〈h〉θ can beconsistently estimated. The resulting variance formula isappropriate even when h(u) is not always non-negative, incontrast with Shirts and Chodera (Sec. IV of Ref. 11). SeeAppendix B for details.

Fourth, when m = 2, the estimator Zθ2/Zθ1 is equivalentto Bennett’s optimal acceptance ratio method (BAR),12 whichattains the smallest asymptotic variance among bridge sam-pling estimators of the form8, 13

n−11

∑n1i=1 α(u1i)e−(θ2−θ1)Tu1i

n−12

∑n2i=1 α(u2i)

, (36)

where α( · ) is an arbitrary function, for example,α(u) = min(e−(θ1−θ2)Tu, 1). In general, the estimators(Zθ2/Zθ1 , . . . , Zθm

/Zθ1 ) and Zθ /Zθ1 jointly attain thesmallest asymptotic variance matrix in the order on positive-definite matrices among a class of extended bridge samplingestimators based on Eq. (36).10, 11 Similarly, the estimatorof 〈h〉θ attains the smallest variance among correspondingextended bridge sampling estimators. For this reason, the

Downloaded 06 May 2012 to 192.12.88.130. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

Page 8: Theory of binless multi-state free energy estimation with applications to protein ...stat.rutgers.edu/~ztan/Publication/UWHAM-May2012.pdf · 2012-05-07 · THE JOURNAL OF CHEMICAL

144102-7 Tan et al. J. Chem. Phys. 136, 144102 (2012)

binless method was called the multi-state Bennett acceptanceratio method (MBAR) by Shirts and Chodera.11

III. APPLICATION: ESTIMATION OF BINDINGFREE ENERGIES

This section illustrates the application of the binlessmethod using both the MBAR software (as developed by Shirtsand Chodera11) and our computational implementation basedon Sec. II D (referred to as unbinned WHAM or UWHAM),to the estimation of protein-ligand binding free energies. Aswe will show, due to the wide range of values of the bind-ing energies involved, it is difficult to apply the conventionalWHAM binning method to this problem unless soft-core po-tentials are employed. In contrast, the binless approach yieldsconsistent results in all cases.

The binding free energy measures the propensity of a re-ceptor R to be associated in solution with a ligand L. Thebinding free energy is by definition the difference betweenthe free energy of the receptor-ligand complex and the freeenergy of the dissociated receptor and ligand. In this work,binding free energies are estimated by simulation in the con-text of the BEDAM,29 which, in the present formalism can besummarized as follows.

Working within the implicit solvent representation, thepotential energy of a conformation x of the complex can bewritten as18, 29

URL(x) = UR(x) + UL(x) + b(x) , (37)

where UR and UL are the potential energies of the dissociatedreceptor and ligand in solution and b(x) is the binding energyof conformation x of the complex, defined as the change inpotential energy for bringing into contact the receptor andligand from infinite separation without intramolecular con-formational rearrangements. Based on the notation developedin Sec. II A, we recognize that the coupled (ligand and re-ceptor fully interacting) and decoupled (non interacting lig-and and receptor) ensembles can be cast in the form of thegeneralized ensemble representation of Eqs. (1)–(3) with atwo-dimensional potential energy function vector u = (U0, b)where

U0(x) = UR(x) + UL(x) (38)

is the reference potential energy function corresponding to theuncoupled state and b is the binding energy function. UsingEq. (37), the potential energy of the decoupled state corre-sponds to the coefficient vector θdcpld = (β, 0) and the one forthe coupled ensemble is θ cpld = (β, β). The binding free en-ergy is then given by the ratio of the corresponding partitionfunctions Zθcpld and Zθdcpld :

�Gb = −kT logZθcpld

Zθdcpld

. (39)

Note that the observable standard binding free energy also in-cludes a standard state concentration-dependent term18, 19, 29

which, being constant among the systems investigated, is in-cluded in the results30 but not further discussed in this work.

A series of intermediate states k = 1, . . . , m are intro-duced with potential energies

Uk(x) = U0(x) + λkb(x) , (40)

where λ1 = 0 corresponds to the decoupled state and λm = 1corresponds to the coupled state. The intermediate states withλi between 0 and 1 serve as interpolating states in which re-ceptor and ligand partially interact to connect, in a free energysense, the two end states.25 In general, as stated in Sec. II A,a (m + 1)-dimensional potential energy vector u = (U0(x),ω1(x), . . . , ωm(x)), with ωk(x) = λkb(x), and corresponding (m+ 1)-dimensional θ vectors are necessary to describe this col-lection of ensembles. However in this case, taking advantageof the particular linear expression of ωk(x), it is convenient tocollapse the λk dependence on the coefficient vector θ so as tolower the dimensionality of the generalized energy vector. Bydoing so, each of the states corresponds to a two-dimensionalθ vector of the form θ k = (β, βλk) which multiplies the po-tential energy vector u = (U0, b) introduced above to yield, bymeans of Eq. (3) the potential energy functions in Eq. (40).

The partition function of each state is computed fromEq. (22) setting Zθdcpld = Zθ1 = 1. Using Eq. (3) and theabove, it is easy to see that the term (θk − θr )Tuji in Eq. (20)in this case simplifies to

(θk − θr )Tuji = β(λk − λr )bji , (41)

which does not include the total reference potential energy U0

and depends only on the binding energy bji of the ith sampledconformation xji from a simulation at λ = λj. Analogously, itis straightforward to show that Eq. (31) simplifies to

κ(log z1, . . . , log zm) = c + 1

n

m∑j=1

nj∑i=1

log

{m∑

r=1

nr

nz−1r e−βλrbji

}

+m∑

r=1

nr

nlog zr , (42)

where c is a constant that depends only on the observations ofU0 and does not affect the position of the minimum.

Similarly, in the denominator of the WHAM equation(Eq. (8)), the (θk − θr )Tu term reduces to β(λk − λr)b, whichdepends only on the binned value b of the binding energy.Furthermore expressing Eq. (8) as

Zλk=

∑U0

∑b

∑mr=1 Nr (U0, b)∑m

r=1 nrZ−1λr

eβ(λk−λr )b

=∑

b

∑mr=1 Nr (b)∑m

r=1 nrZ−1λr

eβ(λk−λr )b, (43)

we see that the two-dimensional histogram Nr(u) = Nr(U0, b)can be replaced by the one-dimensional marginal histogramNr (b) = ∑

U0Nr (U0, b) of the binding energy. Consequently,

in both the WHAM and MBAR calculations that follow ithas been sufficient to collect only the binding energy samplesfrom the molecular simulations.

Binding energies are collected from Hamiltonian replicaexchange all-atom molecular dynamics simulations of theprotein complexes as described29, 30, 32 for a series of λ val-ues from 0 (decoupled state) to 1 (coupled state). The binding

Downloaded 06 May 2012 to 192.12.88.130. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

Page 9: Theory of binless multi-state free energy estimation with applications to protein ...stat.rutgers.edu/~ztan/Publication/UWHAM-May2012.pdf · 2012-05-07 · THE JOURNAL OF CHEMICAL

144102-8 Tan et al. J. Chem. Phys. 136, 144102 (2012)

energy data is then fed into Eq. (8), using binning, or Eq. (42),without binning, to compute the ratios of partition functionsand ultimately the binding free energy from Eq. (39). See be-low for a description of the biological systems and simulationsettings.

A. WHAM estimates with binning

The distributions of binding energies depend criticallyon the λ value at which they are obtained. At λ = 1, whenthe ligand and the receptor fully interact, binding energiesare typically centered around favorable (negative) values (seeFig. 1). In contrast at λ = 0, in the absence of receptor-ligandinteractions, the ligand is likely to sample conformations withunfavorable clashes between receptor and ligand atoms, cor-responding to large unfavorable (positive) values of the bind-ing energy (see Fig. 2). In principle, because the Lennard-Jones and Coulomb interatomic potentials tend to infinity atzero interatomic separation, there is no finite upper limit tothe range of binding energies that can be observed. As shownhere, this causes major difficulties for the binning of bindingenergy data to be used in conjunction with WHAM (Eq. (8)),since in this case the binding energy samples are spread outvery sparsely in a region spanning many orders of magnitudewhich is impossible to bin reliably without using very widebins leading to large integration errors.

Conventional wisdom dictates that the number of binsshould be small enough so that each bin contains more thana few samples so as to minimize statistical noise in the re-sulting histograms. On the other hand, the binning resolutionshould be sufficiently fine so as to avoid significant integra-tion errors when replacing the integral in Eq. (6) with thesummation over bins in Eq. (8). It is not always clear howto balance these opposing requirements especially when, asin this case, the range of values to be binned is unbounded.Of course, as shown above, we now know that it is justifi-

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

-40 -35 -30 -25 -20 -15 -10

Pro

babi

lity

Den

sity

[kc

al/m

ol]-1

Binding Energy [kcal/mol]

UWHAMSamples at λ=1

FIG. 1. Computed probability density at λ = 1, p1(b), for the complex withligand 6 with the unmodified potential.30 The line represents the UWHAMestimate from the data collected from all λ-replicas. The crosses correspondto the probability density computed from the histogram of the binding en-ergy data at only λ = 1. Good correspondence between the two densities isobserved.

1e-20

1e-18

1e-16

1e-14

1e-12

1e-10

1e-08

1e-06

0.0001

-40 -30 60 1000 1e+04 1e+05 1e+06 1e+07 1e+08 1e+09

Pro

babi

lity

Den

sity

[kc

al/m

ol]-1

Binding Energy [kcal/mol]

UWHAMSamples at λ=0

FIG. 2. Computed probability density at λ = 0, p0(b), for the complex withligand 6 with the unmodified potential.30 The line represents the UWHAMestimate from the data collected from all λ-replicas. The crosses correspondto the probability density computed from the histogram of the binding energydata at only λ = 0. There is good correspondence between the two densities inthe range explored by the λ = 0 replica. The binding energy grid used for thisplot has 200 bins, equally spaced (0.5 kcal/mol bin sizes) for negative bindingenergies and exponentially increasing spacing for positive values to up to 109

kcal/mol. Even though p0(u) is predicted to be maximal at approximately u= 20 kcal/mol, it is rare to observe binding energies in that range becauseof the small integrated cumulative probability at low binding energies (notethe logarithmic layout of the binding energy axis). The UWHAM estimateinstead extends to as low as −40 kcal/mol (the lowest observed sample at allλ’s) with an estimated probability density on the order 10−28 kcal/mol−1 (notshown for clarity).

able to increase the number of bins indefinitely, reaching thelimit where the WHAM formula is indistinguishable from theMBAR formula, which is based on the sampled values di-rectly without binning.

In Table I we report WHAM binding free energy esti-mates for the complex with ligand 2 (see below for a de-scription of protein-ligand complexes) varying the number ofbins. In these calculations a uniform grid spacing has beenused in the favorable binding energy range and an exponen-tially increasing bin spacing for unfavorable binding energies.We see that the results change significantly as the grid reso-lution is increased. With fewer bins and coarser bin widthsWHAM under-predicts binding affinities. As the number ofbins is increased the estimate of the binding free energy getscloser to the limiting value of �Gb � −2.2 kcal/mol obtained

TABLE I. WHAM results for ligand 2 with the unmodified potential varyingthe number of bins.

Nbins �Gba

100 3.50120 7.62150 0.81200 −0.07250 −0.511000 −1.45∞b −2.21

aIn kcal/mol.bMBAR/UWHAM result from Table III.

Downloaded 06 May 2012 to 192.12.88.130. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

Page 10: Theory of binless multi-state free energy estimation with applications to protein ...stat.rutgers.edu/~ztan/Publication/UWHAM-May2012.pdf · 2012-05-07 · THE JOURNAL OF CHEMICAL

144102-9 Tan et al. J. Chem. Phys. 136, 144102 (2012)

TABLE II. WHAM results for ligand 2 with the unmodified potentialchanging the energy limit of the last bin with Nbins = 250.

bca �Gb

a

20 −5.9280 −5.23200 −4.88107 −1.46109 −0.51

aIn kcal/mol.

with, effectively, an unlimited number of bins (see results inTable III). These results indicate that binning the unfavorablerange of binding energies, even with a thousand bins, leads tolarge errors.

1. Data censoring bias

One simple way to circumvent the need for binning a verylarge range of binding energies is to terminate the binning ata large but finite grid value bc and assign all of the sampleswith values larger than the maximum to this last bin (datacensoring).29 This approach intuitively appears valid based onthe argument that unfavorable binding energies much largerthan thermal energy are equally unlikely to be sampled bythe complex regardless of their specific value. However, asshown in Table II, this leads to significant bias in the bindingfree energy estimates. The estimates in Table II are obtainedfor ligand 2 with 250 bins and the binning limit, bc, indicated.The results show that using a small bc (but still much largerthan binding energy values achievable at λ = 1 at standardtemperature) leads to overestimation of binding affinities, andthat the bias progressively shifts to less negative values as bc

is increased – but overshoots the correct value because, witha fixed bin size, the bins became too coarse as the the energylimit of the last bin increases.

The origin of the data censoring bias can be understoodin general terms by recognizing that it amounts to assumingthat the potential energy of the system is bounded althoughno limit is actually present. In other words, the data is beinganalyzed with a statistical model inconsistent with the systemthat generated the data. To understand the effect in numericalterms consider the denominator in Eq. (43) for b = bc kT.When λk = 1, the quantities exp [β(1 − λr)bc] are all positiveand some are very large. It follows that the sum in the denom-inator is large and the contribution to Zλ = 1 from the bin at bc

is negligible regardless of the specific value of bc. A similarconclusion can be reached for any large value of λk. Howeverfor small values of λk such that βλkbc � 1 the values of thequantities exp [β(λk − λr)bc] can vary significantly dependingon the specific value of bc. This leads to incorrect estimatesof the free energy profile at small λ’s and, in turn, of the totalfree energy change.

2. Soft-core binding energy function

Another approach that we have explored in this work isto, in effect, prevent the generation of large binding energies

by adopting a soft core potential in the simulations. Soft corepotentials are commonly used to attempt to improve the con-vergence of free energy calculations.26, 27, 33 In this work, asoft core potential is introduced in terms of a modified bind-ing energy function b′(x) of the form30, 34

b′(x) ={

bmax tanh[b(x)/bmax] b(x) > 0

b(x) b(x) ≤ 0, (44)

where bmax is some large positive value, set in this work toeither 103 kcal/mol (soft core) or 109 kcal/mol (referred to be-low as the “unmodified” binding energy function). The mod-ified binding energy function b′(x) serves the purpose of cap-ping the maximum value of the binding energy while leavingunchanged the values of favorable binding energies. Here itis used throughout in the molecular simulations and the sta-tistical analysis in place of the actual binding energy func-tion. The potential energy of λ = 0 state is equal to u0(x) (seeEq. (40)) and is unaffected by the binding energy function.Furthermore, the λ = 1 state with the soft core binding energyfunction is virtually indistinguishable from the original one aslarge positive values of the binding energy are never sampledduring the simulation. We conclude therefore that the free en-ergy difference between the λ = 0 and λ = 1 states (that is thebinding free energy) is not significantly affected by the intro-duction of the modified binding energy function.28 Indeed, asshown below, we obtain statistically indistinguishable bindingfree energy estimates with the two binding energy functions,with any small difference possibly attributable to other fac-tors, such as insufficient equilibration and convergence.

Table III reports WHAM binding energy estimates ob-tained with the soft core binding energy function (Eq. (44))with bmax = 103 kcal/mol. These calculations employed a gridwith 250 bins similar to the one used above for the unmodi-fied binding energy function (Table I) but extending only up tob = bmax since no samples are present beyond this value. Thelimited extent of the range of binding energies makes it possi-ble to select a sufficiently fine binning grid with a reasonablenumber of bins. Because the binding free energy estimatesso obtained are in agreement with the MBAR/UWHAM es-timates (see below) obtained with the unmodified potentialfunction, we conclude that the soft core WHAM results in-deed reflect the correct binding free energies for this system.Conversely, based on the results above, we conclude that ap-plication of WHAM to the data with the unmodified potentialleads to incorrect results with any reasonable binning choicewe attempted.

B. MBAR/UWHAM estimates without binning

As discussed in Sec. II, binless free energy estima-tion methods make it unnecessary to bin the data in orderto compute free energies. Binding energy samples bji,i = 1, . . . , nj, from each simulation at λ = λj are simplyfed into Eq. (20), which is solved for the Zλj

’s by self-consistency11 (referred to as the MBAR implementation) orby minimization of Eq. (42) (referred here as the UWHAMimplementation); see below for details on the numerical im-plementation. The resulting binding free energy estimates for

Downloaded 06 May 2012 to 192.12.88.130. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

Page 11: Theory of binless multi-state free energy estimation with applications to protein ...stat.rutgers.edu/~ztan/Publication/UWHAM-May2012.pdf · 2012-05-07 · THE JOURNAL OF CHEMICAL

144102-10 Tan et al. J. Chem. Phys. 136, 144102 (2012)

TABLE III. Comparison of MBAR/UWHAM and WHAM computed binding free energies for the six com-plexes of FKBP with (“soft-core”) and without (“unmodified”) the soft-core binding energy function.

MBAR/UWHAMa,c

Ligand Expta WHAMa,b MBAR/UWHAMa,b (sub-sampled)

Soft-core Unmodified Soft-core Unmodified Soft-core2 − 7.80 ± 0.1 − 2.46 ± 0.18 − 2.21 ± 0.12 − 2.56 ± 0.19 − 2.22 ± 0.45 − 2.10 ± 0.513 − 8.40 ± 0.1 − 3.90 ± 0.24 − 3.86 ± 0.20 − 4.01 ± 0.23 − 4.63 ± 0.47 − 3.85 ± 0.485 − 9.50 ± 0.1 − 3.85 ± 0.37 − 4.13 ± 0.23 − 3.98 ± 0.32 − 4.03 ± 0.50 − 4.30 ± 0.506 − 10.80 ± 0.3 − 3.74 ± 0.36 − 3.74 ± 0.21 − 3.86 ± 0.31 − 3.77 ± 0.49 − 3.79 ± 0.498 − 10.90 ± 0.1 − 4.45 ± 0.28 − 5.42 ± 0.14 − 4.59 ± 0.30 − 5.81 ± 0.51 − 3.61 ± 0.539 − 11.10 ± 0.2 − 6.19 ± 0.35 − 6.03 ± 0.19 − 6.31 ± 0.32 − 6.20 ± 0.55 − 6.34 ± 0.55

aIn kcal/mol.bUsing all of the data, statistical errors computed by block-bootstrapping.cUsing 1-in-50 sub-sampled data and statistical errors computed as described in Appendix B.

the six protein-ligand systems are given in Table III. Identicalresults are obtained with either the MBAR or UWHAMimplementations. Also reported in Table III are the resultsobtained with WHAM on the data with the soft core potential.

We immediately notice that the MBAR/UWHAM resultswith the unmodified potential agree very closely with thoseusing the soft core potential. The fact that we obtained con-sistent results from two independent sets of simulations, eachproviding very different binding energy datasets is a strongindication that both of these results reflect the actual bind-ing free energies for these systems. This is a significant resultbecause it shows that binless methods are capable of treatingcorrectly the distribution at high binding energies even thoughthis extends to extremely large values (109 kcal/mol) and itis extremely sparsely sampled. For example, for the complexwith ligand 6 there is on average only one observation every10 000 kcal/mol in the range between 106 and 107 kcal/mol, aregime in which, clearly, binning is not a feasible option. Asdiscussed above, reliable WHAM results could be obtainedonly for the soft core data because of challenges with binningthe unmodified binding energy data. The agreement betweenMBAR/UWHAM soft core and unmodified potential resultsand with the WHAM soft core results confirms the ability ofthe binless inference methods to handle non soft-core data re-liably.

The MBAR results obtained are based on the same equa-tion (Eq. (8)) derived here.11 The only difference is thecomputational procedure to solve it. UWHAM uses a min-imization procedure with the criterion function (Eq. (42)),whereas MBAR employs a self-consistent procedure option-ally supplemented by Newton-Raphson iterations. Here wehave used the simple self-consistent solution starting with thedefault initial guess Zλk

= 1 (the same initial guess used forUWHAM). In our experience, UWHAM has provided a con-verged solution in significantly less computational time thanMBAR (seconds vs. minutes typically), a feature that has beenparticularly helpful in block-bootstrapping uncertainty cal-culations involving 100 independent free energy evaluationsper ligand. MBAR and UWHAM yielded virtually identicalresults thereby validating numerically the new minimizationprocedure presented here.

The last two columns in Table III are the results basedon subsampled data,35 including the point estimates and ana-

lytical errors computed as described in Appendix B. For eachsystem, a subsample of size 20 has been selected, with 1 inevery 50 time points, from the original sample of size 1000.The point estimates are reasonably close to those based onthe original samples. The analytical errors, assuming uncor-related data, are approximately 0.50 kcal/mol for all the sys-tems, with or without the soft-core potential. Adjusting forsample sizes, the errors based on uncorrelated data of size1000 would be about 0.50/

√50 ≈ 0.07 kcal/mol. Compari-

son of such adjusted errors with the block-bootstrap errorsfor the original data then indicates statistical inefficiency36

caused by correlations. For example, for ligand 2, the factorof statistical inefficiency due to correlated data is (0.12/0.07)2

= 2.9 for the unmodified potential and (0.19/0.07)2 = 7.4 forthe soft-core potential. It is interesting to note that the statisti-cal inefficiencies with the soft-core potential are consistentlylarger than those with for the unmodified potential, implyingsmaller correlations and faster convergence of binding freeenergies with the latter.

1. MBAR/UWHAM probability densities

The BEDAM binding free energy theory highlights thefundamental importance of probability densities pλ(b) of thebinding free energy as a function of the progress parameterλ. For example we have shown29 that the binding free energy(Eq. (39)) can be written as

�Gb = −kT log∫

p0(b)e−βbdb , (45)

where p0(b) is the probability density of the binding ener-gies at λ = 0, that is in absence of ligand-receptor interac-tions. The probability density of binding energies p1(b) of theligand-receptor coupled state is also of special interest. Themean of p1(b) is the average binding energy 〈b〉1 which mea-sures the driving force toward binding provided by favorableligand-receptor interactions. The difference between the bind-ing free energy and the average binding energy is the bindingreorganization free energy that measures energetic strain andentropic factors which oppose binding. In addition to thermo-dynamic decompositions of this kind, p1(b) also leads to con-formational decompositions of the binding free energy. p1(b)can be interpreted as the contribution to the binding affinity of

Downloaded 06 May 2012 to 192.12.88.130. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

Page 12: Theory of binless multi-state free energy estimation with applications to protein ...stat.rutgers.edu/~ztan/Publication/UWHAM-May2012.pdf · 2012-05-07 · THE JOURNAL OF CHEMICAL

144102-11 Tan et al. J. Chem. Phys. 136, 144102 (2012)

conformations with binding energy b and, consequently, dis-tinct macrostates of the complex contribute to binding affinityproportionally to the integrated intensity of the correspondingcomponents of p1(b).18, 29

It is straightforward to estimate these probability densi-ties by binning and WHAM using Eqs. (4) and (7), which forthe present application can be condensed as2

pλ(b)�b = 1

∑r Nr (b)∑

r nrZ−1λr

eβ(λ−λr )b, (46)

where pλ(b) is the estimate of the probability density in cor-respondence with a bin centered at b, with bin width �b, andNr(b) is the number of observations in that bin from the sim-ulation at λ = λr. As presented above (see Eqs. (22) and (23))the procedure to obtain probability densities and their mo-ments (such as expectation values) is somewhat differentwhen using binless methods. First each binding energy obser-vation bji is assigned a λ-dependent statistical weight given inthis case by

Fλ(bji) = 1

∑r nrZ

−1λr

eβ(λ−λr )bji

. (47)

The sum of statistical weights over the samples is automat-ically unitary. Expectation values are computed as weightedaverages using the weights in Eq. (47). For example the aver-age binding energy at λ is

〈b〉λ =∑ji

bjiFλ(bji) . (48)

Averages of other properties can be obtained similarly by re-placing bji in Eq. (48) with any property of the sampled con-formation ji. As discussed in Sec. II, this expression can alsobe used to estimate probability densities, such as the bindingenergy densities pλ(b). These can be approximated by the re-lationship

pλ(bk)�bk � 〈δbk(b)〉λ , (49)

where δbk(b) is a function defined as 1 if the argument falls

within the bin centered at binding energy bk with width �bk

and zero otherwise. Then the average in Eq. (49) is computedusing the equivalent of Eq. (48)

pλ(bk)�bk �∑ji

δbk(bji)Fλ(bji) . (50)

So the UWHAM calculation of pλ(b) basically consists ofbinning samples based on their binding energies and thencreating a histogram in which the height of each bin is thesum of the weights Fλ(bji) of the observations collected inthat bin.

Figures 1 and 2 illustrate the p1(b) and p0(b) probabilitydensities obtained by UWHAM and Eq. (50) for the complexwith ligand 6.30 These are compared with the correspondingprobability density estimates from the histograms of the datacollected only at λ = 1 and λ = 0, respectively. There is goodagreement between the two estimates in the region of bind-ing energies well sampled at the respective λ values, furthervalidating the UWHAM results. The tails of the probabilitydensities are estimated much more accurately by UWHAM

than by the direct histograms because these are rarely sam-pled by the simulations conducted only at a specific λ. TheUWHAM probability densities are instead estimated fromdata obtained from simulations at multiple λ values between 0and 1 which explore a much wider range of binding energies.Obtaining accurate tails of probability densities is very im-portant in a variety of applications such as for example whenemploying Eq. (45) to estimate the binding free energy fromp0(b) (see Fig. 2). Due to the exponential term in the inte-grand of Eq. (45), the p0(b) density in the range −40 < b< −10 kcal/mol dominates the estimate of the binding freeenergy and the data collected at λ = 0 constitute a verypoor estimate of p0(b) in this region of binding energies (al-though this is difficult to see in Fig. 2 because of log-logrepresentation).

C. Simulation setup and numerical analysis

BEDAM calculations29 were performed for six com-plexes of FKBP with ligands 2, 3, 5, 6, 8, and 9 from Ref. 37,from a ligand set which was the subject of previous bindingfree energy calculations.30, 38, 39 Complexes were prepared asdescribed30 based on the crystal structures of ligands 8 and 9(PDB ID’s 1FKG and 1FKH, respectively). Two BEDAM cal-culations were conducted for each complex, both employingEq. (44) to represent the protein-ligand interaction potential,one with bmax = 109 kcal/mol (referred to as the unmodifiedpotential) and the other with bmax = 103 kcal/mol (referredto as the soft-core potential). Soft-core calculations employed15 BEDAM replicas at λ = 0, 10−3, 2 × 10−3, 4 × 10−3,6 × 10−3, 8 × 10−3, 10−2, 2 × 10−2, 6 × 10−2, 0.1, 0.25, 0.5,0.75, 0.9, and 1. Calculations with the unmodified potentialemployed 18 replicas at λ = 0, 10−9, 10−8, 10−7, 10−6, 10−5,10−4, 10−3, 10−2, 10−1, 0.15, 0.25, 0.35, 0.5, 0.75, 0.9, and1. Hamiltonian replica exchange simulations were conductedfor 2 ns per replica (396 ns total simulation time). Binding en-ergies were recorded at 1 ps intervals during the second halfof the simulations, yielding 1000 observations per replica.

WHAM analysis has been performed employingEqs. (43), (45), and (46) as described2 on the collectedbinding energy data, bji, using a binning grid starting at−40 kcal/mol (the lowest recorded binding energy value) to aset maximum (see Tables I and II) for the unmodified bindingenergy function or the to maximum allowed value (bmax

= 103 kcal/mol) for the soft core data. Grid spacing wasset to 0.3 kcal/mol in the (−40, −10) binding energy range,increasing exponentially starting from this value at a rateadjusted to reach the variable maximum set value with thegiven number of bins.

UWHAM analysis was conducted on the same bindingenergy data to obtain the logarithm of the partition functionslog Zλk

(relative to log Z0, which is set to 0) by minimiza-tion of the function κ(log z1, . . . , log zm) in Eq. (42) with re-spect to log zk setting the free energy of the unbound state tozero (log z1 = log Z0 = 0). For the minimization, we used thetrusted region algorithm15 as implemented in the R statisticalpackage “trust”.40 A similar procedure has been recently pro-posed in the context of WHAM.6 The code for the UWHAMR module we employed and a set of use examples in R are

Downloaded 06 May 2012 to 192.12.88.130. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

Page 13: Theory of binless multi-state free energy estimation with applications to protein ...stat.rutgers.edu/~ztan/Publication/UWHAM-May2012.pdf · 2012-05-07 · THE JOURNAL OF CHEMICAL

144102-12 Tan et al. J. Chem. Phys. 136, 144102 (2012)

available by the authors upon request. MBAR calculationswere performed using the code kindly provided by Choderaand Shirts.11

Statistical uncertainties were computed using the blockbootstrapping method.17 Similarly to the traditional block av-eraging approach,41 the method consists of dividing the sam-pled data in Nb time-contiguous blocks (Nb = 20 in this work).However, in this context blocks span all of the replicas; that iseach block contains the data generated from all of the replicasin the same time window. Each block is then assigned an inte-ger identifier and the list of block identifiers plays the samerole of the data samples in the standard bootstrap method.Namely, a new block identifier list of length Nb is created bysampling with repetition from the original list and a corre-sponding new binding energy dataset is generated by collat-ing the data contained in the blocks of the new list. This isrepeated a number of times (100 times in this case) and thestatistical uncertainty of the binding free energy is estimatedfrom the standard deviation of the free energy values fromeach bootstrap sample. The advantage of this bootstrap tech-nique is that it accounts for time correlation of samples origi-nating from each replica as well as cross-correlations betweenreplicas due to λ exchanges.

IV. CONCLUSION

We demonstrate the statistical validity and usefulness ofinterpreting MBAR as a binless formulation of WHAM. LikeWHAM, the binless formulation can be used not only to es-timate free energies and equilibrium expectations, but alsoto estimate equilibrium distributions. This development al-lows practitioners to easily build on their current applica-tions of WHAM, but without discretizing observations intobins, which may sometimes incur substantial biases. This isillustrated for alchemical absolute binding free energy cal-culations using the BEDAM technique. While UWHAM andMBAR11 binless implementations yield equivalent results foreither the unmodified and soft-core potentials, binning of theunmodified data leads to substantial biases which vary de-pending on the level of discretization. These results indicatethat binless multi-state inference approaches are potentially astraightforward alternative to soft-core potentials for bindingfree energy alchemical calculations.

ACKNOWLEDGMENTS

This work has been supported in part by research grantsfrom the National Institute of Health (Grant No. GM30580)and the National Science Foundation (CDI type II Grant No.1125332 and DMS-0749718). The calculations reported inthis work have been performed at the BioMaPS High Per-formance Computing Center at Rutgers University funded inpart by the NIH shared instrumentation (Grant Nos. 1 S10RR022375 and 1 S10 RR027444), and on the Lonestar4 clus-ter at the Texas Advanced Computing Center under Tera-Grid/XSEDE National Science Foundation allocation (GrantNo. MCB100145). The authors are grateful to John Choderaand Michael Shirts for providing valuable suggestions andguidance.

APPENDIX A: COMPUTING POINT ESTIMATORS

To compute log(Zθ2/Zθ1 , . . . , Zθm/Zθ1 ), we minimize

κ(0, ζ 2, . . . , ζ m) using the trust region algorithm implementedby the R package trust.40 This algorithm is globally conver-gent at the second order (Sec. 4.2 of Ref. 15). Below we pro-vide formulas for evaluating κ and its gradient and Hessian,which are required by the trust region algorithm.

Arrange the pooled data into a column vector(u1, . . . , un)T. Let �s be the m × m diagonal matrixwith the (j, j)th element nj/n, and Qs and Ws be the n × mmatrices, respectively, with (i, j)th element

Qij = e−ζj e−θTj ui , Wij = e−ζj e−θT

j ui∑mr=1

nr

ne−ζr e−θT

r ui.

Write ζ = (ζ1, . . . , ζm)T and 1m (or 1n) as the column vectorof m (or n) ones. Then

κ(ζ ) = 1Tn

nlog(Qs�s1m) +

m∑r=1

nr

nζr ,

∂κ

∂ζ(ζ ) = −�sW

Ts

1n

n+ �s1m,

∂2κ

∂ζ∂ζ T(ζ ) = −1

n�sW

Ts Ws�s + diag

(�sW

Ts

1n

n

),

where diag(c) is the diagonal matrix with (j, j) element cj fora vector c = (c1, . . . , cm). The gradient (or Hessian) of κ(0,ζ 2, . . . , ζ m) is formed by deleting the first element (or the firstrow and column) from that of κ(ζ ).

APPENDIX B: COMPUTING VARIANCE MATRICES

Suppose that Zθ /Zθ1 is computed for k values, θm + 1, . . . ,θm + k, of θ . Let R be the n × m matrix with (i, j)th element(nj/n)−1 if ui is sampled from Fθj

and 0 otherwise, and W bethe n × (m + k) matrix with (i, j)th element

Wij = (Zθj/Zθ1 )−1e−θT

j ui∑mr=1

nr

n(Zθr

/Zθ1 )−1e−θTr ui

.

Let Im + k be the identity matrix of size m + k, 0(m + k) × k bethe (m + k) × k matrix of zeros, and

O = 1

nW TW, B = {

Os�s, 0(m+k)×k

} − Im+k,

D = 1

nW TR, A = O − D�sD

T,

where Os is the (m + k) × m matrix consisting of the first mcolumns in O. The (j, r)th element of D is the sample av-erage of (i, r)th elements of W for i = 1, . . . , n such thatui is sampled from Fθj

. The asymptotic variance matrix oflog(Zθ2/Zθ1 , . . . , Zθm

/Zθ1 , Zθm+1/Zθ1 , . . . Zθm+k/Zθ1 ) can be

consistently estimated by

1

nB(1)

−1 A(1) BT−1

(1) , (B1)

where A(1) and B(1) are formed by deleting the first row andcolumn from A and B. Alternatively, formula (B1) can be usedwith A replaced by O − Os�sO

Ts . The resulting formula does

Downloaded 06 May 2012 to 192.12.88.130. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

Page 14: Theory of binless multi-state free energy estimation with applications to protein ...stat.rutgers.edu/~ztan/Publication/UWHAM-May2012.pdf · 2012-05-07 · THE JOURNAL OF CHEMICAL

144102-13 Tan et al. J. Chem. Phys. 136, 144102 (2012)

not require the use of the information about which observationis sampled from which distribution.

Suppose that 〈h〉θ is estimated for θ = θ1, . . . , θm,θm + 1, . . . , θm + k. Write formula (22) as the ratio Zh

θ /Zθ ,where

Zhθ =

m∑j=1

nj∑i=1

h(uji)∑mr=1 nrZ

−1θr

e(θ−θr )Tuji

.

Redefine W as the n × (m + k) matrix with (i, j)th element

Wij = e−θTj ui∑m

r=1nr

n(Zθr

/Zθ1 )−1e−θTr ui

.

Let Wh be the n × (m + k) matrix with (i, j)th element

Whij = h(ui)e

−θTj ui∑m

r=1nr

n(Zθr

/Zθ1 )−1e−θTr ui

.

Now replace W by (W, Wh) throughout and redefine

B = {Os C−1�sC−1, 0(2m+2k)×(m+2k)} − I2m+2k,

where Os is the (2m + 2k) × m matrix consisting of the firstm columns in O, and C is the (m + k) × (m + k) diagonalmatrix with (j, j)th element Zθj

/Zθ1 . The asymptotic variancematrix of (Zθ2/Zθ1 , . . . , Zθm+k

/Zθ1 , Zhθ1/Zθ1 , . . . , Z

hθm+k

/Zθ1 )can be consistently estimated by formula (B1), whichis denoted by V. The asymptotic variance matrix of(Zh

θ1/Zθ1 , . . . , Z

hθm+k

/Zθm+k) can be consistently estimated by

C−1(−Ch, Im+k)

(0 0T

2m+2k−1

02m+2k−1 V

)(−Ch

Im+k

)C−1,

where 02m + 2k − 1 is the column vector of 2m + 2k − 1 zerosand Ch is the (m + k) × (m + k) diagonal matrix with (j, j)thelement Zh

θj/Zθj

.

1A. M. Ferrenberg and R. H. Swendsen, “Optimized Monte Carlo data anal-ysis,” Phys. Rev. Lett. 63, 1195–1198 (1989).

2E. Gallicchio, M. Andrec, A. K. Felts, and R. M. Levy, “Temperatureweighted histogram analysis method, replica exchange, and transitionpaths,” J. Phys. Chem. B 109, 6722–6731 (2005).

3Free Energy Calculations. Theory and Applications in Chemistry and Bi-ology, Springer Series in Chemical Physics, edited by C. Chipot andA. Pohorille (Springer, Berlin/Heidelberg, 2007).

4M. E. J. Newman and G. T. Barkema, Monte Carlo Methods in StatisticalPhysics (Oxford University Press, New York, 1999).

5C. Bartels and M. Karplus, “Multidimensional adaptive umbrella sampling:Application to main chain and side chain peptide conformations,” J. Com-put. Chem. 18, 1450–1462 (1997).

6F. Zhu and G. Hummer, “Convergence and error estimation in free energycalculations using the weighted histogram analysis method,” J. Comput.Chem. (in press).

7C. J. Geyer, “Estimating normalizing constants and reweighting mixturesin Markov chain Monte Carlo,” Technical report, University of Minnesota,School of Statistics, 1994.

8X.-L. Meng and W. H. Wong, “Simulating ratios of normalizing constantsvia a simple identity: A theoretical explanation,” Stat. Sin. 6, 831–860(1996).

9A. Kong, P. McCullagh, X.-L. Meng, D. Nicolae, and Z. Tan, “A theory ofstatistical models for Monte Carlo integration,” J R. Stat. Soc. Ser. B (Stat.Methodol.) 65, 585–618 (2003).

10Z. Tan, “On a likelihood approach for Monte Carlo integration,” J. Am.Stat. Assoc. 99(468), 1027–1036 (2004).

11M. R. Shirts and J. D. Chodera, “Statistically optimal analysis of sam-ples from multiple equilibrium states,” J. Chem. Phys. 129(12), 124105(2008).

12C. H. Bennett, “Efficient estimation of free energy differences from MonteCarlo data,” J. Comput. Phys. 22(2), 245–268 (1976).

13N. Lu, J. K. Singh, and D. A. Kofke, “Appropriate methods to combineforward and reverse free-energy perturbation averages,” J. Chem. Phys.118(7), 2977–2984 (2003).

14P. Billingsley, Probability and Measure (Wiley, New York, 1995).15J. Nocedal and S. J. Wright, Numerical Optimization (Springer-Verlag,

New York, 1999).16R. Gill, Y. Vardi, and J. Wellner, “Large sample theory of empiri-

cal distributions in biased sampling models,” Ann. Stat. 16, 1069–1112(1988).

17M. R. Chernick, Bootstrap Methods: A Guide for Practitioners andResearchers, 2nd ed. (Wiley, Hoboken, NJ, 2008).

18E. Gallicchio and R. M. Levy, “Recent theoretical and computational ad-vances for modeling protein-ligand binding affinities,” in Advances inProtein Chemistry and Structural Biology (Academic, 2011), Vol. 85,pp. 27–80.

19M. K. Gilson, J. A. Given, B. L. Bush, and J. A. McCammon, “Thestatistical-thermodynamic basis for computation of binding affinities: Acritical review,” Biophys. J. 72, 1047–1069 (1997).

20D. L. Mobley and K. A. Dill, “Binding of small-molecule ligands to pro-teins: ‘what you see’ is not always ‘what you get’,” Structure (London)17(4), 489–498 (2009).

21Y. Deng and B. Roux, “Computations of standard binding free energieswith molecular dynamics simulations,” J. Phys. Chem. B 113(8), 2234–2246 (2009).

22J. Michel and J. W. Essex, “Prediction of protein-ligand binding affin-ity by free energy simulations: assumptions, pitfalls, and expectations,”J. Comput.-Aided Mol. Des. 24(8), 639–658 (2010).

23J. D. Chodera, D. L. Mobley, M. R. Shirts, R. W. Dixon, K. Branson,and V. S. Pande, “Alchemical free energy methods for drug discov-ery: Progress and challenges,” Curr. Opin. Struct. Biol. 21, 150–160(2011).

24M. R. Shirts and V. S. Pande, “Comparison of efficiency and bias offree energies computed by exponential averaging, the Bennett acceptanceratio, and thermodynamic integration,” J. Chem. Phys. 122(14), 144107(2005).

25A. Pohorille, C. Jarzynski, and C. Chipot, “Good practices in free-energycalculations,” J. Phys. Chem. B 114(32), 10235–10253 (2010).

26T. Steinbrecher, D. L. Mobley, and D. A. Case, “Nonlinear scaling schemesfor Lennard-Jones interactions in free energy calculations,” J. Chem. Phys.127(21), 214108 (2007).

27T. Steinbrecher, I. Joung, and D. A. Case, “Soft-core potentials in ther-modynamic integration: Comparing one- and two-step transformations,”J. Comput. Chem. 32(15), 3253–3263 (2011).

28F. P. Buelens and H. Grubmüller, “Linear-scaling soft-core scheme foralchemical free energy calculations,” J. Comput. Chem. 33(1), 25–33(2012).

29E. Gallicchio, M. Lapelosa, and R. M. Levy, “Binding energy distributionanalysis method (BEDAM) for estimation of protein-ligand binding affini-ties,” J. Chem. Theory Comput. 6(9), 2961–2977 (2010).

30M. Lapelosa, E. Gallicchio, and R. M. Levy, “Conformational transitionsand convergence of absolute binding free energy calculations,” J. Chem.Theory Comput. 8, 47–60 (2012).

31S. Kumar, D. Bouzida, R. H. Swendsen, P. A. Kollman, and J. M.Rosenberg, “The weighted histogram analysis method for free-energy cal-culations on biomolecules. I. The method,” J. Comput. Chem. 13, 1011–1021 (1992).

32E. Gallicchio and R. M. Levy, “Advances in all atom sampling methods formodeling protein-ligand binding affinities,” Curr. Opin. Struct. Biol. 21(2),161–166 (2011).

33C. D. Christ, A. E. Mark, and W. F. van Gunsteren, “Basic ingredients offree energy calculations: a review,” J. Comput. Chem. 31(8), 1569–1582(2010).

34E. Gallicchio and R. M. Levy, “Prediction of sampl3 host-guest affini-ties with the binding energy distribution analysis method (BEDAM),” J.Comp.-Aided Mol. Des. (in press).

35H. Paliwal and M. R. Shirts, “A benchmark test set for alchemical freeenergy transformations and its use to quantify error in common free energymethods,” J. Chem. Theory Comput. 7(12), 4115–4134 (2011).

36W. Janke, “Statistical analysis of simulations: Data correlations and er-ror estimation,” in Quantum Simulations of Complex Many-Body Systems:From Theory to Algorithms (John von Neumann Institute for Computing,Jülich, Germany, 2002), pp. 423–445.

Downloaded 06 May 2012 to 192.12.88.130. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

Page 15: Theory of binless multi-state free energy estimation with applications to protein ...stat.rutgers.edu/~ztan/Publication/UWHAM-May2012.pdf · 2012-05-07 · THE JOURNAL OF CHEMICAL

144102-14 Tan et al. J. Chem. Phys. 136, 144102 (2012)

37D. A. Holt, J. I. Luengo, D. S. Yamashita, H. J. Oh, A. L. Konialian,H. K. Yen, L. W. Rozamus, M. Brandt, and M. J. Bossard, “Design, synthe-sis, and kinetic evaluation of high-affinity fkbp ligands and the x-ray crys-tal structures of their complexes with fkbp12,” J. Am. Chem. Soc. 115(22),9925–9938 (1993).

38J. Wang, Y. Deng, and B. Roux, “Absolute binding free energy calcula-tions using molecular dynamics simulations with restraining potentials,”Biophys. J. 91(8), 2798–2814 (2006).

39H. Fujitani, Y. Tanida, M. Ito, J. Guha, C. D Snow, M. R. Shirts,E. J. Sorin, and V. S. Pande, “Direct calculation of the bind-ing free energies of fkbp ligands,” J. Chem. Phys. 123(8), 084108(2005).

40C. J. Geyer, Trust region optimization, R package 0.1-2., 2009, seehttp://www.stat.umn.edu/geyer/trust/.

41M. P. Allen and D. J. Tildesley, Computer Simulation of Liquids (OxfordUniversity Press, New York, 1993).

Downloaded 06 May 2012 to 192.12.88.130. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions