recent advances in quantum monte carlo for quantum ... · recent advances in quantum monte carlo...
TRANSCRIPT
Recent advances in quantum Monte Carlo
for quantum chemistry: optimization of wavefunctions and calculation of observables
Julien Toulouse1, Cyrus J. Umrigar2, Roland Assaraf1
1 Laboratoire de Chimie Theorique, Universite Pierre et Marie Curie - CNRS, Paris, France.
2Laboratory of Atomic and Solid State Physics, Cornell University, Ithaca, New York, USA.
Email : [email protected]
Web page: www.lct.jussieu.fr/pagesperso/toulouse/
March 2009
1 Optimization of wave functions
2 Calculation of observables
1 Optimization of wave functions
2 Calculation of observables
Trial wave function
Jastrow-Slater wave function
|Ψ(p)〉 = J(α)
NCSF∑
i=1
ci |Ci〉
• J(α) = Jastrow factor (with e-e, e-n, e-e-n terms)• |Ci〉 = Configuration state function (CSF) = linearcombination of Slater determinants of given symmetry.
Trial wave function
Jastrow-Slater wave function
|Ψ(p)〉 = J(α)
NCSF∑
i=1
ci |Ci〉
• J(α) = Jastrow factor (with e-e, e-n, e-e-n terms)• |Ci〉 = Configuration state function (CSF) = linearcombination of Slater determinants of given symmetry.
The Slater determinants are made of orbitals expanded on a Slaterbasis:
φk(r) =
Nbasis∑
µ=1
λkµχµ(r)
χ(r) = N(ζ) rn−1 e−ζr Sl ,m(θ, φ)
Trial wave function
Jastrow-Slater wave function
|Ψ(p)〉 = J(α)
NCSF∑
i=1
ci |Ci〉
• J(α) = Jastrow factor (with e-e, e-n, e-e-n terms)• |Ci〉 = Configuration state function (CSF) = linearcombination of Slater determinants of given symmetry.
The Slater determinants are made of orbitals expanded on a Slaterbasis:
φk(r) =
Nbasis∑
µ=1
λkµχµ(r)
χ(r) = N(ζ) rn−1 e−ζr Sl ,m(θ, φ)
Parameters to optimize p = {α, c, λ, ζ}: Jastrow parameters α,
CSF coefficients c, orbital coefficients λ and basis exponents ζ
Wave function optimization: why and how?
Important for both VMC and DMC in order to
reduce the systematic error
reduce the statistical uncertainty
Wave function optimization: why and how?
Important for both VMC and DMC in order to
reduce the systematic error
reduce the statistical uncertainty
How to optimize?
Until recently: minimization of the variance of the energy
OK for the few Jastrow parameters
but does not work well for the many CSF and orbitalparameters
Since recently: minimization of the energy (+ possibly a smallfraction of variance)
works well for all the parameters
the energy is a better criterion
Optimization method: principle
Expansion of the wave function around p0 to linear orderin ∆p = p − p0:
|Ψ[1](p)〉 = |Ψ0〉 +∑
j
∆pj |Ψj〉
where |Ψ0〉 = |Ψ(p0)〉 and |Ψj〉 =∂|Ψ(p0))〉
∂pj
.
Optimization method: principle
Expansion of the wave function around p0 to linear orderin ∆p = p − p0:
|Ψ[1](p)〉 = |Ψ0〉 +∑
j
∆pj |Ψj〉
where |Ψ0〉 = |Ψ(p0)〉 and |Ψj〉 =∂|Ψ(p0))〉
∂pj
.
Normalization of wave function chosen so that thederivatives |Ψj〉 are orthogonal to |Ψ0〉.
Optimization method: principle
Expansion of the wave function around p0 to linear orderin ∆p = p − p0:
|Ψ[1](p)〉 = |Ψ0〉 +∑
j
∆pj |Ψj〉
where |Ψ0〉 = |Ψ(p0)〉 and |Ψj〉 =∂|Ψ(p0))〉
∂pj
.
Normalization of wave function chosen so that thederivatives |Ψj〉 are orthogonal to |Ψ0〉.
Minimization of the energy =⇒ generalized eigenvalue equation:(
E0 gT/2g/2 H
)(
1∆p
)
= Elin
(
1 0T
0 S
)(
1∆p
)
where E0 = 〈Ψ0|H|Ψ0〉, gi =∂E (p0)
∂pi
, Hij = 〈Ψi |H|Ψj〉, Sij = 〈Ψi |Ψj〉.
Optimization method: principle
Expansion of the wave function around p0 to linear orderin ∆p = p − p0:
|Ψ[1](p)〉 = |Ψ0〉 +∑
j
∆pj |Ψj〉
where |Ψ0〉 = |Ψ(p0)〉 and |Ψj〉 =∂|Ψ(p0))〉
∂pj
.
Normalization of wave function chosen so that thederivatives |Ψj〉 are orthogonal to |Ψ0〉.
Minimization of the energy =⇒ generalized eigenvalue equation:(
E0 gT/2g/2 H
)(
1∆p
)
= Elin
(
1 0T
0 S
)(
1∆p
)
where E0 = 〈Ψ0|H|Ψ0〉, gi =∂E (p0)
∂pi
, Hij = 〈Ψi |H|Ψj〉, Sij = 〈Ψi |Ψj〉.
Update of the parameters: p0 → p0 + ∆p.
Optimization method: robustness
The linear method is equivalent to a stabilized Newtonmethod:
(
E0 gT/2g/2 H
)(
1∆p
)
= Elin
(
1 0T
0 S
)(
1∆p
)
⇐⇒
{
(h + 2∆E S) · ∆p = −g
2∆E = −gT · ∆p
where h = 2(H − E0S) is an approximate Hessian, and∆E = E0 − Elin > 0 is the energy stabilization.
=⇒ more robust than Newton method
Optimization method: robustness
The linear method is equivalent to a stabilized Newtonmethod:
(
E0 gT/2g/2 H
)(
1∆p
)
= Elin
(
1 0T
0 S
)(
1∆p
)
⇐⇒
{
(h + 2∆E S) · ∆p = −g
2∆E = −gT · ∆p
where h = 2(H − E0S) is an approximate Hessian, and∆E = E0 − Elin > 0 is the energy stabilization.
=⇒ more robust than Newton method
In quantum chemistry, it is known as super-CI method oraugmented Hessian method.
Optimization method: robustness
The linear method is equivalent to a stabilized Newtonmethod:
(
E0 gT/2g/2 H
)(
1∆p
)
= Elin
(
1 0T
0 S
)(
1∆p
)
⇐⇒
{
(h + 2∆E S) · ∆p = −g
2∆E = −gT · ∆p
where h = 2(H − E0S) is an approximate Hessian, and∆E = E0 − Elin > 0 is the energy stabilization.
=⇒ more robust than Newton method
In quantum chemistry, it is known as super-CI method oraugmented Hessian method.
Additional stabilization: Hij → Hij + a δij where a ≥ 0.
Optimization method: on a finite VMC sample
The generalized eigenvalue equation is estimated as(
E0 gTR/2
gL/2 H
)(
1∆p
)
= Elin
(
1 0T
0 S
) (
1∆p
)
with
gL,i/2 =
⟨
Ψi (R)
Ψ0(R)
H(R)Ψ0(R)
Ψ0(R)
⟩
Ψ20
and gR,j/2 =
⟨
Ψ0(R)
Ψ0(R)
H(R)Ψj(R)
Ψ0(R)
⟩
Ψ20
Hij =
⟨
Ψi (R)
Ψ0(R)
H(R)Ψj(R)
Ψ0(R)
⟩
Ψ20
and Sij =
⟨
Ψi (R)
Ψ0(R)
Ψj(R)
Ψ0(R)
⟩
Ψ20
non-symmetric!
Optimization method: on a finite VMC sample
The generalized eigenvalue equation is estimated as(
E0 gTR/2
gL/2 H
)(
1∆p
)
= Elin
(
1 0T
0 S
) (
1∆p
)
with
gL,i/2 =
⟨
Ψi (R)
Ψ0(R)
H(R)Ψ0(R)
Ψ0(R)
⟩
Ψ20
and gR,j/2 =
⟨
Ψ0(R)
Ψ0(R)
H(R)Ψj(R)
Ψ0(R)
⟩
Ψ20
Hij =
⟨
Ψi (R)
Ψ0(R)
H(R)Ψj(R)
Ψ0(R)
⟩
Ψ20
and Sij =
⟨
Ψi (R)
Ψ0(R)
Ψj(R)
Ψ0(R)
⟩
Ψ20
non-symmetric!
=⇒ Zero-variance principle of Nightingale et al. (PRL 2001):
If there is some ∆p so that Ψ0(R) +∑
j ∆pj Ψj(R) = Ψexact(R)
then ∆p is found with zero variance.
In practice, these non-symmetric estimators reduce the fluctuationson ∆p by 1 or 2 orders of magnitude.
Optimization method: mixing a fraction of variance
How to minimize the energy variance with the linear method?
V = min∆p
{
V0 + gTV · ∆p +
1
2∆pT · hV · ∆p
}
Optimization method: mixing a fraction of variance
How to minimize the energy variance with the linear method?
V = min∆p
{
V0 + gTV · ∆p +
1
2∆pT · hV · ∆p
}
⇐⇒ V = min∆p
(
1 ∆pT)
(
V0 gTV /2
gV /2 hV /2 + V0S
)(
1∆p
)
(
1 ∆pT)
(
1 0T
0 S
)(
1∆p
)
Optimization method: mixing a fraction of variance
How to minimize the energy variance with the linear method?
V = min∆p
{
V0 + gTV · ∆p +
1
2∆pT · hV · ∆p
}
⇐⇒ V = min∆p
(
1 ∆pT)
(
V0 gTV /2
gV /2 hV /2 + V0S
)(
1∆p
)
(
1 ∆pT)
(
1 0T
0 S
)(
1∆p
)
⇐⇒
(
V0 gTV /2
gV /2 hV /2 + V0S
) (
1∆p
)
= V
(
1 0T
0 S
)(
1∆p
)
Optimization method: mixing a fraction of variance
How to minimize the energy variance with the linear method?
V = min∆p
{
V0 + gTV · ∆p +
1
2∆pT · hV · ∆p
}
⇐⇒ V = min∆p
(
1 ∆pT)
(
V0 gTV /2
gV /2 hV /2 + V0S
)(
1∆p
)
(
1 ∆pT)
(
1 0T
0 S
)(
1∆p
)
⇐⇒
(
V0 gTV /2
gV /2 hV /2 + V0S
) (
1∆p
)
= V
(
1 0T
0 S
)(
1∆p
)
matrix to add to the energy matrix
Simultaneous optimization of all parameters
Optimization of 149 parameters = 24 (Jastrow) + 49 (CSF) +64 (orbitals) + 12 (exponents) for C2 molecule :
-75.9
-75.8
-75.7
-75.6
-75.5
-75.4
0 1 2 3 4 5 6
En
erg
y (
Hart
ree)
Iterations
-75.88
-75.875
-75.87
-75.865
-75.86
-75.855
2 3 4 5 6
En
ergy (
Hart
ree)
Iterations
=⇒ Energy converges up to 1 mHartreein a few iterations
Systematic improvement in QMC
For C2 molecule: total energies for a series of fully optimizedJastrow-Slater wave functions:
-75.94
-75.92
-75.9
-75.88
-75.86
-75.84
-75.82
-75.8
J*RAS(8,26)J*CAS(8,8)J*CAS(8,7)J*CAS(8,5)J*SD
En
erg
y (
Hart
ree)
Wave function
Exact
CCSD(T)/cc-pVQZ
VMC
=⇒ Systematic improvement in VMC
Systematic improvement in QMC
For C2 molecule: total energies for a series of fully optimizedJastrow-Slater wave functions:
-75.94
-75.92
-75.9
-75.88
-75.86
-75.84
-75.82
-75.8
J*RAS(8,26)J*CAS(8,8)J*CAS(8,7)J*CAS(8,5)J*SD
En
erg
y (
Hart
ree)
Wave function
Exact
CCSD(T)/cc-pVQZ
VMC
DMC
=⇒ Systematic improvement in VMC and DMC!
Potential energy curve of C2 molecule (1Σ+g )
Jastrow × single determinant wave function
:
-75.9
-75.8
-75.7
-75.6
-75.5
-75.4
1 2 3 4 5 6 7 8 9 10
En
ergy (
Hart
ree)
Interatomic distance (Bohr)
VMC J × SD
Morse potential
size-consistencyerror
Potential energy curve of C2 molecule (1Σ+g )
Jastrow × single determinant wave function
:
-75.9
-75.8
-75.7
-75.6
-75.5
-75.4
1 2 3 4 5 6 7 8 9 10
En
ergy (
Hart
ree)
Interatomic distance (Bohr)
VMC J × SD
DMC J × SD
Morse potential
size-consistencyerror
=⇒ Single-determinant DMC is size-consistent
but with broken spin symmetry at dissociation, 〈ΨDMC|S2|ΨDMC〉 = 2
Potential energy curve of C2 molecule (1Σ+g )
Jastrow × multideterminant wave function:
-75.9
-75.8
-75.7
-75.6
-75.5
-75.4
1 2 3 4 5 6 7 8 9 10
En
ergy (
Hart
ree)
Interatomic distance (Bohr)
VMC J × CAS(8,8)
DMC J × CAS(8,8)
Morse potential
Potential energy curve of C2 molecule (1Σ+g )
Jastrow × multideterminant wave function:
-75.9
-75.8
-75.7
-75.6
-75.5
-75.4
1 2 3 4 5 6 7 8 9 10
En
ergy (
Hart
ree)
Interatomic distance (Bohr)
VMC J × CAS(8,8)
DMC J × CAS(8,8)
Morse potential
=⇒ DMC gives dissociation energywith chemical accuracy (1 kcal/mol ≈ 0.04 eV):
DDMC = 6.482(3) vs Dexact = 6.44(2) eV
Dissociation energies of diatomic molecules
Jastrow × multideterminant (full valence CAS) wave functions:
-1.5
-1
-0.5
0
Ne2F2O2N2C2B2Be2Li2
Err
ors
on
dis
soci
ati
on
en
ergy (
eV)
Molecules
MCSCF CAS
VMC J × CAS
DMC J × CAS
=⇒ Near chemical accuracy in DMC
Example of application
Binding energy of 2 NO2 to a fragment of carbon nanotube:
Estimates for the full nanotube (9,0):
B3LYP calculations: no binding
QMC calculations: weak binding (. 10 kcal/mol)
Lawson, Bauschlicher, Toulouse, Filippi, Umrigar, Chem. Phys.Lett., 466, 170 (2008)
1 Optimization of wave functions
2 Calculation of observables
Calculation of an observable in VMC
Energy
Estimator: EL(R) =H(R)Ψ(R)
Ψ(R)
Systematic error: δE = O(δΨ2)
Variance: σ2 (EL) = O(δΨ2)
}
Quadratic Zero-VarianceZero-Bias property
Calculation of an observable in VMC
Energy
Estimator: EL(R) =H(R)Ψ(R)
Ψ(R)
Systematic error: δE = O(δΨ2)
Variance: σ2 (EL) = O(δΨ2)
}
Quadratic Zero-VarianceZero-Bias property
Arbitrary observable O (which does not commute with H)
Estimator: OL(R) =O(R)Ψ(R)
Ψ(R)
Systematic error: δO = O(δΨ)
Variance: σ2 (OL) = O(1)
}
Quadratic Zero-VarianceZero-Bias property
Zero-Variance Zero-Bias estimators (Assaraf & Caffarel)
Based on the Hellmann-Feynman theorem
〈O〉 =
(
dEλ
dλ
)
λ=0
where Eλ = 〈Ψλ|H + λO|Ψλ〉
Zero-Variance Zero-Bias estimators (Assaraf & Caffarel)
Based on the Hellmann-Feynman theorem
〈O〉 =
(
dEλ
dλ
)
λ=0
where Eλ = 〈Ψλ|H + λO|Ψλ〉
one can define an improved estimator:
Oimproved(R) =O(R)Ψ(R)
Ψ(R)+ ∆OZV(R) + ∆OZB(R)
with the ZV term: ∆OZV(R) =
[
H(R)Ψ′(R)
Ψ′(R)− EL(R)
]
Ψ′(R)
Ψ(R)
and the ZB term: ∆OZB(R) = 2 [EL(R) − E ]Ψ′(R)
Ψ(R)
Zero-Variance Zero-Bias estimators (Assaraf & Caffarel)
Based on the Hellmann-Feynman theorem
〈O〉 =
(
dEλ
dλ
)
λ=0
where Eλ = 〈Ψλ|H + λO|Ψλ〉
one can define an improved estimator:
Oimproved(R) =O(R)Ψ(R)
Ψ(R)+ ∆OZV(R) + ∆OZB(R)
with the ZV term: ∆OZV(R) =
[
H(R)Ψ′(R)
Ψ′(R)− EL(R)
]
Ψ′(R)
Ψ(R)
and the ZB term: ∆OZB(R) = 2 [EL(R) − E ]Ψ′(R)
Ψ(R)
Quadratic Zero-Variance Zero-Bias property
Systematic error: δOimproved = O(δΨ2 + δΨ δΨ′)
Variance: σ2 (Oimproved) = O(δΨ2 + δΨ′2 + δΨ δΨ′)
Example of improved QMC estimators
Dipole moment of CH molecule (2Π) in VMC:
0.9
1
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
1.7 1.8 1.9 2 2.1 2.2 2.3 2.4 2.5
Dip
ole
mo
men
t (D
eby
e)
Interatomic distance (Bohr)
usual estimator
Example of improved QMC estimators
Dipole moment of CH molecule (2Π) in VMC:
0.9
1
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
1.7 1.8 1.9 2 2.1 2.2 2.3 2.4 2.5
Dip
ole
mo
men
t (D
eby
e)
Interatomic distance (Bohr)
usual estimatorimproved estimator
=⇒ Reduction of statistical uncertainty!
Example of improved QMC estimators
Correlation hole of C2 molecule in VMC:
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
0 1 2 3 4 5
Co
rrel
ati
on
ho
le (
Bo
hr-1
)
Interelectronic distance (Bohr)
usual histogram estimator
Example of improved QMC estimators
Correlation hole of C2 molecule in VMC:
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
0 1 2 3 4 5
Co
rrel
ati
on
ho
le (
Bo
hr-1
)
Interelectronic distance (Bohr)
usual histogram estimatorimproved estimator
=⇒ Reduction of statistical uncertainty!
Summary and perspectives
Summary
efficient wave function optimization method byminimization of VMC energy
near chemical accuracy with compact wave functions
improved estimators for observables in QMC
Toulouse, Umrigar, JCP 126, 084102 (2007)
Umrigar, Toulouse, Filippi, Sorella, Hennig, PRL 98, 110201 (2007)
Toulouse, Assaraf, Umrigar, JCP 26, 244112 (2007)
Toulouse, Umrigar, JCP 128, 174101 (2008)
www.lct.jussieu.fr/pagesperso/toulouse/
Perspectives
optimization by minimization of DMC energy
optimization of molecular geometry
excited states