saturday, 02 may 2015saturday, 02 may 2015saturday, 02 may 2015saturday, 02 may 2015 2005 taipei...
TRANSCRIPT
Tuesday 18 April 2023Tuesday 18 April 20232005 Taipei Summer Institute2005 Taipei Summer Institute
on Strings, Particles and Fields on Strings, Particles and Fields
Simulation Algorithms Simulation Algorithms for Lattice QCDfor Lattice QCD
A D KennedyA D KennedySchool of Physics, The University of EdinburghSchool of Physics, The University of Edinburgh
Tuesday 18 April 2023 A D Kennedy 2
Contents
Monte Carlo methodsFunctional Integrals and QFTCentral Limit TheoremImportance SamplingMarkov Chains
Convergence of Markov ChainsAutocorrelations
Hybrid Monte CarloMDMCPartial Momentum RefreshmentSymplectic IntegratorsDynamical fermions
Grassmann algebras
ReversibilityPHMCRHMC
Chiral FermionsOn-shell chiral symmetryNeuberger’s OperatorInto Five Dimensions
Kernel
Schur ComplementConstraintApproximation
tanhЗолотарев
RepresentationContinued Fraction Partial FractionCayley Transform
Chiral Symmetry BreakingNumerical StudiesConclusions
Approximation TheoryPolynomial approximation
Weierstraß’ theoremБернштейне polynomialsЧебышев’s theoremЧебышев polynomials
Чебышев approximation
Elliptic functionsLiouville’s theoremWeierstraß elliptic functionsExpansion of elliptic functionsAddition formulæJacobian elliptic functions
Modular transformationsЗолотарев’s formula
Arithmetico-Geometric meanConclusionsBibliography
Hasenbusch Acceleration
Tuesday 18 April 2023 A D Kennedy 3
( )1( )Sd e
Z
The Expectation value of an operator is defined
non-perturbatively by the Functional Integral
The action is S ()
1 1Normalisation constant Z chosen such that
xx
d d d is the appropriate functional measure
Continuum limit: lattice spacing a 0Thermodynamic limit: physical volume V
Defined in Euclidean space-timeLattice regularisation
Functional Integrals and QFT
Tuesday 18 April 2023 A D Kennedy 4
Monte Carlo methods: I
Monte Carlo integration is based on the identification of probabilities with measures
There are much better methods of carrying out low dimensional quadrature
All other methods become hopelessly expensive for large dimensions
In lattice QFT there is one integration per degree of freedom
We are approximating an infinite dimensional functional integral
Tuesday 18 April 2023 A D Kennedy 5
Measure the value of on each configuration and compute the average
1
1( )
N
ttN
Monte Carlo methods: II
Generate a sequence of random field configurations chosen from the probability distribution
1 2 t N, , , , ,
( )1( ) tS
t tP d eZ
Tuesday 18 April 2023 A D Kennedy 6
Central Limit Theorem: I
Distribution of values for a single sample = ()
( )P d P
Central Limit Theorem
where the variance of the distribution of is
2CNO
22C
Law of Large Numbers limN
The Laplace–DeMoivre Central Limit theorem is an asymptotic expansion for the probability distribution of
Tuesday 18 April 2023 A D Kennedy 7
3
0 3
4 21 4 2
2
2
0
3
C C
C C C
C
The first few cumulants are
Central Limit Theorem: II
Note that this is an asymptotic expansion
Generating function for connected moments
( ) ln ( ) ikW k d P e ln lnik ikd P e e
0 !
n
nn
ikC
n
Tuesday 18 April 2023 A D Kennedy 8
Central Limit Theorem: III
Connected generating function
lnN
ik Nd P e
1 11
ln expN
N N tt
ikd d P P
N
( ) ln ( ) ikW k d P e
1
1 !
n
nn
n
ik Cn N
ln ik NN e
kNW
N
1 11
1 N
N N tt
P d d P PN
Distribution of the average of N samples
Tuesday 18 April 2023 A D Kennedy 9
3 4
3 4212 3 3 4
223! 4!2
N
C Cd dik ik CN N dk ikd de e e
Take inverse Fourier transform to obtain distribution P
12
W k ikP dk e e
23 4
23 4 22 3 3 43! 4!
2 2
C Cd d C N
N Nd d
C N
ee
Central Limit Theorem: IV
Tuesday 18 April 2023 A D Kennedy 10
Central Limit Theorem: V
where and
N
22
2 23 2
32 2
31
6 2
CC C eF
C N C
dP F
d
Re-scale to show convergence to Gaussian distribution
Tuesday 18 April 2023 A D Kennedy 11
Importance Sampling: I
( ) 1N dx x
Sample from distributionProbability
Normalisation
0 ( ) . .x a e
( )dx f x Integral
( )( )
( )f x
I x dxx
Estimator of integral
Estimator of variance2 2
2( ) ( )( )
( ) ( )f x f x
V x dx I dx Ix x
Tuesday 18 April 2023 A D Kennedy 12
Importance Sampling: II
2
2
( ) ( )0
( ) ( )V N f y
y y
Minimise varianceConstraint N=1Lagrange multiplier
( )( )
( )opt
f xx
dx f x
Optimal measure
Optimal variance 2 2
opt ( ) ( )V dx f x dx f x
Tuesday 18 April 2023 A D Kennedy 13
( ) sin( )f x x Example:
12( ) sin( )x x Optimal weight
2 22 2
opt0 0
sin( ) sin( ) 16V dx x dx x
Optimal variance
2
0sindx x
Importance Sampling: III
Tuesday 18 April 2023 A D Kennedy 14
Importance Sampling: IV2
0sindx x
1 Construct cheap approximation to |sin(x)|
2 Calculate relative area within each rectangle
5 Choose another random number uniformly
4 Select corresponding rectangle
3 Choose a random number uniformly
6 Select corresponding x value within rectangle
7 Compute |sin(x)|
Tuesday 18 April 2023 A D Kennedy 15
Importance Sampling: V
2 2
0 0
sin( ) sin( ) sin( ) sin( )I dx x x dx x x
But we can do better!
opt 0V For which
With 100 rectangles we have V = 16.02328561
With 100 rectangles we have V = 0.011642808
2
0sindx x
Tuesday 18 April 2023 A D Kennedy 16
Markov Chains: I
State space (Ergodic) stochastic transitions P’:
Distribution converges to unique fixed point
Q
Deterministic evolution of probability distribution P: Q Q
Tuesday 18 April 2023 A D Kennedy 17
The sequence Q, PQ, P²Q, P³Q,… is Cauchy
Define a metric on the space of (equivalence classes of) probability distributions
1 2 1 2,d Q Q dx Q x Q x
Prove that with > 0, so the Markov process P is a contraction mapping
1 2 1 2, 1 ,d PQ PQ d Q Q
The space of probability distributions is complete, so the sequence converges to a unique fixed pointlim n
nQ P Q
Convergence of Markov Chains: I
Tuesday 18 April 2023 A D Kennedy 18
1 2 1 2,d PQ PQ dx PQ x PQ x 1 2dx dy P x y Q y dy P x y Q y
dx dy P x y Q y Q y Q y
dx dy P x y Q y
2 mindx dy P x y Q y Q y
dx dy P x y Q y
2min ,a b a b a b
1 2Q y Q y Q y
1y y
Convergence of Markov Chains: II
Tuesday 18 April 2023 A D Kennedy 19
2 mindy Q y dx dy P x y Q y Q y
2 inf min
ydy Q y dx P x y dy Q y Q y
infy
dy Q y dx P x y dy Q y 1 2(1 ) ,d Q Q
1 2 1 1 0dy Q y dy Q y dy Q y dy Q y Q y dy Q y Q y
1dx P x y
12dy Q y Q y dy Q y
0 infy
dx P x y
1 2,d PQ PQ
Convergence of Markov Chains: III
Tuesday 18 April 2023 A D Kennedy 20
Markov Chains: II
Use Markov chains to sample from QSuppose we can construct an ergodic Markov process P which has distribution Q as its fixed point
Start with an arbitrary state (“field configuration”)
Iterate the Markov process until it has converged (“thermalized”)
Thereafter, successive configurations will be distributed according to Q
But in general they will be correlated
To construct P we only need relative probabilities of states
Do not know the normalisation of Q
Cannot use Markov chains to compute integrals directly
We can compute ratios of integrals
Tuesday 18 April 2023 A D Kennedy 21
Metropolis algorithm
min 1,Q x
P x yQ y
Markov Chains: III
How do we construct a Markov process with a specified fixed point ?
Q x dy P x y Q y Integrate w.r.t. y to obtain fixed point conditionSufficient but not necessary for fixed point
P y x Q x P x y Q y Detailed balance
Q xP x y
Q x Q y
Other choices are possible, e.g.,
Consider cases and separately to obtain detailed balance condition
( ) ( )Q x Q y ( ) ( )Q x Q y
Sufficient but not necessary for detailed balance
Tuesday 18 April 2023 A D Kennedy 22
Markov Chains: IV
Composition of Markov stepsLet P1 and P2 be two Markov steps which have the desired
fixed point distribution
They need not be ergodic
Then the composition of the two steps P2P1 will also have
the desired fixed point
And it may be ergodic
This trivially generalises to any (fixed) number of steps
For the case where P1 is not ergodic but (P1 ) n is the
terminology weakly and strongly ergodic are sometimes used
Tuesday 18 April 2023 A D Kennedy 23
Markov Chains: V
This result justifies “sweeping” through a lattice performing single site updates
Each individual single site update has the desired fixed point because it satisfies detailed balance
The entire sweep therefore has the desired fixed point, and is ergodic
But the entire sweep does not satisfy detailed balance
Of course it would satisfy detailed balance if the sites were updated in a random order
But this is not necessaryAnd it is undesirable because it puts too much randomness into the system
Tuesday 18 April 2023 A D Kennedy 24
Markov Chains: VI
Coupling from the PastPropp and Wilson (1996)Use fixed set of random numbersFlypaper principle: If states coalesce they stay together forever
– Eventually, all states coalesce to some state with probability one
– Any state from t = - will coalesce to – is a sample from the fixed point distribution
Tuesday 18 April 2023 A D Kennedy 25
Autocorrelations: I
This corresponds to the exponential autocorrelation time
expmax
10
lnN
Exponential autocorrelationsThe unique fixed point of an ergodic Markov process
corresponds to the unique eigenvector with eigenvalue
1
All its other eigenvalues must lie within the unit circlemax 1 In particular, the largest subleading eigenvalue is
Tuesday 18 April 2023 A D Kennedy 26
Autocorrelations: II
2
21 1
1 N N
t tt tN
1
2
21 1 1
12
N N N
t t tt t t tN
Integrated autocorrelationsConsider the autocorrelation of some operator Ω
Without loss of generality we may assume 0
12 2 2
1
1 2 N
N CN N
2
t tC
Define the autocorrelation function
Tuesday 18 April 2023 A D Kennedy 27
Autocorrelations: III
The autocorrelation function must fall faster that the
exponential autocorrelation exp
maxNC e
2
2 exp1 2 1N
A ON N
2
exp
1
1 2 1N
C ON N
For a sufficiently large number of samples
1
A C
Define the integrated autocorrelation function
Tuesday 18 April 2023 A D Kennedy 28
In order to carry out Monte Carlo computations including the effects of dynamical fermions we would like to find an algorithm which
Update the fields globally
Because single link updates are not cheap if the action is not local
Take large steps through configuration space
Because small-step methods carry out a random walk which leads to critical slowing down with a dynamical critical exponent z=2
Hybrid Monte Carlo: I
Does not introduce any systematic errors
z relates the autocorrelation to the correlation length of the system,
zA
Tuesday 18 April 2023 A D Kennedy 29
A useful class of algorithms with these properties is the (Generalised) Hybrid Monte Carlo (HMC) method
Introduce a “fictitious” momentum p corresponding to each dynamical degree of freedom q
Find a Markov chain with fixed point exp[-H(q,p) ] where H is the “fictitious” Hamiltonian ½p2 + S(q)
The action S of the underlying QFT plays the rôle of the potential in the “fictitious” classical mechanical system
This gives the evolution of the system in a fifth dimension, “fictitious” or computer time
This generates the desired distribution exp[-S(q) ] if we ignore the momenta p (i.e., the marginal distribution)
Hybrid Monte Carlo: II
Tuesday 18 April 2023 A D Kennedy 30
The HMC Markov chain alternates two Markov steps
Molecular Dynamics Monte Carlo (MDMC)(Partial) Momentum Refreshment
Both have the desired fixed pointTogether they are ergodic
Hybrid Monte Carlo: IIIHybrid Monte Carlo: III
Tuesday 18 April 2023 A D Kennedy 31
MDMC: I
If we could integrate Hamilton’s equations exactly we could follow a trajectory of constant fictitious energy
This corresponds to a set of equiprobable fictitious phase space configurationsLiouville’s theorem tells us that this also preserves the functional integral measure dp dq as required
Any approximate integration scheme which is reversible and area preserving may be used to suggest configurations to a Metropolis accept/reject test
With acceptance probability min[1,exp(-H)]
Tuesday 18 April 2023 A D Kennedy 32
MDMC: II
Molecular Dynamics (MD), an approximate integrator which is exactly : , ,U q p q p
We build the MDMC step out of three parts
A Metropolis accept/reject step
Area preserving, *
,det det 1
,
q pU
q p
Reversible, 1F U F U
A momentum flip :F p p
1H Hq qF U e y y e
p p
The composition of these gives
with y being a uniformly distributed random number in [0,1]
Tuesday 18 April 2023 A D Kennedy 33
The Gaussian distribution of p is invariant under F
The extra momentum flip F ensures that for small the momenta are reversed after a rejection rather than after an acceptance
For = / 2 all momentum flips are irrelevant
This mixes the Gaussian distributed momenta p with Gaussian noise
cos sin
sin cos
p pF
Partial Momentum Refreshment
Tuesday 18 April 2023 A D Kennedy 34
Special casesThe usual Hybrid Monte Carlo (HMC) algorithm is the special case where = / 2
= 0 corresponds to an exact version of the Molecular Dynamics (MD) or Microcanonical algorithm (which is in general non-ergodic)
the L2MC algorithm of Horowitz corresponds to choosing arbitrary but MDMC trajectories of a single leapfrog integration step ( = ). This method is also called Kramers algorithm.
Hybrid Monte Carlo: IV
Tuesday 18 April 2023 A D Kennedy 35
Hybrid Monte Carlo: V
Further special casesThe Langevin Monte Carlo algorithm corresponds to choosing = / 2 and MDMC trajectories of a single leapfrog integration step ( = ).
The Hybrid and Langevin algorithms are approximations where the Metropolis step is omitted
The Local Hybrid Monte Carlo (LHMC) or Overrelaxation algorithm corresponds to updating a subset of the degrees of freedom (typically those living on one site or link) at a time
Hybrid Monte Carlo: V
Tuesday 18 April 2023 A D Kennedy 36
If A and B belong to any (non-commutative)
algebra then , where constructed from commutators of A and B (i.e.,
is in the Free Lie Algebra generated by {A,B })
A B A Be e e
Symplectic Integrators: I
1 2
1 2
1 2
221
1 2, , 10
1, , , ,
1 2 ! m
m
m
nm
n n k kk km
k k n
Bc c A B c c A B
n m
More precisely, where and
1
ln A Bn
n
e e c
1c A B
Baker-Campbell-Hausdorff (BCH) formula
Tuesday 18 April 2023 A D Kennedy 37
Symplectic Integrators: IIExplicitly, the first few terms are
1 1 12 12 24
1720
ln , , , , , , , ,
, , , , 4 , , , ,
6 , , , , 4 , , , ,
2 , , , , , , , ,
A Be e A B A B A A B B A B B A A B
A A A A B B A A A B
A B A A B B B A A B
A B B A B B B B A B
In order to construct reversible integrators we use symmetric symplectic integrators
2 2 124
15760
ln , , 2 , ,
7 , , , , 28 , , , ,
12 , , , , 32 , , , ,
16 , , , , 8 , , , ,
A B Ae e e A B A A B B A B
A A A A B B A A A B
A B A A B B B A A B
A B B A B B B B A B
The following identity follows directly from the BCH formula
Tuesday 18 April 2023 A D Kennedy 38
Symplectic Integrators: III
exp expd dp dqdt dt p dt q
212,H q p T p S q p S q
We are interested in finding the classical trajectory in phase space of a system described by the Hamiltonian
The basic idea of such a symplectic integrator is to write the time evolution operator as
ˆexp HH He
q p p q
exp S q T pp q
Tuesday 18 April 2023 A D Kennedy 39
Symplectic Integrators: IV
Define and so that̂H P Q P S qp
Q T pq
Since the kinetic energy T is a function only of p and the potential energy S is a function only of q, it follows that the action of and may be evaluated triviallyPe Qe
: , ,
: , ,
Q
P
e f q p f q T p p
e f q p f q p S q
Tuesday 18 April 2023 A D Kennedy 40
Symplectic Integrators: V
From the BCH formula we find that the PQP symmetric symplectic integrator is given by
1 12 2
//
0( ) P PQU e e e
3 5124exp , , 2 , ,P Q P P Q Q P Q O
2 4124exp , , 2 , ,P Q P P Q Q P Q O
ˆ 2P QHe e O
In addition to conserving energy to O (² ) such symmetric symplectic integrators are manifestly area preserving and reversible
Tuesday 18 April 2023 A D Kennedy 41
Symplectic Integrators: VI
For each symplectic integrator there exists a Hamiltonian H’ which is exactly conserved
For the PQP integrator we haveˆ H HH
p q q p
2124
4 615760
, , 2 , ,
32 , , , , 16 , , , ,
28 , , , , 12 , , , ,
8 , , , , 7 , , , ,
Q P P P Q Q P Q
Q Q P P Q P Q Q P Q
Q P P P Q P Q P P Q O
Q Q Q P Q P P P P Q
Tuesday 18 April 2023 A D Kennedy 42
Symplectic Integrators: VII
Substituting in the exact forms for the operations P and Q we obtain the vector field
ˆ H HH
p q q p
2 2112 2p S pS SS p S
q p q p
43 2
544 61
7204 2
2
4 6 3
30 6
3 2
p S S SS pq
p S OS S SS p
pS S SS
Tuesday 18 April 2023 A D Kennedy 43
Symplectic Integrators: VIII
Solving these differential equations for H’ we find
2 2 2124
44 2 2 2 4 61720
2
6 2 3
H H p S S
p S p SS S S S O
Note that H’ cannot be written as the sum of a p-dependent kinetic term and a q-dependent potential term
Tuesday 18 April 2023 A D Kennedy 44
Fermion fields are Grassmann valued
Required to satisfy the spin-statistics theorem
Even “classical” Fermion fields obey anticommutation relationsGrassmann algebras behave like negative dimensional manifolds
Dynamical fermions: I
Tuesday 18 April 2023 A D Kennedy 45
Grassmann algebras: I
Grassmann algebrasLinear space spanned by generators {1,2,3,…} with
coefficients a, b,… in some field (usually the real or complex numbers)
Algebra structure defined by nilpotency condition for elements of the linear space ² = 0
There are many elements of the algebra which are not in the linear space (e.g., 12)
Nilpotency implies anticommutativity + = 00 = ² = ² = ( + )² = ² + + + ² = + = 0
Anticommutativity implies nilpotency, 2² = 0 Unless the coefficient field has characteristic 2, i.e., 2 = 0
Tuesday 18 April 2023 A D Kennedy 46
A natural antiderivation is defined on a Grassmann algebra
Linearity: d(a + b) = a d + b d
Anti-Leibniz rule: d() = (d) + P()(d)
Grassmann algebras have a natural grading corresponding to the number of generators in a given product
deg(1) = 0, deg(i ) = 1, deg(ij ) =
2, ...
All elements of the algebra can be decomposed into a sum of terms of definite grading
Grassmann algebras: II
The parity transform of is
deg1P
Tuesday 18 April 2023 A D Kennedy 47
Grassmann algebras: III
Gaussians over Grassmann manifoldsLike all functions, Gaussians are polynomials over Grassmann algebras
1i ij j
ij i ij j
aa
i ij jij ij
e e a
There is no reason for this function to be positive even for real coefficients
Definite integration on a Grassmann algebra is defined to be the same as derivation
Hence change of variables leads to the inverse Jacobian
1 1 det in n
j
d d d d
Tuesday 18 April 2023 A D Kennedy 48
Grassmann algebras: IV
Where Pf(a) is the Pfaffian
Pf(a)² = det(a)
11 deti ij j
ij
a
nn ijd d d d e a
If we separate the Grassmann variables into two “conjugate” sets then we find the more familiar result
Despite the notation, this is a purely algebraic identityIt does not require the matrix a > 0, unlike its bosonic analogue
Gaussian integrals over Grassmann manifolds
1 Pfi ij j
ija
n ijd d e a
Tuesday 18 April 2023 A D Kennedy 49
Dynamical fermions: II
PseudofermionsDirect simulation of Grassmann fields is not feasible
The problem is not that of manipulating anticommuting values in a computer
We therefore integrate out the fermion fields to obtain the fermion determinant detMd d e M
It is that is not positive, and thus we get poorimportance sampling
FS Me e
and always occur quadratically
The overall sign of the exponent is unimportant
Tuesday 18 April 2023 A D Kennedy 50
Dynamical fermions: III
0
, , Me
Any operator can be expressed solely in terms of the bosonic fields
1( , ) ( , )G x y x y M x y
E.g., the fermion propagator is
Tuesday 18 April 2023 A D Kennedy 51
Dynamical fermions: IV
Represent the fermion determinant as a bosonic Gaussian integral with a non-local kernel 1
det MM d d e
The fermion kernel must be positive definite (all its eigenvalues must have positive real parts) otherwise the bosonic integral will not convergeThe new bosonic fields are called pseudofermions
The determinant is extensive in the lattice volume, thus again we get poor importance sampling
Including the determinant as part of the observable to
be measured is not feasible
det
detB
B
S
S
M
M
Tuesday 18 April 2023 A D Kennedy 52
Dynamical fermions: V
It is usually convenient to introduce two flavours of fermion and to write 1†2 †det det
M MM M M d d e
The evaluation of the pseudofermion action and the corresponding force then requires us to find the solution of a (large) set of linear equations 1†M M
This not only guarantees positivity, but also allows us to generate the pseudofermions from a global heatbath by applying to a random Gaussian distributed field
†M
The equations for motion for the boson (gauge) fields are
†1 1 1† † † † †B BS S
M M M M M M M M
Tuesday 18 April 2023 A D Kennedy 53
Dynamical fermions: VI
It is not necessary to carry out the inversions required for the equations of motion exactly
There is a trade-off between the cost of computing the force and the acceptance rate of the Metropolis MDMC step
The inversions required to compute the pseudofermion action for the accept/reject step does need to be computed exactly, however
We usually take “exactly” to by synonymous with “to machine precision”
Tuesday 18 April 2023 A D Kennedy 54
Reversibility: I
Are HMC trajectories reversible and area preserving in practice?
The only fundamental source of irreversibility is the rounding error caused by using finite precision floating point arithmetic
For fermionic systems we can also introduce irreversibility by choosing the starting vector for the iterative linear equation solver time-asymmetrically
We do this if we to use a Chronological Inverter, which takes (some extrapolation of) the previous solution as the starting vector
Floating point arithmetic is not associativeIt is more natural to store compact variables as scaled integers (fixed point)
Saves memoryDoes not solve the precision problem
Tuesday 18 April 2023 A D Kennedy 55
Reversibility: II
Data for SU(3) gauge theory and QCD with heavy quarks show that rounding errors are amplified exponentially
The underlying continuous time equations of motion are chaotic
Ляпунов exponents characterise the divergence of nearby trajectories
The instability in the integrator occurs when H » 1
Zero acceptance rate anyhow
Tuesday 18 April 2023 A D Kennedy 56
Reversibility: III
In QCD the Ляпунов exponents appear to scale with as the system approaches the continuum limit
= constant
This can be interpreted as saying that the Ляпунов exponent characterises the chaotic nature of the continuum classical equations of motion, and is not a lattice artefact
Therefore we should not have to worry about reversibility breaking down as we approach the continuum limit
Caveat: data is only for small lattices, and is not conclusive
Tuesday 18 April 2023 A D Kennedy 57
Data for QCD with lighter dynamical quarks
Instability occurs close to region in where acceptance rate is near one
May be explained as a few “modes” becoming unstable because of large fermionic force
Integrator goes unstable if too poor an approximation to the fermionic force is used
Reversibility: IV
Tuesday 18 April 2023 A D Kennedy 58
PHMC
Polynomial Hybrid Monte Carlo algorithm: instead of using Чебышев polynomials in the multiboson algorithm, Frezzotti & Jansen and deForcrand suggested using them directly in HMC
Polynomial approximation to 1/x are typically of order 40 to 100 at presentNumerical stability problems with high order polynomial evaluationPolynomials must be factoredCorrect ordering of the roots is importantFrezzotti & Jansen claim there are advantages from using reweighting
Tuesday 18 April 2023 A D Kennedy 59
RHMC
Rational Hybrid Monte Carlo algorithm: the idea is similar to PHMC, but uses rational approximations instead of polynomial ones
Much lower orders required for a given accuracy
Evaluation simplified by using partial fraction expansion and multiple mass linear equation solvers
1/x is already a rational function, so RHMC reduces to HMC in this case
Can be made exact using noisy accept/reject step
Tuesday 18 April 2023 A D Kennedy 60
Chiral Fermions
Conventions
We work in Euclidean spaceWe work in Euclidean space
γγ matrices are Hermitianmatrices are Hermitian
We writeWe write
We assume all Dirac We assume all Dirac
operators are operators are γγ55 HermitianHermitian
†5 5D D
D D
Tuesday 18 April 2023 A D Kennedy 61
It is possible to have chiral symmetry on the lattice without doublers if we only insist that the symmetry holds on shell
On-shell chiral symmetry: I
Such a transformation should be of the form
(Lüscher)
is an independent field from
has the same Spin(4) transformation properties as
does not have the same chiral transformation
properties
as in Euclidean space (even in the continuum)
y†y
y
551 1;i aD i aDe e
y
†y
†y†y
Tuesday 18 April 2023 A D Kennedy 62
On-shell chiral symmetry: II
For it to be a symmetry the Dirac
operator must be invariant 5 51 1i aD i aDD e De D
5 51 1 0aD D D aD For an infinitesimal transformation this implies that
5 5 52D D aD D
Which is the Ginsparg-Wilson relation
Tuesday 18 April 2023 A D Kennedy 63
5 5 5†sgnˆ W
w
W W
D MD M
D M D M
Both of these conditions are satisfied if we define
(Neuberger)
Neuberger’s Operator: I
We can find a solution of the Ginsparg-Wilson relation as follows
† †15 5 5 5 5 52 1 ;ˆ ˆ ˆaD aD aD
Let the lattice Dirac operator to be of the form
This satisfies the GW relation iff 25 1ˆ
It must also have the correct continuum limit
Where we have defined where W WD Z O a 2WZM aZ
2 25 5 52 1 1ˆ WD
D Z aZ O a O aM
Tuesday 18 April 2023 A D Kennedy 64
Into Five Dimensions
H Neuberger hep-lat/9806025A Boriçi hep-lat/9909057, hep-lat/9912040, hep-lat/0402035A Boriçi, A D Kennedy, B Pendleton, U Wenger hep-lat/0110070R Edwards & U Heller hep-lat/0005002趙 挺 偉 (T-W Chiu) hep-lat/0209153, hep-lat/0211032, hep-lat/0303008R C Brower, H Neff, K Orginos hep-lat/0409118 Hernandez, Jansen, Lüscher hep-lat/9808010
Tuesday 18 April 2023 A D Kennedy 65
Is DN local?It is not ultralocal (Hernandez, Jansen, Lüscher)
It is local iff DW has a gap
DW has a gap if the gauge fields are smooth enough
It seems reasonable that good approximations to DN
will be local if DN is local and vice versa
Otherwise DWF with n5 → ∞ may not be local
Neuberger’s Operator: II
1, 1 1 sgn
52D H H
N 10 μ
Tuesday 18 April 2023 A D Kennedy 66
Four dimensional space of algorithms
Neuberger’s Operator: III
Representation (CF, PF, CT=DWF)
Constraint (5D, 4D)
,
( )sgn( ) ( )
( )n
n mm
P HH H
Q HApproximation
5 WH D MKernel
Tuesday 18 April 2023 A D Kennedy 67
Kernel
Shamir kernel
55
5
;2
WT T T
W
a D MH D aD
a D M
Möbius kernel
5 5
55 5
;2
WM M M
W
b c D MH D aD
b c D M
5W WH D M
Wilson (Boriçi) kernel
Tuesday 18 April 2023 A D Kennedy 68
1 0 0 111 11 0 0 1
A A B
CA D CA B
1 0 0 1
1 0 0 1
1 0
1 11 0
A B
CA D CA B
Schur Complement
It may be block diagonalised by an LDU factorisation (Gaussian elimination)
1det det
A BAD ACA B
C DIn particular
A B
C D
Consider the block matrixEquivalently a matrix over a skew field = division ring
The bottom right block is the Schur complement
Tuesday 18 April 2023 A D Kennedy 69
1
2
2
1
n
n
D
00
00
1
2
52
1
n
n
D
Constraint: I
1
2
2
1
n
n
DU
So, what can we do with the Neuberger operator represented as a Schur complement?Consider the five-dimensional system of linear
equations00
00
1L 1LL
The bottom four-dimensional component is, Nn n DD
Tuesday 18 April 2023 A D Kennedy 70
Constraint: II
Alternatively, introduce a five-dimensional pseudofermion field 1 12 n
Then the pseudofermion functional integral is
† 15†
5 ,1
det det det detn
Dj j
j
d d e D LDU D D
So we also introduce n-1 Pauli-Villars fields
†,
11 1
†,
1 1
detj j jj
n nD
j j j jj j
d d e D
and we are left with just det Dn,n = det DN
Tuesday 18 April 2023 A D Kennedy 71
Approximation: tanh
Pandey, Kenney, & Laub; Higham; NeubergerFor even n (analogous formulæ for odd n)
11 1
1,11
1tanh tanh
1
nxx
n n nxx
x n x
2
2 2
2
2 2
1
tan
1
12tan
1
n
n
kx
nk
kx
nk
xn
2
2 2 21 12 2cos sin1
2 1n
x k kk n n
xn
ωj
Tuesday 18 April 2023 A D Kennedy 72
Approximation: Золотарев
2
2/ 2
21
212
sn ;1
sn / ; sn 2 / ;1sn ; sn ;
1sn 2 / ;
n
m
z k
z M iKm n k
z k M z k
iK m n k
sn(z/M,λ)
sn(z,k)
ωj
Tuesday 18 April 2023 A D Kennedy 73
Approximation: Errors
The fermion sgn problem
Approximation over 10-2 < |x| < 1Rational functions of degree (7,8)
ε(x) – sgn(x)
log10 x
0.01
0.005
-0.01
-0.005
-2 1.5 -1 -0.5 0.50
Золотарев tanh(8 tanh-1x)
Tuesday 18 April 2023 A D Kennedy 74
Representation: Continued Fraction I
0
1
2
3
1 0 0
1 1 0
0 1 1
0 0 1
A
A
A
A
Consider a five-dimensional matrix of the form
10
01 10 11
1 121 2
1 32
1 0 0 0 1 0 00 0 01 0 0 0 1 00 0 0
0 1 0 0 0 0 0 0 10 0 0
0 0 1 0 0 0 1
SSS SS
S S SS
S
Compute its LDU decomposition
0 01
1; n n
n
S A S AS
where
3 3 3 32
2 21
10
1 1 11 1
1
S A A AS A A
S AA
then the Schur complement of the matrix is the
continued fraction
Tuesday 18 April 2023 A D Kennedy 75
Representation: Continued Fraction II
We may use this representation to linearise our rational approximations to the sgn function
1
1 21
1
1 0
1
22
, ( )
n
n
n
n
n
n
cc
H HH
HH
cc
c cc
c
0
01
0 0
0 02
01
21
21 1
22 1 2
21 2 1
0
1
10
c c cn n n
c c cn n n
c c c
c c c c
Hn
Hn
H
H
Hc
as the Schur complement of the five-dimensional matrix
Tuesday 18 April 2023 A D Kennedy 76
Representation: Partial Fraction I
2
22
1
1
1
2
2
0 0
0 0 0
0 0
0 0 0
0 0
1 1
1 1
1
1
11
x
p
p x
x
p
q
p
x
q
R
Consider a five-dimensional matrix of
the form (Neuberger & Narayanan)
Tuesday 18 April 2023 A D Kennedy 77
Compute its LDU decomposition
So its Schur complement is
12 2
1
22 2
2
p x
x q
p xR
x q
2
2 2 2
2 2
2
1
1 1 1
2
1 2
2
1
1 0 0 0 0
1 0 0 0
0 0 1 0 0
0 0 1 0
1
p
x
p q x q
x x q x q
p
xp q x q
x x q x q
12 2
1 1
2
1
1
22 2
2 2
2
2
2
2 2
2
2 2
1
(
( )
)
0 0 0 0
0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0
x
p
p x q
xq
p x
x q
x
p
p x q
R
xqp x
x q
1
2
1
1 1
2 2
1 1
2
2 2
2 2
2 2
1 0 0
0 1 0 0
0 0 1
0 0 0 1
0 0 0 0 1
p
p
x
x
p
xq x q
x q x qp
xq x q
x q x q
1 11
11 01
1
1 12
21 02
1 0
0 0
0 0
0 0
0 0
0
x
pp x
q
x
R
x
pp
q
Representation: Partial Fraction II
Tuesday 18 April 2023 A D Kennedy 78
Representation: Partial Fraction III
This allows us to represent the partial
fraction expansion of our rational function as
the Schur complement of a five-dimensional
linear system
1, 2 2
1
( )n
jn n
j j
pH H
H q
Tuesday 18 April 2023 A D Kennedy 79
1
2 1
3 2 1
4 3 2 1
1 0 0 0 1 0 0
0 1 0 0 0 1 0
0 0 1 0 0 0 1
0 0 0 0 0 0 1
A
A A
A A A
C A A A A
1
2 1
3 2 1
4 3 2 1
1 0 0
0 1 0
0 0 1
0 0 0
A
A A
A A A
C A A A A
2
3
4
1 0 0 0
1 0 0
0 1 0
0 0 1
A
A
A
Representation: Cayley Transform I
1
2
3
4
1 0 0
1 0 0
0 1 0
0 0
A
A
A
A C
Consider a five-dimensional matrix of the
formCompute its LDU decomposition
CT 1 2 1n nS C A A A A So its Schur complement is
Neither L nor U depend on C
1CT 2
1 112 51
11 1
1P P T
TS
T
If where , and , then
1 nT T T 1s sA T
1C P P
Tuesday 18 April 2023 A D Kennedy 80
Representation: Cayley Transform II
The Neuberger operator is
CTN
CT
,1
P P SD H
S
1 1
( ) ;1 1
T x xx T x
T x x
T(x) is the Euclidean Cayley transform of, ( ) sgn( )n m x x
1
x x T xT x
0 0 0 1T jj
j
xT x
x
For an odd function we have
In Minkowski space a Cayley transform maps between Hermitian (Hamiltonian) and unitary (transfer) matrices
Tuesday 18 April 2023 A D Kennedy 81
(1) (1) (1)
(2) (2) (2)
(3) (3) (3)
(4) (4) (4)
00
00
D D P D PD P D D P
D P D D PD P D P D
The Neuberger operator with a general Möbius kernel is related to the Schur complement of
D5 (μ) (1) (1) (1) (1)
(2) (2) (2) (2)
(3) (3) (3) (3)
(4) (4) (4) (4) (4)
0 0 0 00 0 0 0
0 0 0 00 0 0
D D D DD D D D
P PD D D D
D D D D D
Representation: Cayley Transform III
with and( ) ( )
5
( ) ( )5
( ) 1
( ) 1
s ss W
s ss W
D b D M
D c D M
152 1P
( ) ( ) ( ) ( )5 55 5 5 5 5 5;s s s s
s
b cb c b c b c
P-μP-
P+ μP+
Tuesday 18 April 2023 A D Kennedy 82
(1) (1) (1) (1)
(2) (2) (2) (2)
5 (3) (3) (3) (3)
(4) (4) (4) (4)
0 0 0 0
0 0 0 0( )
0 0 0 0
0 0 0 0
D D D D
D D D DD P P
D D D D
D D D D
P
1 0 0 0 0 0 0 1
0 1 0 0 1 0 1 0
0 0 1 0 0 1 0 0
0 0 0 1 0 0 1 0
P P
P
Cyclically shift the columns of the right-handed part where
5 5D D P
Representation: Cayley Transform IV
P+P- μP+μP-
Tuesday 18 April 2023 A D Kennedy 83
( ) ( ) ( )s s sQ D P D P
Representation: Cayley Transform V
1 0 0 0
0 1 0 0
0 0 1 0
10 0 0 P P
C
(1)
(2)
(3)
(4)
0 0 0
0 0 0
0 0 0
0 0 0
Q
Q
Q
Q
Q1( ) ( )s s s M
ss M
HT Q Q
H
With some simple rescaling
11
12
13
14
1 0 0
1 0 01
0 1 05
10 0
( )
T
T
T
T P P
D
Q PCx
The domain wall operator reduces to the form introduced before
Tuesday 18 April 2023 A D Kennedy 84
Representation: Cayley Transform VI
It therefore appears to have exact off-shell chiral symmetryBut this violates the Nielsen-Ninomiya theorem
q.v., Pelissetto for non-local versionRenormalisation induces unwanted ghost doublers, so we cannot use DDW for dynamical (“internal”) propagatorsWe must use DN in the quantum action instead
We can us DDW for valence (“external”) propagators, and thus use off-shell (continuum) chiral symmetry to manipulate matrix elements
5 5( ) (1)D D We solve the equation
Note that satisfies1 1DW ND D a
1 1 1 15 5 5 5 5, , 2 , 2 0DW N N N ND D a D D D a
Tuesday 18 April 2023 A D Kennedy 85
Chiral Symmetry Breaking
Ginsparg-Wilson defect5 5 5 52 LD D aD D Using the approximate Neuberger operator 1
52 1aD H
L measures chiral symmetry breaking 212 1La H
The quantity is essentially the usual
domain wall residual mass (Brower et al.)
†
res †
tr
trLG G
mG G
mres is just one moment of L
G is the quark propagator
Tuesday 18 April 2023 A D Kennedy 86
Numerical Studies
Used 15 configurations from the RBRC dynamical DWF dataset
316 32 12 2
0.8 1.8 0.02
V n L ns fM
Matched π mass for Wilson and Möbius kernelsAll operators are even-odd preconditionedDid not project eigenvectors of HW
Tuesday 18 April 2023 A D Kennedy 90
mres per Configuration
mres is not sensitive to this small eigenvalue
But mres is sensitive to this one
ε
Tuesday 18 April 2023 A D Kennedy 91
Cost versus mres
Tuesday 18 April 2023 A D Kennedy 92
Conclusions: I
Relatively goodZolotarev Continued FractionRescaled Shamir DWF via Möbius (tanh)
Relatively poor (so far…)Standard Shamir DWFZolotarev DWF (Chiu)
Can its condition number be improved?
Still to doProjection of small eigenvalues
HMC5 dimensional versus 4 dimensional dynamicsHasenbusch acceleration
5 dimensional multishift?Possible advantage of 4 dimensional nested Krylov solvers
Tunnelling between different topological sectorsAlgorithmic or physical problem (at μ=0)Reflection/refraction
Tuesday 18 April 2023 A D Kennedy 93
Approximation Theory
We review the theory of optimal polynomial and rational Чебышев approximations, and the theory of elliptic functions leading to Золотарев’s formula for the sign function over the range R={z : ε ≤|z|≤1}. We show how Gauß’ arithmetico-geometric mean allows us to compute the Золотарев coefficients on the fly as a function of ε. This allows us to calculate sgn(H) quickly and accurately for a Hermitian matrix H whose spectrum lies in R.
Tuesday 18 April 2023 A D Kennedy 94
What is the best polynomial approximation p(x) to a continuous function f(x) for x in [0,1] ?
Polynomial approximation
Best with respect to the appropriate norm
where n 1
1/1
0
( ) ( )n
np f dx p x f xn
Tuesday 18 April 2023 A D Kennedy 95
Weierstraß’ theorem
Weierstraß: Any continuous function can be arbitrarily well approximated by a polynomial
0 1minmax ( ) ( )
p xp f p x f x
Taking n →∞ this is the Taking n →∞ this is the minimax norm norm
Tuesday 18 April 2023 A D Kennedy 96
Бернштейне polynomials
0
(1 )n
n n kn
k
nkf x x
n kp x
The explicit solution is provided by Бернштейне polynomials
Tuesday 18 April 2023 A D Kennedy 97
Чебышев’s theorem
The error |p(x) - f(x)| reaches its maximum at exactly d+2 points on the unit interval
0 1max
xp f p x f x
ЧебышевЧебышев: There is always : There is always a unique polynomial of any a unique polynomial of any degree d which minimises degree d which minimises
Tuesday 18 April 2023 A D Kennedy 98
Чебышев’s theorem: NecessitySuppose p-f has less than d+2 extrema of equal magnitudeThen at most d+1 maxima exceed some magnitude
And whose magnitude is smaller than the “gap”The polynomial p+q is then a better approximation than p to f
This defines a “gap”We can construct a polynomial q of degree d which has the opposite sign to p-f at each of these maxima (Lagrange interpolation)Lagrange
was born in Turin!
Tuesday 18 April 2023 A D Kennedy 99
Чебышев’s theorem: Sufficiency
Then at each of the d+2 extrema i i i ip x f x p x f x
Thus p’ – p = 0 as it is a polynomial of degree d
Therefore p’ - p must have d+1 zeros on the unit interval
Suppose there is a polynomialp f p f
Tuesday 18 April 2023 A D Kennedy 100
The notation is an old transliteration of Чебышев !
Чебышев polynomials
Convergence is often exponential in d The best approximation of degree d-1 over [-
1,1] to xd is 1
112
dd
d dp x x T x
1cos cosdT x d x
Where the Чебышев polynomials are
1 ln21 2
2
dd
d ddx p x T x e
The error is
Tuesday 18 April 2023 A D Kennedy 101
Чебышев rational functions
Чебышев’s theorem is easily extended to rational approximations
Rational functions with nearly equal degree numerator and denominator are usually best
Convergence is still often exponential
Rational functions usually give a much better approximation
A simple (but somewhat slow) numerical algorithm for finding the optimal Чебышев rational approximation was given by Ремез
Tuesday 18 April 2023 A D Kennedy 102
Чебышев rationals: ExampleA realistic example of a rational approximation is
x 2.3475661045 x 0.1048344600 x 0.00730638141
0.3904603901x 0.4105999719 x 0.0286165446 x 0.0012779193x
Using a partial fraction expansion of such rational functions allows us to use a multishift linear equation solver, thus reducing the cost significantly.
1 0.0511093775 0.1408286237 0.59648450330.3904603901
x 0.0012779193 x 0.0286165446 x 0.4105999719x
The partial fraction expansion of the rational function above is
This is accurate to within almost 0.1% over the range [0.003,1]
This appears to be numerically stable.
Tuesday 18 April 2023 A D Kennedy 103
Polynomials versus rationals
Optimal L2 approximation with weight is
2
1
1 x2 1
0
( ) 4( )
(2 1)
jn
jj
T xj
Optimal L∞ approximation cannot be too much better (or it would lead to a better L2 approximation)
lnn
e Золотарев’s formula has L∞ error
This has L2 error of O(1/n)
Tuesday 18 April 2023 A D Kennedy 104
Elliptic functions
Elliptic function are doubly periodic analytic functionsJacobi showed that if f is not a constant it can have at most two primitive periods
Tuesday 18 April 2023 A D Kennedy 105
Liouville’s theorem
( )P
dzf z
0
0
( )dzf z
( )dzf z
( )dzf z
0
( )dzf z
0
( ) ( )dz f z f z
0
( ) ( )dz f z f z
ω
ω’ ω+ω’
0
Res ( ) 0f
Tuesday 18 April 2023 A D Kennedy 106
Liouville’s theorem: Corollary I
( )( ) ln ( )
( )f z
g z f zf z
Apply this to the logarithmic derivative
1( ) ( ) ( )j jr rj jf z c z O z
Laurent expansion about each pole ζj
( ) 2 0jjP
dzg z i r
Hence number of poles must equal the number of zeros (with multiplicity)
Tuesday 18 April 2023 A D Kennedy 107
Liouville’s theorem: Corollary II
It follows that there are no non-constant holomorphic elliptic functions
If f is an analytic elliptic function with no poles then f(z)-a could have no zeros either
The more common form of
Liouville’s theorem was
actually due to Cauchy
Tuesday 18 April 2023 A D Kennedy 108
Liouville’s theorem: Corollary III
( ) 2P
dz h z i n n
Apply Liouville’s theorem to h(z) = z g(z)
Where n and n’ and the number of times f(z) winds around the origin as z is taken along a straight line from 0 to ω or ω’ respectively
1
m
k kk
n n
If αk and βk are the locations of the poles and zeros of f, then the sum of the locations of the poles minus the location of the zeros vanishes modulo the periods
Tuesday 18 April 2023 A D Kennedy 109
Weierstraß elliptic functions: I
A simple way to construct a doubly periodic analytic function is as the convergent sum
3,
1( ) 2
m m
Q zz m m
Note that Q is an odd function
Tuesday 18 April 2023 A D Kennedy 110
Weierstraß elliptic functions: II
The derivative of an elliptic function is clearly also an elliptic function, but in general its integral is not
Q z z c
1 12 2
But since Q is odd its integral is even, and thus
2 30
1 2zz d Q
z
So the Weierstraß function is elliptic
The only singularity is a double pole at the origin and its periodic images
The name of the function is , and the function is called the Weierstraß
function, but what is its name called?
(with apologies to the White Knight)
Tuesday 18 April 2023 A D Kennedy 111
Weierstraß elliptic functions: III
satisfies the differential equation
where
2 3
2 34z z g z g
3
6,
0
1140 m m
m m
g
m m
24
,
0
160 m m
m m
g
m m
Thus we may express the functional inverse of as the elliptic integral
32 34Z
dwz
w g w g
Tuesday 18 April 2023 A D Kennedy 112
Weierstraß elliptic functions: IV
The integral of - is not an elliptic function
20
1 1zz du u
z u
1 12 2z z
But it is an odd function, and so satisfies
2 i And Liouville’s theorem gives the useful identity
Simple pole at the origin with unit residue
Tuesday 18 April 2023 A D Kennedy 113
Weierstraß elliptic functions: V
That was so much fun we’ll do it again.
0
1exp
zz z u
u
Integrating gives the odd function ln
12z
z e z
With the periodic behaviourNo poles, and simple zero at the origin
Tuesday 18 April 2023 A D Kennedy 114
Expansion of elliptic functions: I
Let f be an elliptic function with periods ω and ω’ whose poles and zeros are at αj and βj respectively
Recall that ; w.l.g. set this to 0 1
m
k kk
n n
1 2
1 2
n
n
z b z b z bg z
z a z a z a
The function g then has the same poles and zeros as f
And is periodic, and hence elliptic
1
expn
j jj
g z g z g z
Tuesday 18 April 2023 A D Kennedy 115
Expansion of elliptic functions: II
By Liouville’s theorem f/g is an elliptic function with no poles, and is thus a constant C
1 2
1 2
n
n
z b z b z bf z C
z a z a z a
Every elliptic function may thus be written in the “rational” form
Tuesday 18 April 2023 A D Kennedy 116
Addition formulæ: I
Similarly, any elliptic function f can be expanded in “partial fraction” form in terms of Weierstraß ζ functions.
12
u vu v u v
u v
By considering this allows us to show
that the ζ function satisfies the addition formula
uf u
u v
Tuesday 18 April 2023 A D Kennedy 117
Addition formulæ: II
2
14
u vu v u v
u v
And by differentiation we obtain an addition formula for too
The principal conclusion is that any elliptic function f can be represented in terms of rational functions of Weierstraß functions of the same argument with the same periods
e of z R z R z z
Tuesday 18 April 2023 A D Kennedy 118
Jacobian elliptic functions: I
Although Weierstraß elliptic functions are most elegant for theoretical studies, in practice it is useful to follow Jacobi and introduce the function sn(z;k) through the inversion of the elliptic integral
sn
2 2 20 1 1
z dtz
t k t
Tuesday 18 April 2023 A D Kennedy 119
Jacobian elliptic functions: II
12
2 3
sn ;z
z kz e z e
Of course, this cannot be anything new – it must be expressible in terms of the Weierstraß functions with the same periods
Where the “roots” of the cubic form in Weierstraß differential equation are
2 2 2
1 2 3
1 6 1 6 1, ,
6 12 12k k k k k
e e e
Tuesday 18 April 2023 A D Kennedy 120
Jacobian elliptic functions: III
We are hiding the fact that the Weierstraß functions depend not only on z but also on periods ω and ω’
1
2 2 20
( )1 1
dtK k
t k t
It is not too hard to show that the periods of the Jacobian elliptic functions are where and the complete elliptic integral is
4 ( ), 2 ( )K k iK k 2 2 1k k
Tuesday 18 April 2023 A D Kennedy 121
Jacobian elliptic functions: IV
2
213
1sn ;
1z k
z k
For later purposes we note the relation between the even Jacobi and Weierstraß elliptic functions
A corollary of this is that any even elliptic function with periods may be expressed as a rational function of sn(z;k)2
4 ( ), 2 ( )K k iK k
Tuesday 18 April 2023 A D Kennedy 122
Modular transformations: I
We observed earlier that there are many choices of periods ω and ω’ which generate the same period lattice: if we choose
with this will be the case.
,
det 1
Since the period lattices are the same, elliptic
functions with these periods are rationally expressible in terms of each other
This is called a first degree transformation
Tuesday 18 April 2023 A D Kennedy 123
Modular transformations: II
We can also choose periods whose lattice has the original one as a sublattice, e.g., , /n
Elliptic functions with these periods must also be rationally expressible in terms of ones with the original periods, although the inverse may not be true
This is called a transformation of degree n
Tuesday 18 April 2023 A D Kennedy 124
Let us construct an elliptic function with periods
4 ( ), 2 ( ) /K k iK k n
Modular transformations: III
This is easily done by taking sn(z;k) and scaling z by some factor 1/M and choosing a new parameter λ
We thus consider sn(z/M; λ) with periods for z/M of
4 4 , 2 2L K iL iK
The periods for z are thus 4LM and 2iL’M, so we must have LM=K and L’M=K’/n
Tuesday 18 April 2023 A D Kennedy 125
Modular transformations: IV
The even elliptic function must therefore
be expressible as a rational function of
sn / ;
sn ;
z M
z k
2sn ;z k
Matching the poles and zeros, and fixing the overall normalisation from the behaviour near z=0 we find
2
2/ 2
21
212
sn ;1
sn / ; sn 2 / ;1sn ; sn ;
1sn 2 / ;
n
m
z k
z M iKm n k
z k M z k
iK m n k
Tuesday 18 April 2023 A D Kennedy 126
Modular transformations: V
The quantity M is determined by evaluating this identity at the half period K where sn ; sn / ; sn ; 1K k K M L
Likewise, the parameter λ is found by evaluating the identity at z = K + iK’/n
/ 2 2
21
1sn ;
1
nm
m m
czM M c
It is useful to write the identity in parametric form
Tuesday 18 April 2023 A D Kennedy 127
Im z
Re z0 2K
K-2K
-K
iK’
-iK’
Золотарев’s formula: I
Real period 4K, K=K(k)Imaginary period 2iK’, K’=K(k’) , k2+k’2=1
Fundamental region
∞1/k10-1
-1/k
Tuesday 18 April 2023 A D Kennedy 128
Золотарев’s formula: IIsn(z/M,λ)
sn(z,k)
Im z
Re z0 2K
K-2K
-K
iK’
-iK’
Tuesday 18 April 2023 A D Kennedy 129
Arithmetico-Geometric mean: I
Long before the work of Jacobi and Weierstraß
Gauß considered the following sequence 1
1 12 ,n n n n n na a b b a b
Since these are means
1 1;n n n n n na a b a b b
Furthermore 22 2 11 1 4 0n n n na b a b
1 1n n n na a b b
So it converges to the AG mean
a b
Tuesday 18 April 2023 A D Kennedy 130
Gauß transformation
2
1 sn ;2sn 1 ;
1 1 sn ;
k u kkk u
k k u k
Gauß discovered the second degree modular transformation (but not in this notation, of course)
Let 1
2, sn 1 ; , sn ;
1n n
n nn n
a b kk s k u s u k
a b k
The the Gauß transformation tells us that
1 1
2 21 1
1 21
n n nn
n n n n n n
k s a ss
ks a b a b s
This is just a special case of the degree n modular transformation we used
before
Tuesday 18 April 2023 A D Kennedy 131
Arithmetico-Geometric mean: II
On the other hand
1
2 2 21 1 42 20 0 1 121
11
n ns s
dt dt
t k t kt t
k
uk
So the quantity
is a constant (the RHS is independent of n)
1
2 2 2 2 2 2 21 1
2 21 1 1 11 0 0
n n
n n n n
s s
dt dt
t a t b t t a t b tn
ua
Tuesday 18 April 2023 A D Kennedy 132
Arithmetico-Geometric mean: III
Therefore it must have the value
2 2 2 2
1
21 11 0 0
sin1
21
s s
t a t b tn
su dtdta a at
We may thus compute
if we start with , then also
21 sn ; ,1, 1ns u k f u k
2
sin if
, , 2with , , if
2
au a b
f u a b a a bf u ab a b
a b a b
2
K ka
0 0
11,
1k
a bk
Tuesday 18 April 2023 A D Kennedy 133
Arithmetico-Geometric mean: IV
We have a rapidly converging recursive function to compute sn(u;k) and K(k) for real u and 0<k<1
Values for arbitrary complex u and k can be computed from this using simple modular transformations
These are best done analytically and not numerically
With a little care about signs we can also compute cn(u;k) and
dn(u;k) which are then also needed
Testing for a=b in floating point arithmetic is sinful: it is better to continue until a ceases to decrease
Tuesday 18 April 2023 A D Kennedy 134
Conclusions: II
One can compute matrix functions exactly using fairly low-order rational approximations
“Exactly” means to within a relative error of 1 u.l.p.Which is what “exactly” normally means for scalar computationsOne can also use an absolute error, but it makes no difference for the sign function!
For the case of the sign function we can evaluate the Золотарев coefficients needed to obtain this precision on the fly just to cover the spectrum of the matrix under consideration
There seem to be no numerical stability problems
Tuesday 18 April 2023 A D Kennedy 135
Bibliography: I
Elliptic functionsНаум Ильич Ахиезер (Naum Il’ich Akhiezer), Elements of the Theory of Elliptic Functions, Translations of Mathematical Monographs, 79, AMS (1990). ISBN 0065-9282 QA343.A3813.
Tuesday 18 April 2023 A D Kennedy 136
Bibliography: II
Approximation theoryНаум Ильич Ахиезер (Naum Il’ich Akhiezer), Theory of Approximation, Constable (1956). ISBN 0094542007.
Theodore J. Rivlin, An Introduction to the Approximation of Functions, Dover (1969). ISBN 0-486-64069-8.
Tuesday 18 April 2023 A D Kennedy 137
Hasenbusch Acceleration
Hasenbusch introduced a simple method that significantly improves the performance of dynamical fermion computations. We shall consider a variant of it that accelerates fermionic molecular dynamics algorithms by introducing n pseudofermion fields coupled with the nth root of the fermionic kernel.
Tuesday 18 April 2023 A D Kennedy 138
Non-linearity of CG solver
Suppose we want to solve A2x=b for Hermitian A by CG
It is better to solve Ax=y, Ay=b successivelyCondition number (A2) = (A)2
Cost is thus 2(A) < (A2) in general
Suppose we want to solve Ax=bWhy don’t we solve A1/2x=y, A1/2y=b successively?
The square root of A is uniquely defined if A>0This is the case for fermion kernels
All this generalises trivially to nth rootsNo tuning needed to split condition number evenly
How do we apply the square root of a matrix?
Tuesday 18 April 2023 A D Kennedy 139
Rational matrix approximation
Functions on matricesDefined for a Hermitian matrix by diagonalisationH = UDU-1
f(H) = f(UDU-1) = U f(D) U-1
Rational functions do not require diagonalisation
Hm + Hn = U( Dm + Dn) U-1
H-1 = UD-1U-1
Rational functions have nice propertiesCheap (relatively) Accurate
Tuesday 18 April 2023 A D Kennedy 140
No Free Lunch Theorem
We must apply the rational approximation with each CG iteration
M1/n r(M)The condition number for each term in the partial fraction expansion is approximately (M)So the cost of applying M1/n is proportional to (M)Even though the condition number (M1/n)=(M)1/n
And even though (r(M))=(M)1/n
So we don’t win this way…
Tuesday 18 April 2023 A D Kennedy 141
Pseudofermions
We want to evaluate a functional integral including the fermionic determinant det M
1
det MM d d e
We write this as a bosonic functional integral over a pseudofermion field with kernel M -1
Tuesday 18 April 2023 A D Kennedy 142
Multipseudofermions
We are introducing extra noise into the system by using a single pseudofermion field to sample this functional integral
This noise manifests itself as fluctuations in the force exerted by the pseudofermions on the gauge fieldsThis increases the maximum fermion forceThis triggers the integrator instabilityThis requires decreasing the integration step size
11
detnMnM d d e
A better estimate is det M = [det M1/n]n
Tuesday 18 April 2023 A D Kennedy 143
Hasenbusch’s method
Clever idea due to HasenbuschStart with the Wilson fermion action M=1-HIntroduce the quantity M’=1-’HUse the identity M = M’(M’-1M)Write the fermion determinant as det M = det M’ det (M’-
1M)Introduce separate pseudofermions for each determinantAdjust ’ to minimise the cost
Easily generalisesMore than two pseudofermionsWilson-clover action
Tuesday 18 April 2023 A D Kennedy 144
Violation of NFL Theorem
So let’s try using our nth root trick to implement multipseudofermions
Condition number (r(M))=(M)1/n
So maximum force is reduced by a factor of n(M)(1/n)-1
This is a good approximation if the condition number is dominated by a few isolated tiny eigenvaluesThis is so in the case of interest
Cost reduced by a factor of n(M)(1/n)-1
Optimal value nopt ln (M)
So optimal cost reduction is (e ln) /
This works!
Tuesday 18 April 2023 A D Kennedy 145
RHMC technicalities
In order to keep the algorithm exact to machine precision (as a Markov process)
Use a good (machine precision) rational approximation for the pseudofermion heatbathUse a good rational approximation for the HMC acceptance test at the end of each trajectoryUse as poor a rational approximation as we can get away with to compute the MD force