neural network force fields for simple metals and

11
6506 | Phys. Chem. Chem. Phys., 2019, 21, 6506--6516 This journal is © the Owner Societies 2019 Cite this: Phys. Chem. Chem. Phys., 2019, 21, 6506 Neural network force fields for simple metals and semiconductors: construction and application to the calculation of phonons and melting temperaturesMa ´ rio R. G. Marques, Jakob Wolff, Conrad Steigemann and Miguel A. L. Marques * We present a practical procedure to obtain reliable and unbiased neural network based force fields for solids. Training and test sets are efficiently generated from global structural prediction runs, at the same time assuring the structural variety and importance of sampling the relevant regions of phase space. The neural networks are trained to yield not only good formation energies, but also accurate forces and stresses, which are the quantities of interest for molecular dynamics simulations. Finally, we construct, as an example, several force fields for both semiconducting and metallic elements, and prove their accuracy for a variety of structural and dynamical properties. These are then used to study the melting of bulk copper and gold. 1 Introduction In the past decades we have witnessed the development of several electronic structure methods 1 for the study of the properties of materials. From these, density functional theory 2–4 occupies a prominent place due to the unparalleled combination of accuracy and computational efficiency. With the continuous development of new approximations for the elusive exchange– correlation functional, 5 the improvement of computer codes, and with the availability of ever faster supercomputers, density- functional theory has been reliably applied to the study of millions of compounds and forms the backbone of all current high- throughput and accelerated material design efforts. 6–8 Some examples of these methodologies for battery materials, 9 syn- thetic fuels, 10 perovskites, 11,12 hard magnets, 13 among others, can be readily found in the literature. Unfortunately, and in spite of its numerical efficiency, density functional calculation has a number of limitations. A single evaluation of the total energy is usually limited to systems with r10 000 atoms. For molecular dynamics simula- tions, or other approaches that involve sampling of the potential energy surface, this number is reduced to a few hundreds. For the complicated problem of global structural prediction, 14 we are currently limited to a few tens of atoms at most. 15,16 To overcome this problem, several faster approaches were developed. This includes, for example, density-functional- based tight binding 17–19 and charge equilibration methods, such as ReaxFF 20 and the charge-optimized many-body potential. 21 At an even higher level we find classical force fields, such as the Tersoff 22 and the Stillinger–Weber potentials. 23 Unfortunately, all these approaches turn out to be considerably less precise than straightforward density-functional theory. Meanwhile, the interest in machine learning methods 24,25 has been growing, fueled by a series of outstanding results in face recognition, 26–29 image classification, 30 driving cars, 31 playing Atari games, 32 Go, 33 etc. In the fields of solid-state physics and theoretical chemistry, these methods can already predict the properties of molecules, 34,35 optimize transition states, 36 determine band gaps 37 and predict the stability of different materials. 38,39 Machine learning methods also prospered in the creation of interatomic potentials. 40–50 Here we will be concerned with neural network 51 force fields that rely on the Behler–Parrinello method. 52 This is a rather successful approach 53 that can yield errors in the energy as low as a few meV per atom and that scales linearly with the number of atoms N in the unit cell. Furthermore, the forces exerted on the atoms, required for most simulations, can be obtained analytically through deriva- tion. Finally, it also has several open-source implementations, such as the ÆNET, 54 Amp 55 and TensorMol-0.1 56 packages. Initially, the cost function used to train Behler–Parrinello force fields only took into account the error in the energy. 52 In fact, for training sets of moderate size, this error can be Institut fu ¨r Physik, Martin-Luther-Universita¨t Halle-Wittenberg, D-06120 Halle (Saale), Germany. E-mail: [email protected] Electronic supplementary information (ESI) available. See DOI: 10.1039/ c8cp05771k Received 12th September 2018, Accepted 25th February 2019 DOI: 10.1039/c8cp05771k rsc.li/pccp PCCP PAPER Published on 07 March 2019. Downloaded by Martin-Luther-Universitaet on 11/16/2020 2:56:01 PM. View Article Online View Journal | View Issue

Upload: others

Post on 28-Oct-2021

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Neural network force fields for simple metals and

6506 | Phys. Chem. Chem. Phys., 2019, 21, 6506--6516 This journal is© the Owner Societies 2019

Cite this:Phys.Chem.Chem.Phys.,

2019, 21, 6506

Neural network force fields for simple metalsand semiconductors: construction and applicationto the calculation of phonons and meltingtemperatures†

Mario R. G. Marques, Jakob Wolff, Conrad Steigemann andMiguel A. L. Marques *

We present a practical procedure to obtain reliable and unbiased neural network based force fields for

solids. Training and test sets are efficiently generated from global structural prediction runs, at the same

time assuring the structural variety and importance of sampling the relevant regions of phase space. The

neural networks are trained to yield not only good formation energies, but also accurate forces and

stresses, which are the quantities of interest for molecular dynamics simulations. Finally, we construct,

as an example, several force fields for both semiconducting and metallic elements, and prove their

accuracy for a variety of structural and dynamical properties. These are then used to study the melting

of bulk copper and gold.

1 Introduction

In the past decades we have witnessed the development ofseveral electronic structure methods1 for the study of theproperties of materials. From these, density functional theory2–4

occupies a prominent place due to the unparalleled combinationof accuracy and computational efficiency. With the continuousdevelopment of new approximations for the elusive exchange–correlation functional,5 the improvement of computer codes, andwith the availability of ever faster supercomputers, density-functional theory has been reliably applied to the study of millionsof compounds and forms the backbone of all current high-throughput and accelerated material design efforts.6–8 Someexamples of these methodologies for battery materials,9 syn-thetic fuels,10 perovskites,11,12 hard magnets,13 among others,can be readily found in the literature.

Unfortunately, and in spite of its numerical efficiency,density functional calculation has a number of limitations.A single evaluation of the total energy is usually limited tosystems with r10 000 atoms. For molecular dynamics simula-tions, or other approaches that involve sampling of the potentialenergy surface, this number is reduced to a few hundreds. Forthe complicated problem of global structural prediction,14 we arecurrently limited to a few tens of atoms at most.15,16

To overcome this problem, several faster approaches weredeveloped. This includes, for example, density-functional-based tight binding17–19 and charge equilibration methods,such as ReaxFF20 and the charge-optimized many-bodypotential.21 At an even higher level we find classical force fields,such as the Tersoff22 and the Stillinger–Weber potentials.23

Unfortunately, all these approaches turn out to be considerablyless precise than straightforward density-functional theory.

Meanwhile, the interest in machine learning methods24,25

has been growing, fueled by a series of outstanding resultsin face recognition,26–29 image classification,30 driving cars,31

playing Atari games,32 Go,33 etc. In the fields of solid-statephysics and theoretical chemistry, these methods can alreadypredict the properties of molecules,34,35 optimize transitionstates,36 determine band gaps37 and predict the stability ofdifferent materials.38,39

Machine learning methods also prospered in the creation ofinteratomic potentials.40–50 Here we will be concerned withneural network51 force fields that rely on the Behler–Parrinellomethod.52 This is a rather successful approach53 that can yielderrors in the energy as low as a few meV per atom and thatscales linearly with the number of atoms N in the unit cell.Furthermore, the forces exerted on the atoms, required formost simulations, can be obtained analytically through deriva-tion. Finally, it also has several open-source implementations,such as the ÆNET,54 Amp55 and TensorMol-0.156 packages.

Initially, the cost function used to train Behler–Parrinelloforce fields only took into account the error in the energy.52

In fact, for training sets of moderate size, this error can be

Institut fur Physik, Martin-Luther-Universitat Halle-Wittenberg,

D-06120 Halle (Saale), Germany. E-mail: [email protected]

† Electronic supplementary information (ESI) available. See DOI: 10.1039/c8cp05771k

Received 12th September 2018,Accepted 25th February 2019

DOI: 10.1039/c8cp05771k

rsc.li/pccp

PCCP

PAPER

Publ

ishe

d on

07

Mar

ch 2

019.

Dow

nloa

ded

by M

artin

-Lut

her-

Uni

vers

itaet

on

11/1

6/20

20 2

:56:

01 P

M.

View Article OnlineView Journal | View Issue

Page 2: Neural network force fields for simple metals and

This journal is© the Owner Societies 2019 Phys. Chem. Chem. Phys., 2019, 21, 6506--6516 | 6507

rather small. However, the error obtained for the forces canremain reasonably high52,54 (above 100 meV �1) and subduingit requires considerably larger training sets.

An alternative is the inclusion of the errors for the forces inthe cost function.42,57–60 Additionally, the errors for stress canalso be included in a similar fashion. For each crystal structurethere are 3N forces and 6 components of the stress tensor whilethere is only one energy. In a way, using forces and stressesis equivalent to increasing the training set by a rather largefactor. Furthermore, these gradients are readily available in anydensity-functional calculation through the use of the Hellmann–Feynman and generalized virial theorems.1 Forces and stressesare derivatives of energy, therefore their inclusion in thecost function requires an extension of the back-propagationalgorithm.58

The other key factor in developing neural network basedforce fields is the construction of adequate training and testsets. These should be varied, which in this context means thatthey should include crystal structures that span all possiblebonding patterns and bonding lengths relevant for the simula-tions. They should also be economical, in the sense that theconstruction of the training and test sets should require theleast amount of computer time possible. These are, of course,contradicting requirements, so a good compromise has to be found.

In this article, we develop a methodology that is capable ofproviding reliable neural network force fields for a large rangeof physical situations. Training and test sets are constructedfrom global structural prediction calculations, ensuring anoptimal coverage of all possible bonding environments.Furthermore, training is performed by optimizing energies,forces, and stresses in a balanced manner. As a positive sideeffect, these increase considerably the reliability and transfer-ability of our networks.

The rest of this article is organized as follows. In Section 2,we present the Behler–Parrinello method as implemented inthe ÆNET package. In Section 3, we discuss how to performback propagation for objective functions involving the error inthe forces and the stress tensor. In Section 4, we proposeatomic interaction potentials for several semiconducting andmetallic elements and one binary. Finally, in Section 5, wepresent some applications of our neural network force fields,namely for the calculation of phonon frequencies and for thestudy of the melting of simple metals. The article ends with ourconclusions.

2 Artificial neural networks

The first step in describing a potential energy surface using anartificial neural network (ANN) pertains to the representation ofthe structure, i.e., how to describe a crystal in a way that can be fedas input into a machine learning method. Several approacheshave been proposed, such as the Coulomb matrix,61 the Ewaldsum matrix,62 partial radial distribution functions,63 root-mean-square distances,64 and the smooth overlap of atomic positionskernel65 among others. In their influential paper,52 Behler and

Parrinello introduced a set of atom-centered symmetry functionsthat enjoyed considerable success in the past years.66,67 Thesesymmetry functions depend on two or three atomic coordinatesand map the distribution of distances between atoms (radialfunctions) or the distribution of bond angles (angular func-tions), respectively. Furthermore, they possess certain impor-tant properties,42 such as rotational and atomic permutationinvariance.

In the Behler–Parrinello method, each atom of a system istherefore described by a set of symmetry functions that serveas input o0

j to the neural network of that element. Every elementin the periodic table is characterized by a different network.An example of such a network can be found in Fig. 1.

Eqn (1) shows how to then obtain the nodes onz of a layerbased on the nodes of the previous layer.

onz ¼ j hnz

� �¼ j

Xj

wnjzon�1j

!; (1)

where j represents an activation function, hnz its argument andwnzj is the weight between the node j in layer n � 1 and the nodez in layer n. Finally, to compute the total energy of the system, asummation over the outputs (Ei) of each atomic neural networkis required.

At this point, to optimize the weights we face a high-dimensional problem, where the objective function is usuallydefined as

e ¼ 1

2

Xs

Xi

Ei � Eref

!2

s

: (2)

Here the sum in i goes over all the atoms in the consideredstructure s, and the differences are between the energy com-puted by the neural network and some reference method.Currently, the back propagation algorithm24,68 is the standardalgorithm to find the set of optimal weight parameters byminimization of the error function.

Fig. 1 Example of a multilayer perceptron feedforward neural networkas used in this work. The bias nodes are shown with a dashed contour.Their standard value is 1.

Paper PCCP

Publ

ishe

d on

07

Mar

ch 2

019.

Dow

nloa

ded

by M

artin

-Lut

her-

Uni

vers

itaet

on

11/1

6/20

20 2

:56:

01 P

M.

View Article Online

Page 3: Neural network force fields for simple metals and

6508 | Phys. Chem. Chem. Phys., 2019, 21, 6506--6516 This journal is© the Owner Societies 2019

To compute forces and the stress tensor, we have to calculatethe derivative of the energy with respect to the positions andthe components of the strain tensor (eab)

Fga ¼ �@E

@Rga¼ �

XNi

Xl

Mi@Ei

@o0il

@o0il@Rga

(3)

sab ¼ �1

O@E

@eab¼ �1

O

XNi

Xl

MiXg

N @Ei

@o0il

@o0il@Rga

@Rga

@eab: (4)

Here N is the number of atoms in the system, Mi is the numberof symmetry functions for atom i (or the number of nodes inthe input layer), and O is the volume of the system.

All the magic behind neural networks can be explained bythe hidden layers and the nonlinearity of the activation func-tions. Examples of activation functions are the linear function,the hyperbolic tangent, the leaky rectified linear units (leakyReLu), and softplus

jlinear(x) = x (5)

jtanhðxÞ ¼1� e�2x

1þ e�2x(6)

jleaky ReLu(x) = max(0.01x,x) (7)

jsoftplus(x) = log(1 + ex). (8)

In recent years, the leaky ReLu has been gaining popularity,69

for example in applications to image processing. In particular,this activation function leads to networks that are faster andeasier to train. Unfortunately, the leaky ReLu has a discontin-uous derivative at x = 0, which makes forces and stresses alsodiscontinuous as a function of the atomic positions. In the nextsection we will see an example of this problem.

3 Backpropagation

When trained only for energies, neural network potentials achieveremarkable accuracy for the energies, yet forces (and stresses) maybe less precise, depending on the diversity and size of the trainingset. As these quantities are obtained from the analytic derivative ofthe neural network function, this might imply that the neuralnetworks do not describe correctly the potential energy surface.Furthermore, forces (and stresses) are readily available from anyelectronic structure method, so why not use all the availableinformation to train the networks? In fact, the inclusion of forcesin the cost function has been a standard practice for severalyears,42,55,58,70 and stresses can also be readily incorporated. Othermachine learning potentials can also use this information.49,50,71,72

The inclusion of the forces and stresses in the training of neuralnetworks can be done by generalizing the cost function

e ¼Xs

aXNi

Ei � Eref

!2

þ g6

X6k

Sk � Srefk

� �224

þ b3N

XNj

X3z

Fjz � F refjz

� �2#s

(9)

where Fjz is the component z of the force acting on atom j. Sk

stands for the components of the stress tensor in the Voigtnotation and a, b and g are parameters that scale units and canbe changed to increase the relative importance of different terms.

To use this objective function we have the option to eitherexpand the network or to generalize the back propagation algorithm.The first option involves the augmentation of the output of theneural network to include not only the energy, but also forces andstresses. As this option does not take into account the relationshipbetween energy and its gradients (forces and stresses), inconsisten-cies between them are prone to appear. The second option is muchmore flexible and will be used here. The back propagation algorithmrelies on the derivative of the objective function with respect to theweights of the neural network. Normally, this entails the derivative ofthe output of the neural network (Ei). Defining on as a vectorcontaining the nodes of the n layer, wn as a matrix containing theweights that connect the nodes in the n� 1 and n layers and _jn as adiagonal matrix that stores the derivatives of the nodes of the n layer,the derivative of the output of the neural network takes the form

@Ei

@wn¼

YNhþ1

m4 n

_jmwm

" #_jn � on�1

!i

(10)

where Nh is the number of hidden layers, and the product insidesquare brackets is ordered in decreasing order of layers i.e.,

Y3m4 1

_jmwm

" #¼ _j3w3 _j2w2:

In addition to this contribution, the backpropagation algorithmrequires the second derivatives with respect to the weights of theneural network output. Defining €jn as a diagonal matrix thatstores the second derivatives of the nodes of the n layer, we obtain

@Fja

@wn¼XNi

YNhþ1

m4 n

_jmwm

" #_jn � xn�1

þXNþ1p¼n

YNhþ1

m4 n

_jmwm

" #lp

Ypq4 1

wq _jq�1

" #� on�1

!i

;

(11)

for the derivative of the force components with respect to theweights. Here the products inside brackets are ordered in decreas-ing order of layers, and xk is defined using the recursion relation

xk = _jkwkxk�1, (12)

x0 ¼ @o

@Rja: (13)

The quantity lp in eqn (11) is a diagonal matrix defined by

lkij ¼ €jkwkxkð Þidij : (14)

We can also perform the derivative of the components of thestress tensor (Sk) with respect to the weights. This can be done

by replacing@o

@Rjaby

1

OPNj

@o

@Rja

@Rja

@eab(15)

in eqn (11)–(13), according to eqn (3) and (4).

PCCP Paper

Publ

ishe

d on

07

Mar

ch 2

019.

Dow

nloa

ded

by M

artin

-Lut

her-

Uni

vers

itaet

on

11/1

6/20

20 2

:56:

01 P

M.

View Article Online

Page 4: Neural network force fields for simple metals and

This journal is© the Owner Societies 2019 Phys. Chem. Chem. Phys., 2019, 21, 6506--6516 | 6509

These expressions were coded in the open source packageÆNET. Our implementation allows the optimization of theenergy, forces, or stresses separately, or any combinationof them.

We would like to point out that a back propagation methodusing eqn (11)–(13) will not always provide an update for everyweight of the neural network. Careful analysis of these equa-tions reveals that the bias neurons will not be updated foractivation functions with vanishing second derivatives. There-fore, for these cases, the training for energy, forces and stressesshould occur either by joint training or using a double loopoptimization. Actually, eqn (11) simplifies considerably foractivation functions with vanishing second derivatives (as theReLu), as only the first term remains. This can, in principle,lead to a decrease in the time necessary to train the neuralnetwork.

4 Examples of force fields

To test our methodology, we selected standard semiconductingelements (Si and Ge) and simple metals (Cu and Au). All ofthese are well studied in the literature, where we can findseveral examples of classical force fields,22,23 tight-bindingparameterizations,73,74 and even neural network based forcefields.44,66,75 We will also look at a binary system, namely Si–Ge.

The creation of an artificial neural network or every othermachine learning force field requires 3 steps: the constructionof a data set containing the structures to be fitted and theirproperties, the training of the model which normally requiresthe minimization of an objective function, and the validation ofthe force field to ascertain its predictability. These three stepswill be discussed in detail in the following.

4.1 Construction of the data sets

The structure optimizations and force and energy calculationspresented in this work were performed within density func-tional theory with the vasp code.76,77 We used the PAW setupsfrom version 5.2 of vasp and we approximated the exchange–correlation functional with the Perdew–Burke–Ernzerhofapproximation.78

Generation of the reference datasets was carried out in thesame spirit as in the work of Huran et al.79 and proceeds in twoseparate steps. The first step of the process uses a globalstructural prediction,14 namely the minima-hopping method(MHM),80,81 to explore the possible bonding patterns and crystalstructures allowed by the chemical composition. The output ofour runs includes not only a set of structures corresponding tolocal-minima, but also another set of intermediate structuresresulting from molecular dynamics runs. Note that the majorityof structures is far from dynamic equilibrium. To complementour datasets, and similarly to the strategy of Artrith et al.,54 wethen apply a series of geometrical distortions to the local minima.Finally, we sometimes add a set of two-dimensional minimastructures obtained with the procedure of Borlido et al.,82,83

in order to increase the reliability of the networks for low-dimensional phases.

We note that in the work of Huran et al., the tight bindingparameterization only required around 500 structures perelement for training. This number had to be increased sub-stantially for our neural networks. As an example, our set forsilicon contains 131 minima, 54 2D structures, 7142 distortionsand 19 033 structures obtained from molecular dynamics simu-lations. Of these, 70% were used for the training and 30% forthe testing. The sizes of the sets for the other systems aresimilar to the ones for silicon and those for the training aredisplayed in Table 1. These numbers show that we have a rathersmall number of minima and 2D structures, which led us toincrease the weight of these structures.

We expect the neural network potentials produced fromthese datasets to be able to describe the local minima ofthe potential energy surface of each element/binary and thesurrounding regions. Different temperatures, structures understress and structures with relatively large forces are also takeninto account (due to MD and distorted structures) and shouldbe properly described. On the other hand, we do not believethat our potentials will be able to describe molecules orclusters, as these kinds of environments are not present inthe training sets. Finally, supercells that locally resemble thestructures present in our training set should be properlydescribed by neural network potentials, as most of the relevantcontributions to the energy reside within the cut-off radiusdefined for each atom.

4.2 Training of the artificial neural networks

We expect that regions of the phase-space close to the dynami-cal minima are sampled most often in simulations. Therefore,to obtain a better description of these regions we decided toinclude in our objectives a dimensionless weight factor wi that isgiven in terms of %Fi, the average norm of forces acting on atomsin the ith structure. It reads, when %Fi is measured in eV �1,

wi ¼0:2

0:2þ �Fi2: (16)

This function attains its maximum value of one for %Fi = 0 anddecreases monotonically for larger forces.

Furthermore, we are usually interested in energy differencesand not in the absolute value of the total energy (that is anywayarbitrary for an infinite solid). For example, the formationenergy uses as a reference the ground-state of the elementarysubstances. To decrease the error in the evaluation of these

Table 1 Number of each kind of structures in our training sets. We notethat the binary training dataset also included all the elemental minimastructures

Minima Distorted MD 2D Total

Si 92 4999 13 323 37 18 451Ge 94 3967 6435 37 10 533SiGe 671 15 540 13 561 0 29 772Cu 20 485 13 191 0 13 696Au 27 1082 12 259 0 13 368

Paper PCCP

Publ

ishe

d on

07

Mar

ch 2

019.

Dow

nloa

ded

by M

artin

-Lut

her-

Uni

vers

itaet

on

11/1

6/20

20 2

:56:

01 P

M.

View Article Online

Page 5: Neural network force fields for simple metals and

6510 | Phys. Chem. Chem. Phys., 2019, 21, 6506--6516 This journal is© the Owner Societies 2019

energies, we found it useful to increase the weight of thereference structures in the training set, so that these are wellreproduced by the neural networks.

As the inputs of the neural networks, we used 8 radialsymmetry functions of type G2 and 18 angular functions oftype G4 for each elemental interaction, leading to 16 radial and54 angular functions for the binary systems. We present theirdefinitions in eqn (17) resorting to the notation of Behler.42

Gi2 ¼

Xneighbours

jai

fc Rij

� �e�Z Rij�Rsð Þ2

Gi4 ¼ 21�z

Xneighbours

jkiajak

1þ l cos yijk� �z

e�Z Rij2þRik

2þRjk2ð Þ

� fc Rij

� �fc Rikð Þfc Rjk

� �;

(17)

where Z, l, Rs and z are parameters related to the properties ofthe Gaussians, and fc cuts off the interactions between atomsbeyond a certain radius Rc. The values for these parameters canbe found in ref. 54 for titanium oxide. We tried to furtheroptimize these values using pattern search techniques, but wecould not achieve any improvement of the results.

We trained several neural networks with different architec-tures, i.e. different numbers of hidden layers, and with differentnumbers of nodes using the Levenberg–Marquardt84,85 and theBroyden–Fletcher–Goldfarb–Shanno methods.86

As we mentioned before, the leaky ReLu has been gainingpopularity in many machine learning applications. We foundthat this activation function does allow for a much fastertraining of the neural network force fields. However, as shownin Fig. 2, physical quantities are noticeably discontinuous whenplotted as a function of the geometrical parameters. Thismeans that the leaky ReLu can only be used in applicationsthat require solely the energy, and not for, e.g., moleculardynamics simulations. As such, in the remainder of this articlewe resorted to the softplus activation function that does notexhibit this abnormal behavior.

4.3 Analysis of the errors

To investigate the predictive power of our force fields, wecomputed, for the structures in our test sets, the (weighted)mean absolute error and root-mean square error for formationenergies, forces and components of the stress tensor. Thesevalues can be found in Table 2, together with the resultsobtained with the classical Stillinger–Weber23,87 and the Tersoffpotentials88,89 (for Si and Ge), and with density-functional tight-binding (Slater–Koster files for Si from the parameters set pbc74

and matsci73). For the classical force fields we used the LAMMPScode90 while for the DFTB calculations we used DFTB+.91 Addi-tionally, we also present the values obtained for networks trainedonly for energies.

It is clear from the table that neural networks performconsistently better than the other methods, and by a largemargin, in all cases studied. Particularly impressive are the verysmall errors obtained for simple metals. In fact, the magnitudeof the errors for these metals reveals that if the training setis large enough, the joint training is not necessary to obtainaccurate results.

We should recall that neural network potentials do require alarger computational effort than traditional classical force-fields, but scale better and are substantially faster than tight-binding approaches, so these are remarkable results.

In Fig. 3 we compare the formation energies calculated withour potential for Ge to the DFT reference. A perfect fit would begiven by the y = x straight line. In the inset we show the regionwith formation energies between 0 and 1 eV, where most of ourstructures are included. We can see that the neural network canpredict the energy consistently through the whole range. Moststructures exhibit errors smaller than 25 meV, but in somecases, these errors can be as large as 100 meV. These arenevertheless the structures that contain forces with high

Fig. 2 Change in pressure with the volume for the cubic silicon structure.The neural networks used to make this figure were only trained for energiesand forces. The inset plots the leaky ReLu and the softplus activation function.

Table 2 Weighted mean absolute errors (M) and root mean square errors(R) for formation energy (FE), forces and stresses calculated with differentmethods. All networks had 3 hidden layers with 5 neurons each

FE (meV per atom) Forces (meV �1) Stress (meV �3)

M R M R M R

Si This work 54 65 76 119 8 13e-training 29 38 162 298 11 16S.-Weber 784 1331 925 1594 231 490Tersoff 194 228 362 568 29 48pbc 299 386 190 282 57 106matsci 504 577 504 754 123 172

Ge This work 34 40 46 75 5 9e-training 14 22 77 155 5 8S.-Weber 276 388 321 558 84 127Tersoff 434 559 448 759 81 122

SiGe This work 78 89 87 127 6 10e-training 64 84 186 279 12 18

Cu This work 4 6 13 18 5 8e-training 3 4 17 26 7 10matsci 228 343 656 937 133 228

Au This work 8 11 25 37 2 3e-training 9 11 36 55 2 3

PCCP Paper

Publ

ishe

d on

07

Mar

ch 2

019.

Dow

nloa

ded

by M

artin

-Lut

her-

Uni

vers

itaet

on

11/1

6/20

20 2

:56:

01 P

M.

View Article Online

Page 6: Neural network force fields for simple metals and

This journal is© the Owner Societies 2019 Phys. Chem. Chem. Phys., 2019, 21, 6506--6516 | 6511

magnitude and that had therefore a small weight during thetraining.

In the top panel of Fig. 4, we compare the forces calculatedwith a neural network for Ge only trained for energies with theDFT reference quantities. The bottom panel shows the samevalues for our network trained for energies, forces, and stresses.The improvement of the results depicted in the lower panel isevident. Not only the error is much reduced on average, but alsothe maximum deviation from the diagonal is considerablysmaller. Furthermore, the distribution is now symmetric,correcting for the systematic error that can be seen in theupper panel. In our opinion, these results clearly demonstratethe importance of including the error in the forces (and stresses)in the cost function (Fig. 4).

In general, training neural networks for additional proper-ties reduces the number of training structures required toachieve a desired accuracy. Nevertheless, a multi-objectiveoptimization will only return optimized values within a Paretocurve. In other words, the training has to balance the predictionof the energy with the prediction of the other properties.As seen from Table 2 and the figures above, our networks obtainaverage errors for the formation energy below 50 meV per atom,forces below 100 meV �1, and stresses below 15 meV �3. Byincreasing the number of layers, or the number of neurons ineach layer, we did not succeed in further improving the errors.We believe, instead, that this could be achieved by using asubstantially larger and more diverse training set and betterinput features, or increasing the complexity of the topology ofthe neural network (e.g., by including convolution layers).

Are the values acceptable? In his Nobel prize lecture,92

John Pople argues that a global accuracy of 1 kcal mol�1

(roughly 43 meV per atom) for energies with respect toexperimental values would be appropriate (the so-calledchemical accuracy). Our networks basically obtain this accu-racy (with respect to the training data) and yield at the sametime high quality forces and stresses for a rather modestnumber of training data.

5 Example of applications

To exemplify the usefulness of our force fields, and to furtherassess their quality, we computed a few properties that are notdirectly related to the quantities that we trained for. Specifi-cally, we used our networks to obtain the phonon dispersionof Si and Cu, and then to obtain the melting temperature ofCu and Au.

Fig. 3 Comparison between formation energies calculated with DFT andwith our force-field for Ge. The neural network used to model ourpotential had 3 hidden layers with 50 nodes each.

Fig. 4 Comparison between properties calculated with DFT and with ourforce-field for Ge. The neural network used to model our potential had3 hidden layers with 50 nodes each. The probability density function (PDF),shown with the color gradient, displays the number of structures that canbe found at each point (based on a smooth kernel density estimate).

Paper PCCP

Publ

ishe

d on

07

Mar

ch 2

019.

Dow

nloa

ded

by M

artin

-Lut

her-

Uni

vers

itaet

on

11/1

6/20

20 2

:56:

01 P

M.

View Article Online

Page 7: Neural network force fields for simple metals and

6512 | Phys. Chem. Chem. Phys., 2019, 21, 6506--6516 This journal is© the Owner Societies 2019

5.1 Phonons of Si and Cu

Calculations were performed with the phonopy package93 usingthe frozen-phonon technique. Fig. 5 and 6 depict the compar-ison between the phonon frequencies obtained using the PBEfunctional and using neural networks for cubic silicon andcopper, respectively. The upper panel of Fig. 5 is obtained with

a network trained only for energies and the lower panel withone for which energies, forces, and stresses were optimized.Training networks for only energies yields a very good speed ofsound, but the accuracy of the acoustic branches deteriorateswhen one approaches the boundaries of the Brillouin zone.Optical phonons are also considerably overestimated. Theimprovement of the lower panel is remarkable, with the PBEphonons accurately reproduced in the whole Brillouin zone. InFig. 6 we show the phonon dispersion of copper calculated withneural networks (trained for energies, forces, and stresses),compared to the PBE reference. As expected from the excellentfit (see Table 2), phonon frequencies of copper are almostperfectly replicated.

Clearly, for copper, the network trained only for energiesalready yields very good forces and consequently good phononfrequencies. This is also the case for some neural networksfound in the literature, e.g., for sodium in ref. 47, for calciumfluoride in ref. 45 or for copper, gold, and palladium in ref. 94.

5.2 Melting of Cu and Au

We computed the melting temperature of Cu and Au in theirground-state, face-centered cubic, crystal lattice. This was doneby performing molecular dynamics simulations with the NPTBerendsen thermostat and barostat95 implemented in the asepackage.96 We used a time step of 10 fs and took for thecoupling constants of the thermostat and barostat the valuesof 500 fs and 1000 fs (for Cu) and 400 fs and 500 fs (for Au).After thermalizing the system at 100 K and 1 bar, we increasedthe temperature of the thermostat linearly. The melting tem-perature for a given heating rate was obtained from two distinctcriteria: (A) from the maximum of the heat capacity, followingthe method of Qi, et al.97 (B) Using a Lindemann criterion98

where the melting temperature is obtained from the maximum

of the second derivative offfiffiffiffiffiffiffiffiffiu2h i

p .d as a function of the

thermostat temperature. In this formula,ffiffiffiffiffiffiffiffiffiu2h i

pis the mean

square atomic displacement and d is the inter-atomic distance.Fig. 7 displays our results for Cu. As expected, there is a good

agreement between both methods and the melting temperatureconverges with the decrease of the heating rate. Furthermore,the melting temperature seems to be rather well converged withrespect to the size of the supercell. We fitted a straight line tothe points pertaining to heating rates below 0.2 K per step inorder to predict the melting temperature for an infinitesimalheating rate. This leads to a value of around 1510 K, to becompared to an experimental melting temperature of 1358 K.99

We could not find DFT simulations of the melting of Cuin the literature, certainly due to the overwhelming computa-tional effort required. However, there are a few works usingembedded-atom method (EAM) potentials fitted to DFTenergies and subsequent correction using DFT quantities. Inthis way, Vocadlo et al.100 predicted a value of 1176 � 100 K(with the PW91 functional101,102) and Zhu et al.103 obtained1251 � 15 K (with the PBE functional). Again using EAMpotentials, Wang et al.104 calculated a melting temperature of1780 K using the heat capacity method for a supercell of

Fig. 5 Phonon dispersion of cubic silicon calculated with the PBE func-tional and with a force field only trained for energies (top panel) andanother where we perform a joint training of energies, forces, and stresses(bottom panel).

Fig. 6 Phonon dispersion of cubic copper calculated with the PBE func-tional and with a force field trained for energies, forces, and stresses.

PCCP Paper

Publ

ishe

d on

07

Mar

ch 2

019.

Dow

nloa

ded

by M

artin

-Lut

her-

Uni

vers

itaet

on

11/1

6/20

20 2

:56:

01 P

M.

View Article Online

Page 8: Neural network force fields for simple metals and

This journal is© the Owner Societies 2019 Phys. Chem. Chem. Phys., 2019, 21, 6506--6516 | 6513

500 atoms and a heating rate of 4 K ps�1. They also computedthe melting temperature of copper clusters of different sizes.Extrapolation to a cluster of infinite size then led to a valueof 1360 K.

We also investigated the melting temperature of the face-centered cubic structure of gold for a supercell containing500 atoms. We show our results in Fig. 8. Following the samemethodology as for copper we arrived at a melting temperature of1048 � 12 K, a value in good agreement with previous resultsobtained with EAM potentials (1090 K),105 but differs from the valueof (2125 � 25) K found using the ReaxFF106 force field. We remindthat the experimental melting temperature of Au is 1338 K.

We note that our simulations are quite efficient numerically, andmost of our runs were performed with standard desktop computers.Furthermore, it is relatively simple to study even larger supercellsdue to the very favorable order-N scaling of the neural networks withthe size of the system. Finally, the melting temperatures obtainedwith our neural networks are in quite good agreement withexperimental values. In view of the excellent fit of DFT energiesand forces that we obtained for Cu and Au, we believe that the erroris mainly due to the shortcomings of the PBE approximation used toproduce the training sets and not the neural network.

We do not expect neural network force fields to replace DFTor other electronic structure methods. However, they can beextremely useful for sampling areas of the potential energysurface that are not accessible with current electronic structuremethods, and are far more accurate than simpler fitting meth-ods, like classical force fields. Moreover, they (and even othermachine learning methods) can be used to speed up searches,like global structure optimization, provided that their validity istested.107,108

6 Summary and conclusions

We developed a methodology to generate neural network force-fields for solids with relatively small training sets. This isachieved by (i) using well-balanced and unbiased training andtest sets that sample all possible bonding configurations andbond lengths and (ii) by training the network to reproduce notonly the energy but also the forces on the atoms and stresses onthe lattice. This joint training of the three objectives requiredan extension of the back propagation algorithm that wasimplemented in the open-source ÆNET package.54

Then we constructed inter-atomic neural network potentialsfor Si, Ge, SiGe, Cu, and Au. The average errors in the formationenergies for these networks are usually below 1 kcal mol�1, andthe forces and stresses are of high quality. This quality is alsoreflected in other quantities. For example, phonon band-structures are remarkably close to the density-functional theoryreference, and the melting energies of simple metals come backin good agreement with experiment.

Finally, our methodology opens the way to improve the qualityof the network while keeping small sizes of training sets. Thiscould be done, specifically, by tweaking the architecture of neuralnetworks through, e.g., the introduction of convolution layers.

Conflicts of interest

There are no conflicts to declare.

Fig. 7 Variation of the melting temperature of Cu with the heating rate fora system with 500 atoms (top panel) and 5234 atoms (bottom panel). Thestraight lines are linear fits (a + bx) to the points with the heating rate below0.4 K per step. Coefficients are a = 1511.1 � 5.14 and b = 481.0 � 34.8(500 atoms) and a = 1508.7 � 7.2 and b = 505.3 � 48.4 (5324 atoms).The errors only concern the fit.

Fig. 8 Variation of the melting temperature of Au with the heating rate fora system with 500 atoms. The straight line (a + bx with a = 1047.8 � 12.3and b = 557.1 � 89.4) was fitted for heating rates below 0.4 K per step.The errors only concern the fit.

Paper PCCP

Publ

ishe

d on

07

Mar

ch 2

019.

Dow

nloa

ded

by M

artin

-Lut

her-

Uni

vers

itaet

on

11/1

6/20

20 2

:56:

01 P

M.

View Article Online

Page 9: Neural network force fields for simple metals and

6514 | Phys. Chem. Chem. Phys., 2019, 21, 6506--6516 This journal is© the Owner Societies 2019

Acknowledgements

We thank Carlos Benavides and Dominik Schulze for helpfuldiscussions. We acknowledge partial support by the DFG colla-borative research centers SFB TRR 227 ‘‘Ultrafast spin dynamics’’.

References

1 R. M. Martin, Electronic Structure: Basic Theory and PracticalMethods, Cambridge University Press, 2008.

2 W. Kohn, Rev. Mod. Phys., 1999, 71, 1253–1266.3 P. Hohenberg and W. Kohn, Phys. Rev. B: Condens. Matter

Mater. Phys., 1964, 136, B864–B871.4 W. Kohn and L. J. Sham, Phys. Rev. B: Condens. Matter

Mater. Phys., 1965, 140, A1133–A1138.5 S. Lehtola, C. Steigemann, M. J. Oliveira and M. A. Marques,

SoftwareX, 2018, 7, 1–5.6 S. Curtarolo, W. Setyawan, G. L. Hart, M. Jahnatek,

R. V. Chepulskii, R. H. Taylor, S. Wang, J. Xue, K. Yang,O. Levy, M. J. Mehl, H. T. Stokes, D. O. Demchenko andD. Morgan, Comput. Mater. Sci., 2012, 58, 218–226.

7 A. Jain, S. P. Ong, G. Hautier, W. Chen, W. D. Richards,S. Dacek, S. Cholia, D. Gunter, D. Skinner, G. Ceder andK. A. Persson, APL Mater., 2013, 1, 011002.

8 J. E. Saal, S. Kirklin, M. Aykol, B. Meredig andC. Wolverton, JOM, 2013, 65, 1501–1509.

9 G. Ceder, MRS Bull., 2010, 35, 693–701.10 T. J. Goncalves, U. Arnold, P. N. Plessow and F. Studt, ACS

Catal., 2017, 7, 3615–3621.11 R. Sarmiento-Perez, T. F. T. Cerqueira, S. Korbel, S. Botti

and M. A. L. Marques, Chem. Mater., 2015, 27, 5957–5963.12 S. Korbel, M. A. L. Marques and S. Botti, J. Mater. Chem. A,

2018, 6, 6463–6475.13 N. Drebov, A. Martinez-Limia, L. Kunz, A. Gola,

T. Shigematsu, T. Eckl, P. Gumbsch and C. Elsasser, NewJ. Phys., 2013, 15, 125023.

14 Modern methods of crystal structure prediction, ed.A. R. Oganov, Wiley-VCH Verlag GmbH & Co. KGaA, 2010.

15 T. D. Huan, M. Amsler, M. A. L. Marques, S. Botti, A. Willandand S. Goedecker, Phys. Rev. Lett., 2013, 110, 135502.

16 H. D. Tran, M. Amsler, S. Botti, M. A. L. Marques andS. Goedecker, J. Chem. Phys., 2014, 140, 124708.

17 D. Porezag, T. Frauenheim, T. Kohler, G. Seifert andR. Kaschner, Phys. Rev. B: Condens. Matter Mater. Phys.,1995, 51, 12947–12957.

18 G. Seifert and J.-O. Joswig, Wiley Interdiscip. Rev.: Comput.Mol. Sci., 2012, 2, 456–465.

19 P. Koskinen and V. Makinen, Comput. Mater. Sci., 2009, 47,237–253.

20 A. C. T. van Duin, S. Dasgupta, F. Lorant and W. A. Goddard,J. Phys. Chem. A, 2001, 105, 9396–9409.

21 J. Yu, S. B. Sinnott and S. R. Phillpot, Phys. Rev. B: Condens.Matter Mater. Phys., 2007, 75, 085311.

22 J. Tersoff, Phys. Rev. Lett., 1986, 56, 632–635.23 F. H. Stillinger and T. A. Weber, Phys. Rev. B: Condens.

Matter Mater. Phys., 1985, 31, 5262–5271.

24 S. Marsland, Machine Learning, CRC Press, Taylor & FrancisInc., 2014.

25 I. H. Witten, E. Frank and M. A. Hall, Data Mining: PracticalMachine Learning Tools and Techniques (The MorganKaufmann Series in Data Management Systems), MorganKaufmann, 2011.

26 Y. Sun, X. Wang and X. Tang, 2014 IEEE Conference onComputer Vision and Pattern Recognition, 2014.

27 F. Schroff, D. Kalenichenko and J. Philbin, 2015 IEEEConference on Computer Vision and Pattern Recognition(CVPR), 2015.

28 A. L. Maas, A. Y. Hannun and A. Y. Ng, Proceedings of the30th International Conference on Machine Learning (ICML),Deep Learning for Audio, Speech and Language Processing,2013.

29 K. He, X. Zhang, S. Ren and J. Sun, 2015 IEEE InternationalConference on Computer Vision (ICCV), 2015, pp. 1026–1034.

30 A. Krizhevsky, I. Sutskever and G. E. Hinton, Commun.ACM, 2017, 60, 84–90.

31 M. Bojarski, D. D. Testa, D. Dworakowski, B. Firner,B. Flepp, P. Goyal, L. D. Jackel, M. Monfort, U. Muller,J. Zhang, X. Zhang, J. Zhao and K. Zieba, arXiv preprint,arXiv:1604.07316, 2016.

32 V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness,M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland,G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou,H. King, D. Kumaran, D. Wierstra, S. Legg and D. Hassabis,Nature, 2015, 518, 529–533.

33 D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van denDriessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam,M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner,I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepeland D. Hassabis, Nature, 2016, 529, 484–489.

34 M. Rupp, A. Tkatchenko, K.-R. Muller and O. A. von Lilienfeld,Phys. Rev. Lett., 2012, 108, 058301.

35 T. D. Huan, A. Mannodi-Kanakkithodi and R. Ramprasad,Phys. Rev. B: Condens. Matter Mater. Phys., 2015, 92, 014106.

36 Z. D. Pozun, K. Hansen, D. Sheppard, M. Rupp, K.-R.Muller and G. Henkelman, J. Chem. Phys., 2012, 136, 174101.

37 P. Dey, J. Bible, S. Datta, S. Broderick, J. Jasinski, M. Sunkara,M. Menon and K. Rajan, Comput. Mater. Sci., 2014, 83, 185–195.

38 J. Schmidt, J. Shi, P. Borlido, L. Chen, S. Botti and M. A. L.Marques, Chem. Mater., 2017, 29, 5090–5103.

39 F. A. Faber, A. Lindmaa, O. A. von Lilienfeld and R. Armiento,Phys. Rev. Lett., 2016, 117, 135502.

40 T. B. Blank, S. D. Brown, A. W. Calhoun and D. J. Doren,J. Chem. Phys., 1995, 103, 4129–4137.

41 J. Behler, J. Chem. Phys., 2016, 145, 170901.42 J. Behler, J. Chem. Phys., 2011, 134, 074106.43 C. M. Handley and P. L. A. Popelier, J. Phys. Chem. A, 2010,

114, 3371–3383.44 G. C. Sosso, G. Miceli, S. Caravati, J. Behler and M. Bernasconi,

Phys. Rev. B: Condens. Matter Mater. Phys., 2012, 85, 174103.45 S. Faraji, S. A. Ghasemi, S. Rostami, R. Rasoulkhani,

B. Schaefer, S. Goedecker and M. Amsler, Phys. Rev. B,2017, 95, 104105.

PCCP Paper

Publ

ishe

d on

07

Mar

ch 2

019.

Dow

nloa

ded

by M

artin

-Lut

her-

Uni

vers

itaet

on

11/1

6/20

20 2

:56:

01 P

M.

View Article Online

Page 10: Neural network force fields for simple metals and

This journal is© the Owner Societies 2019 Phys. Chem. Chem. Phys., 2019, 21, 6506--6516 | 6515

46 N. Artrith and A. M. Kolpak, Nano Lett., 2014, 14,2670–2676.

47 H. Eshet, R. Z. Khaliullin, T. D. Kuhne, J. Behler andM. Parrinello, Phys. Rev. B: Condens. Matter Mater. Phys.,2010, 81, 184107.

48 J. S. Elias, N. Artrith, M. Bugnet, L. Giordano, G. A. Botton,A. M. Kolpak and Y. Shao-Horn, ACS Catal., 2016, 6,1675–1679.

49 A. Thompson, L. Swiler, C. Trott, S. Foiles and G. Tucker,J. Comput. Phys., 2015, 285, 316–330.

50 A. P. Bartok, M. C. Payne, R. Kondor and G. Csanyi, Phys.Rev. Lett., 2010, 104, 136403.

51 J. P. M. de Sa, Pattern Recognition: Concepts, Methods andApplications, Springer Berlin Heidelberg, 2001.

52 J. Behler and M. Parrinello, Phys. Rev. Lett., 2007, 98, 146401.53 J. Behler, Angew. Chem., Int. Ed., 2017, 56, 12828–12840.54 N. Artrith and A. Urban, Comput. Mater. Sci., 2016, 114,

135–150.55 A. Khorshidi and A. A. Peterson, Comput. Phys. Commun.,

2016, 207, 310–324.56 K. Yao, J. E. Herr, D. W. Toth, R. Mckintyre and J. Parkhill,

Chem. Sci., 2018, 9, 2261–2269.57 J. B. Witkoskie and D. J. Doren, J. Chem. Theory Comput.,

2005, 1, 14–23.58 A. Pukrittayakamee, M. Hagan, L. Raff, S. Bukkapatnam

and R. Komanduri, Intelligent Engineering Systems ThroughArtificial Neural Networks: Smart Systems EngineeringComputational Intelligence in Architecting Complex Engineer-ing Systems, ASME Press, 2007, vol. 17, pp. 469–474.

59 A. Pukrittayakamee, M. Malshe, M. Hagan, L. M. Raff,R. Narulkar, S. Bukkapatnum and R. Komanduri, J. Chem.Phys., 2009, 130, 134101.

60 N. Artrith, T. Morawietz and J. Behler, Phys. Rev. B: Condens.Matter Mater. Phys., 2011, 83, 153101.

61 M. Rupp, A. Tkatchenko, K.-R. Muller and O. A. von Lilienfeld,Phys. Rev. Lett., 2012, 108, 058301.

62 F. Faber, A. Lindmaa, O. A. von Lilienfeld and R. Armiento,Int. J. Quantum Chem., 2015, 115, 1094–1101.

63 K. T. Schutt, H. Glawe, F. Brockherde, A. Sanna, K. R.Muller and E. K. U. Gross, Phys. Rev. B: Condens. MatterMater. Phys., 2014, 89, 205118.

64 A. Sadeghi, S. A. Ghasemi, B. Schaefer, S. Mohr, M. A. Lilland S. Goedecker, J. Chem. Phys., 2013, 139, 184118.

65 A. P. Bartok, R. Kondor and G. Csanyi, Phys. Rev. B:Condens. Matter Mater. Phys., 2013, 87, 184115.

66 J. Behler, R. Martonak, D. Donadio and M. Parrinello, Phys.Status Solidi B, 2008, 245, 2618–2629.

67 N. Artrith and A. M. Kolpak, Comput. Mater. Sci., 2015, 110,20–28.

68 R. Rojas, Neural Networks, Springer Berlin Heidelberg, 1996.69 X. Glorot, A. Bordes and Y. Bengio, Proceedings of the

fourteenth international conference on artificial intelligenceand statistics, 2011, pp. 315–323.

70 K. T. Schutt, H. E. Sauceda, P.-J. Kindermans,A. Tkatchenko and K.-R. Muller, J. Chem. Phys., 2018,148, 241722.

71 P. Rowe, G. Csanyi, D. Alfe and A. Michaelides, Phys. Rev. B,2018, 97, 054303.

72 V. L. Deringer, N. Bernstein, A. P. Bartok, M. J. Cliffe,R. N. Kerber, L. E. Marbella, C. P. Grey, S. R. Elliott andG. Csanyi, J. Phys. Chem. Lett., 2018, 9, 2879–2885.

73 L. Guimaraes, A. N. Enyashin, J. Frenzel, T. Heine,H. A. Duarte and G. Seifert, ACS Nano, 2007, 1, 362–368.

74 A. Sieck, T. Frauenheim and K. A. Jackson, Phys. StatusSolidi B, 2003, 240, 537–548.

75 N. Artrith and J. Behler, Phys. Rev. B: Condens. MatterMater. Phys., 2012, 85, 045439.

76 G. Kresse and J. Furthmuller, Comput. Mater. Sci., 1996, 6,15–50.

77 G. Kresse and J. Furthmuller, Phys. Rev. B: Condens. MatterMater. Phys., 1996, 54, 11169–11186.

78 J. P. Perdew, K. Burke and M. Ernzerhof, Phys. Rev. Lett.,1996, 77, 3865–3868.

79 A. W. Huran, C. Steigemann, T. Frauenheim, B. Aradi andM. A. L. Marques, J. Chem. Theory Comput., 2018, 14,2947–2954.

80 S. Goedecker, J. Chem. Phys., 2004, 120, 9911–9917.81 M. Amsler and S. Goedecker, J. Chem. Phys., 2010,

133, 224104.82 P. Borlido, C. Steigemann, N. N. Lathiotakis, M. A. L. Marques

and S. Botti, 2D Mater., 2017, 4, 045009.83 P. Borlido, C. Rodl, M. A. L. Marques and S. Botti, 2D

Mater., 2018, 5, 035010.84 K. Levenberg, Q. Appl. Math., 1944, 2, 164–168.85 D. W. Marquardt, SIAM J. Appl. Math., 1963, 11, 431–441.86 R. H. Byrd, et al., SIAM J. Sci. Stat. Comput., 1995, 16,

1190–1208.87 Z. Jian, Z. Kaiming and X. Xide, Phys. Rev. B: Condens.

Matter Mater. Phys., 1990, 41, 12915–12918.88 J. Tersoff, Phys. Rev. B: Condens. Matter Mater. Phys., 1988,

37, 6991–7000.89 J. Tersoff, Phys. Rev. B: Condens. Matter Mater. Phys., 1989,

39, 5566–5568.90 S. Plimpton, J. Comput. Phys., 1995, 117, 1–19.91 B. Aradi, B. Hourahine and T. Frauenheim, J. Phys. Chem.

A, 2007, 111, 5678–5684.92 J. A. Pople, Rev. Mod. Phys., 1999, 71, 1267–1274.93 A. Togo and I. Tanaka, Scr. Mater., 2015, 108, 1–5.94 S. Hajinazar, J. Shao and A. N. Kolmogorov, Phys. Rev. B,

2017, 95, 014114.95 H. J. C. Berendsen, J. P. M. Postma, W. F. van Gunsteren,

A. DiNola and J. R. Haak, J. Chem. Phys., 1984, 81,3684–3690.

96 A. H. Larsen, J. J. Mortensen, J. Blomqvist, I. E. Castelli,R. Christensen, M. Dułak, J. Friis, M. N. Groves, B. Hammer,C. Hargus, E. D. Hermes, P. C. Jennings, P. B. Jensen,J. Kermode, J. R. Kitchin, E. L. Kolsbjerg, J. Kubal,K. Kaasbjerg, S. Lysgaard, J. B. Maronsson, T. Maxson,T. Olsen, L. Pastewka, A. Peterson, C. Rostgaard,J. Schiøtz, O. Schutt, M. Strange, K. S. Thygesen, T. Vegge,L. Vilhelmsen, M. Walter, Z. Zeng and K. W. Jacobsen,J. Phys.: Condens. Matter, 2017, 29, 273002.

Paper PCCP

Publ

ishe

d on

07

Mar

ch 2

019.

Dow

nloa

ded

by M

artin

-Lut

her-

Uni

vers

itaet

on

11/1

6/20

20 2

:56:

01 P

M.

View Article Online

Page 11: Neural network force fields for simple metals and

6516 | Phys. Chem. Chem. Phys., 2019, 21, 6506--6516 This journal is© the Owner Societies 2019

97 Y. Qi, T. Çagin, W. L. Johnson and W. A. Goddard, J. Chem.Phys., 2001, 115, 385–394.

98 J. M. Ziman, Principles of the Theory of Solids, CambridgeUniversity Press, 2nd edn, 1972.

99 V. Y. Chekhovskoi, V. D. Tarasov and Y. V. Gusev, HighTemp., 2000, 38, 394–399.

100 L. Vocadlo, D. Alfe, G. D. Price and M. J. Gillan, J. Chem.Phys., 2004, 120, 2872–2878.

101 Y. Wang and J. P. Perdew, Phys. Rev. B: Condens. MatterMater. Phys., 1991, 44, 13298–13307.

102 J. P. Perdew, J. A. Chevary, S. H. Vosko, K. A. Jackson,M. R. Pederson, D. J. Singh and C. Fiolhais, Phys. Rev. B:Condens. Matter Mater. Phys., 1992, 46, 6671–6687.

103 L.-F. Zhu, B. Grabowski and J. Neugebauer, Phys. Rev. B,2017, 96, 224202.

104 L. Wang, Y. Zhang, X. Bian and Y. Chen, Phys. Lett. A, 2003,310, 197–202.

105 L. J. Lewis, P. Jensen and J.-L. Barrat, Phys. Rev. B: Condens.Matter Mater. Phys., 1997, 56, 2248–2257.

106 T. T. Jarvi, A. Kuronen, M. Hakala, K. Nordlund, A. C. vanDuin, W. A. Goddard and T. Jacob, Eur. Phys. J. B, 2008, 66,75–79.

107 N. Artrith, A. Urban and G. Ceder, J. Chem. Phys., 2018,148, 241711.

108 T. Jacobsen, M. Jørgensen and B. Hammer, Phys. Rev. Lett.,2018, 120, 026102.

PCCP Paper

Publ

ishe

d on

07

Mar

ch 2

019.

Dow

nloa

ded

by M

artin

-Lut

her-

Uni

vers

itaet

on

11/1

6/20

20 2

:56:

01 P

M.

View Article Online