rapid algorithm for computing the electron repulsion integral over higher order gaussian-type...

— —< <

Rapid Algorithm for Computing theElectron Repulsion Integralover Higher Order Gaussian-TypeOrbitals: AccompanyingCoordinate Expansion Method

KAZUHIRO ISHIDADepartment of Chemistry, Faculty of Science, Science University of Tokyo, Shinjuku-ku, Tokyo 162,Japan

Received 3 July 1997; accepted 16 January 1998

ABSTRACT: A general algorithm for rapidly computing the electron repulsionŽ .integral ERI is derived for the ACE-b3k3 formula, which has been derived

w Ž .xpreviously. K. Ishida, Int. J. Quantum Chem., 59, 209 1996 . A computerŽ .program code that is universal for all types of Gaussian-type orbitals GTOs up

to h-type can be constructed by the use of this general algorithm. It is confirmedthat the ACE-b3k3 algorithm is numerically very stable even for higher order

Ž .GTOs. It is found that, in a floating-point-operation FLOP count assessment,the ACE-b3k3 algorithm is the fastest among all methods available in the

Ž < . Ž < . Ž < . Ž < .literature for dd dd , ff ff , gg gg , and hh hh ERIs when the degree ofcontraction of the GTO is high. Q 1998 John Wiley & Sons, Inc. J ComputChem 19: 923]934, 1998

Keywords: molecular integral; electron repulsion integral; rapid algorithm;accompanying coordinate expansion method

Introduction

he rapid and rigorous calculation of the elec-T Ž .tron repulsion integral ERI is desired inboth SCF and post-SCF calculations. Recently, sev-eral investigators1 have tried to evaluate ERI ap-

Correspondence to: K. Ishida

proximately for its rapid computation. However,these approximations are suitable only for crudecalculations, because the precision of the approxi-mation is not sufficient.1 For example, in the reso-

Ž . 1lution of the identity RI approximation, theŽ .product of Gaussian-type orbitals GTOs is ex-

panded in terms of basis set. Thus, the error of theRI approximation is similar to the basis-set trunca-tion error. As a result, a high-quality basis set, like

( )Journal of Computational Chemistry, Vol. 19, No. 8, 923]934 1998Q 1998 John Wiley & Sons, Inc. CCC 0192-8651 / 98 / 080923-12

ISHIDA

the near-Hartree]Fock method, is necessary tochange the RI approximation into a reliable one.

In a previous study,2 we derived the accompa-Ž .nying coordinate expansion ACE formula, which

is a series of general ERI formulas. These ACEformulas can calculate ERI in any desired preci-sion, because these are rigorous general formulas.The ACE algorithm obtained from ACE formulasis the fastest for computing ERI of s- and p-typeGTOs in the FLOP count assessment, as shown ina previous study.2 Among ACE formulas, theACE-b3k3 formula is used for the case when thedegree of contraction of GTOs, say K , is larger. Forthe case when K is smaller, it is proper to use theother ACE formulas; for example, b2k3, b1k1, andso on.2 A general ACE-b3k3 algorithm was derivedfrom the expanding ACE-b3k3 general formula for

Ž < . Ž < .the individual cases as pp pp and dd dd classesof ERIs. However, this expansion needs a largeamount of tedious manipulation; therefore, it wasnot done for GTOs of higher order than f-type.

Ž < .In this article, extension is done for ff ff ,Ž < . Ž < .gg gg , and hh hh ERIs, with the result that theprevious general algorithm must be revised forextension of GTOs of up to h-type. The old algo-rithm is still valid for GTOs of up to d-type. Thenew algorithm is then shown. A computer pro-gram code, which is universal for all types ofGTOs up to h-type, is made by the use of this newalgorithm. Numerical examples for the use of this

Ž < . Žprogram code are shown for LL LL ERIs L s.2]5 in the final section.

ACE-b3k3 General Algorithm

When the Cartesian GTOs in an ERI are cen-tered at A, B, C, and D and have the exponentsa , a , a , and a and the so-called quantumA B C D

Ž .numbers l m n and so on, the ACE-b3k3 gen-A A Aeral formula of the ERI can be written as2:

� 4 � 4 Ž .ERI s A C N H N 1Ý0 4 3 4 3N3

where:

5r2A s 2p rg g g q g'ž /0 1 2 1 2

2 2 Ž .= exp ya a AB rg y a a CD rg 2ž /A B 1 C D 2

in which g s a q a and g s a q a . The1 A B 2 C Daccompanying coordinate part is given by:

� 4 A3 B3 C3 D3 Ž .C N s D D D D 3Ý4 3 i9i0 j9 j0 k 9k 0 h9h0� 4M

� 4 � 4w h e re M s M M M M , M sA B C D AŽ .M M M , 0 F M F l , 0 F M F m , andA x A y A z A x A A y A0 F M F n :A z A

A3 A3 x A3 y A3 z Ž .X Y X Y X YD s D D D 4Ý Ýi9i0 i i i i i ix x y y z z� 4 � 4i9 i0

and:

l l y MA A A xA3 xX YD s Xi ix x ž / ž /M iA x x

=iX

X X Y Yx l yM yi i yi iA A x x x x x Ž .AB CD AC 5Y x x xž /i x

� 4 � Y Y Y 4 � 4 � X X X 4 Xin which: i0 s i i i , i9 s i i i , 0 F i F lx y z x y z x Ay M , and 0 F iY F iX . The assembly of indicesA x x x� 4 � 4N is M M M M i9 j9k9h9i0 j0 k0 h0 , where 0 F3 A B C DM F L , 0 F M F L , 0 F M F L , 0 F M FA A B B C C DL , 0 F i9 F L y M , 0 F j9 F L y M , 0 F k9D A A B BF L y M , 0 F h9 F L y M , 0 F i0 F i9, 0 FC C D Dj0 F j9, 0 F k0 F k9, 0 F h0 F h9, M s M qA A xM q M , i9 s iX q iX q iX , i0 s iY q iY q iY ,A y A z x y z x y z

Ž .and so on. The L L L L is the angular momen-A B C Dtum quartet of the ERI requested.

The core part is given by:

� 4H N4 3

s G s aBqc Bs bAqd As cDqa Ds dCqb CÝÝ i i B A D C1 2i i1 2

My i yi1 2M qM yi M qM yiA B 1 C D 2 Ž .=s s r s q s1 2 1 2

a q b c q ds qs1 2 B A D CŽ .= yÝÝ s sž / ž /1 2s s1 2

i9qj9 k 9qh9s s1 2 Ž .= F znqs qs1 2ž / ž /s q s s q s1 2 1 2

Ž .6

where s s a rg , s s a rg , s s a rg , sB B 1 A A 1 D D 2 Cs a rg , a s L y M y i9, b s L y M y j9,C 2 B A A A B Bc s L y M y k9, d s L y M y h9, a s i9D C C C D D Dy i0, b s j9 y j0, c s k9 y k0, d s h9 y h0, sC B A 1

wŽ . xs 1r2g , s s 1r2g , 0 F i F int M q M r2 ,1 2 2 1 A BwŽ . x Ž0 F i F int M q M r2 , M s M q M q2 C D A B

.M q M r2, 0 F s F a q b , 0 F s F c q d ,C D 1 B A 2 D C

VOL. 19, NO. 8924

RAPID ALGORITHM

n s i9 q j9 q k9 q h9 q M y i y i , and:1 2

Ž .M qM qMy i qiA B 1 2Ž . Ž .G s y g g g 7Ýi i x y z1 2� 4I

� 4 � 4in which I s i i i i i i , 0 F i F1 x 1 y 1 z 2 x 2 y 2 z 1 xwŽ . x wŽint M q M r2 , 0 F i F int M qA x B x 2 x C x. xM r2 , i s i q i q i , i s i q i q i ,D x 1 1 x 1 y 1 z 2 2 x 2 y 2 z

and:

M q M M q MA x B x C x D x Ž .g s 2 i y 1 !!x 1 xž / ž /2 i 2 i1 x 2 x

Ž . Ž . Ž .= 2 i y 1 !! 2 M y 2 i y 2 i y 1 !! 82 x x 1 x 2 x

The molecular incomplete gamma function can bedefined as:

1 2 m 2Ž . Ž . Ž .F z s t exp yzt dt 9Hm0

62in which z s PQ r4d , d s 1r4g q 1r4g , PQ s1 2

Ž . ŽQ y P, P s a A q a B rg , and Q s a C qA B 1 C.a D rg .D 2

Ž .In eq. 3 , the accompanying coordinate part,� 4C N , can be rewritten as:4 3

� 4 A3 B3 C3 D3 Ž .C N s D D D D 10Ý4 3 i j k lijkl

Ž .where i s i9 i9 q 1 r2 q i0 and so on. The corre-� 4sponding core part H N can be rewritten as4 3

Ž .H i i i i where i s L y M , and so on.i jk l A B C D A A AŽ .Then, the ACE-b3k3 general formula, eq. 1 , can

be rewritten as:

A3 B3 C3 D3 Ž . Ž .ERI s D D D D H i i i i 11Ý i j k i jk l A B C Dijkl

where the factor A is included in the core part0H .i jk l

In a previous study,2 we obtained a generalŽ .b3k3 algorithm inductively by expanding eq. 11

Ž < . Ž < .for individual cases as pp pp and dd dd . How-Ž .ever, by expanding eq. 11 for more general cases

Ž < . Ž < . Ž < .such as ff ff , gg gg , and hh hh , it is foundthat we must revise the previous algorithm. Here-after, we use the phrase ‘‘it is found that . . . ’’ onthe findings obtained inductively from the indi-vidual expansion. This manipulation is so largethat it cannot be shown here. The new ACE-b3k3general algorithm can be described as follows:

Ž . Ž .ERI s H L L L L 12A BC D A B C D

The 0th order H can be defined as:A BC D

A3 Ž . Ž .H s D L H L L L LÝA BC D i A i BC D A B C Di

A A Ž .q d H L y 2 L L LA A A BC D A B C D

A B Ž .q d H L y 1L y 1L LA B A BC D A B C D

AC Ž .q d H L y 1L L y 1LAC A BC D A B C D

A D Ž . Ž .q d H L y 1L L L y 1 13A D A BC D A B C D

where:

Ž .d s d q q d 14aA A A A x A A y A A z

Ž .Ž . Ž .d s l y M l y M y 1 r2 14bA A x A A x A A x

Ž .d s d q d q d 14cA B A B x A B y A B z

Ž .Ž . Ž .d s l y M l y M 14dA B x A A x B B x

Ž .and so on. Eq. 13 is the same as in the previous2 Ž . Ž .study, except for eq. 14b . When l F 2, eq. 14bA

is identical to what was shown in the previouswork.2 The first order H can be defined as:A BC D

H A AA BC D

A3Ž . A A Ž .s D L y 2 H L y 2 L L LÝ i A i BC D A B C Di

Ž . A A A A Ž .q 1r2 d H L y 4L L LA A A BC D A B C D

Ž . A A A B Ž .q 1r2 d H L y 3L y 1L LA B A BC D A B C D

Ž . A A AC Ž .q 1r2 d H L y 3L L y 1LAC A BC D A B C D

Ž . A A A D Ž .q 1r2 d H L y 3L L L y 1A D A BC D A B C D

Ž .15

and so on. The second order H can be definedA BC Das:

A A A A A3Ž . A A A A Ž .H q D L y4 H L y4L L LÝA BC D i A i BC D A B C Di

Ž . A A A A A A Ž .q 1r3 d H L y 6L L LA A A BC D A B C D

Ž . A A A A A B Ž .q 1r3 d H L y 5L y 1L LA B A BC D A B C D

Ž . A A A A AC Ž .q 1r3 d H L y 5L L y 1LAC A BC D A B C D

Ž . A A A A A D Ž .q 1r3 d H L y 5L L L y 1A D A BC D A B C D

Ž .16

Ž . Ž .and so on. Eqs. 15 and 16 are new. It is foundthat H A A A B s H A B A A, H A B AC s H AC A B, and soA BC D A BC D A BC D A BC D

Ž .on. Thus, the second order H i i i i de-A BC D A B C Dpends on the combination of these pairs of AA,AB, AC, and AD. The combination of these pairscan be sufficiently determined by the four indices

JOURNAL OF COMPUTATIONAL CHEMISTRY 925

ISHIDA

Ž . Ž .i i i i where i s L y M , and so on ;A B C D A A Atherefore, it is redundant at this stage to use the

Ž .pair notation of the superscript with i i i i .A B C DThe reason for using the pair notation is clarifiedlater. Hereafter, we use the dictionary order forrepresenting these pairs; that is, we use ABAC andnot ACAB, and so on, at the superscript of H .A BC DThe higher order H can be defined in theA BC Dsame manner as done previously. In the definition

Žof the ith order H , the factor before d d ,A BC D A A A B. Ž .d , or d is 1r i q 1 . The chain definition ofAC A D

Ž . Ž .H , eqs. 13 ] 16 , is closed becauseA B C DU Ž .H i i i i s 0 when there is a negativeA BC D A B C D

value among i , i , i , and i . Hereafter, super-A B C Dscript * denotes any set of pairs.

The 0th order H can be defined as:i BC D

Ž .H L L L Li BC D A B C D

B3 Ž . Ž .s D L H L L L LÝ j B i jC D A B C Dj

BB Ž .q d H L L y 2 L LBB i BC D A B C D

BC Ž .q d H L L y 1L y 1LBC i BC D A B C D

BD Ž . Ž .q d H L L y 1L L y 1 17BD i BC D A B C D

where d , d , and d can be expressed as in eq.BB BC BDŽ .14 . The other type of the 0th order H can bei BC Ddefined as:

A A Ž .H L y 2 L L Li BC D A B C D

B3 Ž . A A Ž .s D L H L y 2 L L LÝ j B i jC D A B C Dj

A A BB Ž .q d H L y 2 L y 2 L LBB i BC D A B C D

A A BC Ž .q d H L y 2 L y 1L y 1LBC i BC D A B C D

A A BD Ž .q d H L y 2 L y 1L L y 1BD i BC D A B C D

Ž .18

and so on. Each 0th order HU corresponds toi BC DU Ž . Ž .each of order of H , given in eqs. 13 ] 16 .A BC D

Ž . Ž .Eqs. 17 and 18 are the same as in the previouswork2 except for the value of d . The first orderBBH can be defined as:i BC D

BB Ž .H L L y 2 L Li BC D A B C D

B3Ž . BB Ž .s D L y 2 H L L y 2 L LÝ j B i jC D A B C Dj

Ž . BBBB Ž .q 1r2 d H L L y 4L LBB i BC D A B C D

Ž . BBBC Ž .q 1r2 d H L L y 3L y 1LBC i BC D A B C D

Ž . BBBD Ž .q 1r2 d H L L y 3L L y 1BD i BC D A B C D

Ž .19

A A BB Ž .H L y 2 L y 2 L Li BC D A B C D

B3Ž . A A BB Ž .s D L y2 H L y2 L y2 L LÝ j B i jC D A B C Dj

Ž . A A BBBB Ž .q 1r2 d H L y2 L y4L LBB i BC D A B C D

Ž . A A BBBC Ž .q 1r2 d H L y2 L y3L y1LBC i BC D A B C D

Ž . A A BBBD Ž .q 1r2 d H L y2 L y3L L y1BD i BC D A B C D

Ž .20

and so on. It is found that H BC BD s H BD BC, and soi BC D i BC DU Ž .on. Thus, H i i i i depends on the combi-i BC D A B C D

nation of pairs of AA, AB, AC, AD, BB, BC, andBD. Hereafter, we use the dictionary order forrepresenting these pairs; that is, we use ABBBBCand not ABBCBB, and so on. It is also found thatH A B A B / H A A BB and H A B AC / H A A BC, becausei BC D i BC D i BC D i BC D

A B A B Ž A B AC .H or H is the 0th order one, whereasi BC D i BC DA A BB Ž A A BC .H or H is first order. This is the reasoni BC D i BC D

to use pair notation at the superscript. When L FB2, however, it is found that H A B A B s H A A BB andi BC D i BC DH A B AC s H A A BC. For this case, the pair notation isi BC D i BC D

Ž . Unot necessary, because i i i i of H canA B C D i BC Dsufficiently describe the term. This is the situationfor description of the ACE-b3k3 algorithm in theprevious work.2 For L G 3, we must use the pairBnotation. The pair notation can completely deter-mine the values of i , i , i , and i . Therefore, theA B C D

Ž .argument i i i i is not necessary in the expres-A B C Dsion. It is mainly for readers’ convenience. Higherorder HU can be defined in the same way as ini BC D

Ž . Ž . Ueqs. 19 and 20 . The chain definition of H isi BC DU Ž .closed because H i i i i s 0 when there isi BC D A B C D

a negative value among i , i , i , and i .A B C DThroughout this paper, we use the term ‘‘ith

U U Ž U Uorder’’ for H , and H H and HA BC D i BC D i jC D i jk D.also appear later when these have the factor of

Ž . Ž .1r i q 1 before d or d , and so on . ThisA A BBfactor is obtained inductively from expanding eq.Ž .11 for individual cases.

The 0th order HU can be defined as:i jC D

U Ž . C3 Ž . U Ž .H i i i i s D i H i i i iÝi jC D A B C D k C i jk D A B C Dk

U CC Ž .q d H i i i y 2 iCC i jC D A B C D

U C D Ž .q d H i i i y 1i y 1C D i jC D A B C D

Ž .21

Higher order HU can be defined in a similari jC Dway as in HU . It is found that H A B A B s H A A BB

i BC D i jC D i jC D

VOL. 19, NO. 8926

RAPID ALGORITHM

and H A B AC s H A A BC in all cases. Then the totali jC D i jC Dnumber of 0th order HU is less than the totali jC Dnumber of all orders of HU . For L F 2, it isi BC D C

Ž .found that the four values of i i i i can suffi-A B C Dciently describe the term of all orders of HU .i jC DThis is the case in the previous work.2

The 0th order HU can be defined as:i jk D

U Ž . D3Ž . Ž .H i i i i s D i H i i i iÝi jk D A B C D l D i jk l A B C Dl

U D D Ž .q d H i i i i y 2D D i jk D A B C D

Ž .22

It is found that H can be described sufficientlyi jk lŽ .by the four indices i i i i ; therefore, the pairA B C D

notation at the superscript is always omitted forH . Higher order HU can be also defined in ai jk l i jk Dsimilar way as in HU . It is found that H A B A B si BC D i jk DH A A BB and H A B AC s H A A BC in all cases. Thei jk D i jk D i jk Dtotal number of 0th order HU is equal to thei jk Dtotal number of all orders of HU . It is found thati jC DH A DC D / H AC D D and H C DC D / H CC D D, becausei jk D i jk D i jk D i jk D

A DC D Ž C DC D. AC D DH or H is 0th order, whereas Hi jk D i jk D i jk DŽ CC D D.or H is first order.i jk D

For L F 2, however, H A DC D s H AC D D andD i jk D i jk DH C DC D s H CC D D. This was the case in the previ-i jk D i jk Dous study.2 All calculations of H are also exactlyi jk lthe same as in the previous study.2 Thus, with noloss of generality, we can assume that the inner-

Ž 4 .most contraction loop so-called K loop is theŽK loop when the innermost is K , then thebra ket

following procedure should be read after exchang-.ing K for K .bra ket

Let us define F and G p q by:m n m n

m ns s1 2 Ž .F s F zm n mqnž / ž /s q s s q s1 2 1 2

where 0 F m F L q L , 0 F n F L q L , and:A B C D

mmtp q p q Ž .G s s s y FÝm n A B mqt nž /tts0

where m s L q L y m and 0 F p, q F n q mA Bwith p q q s n q m.

The computation of F and G p q must be per-m n m nformed at the K 4 loop. The other necessary termat the K 4 loop is:

mmtp q p q l1 Ž .g s s s s y FÝm l A B 1 myl qt 01 1ž /tts0

� wŽ . x4where 1 F l F min m, m, int L q L r2 and1 A B0 F p, q F m y l with p q q s m y l . Let us de-1 1fine H p qr s by:m n

nt np qr s r s p qŽ .H s s s y GÝm n C D m nqtž /t

ts0

where n s L q L y n, m F p q q F n q m, andC Dn F r q s F m q n .

The computation of H p qr s can be performed atm nŽ 2 .the K loop so-called K loop for p q q s n qket

m and r q s s m q n , and can be done at out ofŽ 0 .the contraction loops so-called K step for m F p

q q - m q m and n F r q s - m q n by the useof the following relations:

p qr s pq1qr s p qq1 r s Ž .H sH qH m F pqq - n q mm n m n m n

and:

p qr s p q rq1 s p q r sq1 Ž .H s H q H n F r q s - m q nm n m n m n

These relations are based on s q s s 1 and sA B Cq s s 1. The other necessary term at the K 2

Dloop is:

nt np qr s r s l p q2 Ž .h s s s s y GÝm n l C D 2 m nyl qt2 2ž /t

ts0

Ž .where 1 F l s M q M q M q M r2 F2 A B C D� wŽmin m q n , n q m, m q n int L q L q L qA B C

. x4 � 4L r2 , max m y l , 0 F p q q F n q m y l ,D 2 2� 4and max n y l , 0 F r q s F m q n y l . We2 2

must use g p q instead s l1G p q when n y l q tm l 2 myl 21 1Žis negative in the above equation with l s y n1

. p qr sy l q t . The computation of h can be per-2 m nl2

formed at the K 2 loop for p q q s n q m y l2and r q s s m q n y l and can be done at the2K 0 step for p q q - n y m y l and r q s - m q2n y l with the use of the following relations:2

h p qr s s h pq1 qr s q h p qq1r sm n l m nl m n l2 2 2

Ž � 4 .max m y l , 0 F p q q - n q m y l2 2

h p qr s s h p q rq1s q h p q rsq1m n l m nl m n l2 2 2

Ž � 4 .max n y l , 0 F r q s - m q n y l2 2

These relations are based on s q s s 1 and sA B Cq s s 1. All necessary H can be computedD i jk l

Ž .from the above terms by the use of eq. 6 .Finally, the abstract of the new ACE-b3k3 gen-

eral algorithm is shown in Figure 1. In the FigureŽ .1, the four loop indices i i i i describe the ‘‘ AA 2 3 4

pairs’’; for example, AXAX, where X s A, B, C,Žand D, and so on i s 2 i s 1 i s 1 i s 0 forA 1 3 4


ISHIDA

FIGURE 1. Abstract of the ACE-b3k3 general algorithm.

. Ž .ABAC . The three loop indices j j j describe1 2 3the ‘‘B pairs’’; For example, BXBX, where X s B,

ŽC, and D, and so on j s 4 j s 1 j s 1 for1 2 3.BBBCBD .

Ž .The two loop indices k k describe the ‘‘C1 2pairs’’; for example, CXCXCX, where X s C and

Ž .D, and so on k s 3 k s 1 for CCCD . The loop1 2index m describes the total number of the DD1pairs.

For the reader’s convenience, in order to ob-serve the present algorithm, the expanded formula

Ž . Ž < .of eq. 11 for the pp pp class of ERIs is shown.The expanded formula of those of higher order

GTOs is too extensive to be shown here. Thus:

<p p p pŽ .i j k l

1111Žs AB BA CD DC H�i j k l 00

2101 1101 .qBA H q DB Hl 01 l 01

Ž 1210 2200 1200 .qAB DC H q BA H q DB Hk l 01 l 02 l 02

Ž 1110 2100qCA DC H q BA Hk l 01 l 02

1100 . 1100qDB H q d h 4l 02 k l 011

Ž 0121qDC CD DC H�j k l 10

VOL. 19, NO. 8928

RAPID ALGORITHM

( )FIGURE 1. Continued

1111 0111 .qBA H q DB Hl 11 l 11


Ž 0120 1110qCA DC H q BA Hk l 11 l 12

0110 . 0110qDB H q d h 4l 12 k l 111

Ž 0111 1101qBD CD DC H q BA H�j k l 10 l 11

0101 .qDB Hl 11


Ž 0110 1100qCA DC H q BA Hk l 11 l 12

0100 . 0100qDB H q d h 4l 12 k l 111

Ž 0110 1100 0100 .qd DC h q BA h q DB hjk l 111 l 121 l 121

0101 0200 0100Ž .qd CD h q AB h q CA hjl k 111 k 121 k 121

1012ŽqCD BA CD DC H�i j k l 10

2002 1002 .qBA H q DB Hl 11 l 11

Ž 1111 2101 1101.qAB DC H q BA H q DB Hk l 11 l 12 l 12

Ž 1011 2001qCA DC H q BA Hk l 11 l 12

1001 . 1001qDB H q d h 4l 12 k l 111

Ž 0022 1012qDC CD DC H q BA H�j k l 20 l 21


ISHIDA

0012 .qDB Hl 21


Ž 0021 1011qCA DC H q BA Hk l 21 l 22

0011 . 0011qDB H q d h 4l 22 k l 211


0002 .qDB Hl 21


Ž 0011 1001qCA DC H q BA Hk l 21 l 22

0001 . 0001qDB H q d h 4l 22 k l 211



1011ŽqAC BA CD DC H�i j k l 10

2001 1001 .qBA H q DB Hl 11 l 11


Ž 1010 2000qCA DC H q BA Hk l 11 l 12

1000 . 1000qDB H q d h 4l 12 k l 111

Ž 0021 1011qDC CD DC H q BA H�j k l 20 l 21

0011 .qDB Hl 21


Ž 0020 1010qCA DC H q BA Hk l 21 l 22

0010 . 0010qDB H q d h 4l 22 k l 211


0001 .qDB Hl 21


Ž 0010 1000qCA DC H q BA Hk l 21 l 22

0000 . 0000qDB H q d h 4l 22 k l 211



0011 1001 0001Ž .qd CD DC h q BA h q DB hi j k l 101 l 111 l 111

Ž 0110 1100 0100 .qAB DC h q BA h q DB hk l 111 l 121 l 121

Ž 0010 1000qCA DC h q BA hk l 111 l 121

0000 0000.qDB h q d hl 121 k l 112

1010 2000 1000Ž .qd BA DC h q BA h q DB hik j l 111 l 121 l 121

Ž 0020 1010 0010 .qDC DC h q BA h q DB hj l 211 l 221 l 221

Ž 0010 1000qBD DC h q BA hj l 211 l 221

0000 0000.qDB h q d hl 221 jl 222

1001 1100 1000Ž .qd BA CD h qAB h qCA hil j k 111 k 121 k 121

Ž 0011 0110 0010 .qDC CD h q AB h q CA hj k 211 k 221 k 221

Ž 0001 0100qBD CD h q AB hj k 211 k 221

0000 0000.qCA h q d hk 221 jk 222

where i, j, k, l s x, y, z; d is the Kronecker’si jdelta; AB s B y A , and so on. In the above for-i i imula, the correspondence between terms is as fol-lows:

A3Ž .AB , CD , AC l D 1i i i i

B3Ž .BA , DC , BD l D 1j j j j

C3 Ž .CD , AB , CA l D 1k k k k

D3 Ž .DC , BA , DB l D 1l l l l

d , d , d l d , d , di j i k i l A B AC A D

d , d , d l d , d , djk jl k l BC BD C D

and:p qr s p qr s Ž .H , h l H i i i im n m n l i jk l A B C D2

The present ACE-b3k3 algorithm is especiallysuitable for general contraction of the basis set;

Ž .that is, for the atomic natural orbital ANO basisof Almlof and Taylor.12, 13 They used K s 6 for the¨d-type ANO, K s 4 for f-type in the Ne or N atom,and K s 10 for the d-type in the S atom.12 TheANO is a high-quality basis set that allows foressentially no loss of correlation energy.13 By theACE-b3k3 algorithm, we can use the ANO basis oflarger contraction for higher GTOs. For example,we may use K s 6 for f-type ANO bases, becausethe FLOP count of K s 6 is comparable to that of

Ž .k s 3]5 see Table VI . A special code for the ANObasis can be made, which is a project currently inprogress.

The present algorithm can be also applied to thederivative of ERI. Such a project is also in progress.

Numerical Examples

A computer program code that is universal forall types of GTOs can be constructed by the use ofthe present ACE-b3k3 general algorithm describedin the preceding section. Table I shows the FLOPcount parameters of the present version of the

Ž < . Ž .general program code for LL LL ERIs L s 1]5 .The value of the x parameter is the FLOP count of

VOL. 19, NO. 8930

RAPID ALGORITHM

TABLE I.The FLOP Count Parameters of ACE-b3k3

a,bGeneral Algorithm.

ERI class x y z

c( < ) ( ) ( ) ( )pp pp 75 61 205 166 2318 1920( < )dd dd 327 2281 163000

( < )ff ff 861 11237 4146000( < )gg gg 1781 37128 58022000( < )hh hh 3191 96749 538709000

a ( ) ( )Except for the calculation of F z ; see ref. 2 for F z .m mbFLOP count = xK 4 + yK 2 + z, where K is the degree ofcontraction.cParameters of the computer program code, specially opti-

( < )mized for the pp pp ERI class, are shown in parentheses( )see ref. 2 .

the so-called K 4 step. The value of the y parame-ter is that of K 2 step. The value of the z parame-ter is that of the K 0 step. The total FLOP count isthen given by:

4 2 Ž .FLOP count s xK q yK q z 23

where K is the degree of contraction of GTO. InŽ .eq. 23 , it is assumed that all GTOs in the ERI in

question have the same K value. The FLOP countto make the molecular incomplete gamma function

Ž .F z is omitted in this article, but has been inves-m

tigated in the previous work.2 The FLOP count inŽ .Table I is not optimal for two reasons: 1 the

computer program code is universal for all typesŽ .of GTOs; and 2 this code is not yet fully opti-

mized in its present version. For example, forŽ < .pp pp ERIs, the parameter values are x s 75,y s 205, z s 2318, whereas the optimum valuesare x s 61, y s 166, z s 1920, as shown in theprevious article.2 However, the present version ofthe program code is practically sufficient for d orhigher-order GTOs as discussed in what follows.

Tables II]IV show comparisons to other meth-ods in FLOP count parameters. The methods ofIshida,3 Head-Gordon and Pople,4 Lindh et al.,5

and Hamilton and Schaefer6 are comparable inregard to FLOP count parameters, as seen in Ta-bles II and III.

For a vector computer, Ishida’s method II can bestrongly recommended among these comparablemethods, because of the strongly vectorizable fea-ture in the K 4 step.3 The methods of McMurchieand Davidson7 and Dupuis et al.8 have very largex parameters, which means that these methods

TABLE II.Comparison of FLOP Count Parameters with Those

a,b( < )of Other Methods for dd dd ERI Class.

Method x y z

ACE-b3k3 327 2281 163000c( )Gill and Pople PRISM-CCTTT 575 5506 159624

dTenno’s estimate 330 1800 200000eIshida’s method II 13209 30 11256

cHead-Gordon and Pople 13466 0 10295fLindh et al. 10255 30 11256

gHamilton and Schaefer 13900 30 11256gMcMurchie and Davidson 27300 24000 0

gDupuis et al. 30900 220 0

a ( ) ( )Except for the calculation of F z ; see ref. 2 for F z .m mbFLOP count = xK 4 + yK 2 + z, where K is the degree ofcontraction.cRef. 9.dRef. 10.eRef. 3.fRef. 5.gRef. 6.

must not be used for larger K. The PRISM algo-rithm of Gill and Pople9 and the present ACE-b3k3have small x parameters, which means that theyare adequate for larger K. As seen in Tables II]IV,the x parameter of the ACE-b3k3 is the lowestamong all methods in the literature.

The total FLOP count of the present ACE-b3k3is compared with the other methods in Tables

TABLE III.Comparison of FLOP Count Parameters with Those

a,b( < )of Other Methods for ff ff ERI class.

Method x y z

ACE-b3k3 861 11237 4146000cGill et al. 11000 600000 600000

dIshida’s method II 77817 30 135024eHead-Gordon and Pople 108000 30 135024

fLindh et al. 76901 30 135024gHamilton and Schaefer 87800 30 135024

gMcMurchie and Davidson 342000 383000 0gDupuis et al. 276000 600 0

a ( ) ( )Except for the calculation of F z ; see ref. 2 for F z .m mbFLOP count = xK 4 + yK 2 + z, where K is the degree ofcontraction.cRef. 11.dRef. 3.eRef. 4.fRef. 5.gRef. 6.


ISHIDA

TABLE IV.Comparison of FLOP Count Parameters with Those

( ) ( < ) ( < )of Ishida’s Method II Ref. 3 for gg gg and hh hha,bERI Classes.

Method x y z

( < )gg ggACE-b3k3 1781 37128 58022000Ishida’s method II 308615 30 939060( < )hh hhACE-b3k3 3191 96749 538709000Ishida’s method II 951864 30 4651624

a ( ) ( )Except for the calculation of F z ; see ref. 2 for F z .m mbFLOP count = xK 4 + yK 2 + z, where K is the degree ofcontraction.

Ž < . Ž < . Ž < . Ž < .V]VII for dd dd , ff ff , gg gg , and hh hhERIs. The methods of McMurchie and Davidson7

and Dupuis et al.8 must not be used when K G 3,as seen in Tables V and VI. Ishida’s3 method IIand comparable methods can be used when K F 3

Ž < . Ž < .for ff ff , K F 4 for gg gg , and K F 5 forŽ < .hh hh . The present ACE-b3k3 is fastest when K

Ž < . Ž < . Ž < .G 3 for dd dd and for ff ff , K G 4 for gg gg ,Ž < .and K G 5 for hh hh , as seen in Tables V]VII.

Ishida’s method II is fastest when K s 3 forŽ < . Ž < .gg gg and 3 F K F 4 for hh hh .

Table VIII shows the total computation time ofERI by the present ACE-b3k3 general code com-

3 Ž < .pared with Ishida’s method II for LL LL ERIsŽ .L s 2]5 measured with a HITAC-MP5800 scalar

TABLE V.( < ) a, bComparison of Total FLOP Count with that of Other Methods for dd dd ERI Class.

Method K = 3 K = 4 K = 5 K = 6

ACE-b3k3 210000 283000 424000 669000c( )Gill and Pople PRISM 256000 395000 657000 1103000

dTenno’s estimate 243000 313000 451000 692000eIshida’s method II 1081000 3393000 8268000 17131000

cHead-Gordon and Pople 1101000 3458000 8427000 17462000fLindh et al. 842000 2637000 6421000 13303000

gHamilton and Schaefer 1137000 3570000 8700000 18027000gMcMurchie and Davidson 2427000 7373000 17663000 36245000

gDupuis et al. 2505000 7914000 19318000 40054000

a ( ) ( )Except for the calculation of F z ; see ref. 2 for F z .m mbK is the degree of contraction.cRef. 9.dRef. 10.eRef. 3.fRef. 5.gRef. 6.

TABLE VI.( < ) a,bComparison of Total FLOP Count with that of Other Methods for ff ff ERI Class.

Method K = 3 K = 4 K = 5 K = 6

ACE-b3k3 4317000 4547000 4966000 5666000cGill et al. 6891000 13016000 22475000 36456000

dIshida’s method II 6438000 20057000 48771000 100987000eHead-Gordon and Pople 8883000 27783000 67636000 140104000

fLindh et al. 6372000 19822000 48199000 99800000gHamilton and Schaefer 7247000 22612000 55011000 113925000

gMcMurchie and Davidson 31149000 93680000 223325000 457020000gDupuis et al. 22356000 70666000 172515000 357718000

a ( ) ( )Except for the calculation of F z ; see ref. 2 for F z .m mbK is the degree of contraction.cRef. 11.dRef. 3.eRef. 4.fRef. 5.gRef. 6.

VOL. 19, NO. 8932

RAPID ALGORITHM

TABLE VII.( ) ( < ) ( < ) a,bComparison of Total FLOP Count with that of Ishida’s Method II Ref. 3 for gg gg and hh hh ERI Classes.

Method K = 3 K = 4 K = 5 K = 6

( < )gg ggACE-b3k3 58501000 59073000 60064000 61667000Ishida’s method II 25937000 79945000 193824000 400905000( < )hh hhACE-b3k3 539838000 541073000 543122000 546198000Ishida’s method II 81753000 248329000 599567000 1238268000

a ( ) ( )Except for the calculation of F z ; see ref. 2 for F z .m mbK is the degree of contraction.

TABLE VIII.Total Computation Time of Class of ERIs byACE-b3k3 Algorithm Compared with that Using

a,b( ) ( )Ishida’s Method II Ref. 3 in Seconds .

ERIclass K = 3 K = 4 K = 5 K = 6

( < )dd dd 0.0096 0.0137 0.0217 0.03560.0436 0.138 0.338 0.705

( < )ff ff 0.132 0.143 0.162 0.1960.266 0.835 2.04 4.27

( < )gg gg 1.70 1.73 1.77 1.841.14 3.52 8.80 18.1

( < )hh hh 16.3 16.3 16.4 16.53.76 11.6 28.3 58.3

aACE-b3k3 times are upper entries and Ishida’s method IIare lower entries. All times were measured with a HITACMP5800 scalar computer.bK is the degree of contraction.

Ž < . Ž < .computer. For dd dd and ff ff , the ACE-b3k3 is2]20 times faster than Ishida’s method II, as seen

Ž < .in Table VIII. For gg gg , the ACE-b3k3 is 2]20Ž < .times faster for K G 4. For hh hh , it is 1.5]3 times

faster for K G 5. For a smaller value of K , formu-Žlas other than ACE b3k3 must be used e.g., b2k3

.or b1k1 .Construction of the computer program code for

b2k3 and b1k1, which is universal for all types ofGTOs, is in progress.

It is confirmed with the present general pro-gram code that the ACE-b3k3 algorithm is numeri-cally very stable even for higher order GTOs; thatis, there is no loss of significant figures other than

Ž < . Ž < .the round-off error. For gg gg and hh hh , thetotal FLOP count of the ACE-b3k3 is varied veryslightly along increasing K value, as seen in TableVII. This is because the value of the z parameter isvery large. For the higher order GTOs d]h, it is

Ž .more efficient to use the solid harmonic SH GTOs,

given by:

Ž 2 . Ž .SH-GTO s S exp yar 24lm

where the solid harmonics S of a polar coordi-lmŽ .nate r, u , f can be defined as:

l Ž . Ž .S s r P cos u 25al0 l

l Ž . Ž . Ž .S s r P cos u cos mf m ) 0 25blm lm

and:

l Ž . Ž . Ž .S s r P cos u sin mf m ) 0 25clym lm

Ž 0For SH-GTOs, the value of the z parameter K.FLOP count will be drastically smaller than that

for the usual Cartesian GTO. This will permitdrastically faster computation of ERI for higherorder GTOs. Such a project is currently in progress.

Acknowledgments

All computations were carried out using a HI-TAC MP5800 computer at the Computer Centre ofthe University of Tokyo. The author thanks hiswife for her assistance in typing his program codeinto the computer. The author thanks the refereefor many helpful comments.

References

1. For example, see S. Ten-no and S. Iwata, J. Chem. Phys., 105,Ž .3604 1996 for an RI approximation, and M. Challacombe

Ž .and E. Schwegler, J. Chem. Phys., 106, 5526 1997 for amultipole expansion approximation.

Ž .2. K. Ishida, Int. J. Quantum Chem., 59, 209 1996 .Ž .3. K. Ishida, J. Chem. Phys., 98, 2176 1993 .

4. M. Head-Gordon and J. A. Pople, J. Chem. Phys., 89, 5777Ž .1988 .


ISHIDA

Ž .5. R. Lindh, U. Ryu, and B. Liu, J. Chem. Phys., 95, 5889 1991 .6. T. P. Hamilton and H. F. Schaefer III, Chem. Phys., 150, 163

Ž .1991 .7. L. E. McMurchie and E. R. Davidson, J. Comput. Phys., 26,

Ž .218 1978 .8. M. Dupuis, J. Rys, and H. F. King, J. Chem. Phys., 65, 111

Ž .1976 .

9. P. M. W. Gill and J. A. Pople, Int. J. Quantum Chem., 40, 753Ž .1991 .

Ž .10. S. Ten-no, Chem. Phys. Lett., 211, 259 1993 .11. P. M. W. Gill, M. Head-Gordon, and J. A. Pople, Int. J.

Ž .Quantum Chem. Symp., 23, 269 1989 .Ž .12. J. Almlof and P. R. Taylor, J. Chem. Phys., 86, 4070 1987 .¨

Ž .13. J. Almlof and P. R. Taylor, J. Chem. Phys., 92, 551 1990 .¨

VOL. 19, NO. 8934

rapid algorithm for computing the electron repulsion integral over higher order gaussian-type...

Documents