the distributed data interface in gamess

17
The Distributed Data The Distributed Data Interface in GAMESS Interface in GAMESS Brett M. Bode, Michael W. Brett M. Bode, Michael W. Schmidt, Schmidt, Graham D. Fletcher, and Mark S. Graham D. Fletcher, and Mark S. Gordon Gordon Ames Laboratory-USDOE, Ames Laboratory-USDOE, Iowa State University Iowa State University 10/7/99

Upload: gaenor

Post on 06-Jan-2016

27 views

Category:

Documents


2 download

DESCRIPTION

The Distributed Data Interface in GAMESS. Brett M. Bode, Michael W. Schmidt, Graham D. Fletcher, and Mark S. Gordon Ames Laboratory-USDOE, Iowa State University. 10/7/99. What is GAMESS?. G eneral A tomic and M olecular E lectronic S tructure S ystem. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The Distributed Data Interface in GAMESS

The Distributed Data The Distributed Data Interface in GAMESSInterface in GAMESS

Brett M. Bode, Michael W. Schmidt, Brett M. Bode, Michael W. Schmidt,

Graham D. Fletcher, and Mark S. Graham D. Fletcher, and Mark S. Gordon Gordon

Ames Laboratory-USDOE,Ames Laboratory-USDOE,

Iowa State UniversityIowa State University

10/7/99

Page 2: The Distributed Data Interface in GAMESS

22

GGeneral eneral AAtomic and tomic and MMolecular olecular EElectronic lectronic SStructure tructure SSystemystem

• First principles - fully quantum First principles - fully quantum mechanicalmechanical

• Created from other programs in Created from other programs in ~1980~1980

• Developed by Dr. Mark Gordon’s Developed by Dr. Mark Gordon’s research group since 1982 with Dr. research group since 1982 with Dr. Michael Schmidt as the principle Michael Schmidt as the principle developer.developer.

• Parallelization begin in 1991Parallelization begin in 1991•Emphasis on Distributed memory Emphasis on Distributed memory systemssystems

• Currently includes methods for Currently includes methods for treating 1-atom to several treating 1-atom to several hundred atomshundred atoms

What is GAMESS?What is GAMESS?

Page 3: The Distributed Data Interface in GAMESS

33Partial list of capabilitiesPartial list of capabilities

SCF type RHF ROHF UHF GVB MCSCF

Energy CDP CDP CDP CDP CDP

Gradient CDP CDP CDP CDP CDP

Hessian CDP CDP - CDP -

MP2Energy

CDP CDP CDP - C

MP2Gradient

CDP - - - -

CIEnergy

CDP CDP - CDP CDP

C = Uses disk storageD = Minimal disk usageP = Parallel execution

Page 4: The Distributed Data Interface in GAMESS

44

First Generation ParallelFirst Generation ParallelCodeCode

Parallel communications were Parallel communications were performed using either:performed using either:• TCGMSGTCGMSG• Vendor supplied MPIVendor supplied MPI

Parallel version was usually a Parallel version was usually a slightly modified version of the slightly modified version of the sequential codesequential code

Page 5: The Distributed Data Interface in GAMESS

55IBM-SUR clusterIBM-SUR cluster

22 IBM RS/6000 43P-260:22 IBM RS/6000 43P-260:– Dual 200MHz Power3 CPUsDual 200MHz Power3 CPUs– 4 Mb of Level 2 cache4 Mb of Level 2 cache– 1 GByte of RAM 1 GByte of RAM – 18 GBytes fast local disks18 GBytes fast local disks– Jumbo Frames Gig EthernetJumbo Frames Gig Ethernet– Integrated Fast-Ethernet Integrated Fast-Ethernet

Fast Ethernet Switch to allFast Ethernet Switch to all 3x9 port Gigabit Switches3x9 port Gigabit Switches

Page 6: The Distributed Data Interface in GAMESS

66

Gigabit Performance on Gigabit Performance on thetheIBM 43P-260 ClusterIBM 43P-260 Cluster

0

100

200

300

400

500

600

700

800

900

1x100 1x101 1x102 1x103 1x104 1x105 1x106 1x107 1x108

Message Size

Fast EtherNet, MPI

Fast EtherNet, TCP

Gigabit EtherNet, MPI, Jumbo

Gigabit EtherNet, TCP, Jumbo

Gigabit EtherNet, TCP, Normal

Gigabit Ethernet, MPI, Normal

Alteon 180

Network Performance Comparison onthe IBM RS/6000 43P-m260 running AIX v4.3.2

Page 7: The Distributed Data Interface in GAMESS

77Test MoleculeTest Molecule

Ti(C5H5)2 C2H4SiHCl3 Basis Set

• 6-31G(d,p) on C and H.

• SBKJC ECP on Si, Ti, and Cl extended with 1 d-type polarization function on Si and Cl.

• 345 total basis functions

Page 8: The Distributed Data Interface in GAMESS

88Parallel SCFParallel SCF

Very good Very good scaling scaling dependant dependant on the size on the size of the of the molecule.molecule.

Large Large systems systems show nearly show nearly linear linear scaling scaling through 256 through 256 nodesnodes

J

J

J

J

J

J

H

H

H

H

FF

F

F

F

F

Ñ

Ñ

Ñ

Ñ

É

É

É

É

É

É

Ç

Ç

Ç

Ç

ÅÅ

Å

Å

Å

Å

M

M

M

M

1

4

7

10

13

16

1 4 7 10 13 16

Sp

ee

du

p o

ver

1C

PU

wa

ll tim

ing

Number of CPUs

Ideal

J Direct SCF, Gig. Ethernet

H Direct SCF, Gig. Ethernet, 1 CPU/box

F Conv. SCF, Gig. Ethernet

Ñ Conv. SCF, Gig. Ethernet, 1 CPU/box

É Direct SCF, Fast Ethernet

Ç Direct SCF, Fast Ethernet, 1 CPU/box

Å Conv. SCF, Fast Ethernet

M Conv. SCF, Fast Ethernet, 1 CPU/box

SCF Energy Speedup Curve

Page 9: The Distributed Data Interface in GAMESS

99Successes and LimitationsSuccesses and Limitations

SCF methods SCF methods scale very wellscale very well

Most methods run Most methods run in parallelin parallel

Good use is made Good use is made of aggregate CPU of aggregate CPU and disk and disk resources.resources.

MP2 and MCSCF MP2 and MCSCF methods scale to methods scale to only a few (8-32) only a few (8-32) nodesnodes

The aggregate The aggregate memory is not memory is not utilized so jobs are utilized so jobs are still limited by the still limited by the memory size of memory size of one node.one node.

Page 10: The Distributed Data Interface in GAMESS

1010

Second Generation Second Generation MethodsMethods

New methods should take advantage New methods should take advantage of the aggregate memory of a parallel of the aggregate memory of a parallel systemsystem• Implies a higher communication demandsImplies a higher communication demands• Many to many messaging profileMany to many messaging profile

Methods should scale to hundreds of Methods should scale to hundreds of nodes (at least)nodes (at least)

Demanding local storage needsDemanding local storage needs

Page 11: The Distributed Data Interface in GAMESS

1111

The Distributed Data The Distributed Data Interface (DDI)Interface (DDI)

GAMESS GAMESS

replicated data

DDI DDI

replicated data

distributed data

distributed data

node 0 node 1

process 0 process 1

Figure 1. Memory model if using a full function one-sided messaging library. DDI_GET's interrupt of process 0 results in the data transfer to the requesting node.

Interrupt

"patch"

DDI provides the core functions needed to treat a portion of the memory on each node as part of a global shared array.

Page 12: The Distributed Data Interface in GAMESS

1212DDIDDI

Runs on top of:Runs on top of:• MPI (MPI-2 MPI (MPI-2

preferred)preferred)• TCP/IP socketsTCP/IP sockets

Lightweight - Lightweight - Provides only the Provides only the functionality functionality needed by needed by GAMESSGAMESS

Is not intended as Is not intended as a general purpose a general purpose library.library.

Does optimize for Does optimize for mixed SMP and mixed SMP and distributed distributed memory systemsmemory systems

Page 13: The Distributed Data Interface in GAMESS

1313New MP2 implementationNew MP2 implementation

Uses DDI to utilize the aggregate Uses DDI to utilize the aggregate memory of the parallel machine at the memory of the parallel machine at the expense of communications expense of communications

Trades some symmetry in the MP2 Trades some symmetry in the MP2 equations for better parallel scalabilityequations for better parallel scalability• Requires more memory than the Requires more memory than the

sequential versionsequential version• Is slower than the sequential version on 1 Is slower than the sequential version on 1

CPUCPU

Page 14: The Distributed Data Interface in GAMESS

1414MP2 ScalabilityMP2 Scalability

J

J

J

J

H

H

H

F F

FF

Ñ

Ñ

Ñ

1

4

7

10

13

16

1 4 7 10 13 16

Sp

ee

du

p o

ver

a s

cale

d 1

no

de

tim

ing

Number of CPUs

Ideal

J Gigabit Ethernet

H Gigabit Ethernet, 1 CPU/box

F Fast Ethernet

Ñ Fast Ethernet, 1CPU/box

MP2 Energy and Gradient Speedup Curves

Page 15: The Distributed Data Interface in GAMESS

1515ConclusionsConclusions

DDI provides a scalable way of DDI provides a scalable way of taking advantage of the global taking advantage of the global memory of a parallel system memory of a parallel system

The new MP2 code demonstrates The new MP2 code demonstrates code written specifically for code written specifically for parallel execution without parallel execution without replacing the sequential version.replacing the sequential version.

Page 16: The Distributed Data Interface in GAMESS

1616Future WorkFuture Work

DDI needs further work to enhance DDI needs further work to enhance the features and increase the features and increase robustness, or possibly needs to be robustness, or possibly needs to be replaced with a more general library replaced with a more general library such as the GA tools from PNNL.such as the GA tools from PNNL.

The global shared memory approach The global shared memory approach is being applied to many other parts is being applied to many other parts of GAMESS to increase scalability.of GAMESS to increase scalability.

Page 17: The Distributed Data Interface in GAMESS

1717Thanks!Thanks!

David HalsteadDavid Halstead Guy HelmerGuy Helmer

For $:For $: IBM Corp. for an SUR grant (of 15 IBM Corp. for an SUR grant (of 15

Workstations)Workstations) DOE MICS program (interconnects and 7 DOE MICS program (interconnects and 7

workstations)workstations) Air Force OSR (long term dev. Funding)Air Force OSR (long term dev. Funding) DOD CHSSI program (improved DOD CHSSI program (improved

parallelization)parallelization)