m ultiresolution ad aptive n um e rical s cientific s imulation

23

Upload: kynton

Post on 08-Jan-2016

21 views

Category:

Documents


0 download

DESCRIPTION

M ultiresolution Ad aptive N um e rical S cientific S imulation. Ariana Beste 1 , George I. Fann 1 , Robert J. Harrison 1,2 , Rebecca Hartman-Baker 1 , Shinichiro Sugiki 1 1 Oak Ridge National Laboratory 2 University of Tennessee, Knoxville In collaboration with - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: M ultiresolution  Ad aptive  N um e rical  S cientific  S imulation
Page 2: M ultiresolution  Ad aptive  N um e rical  S cientific  S imulation

Multiresolution Adaptive Numerical Scientific Simulation

Ariana Beste1, George I. Fann1, Robert J. Harrison1,2, Rebecca Hartman-Baker1, Shinichiro Sugiki1

1Oak Ridge National Laboratory2University of Tennessee, Knoxville

In collaboration with

Gregory Beylkin4, Fernando Perez4, Lucas Monzon4, Martin Mohlenkamp5 and others

4University of Colorado5Ohio University

[email protected]

Page 3: M ultiresolution  Ad aptive  N um e rical  S cientific  S imulation

The DOE funding• This work is funded by the U.S. Department of Energy, the

division of Basic Energy Science, Office of Science, under contract DE-AC05-00OR22725 with Oak Ridge National Laboratory. This research was performed in part using – resources of the National Energy Scientific Computing

Center which is supported by the Office of Energy Research of the U.S. Department of Energy under contract DE-AC03-76SF0098,

– and the Center for Computational Sciences at Oak Ridge National Laboratory under contract DE-AC05-00OR22725 .

Page 4: M ultiresolution  Ad aptive  N um e rical  S cientific  S imulation

Outline• Multiresolution basics

• Parallel decomposition and tools

• Underlying representation

• Application characteristics

• Current storage strategy

Page 5: M ultiresolution  Ad aptive  N um e rical  S cientific  S imulation

Molecular Science Software ProjectMolecular Science Software ProjectEMSL / PNNLEMSL / PNNL

Gary Black,Brett Didier,Todd Elsenthagen,Sue Havre,Carina Lansing,Bruce Palmer,Karen Schuchardt,Lisong SunErich Vorpagel

PNNLYuri Alexeev,Eric Bylaska,Bert deJong,Mahin Hackler, Karol Kowalski,Lisa Pollack,Tjerk Straatsma,Marat Valiev,

ORNLEdo Apra,Robert HarrisonVincent Meunier AmesRicky Kendall TL Windus

Manoj Krishnan, Jarek Nieplocha, Bruce Palmer, Vinod Tipparaju

http://www.emsl.pnl.gov/docs/nwchem/nwchem.html

Page 6: M ultiresolution  Ad aptive  N um e rical  S cientific  S imulation

Computational Chemistry EndstationInternational collaboration spanning 8 universities and 5 national labs

• Led out of UT/ORNL• Focus

– Actinides, Aerosols, Catalysis

• ORNL Cray XT3, ANL BG/L

0.001

0.01

0.1

1

10

10 100 1000 10000P

Tim

e/s

compress

reconstruct

mult

square

diff

Capabilties:• Chemically accurate thermochemistry

• Many-body methods required• Mixed QM/QM/MM dynamics

• Accurate free-energy integration• Simulation of extended interfaces

• Families of relativistic methods

Scaling of MADNESS 64-4096 cpu on XT3

NWChem: Largest CCSD(T) calculation - Pollack, EMSL, 2005. - 1960 processor Itanium2 cluster- 1468 basis functions (aug-cc-pVQZ) - Perturbative triples (T)

- 23 hours on 1400 processors- 75% of peak = 6.3 TFlops.

Page 7: M ultiresolution  Ad aptive  N um e rical  S cientific  S imulation

Multiresolution chemistry objectives

• Complete elimination of the basis error– One-electron models (e.g., HF, DFT)– Pair models (e.g., MP2, CCSD, …)

• Correct scaling of cost with system size

• General approach– Readily accessible by students and researchers– Higher level of composition – Direct computation of chemical energy differences

• New computational approaches

• Fast algorithms with guaranteed precision

Page 8: M ultiresolution  Ad aptive  N um e rical  S cientific  S imulation

How to “think” multiresolution

• Consider a ladder of function spaces

– E.g., increasing quality atomic basis sets, or finer resolution grids, …

• Telescoping series

– Instead of using the most accurate representation, use the difference between successive approximations

– Representation on V0 small/dense; differences sparse

– Computationally efficient; many possible insights

0 1 2 nV V V V

0 1 0 2 1 1( ) ( ) ( )n n nV V V V V V V V

Page 9: M ultiresolution  Ad aptive  N um e rical  S cientific  S imulation
Page 10: M ultiresolution  Ad aptive  N um e rical  S cientific  S imulation

High-level composition using functions and operators

• Conventional quant. chem. uses explicitly indexed sparse arrays of matrix elements– Complex, tedious and error prone

• Python classes for Function and Operator– in 1,2,3,6 and general dimensions– wide range of operations

Hpsi = -0.5*Delsq*psi+ V*psi

J = Coulomb.apply(rho)

• All with guaranteed speed and precision

21

2( ) *

( )

| |

H V

J r G

sds

r s

Page 11: M ultiresolution  Ad aptive  N um e rical  S cientific  S imulation

New MADNESS solver• Total rewrite in C++

– Three levels of parallelism targeting massively parallel computer using multi-processor nodes

– In anticipation of highly-threaded processors– Ideally targets low latency AM+MPI+threads– Portable implementation polling+MPI+threads

• Core math functionality is now running– 3D functions, real and complex (1-6D functions will be added

this FYI)– Scaling demonstrated up to 4096 processors – designed for

100+K.

Page 12: M ultiresolution  Ad aptive  N um e rical  S cientific  S imulation

1-D Example Sub-Tree Parallelism

0

1

2

3

4

5

6

Both sub-trees can be done in parallel. In 3-D nodes split into 8 children … in 6-D there are 64 children

Page 13: M ultiresolution  Ad aptive  N um e rical  S cientific  S imulation

Distributed-memory Cilk-like modelParameter: MPI rank probe() set() get()

Compress(tree,result):Parameter left, rightif (tree.left) Compress(tree.left, left)if (tree.right) Compress(tree.right, right)AddTask(Op, left, right, result)

WaitTasks()

Task: Input parameters Output parameters probe() run()

Benefits: Most receives pre-posted greatly increasing scalability Communication latency & transfer time largely hidden Much simpler composition than explicit message passing Positions code to use “intelligent” runtimes with work stealing Positions code for efficient use of multi-core chips

Page 14: M ultiresolution  Ad aptive  N um e rical  S cientific  S imulation

Essential techniques for fast computation

• Multiresolution

• Low-separation rank

• Low-operator rank

0 1

0 1 0 1

n

n n n

V V V

V V V V V V

( )1

1 1

( )

2

( , , ) ( ) ( )

1 0

dMl

d l i il i

li l

f x x f x O

f

1

( )

0 . .

rT

T T

A u v O

v v u u

Page 15: M ultiresolution  Ad aptive  N um e rical  S cientific  S imulation

Separated representations

• Key to computing in higher dimensions– Analogs of SVD exploit low operator rank– Generalized form exploits other operator

properties– E.g., these all have full operator rank but low-

separation rank constructions exist• Identity operator• Green’s functions of many PDEs (Poisson,

Helmholtz)• All-electron Schrödinger Hamiltonian

Page 16: M ultiresolution  Ad aptive  N um e rical  S cientific  S imulation

x

y

|x-y|

|x-y| x-y

|x-y|

y-x

|x-y|

|x-y|

|x-y|

|x-y|

y-x

x-y

y-x

x-y

In 3D, ideally mustbe one box removedfrom the diagonal

Diagonal box hasfull rank

Boxes touching diagonal (face, edge,or corner) have increasingly low rank

Away from diagonalr = O(-log )

1

( ) ( )r

x y f x g y

r = separation rank

Page 17: M ultiresolution  Ad aptive  N um e rical  S cientific  S imulation

Molecular electronic Schrödinger equation

• A 3-N dimensional, non-separable, second-order differential equation

1 2 1 2

212

1, 1, 1,1, 1, 1

( , , , ) ( , , , )

1

n n

ii m i n i ni i j

N j i

H r r r E r r r

ZH

r r r r

Page 18: M ultiresolution  Ad aptive  N um e rical  S cientific  S imulation

• Electron+atom/molecule scatteringMolecules in intense radiation field

• Challenges– Scattering – highly oscillatory states– Dissociation – continuum states– Quantum treatment of light nuclei– Rydberg states – very large volumes

• In principle, adaptive multiresolution techniques are ideal– Single basis treats bound and continuum states on equal footing– Long time steps possible via integral operator for time evolution– Separated representations provide path to higher dimensions

• Waiting for new production code before can apply free-particle propagator efficiently for implicit scheme (integral kernel is exp(-ix2/2t) )

• Need a more strongly band limited basis?• Want to do this in at least 5-9D, 12D being considered

Dynamics of fundamental few electron systems (Krstic and Harrison)

Page 19: M ultiresolution  Ad aptive  N um e rical  S cientific  S imulation

“Independent” particle models• Atomic and molecular orbitals

– Each electron feels the mean field of all other electrons (self-consistent field, Hartree-Fock)

– Replaces linear 3N-D Schrödinger w. non-linear 3-D eigen-problem– Provides the structure of the periodic table and the chemical bond– Linear combination of atomic orbitals - LCAO

– E.g., molecular orbitals for water, H2O

-20.44

-1.31

-0.67

-0.53

-0.48

Page 20: M ultiresolution  Ad aptive  N um e rical  S cientific  S imulation

Density functional theory (DFT)• Hohenberg-Kohn theorem

– The energy is a functional of the density (3D)

• Kohn-Sham– Practical approach to DFT, parameterizing the

density with orbitals (easier treatment of kinetic energy)

– Very similar computationally to Hartree-Fock, but potentially exact

212

2

( ; ) ( ; ) ( ) ( ) ( )

( ) ( )

coul xc ext i i i

ii

V r V r V r r r

r r

Page 21: M ultiresolution  Ad aptive  N um e rical  S cientific  S imulation

Reduced scaling method

• Eigen-functions (canonical orbitals) can be delocalized– Limits to O(VN) data and O(VN2) compute

• Solve instead for localized orbitals that span the same space – Limits to O(NlnV) data and compute – Multiresolution representation makes this easy– Remaining linear algebra has small pre-factor

and is sparse

Page 22: M ultiresolution  Ad aptive  N um e rical  S cientific  S imulation

Current I/O Strategy• Looked seriously at HDF and Phil’s API

– Substantial effort for adoption; HDF perf. questions

– Substantial benefits from interoperability

– Short-term driver is check point restart

• Tunable subset of nodes doing I/O– Currently nodes at a level in tree (in 3D 1, 8, 64, …)

– Collect data from other nodes

– Serialize to disk in either binary or text (XML)

• Already want interfaces to viz. tools

• Starting to consider interface to external solvers– Sundance, PetSc, …

Page 23: M ultiresolution  Ad aptive  N um e rical  S cientific  S imulation

Summary of MADNESS data• Discontinuous spectral element

– Legendre polynomials, or– Approximate prolate spheroidal functions

• Structured, deeply-refined, adaptive mesh

• In higher-dimensions– Separated representations in most elements

• Mix of data types– Float, double, float-complex, double-complex

• 100s to 10Ks of distinct functions in 3D– 10s of Gb to 10s of Tb of data

• Few functions in 6+D– 100s of Gb to 10s of Tb