the cactus code: a problem solving environment for the grid

38
The Cactus Code: A Problem Solving Environment for the Grid Gabrielle Allen, Gerd Lanfermann Max Planck Institute for Gravitational Physics

Upload: casey-booth

Post on 01-Jan-2016

34 views

Category:

Documents


0 download

DESCRIPTION

The Cactus Code: A Problem Solving Environment for the Grid. Gabrielle Allen, Gerd Lanfermann Max Planck Institute for Gravitational Physics. What is Cactus?. Cactus is a freely available, modular, portable and manageable environment for collaboratively developing parallel, high- - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The Cactus Code:  A Problem Solving Environment for the Grid

The Cactus Code: A Problem Solving Environment for the Grid

Gabrielle Allen, Gerd Lanfermann

Max Planck Institute for Gravitational Physics

Page 2: The Cactus Code:  A Problem Solving Environment for the Grid

What is Cactus?What is Cactus?

Cactus is a freely available, modular,

portable and manageable environment

for collaboratively developing parallel, high-

performance multidimensional simulations

Page 3: The Cactus Code:  A Problem Solving Environment for the Grid

CactusCactus

Core “Flesh”

Plug-In “Thorns”(modules)

driverdriver

input/outputinput/output

interpolationinterpolation

SOR solverSOR solver

coordinatescoordinates

boundaryboundary conditionsconditions

black holesblack holes

equations of stateequations of state

remote steeringremote steering

wave evolverswave evolvers multigridmultigrid

parametersparameters

gridgrid variablesvariables

errorerror handlinghandling

schedulingscheduling

extensibleextensible APIsAPIs

makemake systemsystem

ANSI CANSI CFortran/C/C++Fortran/C/C++

Java/Perl/PythonJava/Perl/Python

Page 4: The Cactus Code:  A Problem Solving Environment for the Grid

Cactus ArchitectureCactus Architecture

Configure CST

Flesh

ComputationalToolkit

Toolkit Toolkit

Operating SystemsAIX NT

LinuxUnicos

SolarisHP-UX

Thorns

Cactus

SuperUX Irix

OSF

Make

Page 5: The Cactus Code:  A Problem Solving Environment for the Grid

Thorn ArchitectureThorn Architecture

Make Information

Source Code

Documentation!

Configuration FilesParameter Filesand Testsuites

????

????Fortran

RoutinesC++

RoutinesC

Routines

Thorn

Page 6: The Cactus Code:  A Problem Solving Environment for the Grid

State of the ArtState of the Art

Numerical Relativity Simulations

Albert Einstein InstituteWashington University

Viz: Werner Benger

Page 7: The Cactus Code:  A Problem Solving Environment for the Grid

Current Version Cactus 4.0Current Version Cactus 4.0

Cactus 4.0 beta 1 released September 1999 Flesh and many thorns distributed under GNU GPL Currently: Cactus 4.0 beta 8 Supported Architectures:

SGI Origin SGI 32/64 Cray T3E Dec Alpha Intel Linux IA32/IA64 Windows NT HP Exemplar IBM SP2 Sun Solaris Hitachi SR8000-F NEC SX-5 Mac Linux

Page 8: The Cactus Code:  A Problem Solving Environment for the Grid

Flesh APIFlesh API

Abstract Flesh API for Driver functions (storage, communication) Interpolation Reduction IO, checkpointing Coordinates etc, etc

In general, thorns overload or register their capabilities with the Flesh, agreeing to provide a function with the correct interface

e.g. CCTK_SyncGroup (overloaded) e.g. CCTK_OutputVar(“variable”,“IOASCII”) (registered)

Page 9: The Cactus Code:  A Problem Solving Environment for the Grid

Application ViewApplication View

FleshCCTK_(…) CST

Application Toolkit

Application Toolkit

Computational Toolkit

Computational Toolkit

Page 10: The Cactus Code:  A Problem Solving Environment for the Grid

Parallelism in CactusParallelism in Cactus

Cactus is designed around a distributed memory model. Each thorn is passed a section of the global grid.

The actual parallel driver (implemented in a thorn) can use whatever method it likes to decompose the grid across processors and exchange ghost zone information - each thorn is presented with a standard interface, independent of the driver.

Standard driver distributed with Cactus (PUGH) is for a parallel unigrid and uses MPI for the communication layer

PUGH can do custom processor decomposition and static load balancing

Page 11: The Cactus Code:  A Problem Solving Environment for the Grid

Configuration filesConfiguration files

Each thorn provides 3 configuration files, detailing its interface with the Flesh and with other thorns

CCL: Cactus Configuration Language interface.ccl

implementation, this thorn’s variables and variables used from other thorns

param.ccl this thorn’s parameters, parameters used and extended from other

thorns

schedule.ccl when and how this thorn’s routines should be executed, optionally

with respect to routines from other thorns

Page 12: The Cactus Code:  A Problem Solving Environment for the Grid

Cactus Computational ToolkitCactus Computational Toolkit

CactusBase Boundary, IOUtil, IOBasic,

CartGrid3D, IOASCII, Time CactusBench

BenchADM CactusExample

WaveToy1DF77, WaveToy2DF77

CactusElliptic EllBase, EllPETSc, EllSOR,

EllTest CactusPUGH

Interp, PUGH, PUGHSlab

CactusPUGHIO IOFlexIO, IOHDF5, IsoSurfacer

CactusTest TestArrays, TestCoordinates,

TestInclude1, TestInclude2, TestComplex, TestInterp

CactusWave IDScalarWave, IDScalarWaveC,

IDScalarWaveCXX, WaveBinarySource, WaveToyC, WaveToyCXX, WaveToyF77, WaveToyF90, WaveToyFreeF90

external IEEEIO, RemoteIO, TCPXX

Page 13: The Cactus Code:  A Problem Solving Environment for the Grid

Cactus can make use of ...Cactus can make use of ...

Autopilot

FlexIO (IEEEIO/HDF5)

Globus

GrACE

HDF5

MPI

Panda IO

PAPI

PETSc

Page 14: The Cactus Code:  A Problem Solving Environment for the Grid

AutoPilotAutoPilot

Dynamic performance instrumentation, on-the-fly performance data reduction, resource management algorithms, real-time adaptive control mechanism

Cactus provides a mechanism to register timers, and Autopilot is currently being integrated.

http://www-pablo.cs.uiuc.edu/Project/Autopilot/AutopilotOverview.htmhttp://www.cactuscode.org/Documentation/HOWTO/Performance-HOWTOhttp://www.cactuscode.org/Projects.html

Page 15: The Cactus Code:  A Problem Solving Environment for the Grid

FlexIO (IEEEIO)FlexIO (IEEEIO)

FlexIO is a compact multi-platform API for storing multidimensional scientific data. It hides the differences between underlying file formats including HDF5 and IEEEIO.

IEEEIO readers for:AmiraAVSIDL

LCA VisionNAG Explorer

IEEEIO is a compact library for storing multidimensional scientific data in a binary format that can be transported between different computer systems.

Cactus thorn CactusPUGHIO/IOFlexIO outputs multidimensional data using the IEEEIO library. http://zeus.ncsa.uiuc.edu/~jshalf/FlexIO/http://zeus.ncsa.uiuc.edu/~jshalf/FlexIO/IEEEIO.htmlhttp://www.cactuscode.org/Documentation/HOWTO/Visualization-HOWTODocumentation in thorns CactusBase/IOUtil and CactusPUGHIO/IEEEIO

Page 16: The Cactus Code:  A Problem Solving Environment for the Grid

Globus ToolkitGlobus Toolkit

Globus Toolkit: Enables application of Grid concepts to scientific and engineering computing

Cactus (with the default MPI driver) compiles with Globus (1.0/1.1), using MPICH-G.

Cactus can then be run using RSL scripts as usual with Globus

http://www.globus.org/http://www.cactuscode.org/Documentation/HOWTO/Globus-HOWTOhttp://jean-luc.aei-potsdam.mpg.de/SC98/

The Grid: Dependable, consistent, pervasive access to [high-end] resources

Collaborative engineeringBrowsing of remote datasets

Use of remote softwareData-intensive computing

Very large-scale simulationLarge-scale parameter studies

Page 17: The Cactus Code:  A Problem Solving Environment for the Grid

HDF5HDF5 Hierarchical data format for scientific data management

(I/O libraries and tools). Future standard, overcomes limitations of HDF4. Simple

but powerful model, includes hyperslabs, datatype conversion, parallel IO.

Used for 2D/3D output in Computational Toolkit (CactusPUGHIO/IOHDF5)

Much development in (remote) visualization and steering with Cactus uses HDF5

Readers for Amira, OpenDX, (LCA Vision).

http://hdf.ncsa.uiuc.edu/HDF5/http://www.CactusCode.org/Documentation/UsersGuide_html/node15.htmlhttp://www.cactuscode.org/Documentation/HOWTO/Visualization-HOWTODocumentation in thorns CactusBase/IOUtil and CactusPUGHIO/IOHDF5

Page 18: The Cactus Code:  A Problem Solving Environment for the Grid

Panda IOPanda IO

Data management techniques for I/O intensive applications in high-performance scientific computing.

Simpler, more abstract interfaces, efficient layout alternatives for multidimensional arrays, high performance array I/O operations.

Thorn IOPanda

http://cdr.cs.uiuc.edu/panda/http://www.cactuscode.org/Workshops/NCSA99/talk13/sld003.htm

Page 19: The Cactus Code:  A Problem Solving Environment for the Grid

PAPIPAPI

Standard API for accessing the hardware performance counters on most microprocessors.

Useful for tuning, optimisation, debugging, benchmarking, etc.

http://icl.cs.utk.edu/projects/papi/http://www.cactuscode.org/Documentation/HOWTO/Performance-HOWTOhttp://www.cactuscode.org/Projects.html

Java GUI available for monitoring the metrics Cactus thorn CactusPerformance/PAPI

Page 20: The Cactus Code:  A Problem Solving Environment for the Grid

Grid-Enabled CactusGrid-Enabled Cactus

Cactus and its ancestor codes have been using Grid infrastructure since 1993

Support for Grid computing was part of the design requirements for Cactus 4.0 (experiences with Cactus 3)

Cactus compiles “out-of-the-box” with Globus [using globus device of MPICH-G(2)]

Design of Cactus means that applications are unaware of the underlying machine/s that the simulation is running on … applications become trivially Grid-enabled

Infrastructure thorns (I/O, driver layers) can be enhanced to make most effective use of the underlying Grid architecture

Page 21: The Cactus Code:  A Problem Solving Environment for the Grid

Grid ExperimentsGrid Experiments SC93

remote CM-5 simulation with live viz in CAVE SC95

Heroic I-Way experiments leads to development of Globus. Cornell SP-2, Power Challenge, with live viz in San Diego CAVE

SC97 Garching 512 node T3E, launched, controlled, visualized in San Jose

SC98 HPC Challenge. SDSC, ZIB, and Garching T3E compute collision of 2

Neutron Stars, controlled from Orlando SC99

Colliding Black Holes using Garching, ZIB T3E’s, with remote collaborative interaction and viz at ANL and NCSA booths

2000 Single simulation LANL, NCSA, NERSC, SDSC, ZIB, Garching, … Dynamic distributed computing … spawning new simulations

Page 22: The Cactus Code:  A Problem Solving Environment for the Grid

Cactus + GlobusCactus + Globus

Cactus Application ThornsDistribution information hidden from programmer

Initial data, Evolution, Analysis, etc

Grid Aware Application ThornsDrivers for parallelism, IO, communication, data mapping

PUGH: parallelism via MPI (MPICH-G2, grid enabled message passing library)

Grid Enabled Communication Library

MPICH-G2 implementation of MPI, can run MPI programs across heterogeneous computing

resources

Standard MPI

SingleProc

Page 23: The Cactus Code:  A Problem Solving Environment for the Grid

Grand PictureGrand PictureRemote steering and monitoring

from airport

Origin: NCSA

Remote Viz in St Louis

T3E: Garching

Simulations launched from Cactus PortalGrid enabled

Cactus runs on distributed machines

Remote Viz and steering from Berlin

Viz of data from previous simulations in

SF café

DataGrid/DPSSDownsampling

Globus

http

HDF5

IsoSurfaces

Page 24: The Cactus Code:  A Problem Solving Environment for the Grid

Grid Related ProjectsGrid Related Projects ASC: Astrophysics Simulation Collaboratory

NSF Funded (WashU, Rutgers, Argonne, U. Chicago, NCSA) Collaboratory tools, Cactus Portal Currently setting up testbed (Globus, Cactus, Portal at NCSA, ZIB, AEI)

E-Grid: European Grid Forum Members from academic and government institutions, computer centers

and industry Test application: Cactus+Globus Currently working towards distributed computing project for SC2000

(spawning Cactus jobs to new machines)

GrADs: Grid Application Development Software NSF Funded (Rice, NCSA, U. Illinois, UCSD, U. Chicago, U. Indiana...) Application driver for grid software

Page 25: The Cactus Code:  A Problem Solving Environment for the Grid

Grid Related Projects (2)Grid Related Projects (2) Grid Forum

Experiments Transparency appearances

Distributed Runs AEI, Argonne, U. Chicago Working towards running on several computers, 1000’s of processors

(different processors, memories, OSs, resource management, varied networks, bandwidths and latencies)

TIKSL German DFN funded: AEI, ZIB, Garching Remote online and offline visualization, remote steering/monitoring

Cactus Team Dynamic distributed computing … Testing of alternative communication protocols … MPI, PVM, SHMEM,

pthreads, OpenMP, Corba, RDMA, ...

Page 26: The Cactus Code:  A Problem Solving Environment for the Grid

Dynamic Distributed ComputingDynamic Distributed Computing

Make use of Running with management tools such as Condor, Entropia, etc. Scripting thorns (management, launching new jobs, etc) Dynamic use of MDS for finding available resources

Applications Portal for simulation launching and management Intelligent parameter surveys (Cactus control thorn) Spawning off independent jobs to new machines e.g. analysis tasks Dynamic staging … seeking out and moving to faster/larger/cheaper

machines as they become available (Cactus worm) Dynamic load balancing (e.g. inhomogeneous loads, multiple grids)

Page 27: The Cactus Code:  A Problem Solving Environment for the Grid

Remote VisualizationRemote Visualization

IsoSurfaces and Geodesics

Contour plots(download)

Grid FunctionsStreaming

HDF5

Amira

Amira

LCA Vision

OpenDXOpenDX

Page 28: The Cactus Code:  A Problem Solving Environment for the Grid

Remote VisualizationRemote Visualization

Streaming data from Cactus simulation to viz client Clients: OpenDX, Amira, LCA Vision, ...

Protocols Proprietary:

– Isosurfaces, geodesics HTTP:

– Parameters, xgraph data, JPegs Streaming HDF5:

– HDF5 provides downsampling and hyperslabbing

– all above data, and all possible HDF5 data (e.g. 2D/3D)

– two different technologies• Streaming Virtual File Driver (I/O rerouted over network stream)• XML-wrapper (HDF5 calls wrapped and translated into XML)

Page 29: The Cactus Code:  A Problem Solving Environment for the Grid

Remote Visualization (2)Remote Visualization (2)

Clients Proprietary:

– Amira HTTP:

– Any browser (+ xgraph helper application) HDF5:

– Any HDF5 aware application • h5dump• Amira• OpenDX• LCA Vision (soon)

XML:– Any XML aware application

• Perl/Tk GUI• Future browsers (need XSL-Stylesheets)

Page 30: The Cactus Code:  A Problem Solving Environment for the Grid

OpenDXOpenDX

Open source, (free), multiplatform, large active development community, easy to program

Reads HDF5 (Cactus) data from file or remotely streamed from Cactus

Simple GUI, select different hyperslabs from 3D data

Also support for streamed ASCII data from Cactus

Page 31: The Cactus Code:  A Problem Solving Environment for the Grid

Remote Visualization - IssuesRemote Visualization - Issues

Parallel streaming Cactus can do this, but readers not yet available on the client side

Handling of port numbers clients currently have no method for finding the port number that

Cactus is using for streaming development of external meta-data server needed (ASC/TIKSL)

Generic protocols Data server

Cactus should pass data to a separate server that will handle multiple clients without interfering with simulation

TIKSL provides middleware (streaming HDF5) to implement this

Output parameters for each client

Page 32: The Cactus Code:  A Problem Solving Environment for the Grid

Remote SteeringRemote Steering

Remote Viz data

Remote Viz data

XML HTTP

HDF5

Amira

Any Viz Client

Page 33: The Cactus Code:  A Problem Solving Environment for the Grid

Remote SteeringRemote Steering

Stream parameters from Cactus simulation to remote client, which changes parameters (GUI, command line, viz tool), and streams them back to Cactus where they change the state of the simulation.

Cactus has a special STEERABLE tag for parameters, indicating it makes sense to change them during a simulation, and there is support for them to be changed.

Example: IO parameters, frequency, fields Current protocols:

XML (HDF5) to standalone GUI HDF5 to viz tools (Amira) HTTP to Web browser (HTML forms)

Page 34: The Cactus Code:  A Problem Solving Environment for the Grid

Thorn httpThorn http Thorn which allows

simulation to act as a web server

Connect to simulation from any browser

Monitor run: parameters, basic visualization, ...

Change steerable parameters

See running example at www.CactusCode.org

Wireless remote viz, monitoring and steering

Page 35: The Cactus Code:  A Problem Solving Environment for the Grid

Remote Steering - IssuesRemote Steering - Issues

Same kinds of problems as remote visualization generic protocols handling of port numbers broadcasting of active Cactus simulations

Security Logins Who can change parameters?

Lots of issues still to resolve ...

Page 36: The Cactus Code:  A Problem Solving Environment for the Grid

Remote Offline VisualizationRemote Offline VisualizationViz Client (Amira)

HDF5 VFD

DataGrid (Globus)

DPSS FTP HTTP

VisualizationClient

DPSS Server

FTP Server

Web Server Remote

Data Server

Downsampling, hyperslabs

Viz in Berlin

4TB at NCSA

Only what is needed

Page 37: The Cactus Code:  A Problem Solving Environment for the Grid

Remote Offline VisualizationRemote Offline Visualization

Accessing remote data for local visualization

Should allow downsampling, hyperslabbing, etc. Access via DPSS is working (TIKSL) Waiting for DataGrid support for HTTP and FTP to remove

dependency on the DPSS file systems.

Page 38: The Cactus Code:  A Problem Solving Environment for the Grid

More Information ...More Information ...

Web site: www.CactusCode.org Users Guide Development projects HOWTOs

References for common questions, includes HOWTO-QuickStart

Tutorials Help desk

[email protected]

Bug tracking Mailing lists