the cactus code: a problem solving environment for the grid gabrielle allen, gerd lanfermann max...
TRANSCRIPT
The Cactus Code: A Problem Solving Environment for the Grid
Gabrielle Allen, Gerd Lanfermann
Max Planck Institute for Gravitational Physics
What is Cactus?What is Cactus?
Cactus is a freely available, modular,
portable and manageable environment
for collaboratively developing parallel, high-
performance multidimensional simulations
CactusCactus
Core “Flesh”
Plug-In “Thorns”(modules)
driverdriver
input/outputinput/output
interpolationinterpolation
SOR solverSOR solver
coordinatescoordinates
boundaryboundary conditionsconditions
black holesblack holes
equations of stateequations of state
remote steeringremote steering
wave evolverswave evolvers multigridmultigrid
parametersparameters
gridgrid variablesvariables
errorerror handlinghandling
schedulingscheduling
extensibleextensible APIsAPIs
makemake systemsystem
ANSI CANSI CFortran/C/C++Fortran/C/C++
Java/Perl/PythonJava/Perl/Python
Cactus ArchitectureCactus Architecture
Configure CST
Flesh
ComputationalToolkit
Toolkit Toolkit
Operating SystemsAIX NT
LinuxUnicos
SolarisHP-UX
Thorns
Cactus
SuperUX Irix
OSF
Make
Thorn ArchitectureThorn Architecture
Make Information
Source Code
Documentation!
Configuration FilesParameter Filesand Testsuites
????
????Fortran
RoutinesC++
RoutinesC
Routines
Thorn
State of the ArtState of the Art
Numerical Relativity Simulations
Albert Einstein InstituteWashington University
Viz: Werner Benger
Current Version Cactus 4.0Current Version Cactus 4.0
Cactus 4.0 beta 1 released September 1999 Flesh and many thorns distributed under GNU GPL Currently: Cactus 4.0 beta 8 Supported Architectures:
SGI Origin SGI 32/64 Cray T3E Dec Alpha Intel Linux IA32/IA64 Windows NT HP Exemplar IBM SP2 Sun Solaris Hitachi SR8000-F NEC SX-5 Mac Linux
Flesh APIFlesh API
Abstract Flesh API for Driver functions (storage, communication) Interpolation Reduction IO, checkpointing Coordinates etc, etc
In general, thorns overload or register their capabilities with the Flesh, agreeing to provide a function with the correct interface
e.g. CCTK_SyncGroup (overloaded) e.g. CCTK_OutputVar(“variable”,“IOASCII”) (registered)
Application ViewApplication View
FleshCCTK_(…) CST
Application Toolkit
Application Toolkit
Computational Toolkit
Computational Toolkit
Parallelism in CactusParallelism in Cactus
Cactus is designed around a distributed memory model. Each thorn is passed a section of the global grid.
The actual parallel driver (implemented in a thorn) can use whatever method it likes to decompose the grid across processors and exchange ghost zone information - each thorn is presented with a standard interface, independent of the driver.
Standard driver distributed with Cactus (PUGH) is for a parallel unigrid and uses MPI for the communication layer
PUGH can do custom processor decomposition and static load balancing
Configuration filesConfiguration files
Each thorn provides 3 configuration files, detailing its interface with the Flesh and with other thorns
CCL: Cactus Configuration Language interface.ccl
implementation, this thorn’s variables and variables used from other thorns
param.ccl this thorn’s parameters, parameters used and extended from other
thorns
schedule.ccl when and how this thorn’s routines should be executed, optionally
with respect to routines from other thorns
Cactus Computational ToolkitCactus Computational Toolkit
CactusBase Boundary, IOUtil, IOBasic,
CartGrid3D, IOASCII, Time CactusBench
BenchADM CactusExample
WaveToy1DF77, WaveToy2DF77
CactusElliptic EllBase, EllPETSc, EllSOR,
EllTest CactusPUGH
Interp, PUGH, PUGHSlab
CactusPUGHIO IOFlexIO, IOHDF5, IsoSurfacer
CactusTest TestArrays, TestCoordinates,
TestInclude1, TestInclude2, TestComplex, TestInterp
CactusWave IDScalarWave, IDScalarWaveC,
IDScalarWaveCXX, WaveBinarySource, WaveToyC, WaveToyCXX, WaveToyF77, WaveToyF90, WaveToyFreeF90
external IEEEIO, RemoteIO, TCPXX
Cactus can make use of ...Cactus can make use of ...
Autopilot
FlexIO (IEEEIO/HDF5)
Globus
GrACE
HDF5
MPI
Panda IO
PAPI
PETSc
AutoPilotAutoPilot
Dynamic performance instrumentation, on-the-fly performance data reduction, resource management algorithms, real-time adaptive control mechanism
Cactus provides a mechanism to register timers, and Autopilot is currently being integrated.
http://www-pablo.cs.uiuc.edu/Project/Autopilot/AutopilotOverview.htmhttp://www.cactuscode.org/Documentation/HOWTO/Performance-HOWTOhttp://www.cactuscode.org/Projects.html
FlexIO (IEEEIO)FlexIO (IEEEIO)
FlexIO is a compact multi-platform API for storing multidimensional scientific data. It hides the differences between underlying file formats including HDF5 and IEEEIO.
IEEEIO readers for:AmiraAVSIDL
LCA VisionNAG Explorer
IEEEIO is a compact library for storing multidimensional scientific data in a binary format that can be transported between different computer systems.
Cactus thorn CactusPUGHIO/IOFlexIO outputs multidimensional data using the IEEEIO library. http://zeus.ncsa.uiuc.edu/~jshalf/FlexIO/http://zeus.ncsa.uiuc.edu/~jshalf/FlexIO/IEEEIO.htmlhttp://www.cactuscode.org/Documentation/HOWTO/Visualization-HOWTODocumentation in thorns CactusBase/IOUtil and CactusPUGHIO/IEEEIO
Globus ToolkitGlobus Toolkit
Globus Toolkit: Enables application of Grid concepts to scientific and engineering computing
Cactus (with the default MPI driver) compiles with Globus (1.0/1.1), using MPICH-G.
Cactus can then be run using RSL scripts as usual with Globus
http://www.globus.org/http://www.cactuscode.org/Documentation/HOWTO/Globus-HOWTOhttp://jean-luc.aei-potsdam.mpg.de/SC98/
The Grid: Dependable, consistent, pervasive access to [high-end] resources
Collaborative engineeringBrowsing of remote datasets
Use of remote softwareData-intensive computing
Very large-scale simulationLarge-scale parameter studies
HDF5HDF5 Hierarchical data format for scientific data management
(I/O libraries and tools). Future standard, overcomes limitations of HDF4. Simple
but powerful model, includes hyperslabs, datatype conversion, parallel IO.
Used for 2D/3D output in Computational Toolkit (CactusPUGHIO/IOHDF5)
Much development in (remote) visualization and steering with Cactus uses HDF5
Readers for Amira, OpenDX, (LCA Vision).
http://hdf.ncsa.uiuc.edu/HDF5/http://www.CactusCode.org/Documentation/UsersGuide_html/node15.htmlhttp://www.cactuscode.org/Documentation/HOWTO/Visualization-HOWTODocumentation in thorns CactusBase/IOUtil and CactusPUGHIO/IOHDF5
Panda IOPanda IO
Data management techniques for I/O intensive applications in high-performance scientific computing.
Simpler, more abstract interfaces, efficient layout alternatives for multidimensional arrays, high performance array I/O operations.
Thorn IOPanda
http://cdr.cs.uiuc.edu/panda/http://www.cactuscode.org/Workshops/NCSA99/talk13/sld003.htm
PAPIPAPI
Standard API for accessing the hardware performance counters on most microprocessors.
Useful for tuning, optimisation, debugging, benchmarking, etc.
http://icl.cs.utk.edu/projects/papi/http://www.cactuscode.org/Documentation/HOWTO/Performance-HOWTOhttp://www.cactuscode.org/Projects.html
Java GUI available for monitoring the metrics Cactus thorn CactusPerformance/PAPI
Grid-Enabled CactusGrid-Enabled Cactus
Cactus and its ancestor codes have been using Grid infrastructure since 1993
Support for Grid computing was part of the design requirements for Cactus 4.0 (experiences with Cactus 3)
Cactus compiles “out-of-the-box” with Globus [using globus device of MPICH-G(2)]
Design of Cactus means that applications are unaware of the underlying machine/s that the simulation is running on … applications become trivially Grid-enabled
Infrastructure thorns (I/O, driver layers) can be enhanced to make most effective use of the underlying Grid architecture
Grid ExperimentsGrid Experiments SC93
remote CM-5 simulation with live viz in CAVE SC95
Heroic I-Way experiments leads to development of Globus. Cornell SP-2, Power Challenge, with live viz in San Diego CAVE
SC97 Garching 512 node T3E, launched, controlled, visualized in San Jose
SC98 HPC Challenge. SDSC, ZIB, and Garching T3E compute collision of 2
Neutron Stars, controlled from Orlando SC99
Colliding Black Holes using Garching, ZIB T3E’s, with remote collaborative interaction and viz at ANL and NCSA booths
2000 Single simulation LANL, NCSA, NERSC, SDSC, ZIB, Garching, … Dynamic distributed computing … spawning new simulations
Cactus + GlobusCactus + Globus
Cactus Application ThornsDistribution information hidden from programmer
Initial data, Evolution, Analysis, etc
Grid Aware Application ThornsDrivers for parallelism, IO, communication, data mapping
PUGH: parallelism via MPI (MPICH-G2, grid enabled message passing library)
Grid Enabled Communication Library
MPICH-G2 implementation of MPI, can run MPI programs across heterogeneous computing
resources
Standard MPI
SingleProc
Grand PictureGrand PictureRemote steering and monitoring
from airport
Origin: NCSA
Remote Viz in St Louis
T3E: Garching
Simulations launched from Cactus PortalGrid enabled
Cactus runs on distributed machines
Remote Viz and steering from Berlin
Viz of data from previous simulations in
SF café
DataGrid/DPSSDownsampling
Globus
http
HDF5
IsoSurfaces
Grid Related ProjectsGrid Related Projects ASC: Astrophysics Simulation Collaboratory
NSF Funded (WashU, Rutgers, Argonne, U. Chicago, NCSA) Collaboratory tools, Cactus Portal Currently setting up testbed (Globus, Cactus, Portal at NCSA, ZIB, AEI)
E-Grid: European Grid Forum Members from academic and government institutions, computer centers
and industry Test application: Cactus+Globus Currently working towards distributed computing project for SC2000
(spawning Cactus jobs to new machines)
GrADs: Grid Application Development Software NSF Funded (Rice, NCSA, U. Illinois, UCSD, U. Chicago, U. Indiana...) Application driver for grid software
Grid Related Projects (2)Grid Related Projects (2) Grid Forum
Experiments Transparency appearances
Distributed Runs AEI, Argonne, U. Chicago Working towards running on several computers, 1000’s of processors
(different processors, memories, OSs, resource management, varied networks, bandwidths and latencies)
TIKSL German DFN funded: AEI, ZIB, Garching Remote online and offline visualization, remote steering/monitoring
Cactus Team Dynamic distributed computing … Testing of alternative communication protocols … MPI, PVM, SHMEM,
pthreads, OpenMP, Corba, RDMA, ...
Dynamic Distributed ComputingDynamic Distributed Computing
Make use of Running with management tools such as Condor, Entropia, etc. Scripting thorns (management, launching new jobs, etc) Dynamic use of MDS for finding available resources
Applications Portal for simulation launching and management Intelligent parameter surveys (Cactus control thorn) Spawning off independent jobs to new machines e.g. analysis tasks Dynamic staging … seeking out and moving to faster/larger/cheaper
machines as they become available (Cactus worm) Dynamic load balancing (e.g. inhomogeneous loads, multiple grids)
Remote VisualizationRemote Visualization
IsoSurfaces and Geodesics
Contour plots(download)
Grid FunctionsStreaming
HDF5
Amira
Amira
LCA Vision
OpenDXOpenDX
Remote VisualizationRemote Visualization
Streaming data from Cactus simulation to viz client Clients: OpenDX, Amira, LCA Vision, ...
Protocols Proprietary:
– Isosurfaces, geodesics HTTP:
– Parameters, xgraph data, JPegs Streaming HDF5:
– HDF5 provides downsampling and hyperslabbing
– all above data, and all possible HDF5 data (e.g. 2D/3D)
– two different technologies• Streaming Virtual File Driver (I/O rerouted over network stream)• XML-wrapper (HDF5 calls wrapped and translated into XML)
Remote Visualization (2)Remote Visualization (2)
Clients Proprietary:
– Amira HTTP:
– Any browser (+ xgraph helper application) HDF5:
– Any HDF5 aware application • h5dump• Amira• OpenDX• LCA Vision (soon)
XML:– Any XML aware application
• Perl/Tk GUI• Future browsers (need XSL-Stylesheets)
OpenDXOpenDX
Open source, (free), multiplatform, large active development community, easy to program
Reads HDF5 (Cactus) data from file or remotely streamed from Cactus
Simple GUI, select different hyperslabs from 3D data
Also support for streamed ASCII data from Cactus
Remote Visualization - IssuesRemote Visualization - Issues
Parallel streaming Cactus can do this, but readers not yet available on the client side
Handling of port numbers clients currently have no method for finding the port number that
Cactus is using for streaming development of external meta-data server needed (ASC/TIKSL)
Generic protocols Data server
Cactus should pass data to a separate server that will handle multiple clients without interfering with simulation
TIKSL provides middleware (streaming HDF5) to implement this
Output parameters for each client
Remote SteeringRemote Steering
Remote Viz data
Remote Viz data
XML HTTP
HDF5
Amira
Any Viz Client
Remote SteeringRemote Steering
Stream parameters from Cactus simulation to remote client, which changes parameters (GUI, command line, viz tool), and streams them back to Cactus where they change the state of the simulation.
Cactus has a special STEERABLE tag for parameters, indicating it makes sense to change them during a simulation, and there is support for them to be changed.
Example: IO parameters, frequency, fields Current protocols:
XML (HDF5) to standalone GUI HDF5 to viz tools (Amira) HTTP to Web browser (HTML forms)
Thorn httpThorn http Thorn which allows
simulation to act as a web server
Connect to simulation from any browser
Monitor run: parameters, basic visualization, ...
Change steerable parameters
See running example at www.CactusCode.org
Wireless remote viz, monitoring and steering
Remote Steering - IssuesRemote Steering - Issues
Same kinds of problems as remote visualization generic protocols handling of port numbers broadcasting of active Cactus simulations
Security Logins Who can change parameters?
Lots of issues still to resolve ...
Remote Offline VisualizationRemote Offline VisualizationViz Client (Amira)
HDF5 VFD
DataGrid (Globus)
DPSS FTP HTTP
VisualizationClient
DPSS Server
FTP Server
Web Server Remote
Data Server
Downsampling, hyperslabs
Viz in Berlin
4TB at NCSA
Only what is needed
Remote Offline VisualizationRemote Offline Visualization
Accessing remote data for local visualization
Should allow downsampling, hyperslabbing, etc. Access via DPSS is working (TIKSL) Waiting for DataGrid support for HTTP and FTP to remove
dependency on the DPSS file systems.
More Information ...More Information ...
Web site: www.CactusCode.org Users Guide Development projects HOWTOs
References for common questions, includes HOWTO-QuickStart
Tutorials Help desk
Bug tracking Mailing lists