computational steering: a saga view - a saga...• standardising the api does not guarantee it will...
TRANSCRIPT
© 2006 Open Grid Forum
Computational Steering: A SAGA View
Shantenu Jha and Andre Merzky
2© 2006 Open Grid Forum
Outline
• SAGA in a Nutshell• Important Issues as viewed by SAGA
• Usability, Visualization, Advanced Reservation
• SAGA Use Case(s)• Cactus & RealityGrid
• SAGA Computational Steering Model• Question from/for SAGA
3© 2006 Open Grid Forum
SAGA in a Nutshell
• Why are there so few grid applications out there?• A lack of simple, stable, integrated and uniform highlevel programming interface that provides the most common grid programming abstractions?• Need to hide underlying complexities, varying semantics, heterogenities and changes from application program(er)• Measure(s) of success:
– Does SAGA enable quick development of “new” grid applications?
– Does it enable greater functionality using less code?
4© 2006 Open Grid Forum
Copy a File: Globus GASSif (source_url.scheme_type == GLOBUS_URL_SCHEME_GSIFTP || source_url.scheme_type == GLOBUS_URL_SCHEME_FTP ) { globus_ftp_client_operationattr_init (&source_ftp_attr); globus_gass_copy_attr_set_ftp (&source_gass_copy_attr, &source_ftp_attr); } else { globus_gass_transfer_requestattr_init (&source_gass_attr, source_url.scheme); globus_gass_copy_attr_set_gass(&source_gass_copy_attr, &source_gass_attr); } output_file = globus_libc_open ((char*) target, O_WRONLY | O_TRUNC | O_CREAT, S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP); if ( output_file == -1 ) { printf ("could not open the file \"%s\"\n", target); return (-1); } /* convert stdout to be a globus_io_handle */if ( globus_io_file_posix_convert (output_file, 0, &dest_io_handle) != GLOBUS_SUCCESS) { printf ("Error converting the file handle\n"); return (-1); } result = globus_gass_copy_register_url_to_handle ( &gass_copy_handle, (char*)source_URL, &source_gass_copy_attr, &dest_io_handle, my_callback, NULL); if ( result != GLOBUS_SUCCESS ) { printf ("error: %s\n", globus_object_printable_to_string (globus_error_get (result))); return (-1); } globus_url_destroy (&source_url); return (0); }
int copy_file (char const* source, char const* target) {globus_url_t source_url;globus_io_handle_t dest_io_handle;globus_ftp_client_operationattr_t source_ftp_attr;globus_result_t result;globus_gass_transfer_requestattr_t source_gass_attr;globus_gass_copy_attr_t source_gass_copy_attr;globus_gass_copy_handle_t gass_copy_handle;globus_gass_copy_handleattr_t gass_copy_handleattr; globus_ftp_client_handleattr_t ftp_handleattr; globus_io_attr_t io_attr; int output_file = -1; if ( globus_url_parse (source_URL, &source_url) != GLOBUS_SUCCESS ) { printf ("can not parse source_URL \"%s\"\n", source_URL); return (-1); } if ( source_url.scheme_type != GLOBUS_URL_SCHEME_GSIFTP && source_url.scheme_type != GLOBUS_URL_SCHEME_FTP && source_url.scheme_type != GLOBUS_URL_SCHEME_HTTP && source_url.scheme_type != GLOBUS_URL_SCHEME_HTTPS ) { printf ("can not copy from %s - wrong prot\n", source_URL); return (-1); } globus_gass_copy_handleattr_init (&gass_copy_handleattr); globus_gass_copy_attr_init (&source_gass_copy_attr); globus_ftp_client_handleattr_init (&ftp_handleattr); globus_io_fileattr_init (&io_attr); globus_gass_copy_attr_set_io (&source_gass_copy_attr, &io_attr); &io_attr); globus_gass_copy_handleattr_set_ftp_attr (&gass_copy_handleattr, &ftp_handleattr); globus_gass_copy_handle_init (&gass_copy_handle, &gass_copy_handleattr);
5© 2006 Open Grid Forum
Copy a File: SAGA#include <string>#include <saga/saga.hpp>
void copy_file(std::string source_url, std::string target_url) { try { saga::file f(source_url); f.copy(target_url); } catch (saga::exception const &e) { std::cerr << e.what() << std::endl; }}
• Provides the high level abstraction layer, that application programmers need
• Like MapReduce – leave details of distribution etc. out• Shields details of lowerlevel m/w system
6© 2006 Open Grid Forum
SAGA: In action
• file management, job management, remote procedure calls, replica management, data streaming• SAGA Adaptor development now an OMIIUK project
7© 2006 Open Grid Forum
Important Issues (1)
• Usability for Developers and App scientist• “S” in SAGA is for Simple for the enduser; though
not necessarily for SAGA implementation..• Steering Model (to be discussed): Completeness? • Issues of data model and types, implementation, &
backend remain in spite of API standardization• Are SAGA event handling mechanisms complete?
– callbacks and async notifications exist in SAGA
8© 2006 Open Grid Forum
Important Issues (2)• Visualization on the Grid
• Several use cases; MUST be supported by SAGA!• SAGA mechanisms
– Jobmgmt – (dynamic, realtime) Streams, message bus, async and
notification– (static) remote file access, replica mgmt
• Data Model and format– SAGA does not prescribe model/format– Either rpc to a resource that is capable, or do low level
parsing using SAGA (ugly); not in scope
9© 2006 Open Grid Forum
Important Issues (3)
• Advanced Reservation (AR)• Not in SAGA's scope.. • Not sure if we need to support AR directly at API? • Maybe support the attributes required to utilize AR
– job_description.add_attrib (“StartupDeadline”, “6:00pm”)
• Is AR a user level versus resource level issue?
• Any other “Important Points”?
10© 2006 Open Grid Forum
SAGA: Use Cases• 20 Usecases
• Approximately 12 unique use cases• Steering came in only 5
– Coupled with Viz: DiVA, VISIT, Medical Viz– SCOOP, Cactus, RealityGrid, Ground Water
• Still a candidate system for developing an API! • But always a tierII package
• Your Use Case Please!Your Use Case Please!• Cactus:
• Example of Joblevel steering• RealityGrid:
• Application + Job level steering
11© 2006 Open Grid Forum
Physicist has new idea !
S1 S2
P1
P2
S1S2
P2P1
SBrill WaveFound a horizon,
try out excision
Look forhorizon
Calculate/OutputGrav. Waves
Calculate/OutputInvariants
Find bestresources
Free CPUs!!
NCSA
SDSC
RZG
LRZ
Archive data
SDSC
Add more resources
Clone job with steered parameter
Queue time over, find new machine
Archive to LIGOpublic database
Dynamic Grid Computingexample of Joblevel steering
12© 2006 Open Grid Forum
RealityGrid: Checkpoint trees andparameter space exploration
Initial condition: Random water/ surfactant mixture.
Selfassembly starts.
Rewind and restart from checkpoint.
Lamellar phase: surfactant bilayers between water layers.
Cubic micellar phase, low surfactant density gradient.
Cubic micellar phase, high surfactant density gradient.
13© 2006 Open Grid Forum
RealityGrid: The Architecture of Steering
Steering client
Simulation
Steering library
VisualizationVisualization
Registry
Steering GS
Steering GS
connect
publish
find
bind
data transfer (Globus-IO)
publish
bind
Client
Steering library
Steering library
Steering library
Display
Display
Display
components start independently and
attach/detach dynamically
multiple clients: Qt/C++, .NET on PocketPC, GridSphere Portlet (Java) remote visualization through SGI
VizServer, Chromium, and/or streamed to Access Grid
OGSI middle tier
14© 2006 Open Grid Forum
SAGA Steering
• JobLevel steering (JLS):• Steering & management of a job as an entity
• Suspend(), migrate(), resume(), checkpoint(), spawn(), kill() etc. provide mechanism of JLS
• Anything that is “done” on a jobobject• Cactus example
15© 2006 Open Grid Forum
SAGA Steering
• ApplicationLevel steering:• Internals state, data and control of the
application is modified (e.g., via callback)• Notion of “metric” provide mechanism for
app level steering• RealityGrid as a (partial) example
16© 2006 Open Grid Forum
SAGA: Steering Model (1)
• “Metric”: A Highlevel SAGA object• Represents internal state that can be monitored or
steered• Can be mutable (steerable) versus nonmutable
(monitorable only)• Can be simple data structure
• simple or complex, structured or unstructured
17© 2006 Open Grid Forum
SAGA: Steering Model (2)
• “Metric”: A Highlevel SAGA object• Operations:
• inspection, usual queries, notification, callbacks, async operations
• Fire():
– Update internal state (metric variable) – A mechanism to insulate program detail– Implies push model
• Metaoperations
18© 2006 Open Grid Forum
Question(s) for/from SAGA
• JLS can be implemented using existing API calls. • Is this good enough? Should we have higherlevel abstractions?
• “Metrics” can be used to provide ALS• But what is the best way to implement ALS at the API level?
• SAGA “Steering Model”? Gap analysis? Are we ready?• Standardising the API does not guarantee it will be usable by all
application use cases!• Interoperability: Good thing, but at what level?
• Does it make sense to have a standard API for steering?• Implementation Redux: Are there efforts to standardize at the
“other levels”? (e.g., data models)