eso reflex a graphical workflow engine for data reduction richard hook euro vo data centres alliance...
TRANSCRIPT
ESO ReflexA Graphical Workflow Engine for Data Reduction
Richard Hook
Euro VO Data Centres Alliance Theory & Grid Workshop,
Garching, April 2008
11th April 2008 Euro VO DCA Workshop 2
Overview
The Sampo Project The ESO data reduction context ESO Reflex Examples Future plans
Apology: this talk is not really about grid activities… (and definitely not about theory)
11th April 2008 Euro VO DCA Workshop 3
Sampo overview:
As part of Finland’s joining fee for ESO a contribution “in kind” of computer scientist staff was made available and named “Sampo”.
Sampo started in January 2005 and ended in January 2008
The aim of the project was to assess the requirements for ESO data reduction and analysis software infrastructure in the medium term and perform a series of pilot projects to assess different options and produce useful tools
The project was managed at ESO Garching with the team based in Helsinki, Finland
11th April 2008 Euro VO DCA Workshop 4
The Sampo priorities
Enabling and facilitating science-grade reduction of ESO data from the La Silla Paranal Observatory was identified as its primary goal of the Sampo project Long-standing request from the community Input from Instrument Scientists in Garching and Chile Sampo SAC
Sampo has concentrated on developing ESO Reflex, a graphical user interface to run ESO data reduction recipes
Other sub-projects conducted by Sampo PyMidas: a Python interface to Midas VODA: a pilot project addressing the integration of data analysis
environments and the Virtual Observatory (cancelled)
11th April 2008 Euro VO DCA Workshop 5
ESO cannot reduce all of the data its telescopes and instruments produce to a level where their full scientific potential will be exploited
The responsibility for the quality of the scientific reduction of the data can only rest with the individual users
The users are, then, faced with the challenge of a timely and accurate data reduction Proprietary and/or archival data from several instruments often need to
be combined As volume and complexity of data increase, a detailed knowledge
of the instrument, data format and header content is essential to fully exploit the data Science-quality pipelines are increasingly needed also for Quality
Control General-purpose tools like IRAF and ESO-MIDAS are inadequate
for the task and instrument specific software, implementing carefully tuned algorithms is essential
The data reduction challenge
11th April 2008 Euro VO DCA Workshop 6
Data reduction by individual users: the context
ESO provides pipeline “recipes” for all VLT/VLTI instruments They remove the instrumental signature and are used for quality control
at ESO and distributed to the community In some cases the data products are adequate for scientific
analysis, but this is generally not yet the case Offline tools (Gasgano and EsoRex) are available to call the pipelines,
but lack some of the functionalities needed by the community
AND
Many older general purpose reduction and analysis systems remain in wide use (MIDAS/IRAF etc.) as they contain valuable algorithms
Many instrument-specific packages have been developed in the community (e.g., Euro3D tools, VIPGI etc)
Greater use will be made of remote data resources and the Virtual Observatory (VO)
11th April 2008 Euro VO DCA Workshop 7
Introducing ESO Reflex: a graphical data reduction environment
A data reduction system for the end user requires: Modular recipes to provide access to intermediate products Interactive tools, defined or customized by the user, to
analyze intermediate and final data products A user-friendly, intuitive and flexible interface
The ESO Reflex tool, addresses the interface issue, with a focus on the use case of ESO data: Dedicated invoker for CPL-based recipes General invoker for Python scripts (hence PyRAF &
PyMidas) General invoker for IDL scripts Invoker for the command line Many other Taverna features for free
11th April 2008 Euro VO DCA Workshop 8
ESO Reflex: look and feel
ESO Reflex is based on Taverna, a popular open source Java workflow engine
11th April 2008 Euro VO DCA Workshop 9
Main features of ESO Reflex (over and above what Taverna offers)
In interactive mode the user can make changes to input data and parameters during execution
Errors on during recipe execution are detected by ESO Reflex and appropriate action can be taken
(Some) flow control: looping, skipping, conditional statements, etc.
Parallel execution: full advantage on multi-processor or multi-core machines
Customisability: workflows are easily modified, Python and IDL interfaces, system commands can be invoked, easy access to VO and other web services
FITS file handling (using Gasgano code): data organization, tagging, selection
11th April 2008 Euro VO DCA Workshop 10
ESO Reflex in action: FORS2 MXUand a Python-based GUI tool
11th April 2008 Euro VO DCA Workshop 11
ESO Reflex in action: a FORSCalibration workflow.
11th April 2008 Euro VO DCA Workshop 12
Reflex/Taverna works very well with distributed resources and web services
Reflex also supports the PLASTIC protocol for passing information between client-side VO tools
As a small test project a workflow has been developed that finds VO image data, passes it to a remote processing server and uses PLASTIC tools running locally
Reduction of local ESO data and access to remote VO facilities within same environment
ESO Reflex and VO services
11th April 2008 Euro VO DCA Workshop 13
11th April 2008 Euro VO DCA Workshop 14
Potential shortcomings of ESO Reflex
Yet another tool to learn, support etc. Not as powerful as a script Flow control limited (loops not easy) Not suitable for very complex workflows …
11th April 2008 Euro VO DCA Workshop 15
The bottom line on ESO Reflex
The Sampo project has enabled ESO to explore options for future data reduction systems
The ESO Reflex model is cost effective because it capitalises on ESO’s investment in CPL-based algorithms Offering ESO pipelines to the community in a more flexible way Stimulating more scientific feedback from the community and,
therefore, steering algorithm development towards more science-grade applications
It exploits existing systems to provide those facilities that CPL is not designed to offer (e.g., graphics, interactivity, etc.) It avoids the need to develop from scratch a multi-purpose, and
expensive, full-fledged data reduction environment
11th April 2008 Euro VO DCA Workshop 16
The future of ESO Reflex and pipelines
The Sampo project itself ended in January 2008
ESO has examined the outcome of Sampo and concluded that ESO Reflex is a viable way of addressing some of
the mid-term needs of the user community The continued development of science-grade
CPL-based data reduction recipes is vital The development of interactive tools is also
needed
11th April 2008 Euro VO DCA Workshop 17
Status and plans
ESO Reflex V1.1 is currently available as a beta-test version on request
Current version is built on Taverna 1.5 - conversion to Taverna 2.0 will be needed, and involves significant changes: API changes Use of Maven Some UI refactoring
A public release is planned for late-2008/early-2009