upcoming enhancements to the hst archive mark kyprianou operations and engineering division data...

19
Upcoming Enhancements to the HST Archive Mark Kyprianou Operations and Engineering Division Data System Branch

Upload: joanna-holmes

Post on 25-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Upcoming Enhancements to the HST Archive Mark Kyprianou Operations and Engineering Division Data System Branch

Upcoming Enhancements to the HST Archive

Mark Kyprianou

Operations and Engineering Division

Data System Branch

Page 2: Upcoming Enhancements to the HST Archive Mark Kyprianou Operations and Engineering Division Data System Branch

HST DMS Enhancements

Enhancements in DMS

JWST requirements Define a better archive

HST Mission Office Identify weaknesses/areas that could be improved Allocate resources to implement enhancements in a few areas

A “win” for all missions Common code to support Common system to operate Better services to our customers

Target areas for HST enhancements: Workflow Manager Reprocessing Online Cache Distribution/UI

01/19/2012 2

sadfasfasdf xxx
Page 3: Upcoming Enhancements to the HST Archive Mark Kyprianou Operations and Engineering Division Data System Branch

Workflow Manager

Page 4: Upcoming Enhancements to the HST Archive Mark Kyprianou Operations and Engineering Division Data System Branch

OPUS Workflow manager in use since 1995

All current HST level-0 -> level-2 data processing is performed using OPUS pipelines.

Why change? There are significant risks with using OPUS throughout the remaining HST lifetime Reliability – OPUS GUIs (OMG, PMG) are “fragile”

- Susceptible to failure with additional network security

- Use Java “thick-client” application technology (NOT web-friendly)

- Have not been rebuilt for Windows platform in many years

HST Workflow Manager – today it’s OPUS

Page 5: Upcoming Enhancements to the HST Archive Mark Kyprianou Operations and Engineering Division Data System Branch

A JWST Workflow Manager trade study has just been completed and recommends a new workflow technology for JWST: Condor /Open Workflow Layer (OWL)

Transition HST from OPUS to Condor/OWL?

- Provides a “technology refresh” which should serve HST throughout its remaining lifetime

– Condor has a huge world-wide user base and undergoes continuous development and improvement

- Allows STScI pipeline operations team to focus on a single workflow manager system, rather than learn to operate more than one

- Provides significant “upside” with flexibility for taking advantage of distributed computing resources, both on-site and off

Workflow Framework Trade Study (FOO - Future of OPUS)  https://trac.stsci.edu/trac/DMS/wiki/FutureOfOpus

HST Workflow Manager - Alternatives

Page 6: Upcoming Enhancements to the HST Archive Mark Kyprianou Operations and Engineering Division Data System Branch

Reprocessing

Page 7: Upcoming Enhancements to the HST Archive Mark Kyprianou Operations and Engineering Division Data System Branch

HST DMS Enhancements

Rationale for Reprocessing

Calibration improves over time Science instrument performance better understood

- Reflected in improved calibration algorithms and reference files

Data better understood over time Additional keywords and improved data formats

Pipeline software error corrections

01/19/2012 7

sadfasfasdf xxx
Page 8: Upcoming Enhancements to the HST Archive Mark Kyprianou Operations and Engineering Division Data System Branch

HST DMS Enhancements

On the Fly Recalibration: OTFR

Advantages The user gets the benefit of the very latest data processing and

calibration enhancements at the time of their archive retrieval. Less archive storage since calibrated data products not on disk Unpopular data do not get reprocessed

Disadvantages Delay in retrieval while reprocessing, could be substantial if there is

a large retrieval queue No direct access to data All data not accessible through VO protocols Popular data get identically reprocessed many times

01/19/2012 8

Page 9: Upcoming Enhancements to the HST Archive Mark Kyprianou Operations and Engineering Division Data System Branch

HST DMS Enhancements

Reprocess on Change

Advantages Rapid data retrieval through direct synchronous access or batch

request Data accessible through VO protocols

- Allows for data mining

Disadvantages Requires development of more complex reprocessing software

system

- Logic needed for when to initiate reprocessing and where to start in the pipeline

01/19/2012 9

Page 10: Upcoming Enhancements to the HST Archive Mark Kyprianou Operations and Engineering Division Data System Branch

HST DMS Enhancements

Reprocessing Concept (1/3)

The Reprocessing System will automatically recalibrate affected observations when updates to calibration reference files or the calibration software are approved and released. Other improvements to the quality of data products may trigger

reprocessing.

The Reprocessing System will monitor changes in calibration reference files and software. The Calibration Reference Data System will track changes to the

calibration reference files

01/19/2012 10

Page 11: Upcoming Enhancements to the HST Archive Mark Kyprianou Operations and Engineering Division Data System Branch

HST DMS Enhancements

Reprocessing Concept (2/3)

The latest version of all data are stored in the archive.

Reprocessed data products replace their previous version in the primary archive.

If the Archive User Interface indicates that the data being requested do not have best calibration, archive users will be notified prior to retrieval. Archive users accept existing calibration or wait until calibration is

updated.

01/19/2012 11

Page 12: Upcoming Enhancements to the HST Archive Mark Kyprianou Operations and Engineering Division Data System Branch

HST DMS Enhancements

Reprocessing Concept (3/3)

The order of data processing will take into account items such as: Data designated to be processed immediately. Processing on initial receipt of data from the telescope. Reprocessing of an observation less than one year from execution

requested by an archive user. Reprocessing of an observation more than one year from execution

requested by an archive user. Reprocessing of data less than one year from execution. Reprocessing of data more than one year from execution.

01/19/2012 12

Page 13: Upcoming Enhancements to the HST Archive Mark Kyprianou Operations and Engineering Division Data System Branch

HST DMS Enhancements

Archive User Decision Tree for Direct Download

01/19/2012 13

User waits

Page 14: Upcoming Enhancements to the HST Archive Mark Kyprianou Operations and Engineering Division Data System Branch

Data Storage:Online Cache

Page 15: Upcoming Enhancements to the HST Archive Mark Kyprianou Operations and Engineering Division Data System Branch

HST DMS Enhancements

Storage Broker Concept

Optimizes management of large scale distributed data storage resources

Provides a uniform interface to heterogeneous data storage resources over a network ingest (adding files to the system) accessing files security

Uses common metadata for file storage and location Utilizes a database schema for mapping of the logical file layer to

the physical disk locations on storage media.

Provides independence from the hardware platforms (mainframes, intermediate systems, servers, PCs).

Provides transparent use of public network protocols (SFTP, HTTP, etc.)

Simplifies file exchange between applications and mirror sites

01/19/2012 15

Page 16: Upcoming Enhancements to the HST Archive Mark Kyprianou Operations and Engineering Division Data System Branch

HST DMS Enhancements

Data Storage Key Features

The Storage Broker (SB) supports: Internal archive RAID based disk storage for long term data

preservation (Primary Data Store). An online file storage of files for fast, immediate access. An offline, offsite data backup of the file storage (Safestore).

The SB provides online access to the latest version of the processed data.

01/19/2012 16

Page 17: Upcoming Enhancements to the HST Archive Mark Kyprianou Operations and Engineering Division Data System Branch

Distribution and Archive User Interface

Page 18: Upcoming Enhancements to the HST Archive Mark Kyprianou Operations and Engineering Division Data System Branch

Data Distribution Concept

There are two complementary concepts for data distribution. Batch distribution Direct distribution

Batch distribution XML request generated by Archive User Interface and passed to

Distribution. No further user interaction is needed once the request is submitted.

Direct distribution User has direct access to files through URL. Supports VO services. Necessary for data mining.

HST DMS Enhancements01/19/2012 18

Page 19: Upcoming Enhancements to the HST Archive Mark Kyprianou Operations and Engineering Division Data System Branch

Archive User Interface Concept

The Archive Users Interface (AUI) will provide means to search for data including Program/PI searches; spatial, time and wavelength searches.

After users identify data of interest the AUI will provide an option of download method and prompt for authentication / authorization information for use with proprietary data.

AUI will provide the status of the requested data (e.g. best calibration available or data are in reprocessing queue.) and permit user to select if they want to wait for new data.

Distribution shall record metrics for user transactions, such as IP address, user ID, files selected, distribution mode and format, and download size and time.

HST DMS Enhancements01/19/2012 19