ahm04: sep 2004 nottingham cclrc e-science centre eminerals: environment from the molecular level...

Download AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data

If you can't read please download the document

Upload: eleanor-foster

Post on 18-Jan-2018

215 views

Category:

Documents


0 download

DESCRIPTION

AHM04: Sep 2004 Nottingham CCLRC e-Science Centre Radioactive waste disposal Crystal growth and scale inhibition Pollution: molecules and atoms on mineral surfaces Crystal dissolution and weathering

TRANSCRIPT

AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data Management Group CCLRC Daresbury Laboratory, UK AHM04: Sep 2004 Nottingham CCLRC e-Science Centre Royal Institution University of Reading AHM04: Sep 2004 Nottingham CCLRC e-Science Centre Radioactive waste disposal Crystal growth and scale inhibition Pollution: molecules and atoms on mineral surfaces Crystal dissolution and weathering AHM04: Sep 2004 Nottingham CCLRC e-Science Centre Search for a crystal structure Convert crystal data into format suitable for application Results Storage Catalogue Publish Transfer crystal structure to compute node Run calculation to perform some analysis on crystal Download crystal structure data Convert results into format suitable for storage Catalogue the results Transfer results to permanent archive Make results available online Transform Retrieve Analysis Discover Transfer Transform Results AHM04: Sep 2004 Nottingham CCLRC e-Science Centre Results Storage Catalogue Publish Transform Retrieve Analysis Discover Transfer Transform Results CCLRC DataPortal Server Local data Local metadata XML wrapper Facility 1 Local data Local metadata XML wrapper Facility 2 User search many data resources simultaneously uses CCLRC standard for scientific metadata XML on the wire download scientific datasets directly to own machine for preparation transfer datasets to compute node CCLRC Data Portal AHM04: Sep 2004 Nottingham CCLRC e-Science Centre Results Storage Catalogue Publish Transform Retrieve Analysis Discover Transfer Transform Results data not necessarily in correct format for application cut & paste conversion code common format e.g. CML some codes on project now produce CML AHM04: Sep 2004 Nottingham CCLRC e-Science Centre Results Storage Catalogue Publish Transform Retrieve Analysis Discover Transfer Transform Results each institution wants to manage its own files however shared access desirable within project deployed SRB vaults at several locations coordinating SRB and database at CCLRC provides virtual file system each user has a home directory many different interfaces and APIs provides a personal workspace independent of computational grid where users can upload there input files SRB vaults professionally managed and backed up preventing loss of data SRB provides sophisticated access control system Storage Resource Broker AHM04: Sep 2004 Nottingham CCLRC e-Science Centre Results Storage Catalogue Publish Transform Retrieve Analysis Discover Transfer Transform Results designated job submission nodes allows users to create simple scripts to download input from SRB, run job on minigrid and transfer results back to SRB uses Condor-G as client to Globus running on compute clusters uses SRB S-commands to download and upload files so results are automatically in permanent archive however results are stored as generated AHM04: Sep 2004 Nottingham CCLRC e-Science Centre Results Storage Catalogue Publish Transform Retrieve Analysis Discover Transfer Transform Results forms based web application to manually create annotation for groups of files files are grouped into datasets and datasets into studies each study holds details of investigators, description of study, dates, key words or topics datasets hold location of a directory of files in SRB or on other file system once entered metadata and files are available via Data Portal Metadata Editor AHM04: Sep 2004 Nottingham CCLRC e-Science Centre Data Portal search across data resources simultaneously Probably have to change format of file manually Results Storage Catalogue Publish Transfer file to SRB Script downloads input from SRB, runs job on grid using Condor-G Download data from Data Portal or Storage Resource Broker Not yet tackled Metadata Editor - catalogue the results via web form Script transfers results to SRB Results then available online via Data Portal Transform Retrieve Analysis Discover Transfer Transform Results but have to link up more resources use CML for input / output some codes to address this manually via SRB tools some output in CML though results stored as they are either in text files or CML need to generate metadata automatically AHM04: Sep 2004 Nottingham CCLRC e-Science Centre deployment of distributed data resources via SRB set up project RDBMS for metadata/catalogue info and interfaces to add/edit metadata and searching used CCLRC Multi-disciplinary Scientific Metadata Format for transport of metadata use of CML to format input/output to some codes integration with data and computation dedicated nodes to submit jobs via Condor-G, scripts to download input and upload results to SRB Particular successes AHM04: Sep 2004 Nottingham CCLRC e-Science Centre many codes still input /output proprietary text format auto-generate metadata results stored as generated need to consider more sophisticated data storage further use of CML integrated portals for compute and data to manage whole workflow integrate more data resources for discovery Issues to overcome AHM04: Sep 2004 Nottingham CCLRC e-Science Centre Further Information Environment from the Molecular Level UK CCLRC e-Science Centre