the quantum chromodynamics grid james perry, andrew jackson, matthew egbert, stephen booth, lorna...

The Quantum Chromodynamics Grid

James Perry, Andrew Jackson, Matthew Egbert, Stephen Booth, Lorna Smith

EPCC, The University Of Edinburgh

Overview

Overview

The data grid

The metadata catalogue and browser

Conclusions

References

Overview

Aim– To implement a 'QCDgrid' to become a production

environment for UKQCD, a collaboration of UK Scientists carrying out Quantum Chromodynamics (QCD) simulations

The Grid– This multi-terabyte storage system will supporting distributed

data management across four UK sites: Edinburgh, Glasgow, Liverpool and Swansea

Funding– QCDGrid is part of the GridPP project, a PPARC

funded initiative

Why build a QCD Grid?

QCD currently generates terabytes – petabytes of data– Especially when their purpose built HPC system QCDOC

comes on line– Post-processing is highly diverse and distributed– Involves multinational collaborations

The challenge is to store and access this data– Secure, reliable and expandable distributed storage system

required

Initially, the QCDGrid project aims to address this issue– Develop a multi-terabyte storage system, supporting

distributed data management across different UK sites

The QCDGrid

Stage 1: Implement a multi-site data storage Grid– Globus toolkit for toolkit for basic grid operations e.g. data transfer,

security – Globus replica catalogue for to maintain a directory of files on the

Grid– Intend to use EDG software in the future e.g. for file replication

Stage 2: Develop structured data which describes the characteristics of the raw data (metadata)– Develop an XML schema for lattice QCD Calculations – Implement a metadata catalogue– Develop a metadata catalogue browser

The QCDGrid Structure

Basic DataGrid Requirements

The data grid must distribute data across the four sites

Robustly– Each file must be replicated at at least two sites

Efficiently– Where possible, files should be stored close to where they are

needed most often

Transparently– End users should not need to be concerned with how the data

grid is implemented

DataGrid Implementation

Hardware– Storage elements are PCs– Data stored in RAID arrays – cheap and offer built in

redundancy

Software– Red Hat Linux 7.2 OS– Globus Toolkit 2.0 used for low level grid services– European DataGrid software intended to be used in next

phase for data replication/job submission– Custom written QCDGrid software builds on Globus to

implement QCDGrid client tools and control thread

Data Grid Structure

Simple Use Case – Adding a FileThe user issues a ‘put’ command

The software chooses a suitable storage element and copies the file to its ‘NEW’ directory

On its next scan, the control thread finds the new file and moves it to its actual home, registering it with the replica catalogue

On its next scan, the control thread finds there is only one copy of the file and makes another one at a suitable site, registering it with the replica catalogue

Simple Use Case – Getting a File

The user issues a ‘get’ command on a client machine

The software looks up the replica catalogue to find the nearest copy of the file

The file is transferred from that copy

If the transfer fails, the software looks up the replica catalogue again to find the next nearest copy, and tries to transfer that instead

Fault Tolerance

Probably the most important requirement of QCDgrid

Central control thread– Constantly monitoring nodes to make sure they are still working

Node fails without warning– E-mail sent to the system administrator– Control thread begins to replicate the files that were on the node

elsewhere Nodes can be temporarily disabled if they have to be shut down or rebooted

– Prevents the grid moving data around unnecessarily A secondary node is constantly monitoring the central node

– Backing up the replica catalogue and configuration files. – Grid can still be accessed (albeit read-only) if the central node goes

down

Current Progress

Data grid software has been implemented and is undergoing testing

A 4 node test grid has been set up across two of the sites (Edinburgh and Liverpool)

A web-based status monitor exists, allowing users to check the state of the data grid

Metadata

Storing metadata which describes the actual data – This allow users to see what is on the grid and find what they

want more easily

Data described by XML metadata files– A schema is being developed for the QCD metadata

The XML files stored centrally in an XML database – the QCDGrid metadata catalogue– Using Apache Xindice

The XML files will also be submitted to the data grid itself– Ensures there is a backup copy of the metadata

– Metadata catalogue can be reconstructed from the data grid in the event that it is lost

Implementation of Metadata

Data submitted to the grid must be accompanied by a valid metadata file

This can be enforced by checking it against the schema

A submission tool (graphical or command line) takes care of sending the data and metadata to the right places

The Xindice XML database is accessed as a grid service

The API for this is being developed by the OGSA DAI project

A graphical metadata browser will allow easy access to data stored on the grid, based on meaningful characteristics

Current Progress XML schema development is well advanced

– Prototype available

Metadata browser applet exists– May require

modification due to changes in APIs used

Metadata catalogue– OGSA DAI project are providing grid service software to

QCDGrid

Conclusions

Aim– To implement a 'QCDgrid' to become a production

environment for UKQCD

Developed a prototype distributed data grid– Adding ‘real’ data to the grid this month

Developed a prototype XML schema and browser

Utilising the OGSA DAI grid service software for the XML metadata catalogue

References

QCDGrid– Software mailing list: [email protected]– Project information e-mail: [email protected]– Or see:

http://www.epcc.ed.ac.uk/computing/research_activities/grid/qcdgrid/

– Example Schema, see:– http://www.ph.ed.ac.uk/ukqcd/community/the_grid/

xml_schema/xml_schema.html

GridPP– http://www.gridpp.ac.uk

the quantum chromodynamics grid james perry, andrew jackson, matthew egbert, stephen booth, lorna...

Documents

data grid slide

data grid structure

metadata metadata catalogue

data transfer

raw data metadata

actual data

qcd grid

pcs data