the nerc datagrid

22
NOCS, PML, STFC, BODC, BADC The NERC DataGrid + + + + = Bryan Lawrence Director of the STFC Centre for Environmental Data Archival (BADC, NEODC, IPCC-DDC etc) Representing the NDG and a dose of the “Portals Project”

Upload: arne

Post on 13-Jan-2016

46 views

Category:

Documents


0 download

DESCRIPTION

The NERC DataGrid. Bryan Lawrence Director of the STFC Centre for Environmental Data Archival (BADC, NEODC, IPCC-DDC etc) Representing the NDG and a dose of the “Portals Project”. +. +. =. +. +. NOCS, PML, STFC, BODC, BADC. Simulations. Assimilation. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The NERC DataGrid

NOCS, PML, STFC, BODC, BADC

The NERC DataGridThe NERC DataGrid

+ ++ + =

Bryan Lawrence

Director of the STFC Centre for Environmental Data Archival

(BADC, NEODC, IPCC-DDC etc)

Representing the NDG and a dose of the “Portals Project”

Page 2: The NERC DataGrid

Royal Society, April, 2008

http://ndg.nerc.ac.uk

British Atmospheric Data Centre

British Oceanographic Data Centre

Simulations

Assimilation

Complexity + Volume + Remote Access = Grid Challenge

Page 3: The NERC DataGrid

Royal Society, April, 2008

Data Sets

• Ground based observation networks Met Office surface stations

• Model output NWP, ECMWF reanalyses & Climate models

• Satellite data TOMS, Envisat & MSG

• NERC programmes data UTLS, CWVC & URGENT

“A collection of files with a common theme and administration”

Datasets statistics• 147 datasets • ~130TB (unique)• ~80 Million files

Page 4: The NERC DataGrid

Royal Society, April, 2008

The NDG Use Cases

1. Find data2. Find out about data3. Subset and difference data4. Visualise data5. All using SECURE interdisciplinary technology.Sound easy doesn’t it?

Want interdisciplinary semantic access to information, not abstract data– getData(potential temperature from ERA-40 dataset in North

Atlantic from 1990 to 2000)– not: getData(“era40.nc”, ‘PTMP’, 20:50, 300:340, 190:200)– or even worse:

for j=1990:2000getData(“era40_”+j+“.nc”, ‘PTMP’, 20:50, 300:340)

Page 5: The NERC DataGrid

Royal Society, April, 2008

The big: Climate in 2010 – A graphic Illustration

Figures from Gary Strand, NCAR, ESG website

Page 6: The NERC DataGrid

Royal Society, April, 2008

… and the little … land use!

Page 7: The NERC DataGrid

Royal Society, April, 2008

ISO19109 in a nutshell

Page 8: The NERC DataGrid

Royal Society, April, 2008

Dataset Metadata

Page 9: The NERC DataGrid

Royal Society, April, 2008

MOVIE(s)

Page 10: The NERC DataGrid

Royal Society, April, 2008

NDG Deployment

Vocab Server:

120,000+ concepts100 lists75,000 Relationships (SKOS)

Page 11: The NERC DataGrid

Royal Society, April, 2008

Dataset Metadata

Page 12: The NERC DataGrid

Royal Society, April, 2008

MOLES: Metadata Objects for Linking Environmental Science

A

data production tool

deployed

at an

observation station

on behalf of an

activity

produces a

dataset.

Datasets have

provenance

Page 13: The NERC DataGrid

Royal Society, April, 2008

Getting down to the data itself

A-Metdata• We choose to deploy an “Application

Schema” of the “Geographic Markup Language”.

• This allows us to define our data to support RE-USE by other communities – interdisciplinarity, and to be INSPIRE compliant!

Page 14: The NERC DataGrid

Royal Society, April, 2008

CSML Dataset

Dictionaries: •coordinate reference systems,• phenomena (parameters)• units of measure

Feature Collections• CSML Features are SPECIALISATIONS of GML Features!

Storage Descriptor• Re-usable description of storage

(NB: CSML Software Stack, WMS, WCS etc)

Page 15: The NERC DataGrid

Royal Society, April, 2008

CSML Feature Types

Abstract

Specific: point data, time-series of point data, trajectory data, point collections, profiles, profile series, Ragged profile series, section, ragged section, scanning radar, grid, gridseries and swath

Page 16: The NERC DataGrid

Royal Society, April, 2008

Beyond measuring things at points

CF concepts:

Cell-Methods

(statistical e.g means etc)

Cell-Bounds

(intensive quantities,

Extensive quantities,

Relationship to coverage domain)

New work:

Exploiting Sensor Web

Enablement

Page 17: The NERC DataGrid

Royal Society, April, 2008

Storage Descriptor

Page 18: The NERC DataGrid

Royal Society, April, 2008

Data Modelling Workshop

• April 2007, at The Cosenors House, Abingdon• Brought NDG, Unidata, OGC, Italian NRC

together to get a shared vision for scientific data types.

• Roadmap for bringing our activities together over next few years.

• Harmonising:– NDG CSML– OGC Observations and Measurements (with CSML

and MOLES)– UNIDATA Community Data Model– INRC NCML-G

Page 19: The NERC DataGrid

Royal Society, April, 2008

Data Model Relationships

Route to harmonising GeoSciML and CSML

Page 20: The NERC DataGrid

Royal Society, April, 2008

Wider InternetNERC Grid

taperobot

XML data-base

XML data-base

BADC NDG Wrapper

OnlineData

OnlineData

BODC NDGWrapper

OnlineData

XML data-base

Group NDGWrapper

Software Agent

Grid User

Satellite Supercomputer

Research Group DataSources

Internet Link

Internet User

Internet LinkESG (&other)Applications

Wider Internet

NDGWeb

Portal

XML data-base

The Original Vision

Page 21: The NERC DataGrid

Royal Society, April, 2008

What we’ve done (are doing)

Wider InternetNERC Grid

taperobot

XML data-base

XML data-base

BADC NDG Wrapper

OnlineData

OnlineData

BODC NDGWrapper

OnlineData

XML data-base

Group NDGWrapper

Software Agent

Grid User

Satellite Supercomputer

Research Group DataSources

Internet Link

Internet User

Internet LinkESG (&other)Applications

Wider Internet

NDGWeb

Portal

XML data-base

Schema: MOLESCSMLDIF+Lots of Population Issues

Protocols: OGC +OAI

Pylons based portalPython

WCS/WMS ClientsCSML Toolbox

Vocab Server

PylonsInterface & portals

Page 22: The NERC DataGrid

Royal Society, April, 2008

Now and the Future

More protocol improvements

(= wider community of providers and users)

More deployment.

More population.

Evolution WITH the web.