presented by the earth system grid: turning climate datasets into community resources david e....

9
Presented by The Earth System Grid: Turning Climate Datasets into Community Resources David E. Bernholdt, ORNL on behalf of the Earth System Grid team at Argonne National Laboratory Lawrence Berkeley National Laboratory Lawrence Livermore National Laboratory Los Alamos National Laboratory National Center for Atmospheric Research National Oceanic and Atmospheric Administration Oak Ridge National Laboratory University of Southern California www.earthsystemgrid.org

Upload: scott-bond

Post on 28-Dec-2015

225 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Presented by The Earth System Grid: Turning Climate Datasets into Community Resources David E. Bernholdt, ORNL on behalf of the Earth System Grid team

Presented by

The Earth System Grid: Turning Climate Datasets into Community Resources

David E. Bernholdt, ORNLon behalf of the Earth System Grid team at

Argonne National Laboratory Lawrence Berkeley National Laboratory

Lawrence Livermore National Laboratory Los Alamos National Laboratory

National Center for Atmospheric ResearchNational Oceanic and Atmospheric Administration

Oak Ridge National LaboratoryUniversity of Southern California

www.earthsystemgrid.org

Page 2: Presented by The Earth System Grid: Turning Climate Datasets into Community Resources David E. Bernholdt, ORNL on behalf of the Earth System Grid team

2 Bernholdt_ESG_SC07

The growing importance of climate simulation data

DOE invests broadly in climate change research:

Development of climate models

Climate change simulation

Model intercomparisons

Observational programs

Climate change research is increasingly data-intensive:

Analysis and intercomparison of simulation and observations from many sources

Data used by model developers, impacts analysts, policymakers

2 Bernholdt_ESG_SC07

Results from the Parallel Climate Model (PCM) depicting wind vectors, surface pressure, sea surface temperature, and sea ice concentration. Prepared from data published in the ESG using the FERRET analysis tool by Gary Strand, NCAR.

Page 3: Presented by The Earth System Grid: Turning Climate Datasets into Community Resources David E. Bernholdt, ORNL on behalf of the Earth System Grid team

3 Bernholdt_ESG_SC07

Earth System Grid objectivesTo support the infrastructural needs of the national and international climate community, ESG is providing crucial technology to securely access, monitor, catalog, transport, and distribute data in today’s grid computing environment.

HPChardware running climate models

ESGSites

ESG Portal

3 Bernholdt_ESG_SC07

Page 4: Presented by The Earth System Grid: Turning Climate Datasets into Community Resources David E. Bernholdt, ORNL on behalf of the Earth System Grid team

4 Bernholdt_ESG_SC07

Main ESG PortalMain ESG Portal CMIP3 (IPCC AR4) ESG PortalCMIP3 (IPCC AR4) ESG Portal

146 TB of data at four locations 1,059 datasets 958,072 files Includes the past 6 years of joint

DOE/NSF climate modeling experiments

35 TB of data at one location 77,400 files Generated by a modeling campaign coordinated by

the Intergovernmental Panel on Climate Change Model data from 13 countries

4,910 registered users 1,314 registered users

Downloads to date 30 TB 106,572 files

Downloads to date 245 TB 914,400 files 500 GB/day

(average)

> 300 scientific papers published to date based on analysis of CMIP3 (IPCC AR4) data

ESG facts and figures

Worldwide ESG user base

CMIP3 (IPCC AR4) Daily Downloads (through 7/2/07)

Page 5: Presented by The Earth System Grid: Turning Climate Datasets into Community Resources David E. Bernholdt, ORNL on behalf of the Earth System Grid team

5 Bernholdt_ESG_SC07

ESG architecture and underlying technologies Climate data tools

Metadata catalog NcML (metadata schema) OPenDAP-G

(aggregation and subsetting)

Data management Data Mover Lite Storage Resource Manager

Globus toolkit Globus Security Infrastructure GridFTP Monitoring and Discovery

Services Replica Location Service

Security Access control MyProxy User registration

DataSubsetting

AccessControl

UserRegistration

OPeNDAP-GMyProxy SRM DISKCache

ESG Web Portal

NCARCache

NCARMSS

RLS SRM

RLS

SRM

searchbrowse

downloadpublish

WebBrowser

WebBrowser

DMLDML DataUser

WebBrowser

WebBrowser

DataProvider

MonitoringServices

DataPublishing

ClimateMetadata

CatalogsBrowsing

UsageMetrics

DataDownload

DataSearch

NERSC

MSS, HPSS: Tertiarydata storage systems

First Generation ESG Architecture

RLS

LANLCache

SRM

RLS

LLNLCache

SRM

ORNLHPSS

RLS SRM

Page 6: Presented by The Earth System Grid: Turning Climate Datasets into Community Resources David E. Bernholdt, ORNL on behalf of the Earth System Grid team

6 Bernholdt_ESG_SC07

Evolving ESG to petascale

Full data sharing (add to testbed…)• Synchronized federation

–metadata, data • Full suite of server-side

analysis• Model/observation integration• ESG embedded into desktop

productivity tools• GIS integration• Model intercomparison metrics• User support, life cycle

maintenance

Central database• Centralized curated data

archive• Time aggregation• Distribution by file

transport• No ESG responsibility for

analysis• Shopping-cart-oriented

web portal

Testbed data sharing• Federated metadata• Federated portals• Unified user interface• Selected server-side analysis• Location independence• Distributed aggregation• Manual data sharing• Manual publishing

ESG Data System EvolutionESG Data System Evolution2006 Early 2009 2011

CSSM, IPCC,satellite, In situ

biogeochemistry,ecosystems

ESG Data ArchiveESG Data Archive

Terabytes Petabytes

CCSMIPCC

Page 7: Presented by The Earth System Grid: Turning Climate Datasets into Community Resources David E. Bernholdt, ORNL on behalf of the Earth System Grid team

7 Bernholdt_ESG_SC07

Petascale data archives

Broader geographical distribution of archives across the United States around the world

Easy federation of sites

Increased flexibility and robustness

Architecture of thenext-generation ESG

Second Generation ESG Architecture

Federated ESG Deployment

ESG Node

ESG Node

ESG Node

ESG Node

ESG Node

ESG Node

ESG Node

Web Portal

Interfaces

Applications

Data &MetadataHoldings

ESG Gateway (CCES)

Web Portal

Interfaces

Applications

Data &MetadataHoldings

ESG Gateway (IPCC)

Web Portal

Interfaces

Applications

Data &MetadataHoldings

ESG Gateway (CCSM)

DistributionOnlineData

DistributionOnlineData

DeepArchives

CPU

Browser Clients

Web Portals

Remote Application Clients

(CDAT, NCL, Ferret, GIS, Publishing, OPeNDAP, DML, Modeling, etc.)

Local, Remote, and Web Services Interfaces

Applications Components

(data transfer, data publishing, search, analysis, visualization, post-processing, computation)

Cro

ss-C

utt

ing

Co

nce

rns

(sec

urity

, log

ging

, mon

itorin

g)

Wo

rkflow

& O

rchestratio

n

Page 8: Presented by The Earth System Grid: Turning Climate Datasets into Community Resources David E. Bernholdt, ORNL on behalf of the Earth System Grid team

8 Bernholdt_ESG_SC07

The team and sponsors

National Center forAtmospheric Research

Los Alamos National Laboratory

Argonne National Laboratory

Oak RidgeNational LaboratoryUSC Information

Science Institute

Lawrence LivermoreNational Laboratory/

PCMDI

Lawrence BerkeleyNational Laboratory

National Oceanic& Atmospheric

Administration/PMEL

Climate Data Repository and ESG participant

ESG participant

Page 9: Presented by The Earth System Grid: Turning Climate Datasets into Community Resources David E. Bernholdt, ORNL on behalf of the Earth System Grid team

9 Bernholdt_ESG_SC07

For more information…

ORNL booth at SC2007

• David Bernholdt

Other booths at SC2007

• ANL/Global Grid Forum (Booth 551) Ann Chervenak

• LBNL (351) Arie Shoshani, Alex Sim

• NCAR (361) Don Middleton

Internet

• http://www.earthsystemgrid.org

[email protected]

9 Bernholdt_ESG_SC07