1 cyberinfrastructure for the 21 st century (cif21): data mri and stci earthcube casc sept 9, 2011...

29
1 Cyberinfrastructure for the 21 st Century (CIF21): Data MRI and STCI EarthCube CASC Sept 9, 2011 Rob Pennington Office of Cyberinfrastructure (OCI) National Science Foundation [email protected] 1

Upload: gloria-gordon

Post on 12-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Cyberinfrastructure for the 21 st Century (CIF21): Data MRI and STCI EarthCube CASC Sept 9, 2011 Rob Pennington Office of Cyberinfrastructure (OCI) National

1

Cyberinfrastructure for the 21st Century (CIF21): Data

MRI and STCI

EarthCube

CASCSept 9, 2011

Rob PenningtonOffice of Cyberinfrastructure (OCI)

National Science [email protected]

1

Page 2: 1 Cyberinfrastructure for the 21 st Century (CIF21): Data MRI and STCI EarthCube CASC Sept 9, 2011 Rob Pennington Office of Cyberinfrastructure (OCI) National

Framing the Challenge:Science and Society Transformed by

Data Modern science

Data- and compute-intensive

Integrative, multiscale Multi-disciplinary

Collaborations for Complexity Individuals, groups,

teams, communities Sea of Data

Age of Observation Distributed, central

repositories, sensor- driven, diverse, etc 2

Page 3: 1 Cyberinfrastructure for the 21 st Century (CIF21): Data MRI and STCI EarthCube CASC Sept 9, 2011 Rob Pennington Office of Cyberinfrastructure (OCI) National

Advisory Committee for Cyberinfrastructure Task Force

Reports

GrandChallenges

CampusBridgingData and Viz

Cyberlearning

HPC

HIGH P ERFORMANCE COMPUTING

Software

More than 25 workshops and Birds of a Feather sessions and more than 1300 people involved

Final recommendations presented to the NSF Advisory Committee on Cyberinfrastructure (ACCI) Dec 2010

Final reports on-line at: http://www.nsf.gov/od/oci/taskforces/

3

Page 4: 1 Cyberinfrastructure for the 21 st Century (CIF21): Data MRI and STCI EarthCube CASC Sept 9, 2011 Rob Pennington Office of Cyberinfrastructure (OCI) National

Data Task Force Recommendations

Infrastructure: Recognize data infrastructure and services (including visualization) as essential long term research assets fundamental to today’s science

Economic sustainability: Develop realistic cost models to underpin institutional/national business plans for research repositories/data services

Culture Change: Emphasize expectations for data sharing; support the establishment of new citation models in which data and software tool providers and developers are recognized and credited with their contributions

Data Management Guidelines: Identify and share best-practices for the critical areas of data management

Ethics and IP: Train researchers in privacy-preserving data access

4

Page 5: 1 Cyberinfrastructure for the 21 st Century (CIF21): Data MRI and STCI EarthCube CASC Sept 9, 2011 Rob Pennington Office of Cyberinfrastructure (OCI) National

Evolution of Cyberinfrastructure for the 21st Century (CIF21) and

Data

5

ACCI Data Task Force

National Science Board (NSB)

DataNet Program

Community Input

NSFCIF21 Data

Programs

On-going input

Science &Engineering Research

+ Cyberinfrastructure

Page 6: 1 Cyberinfrastructure for the 21 st Century (CIF21): Data MRI and STCI EarthCube CASC Sept 9, 2011 Rob Pennington Office of Cyberinfrastructure (OCI) National

DiscoveryCollaboration

Education

Maintainability, sustainability, and extensibility

Cyberinfrastructure Ecosystem (CIF21)

Organizations Universities, schools Government labs, agencies Research and Medical Centers Libraries, Museums Virtual Organizations Communities

Expertise Research and Scholarship Education Learning and Workforce Development Interoperability and operations Cyberscience

Networking Campus, national, international networks Research and experimental networks End-to-end throughput Cybersecurity

Computational Resources Supercomputers Clouds, Grids, Clusters Visualization Compute services Data Centers

Data Databases, Data repositories Collections and Libraries Data Access; storage, navigation management, mining tools, curation, privacy

Scientific Instruments Large Facilities, MREFCs,,telescopes Colliders, shake Tables Sensor Arrays - Ocean, environment, weather, buildings, climate. etc

Software Applications, middleware Software development and supportCybersecurity: access, authorization, authentication

Page 7: 1 Cyberinfrastructure for the 21 st Century (CIF21): Data MRI and STCI EarthCube CASC Sept 9, 2011 Rob Pennington Office of Cyberinfrastructure (OCI) National

DiscoveryCollaboration

Education

CIF21: Four Major Thrust Areas

Organizations Universities, schools Government labs, agencies Research and Medical Centers Libraries, Museums Virtual Organizations Communities

Expertise Research and Scholarship Education Learning and Workforce Development Interoperability and operations Cyberscience

Networking Campus, national, international networks Research and experimental networks End-to-end throughput Cybersecurity

Computational Resources Supercomputers Clouds, Grids, Clusters Visualization Compute services Data Centers

Data Databases, Data repositories Collections and Libraries Data Access; storage, navigation management, mining tools, curation, privacy

Scientific Instruments Large Facilities, MREFCs,,telescopes Colliders, shake Tables Sensor Arrays - Ocean, environment, weather, buildings, climate. etc

Software Applications, middleware Software development and supportCybersecurity: access, authorization, authentication

Data-Enabled Science

New ComputationalResources

Community ResearchNetworks

Access andConnections toCI Resources

Education: integral and embedded

Page 8: 1 Cyberinfrastructure for the 21 st Century (CIF21): Data MRI and STCI EarthCube CASC Sept 9, 2011 Rob Pennington Office of Cyberinfrastructure (OCI) National

Scientific Data Challenges

8

Byt

es

pe

r d

ay

2012 2020

Genomics

LHC

TeraGrid, BlueWaters

SquareKilometer

Array

Genomics

LHC

Climate, Environment

LSST

ExaBytes

PetaBytes

TeraBytes

GigaBytes

Climate, Environment

Volume

Useful

Lifetim

e

Distribution

Data Access

Many smaller datasets…

DataNet

Page 9: 1 Cyberinfrastructure for the 21 st Century (CIF21): Data MRI and STCI EarthCube CASC Sept 9, 2011 Rob Pennington Office of Cyberinfrastructure (OCI) National

Support data intensive and multi-disciplinary science

Provide reliable digital access, integration, management and preservation capabilities for science and engineering data over a decades-long timeline

Develop innovative data analysis and mining tools to support data manipulation, modeling, and discovery

Engage at the frontiers of technological innovation and transformative science to drive the leading edge forward

9

CIF21 Data Goals

Page 10: 1 Cyberinfrastructure for the 21 st Century (CIF21): Data MRI and STCI EarthCube CASC Sept 9, 2011 Rob Pennington Office of Cyberinfrastructure (OCI) National

DataNet is a strategic part of Foundation-wide investments in data in CIF21• Focus on center–scale awards

DataNet efforts effectively balance:• Production infrastructure to provide operational

services• Research to create next generation infrastructure

DataNet awards are partnerships• Responsive to user communities to define their

meaningful and useful scope• Form a coordinated network to provide national,

interdisciplinary data models and infrastructure

DataNet Role in CIF21

Page 11: 1 Cyberinfrastructure for the 21 st Century (CIF21): Data MRI and STCI EarthCube CASC Sept 9, 2011 Rob Pennington Office of Cyberinfrastructure (OCI) National

DataNet: A Multi-tiered and Multi-Disciplinary Landscape

11

GenomicsCommunities

Modeling and Simulation Communities

Population, Climate, Environment Communities

Data Curation

Data Storage

Data-enabled Science

DataNet supported

Page 12: 1 Cyberinfrastructure for the 21 st Century (CIF21): Data MRI and STCI EarthCube CASC Sept 9, 2011 Rob Pennington Office of Cyberinfrastructure (OCI) National

Data Storage

National storage infrastructure for scientific data Accommodate scale and heterogeneity of scientific

data through robust, open, and broadly accepted standards

Sustainable cost model that can be implemented with governmental, academic, non profit, and commercial stakeholders such that it is sustainable.

Make strategic investments that: Leverage existing resources in TeraGrid, commercial

clouds, federal data centers Meet growing capacity needs at optimum cost Provide coordinating and integrative functions for

integrity, access control, availability, persistence Catalyze a national data infrastructure in a

similar role that NSFNet played in Internet12

Page 13: 1 Cyberinfrastructure for the 21 st Century (CIF21): Data MRI and STCI EarthCube CASC Sept 9, 2011 Rob Pennington Office of Cyberinfrastructure (OCI) National

Data Curation

Sustainable, community-based networks for management of critical scientific data resources in a life-cycle context.

Overcome challenges of culture change, policy development and implementation, sustainable operations, quality and usability control.

Strategic awards that address heterogeneity in formats, complexity, semantics of data collections that are valued by science communities of significant breadth.

Operate as a network of data services that promote interoperability, multidisciplinarity, and scalability. 13

Page 14: 1 Cyberinfrastructure for the 21 st Century (CIF21): Data MRI and STCI EarthCube CASC Sept 9, 2011 Rob Pennington Office of Cyberinfrastructure (OCI) National

Data Enabled Science

Provide critical tools and services for data mining, integration, analysis, modeling and visualization.

Overcome barriers to scaling, synthesis, and interoperability to promote effective use of large scale, shared data resources.

Strategic investments that concentrate tools, resources and expertise in support of compelling grand challenge science questions.

14

Page 15: 1 Cyberinfrastructure for the 21 st Century (CIF21): Data MRI and STCI EarthCube CASC Sept 9, 2011 Rob Pennington Office of Cyberinfrastructure (OCI) National

Cross Cutting Challenges

Balancing research into next generations of infrastructure with operation & maintenance of current capacity. Stimulate innovation and manage transitions

Sustainable, long term programs Technical design, development of business models,

and integration with the research cycle. Integration

Vertical – Linking low-level bit storage infrastructure to data collections, and finally to applications

Horizontal– Achieving connectivity and interoperability between activities that vary in scale, disciplinarity, and funding source.

15

Page 16: 1 Cyberinfrastructure for the 21 st Century (CIF21): Data MRI and STCI EarthCube CASC Sept 9, 2011 Rob Pennington Office of Cyberinfrastructure (OCI) National

Life cycle perspective covering the use of the data Research, development, implementation, operations,

sustainability, close-out

Apply project management methods WBS, risk management, change control, schedule, milestones,

deliverables

Standardized process: Evaluate science merit, conceptual design Develop draft PEP, design and reporting metrics. Critical review – prototype, finalize baseline (approval/mid-

course correction/off-ramp) Implementation & operations – subject to change control,

oversight based on milestones & metrics Final operational review – informs decision for renewal,

termination.

DataNet Program Management

16

Page 17: 1 Cyberinfrastructure for the 21 st Century (CIF21): Data MRI and STCI EarthCube CASC Sept 9, 2011 Rob Pennington Office of Cyberinfrastructure (OCI) National

DataNet Federation ConsortiumData Driven Science

Implement national data grid Federate existing discipline-specific data management

systems to enable national research collaborations

Enable collaborative research on shared data collections Manage collection life cycle as the user community

broadens

Integrate “live” research data into education initiatives Enable student research participation through control

policies

Project

Shared Collection

Processing Pipeline

Digital Library

Reference Collection

Federation

Collection Life Cycle

Cyber-infrastructure Partners: Univ. of North Carolina, Chapel HillUniv. of California, San DiegoArizona State UniversityDrexel UniversityDuke UniversityUniversity of ArizonaUniversity of South Carolina

Science and Engineering Initiatives: Ocean Observatories Initiativethe iPlant CollaborativeCUAHSICIBER-UOdum Social Science InstituteTemporal Dynamics of Learning Center

National Science Foundation Cooperative Agreement: OCI-0940841Policy-based

data management

Page 18: 1 Cyberinfrastructure for the 21 st Century (CIF21): Data MRI and STCI EarthCube CASC Sept 9, 2011 Rob Pennington Office of Cyberinfrastructure (OCI) National

CUNY SI: Instrumentation for Enabling Data Analysis, Sharing, Storage, and Preservation

UC Boulder: Acquisition of a Scalable Petascale Storage Infrastructure for Data-Collections and Data-Intensive Discovery

RPI: Acquisition of a Balanced Environment for Simulation

NCA&T: Acquisition of a Complete High-Performance Modeling and Visualization System for Research in Mathematical Biology and Mathematical Geosciences

OSU: Acquisition of a High Performance Compute Cluster for Multidisciplinary Research

MRI 2011

18

Page 19: 1 Cyberinfrastructure for the 21 st Century (CIF21): Data MRI and STCI EarthCube CASC Sept 9, 2011 Rob Pennington Office of Cyberinfrastructure (OCI) National

WHAT IS EARTHCUBE?

Page 20: 1 Cyberinfrastructure for the 21 st Century (CIF21): Data MRI and STCI EarthCube CASC Sept 9, 2011 Rob Pennington Office of Cyberinfrastructure (OCI) National

A Call to Action

Over the next decade, the geosciences community commits to developing a

framework to understand and predict responses of the Earth as a

system—from the space-atmosphere boundary to the Earth’s core,

including the influences of humans and ecosystems

Transitions and Tipping Points in Complex Environmental Systems, NSF AC for Environmental Research and Education, 2009

Earth Science and Applications from Space: National Imperatives for the Next Decade and Beyond, 2007

High-Performance Computing Requirements for the Computational Solid Earth Sciences, 2005

Page 21: 1 Cyberinfrastructure for the 21 st Century (CIF21): Data MRI and STCI EarthCube CASC Sept 9, 2011 Rob Pennington Office of Cyberinfrastructure (OCI) National

Goal of EarthCube

To transform the conduct of research in geosciences by supporting community-based cyberinfrastructure to integrate data and information for knowledge management across the Geosciences.

Page 22: 1 Cyberinfrastructure for the 21 st Century (CIF21): Data MRI and STCI EarthCube CASC Sept 9, 2011 Rob Pennington Office of Cyberinfrastructure (OCI) National

What Needs To Be Done?

Integrate data, tools and communities through cyberinfrastructure

Establish a governance mechanism that is inclusive and adopted by the community

Utilize current and emerging technologies to create transparent infrastructure for the geosciences community

Page 23: 1 Cyberinfrastructure for the 21 st Century (CIF21): Data MRI and STCI EarthCube CASC Sept 9, 2011 Rob Pennington Office of Cyberinfrastructure (OCI) National

Modes of Support

Convergence to a Unifying Architecture

Page 24: 1 Cyberinfrastructure for the 21 st Century (CIF21): Data MRI and STCI EarthCube CASC Sept 9, 2011 Rob Pennington Office of Cyberinfrastructure (OCI) National

EARTHCUBE ASSUMPTIONS

The geosciences community is ready to take on the EarthCube challenge

Community will start self-organizing prior to EarthCube activities, like the Nov 1-4 Charrette

Current and emerging technology will help achieve the convergence envisioned for EarthCube

A broad range of expertise and resources must be engaged to shape EarthCube

Page 25: 1 Cyberinfrastructure for the 21 st Century (CIF21): Data MRI and STCI EarthCube CASC Sept 9, 2011 Rob Pennington Office of Cyberinfrastructure (OCI) National

Jun 2011Jul-Sept 2011

Nov 1-4 2011Nov/11-Apr/12

May 2012

DCLReleased

TwoWebExevents

Charrette Proposed

FrameworkApproaches

Developed through EAGERs

Sandpit/IdeasLabto determine18 mo. prototype award(s)

Page 26: 1 Cyberinfrastructure for the 21 st Century (CIF21): Data MRI and STCI EarthCube CASC Sept 9, 2011 Rob Pennington Office of Cyberinfrastructure (OCI) National

EARTHCUBE TIMELINE

On-line Community Information: August to November, 2011

EarthCube Charrette: Early November, 2011

EarthCube Ideas/Lab: Tentatively Early May, 2012

Prototype Development: May to December 2013

Fully integrated geosciences infrastructure: 2014-2022

Page 27: 1 Cyberinfrastructure for the 21 st Century (CIF21): Data MRI and STCI EarthCube CASC Sept 9, 2011 Rob Pennington Office of Cyberinfrastructure (OCI) National

Pre-Charrette Organization(August – September)

Second WebEx on Aug. 22 NSF seeks input from wide range of sources

Individuals, inst./org., representatives of scientific groups or communities

Facilities and managers of CI endeavors Industry, Federal Labs., Federal Agencies, and

International Partners NSF will establish on-line resources and forums to

Gather community inputs/requirements Facilitate partnerships and collaborations Encourage submission of approaches to the EarthCube

design

Page 28: 1 Cyberinfrastructure for the 21 st Century (CIF21): Data MRI and STCI EarthCube CASC Sept 9, 2011 Rob Pennington Office of Cyberinfrastructure (OCI) National

Charrette Process

Stakeholders focus EarthCube Ideas and Activities Plenary Sessions to

discuss user requirements refine approaches and designs for EarthCube develop partnerships and new collaborations

Remote participation and real-time comments system will be available

Summary Session Comments from NSF, facilitators, and participants on

process NSF provides guidance on post-Charrette activities

Page 29: 1 Cyberinfrastructure for the 21 st Century (CIF21): Data MRI and STCI EarthCube CASC Sept 9, 2011 Rob Pennington Office of Cyberinfrastructure (OCI) National

29

Questions?