science gateways and their tremendous potential for science and engineering nancy wilkins-diehr...

32
Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways [email protected] SAB Meeting, January 14-15, 2008

Upload: barrie-cain

Post on 18-Jan-2018

219 views

Category:

Documents


0 download

DESCRIPTION

Phenomenal Impact of the Internet on Worldwide Communication and Information Retrieval Implications on the conduct of science are still evolving –1980’s, Early gateways, National Center for Biotechnology Information BLAST server, search results sent by , still a working portal today –1992 Mosaic web browser developed –1995 “International Protein Data Bank Enhanced by Computer Browser” –2004 TeraGrid project director Rick Stevens recognized growth in scientific portal development and proposed the Science Gateway Program Simultaneous explosion of digital information – Analysis needs in a variety of scientific areas –Sensors, telescopes, satellites, digital images and video –#1 machine on Top500 today is 300x more powerful than all combined entries on the first list in 1993 SAB Meeting, January 14-15, 2008 Only 15 years since the release of Mosaic!

TRANSCRIPT

Page 1: Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways

Science Gateways

and their tremendous potential for science

and engineering

Nancy Wilkins-DiehrTeraGrid Area Director for Science

[email protected]

SAB Meeting, January 14-15, 2008

Page 2: Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways

•Gateway motivation•Current examples•Experience and activities to date•Review information

– March, 2007 program review– June, 2007 TeraGrid Futures gateway workshop

•Future directions, request for feedback

SAB Meeting, January 14-15, 2008

Page 3: Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways

Phenomenal Impact of the Internet on Worldwide Communication and Information Retrieval

•Implications on the conduct of science are still evolving– 1980’s, Early gateways, National Center for Biotechnology

Information BLAST server, search results sent by email, still a working portal today

– 1992 Mosaic web browser developed– 1995 “International Protein Data Bank Enhanced by Computer

Browser”– 2004 TeraGrid project director Rick Stevens recognized growth in

scientific portal development and proposed the Science Gateway Program

•Simultaneous explosion of digital information– Analysis needs in a variety of scientific areas– Sensors, telescopes, satellites, digital images and video– #1 machine on Top500 today is 300x more powerful than all

combined entries on the first list in 1993SAB Meeting, January 14-15, 2008

Only 15 years since the release of Mosaic!

Page 4: Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways

SAB Meeting, January 14-15, 2008

Gateway Timeline•October, 2004 “TeraGrid Science Gateway” term originates

–We will help them build gateway portals that leverage TeraGrid capabilities and provide web-based interfaces to community tools. Typical services provided will include access to the following:•Data: metadata catalogs for the community data resources, the user’s experiments, and remote files, with access via browsable directories, query interfaces, or indexes.

•Analysis: hyperlinked visualization and other data analysis and grid-enabled desktop tools.

•Applications: applications encapsulated as web services and given a user interface in the portal. The portal manages back-end job management and, based on the user’s authorization capabilities, the level of resources applied to the user’s request.

•Collaboration: newsgroups, shared data spaces, “publication” mechanisms. •Workflow: tools that enable the user to compose TeraGrid and application services to create new applications to be “published” for others to use.

Page 5: Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways

Science Gateways are a Natural Extension of Internet Developments

•3 common types of gateway– Web portal with users in front and services in back– Client server model where application programs running on users'

machines (i.e. workstations and desktops) and accesses services– Bridges across multiple grids, allowing communities to utilize both

community developed grids and shared grids•Continued rapid changes ahead, must be adaptable, gateways can provide some nimbleness

SAB Meeting, January 14-15, 2008

Page 6: Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways

Gateway Idea Resonates with Scientists•Capabilities provided by the Web are easy to envision because we use them in every day life•Researchers can imagine scientific capabilities provided through a familiar interface

•Groups resonate with the fact that gateways are designed by communities and provide interfaces understood by those communities– But also provide access to greater capabilities on the back end

without the user needing to understand the details of those capabilities

– Scientists know they can undertake more complex analyses and that’s all they want to focus on

•But this seamless access doesn’t come for free. It all hinges on very capable developers

SAB Meeting, January 14-15, 2008

Page 7: Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways

Highlights: LEAD Inspires StudentsAdvanced capabilities regardless of location

•A student gets excited about what he was able to do with LEAD•“Dr. Sikora:Attached is a display of 2-m T and wind depicting the WRF's interpretation of the coastal front on 14 February 2007. It's interesting that I found an example using IDV that parallels our discussion of mesoscale boundaries in class. It illustrates very nicely the transition to a coastal low and the strong baroclinic zone with a location very similar to Markowski's depiction. I created this image in IDV after running a 5-km WRF run (initialized with NAM output) via the LEAD Portal. This simple 1-level plot is just a precursor of the many capabilities IDV will eventually offer to visualize high-res WRF output. Enjoy!• Eric” (email, March 2007)

SAB Meeting, January 14-15, 2008

Page 8: Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways

SAB Meeting, January 14-

15, 2008

Highlights: GridChem’s Client-Server Approach Provides Power and a Rich Feature Set

Source: Sudhakar Pamidighantam, NCSA

Page 9: Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways

Biomedical Informatics Research Network (BIRN)

BIRN is a National Center for Research Resources (NCRR) initiative aimed at creating a testbed to address biomedical researchers

Source: Anthony Kolasny, Johns Hopkins

SAB Meeting, January 14-15, 2008

Page 10: Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways

Shape Analysis - A Morphometry BIRN Project

MGH Segmentation

Data DonorSites

De-identificationAnd upload

JHU CIS-KKI

Shape Analysis of Segmented Structures

Storage

BWHVisualization

1

2

3

4

5

TeraGridSupercomputing

Goal: comparison and quantification of structures’

shape and volumetric differences across patient

populations

Source: Anthony Kolasny, Johns Hopkins

SAB Meeting, January 14-15, 2008

Page 11: Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways

BIRN uses SSHFS to mount TeraGrid filesystems locally

220TB through

CIS portal using

autofs, samba,smbwebclient.

CIS has 87TB of local storage.

/cis/net lists network drives.

Source: Anthony Kolasny, Johns Hopkins UniversitySAB Meeting, January 14-15, 2008

Page 12: Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways

CReSIS (Center for Remote Sensing of Ice Sheets)

•Awarded CI-TEAM funding to build a Polar Gateway– International Polar Year 2007-

2008– Led by Geoffrey Fox, IU and

Linda Hayden, Elizabeth City State

•CReSISGrid– Build a TeraGrid Science

Gateway– Provide broad-based educational

and training activity in Cyberinfrastructure for remote sensing and ice sheet dynamics

– Lessons learned in remote data gathering can be applied to fields

SAB Meeting, January 14-15, 2008

Page 13: Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways

Easy TeraGrid Gateway True and False Test

Answers Provided•TeraGrid selects all gateways (F)•TeraGrid designs all gateways (F)•TeraGrid limits the number of gateways (F)•All gateways need TeraGrid funding to exist (F)

•Any PI can request an allocation and use it to develop a gateway (T)•Gateway design is community-developed and that is the core strength of the program (T)•TeraGrid staff are alerted to gateway work when a proposal is reviewed or when a community account is requested (T)•Limited TeraGrid support can be provided for targeted assistance to integrate an existing gateway with TeraGrid (T)

SAB Meeting, January 14-15, 2008

Page 14: Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways

When is a gateway appropriate?•Researchers using defined sets of tools in different ways

– Same executables, different input•GridChem, CHARMM

– Creating multi-scale workflows– Datasets

•Common data formats– National Virtual Observatory– Earth System Grid– Some groups have invested significant efforts here

•caBIG, extensive discussions to develop common terminology and formats•BIRN, extensive data sharing agreements

•Difficult to access data/advanced workflows– Sensor/radar input

•LEAD, GEON

SAB Meeting, January 14-15, 2008

Page 15: Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways

TeraGrid RATs(Requirements Analysis Teams)

•Spring, 2005 Science Gateway Requirements Analysis Team (RAT)– Origin of the RAT– October, 2005 most gateways begin

charging in earnest due to funding delays•Identification of common needs across the gateways

•Goal is production use of TG resources in the gateway as well as development of process and policy within TG for scalable gateway program and services

•Tremendous sharing of experiences amongst talented developers

SAB Meeting, January 14-15, 2008

Page 16: Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways

2006 – Implementing Common Gateway Requirements

•Web Services– GT4 deployment, identification of

remaining capabilities– Information services, WebMDS

•Auditing– Need to retrieve job usage info on

production resources – GRAM audit deployed in test mode in

September, inclusion in CTSSv4•Community Accounts

– Policy finalized, security approaches being tested by RPs

– Attribute-based authentication testing•Allocations

– Changes in allocation procedures, the mechanisms used to evaluate science impact, and models for identity management, authentication and authorization that are more tuned to virtual organizations.

•Scheduling– Metascheduling RAT– On-demand via SPRUCE framework

•Outreach– Talks, Schools/workshops (NVO,

GISolve), major project demonstrations (LEAD)

– SURA, HASTAC, GEON, CI-Channel, SC, Grace Hopper, MSI-CI2, Lariat, Science Workflows and On Demand Computing for Geosciences Workshop

•Primer– Living document in wiki, provides

up-to-date overview and instructions for new gateway developers (“how to make your portal a TeraGrid science gateway”)

SAB Meeting, January 14-15, 2008

Page 17: Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways

2007 – Gateways move into production•Web Services

– Development of common services•Steve Mock begins December, 06•QBETS “where can I run soonest” service

•Auditing– Provide capability to regularly report

number of gateway users•GridShib

•Community Accounts– Finalize community account

implementation policy– Provide web interface to account

details for TG security staff•Allocations

– Collaboration with xRAC reviewers to develop instructions for gateway Pis

•Scheduling– Metascheduling working group– Urgent computing workshop

•Outreach– “Build a Gateway” tutorial at TG07

•Downloadable code, documentation– Gateways featured in student

competition at TG07– Cross directorate presentation at NSF– LEAD collegiate forecast competition,

April– GISolve used in 2 classes– NVO announcement of production TG

capabilities at conference in China– EOT supplement evaluation of

gateway use in education• Interest from Navajo Tech

•Primer– TG documentation staff identified to

move the Primer into fully functional documentation

•Addressing issues that prevent current gateways from using TG in production

•Stu Martin begins January, 2007

SAB Meeting, January 14-15, 2008

Page 18: Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways

March, 2007 review summary•The panel was impressed by the potential for Science Gateways to enable new communities and to provide new integrated technologies to a broader scientific audience.•The development and definition of the Science Gateways remains a central activity in the TG and is reviewed with particular attention for the technological and methodological developments that benefit the TG as a whole vs. just the individual SGs.

SAB Meeting, January 14-15, 2008

Page 19: Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways

Gateway-related recommendations•While the panel agreed that these Science Gateways offer tremendous potential to the general scientific community, we encourage the project to clarify its position as to exactly which of these are officially called “TeraGrid Science Gateways.” (R1) •The panel noted that TeraGrid now has a mechanism to limit the privileges of these accounts. Nevertheless, it is unclear what, if any, authentication and validation procedures are performed prior to making a user-submitted gateway program available for community use (R9). •Diversity of research domains represented by the SGs is reportedly “user-driven”. The panel asks the TG to clarify how such information is solicited from current and potential user communities (R10). •Documentation should be maintained on the actual use of each SG, including (user and discipline based) demographics (R11).•Efforts should be made to provide better user documentation for each SG, online and on-site training for the SGs, and more widespread promotion of the SGs to potential user communities (R12).

SAB Meeting, January 14-15, 2008

Page 20: Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways

Community Account Portal Under Development

Designed for security-wg•Password-protected internal site•Type in community account username and bring up associated info•Write permission for security staffers to add info about how and when these accounts have been secured at each site•Repository for Gateway logs

– Automatic mechanism and location for gateways to send logs.

– Notification when logs are stale•Perhaps recent usage data

SAB Meeting, January 14-15, 2008

Page 21: Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways

TeraGrid Futures June, 2007 Gateway Developer Workshop

•Community-driven, participatory planning process for the future of TeraGrid•“What would TeraGrid be if it met the needs of your science gateway perfectly?”•Many themes relevant for all users, not just gateways•Gateway developers value a venue for interacting with other developers–TeraGrid has been good at providing this

•17 Gateways at workshop–7 domains represented–Most funded in part by the NSF

•Report available at www.teragridfuture.org/events

SAB Meeting, January 14-15, 2008

Page 22: Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways

Top Solutions•Develop gateway framework templates built upon toolkits which may already exist (8 votes across all criteria) •Peering with NLR (National Lambda Rail), Internet2, etc. (8)•Have common scheduling of jobs across different TeraGrid sites (7)•Take meta-scheduling seriously, not as a future dream—allocate funding for development (11)•Training, education, workshops, generalized & standardized basic services, documentation (23 for 7 items)•Do not invest $200M into a single machine (15)

– $100M in a capacity machine– $100M in “content”: middleware, interfaces, and end-user

applications•Reliably performing global file system with a fast local I/O (7)•Standardize certificate based authentication/authorization (10)•End-to-end support for Virtual Organizations (9)

Source: Ann ZimmermanSAB Meeting, January 14-15, 2008

Page 23: Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways

Key Issues•Support interaction and cross-fertilization among Science Gateway development communities–Sharing code and successful solutions–Financial and professional support for developing gateways

•Reduce hurdles that make using and building on TeraGrid difficult –Reliability and tracking of upgrades–Length of development cycle–Bureaucracy

Source: Ann ZimmermanSAB Meeting, January 14-15, 2008

Page 24: Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways

Prominent Needs•Basic services that gateways can use instead of creating their own.•Templates and standardized systems to save developers the time of recreating things that others have already built.•Standardization that would make TeraGrid a real grid that could support the effective use of allocations and meta-scheduling.•Operating more effectively as a community in order to better support the education and development needs of gateway developers.

Source: Ann ZimmermanSAB Meeting, January 14-15, 2008

Page 25: Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways

Science Gateways Objectives aka Nancy’s Brave New Gateway World

SAB Meeting, January 14-15, 2008

Page 26: Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways

Gateway Objectives for PY4 and 5

•TeraGrid integration will be straightforward for new and existing gateway developers•There will be a set of easy to discover general services provided by and for Gateways•The targeted support program will be well-organized•We will be able to routinely count end gateway users, who will total 25% of total TeraGrid users•There will be a funded cross-directorate gateway program at the NSF

SAB Meeting, January 14-15, 2008

Page 27: Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways

TeraGrid integration will be straightforward for new and existing gateway developers

•Clear documentation, including step by step integration instructions for gateways– These instructions will include all gateway requirements.

•Deployment of build your own gateway capability•Tools necessary for tasks like accounting and authentication

SAB Meeting, January 14-15, 2008

Page 28: Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways

There will be a set of easy to discover general services provided by and for Gateways

•There will be a published list of general web services for Gateways•Gateways will be able to easily make their own services available to others

SAB Meeting, January 14-15, 2008

Page 29: Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways

The targeted support program will be well-organized

•The request process will be clear– PIs will understand the status of their requests,

•It will be clear to staff members what they are working on and when– We’ll be able to more easily transition amongst projects

•Lessons learned will be included in general gateway documentation/case studies•We will work with at least 10 new projects in PYs 4 and 5•We will seek out projects benefiting underrepresented groups

SAB Meeting, January 14-15, 2008

Page 30: Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways

We will be able to routinely count end gateway users, who will total 25% of total

TeraGrid users•A unique identifier for each end gateway user per community account must exist in TGCDB•Gateways will need to transmit and TGCDB will need to receive this additional identifier through any job submission mechanism•Attribute-based authentication in production and easy to use

SAB Meeting, January 14-15, 2008

Page 31: Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways

There will be a funded cross-directorate gateway program at the NSF

•Recognizes the importance of gateways as infrastructure necessary to tackle the most compelling science problems•Funds them in a sustainable, metrics-driven way

SAB Meeting, January 14-15, 2008

Page 32: Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways

Issues I would particularly like SAB feedback on

•Gateway hosting•HPC resource capabilities

– Interactive pool of nodes where jobs run immediately?•Very few gateways requesting targeted support through xRAC process

– How to get the word out?– How to increase involvement of underrepresented communities?

•Balance between targeted support and general services– Importance of application hosting

•Feedback for streamlining access– Many hoops for gateways right now

•EOT supplement– Study of gateway use by educators

•International usage– Ideas on what other projects have done here, TeraGrid is to develop a plan

to send to NSF•Cross-directorate gateway program

SAB Meeting, January 14-15, 2008