science gateways update nancy wilkins-diehr science gateways area director quarterly meeting,...

28
Science Gateways Update Nancy Wilkins-Diehr Science Gateways Area Director Quarterly Meeting, September 6-7, 2007

Upload: oswin-hodge

Post on 11-Jan-2016

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Science Gateways Update Nancy Wilkins-Diehr Science Gateways Area Director Quarterly Meeting, September 6-7, 2007

Science Gateways Update

Nancy Wilkins-DiehrScience Gateways Area Director

Quarterly Meeting, September 6-7, 2007

Page 2: Science Gateways Update Nancy Wilkins-Diehr Science Gateways Area Director Quarterly Meeting, September 6-7, 2007

Gateway Activities and Futures

•Recent/Current Activities– “How to Write a Winning Gateway Proposal”– “Building Blocks for Science Gateways”, TG07– Community account portal– Return of the original gateway RAT

•Gateway Survey II

– Gateway transitions•New projects, new services

•Futures– Gateway hosting– Interactive access– Application hosting– Web 2.0 capabilities– U Mich Gateway workshop results

Quarterly Meeting, September 6-7, 2007

Page 3: Science Gateways Update Nancy Wilkins-Diehr Science Gateways Area Director Quarterly Meeting, September 6-7, 2007

How to Write a Winning Gateway ProposalCompleted June, 07

•Community usage is an approved usage model on NSF-funded resources– Want successful Gateway PIs

•Current proposal instructions not tailored to gateway usage

•Developed augmented proposal instructions in conjunction with xRAC reviewers– Describe algorithms and codes to be run but also

•Classes of activities that will be enabled•target audience•typical use cases •criteria for success•Gateway management, user tracking

Quarterly Meeting, September 6-7, 2007

Page 4: Science Gateways Update Nancy Wilkins-Diehr Science Gateways Area Director Quarterly Meeting, September 6-7, 2007

Building Blocks for Science GatewaysCaptured for asynchronous learning

•Build a Gateway in an afternoon presented at TG07

•Two components– Accounts on web server at U Iowa– TG accounts

•Very simple framework to do the basics– Compute jobs, data transfer, visualization

•Complete instructions provided on how to recreate this at home

•If and how to provide continued access– DAC requests to get accounts necessary to work through

asynchronous training materials?– Continued access to a web server?

Quarterly Meeting, September 6-7, 2007

Page 5: Science Gateways Update Nancy Wilkins-Diehr Science Gateways Area Director Quarterly Meeting, September 6-7, 2007

Community Account Portal Under Development

Designed for security-wg• Password-protected internal

site• Type in community account

username and bring up associated info

• Write permission for security staffers to add info about how and when these accounts have been secured at each site

• Repository for Gateway logs– Automatic mechanism and

location for gateways to send logs.

– Notification when logs are stale

• Perhaps recent usage dataQuarterly Meeting, September 6-7, 2007

Page 6: Science Gateways Update Nancy Wilkins-Diehr Science Gateways Area Director Quarterly Meeting, September 6-7, 2007

Return of the Gateway RAT

•Gateway Survey II planned for Fall, 2007

•Original gateway RAT determined priorities for the last 2+ years– Time to check in again

and resynch

•Draft developed with Doru Marcusiu

•Content will be developed with input across TG

Quarterly Meeting, September 6-7, 2007

Page 7: Science Gateways Update Nancy Wilkins-Diehr Science Gateways Area Director Quarterly Meeting, September 6-7, 2007

Gateways TransitionMost “original gateways” into production by 1/31/08

•New projects– Take incubation, matchmaking

•CReSIS, SIDgrid, Navajo Tech

– Requests through peer review•CIG, June 2006

•Gateway use in education– TG07 student competition– EOT supplement, participation in SC07 education program

•TeraGrid Gateways as a community builder– Not just a group of funded projects– Provides a unique forum for developers to share experiences

•U Mich Gateway workshop reinforces the need for this forum

– Interest in calls and email archives from outside the project•Detailed gateway analysis by caBIG

Quarterly Meeting, September 6-7, 2007

Page 8: Science Gateways Update Nancy Wilkins-Diehr Science Gateways Area Director Quarterly Meeting, September 6-7, 2007

Increase in General Services

•Now need to clearly organize what’s being offered– Looking at skeleton gateway (SimpleGrid) and web service

registry as organizing principals•General web services

–Continually watch new technologies»REST»Web 2.0

•GRAM audit components–System for capturing per-job usage and attributing to individual gateways

•Attribute-based authentication•Credential management

– Takes coordination, group input

• Increased RP gateway activity– IU team led by Marlon Pierce

Quarterly Meeting, September 6-7, 2007

Page 9: Science Gateways Update Nancy Wilkins-Diehr Science Gateways Area Director Quarterly Meeting, September 6-7, 2007

Source: Von Welch

Attribute-based Authorization

• Allow Science Gateways to push attributes to TeraGrid RP sites with credential– E.g. Client identifier, VO Role, client IP address

• Could be passed from user’s Idp or generated locally• Development on Attribute push code nearing completion

– GridShib for GT 0.6.0 Tech Preview and GridShib-SAML-Tools have been released

• RP can authorize/de-authorize based on attributes • Demonstrate integration with GISolved-based Science

Gateway. • Deploying attribute auditing/authorization code onto Mercury

@ NCSA• Met with Security and Science Gateways WGs and documented

initial use cases

•http://www.teragridforum.org/mediawiki/index.php?title=AAA_Testbed:Simple_Science_Gateway_with_Attributes

Page 10: Science Gateways Update Nancy Wilkins-Diehr Science Gateways Area Director Quarterly Meeting, September 6-7, 2007

Forward-looking capabilities

Quarterly Meeting, September 6-7, 2007

Page 11: Science Gateways Update Nancy Wilkins-Diehr Science Gateways Area Director Quarterly Meeting, September 6-7, 2007

Forward-looking capabilities

• Interactive Access– Queue waits are a priority for all users, especially gateways– Metascheduling to identify free cycles

•Neutron Science, GISolve, nanoHUB participating in metascheduling testbed

– On demand capabilities•SPRUCE into production in the spring

– Virtual machines?

• Application hosting– Ability to provide direct access to applications– RENCI interested in providing bio app access– PGRADE/GEMLCA, AHE?

• REST and Web 2.0– User-generated content and user/third-party integration of

services/capabilities•See some early science applications like openwetware.org, myexperiment.org

– JP and Lee already looking at publishing TG information services with REST front-end

Quarterly Meeting, September 6-7, 2007

Page 12: Science Gateways Update Nancy Wilkins-Diehr Science Gateways Area Director Quarterly Meeting, September 6-7, 2007

Build on Results from Ann Zimmerman’s Gateway Developer Workshop

•Community-driven, participatory planning process for the future of TeraGrid

•“What would TeraGrid be if it met the needs of your science gateway perfectly?”

•Many themes relevant for all users, not just gateways

•Gateway developers value a venue for interacting with other developers– TeraGrid has been good at providing this

•17 Gateways at workshop– 7 domains represented– Most funded in part by the NSF

Quarterly Meeting, September 6-7, 2007

Page 13: Science Gateways Update Nancy Wilkins-Diehr Science Gateways Area Director Quarterly Meeting, September 6-7, 2007

17

Top Solutions• Develop gateway framework templates built upon toolkits

which may already exist (8 votes across all criteria) • Peering with NLR (National Lambda Rail), Internet2, etc. (8)• Have common scheduling of jobs across different TeraGrid

sites (7)• Take meta-scheduling seriously, not as a future dream—

allocate funding for development (11)• Training, education, workshops, generalized & standardized

basic services, documentation (23 for 7 items)• Do not invest $200M into a single machine (15)

– $100M in a capacity machine– $100M in “content”: middleware, interfaces, and end-user

applications• Reliably performing global file system with a fast local I/O (7)• Standardize certificate based authentication/authorization (10)• End-to-end support for Virtual Organizations (9)

Source: Ann Zimmerman

Page 14: Science Gateways Update Nancy Wilkins-Diehr Science Gateways Area Director Quarterly Meeting, September 6-7, 2007

18

Top Solutions• Develop gateway framework templates built upon toolkits

which may already exist (8 votes across all criteria) • Peering with NLR (National Lambda Rail), Internet2, etc. (8)• Have common scheduling of jobs across different TeraGrid

sites (7)• Take meta-scheduling seriously, not as a future dream—

allocate funding for development (11)• Training, education, workshops, generalized &

standardized basic services, documentation (23 for 7 items)

• Do not invest $200M into a single machine (15)– $100M in a capacity machine– $100M in “content”: middleware, interfaces, and end-user applications

• Reliably performing global file system with a fast local I/O (7)

• Standardize certificate based authentication/authorization (10)

• End-to-end support for Virtual Organizations (9)

Source: Ann Zimmerman

Page 15: Science Gateways Update Nancy Wilkins-Diehr Science Gateways Area Director Quarterly Meeting, September 6-7, 2007

19

Key Issues

•Support interaction and cross-fertilization among Science Gateway development communities– Sharing code and successful solutions– Financial and professional support for developing

gateways

•Reduce hurdles that make using and building on TeraGrid difficult – Reliability and tracking of upgrades– Length of development cycle– Bureaucracy

Source: Ann Zimmerman

Page 16: Science Gateways Update Nancy Wilkins-Diehr Science Gateways Area Director Quarterly Meeting, September 6-7, 2007

20

Prominent Needs•Basic services that gateways can use instead of

creating their own.•Templates and standardized systems to save

developers the time of recreating things that others have already built.

•Standardization that would make TeraGrid a real grid that could support the effective use of allocations and meta-scheduling.

•Operating more effectively as a community in order to better support the education and development needs of gateway developers.

Source: Ann Zimmerman

Page 17: Science Gateways Update Nancy Wilkins-Diehr Science Gateways Area Director Quarterly Meeting, September 6-7, 2007

Thank You

Quarterly Meeting, September 6-7, 2007

Page 18: Science Gateways Update Nancy Wilkins-Diehr Science Gateways Area Director Quarterly Meeting, September 6-7, 2007

Quarterly Meeting

SPRUCE Jul-Sep 07 Accomplishments

•Accomplished Milestones - – Ported software to work with SGE on SDSC OnDemand

cluster – Integrating SPRUCE with Condor under progress– Testbed WS-GRAM support complete at UC/ANL– Implementing Q/A backend using Inca– Prototype system for overall job turnaround time predictor

nearing completion– Participating in scheduling-wg to evaluate the software

•Science Impact– Most of the work this quarter was to port and extend the

system to work with different capabilities such as Condor, WS-GRAM, SGE etc. We hope to finish this work by next quarter.

Page 19: Science Gateways Update Nancy Wilkins-Diehr Science Gateways Area Director Quarterly Meeting, September 6-7, 2007

Quarterly Meeting

SPRUCE Oct-Dec 07 Plans

• Road Map

– Finish all the tasks from previous quarter– Collaborate with HARC to integrate advance

reservations– Work on policy encoding lookup table for various

projects per resource– Upgrade the documentation to provide starter guides at

each site– Work with other applications to add SPRUCE support– Prepare for SC07

• Science Impact– The focus would be on finishing the current collaboration projects and put them

under test. We also want to concentrate on further documentation and consolidating the SPRUCE user and resource provider base. SC07 demos and talks will concentrate on this aspect too.

Page 20: Science Gateways Update Nancy Wilkins-Diehr Science Gateways Area Director Quarterly Meeting, September 6-7, 2007

Quarterly Meeting

Caltech TeraGrid Science Gateways

•Status July-September 2007•Plans October-December 2007

Julian Bunn, Matthew Graham, Conrad Steenberg, Roy Williams

Page 21: Science Gateways Update Nancy Wilkins-Diehr Science Gateways Area Director Quarterly Meeting, September 6-7, 2007

Quarterly Meeting

Caltech TeraGrid Science Gateways

•Accomplishments– Regression test for NESSSI mosaicking mashup services

•Test each backend survey, fix details of XML and protocol– Experimenting with NESSSI services on Amazon.com “elastic computing

cloud”• Tradeoffs between TeraGrid and commercial resource provider

– Rebuilt Clarens server on a /gpfs-wan headnode• Getting the code verified and authorized with SDSC/TeraGrid admins

– New Clarens release• Improved speed and reliability. New database implementation• NESSSI addons packaged as an RPM

– Two day visit to Caltech by Rick Wagner for collaboration on ENZO cosmology portal

– ROOTlet client code completely rewritten: now uses same protocol as for NaradaBrokering (NB)

• Improves stability, allows extensions more easily, reduces dependencies, uses https protocol

– ROOTlet server now issues publish/subscribe status messages to NB– New leveraged funding for ROOTlets from DOE’s STTR program– First version of Grid-enabled StatPatternRecognition application, for

deployment as a TeraGrid/Clarens service• Over 20 powerful classifiers, including bagged decision trees, neural net, etc.

Page 22: Science Gateways Update Nancy Wilkins-Diehr Science Gateways Area Director Quarterly Meeting, September 6-7, 2007

Caltech TeraGrid Science Gateways

•Plans– Add NB module to ROOTlet client– Tests of ROOTlets with TeraGrid batch queues – Present this work, and the role of TeraGrid, at

“Computing in High Energy Physics” conference, Victoria, BC in September

– Develop demonstration ROOTlet analysis for SC07 in Reno: will use TeraGrid

– Develop polished interface to SPR as a new Clarens service

Page 23: Science Gateways Update Nancy Wilkins-Diehr Science Gateways Area Director Quarterly Meeting, September 6-7, 2007

Quarterly Meeting

GISolve Q4FY07 Accomplishments Highlights

• Background– Geographic Information Science (GIScience), an interdisciplinary field

involving geography and other social sciences, computer science, geodesy, information sciences, and statistics to study generic issues about the development and use of geographic information systems (GIS) technologies.

• Milestones– Regular number of users: approximately 50– Developed and released the SimpleGrid toolkit and its preliminary

documentation by leveraging GISolve experience• Support the education and training of TeraGrid Science Gateway technologies• Help bring up new TeraGrid Science Gateways• Help test and integrate attribute-based authorization solutions for TeraGrid Science

Gateways– Education

• The following course (24 students) at the University of Illinois at Urbana-Champaign is using GISolve

– Advanced Geographic Information Systems (undergraduate and graduate)

• Impact on science– Produced the following research publications

• Wang, S., and Armstrong, M. P. 2007. “A Spatial Computational Domain Theory.” International Journal of Geographical Information Science, under revision

Page 24: Science Gateways Update Nancy Wilkins-Diehr Science Gateways Area Director Quarterly Meeting, September 6-7, 2007

Quarterly Meeting

GISolve Q1FY08 Plans• Milestones

– Deploy a new visualization service to support analysis steering– Produce a paper based on SimpleGrid experience– Explore new applications

•Land use and management for environmental sustainability•Anthropology

– Continue to develop and support the SimpleGrid toolkit

• Describe impact on science• GISolve enables computationally intensive geographic analyses and support collaborative scientific investigations that rely on geospatial information.

• GISolve represents next-generation advanced Web-GIS that allows a large number of users to collaborate and share the workflows of geographic analyses.

A public health study using a spatial interpolation method in GISolve

Page 25: Science Gateways Update Nancy Wilkins-Diehr Science Gateways Area Director Quarterly Meeting, September 6-7, 2007

Quarterly Meeting

IU GIG Quarterly Gateway Highlights

•Q3FY07 Accomplishments– LEAD & NOAA Scientists used LEAD Gateway to run on-demand

forecasts as part of the Storm Prediction Center’s Spring Experiments

– Continued to working with Globus Developers & TG RP staff in testing and debugging latest version of GRAM and Grid FTP which will be rolled out as part of CTSS V4

– Interactions with SIDGrid Project & potential Genomics Gateway.– Designing Generic Resource Selection & Data Movement Web

Services useful for multiple Gateways

•Q4FY07 Plans– Working with Weather Challenge on using LEAD Portal to enable

participation from undergraduate students from 67 universities.– Continue to work with TG RP team to ensure stability of GRAM and

GridFTP with intended LEAD Gateway usage of half a million SU’s in Fall 07.

– Help new Gateways transition to using high end TG Data and Compute Resources.

Page 26: Science Gateways Update Nancy Wilkins-Diehr Science Gateways Area Director Quarterly Meeting, September 6-7, 2007

Quarterly Meeting

TeraGrid Visualization Gateway Highlights

•Q4FY07 Accomplishments– Dynamic Accounts for community users

•Hardening of these capabilities, hope to move to production yet this quarter, depending on hardware/software availability

– Visibility on main TeraGrid Website•Rolled out first round of Visualization documentation on main TG website, including pointer to the Visualization Gateway

•Q1FY08 Plans– Additional visualization services

•TeraDRE portlet - Collaborating with Purdue to get this capability into production on the Visualization Gateway during Q1FY08

•Exploring possible VMD service to visualize NAMD simulations (collaborating with Indiana)

– SC07 Demonstration

•~40 users have logged into the Visualization Gateway– 25 TeraGrid Users– 10 Training accounts during TG07– 5 community users

Page 27: Science Gateways Update Nancy Wilkins-Diehr Science Gateways Area Director Quarterly Meeting, September 6-7, 2007

Quarterly Meeting

Quarterly RENCI Gateway Highlights

•FY07 Q3 Accomplishments

– Continued support of the production Bioportal– Outreach: three community presentations about the RENCI

Gateway and the general Gateway program– Continued integration of the Web Service hosting

infrastructure at RENCI for BioScientists in support of the workflow infrastructure, which uses the TeraGrid on the back-end

– Continued discussions with a potential new Science Gateway: iFold

– Discussions with consumers of the TeraGrid back-ended BioScience web services

– Begun work on log file parsing for submission to TeraGrid– Begun examination of myCluster for large number of serial

jobs on TG resources

Page 28: Science Gateways Update Nancy Wilkins-Diehr Science Gateways Area Director Quarterly Meeting, September 6-7, 2007

Quarterly Meeting

Quarterly RENCI Gateway Highlights

•FY07 Q4 Plans

– Complete log file submission mechanism– Prototype sending all necessary info with job to eliminate

need for log file submissions– Complete myCluster integration for large numbers of serial

jobs– Develop and implement API for gateway uesr/job

investigation– Maintain the workflow infrastructure integration and

continue to develop specific workflows by engaging directly with key BioScientists

– Work directly with one community to assist them in becoming a self sufficient TeraGrid Science Gateway; identify and develop relationships for a second community to begin assisting them in becoming a Gateway during FY07