teragrid annual review: science gateways nancy wilkins-diehr teragrid area director for science...

19
TeraGrid Annual Review: Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways [email protected] TeraGrid Annual Review, April 15-16, 2008

Upload: ezra-newman

Post on 28-Dec-2015

225 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: TeraGrid Annual Review: Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways wilkinsn@sdsc.edu TeraGrid Annual Review, April

TeraGrid Annual Review:

Science Gateways

Nancy Wilkins-DiehrTeraGrid Area Director for Science

[email protected]

TeraGrid Annual Review, April 15-16, 2008

Page 2: TeraGrid Annual Review: Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways wilkinsn@sdsc.edu TeraGrid Annual Review, April

Today’s Topics

•Background– Importance of gateways and potential for impact

•Development since inception–Program description–Who can be a gateway?

•2007 progress•2008 plans

–Objectives–What’s needed for success

TeraGrid Annual Review, April 15-16, 2008

Page 3: TeraGrid Annual Review: Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways wilkinsn@sdsc.edu TeraGrid Annual Review, April

Phenomenal Impact of the Internet on Worldwide Communication and Information Retrieval

•Implications on the conduct of science are still evolving– 1980’s, Early gateways, National Center for Biotechnology

Information BLAST server, search results sent by email, still a working portal today

– 1992 Mosaic web browser developed– 1995 “International Protein Data Bank Enhanced by Computer

Browser”– 2004 TeraGrid project director Rick Stevens recognized growth in

scientific portal development and proposed the Science Gateway Program

•Simultaneous explosion of digital information– Analysis needs in a variety of scientific areas– Sensors, telescopes, satellites, digital images and video– #1 machine on Top500 today is 300x more powerful than all

combined entries on the first list in 1993TeraGrid Annual Review, April 15-16, 2008

Only 16 years since the release of Mosaic!

Page 4: TeraGrid Annual Review: Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways wilkinsn@sdsc.edu TeraGrid Annual Review, April

Gateways Greatly Expand Access

•Almost anyone can investigate scientific questions using high end resources–Not just those in the research groups of those who request

allocations

•Fosters new ideas, cross-disciplinary approaches•Encourages students to experiment•But used in production too

–Significant number of papers resulting from gateways including GridChem, nanoHUB

–Scientists can focus on challenging science problems rather than challenging infrastructure problems

TeraGrid Annual Review, April 15-16, 2008

Page 5: TeraGrid Annual Review: Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways wilkinsn@sdsc.edu TeraGrid Annual Review, April

Highlights: LEAD Inspires StudentsAdvanced capabilities regardless of location

•A student gets excited about what he was able to do with LEAD•“Dr. Sikora:Attached is a display of 2-m T and wind depicting the WRF's interpretation of the coastal front on 14 February 2007. It's interesting that I found an example using IDV that parallels our discussion of mesoscale boundaries in class. It illustrates very nicely the transition to a coastal low and the strong baroclinic zone with a location very similar to Markowski's depiction. I created this image in IDV after running a 5-km WRF run (initialized with NAM output) via the LEAD Portal. This simple 1-level plot is just a precursor of the many capabilities IDV will eventually offer to visualize high-res WRF output. Enjoy!”• Eric (email, March 2007)

TeraGrid Annual Review, April 15-16, 2008

Page 6: TeraGrid Annual Review: Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways wilkinsn@sdsc.edu TeraGrid Annual Review, April

GridChem Employs a Client-Server Approach

TeraGrid Annual Review, April 15-16, 2008

Page 7: TeraGrid Annual Review: Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways wilkinsn@sdsc.edu TeraGrid Annual Review, April

GridChem Used for Production Science

•Chemical Reactivity of the Biradicaloid (HO...ONO) Singlet States of Peroxynitrous Acid. The Oxidation of Hydrocarbons, Sulfides, and Selenides. Bach, R. D et al. J. Am. Chem. Soc. 2005, 127, 3140-3155.

•The "Somersault" Mechanism for the P-450 Hydroxylation of Hydrocarbons. The Intervention of Transient Inverted Metastable Hydroperoxides. Bach, R. D.; Dmitrenko, O. J. Am. Chem. Soc. 2006, 128(5), 1474-1488.

•The Effect of Carbonyl Substitution on the Strain Energy of Small Ring Compounds and their Six-member Ring Reference Compounds Bach, R. D.; Dmitrenko, O. J. Am. Chem. Soc. 2006,128(14), 4598. •Azide Reactions for Controlling Clean Silicon Surface Chemistry: Benzylazide on Si(100)-2 x 1Semyon Bocharov et al..J. Am. Chem. Soc., 128 (29), 9300 -9301, 2006 •Chemistry of Diffusion Barrier Film Formation: Adsorption and Dissociation of Tetrakis(dimethylamino)titanium on Si(100)-2 × 1 Rodriguez-Reyes, J. C. F.; Teplyakov, A. V.J. Phys. Chem. C.; 2007; 111(12); 4800-4808. •Computational Studies of [2+2] and [4+2] Pericyclic Reactions between Phosphinoboranes and Alkenes. Steric and Electronic Effects in Identifying a Reactive Phosphinoborane that Should Avoid Dimerization Thomas M. Gilbert and Steven M. Bachrach Organometallics, 26 (10), 2672 -2678, 2007.

TeraGrid Annual Review, April 15-16, 2008

Page 8: TeraGrid Annual Review: Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways wilkinsn@sdsc.edu TeraGrid Annual Review, April

cancer Bioinformatics Grid

Addressing today’s challenges in cancer research and treatment

•The mission of caBIG™ is to develop a truly collaborative information network that accelerates the discovery of new approaches for the detection, diagnosis, treatment, and prevention of cancer, ultimately improving patient outcomes. •The goals of caBIG™ are to: •Connect scientists and practitioners through a shareable and interoperable infrastructure•Develop standard rules and a common language to more easily share information•Build or adapt tools for collecting, analyzing, integrating, and disseminating information associated with cancer research and care.

TeraGrid Annual Review, April 15-16, 2008Source: cabig.cancer.gov

Page 9: TeraGrid Annual Review: Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways wilkinsn@sdsc.edu TeraGrid Annual Review, April

caBIG and TeraGrid

•caBIG conducted study of all Gateways–Pleased to discover that community accounts and web services

will exactly meet their requirements

•TeraGrid resources incorporated into geWorkbench–an open source platform for integrated genomics used to

•Load data from local or remote data sources. •Visualize gene expression and sequence data in a variety of ways. •Provide access to client- and server-side computational analysis tools such as t-test analysis, hierarchical clustering, self organizing maps, regulatory networks reconstruction, BLAST searches, pattern/motif discovery, etc.

–Clustering is used to build groups of genes with related expression patterns which may contain functionally related proteins, such as enzymes for a specific pathway

•Validate computational hypothesis through the integration of gene and pathway annotation information from curated sources as well as through Gene Ontology enrichment analysis.

TeraGrid Annual Review, April 15-16, 2008

Page 10: TeraGrid Annual Review: Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways wilkinsn@sdsc.edu TeraGrid Annual Review, April

geWorkbench Integrages TeraGrid Resources

TeraGrid Annual Review, April 15-16, 2008

“Although the new service is TeraGrid-aware, the perspective from geWorkbench does not change. As far as geWorkbench is concerned, it is still connecting to a Hierarchical Clustering caGrid service. The difference is now the caGrid service is a gateway service that submits a TeraGrid job on behalf of geWorkbench. geWorkbench, however, does not notice this difference.”

Source: http://wiki.c2b2.columbia.edu/informatics/index.php/GeWorkbench_Example

Page 11: TeraGrid Annual Review: Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways wilkinsn@sdsc.edu TeraGrid Annual Review, April

Evolution of the Gateway Program•2004 “TeraGrid Science Gateway” term originates

– We will help them build gateway portals that leverage TeraGrid capabilities and provide web-based interfaces to community tools

•2005 Initial Gateway requirements analysis team– Areas of identified commonality include:

•Web services, auditing, community accounts, flexible allocations, scheduling, outreach

•2006– GT4 with web services– GRAM audit, community account policies– presentations to allocations committees– SPRUCE, metascheduling RAT– Primer– Outreach (NVO and GISolve schools, LEAD demonstrations,

presentations to SURA, HASTAC, GEON, CI-Channel, SC, Grace Hopper, MSI-CI2, Lariat, Science Workflows and On Demand Computing for Geosciences Workshop)

TeraGrid Annual Review, April 15-16, 2008

Page 12: TeraGrid Annual Review: Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways wilkinsn@sdsc.edu TeraGrid Annual Review, April

Easy Gateway True and False TestAnswers Provided

•TeraGrid selects all gateways (F)•TeraGrid designs all gateways (F)•TeraGrid limits the number of gateways (F)•All gateways need TeraGrid funding to exist (F)

•Any PI can request an allocation and use it to develop a gateway (T)•Gateway design is community-developed and that is the core strength of the program (T)•TeraGrid staff are alerted to gateway work when a proposal is reviewed or when a community account is requested (T)•Limited TeraGrid support can be provided for targeted assistance to integrate an existing gateway with TeraGrid (T)

TeraGrid Annual Review, April 15-16, 2008

Page 13: TeraGrid Annual Review: Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways wilkinsn@sdsc.edu TeraGrid Annual Review, April

2007 – Gateways move into production•Web Services

– Development of common services•Steve Mock begins December, 06•QBETS “where can I run soonest” service

•Auditing– Provide capability to regularly

report number of gateway users•GridShib

•Community Accounts– Finalize community account

implementation policy– Provide web interface to account

details for TG security staff•Allocations

– Collaboration with xRAC reviewers to develop instructions for gateway PIs

•Scheduling– Scheduling working group– Urgent computing workshop

•Gateway Hosting– Available at IU through peer review

•Outreach– “Build a Gateway” tutorial at TG07

•Downloadable code, documentation

– Gateways featured in student competition at TG07

– Cross directorate presentation at NSF, May 2007

– LEAD collegiate forecast competition, April 2007

– GISolve, nanoHUB used in classes– NVO announcement of production TG

capabilities at conference in China– Pathways supplement includes

evaluation of gateway use by educators• Interest from Navajo Tech

•Primer– TG documentation staff identified to

move the Primer into fully functional documentation

•Addressing issues that prevent current gateways from using TG in production

•Stu Martin begins January, 2007

TeraGrid Annual Review, April 15-16, 2008

Page 14: TeraGrid Annual Review: Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways wilkinsn@sdsc.edu TeraGrid Annual Review, April

TeraGrid Futures Workshop June, 2007 Prominent Gateway Needs Identified

•Basic services that gateways can use instead of creating their own.•Templates and standardized systems to save developers the time of recreating things that others have already built.•Standardization that would make TeraGrid a real grid that could support the effective use of allocations and meta-scheduling.•Operating more effectively as a community in order to better support the education and development needs of gateway developers.

TeraGrid Annual Review, April 15-16, 2008Source: Ann Zimmerman

Page 15: TeraGrid Annual Review: Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways wilkinsn@sdsc.edu TeraGrid Annual Review, April

What are we working toward in 2008?Gateway Objectives for PY4 and 5

•TeraGrid integration will be straightforward for new and existing gateway developers•There will be a set of easy to discover general services provided by and for Gateways•The targeted support program will adapt to changing needs•We will be able to routinely count end gateway users

–Who will constitute a significant fraction of all TeraGrid users

•There will be avenues for sustained Gateway funding

TeraGrid Annual Review, April 15-16, 2008

Page 16: TeraGrid Annual Review: Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways wilkinsn@sdsc.edu TeraGrid Annual Review, April

Gateways and GIG Targeted Support29 Production Gateways Total

25% of GIG gateway staffing in 2008 for helpdesk, training and general services

TeraGrid Annual Review, April 15-16, 2008

Discontinued

Production

Production

Production

Production

Production

Production

RP funding

Page 17: TeraGrid Annual Review: Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways wilkinsn@sdsc.edu TeraGrid Annual Review, April

Biggest challenges in the next 2 years•TeraGrid processes must work smoothly

–Allocations, community account requests, security requirements, community software areas, data collection hosting, accounting

–TeraGrid must remain a service organizations•Gateway developers invest considerable time in TeraGrid integration, this must be worth their while or they won’t remain interested

•Reliability at scale– Intensive work over many months to improve stability under

increased load for LEAD’s WxChallenge

•Counting end users–Per user accounting for gateways

•Sustainable gateway funding–TeraGrid support is for integration only

•Adaptability to changing technologies–Application hosting

TeraGrid Annual Review, April 15-16, 2008

Page 18: TeraGrid Annual Review: Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways wilkinsn@sdsc.edu TeraGrid Annual Review, April

When is a gateway appropriate?

•Researchers using defined sets of tools in different ways– Same executables, different input

•GridChem, CHARMM– Creating multi-scale or complex workflows– Datasets

•Common data formats– National Virtual Observatory– Earth System Grid– Some groups have invested significant efforts here

•caBIG, extensive discussions to develop common terminology and formats

•BIRN, extensive data sharing agreements

•Difficult to access data/advanced workflows– Sensor/radar input

•LEAD, GEON

TeraGrid Annual Review, April 15-16, 2008

Page 19: TeraGrid Annual Review: Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways wilkinsn@sdsc.edu TeraGrid Annual Review, April

Tremendous Potential for Gateways•In only 16 years, the Web has fundamentally changed human communication•Science Gateways can leverage this amazingly powerful tool to:– Transform the way scientists

collaborate– Streamline conduct of science– Influence the public’s perception

of science

•Reliability, trust, continuity are fundamental to truly change the conduct of science through the use of gateways– High end resources can have a

profound impact

•The future is very exciting!

TeraGrid Annual Review, April 15-16, 2008