1 science gateways workshop ggf14 nancy wilkins-diehr teragrid area director for science gateways...

31
1 Science Gateways Workshop GGF14 Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways GGF14, Chicago June 28, 2005

Upload: osborn-paul

Post on 13-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Science Gateways Workshop GGF14 Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways GGF14, Chicago June 28, 2005

1

Science Gateways WorkshopGGF14

Nancy Wilkins-DiehrTeraGrid Area Director for Science

GatewaysGGF14, ChicagoJune 28, 2005

Page 2: 1 Science Gateways Workshop GGF14 Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways GGF14, Chicago June 28, 2005

Science Gateways Workshop June 28, 2005 2

Program Committee Nancy Wilkins-Diehr (SDSC, USA) (co-chair) Sebastien Goasguen (Purdue University, USA) (co-chair) Ariel Oleksiak (Poznan Supercomputing and Networking

Center, Poland) Jarek Nabrzyski (Poznan Supercomputing and Networking

Center, Poland) Charlie Catlett (University of Chicago and Argonne National

Laboratory, USA) Ian Foster (University of Chicago and Argonne National

Laboratory, USA) Dennis Gannon (Indiana University, USA) Satoshi Sekiguchi (AIST, Japan) Sang Beom Lim (KISTI, Korea) Konstantinos Dolkas (National Technical University of

Athens(NTUA))

Page 3: 1 Science Gateways Workshop GGF14 Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways GGF14, Chicago June 28, 2005

Science Gateways Workshop June 28, 2005 3

Welcome and Thank You Many fine talks today from researchers or resource providers

who are bringing Grid capabilities to a particular science community (atmospheric scientists, chemists, bioinformaticists, etc.)

Explore and summarize commonalities and differences - system, security, accounting, authentication/authorization and other policies and capabilities needed for production grid support

Presentations will cover: Services provided and technologies/software used to provide them Configuration or policy issues encountered during deployment and

maintenance Authentication and authorization approaches to support a variety

of user “types” Practical issues related to supporting workflows Approaches to providing secure web services

Page 4: 1 Science Gateways Workshop GGF14 Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways GGF14, Chicago June 28, 2005

Science Gateways Workshop June 28, 2005 4

Your participation will make the workshop a success

Five 90 minute sessions Four presentations followed by a discussion Interactive discussions encouraged! Questions from moderators to initiate dialogue

Detailed notes will be taken Workshop proceedings will be available as

GGF informational document Peer-reviewed papers to be published in

special issue of Concurrency and Computation: Practice and Experience in early fall

Page 5: 1 Science Gateways Workshop GGF14 Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways GGF14, Chicago June 28, 2005

Science Gateways Workshop June 28, 2005 5

GCE-RG at GGF

Grid Computing Environments Research Group Co-chaired by Geoffrey Fox, Dennis Gannon, IU,

Mary Thomas, SDSU Addresses many of the issues presented in this

workshop Marlon Pierce, IU here to discuss current

activities Meeting 6/29, 7:30-9am Next steps from this workshop will be part of

ongoing GCE-RG activities

Page 6: 1 Science Gateways Workshop GGF14 Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways GGF14, Chicago June 28, 2005

Science Gateways Workshop June 28, 2005 6

Why a workshop on Science Gateways?

My day job – TeraGrid Area Director for Science Gateways

10 Science Gateway projects in TeraGrid I need to make these successful

New activity, funding begins this summer Interviews conducted with all 10 teams,

findings summarized Interest in what others are doing in this

area

Page 7: 1 Science Gateways Workshop GGF14 Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways GGF14, Chicago June 28, 2005

Science Gateways Workshop June 28, 2005 7

The TeraGrid Strategy Building a distributed system

of unprecedented scale 40+ teraflops compute 1+ petabyte storage 10-40Gb/s networking

Creating a unified user environment across heterogeneous resources

User software environment, User support resources.

Created an initial community of over 500 users, 80 PI’s.

Integrating new partners to introduce new capabilities

Additional computing, visualization capabilities

New types of resources- data collections, instruments

Make it extensible!

Page 8: 1 Science Gateways Workshop GGF14 Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways GGF14, Chicago June 28, 2005

Science Gateways Workshop June 28, 2005 8

TeraGrid Resource Partners

Page 9: 1 Science Gateways Workshop GGF14 Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways GGF14, Chicago June 28, 2005

Science Gateways Workshop June 28, 2005 9

TeraGrid ResourcesANL/UC Caltec

hIU NCSA ORNL PSC Purdu

eSDSC TACC

ComputeResources

Itanium2(0.5 TF)

IA-32(0.5 TF)

Itanium2(0.8 TF)

Itanium2(0.2 TF)

IA-32(2.0 TF)

Itanium2 (10 TF)

SGI SMP(6.5 TF)

IA-32(0.3 TF)

XT3(10 TF)TCS (6 TF)Marvel(0.3 TF)

Hetero (1.7 TF)

Itanium2(4.4 TF)

Power4+(1.1 TF)

IA-32(6.3 TF)

Sun (Vis)

Online Storage

20 TB 155 TB 32 TB 600 TB 1 TB 150 TB

540 TB 50 TB

MassStorage

1.2 PB 3 PB 2.4 PB 6 PB 2 PB

Data Collections

Yes Yes Yes Yes Yes

Visualization

Yes Yes Yes Yes Yes

Instruments Yes Yes Yes

Network(Gb/s,Hub)

30CHI

30LA

10CHI

30CHI

10ATL

30CHI

10CHI

30LA

10CHI

Partners will add resources and TeraGrid will add partners!

Page 10: 1 Science Gateways Workshop GGF14 Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways GGF14, Chicago June 28, 2005

Science Gateways Workshop June 28, 2005 10

Science GatewaysA new initiative for the TeraGrid

Increasing investment by communities to build their own cyberinfrastructure.

Heterogeneity Resources - different architectures at local, national and

international levels Users- from HPC expert to K-12 student…they should all

benefit from CI Software stacks, policies

How can “centers/institutions” provide, operate, maintain in this heterogeneous world ?

Working with Gateways, TeraGrid will start to answer that question by providing generic CI services to communities.

Integration and interoperability

Page 11: 1 Science Gateways Workshop GGF14 Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways GGF14, Chicago June 28, 2005

Science Gateways Workshop June 28, 2005 11

What are Gateways? Gateways will

engage communities that are not traditional users of the supercomputing centers

by providing community-tailored access to TeraGrid services

and capabilities Three examples:

Web-based Portals that front-end Grid Services that provide teragrid-deployed applications used by a community.

Coordinated access points enabling users to move seamlessly between TeraGrid and other grids.

Application programs running on users' machines but accessing services in TeraGrid (and elsewhere)

All take advantage of existing community investment in software, services, education, and other components of Cyberinfrastructure.

Page 12: 1 Science Gateways Workshop GGF14 Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways GGF14, Chicago June 28, 2005

Science Gateways Workshop June 28, 2005 12

Grid Portal Gateways The Portal accessed through a

browser or desktop tools Provides Grid authentication and

access to services Provide direct access to TeraGrid

hosted applications as services

The Required Support Services Searchable Metadata catalogs Information Space Management. Workflow managers Resource brokers Application deployment services Authorization services.

Builds on NSF & DOE software Use NMI Portal Framework, GridPort NMI Grid Tools: Condor, Globus,

etc. OSG, HEP tools: Clarens, MonaLisa

Technical Approach

Biomedical and Biology, Building Biomedical Communities

OG

CE

Sc

ien

ce

Po

rta

l

OGCE Portletswith ContainerOGCE Portletswith Container

Apache JetspeedInternal ServicesApache JetspeedInternal Services

ServiceAPI

ServiceAPI

GridProtocols

GridServiceStubs

GridServiceStubs

RemoteContentServices

RemoteContentServices

RemoteContentServersHTTP

GridService

s

Java

Co

G K

it

LocalPortal

Services

LocalPortal

Services

Grid Resources

Open Source Tools

Build standard portals to meet the domain requirements of the biology communitiesDevelop federated databases to be replicated and shared across TeraGrid

Workflow Composer

Page 13: 1 Science Gateways Workshop GGF14 Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways GGF14, Chicago June 28, 2005

Science Gateways Workshop June 28, 2005 13

Gateways that Bridge to Community Grids

Many Community Grids already exist or are being built

NEESGrid, LIGO, Earth Systems Grid, NVO, Open Science Grid, etc.

TeraGrid will provide a service framework to enable access in ways that are transparent to their users.

The community maintains and controls the Gateway

Different Communities have different requirements.

NEES and LEAD will use TeraGrid to provision compute services

LIGO and NVO have substantial data distribution problems.

All of them require remote execution of complex workflows.

Technical Approach

•Develop web services interfaces (wrappers) for existingand emerging bioinformatics tools

• Integrate of collections of tools into Life Science servicebundles that can be deployed as persistent services onTeraGrid resources

• Integration of TG hosted Life Science services withexisting end-user tools to provide scalable analysiscapabilities

Existing User Tools(e.g. GenDB)

Life ScienceGatewayService

Dispatcher

Web ServicesInterfaces forBackendComputing

Life Science Services Bundles

..

..

..

..

TeraGridResource

Partners

On-DemandGrid Computing

StreamingObservations

Forecast Model

Data Mining

Storms Forming

Science Communities and Outreach

• Communities• CERN’s Large Hadron Collider

experiments• Physicists working in HEP and

similarly data intensive scientificdisciplines

• National collaborators and thoseacross the digital divide indisadvantaged countries

• Scope• Interoperation between LHC

Data Grid Hierarchy and ETF• Create and Deploy Scientific

Data and Services Grid Portals• Bring the Power of ETF to bear

on LHC Physics Analysis: Helpdiscover the Higgs Boson!

• Partners• Caltech• University of Florida• Open Science Grid and Grid3• Fermilab• DOE PPDG• CERN• NSF GriPhyn and iVDGL• EU LCG and EGEE• Brazil (UERJ,…)• Pakistan (NUST,…)• Korea (KAIST,…)

LHC Data Distribution Model

Page 14: 1 Science Gateways Workshop GGF14 Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways GGF14, Chicago June 28, 2005

Science Gateways Workshop June 28, 2005 14

Science Gateway Prototype Discipline Science Partner(s) TeraGrid Liaison

Linked Environments for Atmospheric Discovery (LEAD)

Atmospheric Droegemeier (OU) Gannon (IU), Pennington (NCSA)

National Virtual Observatory (NVO)

Astronomy Szalay (Johns Hopkins) Williams (Caltech)

Network for Computational Nanotechnology (NCN) and “nanoHUB”

Nanotechnology Lundstrum (PU) Goasguen (PU)

National Microbial Pathogen Data Resource Center (NMPDR)

Biomedicine and Biology Schneewind (UC), Osterman (Burnham/UCSD), DeLong (MIT), Dusko (INRA)

Stevens (UC/Argonne)

NSF National Evolutionary Biology Center (NESC), NIH Carolina Center for Exploratory Genetic Analysis, State of North Carolina Bioinformatics Portal project

Biomedicine and Biology Cunningham (Duke), Magnuson (UNC)

Reed (UNC), Blatecky (UNC)

Neutron Science Instrument Gateway

Physics Dunning (ORNL) Cobb (ORNL)

Grid Analysis Environment High-Energy Physics Newman (Caltech) Bunn (Caltech)

Transportation System Decision Support

Homeland Security Stephen Eubanks (LANL) Beckman (Argonne)

Groundwater/Flood Modeling Environmental Wells (UT-Austin), Engel (ORNL) Boisseau (TACC)

Science Grid [GrPhyN/ivDGL/Grid3]

Multiple Pordes (FNAL), Huth (Harvard), Avery (Uflorida)

Foster (UC/Argonne), Kesselman (USC-ISI), Livny (UW)

Initial Focus on 10 Gateways

Page 15: 1 Science Gateways Workshop GGF14 Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways GGF14, Chicago June 28, 2005

Science Gateways Workshop June 28, 2005 15

Expanding User Base

0

1000

2000

3000

4000

5000

6000

1 2 3 4 5

OSG

Flood

HEP

SNS

NESC/CCEGA

OLSG

NCN

NVO

LEAD

0

1000

2000

3000

4000

5000

6000

2005 2006 2007 2008 2009

OSG

Flood

HEP

SNS

NESC/CCEGA

OLSG

NCN

NVO

LEAD

A new generation of “users” that access TeraGrid via Science Gateways, scaling well beyond the traditional “user” with a shell login account.

Projected user community size by each science gateway project.

Impact on society from gateways enabling decision support is much larger!

Page 16: 1 Science Gateways Workshop GGF14 Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways GGF14, Chicago June 28, 2005

Science Gateways Workshop June 28, 2005 16

LEAD (Linked Environments for Atmospheric Discovery )

•Storm forecasting

•Modeling

•Connection to sensor networks

•LEAD tesbed

•Workflows

•Student usage

Page 17: 1 Science Gateways Workshop GGF14 Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways GGF14, Chicago June 28, 2005

Science Gateways Workshop June 28, 2005 17

Harnessing TeraGrid for Education Example: Nanohub is used to complete coursework by

undergraduate and graduate students in dozens of courses at 10 universities.

Page 18: 1 Science Gateways Workshop GGF14 Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways GGF14, Chicago June 28, 2005

Science Gateways Workshop June 28, 2005 18

Biomedical and Biology Building Biomedical Communities – Dan Reed (UNC)

National Evolutionary Synthesis Center Carolina Center for Exploratory Genetic Analysis

Portals and federated databases for the Biomed research community

Identify Genes

Phenotype 1 Phenotype 2 Phenotype 3 Phenotype 4

Predictive Disease Susceptibility

Physiology

Metabolism Endocrine

Proteome

Immune Transcriptome

BiomarkerSignatures

Morphometrics

Pharmacokinetics

EthnicityEnvironment

AgeGender

Genetics and Disease Susceptibility

Source: Terry Magnuson, UNC

Science Communities and Outreach

• Communities• Students and educators• Phylogeneticists• Evolutionary biologists• Biomedical researchers• Biostatisticians• Computer scientists• Medical clinicians

Biomedical and Biology, Building Biomedical Communities

• Partners• University of North Carolina• Duke University• North Carolina State University• NSF National Evolutionary

Synthesis Center (NESC)• NIH Carolina Center for

Exploratory Genetic Analysis(CCEGA)

QuickTime™ and aGraphics decompressor

are needed to see this picture.

QuickTime™ and aGraphics decompressor

are needed to see this picture.

Page 19: 1 Science Gateways Workshop GGF14 Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways GGF14, Chicago June 28, 2005

Science Gateways Workshop June 28, 2005 19

Neutron Science Gateway•17 instruments

•Users worldwide get “beam time”

•Need access to their data and post processing capabilities

Page 20: 1 Science Gateways Workshop GGF14 Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways GGF14, Chicago June 28, 2005

Science Gateways Workshop June 28, 2005 20

Flood Modeling/Homeland Security

Gordon Wells, UT; David Maidment, UT; Budhu Bhaduri, ORNL

Large-scale flooding along Brays Bayou in central Houston triggered by heavy rainfall during Tropical Storm Allison (June 9, 2001) caused more than $2 billion

of damage.

Page 21: 1 Science Gateways Workshop GGF14 Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways GGF14, Chicago June 28, 2005

Science Gateways Workshop June 28, 2005 21

OSG / one VO: CMS…

Science Communities and Outreach

• Communities• CERN’s Large Hadron Collider

experiments• Physicists working in HEP and

similarly data intensive scientificdisciplines

• National collaborators and thoseacross the digital divide indisadvantaged countries

• Scope• Interoperation between LHC

Data Grid Hierarchy and ETF• Create and Deploy Scientific

Data and Services Grid Portals• Bring the Power of ETF to bear

on LHC Physics Analysis: Helpdiscover the Higgs Boson!

• Partners• Caltech• University of Florida• Open Science Grid and Grid3• Fermilab• DOE PPDG• CERN• NSF GriPhyn and iVDGL• EU LCG and EGEE• Brazil (UERJ,…)• Pakistan (NUST,…)• Korea (KAIST,…)

LHC Data Distribution Model

Page 22: 1 Science Gateways Workshop GGF14 Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways GGF14, Chicago June 28, 2005

Science Gateways Workshop June 28, 2005 22

So how will we meet all these needs?

With RATS! (Requirements Analysis Teams)

Organized RATS Collection, analysis and

consolidation of requirements to jumpstart the work

And milestones

Page 23: 1 Science Gateways Workshop GGF14 Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways GGF14, Chicago June 28, 2005

Science Gateways Workshop June 28, 2005 23

Gateways RAT concludes after 2 months

RAT team conducted interviews with all 10 Gateways

Summarized requirements for each TeraGrid working group

Draft a primer outline for new Gateways Organize this workshop to hear from others

Page 24: 1 Science Gateways Workshop GGF14 Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways GGF14, Chicago June 28, 2005

Science Gateways Workshop June 28, 2005 24

RAT summary

Community allocations Group accounts / limited privileges Need for portal accounting capabilities, but

little development On-demand scheduling Classifications (3 types)

Portals, desktop apps, access point to other grids User model (3 modes)

Standard, portal, community

Page 25: 1 Science Gateways Workshop GGF14 Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways GGF14, Chicago June 28, 2005

Science Gateways Workshop June 28, 2005 25

Actions for wg’s

tg-acctmgmt Support for accounts

with differing capabilities Ability to associate

compute job to a individual portal user

Scheme for portal registration and usage tracking

Support for OSG’s Grid User Management System (GUMS)

Dynamic accounts?

security-wg Define open port ranges Firewalls Community account

privileges Need to identify human

responsible for a job for incident response

Acceptance of other grid certificates

TG-hosted web servers, cgi-bin code

Page 26: 1 Science Gateways Workshop GGF14 Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways GGF14, Chicago June 28, 2005

Science Gateways Workshop June 28, 2005 26

Actions for wg’s (2)

Web Services (currently no wg for this)

Needs further study Some Gateways (LEAD,

NMBR) have immediate needs

Many will build on capabilities offered by GT4, but interoperability could be an issue

Web Service security Interfaces to scheduling

and account management are common requirements

software-wg Interoperability of CTSS

and VDT for OSG Software installations

across all TG sites Community software

areas portals-wg

Variety of approaches needs further analysis

OGCE, in-VIGO, Clarens, Neutron Science Tomcat+Apache

TG User Portal

Page 27: 1 Science Gateways Workshop GGF14 Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways GGF14, Chicago June 28, 2005

Science Gateways Workshop June 28, 2005 27

Follow on RATs formed Web services RAT–Ivan Judson

GT4

Portal technology RAT –John Cobb OGCE Clarens In-VIGO …

OSG RAT–Stuart Martin OSG/CMS DAC, porting CMS apps to TG resources Job forwarding between gatekeepers Exposing TG resources to OSG …and vice versa !!

Page 28: 1 Science Gateways Workshop GGF14 Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways GGF14, Chicago June 28, 2005

Science Gateways Workshop June 28, 2005 28

Gateways Primer Outline 1. Introduction 2. Science Gateway in Context

a. Science Gateway (SGW) Definition(s) b. Science Gateway user modes c. Distinction between SGW and other TeraGrid

user modes 3. Components of a Science Gateway

a. User Model b. Gateway targeted community c. Gateway Services d. Integration with TeraGrid external resources

(data collections, services, …) e. Organizational and administrative structure

4. TeraGrid services and policies available for Science Gateways

a. Portal middleware tools (user portal and other portal tools)

b. Account Management (user models, community accounts, )

c. Security environment (security models) d. Web Services e. Scheduling services (and meta-scheduling) f. Community accounts and allocations g. Community Software Areas h. All traditional TeraGrid services and resources i. Ability to propose additional services and how

that would interact with TeraGrid operations

5. Responsibilities and Requirements for Science Gateways

a. Interaction with and compatibility with TeraGrid communities

b. Control procedures i. Community user identification and tracking

(map TeraGrid usage to Portal user) ii. Use monitoring and reporting iii. Security and trust iv. Appropriate use

6. How to get started a. Existing resources

i. Publication references ii. Web areas with more details iii. Online tutorials iv. Upcoming presentations and tutorials

b. Who to contact for initial discussions c. How to propose a new Gateway d. How to integrate with TeraGrid Gateways

efforts. e. How to obtain a resource allocation

Page 29: 1 Science Gateways Workshop GGF14 Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways GGF14, Chicago June 28, 2005

Science Gateways Workshop June 28, 2005 29

Timelines - Fall, 2005

Deploy 3 prototype portals (LEAD, Bioinformatics, Evolutionary Biology)

Define work plan and application characteristics (NVO, nanoHub, Neutron Science)

Port/install software (Homeland Security, Flood Analysis, OSG)

Analyze Gateway needs, plans for OSG integration (TG)

Page 30: 1 Science Gateways Workshop GGF14 Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways GGF14, Chicago June 28, 2005

Science Gateways Workshop June 28, 2005 30

Spring, 2006 Explore authentication methods (NVO) Integrate TG compute resources, incl. support for

large scale computing (LEAD, nanoHUB, Bioinf., Evo. Bio., HEP, OSG)

Run a workshop (nanoHUB) Prototypes

web/grid services (Bioinformatics) Data archive hosting (Neutron Science) Data federation models with compute support (Evo. Bio.) Application hosting services, initial compute resource

brokering and data federation. Test for security, scalability (TG)

Code porting and verification (Homeland, Flood Modeling, OSG)

TG/OSG security and accounting mechanisms (TG)

Page 31: 1 Science Gateways Workshop GGF14 Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways GGF14, Chicago June 28, 2005

Science Gateways Workshop June 28, 2005 31

Today’s Schedule 10-11:30 Session 1 - Science

portals (1)- Wilkins-Diehr/Goasguen TG Science Gateways (20 min.) CCLRC Bioscience LEAD

<coffee> 12-1:30 Session 2 - Science portals

(2) - Oleksiak Rick Stevens talk (45 min.) TG vis portal nanoHUB

<lunch> 2:30-4 Session 3 - Science portals

(3) – Foster Telescience NAREGI GridSAT ORNL

<break> 4:30-6 Session 4 - Job submission

portals – Lim/Sekiguchi GENIUS PROGRESS DEISA HPC-Europa

<break> 6:30-8 Session 5 - Enabling

technologies – Dolkas GRIA GridASP AAAA InVIGO OGCE

<collapse>