science gateways on the teragrid

21
Science Gateways on the TeraGrid Von Welch, NCSA [email protected] (with thanks to Nancy Wilkins-Diehr, SDSC for many slides)

Upload: altessa

Post on 11-Jan-2016

27 views

Category:

Documents


2 download

DESCRIPTION

Science Gateways on the TeraGrid. Von Welch, NCSA [email protected] (with thanks to Nancy Wilkins-Diehr, SDSC for many slides). Building a distributed system of unprecedented scale 40+ teraflops compute 1+ petabyte storage 10-40Gb/s networking - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Science Gateways on the TeraGrid

Science Gateways on the TeraGrid

Von Welch, [email protected]

(with thanks to Nancy Wilkins-Diehr, SDSC for many slides)

Page 2: Science Gateways on the TeraGrid

The TeraGrid Strategy• Building a distributed

system of unprecedented scale– 40+ teraflops compute – 1+ petabyte storage– 10-40Gb/s networking

• Creating a unified user environment across heterogeneous resources– User software environment,

User support resources.– Created an initial community

of over 500 users, 80 PI’s.

• Integrating new partners to introduce new capabilities– Additional computing,

visualization capabilities– New types of resources-

data collections, instruments

Make it extensible!

Page 3: Science Gateways on the TeraGrid

The TeraGrid Team• Two major components:

– 9 Resource Providers (RPs) who provide resources and expertise

• Seven universities• Two government laboratories• Expected to grow

– The Grid Integration Group (GIG) who provides leadership in grid integration among the RPs

• Led by Director, who is assisted by Executive Steering Committee, Area Directors, Project Manager

• Includes participation by staff at each RP

• Funding now provided for people, not just networks and hardware!

Page 4: Science Gateways on the TeraGrid

TeraGrid Resource Partners

Page 5: Science Gateways on the TeraGrid

TeraGrid ResourcesANL/UC Caltech IU NCSA ORNL PSC Purdue SDSC TACC

Compute

Resources

Itanium2(0.5 TF)

IA-32(0.5 TF)

Itanium2(0.8 TF)

Itanium2(0.2 TF)

IA-32(2.0 TF)

Itanium2 (10 TF)

SGI SMP(6.5 TF)

IA-32(0.3 TF)

XT3(10 TF)TCS (6 TF)Marvel(0.3 TF)

Hetero (1.7 TF)

Itanium2(4.4 TF)

Power4+(1.1 TF)

IA-32(6.3 TF)

Sun (Vis)

Online Storage

20 TB 155 TB 32 TB 600 TB 1 TB 150 TB

540 TB 50 TB

Mass

Storage

1.2 PB 3 PB 2.4 PB 6 PB 2 PB

Data Collections

Yes Yes Yes Yes Yes

Visualization

Yes Yes Yes Yes Yes

Instruments Yes Yes Yes

Network

(Gb/s,Hub)

30

CHI

30

LA

10

CHI

30

CHI

10

ATL

30

CHI

10

CHI

30

LA

10

CHI

Partners will add resources and TeraGrid will add partners!

Page 6: Science Gateways on the TeraGrid

Science GatewaysA new initiative for the TeraGrid

• Increasing investment by communities to build their own cyberinfrastructure.

• Heterogeneity– Resources - different architectures at local, national and

international levels– Users- from HPC expert to K-12 student…they should all

benefit from CI.– Software stacks, policies.

• How can “centers/Institutions” provide, operate, maintain in this heterogeneous world ?

• Working with Gateways, TeraGrid will start to answer that question by providing generic CI services to communities.

• Integration and interoperability.

Page 7: Science Gateways on the TeraGrid

What are Gateways?• Gateways will

– engage communities that are not traditional users of the supercomputing centers

• by

– providing community-tailored access to TeraGrid services and capabilities

• Three examples:– Web-based Portals that front-end Grid Services that provide teragrid-deployed

applications used by a community.– Coordinated access points enabling users to move seamlessly between

TeraGrid and other grids.– Application programs running on users' machines but accessing services in

TeraGrid (and elsewhere)

• All take advantage of existing community investment in software, services, education, and other components of Cyberinfrastructure.

Page 8: Science Gateways on the TeraGrid

Grid Portal Gateways

• The Portal accessed through a browser or desktop tools

– Provides Grid authentication and access to services

– Provide direct access to TeraGrid hosted applications as services

• The Required Support Services– Searchable Metadata catalogs– Information Space Management.– Workflow managers– Resource brokers– Application deployment services – Authorization services.

• Builds on NSF & DOE software– Use NMI Portal Framework, GridPort– NMI Grid Tools: Condor, Globus, etc.– OSG, HEP tools: Clarens, MonaLisa

Technical Approach

Biomedical and Biology, Building Biomedical Communities

OGCE Portletswith ContainerOGCE Portletswith Container

Apache JetspeedInternal ServicesApache JetspeedInternal Services

ServiceAPI

ServiceAPI

GridProtocols

GridServiceStubs

GridServiceStubs

RemoteContentServices

RemoteContentServices

RemoteContentServersHTTP

GridService

sLocalPortal

Services

LocalPortal

Services

Grid Resources

Open Source Tools

Build standard portals to meet the domain requirements of the biology communitiesDevelop federated databases to be replicated and shared across TeraGrid

Workflow Composer

Page 9: Science Gateways on the TeraGrid

Initial Focus on 10 GatewaysScience Gateway Prototype Discipline Science Partner(s) TeraGrid Liaison

Linked Environments for Atmospheric Discovery (LEAD)

Atmospheric Droegemeier (OU) Gannon (IU), Pennington (NCSA)

National Virtual Observatory (NVO)

Astronomy Szalay (Johns Hopkins) Williams (Caltech)

Network for Computational Nanotechnology (NCN) and “nanoHUB”

Nanotechnology Lundstrum (PU) Goasguen (PU)

National Microbial Pathogen Data Resource Center (NMPDR)

Biomedicine and Biology Schneewind (UC), Osterman (Burnham/UCSD), DeLong (MIT), Dusko (INRA)

Stevens (UC/Argonne)

NSF National Evolutionary Biology Center (NESC), NIH Carolina Center for Exploratory Genetic Analysis, State of North Carolina Bioinformatics Portal project

Biomedicine and Biology Cunningham (Duke), Magnuson (UNC)

Reed (UNC), Blatecky (UNC)

Neutron Science Instrument Gateway

Physics Dunning (ORNL) Cobb (ORNL)

Grid Analysis Environment High -Energy Physics Newman (Caltech) Bunn (Caltech)

Transportation System Decision Support

Homeland Security Stephen Eubanks (LAN L) Beckman (Argonne)

Groundwater/Flood Modeling Environmental Wells (UT -Austin), Engel (ORNL) Boisseau (TACC)

Science Grid [GrPhyN/ivDGL/Grid3]

Multiple Pordes (FNAL), Huth (Harvard), Avery (Uflorida)

Foster (UC/Argonne), Kesselman (USC -ISI), Livny (UW)

Page 10: Science Gateways on the TeraGrid

Expanding User Base

0

1000

2000

3000

4000

5000

6000

1 2 3 4 5

OSG

Flood

HEP

SNS

NESC/CCEGA

OLSG

NCN

NVO

LEAD

0

1000

2000

3000

4000

5000

6000

2005 2006 2007 2008 2009

OSG

Flood

HEP

SNS

NESC/CCEGA

OLSG

NCN

NVO

LEAD

A new generation of “users” that access TeraGrid via Science Gateways, scaling well beyond the traditional “user” with a shell login account.

Projected user community size by each science gateway project.

Impact on society from gateways enabling decision support is much larger!

Page 11: Science Gateways on the TeraGrid

So how will we meet all these needs?• With RATS!

(Requirements Analysis Teams)

• Organized RATS• Collection, analysis

and consolidation of requirements to jumpstart the work

• And milestones

Page 12: Science Gateways on the TeraGrid

Rats de Paris

Page 13: Science Gateways on the TeraGrid

Traditional HPC Model• All user have accounts at each

site/resource– NxN matrix of users and sites

• Users access resources through low-level interfaces– E.g. Unix Shells, FTP session

• Resource takes care of all the security– AAAA: Authentication, Authorization,

Auditing, Accounting

Page 14: Science Gateways on the TeraGrid

Traditional HPC Usage

% ls% foo

AUTHn

OS(Authz)

AuditAccounting

Page 15: Science Gateways on the TeraGrid

Science Gateway Motivation• Shell-level access to resources is great for power

users, but has steep learning curve– Many SG users just need domain-specific interface, e.g.

they are not developing or deploying application codes

• Each resource/site has to maintain state about every user– Scalability problems for large/dynamic user communities

• No abstraction - users must adapt to all changes in resources

Page 16: Science Gateways on the TeraGrid

SG Security Model• SG acts as a interface between the

community and its resources• Much like a traditional ‘Grid Portal’, it provides

a domain-specific interface• However, unlike portals, it exists as a trusted

entity in its own right, allowing the resource to “outsource” AAAA functionality to the SG

• Resources runs all commands in a community account, which constrains what community can do - account can be constrained to a few community applications

Page 17: Science Gateways on the TeraGrid

Conceptual Model

% ls% foo

% ls% foo

% ls% foo

Page 18: Science Gateways on the TeraGrid

SG AAAA Model

% ls% foo

• Security functions held by the resource are now split between resource and Science Gateway

• However there is a strong need to communicate between the two• Resource will want full audit information and user information to

investigate suspicious activity• SG needs accounting information to do allocations and reporting

(e.g. who is using the SG)

Accounting

Authn

User-level Authz

Community-levelAuthz

User-LevelAudit

Job-LevelAudit

Page 19: Science Gateways on the TeraGrid

Outstanding Challenges• How to identify a job between SG and resource?

– “/bin/foo run at 15:38:13 (my time)” not very accurate

• Standard template for resource/SG agreement– Akin to certificate policy

• Acceptance of group accounts– Convince folks its ok to outsource

• Restricted accounts– Cookbook to restrict account to certain

applications• Sandboxing of users from each others• Community administrators

– Those who set up group account

Page 20: Science Gateways on the TeraGrid

Outstanding Challenges (cont)• Each SG forms its own VO

– TeraGrid provides resources– SG provides the user

• I’ve mostly talked about SG/TeraGrid relationship• But how SGs will manage their users is open

– Authentication, Authorization, Contact information… (the whole list Jill just gave)

– Users distributed over multiple domains– Wanting to get into the 1000’s of users– Different communities for each SG

• TeraGrid would like to help as much as possible here as well

Page 21: Science Gateways on the TeraGrid

Questions?