grids center g rid r esearch i ntegration d evelopment & s upport copyright thomas garritano,...

28
GRIDS Center Grid Research Integration Development & Support http://www.grids-center.org Copyright Thomas Garritano, 2002. This work is the intellectual property of the author. Permission is granted for this material to be shared for non-commercial, educational purposes, provided that this copyright appears on the reproduced materials and notice is given that the copying is by permission of the author. To disseminate otherwise or to republish requires written permission from the author. Chicago - NCSA – SDSC - USC/ISI - Wisconsin

Upload: alec-wyly

Post on 31-Mar-2015

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: GRIDS Center G rid R esearch I ntegration D evelopment & S upport  Copyright Thomas Garritano, 2002. This work is the intellectual

GRIDS Center

Grid Research Integration Development & Support

http://www.grids-center.org

Copyright Thomas Garritano, 2002. This work is the intellectual property of the author. Permission is granted for this material to be shared for non-commercial, educational purposes, provided that this copyright appears on the reproduced materials and notice is given that the

copying is by permission of the author. To disseminate otherwise or to republish requires written permission from the author.

Chicago - NCSA – SDSC - USC/ISI - Wisconsin

Page 2: GRIDS Center G rid R esearch I ntegration D evelopment & S upport  Copyright Thomas Garritano, 2002. This work is the intellectual

Part of the NSF Middleware Initiative (NMI) www.grids-center.org

GRIDS

                                                                    

GRIDS, part of the NSF Middleware Initiative (NMI)

• The Information Sciences Institute (ISI) at the University of Southern California (Carl Kesselman)

• The University of Chicago (Ian Foster)

• The National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign (Randy Butler) 

• The University of California at San Diego (Phil Papadoupolus)

• The University of Wisconsin at Madison (Miron Livny) 

Page 3: GRIDS Center G rid R esearch I ntegration D evelopment & S upport  Copyright Thomas Garritano, 2002. This work is the intellectual

Part of the NSF Middleware Initiative (NMI) www.grids-center.org

GRIDS

                                                                    

Enabling Seamless Collaboration

GRIDS will help distributed communities pursue common goals

Scientific research Engineering design Education Artistic creation

Focus is on the enabling mechanisms required for collaboration

Resource sharing as a fundamental concept

Page 4: GRIDS Center G rid R esearch I ntegration D evelopment & S upport  Copyright Thomas Garritano, 2002. This work is the intellectual

Part of the NSF Middleware Initiative (NMI) www.grids-center.org

GRIDS

                                                                    

Grid Computing Rationale

The need for flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions, and resource

See “The Anatomy of the Grid: Enabling Scalable Virtual Organizations” by Foster,

Kesselman, Tuecke at http://www.globus.org (in the “Publications” section)

The need for communities (“virtual organizations”) to share geographically distributed resources as they pursue common goals while assuming the absence of:

central location central control omniscience existing trust relationships

Page 5: GRIDS Center G rid R esearch I ntegration D evelopment & S upport  Copyright Thomas Garritano, 2002. This work is the intellectual

Part of the NSF Middleware Initiative (NMI) www.grids-center.org

GRIDS

                                                                    

Elements of Grid Computing

Resource sharing Computers, storage, sensors, networks Sharing is always conditional, based on issues of trust,

policy, negotiation, payment, etc.

Coordinated problem solving Beyond client-server: distributed data analysis,

computation, collaboration, etc.

Dynamic, multi-institutional virtual organizations Community overlays on classic org structures Large or small, static or dynamic

Page 6: GRIDS Center G rid R esearch I ntegration D evelopment & S upport  Copyright Thomas Garritano, 2002. This work is the intellectual

Part of the NSF Middleware Initiative (NMI) www.grids-center.org

GRIDS

                                                                    

Resource-SharingMechanisms

• Should address security and policy concerns of resource owners and users

• Should be flexible and interoperable enough to deal with many resource types and sharing modes

• Should scale to large numbers of resources, participants, and/or program components

• Should operate efficiently when dealing with large amounts of data & computational power

Page 7: GRIDS Center G rid R esearch I ntegration D evelopment & S upport  Copyright Thomas Garritano, 2002. This work is the intellectual

Part of the NSF Middleware Initiative (NMI) www.grids-center.org

GRIDS

                                                                    

Grid ApplicationsScience portals

Help scientists overcome steep learning curves of installing and using new software

Solve advanced problems by invoking sophisticated packages remotely from Web browsers or "thin clients”

Portals are currently being developed in biology, fusion, computational chemistry, and other disciplines

Distributed computing High-speed workstations and networks can yoke

together an organization's PCs to form a substantial computational resource

E.g., U.S. and Italian mathematicians pooled resources for one week, aggregating 42,000 CPU-days to solve "Nug30"

Page 8: GRIDS Center G rid R esearch I ntegration D evelopment & S upport  Copyright Thomas Garritano, 2002. This work is the intellectual

Grid Portals

           

Page 9: GRIDS Center G rid R esearch I ntegration D evelopment & S upport  Copyright Thomas Garritano, 2002. This work is the intellectual

Mathematicians Solve NUG30

Looking for the solution to the NUG30 quadratic assignment problem

An informal collaboration of mathematicians and computer scientists

Condor-G delivered 3.46E8 CPU seconds in 7 days (peak 1009 processors) in U.S. and Italy (8 sites)

14,5,28,24,1,3,16,15,10,9,21,2,4,29,25,22,13,26,17,30,6,20,19,8,18,7,27,12,11,23

MetaNEOS: Argonne, Iowa, Northwestern, Wisconsin

Page 10: GRIDS Center G rid R esearch I ntegration D evelopment & S upport  Copyright Thomas Garritano, 2002. This work is the intellectual

Community = 1000s of home

computer users Philanthropic

computing vendor (Entropia)

Research group (Scripps)

Common goal= advance AIDS research

Home ComputersEvaluate AIDS Drugs

Page 11: GRIDS Center G rid R esearch I ntegration D evelopment & S upport  Copyright Thomas Garritano, 2002. This work is the intellectual

Part of the NSF Middleware Initiative (NMI) www.grids-center.org

GRIDS

                                                                    

More Grid ApplicationsLarge-scale data analysis

Science increasingly relies on large datasets that benefit from distributed computing and storage

E.g., the Large Hadron Collider at CERN will generate many petabytes of data from high-energy physics experiments, with single-site storage impractical for technical and political reasons

Computer-in-the-loop instrumentation Data from telescopes, synchrotrons, and electron

microscopes are traditionally archived for batch processing

Grids are permitting quasi-real-time analysis that enhances the instruments’ capabilities

E.g., with sophisticated “on-demand” software, astronomers may be able to use automated detection techniques to zoom in on solar flares as they occur

Page 12: GRIDS Center G rid R esearch I ntegration D evelopment & S upport  Copyright Thomas Garritano, 2002. This work is the intellectual

Image courtesy Harvey Newman, Caltech

Data Grids forHigh Energy Physics

Tier2 Centre ~1 TIPS

Online System

Offline Processor Farm

~20 TIPS

CERN Computer Centre

FermiLab ~4 TIPSFrance Regional Centre

Italy Regional Centre

Germany Regional Centre

InstituteInstituteInstituteInstitute ~0.25TIPS

Physicist workstations

~100 MBytes/sec

~100 MBytes/sec

~622 Mbits/sec

~1 MBytes/sec

There is a “bunch crossing” every 25 nsecs.

There are 100 “triggers” per second

Each triggered event is ~1 MByte in size

Physicists work on analysis “channels.”

Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server.

Physics data cache

~PBytes/sec

~622 Mbits/sec or Air Freight (deprecated)

Tier2 Centre ~1 TIPS

Tier2 Centre ~1 TIPS

Tier2 Centre ~1 TIPS

Caltech ~1 TIPS

~622 Mbits/sec

Tier 0Tier 0

Tier 1Tier 1

Tier 2Tier 2

Tier 4Tier 4

1 TIPS is approximately 25,000

SpecInt95 equivalents

Page 13: GRIDS Center G rid R esearch I ntegration D evelopment & S upport  Copyright Thomas Garritano, 2002. This work is the intellectual

DOE X-ray grand challenge: ANL, USC/ISI, NIST, U.Chicago

tomographic reconstruction

real-timecollection

wide-areadissemination

desktop & VR clients with shared controls

Advanced Photon Source

Online Access to Scientific Instruments

archival storage

Page 14: GRIDS Center G rid R esearch I ntegration D evelopment & S upport  Copyright Thomas Garritano, 2002. This work is the intellectual

Part of the NSF Middleware Initiative (NMI) www.grids-center.org

GRIDS

                                                                    

Still More Grid Applications

Collaborative work Researchers often want to aggregate not only data

and computing power, but also human expertise Grids enable collaborative problem formulation and

data analysis E.g., an astrophysicist who has performed a large,

multi-terabyte simulation could let colleagues around the world simultaneously visualize the results, permitting real-time group discussion

E.g., civil engineers collaborate to design, execute, & analyze shake table experiments

Page 15: GRIDS Center G rid R esearch I ntegration D evelopment & S upport  Copyright Thomas Garritano, 2002. This work is the intellectual

U.S. PIs: Avery, Foster, Gardner, Newman, Szalay www.ivdgl.org

iVDGL: International Virtual Data Grid Laboratory

Tier0/1 facility

Tier2 facility

10 Gbps link

2.5 Gbps link

622 Mbps link

Other link

Tier3 facility

Page 16: GRIDS Center G rid R esearch I ntegration D evelopment & S upport  Copyright Thomas Garritano, 2002. This work is the intellectual

Network for EarthquakeEngineering Simulation

NEESgrid: US national infrastructure to couple earthquake engineers with experimental facilities, databases, computers, and each other

On-demand access to experiments, data streams, computing, archives, collaboration

NEESgrid: Argonne, Michigan, NCSA, UIUC, USC

Page 17: GRIDS Center G rid R esearch I ntegration D evelopment & S upport  Copyright Thomas Garritano, 2002. This work is the intellectual

The 13.6 TF TeraGrid:Computing at 40 Gb/s

26

24

8

4 HPSS

5

HPSS

HPSS UniTree

External Networks

External Networks

External Networks

External Networks

Site Resources Site Resources

Site ResourcesSite ResourcesNCSA/PACI8 TF240 TB

SDSC4.1 TF225 TB

Caltech Argonne

TeraGrid/DTF: NCSA, SDSC, Caltech, Argonne www.teragrid.org

Page 18: GRIDS Center G rid R esearch I ntegration D evelopment & S upport  Copyright Thomas Garritano, 2002. This work is the intellectual

Part of the NSF Middleware Initiative (NMI) www.grids-center.org

GRIDS

                                                                    

Grids and IndustryGrid computing has much in common with major industrial thrusts

Business-to-business, Peer-to-peer, Application Service Providers, Storage Service Providers, Distributed Computing, Internet Computing, etc.

Outsourcing increases decentralization of resources

Sharing issues are not adequately addressed by existing technologies

Complicated requirements: “run program X at site Y subject to community policy P, providing access to data at Z according to policy Q”

Companies like IBM, Platform Computing and Microsoft are getting substantively involved with the open-source Grid community (e.g., web services and Grid services)

Page 19: GRIDS Center G rid R esearch I ntegration D evelopment & S upport  Copyright Thomas Garritano, 2002. This work is the intellectual

Part of the NSF Middleware Initiative (NMI) www.grids-center.org

GRIDS

                                                                    

eBusiness Grids

• Engineers at a multinational company collaborate on the design of a new product

• A multidisciplinary analysis in aerospace couples code and data in four companies

• An insurance company mines data from partner hospitals for fraud detection

• An application service provider offloads excess load to a compute cycle provider

• An enterprise configures internal & external resources to support eBusiness workload

Page 20: GRIDS Center G rid R esearch I ntegration D evelopment & S upport  Copyright Thomas Garritano, 2002. This work is the intellectual

Part of the NSF Middleware Initiative (NMI) www.grids-center.org

GRIDS

                                                                    

Grid Computing: Why Now?

• Moore’s law improvements in computing produce highly functional endsystems

• The Internet and burgeoning wired and wireless provide universal connectivity

• Changing modes of problem solving emphasize teamwork, computation

• Network exponentials produce dramatic changes in geometry and geography

Page 21: GRIDS Center G rid R esearch I ntegration D evelopment & S upport  Copyright Thomas Garritano, 2002. This work is the intellectual

Part of the NSF Middleware Initiative (NMI) www.grids-center.org

GRIDS

                                                                    

Network ExponentialsNetwork vs. computer performance

Computer speed doubles every 18 months Network speed doubles every 9 months Difference = order of magnitude per 5 years

1986 to 2000 Computers: x 500 Networks: x 340,000

2001 to 2010 Computers: x 60 Networks: x 4000

Moore’s Law vs. storage improvements vs. optical improvements. Graph from Scientific American (Jan-2001) by Cleo Vilett, source Vined Khoslan, Kleiner, Caufield and Perkins.

Page 22: GRIDS Center G rid R esearch I ntegration D evelopment & S upport  Copyright Thomas Garritano, 2002. This work is the intellectual

Part of the NSF Middleware Initiative (NMI) www.grids-center.org

GRIDS

                                                                    

GRIDS and the NSF Middleware Initiative

GRIDS is one of two NMI teams; the other is EDIT

NMI seeks standard components and mechanisms Authentication, authorization, policy Resource discovery and directory Remote access of computers, data, instruments

Also seeks: Integration with end-user tools (conferencing, data

analysis, data sharing, distributed computing, etc.) Integration with campus infrastructures Integration with commercial technologies

Page 23: GRIDS Center G rid R esearch I ntegration D evelopment & S upport  Copyright Thomas Garritano, 2002. This work is the intellectual

Part of the NSF Middleware Initiative (NMI) www.grids-center.org

GRIDS

                                                                    

GRIDS Deliverablesfor NMI Release 1.0

On May 7, NMI Release 1.0 will be issued (see www.nsf-middleware.org), including deliverables from the GRIDS and EDIT teams

GRIDS software in NMI-R1 will include new versions of:

Globus Toolkit™ Condor-G Network Weather Service package also includes KX.509

Page 24: GRIDS Center G rid R esearch I ntegration D evelopment & S upport  Copyright Thomas Garritano, 2002. This work is the intellectual

Part of the NSF Middleware Initiative (NMI) www.grids-center.org

GRIDS

                                                                    

The Globus Toolkit™The de facto standard for Grid computing

A modular “bag of technologies” addressing key technical problems facing Grid tools, services and applications

Made available under liberal open source license Simplifies collaboration across virtual organizations

Authentication Grid Security Infrastructure (GSI)

Scheduling Globus Resource Allocation Manager (GRAM) Dynamically Updated Request Online Coallocator (DUROC)

File transfer Global Access to Secondary Storage (GASS) GridFTP

Resource description Monitoring and Discovery Service (MDS)

Page 25: GRIDS Center G rid R esearch I ntegration D evelopment & S upport  Copyright Thomas Garritano, 2002. This work is the intellectual

Part of the NSF Middleware Initiative (NMI) www.grids-center.org

GRIDS

                                                                    

Condor-G High performance computing (HPC) is often measured

in operations per second; with high throughput computing (HTC), Condor permits increased processing capacity over longer periods of time

CPU cycles/day (week, month, year?) under non-ideal circumstances

“How many times can I run simulation X in a month using all available machines?”

The Condor Project develops, deploys, and evaluates mechanisms and policies for HTC on large collections of distributed systems

NMI-R1 will include Condor-G, an enhanced version of the core Condor software optimized to work with Globus Toolkit™ for managing Grid jobs

Page 26: GRIDS Center G rid R esearch I ntegration D evelopment & S upport  Copyright Thomas Garritano, 2002. This work is the intellectual

Part of the NSF Middleware Initiative (NMI) www.grids-center.org

GRIDS

                                                                    

Network Weather Service From UC Santa Barbara, NWS monitors and dynamically

forecasts performance of network and computational resources

Uses a distributed set of performance sensors (network monitors, CPU monitors, etc.) for instantaneous readings

Numerical models’ ability to predict conditions is analogous to weather forecasting – hence the name

For use with the Globus Toolkit and Condor, allowing dynamic schedulers to provide statistical Quality-of-Service readings

NWS forecasts end-to-end TCP/IP performance (bandwidth and latency), available CPU percentage and available non-paged memory

NWS automatically identifies the best forecasting technique for any given resource

Page 27: GRIDS Center G rid R esearch I ntegration D evelopment & S upport  Copyright Thomas Garritano, 2002. This work is the intellectual

Part of the NSF Middleware Initiative (NMI) www.grids-center.org

GRIDS

                                                                    

KX.509 for Converting Kerberos Certificates to PKI

Stand-alone client program from the University of Michigan

For a Kerberos-authenticated user, KX.509 acquires a short-term X.509 certificate that can be used by PKI applications

Stores the certificate in the local user's Kerberos ticket file Systems that already have a mechanism for removing

unused kerberos credentials may also automatically remove the X.509 credentials

Web browser may then load a library (PKCS11) to use these credentials for https

The client reads X.509 credentials from the user’s Kerberos cache and converts them to PEM, the format used by the Globus Toolkit

Page 28: GRIDS Center G rid R esearch I ntegration D evelopment & S upport  Copyright Thomas Garritano, 2002. This work is the intellectual

Part of the NSF Middleware Initiative (NMI) www.grids-center.org

GRIDS

                                                                    

GRIDS Integration Issues

Ten NMI testbed sites will be early adopters, seeking integration of enterprise and Grid computing

Eight sites to be announced soon by SURA Two further sites: CalTech and USC

Via NMI partnerships, GRIDS will help identify points of intersection and divergence between Grid and enterprise computing

Directory services Authorization, authentication and security Emphasis is on open standards and architectures

as the route to successful collaboration