introduction to grid computing and the globus toolkit™ the globus project™ argonne national...
Post on 21-Dec-2015
229 views
TRANSCRIPT
Introduction to Grid Computing and
the Globus Toolkit™
The Globus Project™Argonne National Laboratory
USC Information Sciences Institute
http://www.globus.org/
Copyright (c) 2002 University of Chicago and The University of Southern California. All Rights Reserved. This presentation is licensed for use under the terms of the Globus Toolkit Public License.See http://www.globus.org/toolkit/download/license.html for the full text of this license.
2Introduction to the Globus Toolkit™
Outline Introduction to Grid Computing Grid Problem The Globus Toolkit™
– Introduction, Security, Resource Management, Information Services, Data Management
Future and Conclusions
3Introduction to the Globus Toolkit™
The Grid Problem Flexible, secure, coordinated resource sharing among
dynamic collections of individuals, institutions, and resource
From “The Anatomy of the Grid: Enabling Scalable Virtual Organizations”
Enable communities (“virtual organizations”) to share geographically distributed resource, assuming the absence of…– central location,
– central control,
– existing trust relationships.
4Introduction to the Globus Toolkit™
What is a Virtual Organization?
• Facilitates the workflow of a group of users across multiple domains who share (some of) their resources to solve particular classes of problems
5Introduction to the Globus Toolkit™
Elements of the Problem
Resource sharing– Computers, storage, sensors, networks, …
– Sharing always conditional: issues of trust, policy, negotiation, payment, …
Coordinated problem solving– Beyond client-server: distributed data analysis,
computation, collaboration, … Dynamic, multi-institutional virtual orgs
– Community overlays on classic org structures
– Large or small, static or dynamic
6Introduction to the Globus Toolkit™
tomographic reconstruction
real-timecollection
wide-areadissemination
desktop & VR clients with shared controls
Advanced Photon Source
Online Access to Scientific Instruments
archival storage
7Introduction to the Globus Toolkit™
The Globus Project™Making Grid computing a reality
Close collaboration with real Grid projects in science and industry
Development and promotion of standard Grid protocols
Development and promotion of standard Grid software APIs and SDKs
The Globus Toolkit™: Open source, reference software base for building grid infrastructure and applications
8Introduction to the Globus Toolkit™
One View of Requirements Identity & authentication Authorization & policy Resource discovery Resource characterization Resource allocation (Co-)reservation, workflow Distributed algorithms Remote data access High-speed data transfer Performance guarantees Monitoring
Adaptation Intrusion detection Resource management Accounting & payment Fault management System evolution Etc. Etc. …
9Introduction to the Globus Toolkit™
Where Are We With Architecture?
No “official” standards exist But:
– Globus Toolkit™ has emerged as the popular standard for several important Connectivity, Resource, and Collective protocols
– GGF has an architecture working group– Technical specifications are being developed
for architecture elements: e.g., security, data, resource management, information
– Internet drafts submitted in security area
10Introduction to the Globus Toolkit™
Globus Toolkit™
A software toolkit addressing key technical problems in the development of Grid enabled tools, services, and applications– Offer a modular “bag of technologies”
– Enable incremental development of grid-enabled tools and applications
– Implement standard Grid protocols and APIs
– Make available under liberal open source license
11Introduction to the Globus Toolkit™
General Approach
Define Grid protocols & APIs– Protocol-mediated access to remote resources
– Integrate and extend existing standards
– “On the Grid” = speak “Intergrid” protocols Develop a reference implementation
– Open source Globus Toolkit
– Client and server SDKs, services, tools, etc. Grid-enable wide variety of tools
– Globus Toolkit, FTP, SSH, Condor, SRB, MPI, … Learn through deployment and applications
12Introduction to the Globus Toolkit™
Key Protocols
The Globus Toolkit™ centers around four key protocols– Connectivity layer:
> Security: Grid Security Infrastructure (GSI)
– Resource layer:> Resource Management: Grid Resource Allocation Management
(GRAM)
> Information Services: Grid Resource Information Protocol (GRIP)
> Data Transfer: Grid File Transfer Protocol (GridFTP)
Also key collective layer protocols– Info Services, Replica Management, etc.
The Globus Toolkit™:
Security
14Introduction to the Globus Toolkit™
Security Terminology
Authentication: Establishing identity Authorization: Establishing rights Message protection
– Message integrity
– Message confidentiality Digital signature Accounting Certificate Authority (CA)
15Introduction to the Globus Toolkit™
Why Grid Security is Hard Resources being used may be valuable & the problems
being solved sensitive Resources are often located in distinct administrative
domains– Each resource has own policies & procedures
Set of resources used by a single computation may be large, dynamic, and unpredictable– Not just client/server, requires delegation
It must be broadly available & applicable– Standard, well-tested, well-understood protocols;
integrated with wide variety of tools
16Introduction to the Globus Toolkit™
1) Easy to use
2) Single sign-on
3) Run applicationsftp,ssh,MPI,Condor,Web,…
4) User based trust model
5) Proxies/agents (delegation)
User View
1) Specify local access control
2) Auditing, accounting, etc.
3) Integration w/ local systemKerberos, AFS, license mgr.
4) Protection from compromisedresources
Resource Owner View
API/SDK with authentication, flexible message protection,
flexible communication, delegation, ...Direct calls to various security functions (e.g. GSS-API)Or security integrated into higher-level SDKs:
E.g. GlobusIO, Condor-G, MPICH-G2, etc.
Developer View
Grid Security Requirements
17Introduction to the Globus Toolkit™
Site A(Kerberos)
Site B (Unix)
Site C(Kerberos)
Computer
User
Single sign-on via “grid-id”& generation of proxy cred.
Or: retrieval of proxy cred.from online repository
User ProxyProxy
credential
Computer
Storagesystem
Communication*
GSI-enabledFTP server
AuthorizeMap to local idAccess file
Remote fileaccess request*
GSI-enabledGRAM server
GSI-enabledGRAM server
Remote processcreation requests*
* With mutual authentication
Process
Kerberosticket
Restrictedproxy
Process
Restrictedproxy
Local id Local id
AuthorizeMap to local idCreate processGenerate credentials
Ditto
GSI in Action“Create Processes at A and B
that Communicate & Access Files at C”
18Introduction to the Globus Toolkit™
Community Authorization Service
Question: How does a large community grant its users access to a large set of resources?– Should minimize burden on both the users and
resource providers Community Authorization Service (CAS)
– Community negotiates access to resources– Resource outsources authorization to CAS– Resource only knows about “CAS user” credential
> CAS handles user registration, group membership…
– User who wants access to resource asks CAS for a capability credential
> Restricted proxy of the “CAS user” cred., checked by resource
19Introduction to the Globus Toolkit™
CAS1. CAS request, with resource names and operations
Community Authorization(Prototype shown August 2001)
Does the collective policy authorize this
request for this user?
user/group membership
resource/collective membership
collective policy information
Resource
Is this request authorized for
the CAS?
Is this request authorized by
the capability? local policy
information
4. Resource reply
User 3. Resource request, authenticated with
capability
2. CAS reply, with and resource CA info
capability
20Introduction to the Globus Toolkit™
Community Authorization Service
CAS provides user community with information needed to authenticate resources– Sent with capability credential, used on
connection with resource
– Resource identity (DN), CA This allows new resources/users (and their
CAs) to be made available to a community through the CAS without action on the other user’s/resource’s part
21Introduction to the Globus Toolkit™
Passport Online CA & MyProxy
Requiring users to manage their own certs and keys is annoying and error prone
A solution: Leverage Passport global authentication to obtain a proxy credential– Passport provides
> Globally unique user name (email address)
> Method of verifying ownership of the name (authentication)
> Re-issuance (e.g. forgotten password)
– Passport credentials can be presented to an online CA or credential repository
> Creates and issues new (restricted) proxy certificate to the user on demand
22Introduction to the Globus Toolkit™
Security Summary
GSI successfully addresses wide variety of Grid security issues
Broad acceptance, deployment, integration with tools
Standardization on-going in IETF & GGF Ongoing R&D to address next set of issues For more information:
– www.globus.org/research/papers.html> “A Security Architecture for Computational Grids”> “Design and Deployment of a National-Scale Authentication
Infrastructure”
– www.gridforum.org/security
The Globus Toolkit™:
Resource Management
24Introduction to the Globus Toolkit™
The Challenge Enabling secure, controlled remote access to
heterogeneous computational resources and management of remote computation– Authentication and authorization
– Resource discovery & characterization
– Reservation and allocation
– Computation monitoring and control Addressed by new protocols & services
– GRAM protocol as a basic building block
– Resource brokering & co-allocation services
– GSI for security, MDS for discovery
25Introduction to the Globus Toolkit™
Resource Management
The Grid Resource Allocation Management (GRAM) protocol and client API allows programs to be started on remote resources, despite local heterogeneity
Resource Specification Language (RSL) is used to communicate requirements
A layered architecture allows application-specific resource brokers and co-allocators to be defined in terms of GRAM services– Integrated with Condor, PBS, MPICH-G2, …
26Introduction to the Globus Toolkit™
GRAM GRAM GRAM
LSF Condor NQE
Application
RSL
Simple ground RSL
Information Service
Localresourcemanagers
RSLspecialization
Broker
Ground RSL
Co-allocator
Queries& Info
Resource Management Architecture
27Introduction to the Globus Toolkit™
Resource Specification Language
Common notation for exchange of information between components– Syntax similar to MDS/LDAP filters
RSL provides two types of information:– Resource requirements: Machine type,
number of nodes, memory, etc.– Job configuration: Directory, executable,
args, environment Globus Toolkit provides an API/SDK for
manipulating RSL
28Introduction to the Globus Toolkit™
RSL Syntax
Elementary form: parenthesis clauses– (attribute op value [ value … ] )
Operators Supported:– <, <=, =, >=, > , !=
Some supported attributes:– executable, arguments, environment, stdin, stdout,
stderr, resourceManagerContact,resourceManagerName
Unknown attributes are passed through – May be handled by subsequent tools
29Introduction to the Globus Toolkit™
Constraints: “&”
For example:
& (count>=5) (count<=10)
(max_time=240) (memory>=64)
(executable=myprog) “Create 5-10 instances of myprog, each
on a machine with at least 64 MB memory that is available to me for 4 hours”
30Introduction to the Globus Toolkit™
Disjunction: “|”
For example:
& (executable=myprog)
( | (&(count=5)(memory>=64))
(&(count=10)(memory>=32))) Create 5 instances of myprog on a
machine that has at least 64MB of memory, or 10 instances on a machine with at least 32MB of memory
31Introduction to the Globus Toolkit™
Globus Toolkit Implementation
Gatekeeper– Single point of entry– Authenticates user, maps to local security
environment, runs service– In essence, a “secure inetd”
Job manager– A gatekeeper service– Layers on top of local resource management
system (e.g., PBS, LSF, etc.)– Handles remote interaction with the job
32Introduction to the Globus Toolkit™
GRAM Components
Grid SecurityInfrastructure
Job Manager
GRAM client API calls to request resource allocation
and process creation.
MDS client API callsto locate resources
Query current statusof resource
Create
RSL Library
Parse
RequestAllocate &
create processes
Process
Process
Process
Monitor &control
Site boundary
Client MDS: Grid Index Info Server
Gatekeeper
MDS: Grid Resource Info Server
Local Resource Manager
MDS client API callsto get resource info
GRAM client API statechange callbacks
The Globus Toolkit™:
Information Services
34Introduction to the Globus Toolkit™
Grid Information Services
System information is critical to operation of the grid and construction of applications– What resources are available?
> Resource discovery
– What is the “state” of the grid?> Resource selection
– How to optimize resource use> Application configuration and adaptation?
We need a general information infrastructure to answer these questions
35Introduction to the Globus Toolkit™
Examples of Useful Information
Characteristics of a compute resource– IP address, software available, system
administrator, networks connected to, OS version, load
Characteristics of a network– Bandwidth and latency, protocols, logical
topology Characteristics of the Globus infrastructure
– Hosts, resource managers
36Introduction to the Globus Toolkit™
Grid Information Service
Provide access to static and dynamic information regarding system components
A basis for configuration and adaptation in heterogeneous, dynamic environments
Requirements and characteristics– Uniform, flexible access to information
– Scalable, efficient access to dynamic data
– Access to multiple information sources
– Decentralized maintenance
37Introduction to the Globus Toolkit™
The GIS Problem: Many Information Sources, Many Views
?RR
R
RR
?
R
R
RR
R?
R
R
R
RR
?
RR
VO A
VO B
VO C
38Introduction to the Globus Toolkit™
Two Classes Of Metacomputing Directory Service (MDS) Servers
Grid Resource Information Service (GRIS)– Supplies information about a specific resource
– Configurable to support multiple information providers
– LDAP as inquiry protocol Grid Index Information Service (GIIS)
– Supplies collection of information which was gathered from multiple GRIS servers
– Supports efficient queries against information which is spread across multiple GRIS server
– LDAP as inquiry protocol
39Introduction to the Globus Toolkit™
Grid Resource Information Service Server which runs on each resource
– Given the resource DNS name, you can find the GRIS server (well known port = 2135)
Provides resource specific information– Much of this information may be dynamic
> Load, process information, storage information, etc.
> GRIS gathers this information on demand
“White pages” lookup of resource information– Ex: How much memory does machine have?
“Yellow pages” lookup of resource options– Ex: Which queues on machine allows large jobs?
40Introduction to the Globus Toolkit™
Grid Index Information Service GIIS describes a class of servers
– Gathers information from multiple GRIS servers– Each GIIS is optimized for particular queries
> Ex1: Which Alliance machines are >16 process SGIs?> Ex2: Which Alliance storage servers have >100Mbps bandwidth to
host X?
– Akin to web search engines Organization GIIS
– The Globus Toolkit ships with one GIIS– Caches GRIS info with long update frequency
> Useful for queries across an organization that rely on relatively static information (Ex1 above)
Can be merged into GRIS
41Introduction to the Globus Toolkit™
Logical MDS Deployment
ISI
GRISes
GIIS
Grads Gusto
42Introduction to the Globus Toolkit™
Example: Discovering CPU Load
Retrieve CPU load fields of compute resources% grid-info-search -L “(objectclass=GlobusComputeResource)” \
dn cpuload1 cpuload5 cpuload15 dn: hn=lemon.mcs.anl.gov, ou=MCS, o=Argonne National Laboratory, o=Globus, c=UScpuload1: 0.48cpuload5: 0.20cpuload15: 0.03 dn: hn=tuva.mcs.anl.gov, ou=MCS, o=Argonne National Laboratory, o=Globus, c=UScpuload1: 3.11cpuload5: 2.64cpuload15: 2.57
The Globus Toolkit™:
Data Management
44Introduction to the Globus Toolkit™
Data Grid Problem
“ Enable a geographically distributed community [of thousands] to pool their resources in order to perform sophisticated, computationally intensive analyses on Petabytes of data”
Note that this problem:– Is common to many areas of science– Overlaps strongly with other Grid problems
45Introduction to the Globus Toolkit™
Examples ofDesired Data Grid Functionality
High-speed, reliable access to remote data Automated discovery of “best” copy of data Manage replication to improve performance Co-schedule compute, storage, network “Transparency” wrt delivered performance Enforce access control on data Allow representation of “global” resource
allocation policies
46Introduction to the Globus Toolkit™
A Model Architecture for Data Grids
Metadata Catalog
Replica Catalog
Tape Library
Disk Cache
Attribute Specification
Logical Collection and Logical File Name
Disk Array Disk Cache
Application
Replica Selection
Multiple LocationsSelectedReplica
GridFTP Control Channel
Replica Location 1 Replica Location 2 Replica Location 3
MDS
GridFTPDataChannel
47Introduction to the Globus Toolkit™
Globus Toolkit ComponentsTwo major Data Grid components:
1. Data Transport and Access Common protocol
Secure, efficient, flexible, extensible data movement
Family of tools supporting this protocol
2. Replica Management Architecture Simple scheme for managing:
multiple copies of files collections of files
48Introduction to the Globus Toolkit™
Data Transport and Access -- GridFTP
Why FTP?– Ubiquity enables interoperation with many commodity
tools– Already supports many desired features, easily
extended to support others– Well understood and supported
We use the term GridFTP to refer to– Transfer protocol which meets requirements– Family of tools which implement the protocol
Note GridFTP > FTP Note that despite name, GridFTP is not restricted to
file transfer!
49Introduction to the Globus Toolkit™
GridFTP: Basic Approach
FTP protocol is defined by several IETF RFCs Start with most commonly used subset
– Standard FTP: get/put etc., 3rd-party transfer Implement standard but often unused features
– GSS binding, extended directory listing, simple restart
Extend in various ways, while preserving interoperability with existing servers– Striped/parallel data channels, partial file, automatic
& manual TCP buffer setting, progress monitoring, extended restart
50Introduction to the Globus Toolkit™
(Prototype)Striped GridFTP Server
Parallel File System (e.g. PVFS, PFS, etc.)
MPI-IO
…
Plug-in
Control
GridFTP Server Parallel BackendGridFTPservermaster
mpirun
GridFTPclient
Plug-in
Control
Plug-in
Control
Plug-in
Control…MPI (Comm_World)
MPI (Sub-Comm)
To Client or Another Striped GridFTP Server
Controlsocket
GridFTP Control Channel GridFTP Data Channels
51Introduction to the Globus Toolkit™
Future Directions
Continued enhancement & standardization of protocol– Globus Toolkit libraries provide reference
implementation Continue building on libraries
– Striped server w/ server side processing
– Reliable replica/copy management service
– Proxies for firewalls & load balancing Work with more application communities
The Globus Toolkit™:
Futures & Conclusions
53Introduction to the Globus Toolkit™
The Future:All Software is Network-Centric
We don’t build or buy “computers” anymore, we borrow or lease required resources– When I walk into a room, need to solve a
problem, need to communicate A “computer” is a dynamically, often
collaboratively constructed collection of processors, data sources, sensors, networks
54Introduction to the Globus Toolkit™
And Thus …
Reduced barriers to access mean that we do much more computing, and more interesting computing, than today => Many more components (& services); massive parallelism
All resources are owned by others => Sharing (for fun or profit) is fundamental; trust, policy, negotiation, payment
All computing is performed on unfamiliar systems => Dynamic behaviors, discovery, adaptivity, failure
55Introduction to the Globus Toolkit™
Summary
The Grid problem: Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations
Grid architecture emphasizes systems problem– Protocols & services, to facilitate interoperability
and shared infrastructure services Globus Toolkit™: APIs, SDKs, and tools which
implement Grid protocols & services– Provides basic software infrastructure for suite of
tools addressing the programming problem