high level grid services
DESCRIPTION
High Level Grid Services. Warren Smith Texas Advanced Computing Center University of Texas. Outline. Grid Monitoring Ganglia MonALISA Nagios Others Workflow Condor DAGMan (and Condor-G) Pegasus Data Storage Resource Broker Replica Location Service Distributed file systems. - PowerPoint PPT PresentationTRANSCRIPT
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
High Level Grid Services
Warren SmithTexas Advanced Computing
CenterUniversity of Texas
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Outline• Grid Monitoring
– Ganglia– MonALISA– Nagios– Others
• Workflow– Condor DAGMan (and Condor-G)– Pegasus
• Data– Storage Resource Broker– Replica Location Service– Distributed file systems
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Other High Level Services(Not Covered)
• Resource Brokering• Metascheduling
– GRMS, MARS• Credential issuance
– PURSE, GAMA• Authorization
– Shibboleth– VOMS– CAS
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Grid Monitoring• Ganglia• MonALISA• Nagios• Others
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Gangliahttp://ganglia.sourceforge.net
• Monitors clusters and aggregations of clusters
• Collects system status information– Provided in XML documents– Provides it graphically via a web interface
• Can be subscribed to and aggregated across multiple clusters
• Focus on simplicity and performance– Can monitor 1000s of systems
• MDS, MonALISA can consume information provided by Ganglia
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
gmond• Ganglia Monitoring Daemon• Runs on each resource being monitored• Collects a standard set of information• Configuration file specifies
– When to collect information– When to send
• Based on time and/or change– Who to send to– Who to allow to request
• Supports UDP unicast, UDP multicast, TCP
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Information collected by gmondName Description Linux FreeBSD Solaris AIX MacOS X IRIX HPUX Tru64
boottime System boot timestamp X X X X X X X Xbread_sec Buffer reads per second Xbwrite_sec Buffer writes per second Xbytes_in Number of bytes in per second X X Xbytes_out Number of bytes out per second X X Xcpu_aidle Percent of time since boot idle CPU X Xcpu_idle Percent CPU idle X X X X X X X Xcpu_intr Time spent processing interrupts Xcpu_nice Percent CPU nice X X X Xcpu_num Number of CPUs X X X X X X X Xcpu_speed Speed in MHz of CPU X X X X X X X Xcpu_ssys Time in kernel mode Xcpu_system Percent CPU system X X X X X X Xcpu_user Percent CPU user X X X X X X Xcpu_wait Time spent waiting Xcpu_wio Time spent in i/o wait Xdisk_free Total free disk space X Xdisk_total Total available disk space X Xload_fifteen Fifteen minute load average X X X X X X X Xload_five Five minute load average X X X X X X X Xload_one One minute load average X X X X X X X Xlocation GPS coordinates for host X X X X X X X Xlread_sec Linear reads per second Xlwrite_sec Linear writes per second Xmachine_type Machine hardware (uname -m) X X X X X X X Xmem_arm Available real memory Xmem_avm Available virtual memory Xmem_buffers Amount of buffered memory X X X Xmem_cached Amount of cached memory X X X Xmem_free Amount of available memory X X X X X X X Xmem_rm Total real memory Xmem_shared Amount of shared memory X X X Xmem_total Total memory X X X X X X X Xmem_vm Total virtual memory Xmtu Network maximum transmission unit X X X X X X X Xos_name Operating system name X X X X X X X Xos_release Operating system release (version) X X X X X X X Xpart_max_used Maximum percent used for all partitions X Xphread_sec Physical reads per second Xphwrite_sec Physical writes per second Xpkts_in Packets in per second X X Xpkts_out Packets out per second X X Xproc_run Total number of running processes X X X X Xproc_total Total number of processes X X X X X Xrcache Read cache hit ratio Xswap_free Amount of available swap memory X X X X X X Xswap_total Total amount of swap memory X X X X X X Xsys_clock Current time on host X X X X X X X Xwcache Write cache hit ratio X
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
gmetric• Program to provide custom
information to Ganglia– e.g. CPU temperature, batch queue
length• Uses the gmond configuration file
to determine who to send to• Executed as a cron job
– Execute command(s) to gather the data
– Execute gmetric to send data
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
gmetad• Aggregates information from gmonds• Configuration file specifies which gmonds to
get data from– Connects to gmonds using TCP
• Stores information in Round Robin Database (RRD)– Small database where data for each attribute is
stored in time order– Maximum size– Oldest data is forgotten
• PHP scripts to display RRD data as web pages– Graphs over time
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Who’s Using Ganglia?• Planet Lab• Lots of clusters
– SDSC– NASA Goddard– Naval Research Lab– …
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
MonALISAhttp://monalisa.cacr.caltech.edu
• Distributed monitoring system• Agent-based design• Written in Java• Uses JINI & SOAP/WSDL
– Locating services & communicating• Gathers information using other systems
– SNMP, Ganglia, MRTG, Hawkeye, custom• Clients
– Locate and subscribe to services that provide monitoring information
– GUI client, web client, administrative client
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Monitoring I2 Network Traffic,
Grid03 Farms and Jobs
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
MonALISA Services• Autonomous, self-describing
services– Built on a generic Dynamic
Distributed Services Architecture • Each monitoring service stores
data in a relational database• Automatic update of monitoring
services• Lookup discovery service
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Who’s using MonALISA?• Open Science Grid
– Included in the Virtual Data Toolkit• Internet2• ABILENE• Compact Muon Solenoid• Many others
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Nagios Overview• A monitoring framework
– Configurable– Extensible
• Provides a relatively comprehensive set of functionality
• Supports distributed monitoring• Supports taking actions in addition to monitoring• Large community using and extending
• Doesn’t store historical data in a true database• Quality of add-ons varies
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Nagiossend_ncsa
Nagios plugins
Nagios configura
tion files
Remote system
Architecture
Nagiossend_nsca
Nagios plugins
Nagios configura
tion files
Remote system
Nagios CGIs
Nagios
NSCA
httpd
Nagios log files
Nagios plugins
Nagios configura
tion files
Central collector
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Nagios Features I• Web interface
– Current status, graphs• Monitoring
– Monitoring of a number of properties included– People provide plugins to monitor other properties, we
can do the same– Periodic monitoring w/ user-defined periods
• Thresholds to indicate problems• Actions when problems occur
– Notification• Email, page, extensible
– Actions to attempt to fix problem (e.g. restart a daemon)
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Nagios Features II• Escalations
– If a problem occurs n times do x• Attempt to fix automatically
– If a probem occurs more than n times do y• Ticket in to trouble ticket system
– …• Distributed monitoring
– A Nagios daemon can test things all over– Can also have Nagios daemons on multiple
systems• Certain daemons can act as central collection points
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Who’s Using Nagios?• It’s included in a number of Unix
distros– Debian– SUSE– Gentoo– OpenBSD
• Nagios users can register with the site– 986 sites have registered– ~200,000 hosts monitored– ~720,000 services monitored
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
TeraGrid’s Inca• Hierarchical Status
Monitoring– Groups tests into logical
sets– Supports many levels of
detail and summarization• Flexible, scalable
architecture– Very simple reporter API– Can use existing test scripts
(unit tests, status tools)– Hierarchical controllers– Several query/display tools
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
And Many Others…• SNMP
– OpenNMS– HP OpenView
• Big Brother / Big Sister• Globus MDS• ACDC (U Buffalo)• GridCat• GPIR (TACC)• …
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Workflow• Condor DAGMan
– Starting with Condor-G• Pegasus
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Workflow Definition• Set of tasks with dependencies• Tasks can be anything, but in grids:
– Execute programs– Move data
• Dependencies can be– Control - “do T2 after T1 finishes”– Data - “T2 input 1 comes from T1 output 1”
• Can be acyclic or have cycles/iterations• Can have conditional execution• A large variety of types of workflows
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Condor-G: Condor + Globushttp://www.cs.wisc.edu/condor
• Submit your jobs to condor– Jobs say they want to run via Globus
• Condor manages your jobs– Queuing, fault tolerance
• Submits jobs to resources via Globus
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Globus Universe• Condor has a number of universes
– Standard - to take advantage of features like checkpointing and redirecting file I/O
– Vanilla - to run jobs without the frills– Java - to run java codes
• Globus universe to run jobs via Globus– Universe = Globus– Which Globus Gatekeeper to use– Optional: Location of file containing your Globus certificateuniverse = globusglobusscheduler = beak.cs.wisc.edu/jobmanagerexecutable = prognamequeue
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
How Condor-G Works
Schedd
LSF
Personal Condor Globus Resource
• Queues, submits, and manages jobs• Available commands:
– condor_submit, condor_rm, condor_q,condor_hold, …
• Manages cluster resources
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
How Condor-G Works
Schedd
LSF
Personal Condor Globus Resource
600 Globusjobs
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
How Condor-G Works
Schedd
LSF
Personal Condor Globus Resource
GridManager
600 Globusjobs
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
How Condor-G Works
Schedd JobManager
LSF
Personal Condor Globus Resource
GridManager
600 Globusjobs
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
How Condor-G Works
Schedd JobManager
LSF
User Job
Personal Condor Globus Resource
GridManager
600 Globusjobs
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Globus Universe Fault Tolerance
• Submit side failure:– All relevant state for each submitted job is stored
persistently in the Condor job queue. – This persistent information allows the Condor
GridManager upon restart to read the state information and reconnect to JobManagers that were running at the time of the crash.
• Execute side:– Condor worked with Globus to improve fault tolerance
• X.509 proxy expiration– Condor can put jobs on hold and email user to refresh
proxy
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Condor DAGMan
• Directed Acyclic Graph Manager
• DAGMan allows you to specify the dependencies between your Condor jobs, so it can manage them automatically for you.
• (e.g., “Don’t run job “B” until job “A” has completed successfully.”)
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
What is a DAG?
• A DAG is the data structure used by DAGMan to represent these dependencies.
• Each job is a “node” in the DAG.
• Each node can have any number of “parent” or “children” nodes – as long as there are no loops!
Job A
Job B
Job C
Job D
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Defining a DAG• A DAG is defined by a .dag file, listing each of its nodes
and their dependencies:# diamond.dagJob A a.subJob B b.subJob C c.subJob D d.subParent A Child B CParent B C Child D
• Each node will run the Condor job specified byits accompanying Condor submit file
• Each node can have a pre and post step
Job A
Job B Job C
Job D
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Submitting a DAG• To start your DAG, just run condor_submit_dag with
your .dag file, and Condor will start a personal DAGMan daemon which to begin running your jobs:
% condor_submit_dag diamond.dag
• condor_submit_dag submits a Scheduler Universe Job with DAGMan as the executable.
• Thus the DAGMan daemon itself runs as a Condor job, so you don’t have to baby-sit it.
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Running a DAG• DAGMan manages the submission of your jobs
to Condor based on the DAG dependencies.– Can configure throttling of job submission
• In case of a failure, DAGMan creates a “rescue” file with the current state of the DAG.– Failures can be retried a configurable number of
times– The rescue file can be used to restore the prior state
of the DAG when restarting• Once the DAG is complete, the DAGMan job
itself is finished, and exits
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Who’s Using Condor-G & DAGMan?
• Pegasus• LIGO, Atlas, CMS, …• gLite• TACC• DAGMan available on every
Condor pool
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Pegasushttp://pegasus.isi.edu
• Pegasus - Planning for Execution on Grids– Intelligently decide how to run a workflow on
a grid• Take as input an abstract workflow
– Abstract DAG in XML (DAX)• Generates concrete workflow
– Select computer systems (MDS)– Select file replicas (RLS)
• Executes the workflow (Condor Dagman)
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Scientific AnalysisW
orkf
low
Evo
lutio
n Select the Input Data
Map the Workflow onto Available Resources
Execute the Workflow
Construct the Analysis
Workflow Template
Abstract Worfklow
Concrete Workflow
Tasks to be executed
Grid Resources
Pegasus
Science Gateway
Condor
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Pegasus Workflows• Abstract workflow
– Edges are data dependencies• Implicit data movement
– Processing on the data• Concrete workflow
– Edges are control flow• Explicit data movement as tasks
• Acyclic• Supports parallelism
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Who’s Using Pegasus?• LIGO• Atlas High energy physics application• Southern California Earthquake Center
(SCEC) • Astronomy: Montage and Galaxy
Morphology applications• Bioinformatics• Tomography
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Data• Storage Resource Broker• Replica Location Service
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Storage Resource Broker (SRB)
http://www.sdsc.edu/srb• Manages collections of data– In many cases, the data are files
• Provides a logical namespace• Maps logical names to physical instances• Associates metadata with logical names
– Metadata Catalog (MCat)• Interfaces to variety of storage
– Local disk– Parallel file systems– Archives– Databases
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
SRB Client Implementations• A set of Basic APIs
– Over 160 APIs– Used by all clients to make request to
servers• Scommands
– Unix like command line utilities for UNIX and Window platforms
– Over 60 - Sls, Scp, Sput, Sget …
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
SRB Client Implementations• inQ – Window GUI browser• Jargon – Java SRB client classes
– Pure Java implementation• mySRB – Web based GUI
– run using web browser• Java Admin Tool
– GUI for User and Resource management• Matrix – Web service for SRB work flow
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
SRBserver
SRB agent
SRBserver
Example Read
MCAT
Read Application
SRB agent
1
2
3 4
7
6
Logical Name
1.Logical-to-Physical mapping2.Identification of Replicas3.Access & Audit Control
Peer-to-peer
Brokering
Data Access
R1R2
7
5
5
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Authentication• Grid Security Infrastructure
– PKI certificates• Challenge-response mechanism
– No passwords sent over network• Ticket
– Valid for specified time period or number of accesses
• Generic Security Service API– Authentication of server to remote
storage
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Authorization• Collection-owned data
– At each remote storage system, an account ID is created under which the data grid stores files
• User authenticates to SRB• SRB checks access controls• SRB server authenticates to a remote
SRB server• Remote SRB server authenticates to the
remote storage repository
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Metadata in SRB• SRB System Metadata• Free-form Metadata (User-defined)
– Attribute-Value-Unit Triplets…• Extensible Schema Metadata
– User Defined – Tables integrated into MCAT Core Schema
• External Database• Metadata operations
– Metadata Insertion through User Interfaces– Bulk Metadata Insertion– Template based Metadata Extraction– Query Metadata through well defined
Interfaces
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Who’s Using SRB?• Very large number of users• A sample:
– National Virtual Observatory– Large Hadron Collider– NASA– NCAR– BIRN
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Replica Location Service (RLS)
http://www.globus.org/toolkit/data/rls/• Maintains a mapping from logical file names to
physical file names– 1 logical file to 1+ physical files
• Improves performance and fault tolerance when accessing data
• Supports user-defined attributes of logical files• Component of Globus toolkit
– WS-RF service• RLS was designed and implemented in a
collaboration between the Globus project and the EU DataGrid project
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Replica Location Service In Context
Replica Location Service Reliable DataTransfer Service
GridFTP
Reliable Replication Service
Replica Consistency Management Services
MetadataService
• RLS is one component in a data management architecture• Provides a simple, distributed registry of mappings• Consistency management provided by higher-level
services
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
LRC LRC LRC
RLIRLI
LRCLRC
Replica Location Indexes
Local Replica Catalogs• Replica Location Index (RLI) nodes aggregate information about one or more LRCs
• LRCs use soft state update mechanisms to inform RLIs about their state: relaxed consistency of index
• Optional compression of state updates reduces communication, CPU and storage overheads
RLS Features
• Local Replica Catalogs (LRCs) contain consistent information about logical-to-target mappings
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Who’s Using RLS?• Used with Pegasus and Chimera:
– LIGO– Atlas High energy physics application– Southern California Earthquake Center (SCEC) – Astronomy: Montage and Galaxy Morphology
applications– Bioinformatics– Tomography
• Other RLS Users– QCD Grid, US CMS experiment (integrated with
POOL)
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Distributed File Systems• What everyone would like• Hard to implement• Features that are needed
– Performance– Fault tolerance– Security– Fine-grained authorization– Access via Unix file system libraries and
programs– User-defined metadata
• Some would like this
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Example Distributed File Systems
• AFS & DFS– Kerberos for security– Performance and fault
tolerance problems• NFS
– Performance, security, and fault tolerance problems
• NFSv4– Tries to imporve
performance and security
• GridNFS– Univ of Michigan– Extend NFSv4– Add grid security and
improve performance• IBM GPFS
– Originally designed as a cluster parallel file system
– Being used in distributed environments
– Relatively large hardware requirements
December 8 & 9, 2005, Austin, TXSURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide
Summary• Grid Monitoring
– Ganglia– MonALISA– Nagios– Others
• Workflow– Condor DAGMan (and Condor-G)– Pegasus
• Data– Storage Resource Broker– Replica Location Service– Distributed file systems