the information service alessandro costa inaf catania corso di calcolo parallelo grid computing...

29
The Information service Alessandro Costa INAF Catania Corso di Calcolo Parallelo Grid Computing Catania - ITALY 25-29 September 2006

Upload: kerry-perkins

Post on 26-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

The Information service

Alessandro Costa INAF Catania

Corso di Calcolo Parallelo Grid ComputingCatania - ITALY

25-29 September 2006

2

Introduction

• The Information Service (IS) provides information about the Grid resources and their status. Thisinformation is essential for the operation of the whole Grid, as it is via the IS that resources are discovered. Thepublished information is also used for monitoring and accounting purposes.

Two IS systems are used in gLite 3.0: the Globus Monitoring and Discovery Service (MDS) ,

used for resource discovery and to publish the resource status.

The Relational Grid Monitoring Architecture (RGMA), used for accounting, monitoring and publication of user-level information.

3

MDS• The Globus MDS implements the GLUE Schema using

OpenLDAP, an open source implementation of the

Lightweight Directory Access Protocol (LDAP), a specialised database optimised for reading, browsing and searching information.

No authentication is required for reading informations through ldap protocol.

e.g.

ldapsearch command

or a graphical interface (LDAP browser/Editor)

http://www-unix.mcs.anl.gov/~gawcr/ldap

4

• The Information System (IS) provides information about grid resources and their actual status.

• Computing and Storage resources at a site report their static and dynamic status information via GRISes (Grid Resource Information Server) one per grid element

• At each site an element called GIIS (Grid Index Information Server) collects information from all the site GRISes (From LCG2.3.0 site GIIS has been replaced by “local” BDII) one per grid site

• The BDII (Berkley Database Information Index) queries the GIISes (of different sites) and acts as a cache storing information about the Grid status in its database.

MDS System Overview

5

IS Components: GRISs, GIISs and BDII

Each site can run

a BDII. It collects the information

coming from the GIISs

At each site, a site GIIS collects the information

given by the GRISs

Local GRISes run on CEs and SEs at each site and report dynamic and static information

Abbreviations:

BDII: Berkeley DataBase Information Index

GIIS: Grid Index Information

Server

GRIS: Grid Resource

Information Server

From LCG2.3.0 site GIIS has been replaced by“local” BDII

6

GLUE Schema

• The GLUE Schema (namely the LDAP implementation of the GLUE Schema) describes the Grid resources information that is stored by the Information System.

• Is there an object class hierarchy

• Each class contains a set of attributes.

7

GLUE Schema

e.g. the GlueCEInfo objectClass

• General Info for the queue associated to the CE (objectclass GlueCEInfo)

– GlueCEInfoLRMSType: name of the local batch system – GlueCEInfoLRMSVersion: version of the local batch system – GlueCEInfoGRAMVersion: version of GRAM – GlueCEInfoHostName: fully qualified name of the host where the

gatekeeper runs – GlueCEInfoGateKeeperPort: port number for the gatekeeper – GlueCEInfoTotalCPUs: number of CPUs in the cluster associated to

the CE ……..

How to query the IS?

• In order to query directly the IS elements two high level tools are provided.

lcg-infosites

lcg-info

These tools should be enough for most common user needs and will usually avoid the necessary of raw LDAP queries.

lcg-infosites

• The lcg-infosites command can be used as an easy way to retrieve information on Grid resources for the most use cases.

USAGE: lcg-infosites --vo <vo name> options

-v <verbose level> --is <BDII to query>

lcg-infosites options

11

lcg-infositesThe "lcg-infosites" command is actually just a perl script wrapping a series of LDAP commands and was developed to allow the user to retrieve information on Grid resources for the most common cases.

Before beginning it is worth observing that "lcg-infosites" does not use your VOMS proxy certificate and hence all commands need to include the option "--vo gilda"

the BDII defined into the LCG_GFAL_INFOSYS environment

variable will be interrogated

e.g.

$ echo $LCG_GFAL_INFOSYS

$ grid004.ct.infn.it:2170

--is option : BDII user wishes to query.

e.g.

$ lcg-infosites –vo atlas ce --is prod-bdii.cern.ch

15

lcg-info

• QUERY & list of attributes:

The "lcg-info" command is similar to the "lcg-infosites" except that it is used to list either CE's or SE’s satisfying a given set of conditions on their attributes and to print, for each of them, the values of a given set of attributes.

•HELPFUL for “Requirements” tag:

This is very similar to the usage of the "Requirements" tag in a JDL file along with the command "glite-job-list-match". The "lcg-info" command can therefore be useful when constructing the "Requirements" tag in a JDL file.

lcg-info --list-ce [--bdii bdii] [--vo vo] [--sed] [--query query] [--attrs list]

lcg-info --list-se [--bdii bdii] [--vo vo] [--sed] [--query query] [--attrs list]

lcg-info --list-attrs

lcg-info --help

lcg-info usage

lcg-info options

22

lcg-info: useful query I

lcg-info --vo gilda --list-ce --query 'Tag=MPICH' --attrs 'FreeCPUs'

23

lcg-info: useful query I

lcg-info --vo gilda --list-ce --query 'Tag=MPICH' --attrs 'FreeCPUs'

Filters out info which are not available to the gilda VO

24

lcg-info: useful query I

lcg-info --vo gilda --list-ce --query 'Tag=MPICH' --attrs 'FreeCPUs'

Lists CE (for the following query)

25

lcg-info: useful query I

lcg-info --vo gilda --list-ce --query 'Tag=MPICH' --attrs 'FreeCPUs'

We wish to find all sites that support the MPICH package

26

lcg-info: useful query I

lcg-info --vo gilda --list-ce --query 'Tag=MPICH' --attrs 'FreeCPUs'

We wish to display how many CPU's are available

27

lcg-info: useful query II

lcg-info --vo gilda --list-se --query 'SE=aliserv6.ct.infn.it' --attrs CloseCE

Filters out info which are not available to the gilda VO

28

lcg-info: useful query II

lcg-info --vo gilda --list-se --query 'SE=aliserv6.ct.infn.it' --attrs CloseCE

Lists SE (for the following query)

29

lcg-info: useful query II

lcg-info --vo gilda --list-se --query 'SE=aliserv6.ct.infn.it' --attrs CloseCE

We wish to find all entries containing

‘aliserv6.ct.infn.it’

30

lcg-info: useful query II

lcg-info --vo gilda --list-se --query 'SE=aliserv6.ct.infn.it' --attrs CloseCE

We wish to display the CloseCE

attribute

31

• Relational Grid Monitoring Architecture (R-GMA) Provides Information (which resources are available on the Grid)

and Monitoring Services Developed as part of the EuropeanDataGrid Project (EDG) Now as part of the EGEE project. Implementation of the Grid Monitoring Architecture (GMA) from the

Global Grid Forum (GGF).

• Uses a relational data model. Data are viewed as tables. Data structure defined by the columns. Each entry is a row (tuple). Queried using Structured Query Language (SQL).

Introduction to R-GMA

32

Data are organised in relational tables, and inserted and queried with SQL-style INSERT and SELECT statements (the allowed syntax is a subset of SQL, but reasonably complete for most purposes).

There are some differences to bear in mind. The most basic is that a standard relational database can only have one row (tuple) with a given primary key value, but R-GMA usually has more than one.

Introduction to R-GMA II

33

Latest queryEach tuple has a timestamp, and for a given primary key value you can query the most recent tuple.

History queryA history of all tuples within some defined retention period.

Continuous queryStreaming

Introduction to R-GMA: Three different query types.

34

The data model is relational.

The table definition is globally unique and is stored in the Schema.

The Registry stores the Producers table name as well as the URL.

The data is inserted in the form of a tuple.

The Consumer gets the tuple from Producer.

Producers

publish: SQL “INSERT”

Consumers

collect: SQL “SELECT”

Registry

Producer Consumer

Execute or Stream data

Schema

Store

Loc

atio

n

Look up Location

Relational GMA

47

Questions…

48

References

• GLITE 3.0 user guide manuals serieshttps://edms.cern.ch/file/722398//gLite-3-UserGuide.pdf

• BDII server installation and configuration

http://agenda.euchinagrid.org/askArchive.php?base=agenda&categ=a0615&id=a0615s0t5/transparencies

• The gLite Information System

http://agenda.euchinagrid.org/askArchive.php?base=agenda&categ=a0615&id=a0615s4t10/transparencies

• RGMA Server Installationhttp://agenda.euchinagrid.org/askArchive.php?base=agenda&categ=a0615&id=a0615s0t7/transparencies