infso-ri-508833 enabling grids for e-science gridice: a monitoring service for grid systems sergio...
TRANSCRIPT
INFSO-RI-508833
Enabling Grids for E-sciencE
www.eu-egee.org
GridICE: a monitoring service for Grid SystemsSergio AndreozziINFN (Italy)[email protected]
TERENA Networking Conference 2005, 8 June
TNC2005, Poznan, 8 june 2005 2
Enabling Grids for E-sciencE
INFSO-RI-508833
OUTLINE
• What is a Grid• What is Monitoring
– terms and concepts– the process of monitoring
• Use cases for a Grid Monitoring System• GridICE
– Architecture– Screenshots
TNC2005, Poznan, 8 june 2005 3
Enabling Grids for E-sciencE
INFSO-RI-508833
What is a Grid
• Virtualization of users and resources
Site A Site B
Grid system
• Mapping virtual resources to physical resources
• Mapping virtual users to physical users
TNC2005, Poznan, 8 june 2005 4
Enabling Grids for E-sciencE
INFSO-RI-508833
What is a Grid
Grid Systems:
• enable the secure sharing of resources (e.g., execution
environments for jobs, storage areas, databases)
• Shared resources
– Can be geographically dispersed
– Can be heterogeneous
– Can belong to different administrative domains
– Their composition can dynamically change
– Can be accessed by remote users
TNC2005, Poznan, 8 june 2005 5
Enabling Grids for E-sciencE
INFSO-RI-508833
What is MonitoringTerms and Concepts
• Grid Monitoring– the activity of measuring significant grid resources
related parameters – in order to
analyze usage, behavior and performance of the grid detect and notify
• fault situations• contract violations (SLA)• user-defined events
TNC2005, Poznan, 8 june 2005 6
Enabling Grids for E-sciencE
INFSO-RI-508833
What is MonitoringTerms and Concepts
• Measurement: the process by which numbers or symbols are assigned to feature of an entity in order to describe them according to clearly defined rules
• Event: collection of timestamped data associated with the attribute of an entity [2]
• Event schema (or simply schema): defines the typed structure and semantics of all events so that, given an event type, one can find the structure and interpret the semantics of the corresponding event [2]
TNC2005, Poznan, 8 june 2005 7
Enabling Grids for E-sciencE
INFSO-RI-508833
The four main phases of monitoring
Generation
Distributing
Presenting
Pro
cessin
g
sensors enquiring entities and encoding the measurements according to a schema (active/passive, intrusive/non-intrusive)
transmission of the events from the source to any interested parties (data delivery model: push vs. pull; periodic vs. aperiodic; unicast vs. l-to-N)
Processing and abstract the number of received events in order to enable a the consumer to draw conclusions about the operation of the monitored system
e.g., filtering according to some predefined criteria, or summarising a
group of events
TNC2005, Poznan, 8 june 2005 8
Enabling Grids for E-sciencE
INFSO-RI-508833
Use cases for Grid monitoring
• Virtual Organization:1. visualize at various aggregation levels the actual set of
resources accessible to its members;
2. Assess how Grid mapping functionalities from virtual to physical resources and users meet the members’ demands
3. analyze data retrospectively to understand how to improve the effectiveness of VO applications running in a Grid, as the target machine for different executions of the same application can vary over time
TNC2005, Poznan, 8 june 2005 9
Enabling Grids for E-sciencE
INFSO-RI-508833
Use cases for Grid monitoring
• Site Administrator:– Visualize the managed Grid services in order to see how they
are being used/performing (possibly divided by VO)
• User:– Is my job “working” (e.g., consuming CPU?)
• Grid Operation Center:– Status of Grid services (e.g., WMS, Service Discovery, CE, SE)– Free/busy resources per site/per VO at a given time– Timely notification about fault situations
TNC2005, Poznan, 8 june 2005 10
Enabling Grids for E-sciencE
INFSO-RI-508833
GridICE: architectural insight
TNC2005, Poznan, 8 june 2005 11
Enabling Grids for E-sciencE
INFSO-RI-508833
Monitoring: generating events
• generation of events:– Sensors: typically perl scripts or c programs– Schema:
GLUE Schema v.1.1 + GridICE extension• System related (e.g., CPU load, CPU Type, Memory size)
• Grid service related (e.g., CE ID, queued jobs)
• Network related (e.g., Packet loss) [5]
• Job usage (e.g., CPU Time, Wall Time)
– All sensors are executed in a periodic fashion
TNC2005, Poznan, 8 june 2005 12
Enabling Grids for E-sciencE
INFSO-RI-508833
Monitoring: distributing
• distribution of events:– Hierarchical model
Intra-site: by means of the local monitoring service • default choice, LEMON (http://www.cern.ch/lemon)
Inter-site: by offering data through the Grid Information Service Final Consumer: depending on the client application
– Mixed data delivery model Intra-site: depending on the local monitoring service (push for
lemon) Inter-site: depending on the GIS (current choice, MDS 2.x, pull) Final consumer: pull (browser/application), push (publish/subscribe
notification service)
TNC2005, Poznan, 8 june 2005 13
Enabling Grids for E-sciencE
INFSO-RI-508833
Example deployment in LCG2
TNC2005, Poznan, 8 june 2005 14
Enabling Grids for E-sciencE
INFSO-RI-508833
GridICE Architecture
Resource
Site Publisher
Sensor
event collector
event provider
consumer
publisher
WAN
LAN
publishers
Lemon srv
Lemon agt
LDAP Client
MDS GRIS
scripts
HTTP:HTML/XMLNS
GridICE on LCG 2
logical components
roles
GridICE Server
Consumer
WAN
xML: pull,aperiodic,unicastNS: push,aperiodic,unicast
Browser
Data delivery model
pull,periodic,unicast
push,periodic,unicast
application
consumers
TNC2005, Poznan, 8 june 2005 15
Enabling Grids for E-sciencE
INFSO-RI-508833
GridICE on gLite
Resource
Site Publisher
Sensor
event collector
event provider
consumer
publisher
WAN
LAN
publishers
Lemon srv
Lemon agent
CEMon
scripts
HTTP:HTML/XMLNS
GridICE on gLite
logical components
roles
GridICE Server
Consumer
WAN
xML: pull,aperiodic,unicastNS: push,aperiodic,unicast
Browser
Data delivery model
pull,periodic,unicast
push,periodic,unicast
RGMA
application
consumers
MDS2
consumers
G
TNC2005, Poznan, 8 june 2005 16
Enabling Grids for E-sciencE
INFSO-RI-508833
GridICE Server
Persistent storage
Discovery Consumers Scheduler
XSLT->HTML
XML abstraction
XML Notification S.Charts
components that need to be revised when migrating to gLite
[6]
TNC2005, Poznan, 8 june 2005 17
Enabling Grids for E-sciencE
INFSO-RI-508833
GridICE >> Site View >> General
TNC2005, Poznan, 8 june 2005 18
Enabling Grids for E-sciencE
INFSO-RI-508833
GridICE >> Site View >> Host Summary
TNC2005, Poznan, 8 june 2005 19
Enabling Grids for E-sciencE
INFSO-RI-508833
Running/waiting jobs for a VO
TNC2005, Poznan, 8 june 2005 20
Enabling Grids for E-sciencE
INFSO-RI-508833
Current release status
– Integrated and deployed with LCG 2.x– Last server release v.1.8.0.rc1– Last sensor release v.1.5.1.pl2– Installed servers are monitoring Grid resources in the scope of:
Grid.it GILDA LCG LCG South-West federation LCG South-East federation SEE-GRID E-grid project CMS experiment ATLAS experiment
TNC2005, Poznan, 8 june 2005 21
Enabling Grids for E-sciencE
INFSO-RI-508833
Next steps
• Avoid proliferation of sensors (especially for intrusive ones)– Planned integrated with metering system of DGAS (Grid
accounting)
• Dealing with site policies as regards privacy of log files– Sensors should provide mechanisms to allow sites to push the
data of interest
• Enable security of monitoring data– in particular, VO and role based authorization at the producer
levels (e.g., the VO manager of CMS can read all info about the jobs that are running by CMS users)
• Dealing with heterogeneous producer interfaces• Provide a more flexible job monitoring mechanism
TNC2005, Poznan, 8 june 2005 22
Enabling Grids for E-sciencE
INFSO-RI-508833
CONCLUSION
• Monitoring of Grid systems is a complex activity in metering, distributing and presenting
• GridICE has been designed on the basis of the requirements collected by a wide variety of users
• The experience in production environments is positive
• New challenges for the evolution in the context of gLite are harmonization of metering, security and dealing with multiple producers
TNC2005, Poznan, 8 june 2005 23
Enabling Grids for E-sciencE
INFSO-RI-508833
References
[1] S. Andreozzia, N. De Bortoli, S. Fantinel, A. Ghiselli, G. L. Rubini, G. Tortone, M. C. Vistoli GridICE: a monitoring service for Grid systems, Future Generation Computer System 21 (2005) 559–571
[2] B. Tierney, R. Aydt, D. Gunter, W. Smith, M. Swany, V. Taylor, R. Wolski, A Grid Monitoring Architecture, GFD-I.7
[3] S. Zanikolas, R. Sakellariou, A taxonomy of grid monitoring systems, Future Generation Computer Systems 21 (2005) 163–188
[4] M. Franklin, S. Zdonik, “Data In Your Face”: Push Technology in Perspective, ACM SIGMOD ’98, Seattle, WA, USA
[5] S. Andreozzi, A. Ciuffoletti, A. Ghiselli, C. Vistoli. Monitoring the connectivity of a Grid. Proceedings of the 2nd International Workshop on Middleware for Grid Computing (MGC 2004) in conjunction with the 5th ACM/IFIP/USENIX International Middleware Conference, Toronto, Canada, October 2004.
[6] S. Andreozzi, N. De Bortoli, S. Fantinel, G.L. Rubini, G. Tortone. Design and Implementation of a Notification Model for Grid Monitoring Events. CHEP04, Interlaken (CH), Sep 2004
Dissemination: http://grid.infn.it/gridice