opsview ral retreat 2011
TRANSCRIPT
Applica'ons and System Monitoring with Opsview/Nagios
RAL Retreat 2011
Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.
What is Opsview
Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.
Opsview vs. Nagios
Nagios Opsview
Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.
Opsview vs SNMP
SNMP
SNMP Trap
NRPE
NSCA
Opsview
Server
Nagios Proprietary Protocols
SNMP Protocols
Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.
Opsview vs. RAL Tools
Opsview
Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.
Opsview Concepts
Host
Fan Speed
Services Temprature
Ping
Memory/RAM Clock Synchroniza'on CPU
Disk Space RAID
Service Checks
Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.
How does it work?
Opsview
Server
Nagios Remote Plug-‐Ins Executor
(NRPE)
Scheduled Execu'on
Allowed Commands Only
Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.
What about firewalls? Opsview
Server Nagios Service Check Acceptor
(NSCA) No response?
Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.
Opsview Web Interface
Comment
Scheduled Down'me
Graph Available
Unhandled Colors
Handled Colors
Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.
Automa'c No'fica'on Management
No'fica'on only aWer mul'ple failures
Ping
CPU
Disk
Memory
NRPE Dependencies
OK
Cri0cal
OK
Warning
Cri0cal
= Flap Detec'on
Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.
Manual No'fica'on Management
Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.
Accessing Opsview
• Mul'ple instances at RAL • E-‐mail helpdesk@ rap.ucar.edu to request access.
• SNAT instance: h\ps://opsview.rap.ucar.edu • Just type “opsview” in your browser.
Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.
Custom Nagios Plug-‐in Development
A nagios plug-‐in is any executable that: – Prints a one-‐line status to stdout; and – Has an exit code to indicate status: • 0 – Ok • 1 – Warning • 2 – Cri'cal • 3 – Unknown; and
– Op'onal performance data appended to one-‐line status • |’Graph Label’=value;warning threshold;error threshold;min y-axis value;max y-axis value
Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.
Performance Data
|’Graph Label’=value;warning threshold;error threshold;min y-axis value;max y-axis value
|’Age of madis decoded’=4389s;6400;7200;0;7500
Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.
Passive Checks
• Means your soWware is sending a message to a nagios server
• Forma\ed string provided at stdin for send_nsca executable
• Perl API developed (cvs/apps/nagios/src/passive) – Subrou'ne takes parameters and re-‐formats as necessary
• Service check must s'll be configured in OpsView
Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.
Ac've Check Example
Many re-‐usable plug-‐ins in cvs/apps/nagios/src/plugins
Usage ./check_mdv_data_'me.pl -‐u <mdvUrl> full URL to the MDV data set -‐l <maxDataAge> maximum age of the latest data before being considered late (seconds) -‐m <maxDataAge> maximum age of the latest data before being considered missing (seconds) -‐n <dataSetName> name of the data set -‐-‐ used in an alert message
-‐h show this message Example: ./check_mdv_data_'me.pl -‐u mdvp:://<host>::<path> -‐l 1200 -‐m 3600 -‐n MyData
Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.
Passive Check Example
Passive check is a call from applica'on code to send_nsca binary
<hostname>[tab]<descrip>[tab]<return_code>[tab]<plugin_output>[newline]
echo “magen-‐c1-‐int1\tdata archive\t0\tdata archive was successful” | send_nsca -‐h magen-‐dev-‐admin -‐c nagios/etc/send_nsca.cfg
Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.
Passive Check Example
There is a Perl API available to make this call easier:
require “cvs/apps/nagios/src/passive/perl/sendNSCA.pm”; … … ($success,$errorMsgs) = &sendNSCA(
“magen-‐dev-‐admin”, # nagios host “magen-‐c1-‐int1”, # host were service is checked “data archive”, # service check name 0, “data archive was successful” ); # status and message
Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.
Installing Custom Plug-‐ins
• Copy plug-‐in files to: nagios/libexec • Edit nagios/etc/nrpe.cfg to allow plug-‐ins to be used
• Restart the opsview-‐agent process • Configure ac've check using opsview
Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.