writing nagios plugins in python

35
Enhancing Nagios with Python Plugins Maurice Maneschi Associate Director, Risk Management Systems Oakvale Capital Limited

Upload: guesta6e653

Post on 03-Jul-2015

14.055 views

Category:

Technology


3 download

DESCRIPTION

I introduced Nagios to an organisation in 2004 to track the availability of various servers and network resources. It has since grown into a system validity tool that takes the stress out of help desk. Using Python as a scripting language, I have created a suite of additional Nagios plugins that ensures: * real-time entry of market rates * end of day rate integrity * common errors in manual spreadsheets * success of backup processes * validity conditions in MS SQL databases * routine tracking of known chronic errors

TRANSCRIPT

Page 1: Writing Nagios Plugins in Python

Enhancing Nagioswith Python Plugins

Maurice ManeschiAssociate Director, Risk Management Systems

Oakvale Capital Limited

Page 2: Writing Nagios Plugins in Python

Presentation Outline

● Risk Management Systems● What is Nagios● Why Python● What is a plug in● Specific Risks being monitored● Analysing reports and logs● Where to next

Page 3: Writing Nagios Plugins in Python

Risk Management Systems

● A division of five staff● Supporting three key applications● Running on eight servers● Depending on 15+ other boxes spread over 3 LANs● Five key vendors

Page 4: Writing Nagios Plugins in Python

Risk Management System

● Divisional goals

– Key goal is application management

– Some customer support

– Product innovation

– Project management

– No time for nasty surprises

Page 5: Writing Nagios Plugins in Python
Page 6: Writing Nagios Plugins in Python

What is Nagios

● Host, service, network monitoring program● Open source● Written in C● Runs on Linux and Apache

Page 7: Writing Nagios Plugins in Python

What is Nagios

● Configured with the hosts of a network

– How the hosts are networked

– What key services are on the hosts● “PING”, SMTP, HTTP etc.

● Application polls these at specified intervals

– From the results of the polls, determines the state of hosts, services and networks

– Alerts sent by email

– Escalation, reporting, statistics and more

Page 8: Writing Nagios Plugins in Python

Why Python

● Flexible● Efficient● Managable● Numerous, diverse libraries● Cross-platform● Huge number of code samples across the network

Page 9: Writing Nagios Plugins in Python

What is a plugin

● Executable file

– Takes parameters (preferable)

– Prints a short status message● Returns an exit status of

– 0 – all OK

– 1 – warning

– 2 – critical● Stateless

Page 10: Writing Nagios Plugins in Python

What is a plugin

● Executable Python script

● Code the test● Print the status line● Return a status● Easy!

Page 11: Writing Nagios Plugins in Python

Specific risks being monitored

● Customer email to the help desk system has stopped

– User issues email in directly into our help desk system for prioritisation, action and eventually billing

– Spam periodically breaks the import agent

– Its proprietary, so no fix in sight

– Nagios watches the queue using POP3

Page 12: Writing Nagios Plugins in Python

Specific risks being monitored

Page 13: Writing Nagios Plugins in Python

Specific risks being monitored

Page 14: Writing Nagios Plugins in Python

Specific risks being monitored

● Ratefeed is missing some rates

– Rates feed into our system from Reuters via MS Excel

– Some rates are critical, and human intervention is required if they are missing

– Other rates are important, but are just tracked when missing

– Nagios watches MS Excel file sheet with the “unreliable rates”

Page 15: Writing Nagios Plugins in Python

Specific risks being monitored

Page 16: Writing Nagios Plugins in Python

Specific risks being monitored

Page 17: Writing Nagios Plugins in Python

Specific risks being monitored

● Rates must be inserted regularly

– Insertion process has numerous dependencies

– Moving target – causes of failure change over time

– Focus on the end point – are the rates in the database?

– Nagios the databases and alerts to old or missing rates

Page 18: Writing Nagios Plugins in Python

Specific risks being monitored

Page 19: Writing Nagios Plugins in Python

Specific risks being monitored

Page 20: Writing Nagios Plugins in Python

Specific risks being monitored

● External source of dealing information

– Fed in through the FIX protocol

– Numerous failure points being monitored on a (Windows) server

– Monitor process must check in with Nagios every 10 minutes

– Using passive and active checks

Page 21: Writing Nagios Plugins in Python

Specific risks being monitored

Page 22: Writing Nagios Plugins in Python

Specific risks being monitored

Page 23: Writing Nagios Plugins in Python

Specific risks being monitored

● Quick passive check

Page 24: Writing Nagios Plugins in Python

Specific risks being monitored

● Successful backups● Successful scheduled tasks● Database comparisons● Common errors

– Password server on web site

– Known failure point on an MS Excel worksheet

Page 25: Writing Nagios Plugins in Python

Extra enhancements to Nagios

● High level view to systems health● Audio alerts and SMSes from UTbox.net● Status screen on monitor PC● Syslogd for firewall● Script reuse for rate checks● Ad hoc system problems

– Currently tracking WAN failures

Page 26: Writing Nagios Plugins in Python

Analysing reports and logs

● Screen saver often sufficient● Summary views

Page 27: Writing Nagios Plugins in Python
Page 28: Writing Nagios Plugins in Python
Page 29: Writing Nagios Plugins in Python
Page 30: Writing Nagios Plugins in Python
Page 31: Writing Nagios Plugins in Python
Page 32: Writing Nagios Plugins in Python
Page 33: Writing Nagios Plugins in Python
Page 34: Writing Nagios Plugins in Python

Where to next

● Low spec-ed PC● Nagios is in several distro repositories

– I compile from the source● Allow a day at least to configure Nagios

– Don't expect to install and switch it on● Tuning Nagios is an ongoing job

Page 35: Writing Nagios Plugins in Python

Further information

● Nagios: http://www.nagios.org● Python: http://www.python.org

– pyexcelerator, pymssql, freetds from Sourceforge● Oakvale Capital: http://www.oakvale.com● Code samples:

http://www.redwaratah.com/wiki/index.php?title=Nagios_and_Python● Maurice Maneschi: [email protected]