![Page 1: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/1.jpg)
Open Source Enterprise Monitoring with Zabbix
Alexei Vladishev, Founder of Zabbixwww.zabbix.com
![Page 2: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/2.jpg)
What is Zabbix:
• Zabbix overview• Highlights of Zabbix features• Monitoring of large distributed environments
Future:
• Zabbix Roadmap
Plan
![Page 3: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/3.jpg)
Zabbix overview
![Page 4: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/4.jpg)
Most important reasons:
• Warn and act in case of any problems.• Downtimes are very expensive!• To identify and fix problems ASAP before customers start calling.• More productive work of IT staff• To automate routine tasks, check of availability of resources• To plan hardware resources. Capacity planning and trends. • To measure and analyse quality of provided and used services (SLA)
A good monitoring system makes us confident our business is running!
Why shall we use monitoring?
![Page 5: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/5.jpg)
Zabbix is celebrating its 8th anniversary!
• Choice of 1998 — HP OpenView, IBM, BMC: expensive to buy and maintain• How to name it? ABCDE...Zabbix! • April 2001 — the first public release Zabbix 1.0alpha1• April 2004 — the first stable release Zabbix 1.0• April 2005 — the company Zabbix SIA was established: commercial support
Zabbix today. We have made a good progress!
Zabbix 1.6.4, 500 downloads per day, 15.000 forum usersZabbix company is growing, 20 Zabbix partners (Europe, Japan, the US)
History
![Page 6: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/6.jpg)
Zabbix is an Open Source distributed monitoring system capable of monitoring availability and performance of servers, network devices, applications.
Zabbix functionality:• Agent-less/based monitoring• Auto-discovery• Escalations and repeated notifications• Pro-active monitoring, remote actions• WEB monitoring• Graphs, maps, screens• IT Services (SLA), reports• Distributed monitoring, IPv6 and more!
What is Zabbix?
![Page 7: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/7.jpg)
Zabbix: main componentsServer:• Zabbix core, system logic• Data processing, escalations
WEB front-end:• Access to historical data• Configuration
Agent:• Server data collection, actions
Proxy:• Remote data collection
![Page 8: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/8.jpg)
Important technical decisions:• WEB front-end for data visualisation and configuration• Written in the C language, PHP front-end. No Java/Python/Perl/Ruby on the server and agent side! No fork(), native syscalls() are used instead.• Support of virtually all platforms (Linux, *BSD, Solaris, AIX, HP-UX, Windows,...)• Choice of database engines: MySQL, PostgreSQL, Oracle, SQLite• We do not reuse Nagios, RRD, Cacti
Key principles of Zabbix development:• Keep things simple (KISS), yet be very flexible• Maintain low hardware requirements, should not affect production
Technical details
![Page 9: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/9.jpg)
What makes Zabbix so special?• All-in-one solution only when it comes to monitoring!• All historical data, trends and configuration is stored in a database• Ready for monitoring of small and LARGE distributed environments• True Open Source (GPLv2) solution, no commercial versions.• All logic is on the server side, agents are for data collection only• Extremely flexible! Triggers, escalations, new checks, screens, and more.• Designed to deal with unstable communications• Full support of IPv6
Why would we choose Zabbix?
![Page 10: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/10.jpg)
How to monitorService checks:• FTP, SSH, HTTP, SMTP, DNS ...
Zabbix Agent:• Аctive and passive checks• Monitoring of logs, event logs• Easy to extend• Remote command execution• Extremely efficient!
Other: WMI, JMX, Nagios plugins
SNMP v1,v2,v3:• Network devices• Normally NET-SNMP for servers• Monitoring of applications (Oracle, Weblogic, Websphere, PostgreSQL, MySQL, ...)• SNMP traps
IPMI:• Monitoring of hardware• Remote management (reboot, reset, halt)
![Page 11: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/11.jpg)
Use of Zabbix agentActive checks:• Highly efficient• Buffering of collected data
Passive checks:• Requires polling on the Zabbix server side• Additional performance hit because of polling and network bandwidth
![Page 12: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/12.jpg)
Zabbix Highlights
![Page 13: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/13.jpg)
Mmm... Triggers!Trigger is a flexible logical expression used to define a problem condition.• Status (value) of a trigger represents system state• Change of trigger value generates events• It is one of the ways to deal with flapping
CPU load is too high: {host:cpuload.last(0)}>5CPU load is too high: {host:cpuload.min(300)}>2CPU load is too high: {host:cpuload.min(300)}>2 & {host:cpuuser.min(300)}>50CPU load is too high: {host:cpuload.min(300)}>2 & {host2:backup.last(0)}=0
We decide how to define «CPU load is too high» not Zabbix itself!
![Page 14: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/14.jpg)
DependenciesThey are used to:
• Avoid notifications• Define dependencies between different problems (related to networks, applications, anything). No host dependencies!
Server is down → Switch1 is down → Switch2 is down
WEB App is down → MySQL is not responsive → No free disk space on /tmp
![Page 15: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/15.jpg)
EscalationsDifferent scenarios:• Delayed notifications• Repeated notifications• Execution of commands• Escalation to other users• Recovery messages• Different actions for acknowledged and not acknowledges events
Example (reaction to a failed WEB check):
Increase step every 5 minutes Step 1-3: Send message to Unix Admins Step 3-5: Send message to Boss if not ACK Step 6: Restart Apache if not ACK Step 7: Reboot server if not ACK Step 10: Send message to all of not ACK
![Page 16: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/16.jpg)
Visualisation: DashboardFavourite resources:• Maps• Graphs• Screens
High-level view:• Problems by host group• Zabbix statistics• List of the latest issues• WEB monitoring info• Auto-discovery
![Page 17: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/17.jpg)
![Page 18: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/18.jpg)
Visualisation: GraphsImmediate access:• Any period of time• Easy time-navigation• Two mouse-click zooming• Problem conditions displayed• Non-working time is marked• Not generated in advance!
Graph types:• Standard (dots, lines, colors)• Stacked• Pie
![Page 19: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/19.jpg)
![Page 20: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/20.jpg)
Visualisation: ScreensDifferent blocks:• Graphs• Maps• Plain text data• List of problems• High level stats
Slide shows:• Combination of screens• Displayed one after another
![Page 21: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/21.jpg)
![Page 22: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/22.jpg)
WEB monitoringGoals:• Monitoring of user experience• Support of complex scenarios• Performance monitoring• Availability monitoring
Example:Step 1 Access home pageStep 2 Login (POST, GET)Step 3 Run reportStep 4 Logout
![Page 23: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/23.jpg)
![Page 24: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/24.jpg)
IT ServicesGoals:• Business level monitoring• SLA monitoring• We care about services• Escalation of problems• Root cause of the problem
Tree structure based on:• Dependencies• Physical location• Type of service, etc
![Page 25: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/25.jpg)
![Page 26: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/26.jpg)
User managementAuthentication:• Standard: Zabbix database• LDAP (Active Directory)• Apache (Kerberos, Unix, etc)
Permissions:• Depends of user type• User group level permissions
Also:• Notifications-only user groups
![Page 27: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/27.jpg)
Extending ZabbixNew Zabbix agent-side check: UserParameter=mysql.qps,mysqladmin –uroot status|cut –f9 –d”:”UserParameter=sum[*],echo “$1+$2”|bcExamples: mysql.qps = 456, sum[4,5] = 9
New notification methods:• Just a matter of writing a shell script (voice generation, Skype call, anything)
New server side checks:• Just a matter of writing a shell script
![Page 28: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/28.jpg)
Monitoring of large environments
![Page 29: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/29.jpg)
Our environmentSituation:• Several thousands of servers and network devices• Distributed accross 2-100 data centers or branches• Centralised monitoring is required
![Page 30: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/30.jpg)
Zabbix: several approaches
• One Zabbix server does everything
• One Zabbix server• One Proxy per data center or company branch
DistributedDistributed1 Server1 ServerMany ProxiesMany Proxies1 Server1 Server
• One Zabbix server per data center• More effort to maintain• Can be used with Proxies
![Page 31: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/31.jpg)
What is Proxy?Proxy is a data collector. It is also used for auto-discovery.
Advantages:• Makes architecture easier• Does not require significant resources• Offloads Zabbix server
![Page 32: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/32.jpg)
Proxy: how does it work?Connection loss processing:• Data is buferred in the Proxy database• Will be sent on connection recovery• No notifications about local problems!
Management:
• Data collection only• Fully managed via WEB front-end• Configuration is stored on the Zabbix server side• All connections are initiated by Proxy• Collection of thousands of values per second
![Page 33: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/33.jpg)
Distributed monitoringBasic attributes:• Tree-like structure• Node is a Zabbix server• Nodes are platform independent
Managements:• Two-way replication of configuration• Parent node controls child nodes
![Page 34: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/34.jpg)
Processing of connection lossWhat will stop working?• Data sending to parent node• Synchronisation of configuration
Everything else will keep working!
![Page 35: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/35.jpg)
Thousands of devices: solutionsProblems and solutions:• Huge data volume: use database partitions for historical data• Integration with existing systems: LDAP authentication, notifcation methods to open tickets, XML import/export for configuration management and inventory• Maintenance: templates, mass updates• Upgrades: all Zabbix components are compatible within one major release 1.6.x
![Page 36: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/36.jpg)
Choice of the best schema
Getting used to ZabbixAdopt Open Source
Adding Proxies
DistributedDistributed1 Server1 Server
Many ProxiesMany Proxies1 Server1 Server Distributed monitoring
Depends on the requirements:• Local administration• Full-featured monitoring when no connection between data centers (branches)
![Page 37: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/37.jpg)
Zabbix Roadmap
![Page 38: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/38.jpg)
General directions
• Better integration• REST API/RPC• Better scalability
• Flexible Dashboard• Personalization (widgets)
OtherOtherGUIGUIGeneralGeneral
• Infrastructure for widgets• Business level monitoring
![Page 39: Alexei vladishev - Open Source Monitoring With Zabbix](https://reader031.vdocuments.mx/reader031/viewer/2022012308/547e047eb4af9fb9158b55b2/html5/thumbnails/39.jpg)
Questions?Today and tomorrow I am around!