grafana for messaging monitoring - cern asdf 17-03-2016 - grafana for messaging monitoring...grafana...

Click here to load reader

Post on 03-Jun-2020

5 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • Grafana for Messaging Monitoring

    Lionel Cons – CERN IT/CM/MM

    17 Mar 2016 Grafana for Messaging Monitoring 1

  • History

    EGEE then EGI messaging brokers monitoring

    • multi-site effort: AUTH, CERN and SRCE

    • Nagios based

    Then added metrics via PNP4Nagios (so RRD).

    Then switched to Graphite.

    Then used only for the CERN brokers.

    Then removed Nagios but kept Graphite.

    Then added Grafana.

    17 Mar 2016 Grafana for Messaging Monitoring 2

    https://www.nagios.org/https://docs.pnp4nagios.org/starthttp://oss.oetiker.ch/rrdtool/http://graphite.readthedocs.org/en/latest/http://grafana.org/

  • Requirements

    • fine grain metrics (e.g. number of messages

    stored for client X on broker Y on host Z)

    • medium refresh rate: 1 update / minute

    • multi-year retention

    • live/streaming access

    • advanced metric-based alerts

    • user controlled graphs

    • easy to build dashboards

    17 Mar 2016 Grafana for Messaging Monitoring 3

  • Architecture

    17 Mar 2016 Grafana for Messaging Monitoring 4

    Production Transport

    ActiveMQ

    Storage

    Graphite

    Visualization

    Grafana

    Analysis

    Esper

    ☞ IT/TF: Advanced monitoring with complex stream processing

    https://indico.cern.ch/event/382420/

  • Numbers

    • 37 brokers on 16+10 hosts

    • ~100 metrics, ~8200 metric instances

    • ~140 metric updates per second

    • ~300 IOPS for Graphite data

    • m1.large VM with Ceph io1 volume

    • ~3 GB in total for 5 years worth of data

    • ~3 messages per second

    17 Mar 2016 Grafana for Messaging Monitoring 5

  • Graphite

    • widely used and available in EPEL

    • slowly evolving project

    • version 0.9.15 released on 27/11/2015

    • excellent web API with many functions

    • time series stored using whisper

    • next generation storage started: ceres

    • very slow progress there…

    17 Mar 2016 Grafana for Messaging Monitoring 6

    http://graphite.readthedocs.org/en/latest/functions.htmlhttps://github.com/graphite-project/whisperhttps://github.com/graphite-project/ceres

  • https://mig-graphite.cern.ch/render?

    target=msgbrk.received_messages.*.atlas

    https://mig-graphite.cern.ch/render?

    target=scaleToSeconds(nonNegativeDerivat

    ive(msgbrk.received_messages.*.atlas),1)

    https://mig-graphite.cern.ch/render?

    target=highestAverage(scaleToSeconds(non

    NegativeDerivative(msgbrk.received_messa

    ges.*.atlas),1),5)

    https://mig-graphite.cern.ch/render?

    target=highestAverage(scaleToSeconds(non

    NegativeDerivative(msgbrk.received_messa

    ges.*.atlas),1),5)

    &width=800&height=400&from=-3month

    &title=Received%20Messages%20(Hz)

    &bgcolor=white&fgcolor=black&fontSize=10

    https://mig-graphite.cern.ch/render?

    target=highestAverage(scaleToSeconds(non

    NegativeDerivative(msgbrk.received_messa

    ges.*.atlas),1),5)

    &width=800&height=400&from=-3month

    &title=Received%20Messages%20(Hz)

    &bgcolor=white&fgcolor=black&fontSize=10

    &format=json

    Graphite Web API

    17 Mar 2016 Grafana for Messaging Monitoring 7

  • Grafana

    • started as better Graphite GUI

    • Grafana was to Graphite what Kibana was to ES

    • very active development

    • now supports many time series databases

    • production: Graphite, InfluxDB, OpenTSDB

    • experimental: KairosDB, Prometheus

    • ... and also ElasticSearch

    • fits very well our requirements

    17 Mar 2016 Grafana for Messaging Monitoring 8

  • Grafana Dashboards

    17 Mar 2016 Grafana for Messaging Monitoring 9

  • Grafana Comments

    • we only use a small subset of Grafana

    • some features are missing (e.g. template

    options like stacking) but will probably come

    • the dashboard editor is very good for

    prototyping (especially the query builder)

    • JSON editing is however often needed

    • using a database to store configuration

    information (e.g. available data sources)

    unnecessarily complicates the deployment

    17 Mar 2016 Grafana for Messaging Monitoring 10

  • Summary

    • for the metrics to monitor our brokers we use:

    • Graphite to store and visualize (low level)

    • Grafana to visualize (high-level)

    • this combination fulfills all our requirements

    • a single VM (with Ceph) is enough at our scale

    • we are happy with this simple solution

    17 Mar 2016 Grafana for Messaging Monitoring 11