scaling monitoring with datadog
TRANSCRIPT
![Page 1: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/1.jpg)
Scaling monitoring with
Datadog
![Page 2: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/2.jpg)
Agenda
. What is Datadog?
. How does it help me scale monitoring?
![Page 3: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/3.jpg)
Thank you #$%! Peter and Denise...
![Page 4: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/4.jpg)
Thank you Dear Sponsor !
![Page 5: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/5.jpg)
Thank you Dear Host !
![Page 6: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/6.jpg)
whoami
Alexis Midon
Backend Engineer
a year in DevOps, by accident
![Page 7: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/7.jpg)
environment
![Page 8: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/8.jpg)
What is Datadog?
![Page 9: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/9.jpg)
Datadog
. Monitoring service
. agent based
. integrated with AWS
. resource tagging
![Page 10: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/10.jpg)
Datadog - cont’d
. Metrics and Alerts
. Correlation features
. Collaboration features
. Custom Dashboards
![Page 11: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/11.jpg)
Event Stream
![Page 12: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/12.jpg)
Default Instance Dashboard
![Page 13: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/13.jpg)
![Page 14: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/14.jpg)
nice, but...
![Page 15: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/15.jpg)
How do you help me deal with:
N components: mongo, redis, nodejs, ...
x P environments: prod-1, prod-2, staging, …
x Q versions: app-blue, app-green, etc
x R users
![Page 16: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/16.jpg)
help me scale!
![Page 17: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/17.jpg)
How does Datadog help?
![Page 18: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/18.jpg)
#1 pre-canned tools
pre-canned integrations
pre-canned dashboards
![Page 19: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/19.jpg)
#2 Templated Dashboards
A dashboard can have multiple variables.
Edit once, and re-use.
$environment $zone $tier $asg ...
![Page 20: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/20.jpg)
a template example
![Page 21: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/21.jpg)
not bad but
![Page 22: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/22.jpg)
Gimme API !!!
We can code:. instance configuration . infrastructure
Why not monitoring?
![Page 23: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/23.jpg)
#3 Datadog has a great API.
. events, metrics, event, tags, dashboard, alerts, …
. bindings for python, ruby, node.js, etc
. command-line
![Page 24: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/24.jpg)
plain json + curl
![Page 25: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/25.jpg)
using the ruby gem
![Page 26: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/26.jpg)
![Page 27: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/27.jpg)
Now, I can:
. version control my dashboards, alerts
. code my monitoring resources
. integrate with my provisioning tool
![Page 28: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/28.jpg)
Integration example:
CloudFormation++
![Page 29: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/29.jpg)
CloudFormation++
. a CFN template usually has related dashboards and alerts. . same life-cycle
e.g. app tier: . dashboards for ELB and front-end instances . alerts on HTTP errors, etc
![Page 30: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/30.jpg)
CloudFormation++
stack = CloudFormation + Datadog
$ rake stack:app:createexecuting stack:app:cloudformation:createexecuting stack:app:datadog:create
![Page 31: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/31.jpg)
in git /stacks /app app_cfn_template.json app_datadog.rb app_http_alerts.json
![Page 32: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/32.jpg)
be creative
datadog.rb is evaluated in a rich context. It has access to everything.
very flexible.
![Page 33: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/33.jpg)
Cons / Pain points :-(
. still have to deal with some json
. room for drift - if users manually edit resources. resource tracking can be tricky
![Page 34: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/34.jpg)
Pros
. monitoring has code
. all the benefits of using code: tests, versioning, tracking, DRY, bugs, ...
![Page 35: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/35.jpg)
Summary
Go code your Monitoring,
with the awesome Datadog API.
![Page 36: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/36.jpg)
impressed ?
![Page 37: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/37.jpg)
Thank you!
![Page 38: Scaling monitoring with Datadog](https://reader030.vdocuments.mx/reader030/viewer/2022020105/5583ec37d8b42a2a4d8b4d21/html5/thumbnails/38.jpg)
Scaling monitoring with
Datadog