enterprise drupal application & hosting infrastructure level monitoring
TRANSCRIPT
![Page 1: Enterprise Drupal Application & Hosting Infrastructure Level Monitoring](https://reader033.vdocuments.mx/reader033/viewer/2022042707/58e588721a28abbf5d8b63cf/html5/thumbnails/1.jpg)
![Page 2: Enterprise Drupal Application & Hosting Infrastructure Level Monitoring](https://reader033.vdocuments.mx/reader033/viewer/2022042707/58e588721a28abbf5d8b63cf/html5/thumbnails/2.jpg)
Enterprise Drupal Application & Hosting Infrastructure Level
Monitoring
Daniel KanchevSenior Site Reliability Engineer
@dvkanchev
![Page 3: Enterprise Drupal Application & Hosting Infrastructure Level Monitoring](https://reader033.vdocuments.mx/reader033/viewer/2022042707/58e588721a28abbf5d8b63cf/html5/thumbnails/3.jpg)
Enterprise Drupal Hosting Characteristics
○ Consists of multiple servers
○ Provides high availability
○ Offers auto scalability
○ Requires multiple services to work as expected
![Page 4: Enterprise Drupal Application & Hosting Infrastructure Level Monitoring](https://reader033.vdocuments.mx/reader033/viewer/2022042707/58e588721a28abbf5d8b63cf/html5/thumbnails/4.jpg)
Enterprise Drupal Hosting Characteristics
○ Consists of multiple servers
○ Provides high availability
○ Offers auto scalability
○ Requires multiple services to work as expected
○ Really expensive
○ Nobody wants to manage this sh*t :)
![Page 5: Enterprise Drupal Application & Hosting Infrastructure Level Monitoring](https://reader033.vdocuments.mx/reader033/viewer/2022042707/58e588721a28abbf5d8b63cf/html5/thumbnails/5.jpg)
Hosting Types Complexity
![Page 6: Enterprise Drupal Application & Hosting Infrastructure Level Monitoring](https://reader033.vdocuments.mx/reader033/viewer/2022042707/58e588721a28abbf5d8b63cf/html5/thumbnails/6.jpg)
Hosting Types Complexity
○ Shared Hosting Service
○ Single Virtual Server
○ Single Dedicated Server
○ PaaS
![Page 7: Enterprise Drupal Application & Hosting Infrastructure Level Monitoring](https://reader033.vdocuments.mx/reader033/viewer/2022042707/58e588721a28abbf5d8b63cf/html5/thumbnails/7.jpg)
Hosting Types Complexity
○ Shared Hosting Service
○ Single Virtual Server
○ Single Dedicated Server
○ PaaS
○ Custom Private/Public Clouds
![Page 8: Enterprise Drupal Application & Hosting Infrastructure Level Monitoring](https://reader033.vdocuments.mx/reader033/viewer/2022042707/58e588721a28abbf5d8b63cf/html5/thumbnails/8.jpg)
![Page 9: Enterprise Drupal Application & Hosting Infrastructure Level Monitoring](https://reader033.vdocuments.mx/reader033/viewer/2022042707/58e588721a28abbf5d8b63cf/html5/thumbnails/9.jpg)
○ ElasticSearch/Solr
○ Redis/Memcached
○ GraphQL
○ MongoDB
○ Nodejs
○ Gearman
○ CI systems
![Page 10: Enterprise Drupal Application & Hosting Infrastructure Level Monitoring](https://reader033.vdocuments.mx/reader033/viewer/2022042707/58e588721a28abbf5d8b63cf/html5/thumbnails/10.jpg)
One Monitoring To Rule Them All
• Website Monitoring• Hosting Infrastructure Monitoring
![Page 11: Enterprise Drupal Application & Hosting Infrastructure Level Monitoring](https://reader033.vdocuments.mx/reader033/viewer/2022042707/58e588721a28abbf5d8b63cf/html5/thumbnails/11.jpg)
Website Monitoring Architecture
Website
London Amsterdam Munich
![Page 12: Enterprise Drupal Application & Hosting Infrastructure Level Monitoring](https://reader033.vdocuments.mx/reader033/viewer/2022042707/58e588721a28abbf5d8b63cf/html5/thumbnails/12.jpg)
Website Monitoring Architecture
Website
London Amsterdam Munich
503 ISE
![Page 13: Enterprise Drupal Application & Hosting Infrastructure Level Monitoring](https://reader033.vdocuments.mx/reader033/viewer/2022042707/58e588721a28abbf5d8b63cf/html5/thumbnails/13.jpg)
Incidents○ Critical Incident - website is down from all locations
○ Major Incident - website is down from a single location; MySQL replication
is broken; PHP fatal errors recorded in the logs; read-only file system issue
○ Minor Incident - Memcached/Redis on a single server is down
○ Notice Incident - web node X is running out of space; PHP warnings
recorded in the logs
![Page 14: Enterprise Drupal Application & Hosting Infrastructure Level Monitoring](https://reader033.vdocuments.mx/reader033/viewer/2022042707/58e588721a28abbf5d8b63cf/html5/thumbnails/14.jpg)
![Page 15: Enterprise Drupal Application & Hosting Infrastructure Level Monitoring](https://reader033.vdocuments.mx/reader033/viewer/2022042707/58e588721a28abbf5d8b63cf/html5/thumbnails/15.jpg)
Core Principles○ Log all events and archive them. Write postmortem reports
○ Check every single incident - even minor ones and notices
○ Define performance limits and regularly check reports
○ Beware of cascade failures
○ Always strive to go back to pre-incident state
○ Check one thing at a time and return “OK” or “Failure”
![Page 16: Enterprise Drupal Application & Hosting Infrastructure Level Monitoring](https://reader033.vdocuments.mx/reader033/viewer/2022042707/58e588721a28abbf5d8b63cf/html5/thumbnails/16.jpg)
Examples○ 1 of 5 app servers goes down
○ Load on the other 4 increases by 20%
○ Redis caches are invalidated - overload
○ Varnish is restarted by a system
administrator to apply a configuration
change
○ App servers start to return 503 errors
○ MySQL master goes down
○ MySQL slave 1 takes over and at this
moment there is no downtime
○ MySQL slave 2 is behind the new
master
○ The new MySQL master goes down too
result is a broken DB or outdated one
![Page 17: Enterprise Drupal Application & Hosting Infrastructure Level Monitoring](https://reader033.vdocuments.mx/reader033/viewer/2022042707/58e588721a28abbf5d8b63cf/html5/thumbnails/17.jpg)
![Page 18: Enterprise Drupal Application & Hosting Infrastructure Level Monitoring](https://reader033.vdocuments.mx/reader033/viewer/2022042707/58e588721a28abbf5d8b63cf/html5/thumbnails/18.jpg)
![Page 19: Enterprise Drupal Application & Hosting Infrastructure Level Monitoring](https://reader033.vdocuments.mx/reader033/viewer/2022042707/58e588721a28abbf5d8b63cf/html5/thumbnails/19.jpg)
KEY TAKEAWAYS
1. Embrace Failure and Design for Failure2. Automate Recovery3. Log all incidents and analyse them4. Measure and graph the performance of all components5. Regularly brake things on purpose in order to test
![Page 20: Enterprise Drupal Application & Hosting Infrastructure Level Monitoring](https://reader033.vdocuments.mx/reader033/viewer/2022042707/58e588721a28abbf5d8b63cf/html5/thumbnails/20.jpg)
RESOURCES
Injecting Failure at Netflix - goo.gl/YE1sEYWhat is SRE - goo.gl/2lI8E0SRE book - goo.gl/bfL2AtNetflix Open Source Software - https://netflix.github.io/Etsy “Measure Everything” - goo.gl/CPVUT5
![Page 21: Enterprise Drupal Application & Hosting Infrastructure Level Monitoring](https://reader033.vdocuments.mx/reader033/viewer/2022042707/58e588721a28abbf5d8b63cf/html5/thumbnails/21.jpg)
JOIN US FORCONTRIBUTION SPRINTS
First Time Sprinter Workshop - 9:00-12:00 - Room Wicklow2AMentored Core Sprint - 9:00-18:00 - Wicklow Hall 2BGeneral Sprints - 9:00 - 18:00 - Wicklow Hall 2A
![Page 22: Enterprise Drupal Application & Hosting Infrastructure Level Monitoring](https://reader033.vdocuments.mx/reader033/viewer/2022042707/58e588721a28abbf5d8b63cf/html5/thumbnails/22.jpg)
Evaluate This Session
THANK YOU!
events.drupal.org/dublin2016/schedule
WHAT DID YOU THINK?