metrics-driven devops: delivering high quality software faster!
TRANSCRIPT
Metrics Driven DevOpsDelivering High Quality Software Faster!Andreas Grabner (@grabnerandi) – [email protected] Dynatrace Personal License: http://bit.ly/dtpersonal
700 deployments / year
10 + deployments / day
50 – 60 deployments / day
Every 11.6 seconds
Not only fast delivered but also delivering fast!
60% Rate performance/response time as the #1 mobile app
expectation- ahead of features and functionality -
Source: Forrester Research 2014
It‘s not about blind automation of pushing more bad code on new stacks through a pipeline
It‘s not about blindly giving everyone Ops powerto deploy changes only tested locally
Technical Debt
Business Debt
Organizational Rust
Status Quo
NEXT EXIT
„All-In Agile: Across the Pipeline“„We don‘t log bugs! We fix them!“
„All Manual Testers Automate!“
Adam Auerbach@bugman31
All-In Agile: Use application metrics as Feedback Loop & Quality Gates
Dev&Test: Personal License to Stop Bad Code
when it gets created!Tip: Dont leave your IDE!
Continuous Integration: Auto-Stop Bad Builds based on AppMetrics from Unit-, Integration, - Perf Tests
Tip: integrate with Jenkins, Bamboo, ...
Prod: Monitor Usage and Runtime Behavior per Service, User Action,
Feature ...Tip: Stream to ELK, Splunk and Co ...
Automated Tests: Identify Non-Functional Problems by looking at App Metrics
Tip: Feed data back into your test tool!
14
What you currently measure
What you should measure
Quality Metrics in your pipeline # Test Failures
Execution Time per test# calls to API# executed SQL statements# Web Service Calls# JMS Messages# Objects Allocated# Exceptions# Log Messages# HTTP 4xx/5xxRequest/Response SizePage Load/Rendering Time…
WHO is using it?Did we build the RIGHT thing?
Online Sportsclub Search Service
2015201420xx
Response Time
2016+
1) Started as a small project
2) Slowly growing user base
3) Expanding to new markets –
1st performance degradation!
4) Adding more markets – performance becomes
a business impact Users
4) Potentially start loosing users
Early 2015: Monolithic App
Can‘t scale vertically endlessly!
2.68s Load Time
94.09% CPU Bound
Proposal: Service approach!
Front Endto Cloud
Scale Backendin Containers!
7:00 a.m.Low Load and Service runningon minimum redundancy
12:00 p.m.Scaled up service during peak loadwith failover of problematic node
7:00 p.m.Scaled down again to lower loadand move to different geo location
Testing the Backend Service alone scales well …
Go live – 7:00 a.m.
Go live – 12:00 p.m.
What Went Wrong?
26.7s Load Time5kB Payload
33! Service Calls
99kB - 3kB for each call!
171! Total SQL Count
Architecture ViolationDirect access to DB from frontend service
Key Metrics: Top Search Query end-to-end
The fixed end-to-end use case“Re-architect” vs. “Migrate” to Service-Orientation
2.5s (vs 26.7) 5kB Payload
1! (vs 33!) Service Call
5kB (vs 99) Payload!
3! (vs 177) Total SQL Count
Build # Use Case Stat # API Calls # SQL Payload CPU
Use Case Tests and Monitors Service & App Metrics Ops#ServInst Usage RT
With App Metrics from&for Dev(to)OpsScenario: Monolithic App with 2 Key Features
Build 17 testNewsAlert OKtestSearch OK
Build # Use Case Stat # API Calls # SQL Payload CPU
1 5 2kb 70ms1 3 5kb 120ms
Use Case Tests and Monitors Service & App Metrics Ops#ServInst Usage RT
1 0.5% 7.2s
1 63% 5.2s
With App Metrics from&for Dev(to)OpsScenario: Monolithic App with 2 Key Features
Build 17 testNewsAlert OKtestSearch OK
Build # Use Case Stat # API Calls # SQL Payload CPU
1 5 2kb 70ms1 3 5kb 120ms
Use Case Tests and Monitors Service & App Metrics Ops#ServInst Usage RT
1 0.5% 7.2s
1 63% 5.2s
With App Metrics from&for Dev(to)Ops
Re-architecture -> Performance Fixes
Scenario: Monolithic App with 2 Key Features
Build 17 testNewsAlert OKtestSearch OK
Build # Use Case Stat # API Calls # SQL Payload CPU
1 5 2kb 70ms1 3 5kb 120ms
Use Case Tests and Monitors Service & App Metrics
Build 25 testNewsAlert OKtestSearch OK
1 4 1kb 60ms34 171 104kb 550ms
Ops#ServInst Usage RT
1 0.5% 7.2s
1 63% 5.2s
With App Metrics from&for Dev(to)Ops
Re-architecture -> Performance Fixes
Scenario: Monolithic App with 2 Key Features
Build 17 testNewsAlert OKtestSearch OK
Build # Use Case Stat # API Calls # SQL Payload CPU
1 5 2kb 70ms1 3 5kb 120ms
Use Case Tests and Monitors Service & App Metrics
Build 26 testNewsAlert OKtestSearch OK
Build 25 testNewsAlert OKtestSearch OK
1 4 1kb 60ms34 171 104kb 550ms
Ops#ServInst Usage RT
1 0.5% 7.2s
1 63% 5.2s
1 4 1kb 60ms2 3 10kb 150ms
1 0.6% 4.2s
5 75% 2.5s
With App Metrics from&for Dev(to)Ops
Re-architecture -> Performance Fixes
Scenario: Monolithic App with 2 Key Features
Build 17 testNewsAlert OKtestSearch OK
Build # Use Case Stat # API Calls # SQL Payload CPU
1 5 2kb 70ms1 3 5kb 120ms
Use Case Tests and Monitors Service & App Metrics
Build 26 testNewsAlert OKtestSearch OK
Build 25 testNewsAlert OKtestSearch OK
1 4 1kb 60ms34 171 104kb 550ms
Ops#ServInst Usage RT
1 0.5% 7.2s
1 63% 5.2s
1 4 1kb 60ms2 3 10kb 150ms
1 0.6% 4.2s
5 75% 2.5s
Build 35 testNewsAlert -testSearch OK
- - - -2 3 10kb 150ms
- - -
8 80% 2.0s
With App Metrics from&for Dev(to)Ops
Re-architecture -> Performance Fixes
Scenario: Monolithic App with 2 Key Features
Build 17 testNewsAlert OKtestSearch OK
Build # Use Case Stat # API Calls # SQL Payload CPU
1 5 2kb 70ms1 3 5kb 120ms
Use Case Tests and Monitors Service & App Metrics
Build 26 testNewsAlert OKtestSearch OK
Build 25 testNewsAlert OKtestSearch OK
1 4 1kb 60ms34 171 104kb 550ms
Ops#ServInst Usage RT
1 0.5% 7.2s
1 63% 5.2s
1 4 1kb 60ms2 3 10kb 150ms
1 0.6% 4.2s
5 75% 2.5s
Build 35 testNewsAlert -testSearch OK
- - - -2 3 10kb 150ms
- - -
8 80% 2.0s
With App Metrics from&for Dev(to)Ops
Re-architecture -> Performance Fixes
Scenario: Monolithic App with 2 Key Features
How
gets you there!
I am a Developer ...
... and want to only check-in good code!
#1: Don’t Check In Bad CodeStep #1: Execute your Tests just as you always do ...
Step #2: ... but DO IT WITH
Dynatrace!!
#1: Don’t Check In Bad CodeStep #1: Execute your Tests just as you always do ...
Step #2: ... but DO IT WITH
Dynatrace!!
Step #3: Verify Code works as intended?
I am a Functional Tester ...
... and want to find non-functional problems
#2: Identify Non-Functional Problems1: Loading of Homepage
2: Click on Search
WPO: Analyze W3C Timings for
any obvious mistakes, e.g: too many images, ...
Architecture: Identify common problem patterns,
e.g: N+1 Query Problem
I am a Test Automation Engineer ...
... and want to integrate App Metrics into my
tests
#3: Integrate with your Testing Tools
HTTP-based Testing: Leverage the X-Dynatrace HTTP
Header Integration
#3: Integrate with your Testing Tools
HTTP-based Testing: Leverage the X-Dynatrace HTTP
Header Integration
HTTP-based Testing: Makes analyzing these requests much easier
in Dynatrace
#3: Integrate with your Testing Tools Pull Your Test Tool Results
into DynatraceCompare with
Dynatrace Data
App Metrics DB, Exceptions, CPU, ...
for each Test Step
I am a Build Engineer ...
... and want to integrate App
Metrics into my Pipeline to stop
bad builds!
#1: Analyzing every Unit, Integration & REST API test
#2: Key Architectural Metrics for each test
#3: Detecting regression based on measure per Checkin
#4: Automatically Stop Bad Builds in CI
#4: Integration Powered by REST API
I am a (Dev/App)Ops ...
... and want to identify good/bad
features & deployments
based on Usage and App Metrics!
#5: Monitor your Services/Users in Prod #1: Usage
Tip: UEM Conversion!#2: Load vs Response
Tip: See unusual spikes
#3: Architectural Metrics DB, Exceptions, Web
Service Calls
I am a (Biz)Ops ...
... and want to improve user experience &
behavior
#6: UX Analysis based on UEM Data
#6: UX Analysis based on UEM Data
We are DevOps
... and only deploy software that approved by Dynatrace!
Confidential, Dynatrace, LLC
„Level-Up my friends! Work on your own bright
future! Dont‘ fight the change!
#devops“Andreas Grabner (@grabnerandi)
Use application metrics as additional Quality Gates
Use application metrics as additional Quality GatesDev&Test: Personal
License to Stop Bad Code when it gets created!
Tip: Dont leave your IDE!
Use application metrics as additional Quality GatesDev&Test: Personal
License to Stop Bad Code when it gets created!
Tip: Dont leave your IDE!
Automated Tests: Identify Non-Functional Problems by looking at App Metrics
Tip: Feed data back into your test tool!
Use application metrics as additional Quality GatesDev&Test: Personal
License to Stop Bad Code when it gets created!
Tip: Dont leave your IDE!
Continuous Integration: Auto-Stop Bad Builds based on AppMetrics from Unit-, Integration, - Perf Tests
Tip: integrate with Jenkins, Bamboo, ...
Automated Tests: Identify Non-Functional Problems by looking at App Metrics
Tip: Feed data back into your test tool!
Use application metrics as additional Quality GatesDev&Test: Personal
License to Stop Bad Code when it gets created!
Tip: Dont leave your IDE!
Continuous Integration: Auto-Stop Bad Builds based on AppMetrics from Unit-, Integration, - Perf Tests
Tip: integrate with Jenkins, Bamboo, ...
Prod: Monitor Usage and Runtime Behavior per Service, User Action,
Feature ...Tip: Stream to ELK, Splunk and Co ...
Automated Tests: Identify Non-Functional Problems by looking at App Metrics
Tip: Feed data back into your test tool!
QuestionsSlides: slideshare.net/grabnerandiGet Tools: bit.ly/dtpersonalYouTube Tutorials: bit.ly/dttutorialsContact Me: [email protected] Me: @grabnerandiRead More: blog.dynatrace.com
Andreas GrabnerDynatrace Developer Advocate@grabnerandihttp://blog.dynatrace.com