performance monitoring and call tracing in microservice environments

57
Performance Analysis and Call Tracing in Microservice environments Martin Gutenbrunner Dynatrace Innovation Lab @MartinGoowell Microservice Meetup Berlin – 2016-06-30

Upload: martin-goodwell

Post on 23-Feb-2017

533 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Performance monitoring and call tracing in microservice environments

Performance Analysis and

Call Tracingin Microservice environments

Martin Gutenbrunner

Dynatrace Innovation Lab

@MartinGoowell

Microservice Meetup Berlin – 2016-06-30

Page 2: Performance monitoring and call tracing in microservice environments

About me

Started with Commodore 8-bit (VC-20 and C-64)

Built Null-Modem connections for playing Doom and WarCraft I

Went on to IPX/SPX networks between MS-DOS 6.22 and

WfW 3.11

Did DevOps before it was a thing (mainly Java and Web)

for ~ 10 years

Now at Dynatrace Innovation Lab

Tech Lead for Azure and Microservices

Find me on Twitter: @MartinGoodwell

Passionate about life, technology and the people behind both of them.

Page 3: Performance monitoring and call tracing in microservice environments

Agenda

Traditional monitoring

What‘s wrong with it?

Performance in your code

The dramatic dilemma

Happy end

@MartinGoodwell

Page 4: Performance monitoring and call tracing in microservice environments

Questions

Please, ask and interrupt anytime!

What‘s your occupation?

Dev, Ops, BinExec?

What‘s your technology stack?

Java, .net

Node.js

Who of you knows what APM is/does?

Page 5: Performance monitoring and call tracing in microservice environments

A lil` bit o`history

Traditional monitoring was for Ops only

APM (incl. Call Tracing) is also for devs, debugging, pre-prod

@MartinGoodwell

Page 6: Performance monitoring and call tracing in microservice environments

Monitoring

@MartinGoodwell

Page 7: Performance monitoring and call tracing in microservice environments

Host performance

CPU-usage

Memory-usage

Disk IO

Network performance

@MartinGoodwellNagios

Page 8: Performance monitoring and call tracing in microservice environments

What‘s wrong with it?

Nothing is wrong

Some things might just be out of scope

No insight into your application‘s performance

@MartinGoodwell

Page 9: Performance monitoring and call tracing in microservice environments

Performance in your codea.k.a. Application Performance Management

@MartinGoodwell

Page 10: Performance monitoring and call tracing in microservice environments

Add monitoring code

@MartinGoodwell

Page 11: Performance monitoring and call tracing in microservice environments

Use statsd

@MartinGoodwell

Page 12: Performance monitoring and call tracing in microservice environments

statsd real quick

http://www.slideshare.net/DatadogSlides/dev-opsdays-tokyo2013effectivestatsdmonitoring@MartinGoodwell

Page 13: Performance monitoring and call tracing in microservice environments

Use JMX

@MartinGoodwell

Page 14: Performance monitoring and call tracing in microservice environments

@MartinGoodwell

Page 15: Performance monitoring and call tracing in microservice environments

Aspect oriented programming

http://veerasundar.com/blog/2010/01/spring-aop-example-profiling-method-execution-time-tutorial/@MartinGoodwell

Page 16: Performance monitoring and call tracing in microservice environments

Graphite Visualization

@MartinGoodwell

Page 17: Performance monitoring and call tracing in microservice environments

Any downsides here?

Basic approaches are subject to polluting your code

AOP is the better choice, but requires advanced skills

If you‘re not using something like statsd, it‘s hard to have a central spot for

all your performance data of different components

Great for performance insights of single components

What about 3rd parties?

Or distributed systems?

Like, microservices, maybe

@MartinGoodwell

Page 18: Performance monitoring and call tracing in microservice environments

What about components which we

can‘t modify?like databases, message queues, ...

@MartinGoodwell

Page 19: Performance monitoring and call tracing in microservice environments

Best case: use readily available APIs or integrations (statsd, JMX, etc)

For open-source: apply same technique as to your own code

Keeping in sync with original code can become tedious

try to make your changes part of the original project

Use dedicated monitoring tools

Very common for databases

BUT even the best tool is an additional tool

How long does it take to get a new team member up-to-speed?

@MartinGoodwell

Page 20: Performance monitoring and call tracing in microservice environments

Microservices

@MartinGoodwell

Page 21: Performance monitoring and call tracing in microservice environments

Microservices vs SOA

Microservices

fit the scope of a single application

Service Oriented Architecture

is scoped to fit enterprises / environments / infrastructures

@MartinGoodwell

Page 22: Performance monitoring and call tracing in microservice environments

For a dev, microservices hardly pose any downsides

On the upside, the code-size and scope of the domain becomes smaller

Any best practices for analyzing performance of a single microservice are still

valid

The real challenge of microservices is proper operation

@MartinGoodwell

Page 23: Performance monitoring and call tracing in microservice environments

What‘s the challenge about monitoring

microservice?

The big challenge of well performing microservices is the communication

between the microservices

Not in the high-performance of a single microservice

Tracing calls between services is very difficult

@MartinGoodwell

Page 24: Performance monitoring and call tracing in microservice environments

@MartinGoodwell

Source: http://theburningmonk.com/2015/05/a-consistent-approach-to-track-correlation-ids-through-microservices/

Page 25: Performance monitoring and call tracing in microservice environments

Call Tracing

@MartinGoodwell

Page 26: Performance monitoring and call tracing in microservice environments

@MartinGoodwell

Source: http://theburningmonk.com/2015/05/a-consistent-approach-to-track-correlation-ids-through-microservices/

Page 27: Performance monitoring and call tracing in microservice environments

@MartinGoodwell

Source: http://theburningmonk.com/2015/05/a-consistent-approach-to-track-correlation-ids-through-microservices/

Page 28: Performance monitoring and call tracing in microservice environments

In Java

https://taidevcouk.wordpress.com/category/experiments/

@MartinGoodwell

Page 29: Performance monitoring and call tracing in microservice environments

C#

http://theburningmonk.com/2015/05/a-consistent-

approach-to-track-correlation-ids-through-microservices/ @MartinGoodwell

Page 30: Performance monitoring and call tracing in microservice environments

Leverage on existing tools

https://github.com/ordina-jworks/microservices-dashboard

@MartinGoodwell

Page 31: Performance monitoring and call tracing in microservice environments

Spring Cloud Sleuth

@MartinGoodwell

Sleuth: https://github.com/spring-cloud/spring-cloud-sleuth

Spring Cloud Sleuth implements a distributed tracing

solution for Spring Cloud.

Page 32: Performance monitoring and call tracing in microservice environments

@MartinGoodwell

Zipkin

Page 33: Performance monitoring and call tracing in microservice environments

@MartinGoodwell

Trace

https://trace.risingstack.com/

Page 34: Performance monitoring and call tracing in microservice environments

So, here we got everything we need?

Usually, one tracing solution only covers a single technology

Besides visualization, you‘ll also want log analysis

ELK stack does this really well, especially in connection with correlation Ids

But ELK stack does no visualization

And your visualization does no log analysis

yet another tool

Don‘t get me started about integrating all this with host monitoring...

The trace ends, where your code ends

No correlation IDs for database calls

@MartinGoodwell

Page 35: Performance monitoring and call tracing in microservice environments

What‘s next?

@MartinGoodwell

Page 36: Performance monitoring and call tracing in microservice environments

Considerations for custom

implementations

Multitude of languages

Open-source tools can get expensive

Manual configuration

Often only applicable to a single technology

Keep the pace with new technology

Serverless code (eg AWS Lambda, Azure Functions)

@MartinGoodwell

Page 37: Performance monitoring and call tracing in microservice environments

http://de.slideshare.net/InfoQ/netflix-built-its-own-monitoring-system-and-

why-you-probably-shouldnt

@MartinGoodwell

Page 38: Performance monitoring and call tracing in microservice environments

The Ops‘ dilemmahow to handle all this in production

how to identify production issues

how to tell the devs, what they should look into, w/o tearing down everything

@MartinGoodwell

Page 39: Performance monitoring and call tracing in microservice environments

All fine?

While the Dev can leverage on a huge number of tools, libs and frameworks,

it‘s still up to the Ops to integrate it into a single, unified, well-integrated

solution that allows to draw the right conclusions

@MartinGoodwell

Page 40: Performance monitoring and call tracing in microservice environments

From Dev to Prod

Dev

Single transaction

Deal with a specific problem

No impact on real users and business

Can concentrate on single component

„perfect world“

A dev‘s deadline is made of Sprints

A couple of weeks, usually

Ops

100s or 1000s of transactions

No idea, what the prob is

Slow or bad requests impact real

users and business

Lots of components that might not

be under your control

An Op‘s deadline is made of SLAs

Hours, maybe just minutes

@MartinGoodwell

Page 41: Performance monitoring and call tracing in microservice environments

The Dev-Ops-Dev-Ops-Dev-Ops dilemma

Dev

Ops

@MartinGoodwell

Sprint

(days / weeks)

SLA

(hours / minutes)

Page 42: Performance monitoring and call tracing in microservice environments

From Prod to Dev

Dev

Single transaction

Deal with a specific problem

No impact on real users and business

Can concentrate on single component

„perfect world“

Ops

100s or 1000s of transactions

No idea, what the prob is

Slow or bad requests impact real users and

business

Lots of components that might not be under

your control

Which?

Which?

Time!

Reproduce

?

@MartinGoodwell

Page 43: Performance monitoring and call tracing in microservice environments

Commercial solutionsDynatrace Ruxit

@MartinGoodwell

Page 44: Performance monitoring and call tracing in microservice environments

@MartinGoodwell

Page 45: Performance monitoring and call tracing in microservice environments

Dynatrace Ruxit

@MartinGoodwell

Page 46: Performance monitoring and call tracing in microservice environments

Set-up in 5 minutes

Install a single monitoring agent per host

Everything is auto-detected

No changes to your source-code

No changes to runtime configuration

Supports a wide array of technologies

http://www.dynatrace.com/en/ruxit/technologies/

@MartinGoodwell

Page 47: Performance monitoring and call tracing in microservice environments

Traditional metrics

@MartinGoodwell

Page 48: Performance monitoring and call tracing in microservice environments

Service metrics

@MartinGoodwell

Page 49: Performance monitoring and call tracing in microservice environments

Does not end at your custom

components

@MartinGoodwell

Page 50: Performance monitoring and call tracing in microservice environments

Baselining

Automatically detects and correlates problems without setting thresholds

@MartinGoodwell

Page 51: Performance monitoring and call tracing in microservice environments

Includes the Client-side

Browser auto-injection

Includes client-side JavaScript in traces and problem-correlation

@MartinGoodwell

Page 52: Performance monitoring and call tracing in microservice environments

Visualization

@MartinGoodwell

Page 53: Performance monitoring and call tracing in microservice environments

Call Tracing

@MartinGoodwell

Page 54: Performance monitoring and call tracing in microservice environments

Solving a dilemma

Include this URL in a

trouble ticket and the Dev

can jump in right away

@MartinGoodwell

Page 55: Performance monitoring and call tracing in microservice environments

Supporting most popular technologies

• Java

• .NET

• Node.js

• PHP

• Databases via

• JDBC

• ADO.NET

• PDO

• Message Queues

• Caches

• Cloud Infrastructure Metrics

• See more at

http://www.dynatrace.com/en/ruxit/technologies/

@MartinGoodwell

Page 56: Performance monitoring and call tracing in microservice environments

Dynatrace Ruxit

2016 hours for free

@MartinGoodwell

http://bit.ly/monitoring-2016

Page 57: Performance monitoring and call tracing in microservice environments

References

https://www.nagios.org

https://github.com/etsy/statsd/wiki

http://veerasundar.com/blog/2010/01/spring-aop-example-profiling-method-execution-time-tutorial/

http://theburningmonk.com/2015/05/a-consistent-approach-to-track-correlation-ids-through-microservices/

http://apmblog.dynatrace.com/2014/06/17/software-quality-metrics-for-your-continuous-delivery-pipeline-part-iii-logging/

https://blog.buoyant.io/2016/05/17/distributed-tracing-for-polyglot-microservices/

https://blog.init.ai/distributed-tracing-the-most-wanted-and-missed-tool-in-the-micro-service-world-c2f3d7549c47#.93r1dj6ah

@MartinGoodwell