srv210 improving microservice and serverless observability with monitoring data

Post on 21-Jan-2018

710 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Microservice and

Serverless ObservabilityCLAY SMITH, NEW RELIC

@SMITHCLAY

IMPROVING

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

What’s Observability?

A measure of how well we can

understand a system from the

work it does.

“I know long all the methods in

this service take to execute.”

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

What’s Instrumentation?

“This method took 25ms to execute”

Instrumentation: Measuring events in software using code.

(a type of white-box monitoring)

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Agenda

1. System architectures of past, present and future

2. Collecting the right data to understand modern architectures

3. Observability requirements for modern architectures

4. Case study: AWS Lambda Observability

5. Q&A with New Relic Customer

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

How Did You Monitor Apps in 1967?

Attribution: Bundesarchiv, B 145 Bild-F038812-0014 / Schaack, Lothar / CC-BY-SA 3.0

1. People in lab coats looking

at blinking lights.

2. ‘Autotest’ (IBM System/360)

• Status print-outs at

different points during

program execution

• Main storage print-out

in the event of failure (!)

• ‘Automatic patch card

inclusion’ (?)

Source: IBM System/360 Programmer’s Basic Operating

System Programmer’s Guide (September 1967)

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Good News: We Don’t Have to Wear

Lab Coats Anymore

Attribution: Flickr / Heisenberg Media/8408215473 / CC-BY-SA 3.0

1. People in jeans and hoodies

looking at screens

2. Various types of machine data

from different sources

• Infrastructure

• Backend Apps and Services

• … Mobile, Browser,

IoT, Edge, etc.

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Software Architecture Continues to Change

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

It’s Globally Distributed in Multiple Regions

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

And Compute Is Getting Physically Closer with

Edge Computing and IoT

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

The Architecture Is Also Extremely Dynamic

Docker container lifespan in minutes (1-100), New Relic April 2017

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

More New Relic Customers

Run Complex, Distributed Systems

New Relic Service Map of Reference Telco Architecture

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Good Data Can Help with the Technical Shift

to New Systems

• Improved debugging and troubleshooting

• Designs validated with data

• Reduced defects, more issues caught

proactively

• Improved feature velocity

Technical

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Good Data Can Help with the Cultural Shift to

New Systems

• Builds transparency across teams

• Shared understanding of complex components

• Decisions not (entirely) driven or explained by

‘gut-feelings’ or guessing

• Freedom to experiment

• Blameless culture

• ‘Context not control’

Cultural

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Instrumentation

Increases

Observability

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

How Do We Make

Microservices and Serverless

Functions Observable?

But...

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

#1: Observable Systems Should Emit Events:

Metrics, Logs, and Traces

16

“The database won’t start after the update.”

“Our application is 35% slower than last week

after this configuration change.”

“What are the dependencies for this service?”

Logs

Metrics

Traces

New Relic Provides

*via Partner

Integrations

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

#2: All Components (Not Just Critical Services!)

Should Be Instrumented

BrowserMobile

Server (Virtual)

Hardware and

Managed Services

Host Operating

System and

Containers

Application

Amazon EC2 Instance

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

#3: Instrumentation Should Not Be Opt-in,

Manual, or ‘Hard to Do’

On-Premises

Web

Server

On Premises

Relational Data

Synthetic

customers

Customers

Public Cloud

Micro Services

API

Browser

Apps

Mobile

NoSQL

Data Store

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Lambda Case Study

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Which Monitoring Batteries Are Included?

Amazon Cloudwatch Metrics

Amazon Cloudwatch Logs

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Lambda: Key Metrics

1. Invocations

2. Errors

3. Dead Letter Error

4. Duration

5. Throttles

6. Iterator Age (stream-based invocations only)

http://docs.aws.amazon.com/lambda/latest/dg/monitoring-functions-metrics.html

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

What Else Provides AWS Lambda Observability?

AWS X-Ray

Request tracing for many AWS-managed services.

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS X-Ray Trace: Example

A “cold start” trace initiated from in AWS X-Ray. Annotations in red.

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Warm Start in an X-Ray Trace

Note the function executes almost immediately after the service

receives the request.

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Traces In Aggregate Show Interesting Trends

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Serverless Architecture for Aggregating Traces

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

What Does the Data Show in Insights?

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

A-ha Moment: It Was Under-provisioned

with Memory!

Memory: 768mb

Memory: 1152mb

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Lessons Learned

• Instrument for observability: “What are the internal

lambda service latencies for my function?”

• Find the right balance of metrics, logs, and

traces for a given system: “Over 24 hours what’s

the distribution of function duration for my function?”

• Use analytics to diagnose: “Are cold starts

significant, what other factors are at play?”

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Q&A with Marcus Irven, Scripps Network

Serverless Architectures in Production

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

THANK YOU!CLAY SMITH, NEW RELIC

TWITTER: @SMITHCLAY

top related