aws re:invent 2016: cloud monitoring - understanding, preparing, and troubleshooting dynamic apps on...

61
Lee Atchison, Principal Cloud Architect, New Relic November 2016 Cloud Monitoring Understanding, Preparing, and Troubleshooting Dynamic Apps in AWS ARC303

Upload: amazon-web-services

Post on 06-Jan-2017

138 views

Category:

Technology


0 download

TRANSCRIPT

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Lee Atchison, Principal Cloud Architect, New Relic

November 2016

Cloud MonitoringUnderstanding, Preparing, and

Troubleshooting Dynamic Apps in AWS

ARC303

Safe Harbor

This document and the information herein (including any information that may be incorporated by reference) is provided for

informational purposes only and should not be construed as an offer, commitment, promise or obligation on behalf of New Relic, Inc.

(“New Relic”) to sell securities or deliver any product, material, code, functionality, or other feature. Any information provided hereby is

proprietary to New Relic and may not be replicated or disclosed without New Relic’s express written permission.

Such information may contain forward-looking statements within the meaning of federal securities laws. Any statement that is not a

historical fact or refers to expectations, projections, future plans, objectives, estimates, goals, or other characterizations of future

events is a forward-looking statement. These forward-looking statements can often be identified as such because the context of the

statement will include words such as “believes,” “anticipates,”, “expects” or words of similar import.

Actual results may differ materially from those expressed in these forward-looking statements, which speak only as of the date hereof,

and are subject to change at any time without notice. Existing and prospective investors, customers and other third parties transacting

business with New Relic are cautioned not to place undue reliance on this forward-looking information. The achievement or success

of the matters covered by such forward-looking statements are based on New Relic’s current assumptions, expectations, and beliefs

and are subject to substantial risks, uncertainties, assumptions, and changes in circumstances that may cause the actual results,

performance, or achievements to differ materially from those expressed or implied in any forward-looking statement. Further

information on factors that could affect such forward-looking statements is included in the filings we make with the SEC from time to

time. Copies of these documents may be obtained by visiting New Relic’s Investor Relations website at http://ir.newrelic.com or the

SEC’s website at www.sec.gov.

New Relic assumes no obligation and does not Intend to update these forward-looking statements, except as required by law. New

Relic makes no warranties, expressed or implied, in this document or otherwise, with respect to the information provided.

Who am I?

29 years in industry

4+ in New Relic

(Architecture Lead, Cloud, Service Migration)

7 in Amazon Retail & AWS

(Built SW/VG AppStore, AWS Elastic Beanstalk)

Specializes in:

Cloud computing

Services & Microservices

Scalability, Availability

@leeatchison leeatchison

Principal Cloud Architect

We want better apps faster

Better

data center

Dynamic

environment

How do we use the cloud to accomplish this?

Better Data Center

Better

data center

Dynamic

environment

Cloud as a “Better Data Center”

Resources are

allocated to uses,

just like in a data

center

Provisioning process

is faster

Lifetime of

components is

relatively long

Capacity planning is

still important and

still applies

Why use a “Better Data Center”?

Add new capacity

(faster)

Improve application availability

(redundancy)

Compliance

Who is impacted?

Operations Better data center Development

Can I scale my server fleet?

Can apps run anywhere?

How do they perform in the cloud?

A data center is a data center…

Who is impacted?

Operations Better data center Development

Better data centerFaster application

launch / deploy=

How do I monitor it?Similar to monitoring any other data center…

Monitoring an application

• Application & Application Microservices

• Server OS

• Hardware (virtual)

Typical Server / Amazon EC2 Instance

Amazon EC2 Instance

Server OS

Server (Virtual)Hardware

Application &Application Microservices

BrowserMobile

Amazon EC2 Instance

Server OS

Server (Virtual)Hardware

Application &Application Microservices

BrowserMobile

AWS Monitoring

• Server OS

• Memory / Filesystem

• Processes

• Configuration

• Application

- Latency

- Error rates

• EC2 instance

• Virtualization

• Hardware

• [CPU / Disk / Networking]

Amazon CloudWatch

Monitors

Doesn’t know about:

Amazon CloudWatch

AWS CONSOLE

Amazon EC2 Instance

Server OS

Server (Virtual)Hardware

Application &Application Microservices

BrowserMobile

Amazon CloudWatch

AWS CONSOLE

DASHBOARDS

New Relic Monitoring

• Virtualization

• How O.S. is performing

• Configuration

• Processes

• Hardware

• App health

• App performance

• Microservices

New Relic

Monitors (Server):

Monitors (Application):

Doesn’t know

New RelicApplicationMonitoring

New Relic Infrastructure

Monitoring

Amazon EC2 Instance

Server OS

Server (Virtual)Hardware

Application &Application Microservices

BrowserMobile

Amazon CloudWatch

AWS CONSOLE

DASHBOARDS

New RelicApplicationMonitoring

New Relic Infrastructure

Monitoring

AWSNew Relic Monitoring

• Visibility into virtualization

• CPU / Disk / Networking

• CPU / Disk / Networking

• Memory / Filesystem

• Processes

- Infrastructure components

• Application / Microservices:

- Latency

- Error rates

- App insights

AWS / CloudWatch

New Relic

New Relic

Monitors

CloudWatch

monitors

Dynamic Cloud

Dynamic

environment

Better

data center

Cloud as a

“Dynamic Tool for Dynamic Apps”

Use only the resources

you need

Cloud as a

“Dynamic Tool for Dynamic Apps”

Allocate / de-allocate

resources on the fly

Use only the resources

you need

Cloud as a

“Dynamic Tool for Dynamic Apps”

Resource allocation is an

integral part of your

application architecture

Allocate / de-allocate

resources on the fly

Use only the resources

you need

Allocated Application is aware of and is controlling traditional OPs resources

Consumed De-allocated

Application in charge:Resources are:

Dynamic Cloud

Dynamic Usage Example…

Docker Container Age(Count vs. Hours)

1 Hour

200 days 833 days

Dynamic Usage Example…

Docker Container Age(by Minute and Hour)

1,200,000

11% under one minute

Container age (minutes)

Dynamic Cloud Technologies

Dynamic Cloud is about scaling

Auto Scaling

Mobile / IoT Dynamic routing

Load balancing

Queues and notifications

Docker

How do I monitor the Dynamic Cloud?

Dynamic Cloud has unique monitoring requirements…

What is a Dynamic Cloud Application?

• Application & Application Microservices

Responsible for the parts you care about

• Infrastructure

• Allocation/Provisioning

• Scaling

Let cloud manage rest

Server OS

Server (Virtual)Hardware

Application & Application

Microservices

Provisioning

Application & Application

Microservices

Application & Application

Microservices

BrowserMobile

Server OS

Server (Virtual)Hardware

Application & Application

Microservices

Provisioning

Application & Application

Microservices

Application & Application

Microservices

BrowserMobile

Monitoring Dynamic Cloud Applications

DASHBOARDS

AWS CONSOLE

CloudWatch

Server OS

Server (Virtual)Hardware

Application & Application

Microservices

Provisioning

Application & Application

Microservices

Application & Application

Microservices

BrowserMobile

AWS InfrastructureNew Relic work together

CloudWatch

AWS CONSOLE

New RelicApplicationMonitoring

New Relic Infrastructure

Monitoring

DASHBOARDS

Server OS

Server (Virtual)Hardware

Application & Application

Microservices

Provisioning

Application & Application

Microservices

Application & Application

Microservices

BrowserMobile

CloudWatch

AWS CONSOLE

New RelicApplicationMonitoring

New Relic Infrastructure

Monitoring

DASHBOARDS

AWS InfrastructureNew Relic work together

New Relic

Monitors

CloudWatch &

AWS monitors

Server OS

Server (Virtual)Hardware

Application & Application

Microservices

Provisioning

Application & Application

Microservices

Application & Application

Microservices

BrowserMobile

How do you monitor this?

?How do you

monitor this?

Where did it go? It was just here!!

The thing you monitored 10 minutes ago…

...doesn’t exist anymore!?

Remember This?

Docker Container Age(by Minute and Hour)

1,200,000

11% under one minute

Container age (minutes)

Monitoring the Dynamic Cloud

Monitor the lifecycle of the

cloud components

Monitor the cloud

components themselves

Very different than monitoring traditional data center components

Who is impacted?

Operations Better data center Development

Can I scale my server fleet?

Can apps run anywhere?

How do they perform in the cloud?

A data center is a data center…

Who is impacted?

Operations Dynamic Cloud Development

What is a container?

Why do I care??

It was just here, where did it go??

Cloud architecture is integral to the

application architecture

*

Developers deeply involved in cloud

activities

Changing World

Previous - STATIC World

Ops

Changing World

Previous - STATIC World

Ops

Dev

Now - DYNAMIC World

Ops

Change is speeding up

Traditional data center Cloud data center Dynamic Cloud

Dynamic Cloud enables better applications faster.

Good Better Best

The way you’ve done things in the past

won’t work in the future.

Dynamic Cloud

Process running

a command

Things happen faster because of…

This is

HARDAmazon EC2 Docker container

Server running

application/ processes

Dynamic Cloud

Server running

application/ processes

Things happen faster because of…

Amazon EC2 Docker container

Function performing

a task or operation

AWS Lambda

Process running

a command

The Future with Lambda

Microcomputing & AWS Lambda

• Newest entrance to the “dynamic cloud”

• Provides event driven compute capabilities

• No infrastructure to provision

• Massively shared infrastructure

Why use Lambda?

Run in response to a

state change or action

in the cloud

Stateless, “filters”Perform quick actions

Virtually no

startup/shutdown cost

Lambda scripts

AWS Lambda

• Takes an event from an AWS

resource (a trigger) S3

Bucket

API

Gateway

SQS

RESOURCESSOME

DynamoDB

AWS Lambda

Lambda

Script

• Takes an event from an AWS

resource (a trigger)

• Creates an instance

to executeLambda

Instance

S3

Bucket

API

Gateway

SQS

RESOURCESSOME

DynamoDB

AWS Lambda

Lambda

Script

• Takes an event from an AWS

resource (a trigger)

• Creates an instance

to execute

• Can impact original or

different AWS resource

Lambda

Instance

S3

Bucket

DynamoDB

API

Gateway

SQS

RESOURCESSOME

S3

Bucket

API

Gateway SQS

RESOURCESSOME

AWS Lambda

Lambda

Script

Lambda

instances

• Takes an event from an AWS

resource (a trigger)

• Creates an instance

to execute

• Can impact original or

different AWS resource

• Any number of instances can

run at a time

S3

Bucket

API

Gateway

SQS

RESOURCESSOME

S3

Bucket

API

Gateway SQS

RESOURCESSOME

DynamoDB

Lambda example #1

Photo Management App

Photo management application

Upload

File• Photos uploaded to S3

S3 Bucket

Image

Import

User

Photo management application

Upload

File

S3 Bucket

Image

Thumbnails

View

Thumbnail

• Photos uploaded to S3

• Lambda script creates thumbnails

S3 Bucket

Image

Import

Lambda

Script

User

Photo management application

Image

Database

Upload

File

S3 Bucket

Image

Thumbnails

Lambda

Script

View

Thumbnail

Lambda

Script

• Photos uploaded to S3

• Lambda script creates thumbnails

• Lambda script updates metadata in

database

S3 Bucket

Image

Import

User

Photo management application

Application

Upload

File

S3 Bucket

Image

Thumbnails

App

Interactions

View

Thumbnail

Lambda

Script

• Photos uploaded to S3

• Lambda script creates thumbnails

• Lambda script updates metadata in

database

• Application only has

to deal with metadata editing, not

photo / file management

User

Image

Database

S3 Bucket

Image

Import

Lambda

Script

Lambda example #2

Mobile Game App

Mobile game platform

• Cloud platform hosts an

API for mobile app

- API Gateway

• Lambda scripts implement the API

• Lambda scripts manipulate database

• Extremely high scale possible

- No infrastructureAPI

Gateway

Lambda

Script

Lambda

Script

Lambda

Script

Mobile Phone

Application Users

DatabaseDatabase

Monitoring Lambda Scripts

Less like infrastructure monitoring /

More like web application monitoring

We Care About We Don’t Care About

• Run time

(average, extremes – TP90/TP99)

• Statistical metrics

• Error rates and other deviations

from norm

• “Drill down” into individual “runs”

• Details about all ”runs”

• Server / infrastructure metrics

Monitoring Lambda

More like application performance monitoring than infrastructure monitoring

Monitoring Lambda

More like application performance monitoring than infrastructure monitoring

Change is speeding up

Dynamic Cloud enables better applications faster.

Good Better Best

The way you’ve done things in the past

won’t work in the future.

Traditional data center Cloud data center Dynamic Cloud

EC2 Instance

Server OS

Server (Virtual)Hardware

Application &Application Microservices

Monitoring just the serverWorked when the rate of change was low…

CloudWatch

AWS CONSOLE

Monitoring just the server

• Rate of change is faster

• Problems come up quicker

• “Server” isn’t a server anymore

• “Provisioning” isn’t provisioning anymore

Insufficient in the cloud:

• Top to bottom monitoring…

• Full stack accountability...

• Dynamic infrastructure control...

You need:

Server OS

Server (Virtual)Hardware

Application & Application

Microservices

Provisioning

Application & Application

Microservices

Application & Application

Microservices

BrowserMobile

New Relic enables accountability

between your code & AWS

CustomersOn-Premises On Premises

Relational Data

RDS

Synthetic

Customers

S3

Service API

EC2

NoSQL

Browser / Mobile / Apps

Data Driven Digital Business

Customer Experience Mgmt Application Performance Mgmt Dynamic Infrastructure Mgmt

Thank you!

Architecting for Scale

By: Lee Atchison

Published by: O’Reilly Media

www.ArchitectingForScale.com

leeatchison@leeatchison

Stop by the New Relic Booth!Booth #610

Thursday – 11:20am

• Book signing

• Free copies of the book