netflix edge engineering open house presentations - june 9, 2016

110
Daniel Jacobson @daniel_jacob son Satish Gudiboina @sgudiboina Suudhan Rangarajan @suudhan Vasanth Asokan @vasanthasoka n Edge Engineering Open House - June 9, 2016

Upload: daniel-jacobson

Post on 16-Apr-2017

1.738 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Netflix Edge Engineering Open House Presentations - June 9, 2016

Daniel Jacobson@daniel_jacobson

Satish Gudiboina@sgudiboina

Suudhan Rangarajan@suudhan

Vasanth Asokan@vasanthasokan

Edge Engineering Open House - June 9, 2016

Page 2: Netflix Edge Engineering Open House Presentations - June 9, 2016

190 Countries (not China and a few others)

81+ Million Subscribers

Page 3: Netflix Edge Engineering Open House Presentations - June 9, 2016

1000+ Different Device Types

Page 4: Netflix Edge Engineering Open House Presentations - June 9, 2016

Over 42 Billion Hours Streamed in 2015

Page 5: Netflix Edge Engineering Open House Presentations - June 9, 2016

Streaming Hours Per Year in Billions

Page 6: Netflix Edge Engineering Open House Presentations - June 9, 2016

Streaming Hours Per Year in Billions

Page 7: Netflix Edge Engineering Open House Presentations - June 9, 2016

Over 42 Billion Hours Streamed in 2015

Page 8: Netflix Edge Engineering Open House Presentations - June 9, 2016

Over 42 BillionSuccesses!

Page 9: Netflix Edge Engineering Open House Presentations - June 9, 2016

Of Course, There Are Failures Too…

Page 10: Netflix Edge Engineering Open House Presentations - June 9, 2016

Two Primary Drivers Behind Our Successes

Page 11: Netflix Edge Engineering Open House Presentations - June 9, 2016

People Desire to Watch Netflix

Two Primary Drivers Behind Our Successes

Page 12: Netflix Edge Engineering Open House Presentations - June 9, 2016

People Desire to Watch Netflix

Systems Scale to Meet Desires

Two Primary Drivers Behind Our Successes

Page 13: Netflix Edge Engineering Open House Presentations - June 9, 2016
Page 14: Netflix Edge Engineering Open House Presentations - June 9, 2016

Sign-Up

Page 15: Netflix Edge Engineering Open House Presentations - June 9, 2016

Sign-Up

Discovery / Browse

Page 16: Netflix Edge Engineering Open House Presentations - June 9, 2016

Sign-Up

Discovery / Browse

Playback

Page 17: Netflix Edge Engineering Open House Presentations - June 9, 2016

Edge Engineering provides data and

functionality to support these

three experiences

Page 18: Netflix Edge Engineering Open House Presentations - June 9, 2016

Designing APIs

EnablingPlayback Scaling

Routing

InsightsDX

Resiliency

Tools

Edge Engineering provides data and

functionality to support these

three experiences

Page 19: Netflix Edge Engineering Open House Presentations - June 9, 2016
Page 20: Netflix Edge Engineering Open House Presentations - June 9, 2016

DEVICES

Page 21: Netflix Edge Engineering Open House Presentations - June 9, 2016

DEVICES

ROUTING

Page 22: Netflix Edge Engineering Open House Presentations - June 9, 2016

DEVICES

ROUTING

Page 23: Netflix Edge Engineering Open House Presentations - June 9, 2016

DEVICES

ROUTING

API

API API API API API API

Page 24: Netflix Edge Engineering Open House Presentations - June 9, 2016

DEVICES

ROUTING

API

API API API API API API

SERVICES

S2S2RecsS2S2Member

S2S2RatingsS2S2Playback LifecycleS2S2Authn/z

S2S2A/BS2S2Search

S2S2IdentityS2S2 S2S2Playback Data S2S2DRMMetadata

Page 25: Netflix Edge Engineering Open House Presentations - June 9, 2016

DEVICES

ROUTING

API

API API API API API API

SERVICES

S2S2RecsS2S2Member

S2S2RatingsS2S2S2S2Authn/z

S2S2A/BS2S2Search

S2S2IdentityS2S2Metadata

S2S2Playback Data S2S2DRM

Ownedby Edge

Engineering

Playback Lifecycle

Page 26: Netflix Edge Engineering Open House Presentations - June 9, 2016

DEVICES

ROUTING

API

API API API API API API

SERVICES

S2S2RecsS2S2Member

S2S2RatingsS2S2S2S2Authn/z

S2S2A/BS2S2Search

S2S2IdentityS2S2 S2S2Playback Data S2S2DRMMetadata

Playback Lifecycle

Page 27: Netflix Edge Engineering Open House Presentations - June 9, 2016

DEVICES

ROUTING

API

API API API API API API

SERVICES

S2S2RecsS2S2Member

S2S2RatingsS2S2S2S2Authn/z

S2S2A/BS2S2Search

S2S2IdentityS2S2 S2S2Playback Data S2S2DRMMetadata

Playback Lifecycle

Page 28: Netflix Edge Engineering Open House Presentations - June 9, 2016

DEVICES

ROUTING

API

API API API API API API

SERVICES

S2S2RecsS2S2Member

S2S2RatingsS2S2S2S2Authn/z

S2S2A/BS2S2Search

S2S2IdentityS2S2 S2S2Playback Data S2S2DRMMetadata

Playback Lifecycle

Page 29: Netflix Edge Engineering Open House Presentations - June 9, 2016

DEVICES

ROUTING

API

API API API API API API

SERVICES

S2S2RecsS2S2Member

S2S2RatingsS2S2S2S2Authn/z

S2S2A/BS2S2Search

S2S2IdentityS2S2 S2S2Playback Data S2S2DRMMetadata

Playback Lifecycle

Page 30: Netflix Edge Engineering Open House Presentations - June 9, 2016

API API API API API API

S2S2S2S2Authn/z

S2S2Playback Data S2S2DRM

INSIGHTS

TOOLS

DX

Playback Lifecycle

Page 31: Netflix Edge Engineering Open House Presentations - June 9, 2016

42 Billion Hours2015

Page 32: Netflix Edge Engineering Open House Presentations - June 9, 2016

200 Billion Hours

2015

Future

42 Billion Hours

Page 33: Netflix Edge Engineering Open House Presentations - June 9, 2016

The rest of

Netflix’s AWS Cloud Footprint by %

Page 34: Netflix Edge Engineering Open House Presentations - June 9, 2016

Talking About the Future of Edge Engineering

Satish GudiboinaAPI and Upcoming Re-Architecture

Suudhan RangarajanPlayback Experience

Vasanth AsokanDeveloper Tools, Velocity and Experience

Page 35: Netflix Edge Engineering Open House Presentations - June 9, 2016

The Netflix API Platform for Server-Side Scripting

Current and The FutureSatish Gudiboina

Page 36: Netflix Edge Engineering Open House Presentations - June 9, 2016

The Netflix API

Page 37: Netflix Edge Engineering Open House Presentations - June 9, 2016

Streaming Hours Per Year in Billions

Page 38: Netflix Edge Engineering Open House Presentations - June 9, 2016

Scale is multi-faceted

Growing number of users ( → RPS)

Growing number of device types

Growing number of A/B tests

Growing number of languages

Growing number of countries

Page 39: Netflix Edge Engineering Open House Presentations - June 9, 2016

What we need to build for

Velocity

Resiliency

Other requirements:PerformanceGreat developer experienceOperational insightsTooling

Page 40: Netflix Edge Engineering Open House Presentations - June 9, 2016

SERV

ICE

LAYE

R

Js(mostly)

java

Client AClient BClient C

Client A

Client YClient Z

...

...

Netflix Microservices

script

script

script

script

...

script

script

script

script

Network boundary

API Server JVM

Today’s architecture

Resiliency with Hystrix

Page 41: Netflix Edge Engineering Open House Presentations - June 9, 2016

Developer Velocity: Decoupled deployments of versions

n+3

Day 1

Day 2

Day 3

Day 4

Day 5

API device 1 device 2 device 3 device 4

i+4

i+1i+2i+3

i

n+2

n+1

n

k+1

k j

j+1

l

Page 42: Netflix Edge Engineering Open House Presentations - June 9, 2016

Changing risk profile

Growing number of users ( → RPS)

Growing number of devices

Growing number of A/B tests

Growing number of languages

Growing number of countries

Growing number and complexity of scripts (scripts → apps)

Page 43: Netflix Edge Engineering Open House Presentations - June 9, 2016

SERV

ICE

LAYE

R

Js(mostly)

java

Client AClient BClient C

Client A

Client YClient Z

...

...

Netflix Microservices

script

script

...

script

script

Network boundary

API Server JVM

Today’s system (T-3yrs)

few, small scriptsfewer uploads

Page 44: Netflix Edge Engineering Open House Presentations - June 9, 2016

SERV

ICE

LAYE

RJs

(mostly)java

Client AClient BClient C

Client A

Client YClient Z

...

...

Netflix Microservices

script

script

script

script

...

script

script

script

script

Network boundary

API Server JVM

Today’s system (T)

scripts

scripts

hundreds of more complex scripts,10-50 uploads per day

Page 45: Netflix Edge Engineering Open House Presentations - June 9, 2016

What we need

Velocity

Resiliency?

Page 46: Netflix Edge Engineering Open House Presentations - June 9, 2016

Lack of process isolation is a growing risk.

Page 47: Netflix Edge Engineering Open House Presentations - June 9, 2016

Moving toward our ideal API:What will change

Scripts will run in containers

Scripts will call API remotely

Page 48: Netflix Edge Engineering Open House Presentations - June 9, 2016

SERV

ICE

LAYE

RJs

(mostly)java

Client AClient BClient C

Client A

Client YClient Z

...

...

Netflix Microservices

node script

node script

...

node script

node script

Network boundary API Server JVM

The (near) future

node.js

process isolation

node for device teams

Page 49: Netflix Edge Engineering Open House Presentations - June 9, 2016

Why containers?

Process isolation

Fast startup

Consistent developer experience across environments

Page 50: Netflix Edge Engineering Open House Presentations - June 9, 2016

Isolated failures: scripts don’t affect each other

API

device 1 device 2 device 3 device 4Temporarily unavailable!

Page 51: Netflix Edge Engineering Open House Presentations - June 9, 2016

Independent autoscaling

API

device 1 device 2 device 3 device 4

Page 52: Netflix Edge Engineering Open House Presentations - June 9, 2016

Fast startup

New API server: minutesNew container: seconds

Fast rollout, fast rollback, fast MTTR

Page 53: Netflix Edge Engineering Open House Presentations - June 9, 2016

The Netflix API

Page 54: Netflix Edge Engineering Open House Presentations - June 9, 2016

Edge Developer ExperienceTranslating developer productivity to Netflix customer delight

Page 55: Netflix Edge Engineering Open House Presentations - June 9, 2016

Developer Experience?

Page 56: Netflix Edge Engineering Open House Presentations - June 9, 2016

DEVELOP(rapidly)

DEPLOY(reliably)

OPERATE(effectively)

Experimentation driven innovation

~700 apps, dozens of pushes a day15+ client teams, ~200 developers

~50 direct services, 100s of AB tests, dozens of new features

The Innovation Funnel

API

Devices

Netflix Services

Client Adaptor Applications

Page 57: Netflix Edge Engineering Open House Presentations - June 9, 2016

Why care about DevEx?

DeveloperProductivity

ProductInnovation

Tools

Automation

Insights

CustomerSatisfaction

Page 58: Netflix Edge Engineering Open House Presentations - June 9, 2016

App Development and Management

DEVELOP(rapidly)

DEPLOY(reliably)

OPERATE(effectively)

Page 59: Netflix Edge Engineering Open House Presentations - June 9, 2016

SERV

ICE

LAYE

R

Netflix Microservices

appW

AN

Boun

dary API SERVER JVM

js java

Developer Ergonomics

app

...

app

app

CLI

EN

T LI

BR

AR

IES

Large / Complex

SERV

ICE

LAYE

R

Page 60: Netflix Edge Engineering Open House Presentations - June 9, 2016

REM

OTE

SERV

ICE

LAYE

Rapp

API SERVER JVM

Developer Ergonomics ...

app

...

app

app

CLI

EN

T LI

BR

AR

IES

js javajs

DOCKER CONTAINERS

WAN

Bo

unda

ryNetflix

Microservices

Page 61: Netflix Edge Engineering Open House Presentations - June 9, 2016

Setup Canary

SupportProd Push

Pre-Prod

MetricsTracing

Lifecycle

Alerts

Build

Bootstrap

API Discovery

REPL

Unit Test

SDK Debug Logging

Profiling

Audits

Security

Custom Routing

Dependency Management

Client Application Development Critical Component!

Dx Developer Experience

Page 62: Netflix Edge Engineering Open House Presentations - June 9, 2016

$ newt init

Just bring your Javascript business logic

NeWT: Netflix Workflow Toolkit

Continuous Integration

Deployment Pipelines

Autoscaling

Dashboards

Alerting

Logging

Lifecycle Management

Audits and Analytics

Container tooling

Canaries

Dependency Management

Page 63: Netflix Edge Engineering Open House Presentations - June 9, 2016

Titus

ATLAS

NeWT: Netflix Workflow Toolkit

Page 64: Netflix Edge Engineering Open House Presentations - June 9, 2016

Edge PaaS UI

Page 65: Netflix Edge Engineering Open House Presentations - June 9, 2016

$ newt auto-deploy -d

nodeJSproject

Docker Machine

node-inspector

DebuggerFile watcher / live reload trigger

File watcher agent

NeWT: Local Container Development

Local Container

docker build / run

Page 66: Netflix Edge Engineering Open House Presentations - June 9, 2016

$ newt auto-deploy -d

Docker Machine

NeWT: Local Container Development

Local Container

CloudMicroservices

Cloud Proxy

Terminate security

Disc

over

y Ag

ent

Service Discover

y

Loca

l Sy

stem

Clou

d

Page 67: Netflix Edge Engineering Open House Presentations - June 9, 2016

App Operations and Insights

DEVELOP(rapidly)

DEPLOY(reliably)

OPERATE(effectively)

Page 68: Netflix Edge Engineering Open House Presentations - June 9, 2016

• Low Latency, High throughput, Highly Efficient• Handle bursty or large scale loads• Extensible programming model

600 jobs in production, 8M messages/sec at peak, 100Gbps network throughput

Mantis - Stream Processing Platform

Page 69: Netflix Edge Engineering Open House Presentations - June 9, 2016

Monitoring facets of aggregate application health, globally

Aggregate Insights

Page 70: Netflix Edge Engineering Open House Presentations - June 9, 2016

Aggregate Insights

Page 71: Netflix Edge Engineering Open House Presentations - June 9, 2016

Analyze in real-time, requests matching a precise set of conditions

Surgical Insights

Page 72: Netflix Edge Engineering Open House Presentations - June 9, 2016

Surgical Insights - Real-time Stream Queries

Page 73: Netflix Edge Engineering Open House Presentations - June 9, 2016

Surgical Insights - Real-time Stream Queries

Page 74: Netflix Edge Engineering Open House Presentations - June 9, 2016

Surgical Insights - Real-time Stream Queries

Page 75: Netflix Edge Engineering Open House Presentations - June 9, 2016

Monitoring server side calling pattern and internal application profile

Session Tracing

Page 76: Netflix Edge Engineering Open House Presentations - June 9, 2016

Session Tracing

Page 77: Netflix Edge Engineering Open House Presentations - June 9, 2016

Session Tracing - Request Profile

Page 78: Netflix Edge Engineering Open House Presentations - June 9, 2016

Session Tracing - Per Node Profile

Page 79: Netflix Edge Engineering Open House Presentations - June 9, 2016

Automatic monitoring of high cardinality data across multiple dimensions

Real-time Anomaly Detection

Page 80: Netflix Edge Engineering Open House Presentations - June 9, 2016

Real-time Anomaly Detection

Page 81: Netflix Edge Engineering Open House Presentations - June 9, 2016

• Scaling developer productivity with business growth

• Provide fully managed PaaS experience to client developers • Shift Left Insights to power smart development• Curated, blended visualizations that simplify devops

In conclusion...

Page 82: Netflix Edge Engineering Open House Presentations - June 9, 2016

Tech Soup

Page 83: Netflix Edge Engineering Open House Presentations - June 9, 2016

Scaling Playback Services

Suudhan Rangarajan Senior Software Engineer, Playback Features

@suudhan

Page 84: Netflix Edge Engineering Open House Presentations - June 9, 2016
Page 85: Netflix Edge Engineering Open House Presentations - June 9, 2016

Playback Lifecycle

DECIDE

COLLECT & LEARN

AUTHORIZE

Page 86: Netflix Edge Engineering Open House Presentations - June 9, 2016

Decide

MANIFEST (Tracks and URLs)

Page 87: Netflix Edge Engineering Open House Presentations - June 9, 2016

Authorize

LICENSE

❏ Content usage / resolution policies

❏ Plan / device limits enforcement

❏ DRM / License generation

Page 88: Netflix Edge Engineering Open House Presentations - June 9, 2016

Collect & Learn

Bookmarks & Hours Watched

Streaming Errors and Metrics

Quality Of Experience metrics

4

Page 89: Netflix Edge Engineering Open House Presentations - June 9, 2016

Lets look at Play Decisions

DECIDE

MANIFEST

AUTHORIZE

COLLECT & LEARN

LICENSE

SESSION

Page 90: Netflix Edge Engineering Open House Presentations - June 9, 2016

Huge number of Streams

Resolutions - 720p, 1080p, 4K etcCodecs - H.264,HEVC etcBitrates - 230, 780, 3000 etc

Channels - Stereo, Surround SoundLanguages - English, French etc

Types - Subtitles, Closed Captions, Forced NarrativesLanguages - English, French etc

Suudhan Rangarajan
I went through all my icons and replaced it with the ones with Creative Commons license and added a image attribution slide at the end as well
Daniel Jacobson
they are on quite a few slides...
Daniel Jacobson
[email protected] these little images look like clipart type of images. do we have rights to use them?
Page 91: Netflix Edge Engineering Open House Presentations - June 9, 2016

Streams to Tracks

- H.264 Main Profile- English 5.1 Audio- No Subtitle

- HEVC Dash Profile- French 2.0 Audio- English CC

- HDR Dash Profile- Spanish AAC Audio- English Forced Narrative

Page 92: Netflix Edge Engineering Open House Presentations - June 9, 2016

Decide & Filter

MANIFEST SERVICE

Page 93: Netflix Edge Engineering Open House Presentations - June 9, 2016

Many Many Dimensions

PLAYBACKMANIFEST

USER PREFERENCES

TITLEMETADATA

COUNTRY

DEVICE

NETWORK

Page 94: Netflix Edge Engineering Open House Presentations - June 9, 2016

Big Opportunity

Rich playback experiences

Tremendous increase in scale

Customer growth

Page 95: Netflix Edge Engineering Open House Presentations - June 9, 2016

Challenge: Efficient Scaling

Targeting sub-linear growth

# of Requests

Cloud Costs

Page 96: Netflix Edge Engineering Open House Presentations - June 9, 2016

Predictable Viewing Patterns

Key Insight

Page 97: Netflix Edge Engineering Open House Presentations - June 9, 2016

Key Insight

CONTENT RANK

PLAY

RE

QUES

TS

Page 98: Netflix Edge Engineering Open House Presentations - June 9, 2016

Also..Manifest Request for one title

PLAY

RE

QUES

TS

TIME

Page 99: Netflix Edge Engineering Open House Presentations - June 9, 2016

Current: Completely Real-time

Real-time manifest generation

Page 100: Netflix Edge Engineering Open House Presentations - June 9, 2016

With Caching

Real-time manifest generation

80% Cached20% Real-time

Page 101: Netflix Edge Engineering Open House Presentations - June 9, 2016

Challenges

How do we determine the optimal combination of attributes to cache on?

Page 102: Netflix Edge Engineering Open House Presentations - June 9, 2016

Challenges

Cache Considerations: ●When to populate?●When to bust?●How to scale for

cache-miss or failures?

Page 103: Netflix Edge Engineering Open House Presentations - June 9, 2016

Potential Win

10x increase in requests with only 4x increase in costs

Page 104: Netflix Edge Engineering Open House Presentations - June 9, 2016

Optimize computation

Can we re-imagine our service processing to dramatically increase throughput?

Page 105: Netflix Edge Engineering Open House Presentations - June 9, 2016

Anatomy of a Playback Manifest Request

Metadata Access

27%

36%

Tracks Generation

16%

Streams Filtering

21%

Serialization

Page 106: Netflix Edge Engineering Open House Presentations - June 9, 2016

Potential Win

10x increase in requests with just 2x increase in service costs

Page 107: Netflix Edge Engineering Open House Presentations - June 9, 2016

Two-pronged Strategy to Scaling

Cache Manifests

Re-architect code to reduce processing time

Page 108: Netflix Edge Engineering Open House Presentations - June 9, 2016

Scaling Problems Across Services

Decide Authorize Collect & Learn

Playback Features

Playback Access

Playback Data Systems

Page 109: Netflix Edge Engineering Open House Presentations - June 9, 2016

Thanks!

@suudhan

Come Talk to Us!

Page 110: Netflix Edge Engineering Open House Presentations - June 9, 2016

Image AttributionAll Images used are under creative commons or public domain license:

● Video icon - http://simpleicon.com/wp-content/uploads/video-camera-1.png● Speaker icon -

https://upload.wikimedia.org/wikipedia/commons/thumb/2/21/Speaker_Icon.svg/1024px-Speaker_Icon.svg.png

● Subtitle icon - https://thenounproject.com/term/subtitles/78795/ ● Uptrend image - https://pixabay.com/en/chart-line-line-chart-diagram-trend-148256/ ● Funnel image - https://commons.wikimedia.org/wiki/File:Funnel_Mech.svg ● Business Intelligence image -

https://pixabay.com/static/uploads/photo/2015/04/14/23/17/it-business-722950_960_720.png ● Key icon - https://pixabay.com/static/uploads/photo/2014/04/03/10/55/key-311738_960_720.png ● Person icon-

https://pixabay.com/static/uploads/photo/2015/12/22/04/00/photo-1103596_960_720.png ● Mobile icon-

https://upload.wikimedia.org/wikipedia/commons/thumb/1/14/Mobile_phone_font_awesome.svg/1024px-Mobile_phone_font_awesome.svg.png

● Globe image - https://upload.wikimedia.org/wikipedia/commons/thumb/6/60/Simple_Globe.svg/1024px-Simple_Globe.svg.png

● Devices icon- https://upload.wikimedia.org/wikipedia/commons/thumb/6/60/Simple_Globe.svg/1024px-Simple_Globe.svg.png

● wifi icon - https://pixabay.com/static/uploads/photo/2016/01/03/11/32/wireless-signal-1119306_960_720.png

● cell tower - https://pixabay.com/static/uploads/photo/2012/04/13/00/23/tower-31235_960_720.png