automated jvm tuning with bayesian optimization · automated jvm tuning with bayesian optimization....

44
JIANQIAO LIU @TRACKPOINT10 RAMKI RAMAKRISHNA @YSR1729 ALEX WILTSCHKO @AWILTSCH TWITTER ENGINEERING ADVANCED TECHNOLOGY GROUP AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION

Upload: lamkiet

Post on 16-Sep-2018

268 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

J I A N Q I A O L I U @ T R A C K P O I N T 1 0R A M K I R A M A K R I S H N A @ Y S R 1 7 2 9A L E X W I L T S C H K O @ A W I L T S C H

T W I T T E R E N G I N E E R I N GA D V A N C E D T E C H N O L O G Y G R O U P

A U T O M A T E D J V M T U N I N GW I T H B A Y E S I A N O P T I M I Z A T I O N

Page 2: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

M A T E R I A L D E V E L O P E D W I T H

I A N B R O W N @ I G BK E V I N S W E R S K Y @ K S W E R S KJ A S P E R S N O E K @ L A T E N T J A S P E RR Y A N A D A M S @ R Y A N _ P _ A D A M SH U G O L A R O C H E L L E @ H U G O L A R O C H E L L E

D A V E B A R R @ D A V E B A R RJ O H N C O O M E S @ J O H N _ C O O M E SI A N D O W N E S @ N D W N SD A V E R O B I N S O N @ D A V E R O B I N S O NC H R I S R E G A D O @ C H R I S R E G A D OT O D D S T U M P F @ S T U M P F

A C K N O W L E D G E M E N T S

Page 3: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

• Jianqao Liu : Graduate Student, ECE at Purdue University (work done while a summer intern at Twitter San Francisco)

• Ramki Ramakrishna : Staff Engineer, JVM Engineering at Twitter San Francisco

• Alex Wiltschko : Research Engineer, Advanced Technology Group at Twitter Boston

JAVAONE 2016

WHO WE ARE

Page 4: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

Source: “How we built a metering and chargeback system to incentivize higher resource utilization of Twitter infrastructure”, Micheal Arul, Vinu Charanya, LinuxCon 2016, Toronto, August 22-24, 2016.

• O(103) services• O(105) service instances• Heterogeneous hardware• Varying resources

JAVAONE 2016

TWITTER RUNS ON MICRO-SERVICES

Page 5: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

JVM

Hardware

Mesos Container Mesos ContainerKernel + OS Services

JVMMicroservice A Microservice B

h1 , h2 , …

k1 , k2 , …

m1 , m2 , …

j1 , j2 , …

s1 , s2 , …

f(h,k,m,j,s)

JAVAONE 2016

A PERFORMANCE STACK AT TWITTERA simplified view

Page 6: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

• Hotspot JVM has hundreds of tunable knobs:$ java -XX:+PrintFlagsFinal -version | grep "=”uintx AdaptiveSizePolicyWeight = 10 {product}uintx AdaptiveSizeThroughPutPolicy = 0 {product}uintx AdaptiveTimeWeight = 25 {product}bool AdjustConcurrency = false {product}bool AggressiveOpts = false {product}intx AliasLevel = 3 {C2 product}bool AlignVector = false {C2 product}…

$ java -XX:+PrintFlagsFinal -version | grep "=” | wc –l757

A large variety of parameters:

• performance-sensitivity• hardware-dependency• mutual (in)dependency

JAVAONE 2016

TUNING AT THE JVM LAYER

Page 7: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

Hand-tuning doesn’t scale:• few parameters handled manually• time-consuming, labor-intensive, error-proneCargo-culted configurations

Upgrades make optimality fleeting

Hypothesis: Most micro-services operate below optimalityPreview: 80% improvement on a large service

JAVAONE 2016

PERFORMANCE OPTIMIZATIONNeeds to be continuous

Page 8: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

• Given a function f(x1 , x2 , …, xn ) defined over domain X

• Find a configuration A = (a1 , a2 , …, an ) that maximizes f

JAVAONE 2016

PERFORMANCE TUNINGAs a formal optimization problem

Page 9: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

• Simple constraints:

x1 < x2 : e.g. NewSize <= HeapSize

a < x3 <= b : e.g. 0 <= MaxTenuringThreshold <= 15

• More complex constraints:

g(x1 , x2 ) <= h(x3, x4 )

• Constraints on behavior:

w(X) < k : e.g. 99 percentile of response latency < 5 ms

r(X) = t(X) : e.g. no requests result in errors

JAVAONE 2016

PERFORMANCE TUNINGAs a constrained optimization problem

Page 10: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

Environment may introduce hidden, uncontrollable, and possibly time-varying, parameters, e.g. :• inter-container cross-talk• seasonal/diurnal environmental conditions or load• heterogeneous hardware

JAVAONE 2016

PERFORMANCE TUNINGAs optimization of a noisy, non-stationary cost function

Page 11: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

• Design (and refine) a suitable performance metric

• Decide on (and refine) knobs to tune

• Use an iterative strategy to tune these knobs

JAVAONE 2016

PERFORMANCE TUNING

Page 12: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

Pick new parameters to testbased on results obtained

Measure performancewith new parameter settings

System being tuned

measureanalyze

JAVAONE 2016

PERFORMANCE TUNINGThe manual approach

Performance Engineer

Page 13: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

Pick new parameters to testbased on results obtained

Measure performancewith new parameter settings

System being tuned

“evaluation”“suggestion”

JAVAONE 2016

PERFORMANCE TUNINGUsing an automation assistant

Black Box Tuning

Assistant

Page 14: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

A technique from machine learning:Bayesian Optimization

• A machine learning approach to black-box optimization.• A method to learn (potentially noisy) cost functions

• iteratively

• efficiently• Finds very good answers very quickly on a wide variety of problems

I'll show you how it works in practice

JAVAONE 2016

HOW SHOULD WE BUILD AN AUTOMATION ASSISTANT?

Page 15: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

Each experiment we run with a different setting of our parameter is expensive

JAVAONE 2016

BAYESIAN OPTIMIZATION

Page 16: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

If choosing what experiments to run is important, how do we do it well?

JAVAONE 2016

BAYESIAN OPTIMIZATION

Page 17: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

If choosing what experiments to run is important, how do we do it well?

JAVAONE 2016

BAYESIAN OPTIMIZATION

Page 18: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

If choosing what experiments to run is important, how do we do it well?

JAVAONE 2016

BAYESIAN OPTIMIZATION

Page 19: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

BAYESIAN OPTIMIZATIONIf choosing what experiments to run is important, how do we do it well?

JAVAONE 2016

Expected Improvement

Page 20: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

BAYESIAN OPTIMIZATIONBayesOpt in action

Global optimum discovered

JAVAONE 2016

Expected Improvement

Page 21: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

BayesOpt works in much higher dimensions than humans do

JAVAONE 2016

BAYESIAN OPTIMIZATION

Page 22: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

What do we want an implementation of BayesOpt to look like in practice?• Easy-to-use• Minimal coding required by the user• Support multiple languages• Running concurrent experiments should be trivial

JAVAONE 2016

BUILDING AN AUTOTUNING SERVICEWhat should an ideal autotuning system look like?

Page 23: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

JAVAONE 2016

BUILDING AN AUTOTUNING SERVICEAn example API call

Page 24: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

LuaClients

LuaClients

LuaClient

LuaClients

LuaClientsMatlabClient

LuaClients

LuaClientsPythonClient

LuaClients

LuaClients

ScalaClient

WEBSERVER

LuaClients

LuaClients

CLIClient

Middleware

Worker(Queue and State Manager)

BayesOpt Engine

Auto-Scaling Group

Queue

Mesos JAVAONE 2016

BUILDING AN AUTOTUNING SERVICEThe service layout of BayesOpt at Twitter

Page 25: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

ALTERNATIVE APPROACHESBayesOpt isn't the only way to do it, but it's by far our favorite

Random Search (Bergstra 2012, shockingly good for zero effort!)Parzen Trees (Bergstra 2011)Random Forests (Hutter et al., 2011)Reinforcement Learning (e.g., Google Datacenter Cooling)

We prefer BayesOpt because it's• Robust• Extensible• Battle-tested on many types of real-world, high-impact

problems.

JAVAONE 2016

Page 26: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

BAYESOPT WINS AT TWITTERWe're just getting started

Spam detection (+8%)Abuse Detection (+6%)All deep learning applications ("set it and forget it" prototyping)Vine video recommendations (+30% user engagement on recs)Hadoop cost reduction (-80% cost)Revenue applicationsJVM performance (take it away, Jianqiao!)

JAVAONE 2016

Page 27: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

Garbage Collector TypeNew Generation SizeSurvivor RatioParallel GC ThreadsConcurrent GC ThreadsPre-fetch Interval SizeClip In-liningBiased LockingThere are dozens more

JAVAONE 2016

A SAMPLING OF JVM PARAMETERS

Page 28: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

J1, J2, … , Jn

F(J)

Kernel

JVM

ServiceS1, S2, … , Sm

K1, K2, … , Ko

HardwareH1, H2, … , Hp

F(J,S,K,H,…)

JAVAONE 2016

A TYPICAL MICRO-SERVICE STACK

Page 29: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

J1, J2, … , Jn

F(J)

Kernel

JVM

SPECjbb2015

JAVAONE 2016

SPECJBB2015A JVM benchmark

Page 30: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

J1, J2, … , Jn

F(J)

Kernel

JVM

Microservice A

JAVAONE 2016

A MICROSERVICE ON THE JVM

Page 31: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

• A large production service

• Access to User Objects via a Thrift interface

• Why this service?

• Mature, does not undergo frequent redeploys

• Large number of service instances

JAVAONE 2016

MICROSERVICE "A"

Page 32: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

• Environment• Microservice staging environment• Real production traffic (dark read)• Portion of production workload

• Performance metric• RPS: requests per second• GC_cost: wall-clock time spent in gc

• Perf = RPS / GC_Cost

JAVAONE 2016

THE SET-UP

Page 33: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

Scor

era

tio

Version Controlled File

StoreService

BayesOptService

Shard #1

Shard #0

JVM TuningService

Aurora Scheduler

Shard #4

Observability/Metrics Service

Mesos

Restart

Stop

Sug

ges

tion

Baseline and

Experim

ent scores

Con

fig

Shard #3

Shard #2

1. Get a new parm suggestion from BayesOpt

2. Generate JVM configuration

3. Upload new configuration to File Store Service

4. Get baseline platform information

5. Stop-and-restart test instance on specific hardware

6. Test for a fixed duration

7. Get valid baselines

8. Obtain baseline performance score

9. Obtain performance score from experiment

10. Compute the ratio and inform BayesOpt

Microservice A

33

Page 34: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

0

0.5

1

1.5

2

2.5

0 10 20 30 40 50 60 70 80 90Iterations

Per

form

ance

sco

re r

atio

JAVAONE 2016

EVALUATIONBayesOpt in action

Page 35: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

PERFORMANCE OF THE OPTIMUM RESULT

Performance ratio: 590.253 / 324.237 = 182.0%

JAVAONE 2016

Page 36: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

REQUESTS PER SECONDOf the optimum BayesOpt result

JAVAONE 2016

Page 37: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

GC COSTOf the optimum BayesOpt result

GC_cost ratio: 15.3/ 27.9 = 54.8%, reduce by 45%

JAVAONE 2016

Page 38: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

Apples-to-apples comparison:• Over 50 different platforms in Mesos; ~12 major platforms

System services or Python bugs could cause failure:• timeout (subprocess)• return empty content• throw exceptions

Mesos/Aurora scheduler might interrupt & restart an instance

Authentication service ticket timeout

JAVAONE 2016

PRACTICAL EFFORT

Page 39: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

• Several concurrent evaluations• Trade off with longer experiment duration• Experiment set per hardware platform• Terminate obviously poor suggestions early

• Stress test/validation of optimal configuration(s)

• General framework/service for optimizing an arbitrary micro-service

• Clean up service authentication and use more robust, official APIs• Replace Python subprocess calls with direct calls to Python APIs

JAVAONE 2016

NEXT STEPS

Page 40: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

Optimal Parameter Settings:• Similar new generation size• Smaller tenuring threshold, smaller survivor spaces• Larger prefetch read interval for GC scan• Old generation promotion allocator filter parameters• More GC threads• Higher compilation size threshold

Performance Gains:• GC overhead• Tail response latency• Data center footprint

JAVAONE 2016

LESSONS FROM OPTIMAL RESULT

Page 41: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

• Choice of performance function

• Choice of parameters to tune

• Duration of evaluation runs

• Concurrent evaluations

• Factor out hardware effects

• Protect against noise

• Use baseline configurations

• Long range effects

• Stress-testing to filter optima

JAVAONE 2016

MORE LESSONS

Page 42: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

• BayesOpt suggestions/evaluations may be sub-optimal• Reliability and redundancy designed into micro-services

architecture• Existing monitors/alarms/alerts/sensors/telemetry

JAVAONE 2016

AUTOMATED PERFORMANCE TUNINGLeverage existing micro-services infrastructure

Page 43: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

• Continuous, inexpensive automated optimization of micro-services is possible, even inevitable

• BayesOpt reduces the number of costly experiments to quickly find a near-optimal setting

• Existing micro-services and DevOps frameworks already have most of the infrastructure to support this

JAVAONE 2016

CONCLUSIONAutomated performance optimization in the DevOps deployment workflow

Page 44: AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

QUESTIONS ?

THANK YOU !