node interactive debugging node.js in production
TRANSCRIPT
Debugging Node.js in ProductionYunong Xiao
@yunongx Software Engineer
Node Platform
Node.js @ Netflix
❖ 65+ Million Subscribers❖ Website (netflix.com)❖ Dynamic asset packager❖ PaaS on Node❖ Internal Services
–Gene Kranz, Flight Director, Apollo 13
“Let's work the problem, people. Let's not make things any worse by guessing”
Apply the Scientific Method
1. Construct a Hypothesis
2. Collect data
3. Analyze data and draw a conclusion
4. Repeat
Production Crisis
❖ Runtime Performance
❖ Runtime Crashes
❖ Memory Leaks
Netflix is “Slow”
Gather Request Data
http://restify.comhttp://github.com/restify/node-restify
Observable REST Framework
to the Rescue[2014-12-09T14:07:26.293Z] INFO: shakti/restify-audit/20067: handled: 200, latency=1402 (req_id=b3fa3820-7fac-11e4-8908-a5c7b70d676f, latency=1435) GET / HTTP/1.1 host: www.netflix.com -- HTTP/1.1 200 OK x-netflix.client.instance: i-057e47ef x-frame-options: DENY content-type: text/html -- req.timers: { "parseBody": 700123, "apiRpc": 701911, "render": 400031 }
req.timers: { "parseBody": 700123, “apiRPC”: 301911, "render": 400031,}
On CPU
CPU is Critical
❖ Node is essentially “single threaded”
❖ Cascading effect on ALL requests in process
req.timers: { "parseBody": 700123, “apiRPC”: 301911, "render": 400031,}
Can’t process ANY other request for 1.1 seconds
On CPU
How Much Code?
$ find . -name "*.js*" | xargs cat | wc -l
6 042 301
Statistically Sample Stack Traces
Snapshot What’s Currently Executing
Stacktrace: A stack trace is a report of the active stack frames at a certain point in time during the execution of a program.
> console.log(ex, ex.stack.split("\n"))ReferenceError: ex is not defined at repl:1:13 at REPLServer.defaultEval (repl.js:132:27) at bound (domain.js:254:14) at REPLServer.runBound [as eval] (domain.js:267:12) at REPLServer.<anonymous> (repl.js:279:12) at REPLServer.emit (events.js:107:17) at REPLServer.Interface._onLine (readline.js:214:10) at REPLServer.Interface._line (readline.js:553:8) at REPLServer.Interface._ttyWrite (readline.js:830:14) at ReadStream.onkeypress (readline.js:109:10)
Two Problems 1) How to sample stack traces from a running
process? 2) How to do 1) without affecting the process?
Linux Perf EventsPERF(1) perf Manual PERF(1)
NAME perf - Performance analysis tools for Linux
SYNOPSIS perf [--version] [--help] COMMAND [ARGS]
DESCRIPTION Performance counters for Linux are a new kernel-based subsystem that provide a framework for all things performance analysis. It covers hardware level (CPU/PMU, Performance Monitoring Unit) features and software features (software counters, tracepoints) as well.
Sample Stack Traces w/ perf(1)
# perf record -F 99 -p `pgrep -n node` -g -- sleep 30[ perf record: Woken up 2 times to write data ][ perf record: Captured and wrote 0.524 MB perf.data (~22912 samples) ]
Sample Stack Traceab2fee v8::internal::Heap::DeoptMarkedAllocationSites() (/apps/node/bin/a69754 v8::internal::StackGuard::HandleInterrupts() (/apps/node/bin/node)c9f13b v8::internal::Runtime_StackGuard(int, v8::internal::Object**3c793e3060bb (/tmp/perf-5382.map)3c793e3060bb (/tmp/perf-5382.map)3c793e3060bb (/tmp/perf-5382.map)3c793e3060bb (/tmp/perf-5382.map) (repeated 30 more lines)8e6b2f v8::Function::Call(v8::Local<v8::Context>, v8::Local<v8::Value>, int, v8::Local<v8::Value>*) (/apps/node/bin/node)8f2281 v8::Function::Call(v8::Local<v8::Value>, int, v8::Local<v8::Value>*) (/apps/node/bin/node)df599a node::MakeCallback(node::Environment*, v8::Local<v8::Value>,...df5ccb node::CheckImmediate(uv_check_s*) (/apps/node/bin/node)fb1597 uv__run_check (/apps/node/bin/node)fabcee uv_run (/apps/node/bin/node)dfaa50 node::Start(int, char**) (/apps/node/bin/node)7fcc3ef6876d __libc_start_main (/lib/x86_64-linux-gnu/libc-2.15.so)
Missing JS Frames
Why? v8 places symbols JIT(Just in Time)
node --perf_basic_prof_only_functions
“outputs the files in a format that the existing perf toolcan consume.”
node --perf_basic_prof_only_functions
Available right now in Node v5.x
Coming soon to Node v4.3:https://github.com/nodejs/node/pull/3609
Resultsnode 5382 cpu-clock: 3c793e38b0c1 LazyCompile:DELETE native runtime.js:349 (/tmp/perf-5382.map) 3c793e31981d Builtin:JSConstructStubGeneric (/tmp/perf-5382.map) 3c793ff2ca94 (/tmp/perf-5382.map) 3c793e98a10f LazyCompile:~AtlasClient._run /apps/node/webapp/node_modules/nf-atlas-client/lib/client/AtlasClient.js:85 (/tmp/perf-5382.map) 3c793f47de29 LazyCompile:*AtlasClient.timer /apps/node/webapp/node_modules/nf-atlas-client/lib/client/AtlasClient.js:70 (/tmp/perf-5382.map) 3c793e9eee38 LazyCompile:~fetchSingleGetCallback /apps/node/webapp/singletons/ShaktiFetcher.js:120 (/tmp/perf-5382.map) 3c793f6cffee LazyCompile:*Model.get /apps/node/webapp/node_modules/nf-models/lib/Model.js:90 (/tmp/perf-5382.map) 3c793ed3e2ad (/tmp/perf-5382.map) 3c7940e4357b Handler:ca (/tmp/perf-5382.map) 3c793f060e3c Function:~ /apps/node/webapp/node_modules/vasync/lib/vasync.js:134 (/tmp/perf-5382.map) 3c79404edbfa (/tmp/perf-5382.map) 3c79401fd3f7 (/tmp/perf-5382.map) 3c79400e307b LazyCompile:*fetchMulti /apps/node/webapp/singletons/ShaktiFetcher.js:50 (/tmp/perf-5382.map) 3c793fb9a59f LazyCompile:*fetch /apps/node/webapp/singletons/ShaktiFetcher.js:32 (/tmp/perf-5382.map) 3c793e896697 (/tmp/perf-5382.map) 3c7943aaabbe (/tmp/perf-5382.map) 3c793ef4c53c Function:~ /apps/node/webapp/node_modules/vasync/lib/vasync.js:245 (/tmp/perf-5382.map) 3c793eaf4f01 LazyCompile:* /apps/node/webapp/node_modules/nf-packager/lib/index.js:194 (/tmp/perf-5382.map) 3c793eab130a LazyCompile:processImmediate timers.js:352 (/tmp/perf-5382.map) 3c793e319f7d Builtin:JSEntryTrampoline (/tmp/perf-5382.map) 3c793e3189e2 Stub:JSEntryStub (/tmp/perf-5382.map) a65baf v8::internal::Execution::Call(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*, bool) (/apps/node/bin/node) 8e6b2f v8::Function::Call(v8::Local<v8::Context>, v8::Local<v8::Value>, int, v8::Local<v8::Value>*) (/apps/node/bin/node) 8f2281 v8::Function::Call(v8::Local<v8::Value>, int, v8::Local<v8::Value>*) (/apps/node/bin/node) df599a node::MakeCallback(node::Environment*, v8::Local<v8::Value>, v8::Local<v8::Function>, int, v8::Local<v8::Value>*) (/apps/node/bin/node) df5ccb node::CheckImmediate(uv_check_s*) (/apps/node/bin/node) fb1597 uv__run_check (/apps/node/bin/node) fabcee uv_run (/apps/node/bin/node) dfaa50 node::Start(int, char**) (/apps/node/bin/node) 7fcc3ef6876d __libc_start_main (/lib/x86_64-linux-gnu/libc-2.15.so))
JS Frames
Native Frames
Problem: Too Many Traces
$ cat out.nodestacks01 | grep cpu-clock | wc -l
744$ wc -l out.nodestacks01
58116
Too Many Traces
Solution: Flame Graphs
Flamegraph
❖ Each box presents a function in the stack (stack frame)
❖ x-axis: percent of time on CPU❖ y-axis: stack depth❖ colors: random, or can be a
dimension❖ https://github.com/
brendangregg/FlameGraph
v8
libc
JS
built ins
Flame Graph Interpretation
a()
b() h()
c()
d()
e() f()
g()
i()
Flame Graph InterpretationTop edge shows who is running on-CPU, and how much (width)
a()
b() h()
c()
d()
e() f()
g()
i()
Flame Graph InterpretationTop-down shows ancestry
e.g., from g():
h()
d()
e()
i()
a()
b()
c()
f()
g()
Flame Graph Interpretation
a()
b() h()
c()
d()
e() f()
g()
i()
Widths are proportional to presence in samples
e.g., comparing b() to h() (incl. children)
> 50% time on CPU
lodash!
function merge(object) { var args = arguments, length = 2;...
Use _.assign() Instead
Before
After
Flame Graphs
Helps you find 1 LoC out of 6 Million
Results
❖ Dramatically reduced request latency
❖ Reduced CPU utilization
❖ Increased throughput
Runtime Performance Technique
❖ Sample stack traces via perf(1)
❖ Visualize code distribution with CPU flame graphs
❖ Identify candidate code paths for performance improvement
❖ Repeat
Runtime Crashes
- Chafin, R. "Pioneer F & G Telemetry and Command Processor Core Dump Program." JPL Technical Report XVI, no. 32-1526 (1971): 174.
“The method described in this article was designed to provide a core dump… with a minimal impact
on the spacecraft… as the resumption of data acquisition from the spacecraft is the highest
priority.”
Core Dumps — A Brief History
❖ Magnetic core memory❖ Dump out the contents of
“core” memory for debugging❖ “Core dump” was born❖ Initially printed on paper!❖ Postmortem debugging was
born!
Production Constraints
❖ Uptime is critical
❖ Not easily reproducible
❖ Can’t simulate environment
❖ Resume normal operations ASAP
Postmortem Debugging
Take core dump
Restart app
Load core dump
elsewhere
Engineer FixDebug
Continue serving traffic
Configure Node to Dump Core on Error
!"[0] <> node --abort_on_uncaught_exception throw.jsUncaught Error
FROMObject.<anonymous> (/Users/yunong/throw.js:1:63)Module._compile (module.js:435:26)Object.Module._extensions..js (module.js:442:10)Module.load (module.js:356:32)Function.Module._load (module.js:311:12)Function.Module.runMain (module.js:467:10)startup (node.js:134:18)node.js:961:3
[1] 4131 illegal hardware instruction (core dumped) node --abort_on_uncaught_exception throw.js
Node Post Mortem Tooling
❖ Netflix uses Linux in Prod
❖ Linux — Work in progress
❖ https://github.com/tjfontaine/lldb-v8
❖ https://github.com/indutny/llnode
❖ Solaris — Full featured, compatible with Linux cores
❖ https://github.com/joyent/mdb_v8
Socks & Duct Tape: Setup a Debug Solaris Instance
EC2: http://omnios.omniti.com/wiki.php/Installation#IntheCloud
VM: http://omnios.omniti.com/wiki.php/Installation#Quickstart
Post Mortem Methodology
❖ Where: Inspect stack trace
❖ Why: Inspect heap and stack variable state
mdb(1) JS commands❖ ::help <cmd>
❖ ::jsstack
❖ ::jsprint
❖ ::jssource
❖ ::jsconstructor
❖ ::findjsobjects
❖ ::jsfunctions
Load the Core Dump
# mdb ./node-v4.2.2-linux/node-v4.2.2-linux-x64/bin/node ./core.7186
> ::load ./mdb_v8_amd64.somdb_v8 version: 1.1.1 (release, from 28cedf2)V8 version: 143.156.132.195Autoconfigured V8 support from targetC++ symbol demangling enabled
linux node binary core dumpload mdb_v8 module
::jsstack> ::jsstackjs: testjs: storeHeaderjs: <anonymous> (as OutgoingMessage._storeHeader)js: <anonymous> (as ServerResponse.writeHead)js: restifyWriteHeadjs: _cbjs: sendjs: <anonymous> (as <anon>)js: <anonymous> (as ReactRenderer._renderLayout)js: <anonymous> (as <anon>)js: <anonymous> (as <anon>)js: <anonymous> (as dispatchHandler)js: <anonymous> (as <anon>)js: runHooksjs: runTransitionToHooksjs: <anonymous> (as assign.to)js: <anonymous> (as <anon>)js: runHooksjs: runTransitionFromHooksjs: <anonymous> (as assign.from)js: <anonymous> (as React.createClass.statics.dispatch)native: _ZN2v88internalL6InvokeEbNS0_6HandleINS0_10JSFunctionEEENS1_INS0...native: v8::internal::Execution::Call+0xc8native: v8::internal::Runtime_Apply+0x1cejs: <anonymous> (as b)
frame type
func name
Always name your functions!var foo = function foo() {};
Foo.prototype.bar = function bar() {};
foo(function bar() {});
::jsstack -v Frame Source> ::jsstack -vjs: storeHeader file: http.js posn: position 18774 this: 2ad561306c91 (<unknown>) arg1: 3bd67e0669b9 (JSObject: ServerResponse) arg2: 3dfe966ae299 (JSObject: Object) arg3: 34d5391d8859 (SeqAsciiString) arg4: 34d5391d8881 (SeqAsciiString)
652 653 function storeHeader(self, state, field, value) { 654 // Protect against response splitting. The if statement is there to 655 // minimize the performance impact in the common case. 656 if (/[\r\n]/.test(value)) 657 value = value.replace(/[\r\n]+[ \t]*/g, ''); 658 659 state.messageHeader += field + ': ' + value + CRLF; 660 661 if (connectionExpression.test(field)) { 662 state.sentConnectionHeader = true; 663 if (closeExpression.test(value)) { 664 self._last = true; 665 } else { 666 self.shouldKeepAlive = true; 667 } 668 669 } else if (transferEncodingExpression.test(field)) {
::jsstack -vn0 Frame and Function Args> ::jsstack -vn0js: test file: native regexp.js posn: position 2677 this: 2421205bd4d9 (JSRegExp) arg1: 34d5391d8859 (SeqAsciiString)js: storeHeader file: http.js posn: position 18774 this: 2ad561306c91 (<unknown>) arg1: 3bd67e0669b9 (JSObject: ServerResponse) arg2: 3dfe966ae299 (JSObject: Object) arg3: 34d5391d8859 (SeqAsciiString) arg4: 34d5391d8881 (SeqAsciiString)js: <anonymous> (as OutgoingMessage._storeHeader) file: http.js posn: position 15652 this: 3bd67e0669b9 (JSObject: ServerResponse) arg1: 3dfe966ae271 (ConsString) arg2: 3dfe966add99 (JSObject: Object)js: restifyWriteHead file: /apps/node/webapp/node_modules/restify/lib/response.js posn: position 6964 this: 3bd67e0669b9 (JSObject: ServerResponse) (1 internal frame elided)js: _cb file: /apps/node/webapp/node_modules/restify/lib/response.js
Func NameJS FileLine #
Func Args
::jsstack Function Args> ::jsstack -vn0js: test file: native regexp.js posn: position 2677 this: 2421205bd4d9 (JSRegExp) arg1: 34d5391d8859 (SeqAsciiString)js: storeHeader file: http.js posn: position 18774 this: 2ad561306c91 (<unknown>) arg1: 3bd67e0669b9 (JSObject: ServerResponse) arg2: 3dfe966ae299 (JSObject: Object) arg3: 34d5391d8859 (SeqAsciiString) arg4: 34d5391d8881 (SeqAsciiString)js: <anonymous> (as OutgoingMessage._storeHeader) file: http.js posn: position 15652 this: 3bd67e0669b9 (JSObject: ServerResponse) arg1: 3dfe966ae271 (ConsString) arg2: 3dfe966add99 (JSObject: Object)js: restifyWriteHead file: /apps/node/webapp/node_modules/restify/lib/response.js posn: position 6964 this: 3bd67e0669b9 (JSObject: ServerResponse) (1 internal frame elided)js: _cb file: /apps/node/webapp/node_modules/restify/lib/response.js
Memory Address of Var Var Type
::jsprint Print JS Objects> 3bd67e0669b9::jsprint{ "_time": 1437690472539, "_headers": { "content-type": "text/html", "req_id": "5b7f18f2-7f12-4c68-b07f-3cd75698ba65", "set-cookie": “CENSORED; Domain=.netflix.com; Expires=Fri, 24 Jul 2015 10:27:52 GMT", "x-frame-options": "DENY", "x-ua-compatible": "IE=edge", "x-netflix.client.instance": "i-c420596c", }, "output": [], "_last": false, "_hangupClose": false, "_hasBody": true, "socket": { "_connecting": false, "_handle": [...], "_readableState": [...], "readable": true, "domain": null, "_events": [...], "_maxListeners": 10, "_writableState": [...], "writable": true, "allowHalfOpen": true, "onend": function <anonymous> (as socket.onend),
Actual JS Object Instance
::jsconstructor Show Object Constructor
> 3bd67e0669b9::jsconstructor -vServerResponse (JSFunction: 2421205bced9)
::jssource Print f() Source
> 2421205bced9::jssourcefile: http.js
1066 function ServerResponse(req) { 1067 OutgoingMessage.call(this); 1068 1069 if (req.method === 'HEAD') this._hasBody = false; 1070 1071 this.sendDate = true; 1072 1073 if (req.httpVersionMajor < 1 || req.httpVersionMinor < 1) { 1074 this.useChunkedEncodingByDefault = chunkExpression.test(req.headers.te); 1075 this.shouldKeepAlive = false; 1076 } 1077 } 1078 util.inherits(ServerResponse, OutgoingMessage);
Core Dump === Complete Process State
Memory Leaks
Memory Leaks
Generate Core Dump Ad-hoc
gcore(1) GNU Tools gcore(1)
NAME gcore - Generate a core file for a running process
SYNOPSIS gcore [-o filename] pid
Take a Core Dump!root@demo:~# gcore `pgrep node`[Thread debugging using libthread_db enabled]Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".[New Thread 0x7facaeffd700 (LWP 5650)][New Thread 0x7facaf7fe700 (LWP 5649)][New Thread 0x7facaffff700 (LWP 5648)][New Thread 0x7facbc967700 (LWP 5647)][New Thread 0x7facbd168700 (LWP 5617)][New Thread 0x7facbd969700 (LWP 5616)][New Thread 0x7facbe16a700 (LWP 5615)][New Thread 0x7facbe96b700 (LWP 5614)]0x00007facbea5b5a9 in syscall () from /lib/x86_64-linux-gnu/libc.so.6Saved corefile core.5602
Problem: Find Leaking Objects
::findjsobjects
NAME findjsobjects - find JavaScript objects
SYNOPSIS [ addr ] ::findjsobjects [-vb] [-r | -c cons | -p prop]
::findjsobjects Find ALL JS Objects on Heap
> ::findjsobjects OBJECT #OBJECTS #PROPS CONSTRUCTOR: PROPS ... 3dfe97453121 18 6721 Array 157a020e01 1304 101 <anonymous> (as Constructor): ... 8f1a53211 13879 12 ReactDOMComponent: _tag, tagName, props, ... 8f1a05691 85776 2 Array 3dfe97451a99 36 5589 Array 23e5d7d44351 1 218020 Object: .2f5hpw2hgjk.1.0.3, ... 8f1a05f31 40533 6 <anonymous> (as ReactElement): type, ... 8f1a04da1 252133 1 Array 8f1a04dc1 125869 7 Array 8f1a04f01 114914 8 Array 8f1a04d39 230924 7 Module: id, exports, parent, filename, ...
Memory Leak Strategy
❖ Look at objects on heap for suspicious objects
❖ Take successive core dumps and compare object counts
❖ Growing object counts are likely leaking
❖ Inspect object for more context
❖ Walk reverse references to find root object
Look at Object Delta Between Successive Core Dumps
Uptime = 45mins
> ::findjsobjects OBJECT #OBJECTS #PROPS CONSTRUCTOR: PROPS ... 8f1a04d39 230924 7 Module: id, exports, parent, filename, ...
Uptime = 90 mins
> ::findjsobjects OBJECT #OBJECTS #PROPS CONSTRUCTOR: PROPS ... 8f1a04d39 323454 7 Module: id, exports, parent, filename, ...
Analyze Leaked Objects
Representative Object
> ::findjsobjects OBJECT #OBJECTS #PROPS CONSTRUCTOR: PROPS ... 8f1a04d39 323454 7 Module: id, exports, parent, filename, ...
Representative Object, 1 of 323454
Look Closer> 8f1a04d39::jsprint{ "id": "/apps/node/webapp/ui/js/pages/akiraClient.js", "exports": {}, "parent": { "id": "/apps/node/webapp/middleware/autoClientStrings.js", "exports": function autoExposeClientStrings, "parent": [...], "filename": "/apps/node/webapp/middleware/autoClientStrings.js", "loaded": true, "children": [...], "paths": [...], }, "filename": "/apps/node/webapp/ui/js/pages/akiraClient.js",
Use ::findjsobjects to Find All “Module” Objects
> 8f1a04d39::findjsobjects8f1a04d393fd996bffb393fd996bfcff13fd996bfbac13fd996bf8a193fd996bf79493fd996bf3ce93fd996bf0f193fd996bead713fd996bea8213fd996bea0013fd996be92b13fd996be73d13fd996be58d13fd996bd88b13fd996bcb4593fd996bcaa413fd996bc70093fd996bc3321
Analyze All 320K+ Objects?
Custom Querying With Pipes and Unix Tools
8f1a04d39::findjsobjects | ::jsprint ! grep filename | sort | uniq -c
Results... 1 "filename": "/apps/node/webapp/ui/js/akira/components/messaging/paymentHold.js", 2 "filename": "/apps/node/webapp/ui/js/common/commonCore.js", 1 "filename": "/apps/node/webapp/ui/js/common/playPrediction/playPrediction.js", 3 "filename": "/apps/node/webapp/ui/js/common/presentationTracking/presentationTracking.js", 111061 "filename": “/apps/node/webapp/ui/js/common/playPrediction/playPrediction.js", 7103 "filename": “/apps/node/webapp/ui/js/pages/reactClientRender.js", 111061 "filename": “/apps/node/webapp/ui/js/pages/akiraClient.js", 118257 "filename": “/apps/node/webapp/middleware/autoClientStrings.js",... Client Side Modules
What’s holding on to these modules?
Aim: Find Root Object
Walk Reverse Refs with ::findjsobjects -r
> 8f1a04d39::findjsobjects -r
8f1a04d39 referred to by 14fd6c5b13c1.parent
Root Object> 1f313791bb41::jsprint[ { "id": "/apps/node/webapp/ui/js/pages/akiraClient.js", "exports": [...], "parent": [...], "filename": "/apps/node/webapp/ui/js/pages/akiraClient.js", "loaded": false, "children": [...], "paths": [...], }, { "id": "/apps/node/webapp/ui/js/pages/akiraClient.js", "exports": [...], "parent": [...], "filename": "/apps/node/webapp/ui/js/pages/akiraClient.js", "loaded": false, "children": [...], "paths": [...], }, { "id": "/apps/node/webapp/ui/js/pages/akiraClient.js", "exports": [...], "parent": [...], "filename": "/apps/node/webapp/ui/js/pages/akiraClient.js",
Spot the Leakvar cache = {};
function checkCache(someModule) { var mod = cache[someModule]; if (!mod) { try { mod = require(someModule); cache[someModule] = mod; return mod; } catch (e) { return {}; } }
return mod;}
Module could be client only, must catch
Should cache the fact we caught an exception here
Root Cause
❖ Node caches metadata for each module
❖ If require process throws an exception, the module metadata is leaked (bug?)
❖ Client side module meant we were throwing during every request, and not caching the fact we tried to require it
❖ Each request leaks 3+ module metadata objects
Memory Leaks
❖ Take successive core dumps (gcore(1))
❖ Compare object counts (::findjsobjects)
❖ Growing objects are likely leaking
❖ Inspect object for more context (::jsprint)
❖ Walk reverse references to find root obj (::findjsobjects -r)
Post Mortem Debugging is Critical to Large Scale Prod Node Deployments
More State than Just Logs❖ Detailed stack trace (::jsstack)
❖ Function args for each frame (::jsstack -vn0)
❖ Get state of any object and its provenance (::jsprint, ::jsconstructor)
❖ Get source code of any function (::jssource)
❖ Find arbitrary JS objects (::findjsobjects)
❖ Unmodified Node binary!
Production Failures are Inevitable
But We Can Learn From Them
Production Debugging❖ Runtime Performance
❖ CPU profiling/flame graphs
❖ Runtime Crashes
❖ Inspect program state with core dumps and mdb
❖ Memory leaks
❖ Analyze objects and references with core dumps and mdb
Use the Scientific Method
Epilogue — State of Tooling
❖ Join Working Group https://github.com/nodejs/post-mortem
❖ Help make mdb_v8 cross platform https://github.com/joyent/mdb_v8
❖ Contribute to https://github.com/tjfontaine/lldb-v8 and https://github.com/indutny/llnode
Acknowledgements❖ mdb_v8
❖ Dave Pacheco, TJ Fontaine, Julien Gilli, Bryan Cantrill
❖ CPU Profiling/Flamegraphs
❖ Brendan Gregg, Google V8 team, Ali Ijaz Sheikh
❖ Linux Perf
❖ Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Jiri Olsa, Peter Zijlstra
❖ lldb-v8
❖ TJ Fontaine
❖ llnode
❖ Fedor Indutny
Get Involved!
Citations
❖ Slides 29-32 used with permission from “Java Mixed-Mode Flame Graphs”, Brendan Gregg, Oct 2015
❖ Slide 26 used with permission from http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html