jÄk: using dynamic analysis to crawl and test modern web ... · nov. 3, 2016 crawler and modern...
TRANSCRIPT
![Page 1: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/1.jpg)
jÄk: Using Dynamic Analysis to Crawl and Test Modern Web Applications
Giancarlo Pellegrino(1), Constantin Tschürtz(2), Eric Bodden(2), and Christian Rossow(1)
18th International Symposium on Research in Attacks, Intrusions and Defenses
November 3rd, Kyoto, Japan
(1) CISPA, Saarland University, Germany(2) Fraunhofer SIT / TU Darmstadt, Germany
![Page 2: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/2.jpg)
Nov. 3, 2016
Web Application Scanners
(Semi-)automated security testing tools Follow a dynamic and black-box testing approach
![Page 3: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/3.jpg)
Nov. 3, 2016
Web Application Scanners
(Semi-)automated security testing tools Follow a dynamic and black-box testing approach
![Page 4: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/4.jpg)
Nov. 3, 2016
Architecture
Crawler Module Attacker Module Analysis Module
![Page 5: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/5.jpg)
Nov. 3, 2016
Crawler
Seed URLhttp://shop.foo
http://shop.foo
![Page 6: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/6.jpg)
Nov. 3, 2016
Crawler
<html> <head>
<title>Online shopping</title></head><body>
<a href=”/contacts”>Contacts</a><form action=”/search”>
<input type=”text” name=”q”/><input type=”submit”/>
</form></body>
</html>
http://shop.foo
![Page 7: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/7.jpg)
Nov. 3, 2016
Crawler
<html> <head>
<title>Online shopping</title></head><body>
<a href=”/contacts”>Contacts</a><form action=”/search”>
<input type=”text” name=”q”/><input type=”submit”/>
</form></body>
</html>
http://shop.foo
New URL
![Page 8: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/8.jpg)
Nov. 3, 2016
Crawler
<html> <head>
<title>Online shopping</title></head><body>
<a href=”/contacts”>Contacts</a><form action=”/search”>
<input type=”text” name=”q”/><input type=”submit”/>
</form></body>
</html>
http://shop.foo
New search HTML form
![Page 9: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/9.jpg)
Nov. 3, 2016
Crawler
http://shop.foo/contacts
Next?
![Page 10: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/10.jpg)
Nov. 3, 2016
Crawler
<html> <head>
<title>Contact Page</title></head><body>
<form action=”/comments”> <input type=”text” name=”msg”/> <input type=”submit”/></form>
</body></html>
http://shop.foo/contacts
New HTML form
![Page 11: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/11.jpg)
Nov. 3, 2016
Security Testing
<form action=”/search”><input type=”text” name=”q”/><input type=”submit”/>
</form>
Tests == Attacks
Responses
?
shop.foo
XSS payloadXSS payloadSQL payloadSQL payload
![Page 12: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/12.jpg)
Nov. 3, 2016
Crawler Critical for Coverage
Crawler explores the Web application attack surface● Missing parts → missing possible vulnerabilities
Existing crawlers based on:● HTML parsing and pattern matching to extract URLs● “clickable” areas to further explore the surface
![Page 13: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/13.jpg)
Nov. 3, 2016
Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful JS programs)
![Page 14: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/14.jpg)
Nov. 3, 2016
Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful JS programs)
Links and forms can be built and inserted in the webpage at run-time
➔HTML parsing and pattern matching no longer sufficient
var url = scheme() + '://' + domain() + '/' + endpoint();document.getElementByID('myLink').href = url;
![Page 15: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/15.jpg)
Nov. 3, 2016
Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful JS programs)
Links and forms can be built and inserted in the webpage at run-time
➔HTML parsing and pattern matching no longer sufficient
JS is an event-driven language
● Functions executed upon events
➔Lack of support of event-based execution model
var url = scheme() + '://' + domain() + '/' + endpoint();document.getElementByID('myLink').href = url;
clickmouse movement
timeoutAjax response received
generate URLs/HTML form
register new events
Ajax requests
![Page 16: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/16.jpg)
Nov. 3, 2016
Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful JS programs)
Links and forms can be built and inserted in the webpage at run-time
➔HTML parsing and pattern matching no longer sufficient
JS is an event-driven language
● Functions executed upon events
➔Lack of support of event-based execution model
var url = scheme() + '://' + domain() + '/' + endpoint();document.getElementByID('myLink').href = url;
clickmouse movement
timeoutAjax response received
generate URLs/HTML form
register new events
Ajax requests
Large part of web applications remain unexplored!Large part of web applications remain unexplored!
![Page 17: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/17.jpg)
Nov. 3, 2016
Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful JS programs)
Links and forms can be built and inserted in the webpage at run-time
➔HTML parsing and pattern matching no longer sufficient
JS is an event-driven language
● Functions executed upon events
➔Lack of support of event-based execution model
var url = scheme() + '://' + domain() + '/' + endpoint();document.getElementByID('myLink').href = url;
clickmouse movement
timeoutAjax response received
generate URLs/HTML form
register new events
Ajax requests
Large part of web applications remain unexplored!Large part of web applications remain unexplored!
We addressed the coverage problem with● JavaScript client side dynamic analysis● Model-based Crawler
Build a tool: jÄk
![Page 18: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/18.jpg)
Nov. 3, 2016
Our Approach
Combine dynamic analysis with model-based crawler● Dynamic analysis monitors client side program execution● Crawler builds, maintains, uses a model of the visited attack surface
Seed URL
Model-based Crawler
Model Inference/Update
Action
Navigator
Dynamic Analysis
Trace Analysis
APIs
I/O
Trace
Handler reg.
JS Engine
Probe
![Page 19: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/19.jpg)
Nov. 3, 2016
Dynamic Analysis
Different approaches:
Seed URL
Model Inference/Update
Action
Navigator
Trace Analysis
I/O
Trace
Handler reg.
JS Engine
Environment
Probe
APIs
![Page 20: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/20.jpg)
Nov. 3, 2016
Dynamic Analysis
Different approaches:
1) JS engine instrumentation → laborious task, engine-dependent
Seed URL
Model Inference/Update
Action
Navigator
Trace Analysis
I/O
Trace
Handler reg.
JS Engine
Environment
Probe
APIs
![Page 21: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/21.jpg)
Nov. 3, 2016
Dynamic Analysis
Different approaches:
1) JS engine instrumentation → laborious task, engine-dependent
2) JS program instrumentation → JS code is not entirely available
Seed URL
Model Inference/Update
Action
Navigator
Trace Analysis
I/O
Trace
Handler reg.
JS Engine
Environment
Probe
APIs
![Page 22: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/22.jpg)
Nov. 3, 2016
Dynamic Analysis
Different approaches:
1) JS engine instrumentation → laborious task, engine-dependent
2) JS program instrumentation → JS code is not entirely available
3) Modification of execution environment
Seed URL
Model Inference/Update
Action
Navigator
Trace Analysis
I/O
Trace
Handler reg.
JS Engine
Environment
Probe
APIs
![Page 23: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/23.jpg)
Nov. 3, 2016
Dynamic Analysis
Modify execution environment via function hooking:● Intercept API calls (e.g., network I/O and event handler registration)
● Object manipulations (i.e., object properties)
● Schedule DOM inspections
Hooks installed by injecting own JS code:● Function redefinition
● Set functions
Seed URL
Model Inference/Update
Action
NavigatorI/O
Handler reg.
JS Engine
Environment
Probe
APIs
![Page 24: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/24.jpg)
Nov. 3, 2016
Function Redefinition
function handler() {alert("hello world");
}
el = document.getElementByID('img') el.addEventListener("click", handler);
App
licat
ion
JS c
ode
![Page 25: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/25.jpg)
Nov. 3, 2016
Function Redefinition
function handler() {alert("hello world");
}
el = document.getElementByID('img') el.addEventListener("click", handler);
![Page 26: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/26.jpg)
Nov. 3, 2016
Function Redefinition
function handler() {alert("hello world");
}
el = document.getElementByID('img') el.addEventListener("click", handler);
![Page 27: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/27.jpg)
Nov. 3, 2016
Function Redefinition
function handler() {alert("hello world");
}
el = document.getElementByID('img') el.addEventListener("click", handler);
AP
IA
PI Element.prototype.addEventListener = function(e, h) {
[…]listeners[e].append(h);
}
Element.prototype.addEventListener = function(e, h) {[…]listeners[e].append(h);
}
![Page 28: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/28.jpg)
Nov. 3, 2016
Function Redefinition
function handler() {alert("hello world");
}
el = document.getElementByID('img') el.addEventListener("click", handler);
AP
IA
PI Element.prototype.addEventListener = function(e, h) {
[…]listeners[e].append(h);
}
Element.prototype.addEventListener = function(e, h) {[…]listeners[e].append(h);
}
Intercept!Intercept!
![Page 29: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/29.jpg)
Nov. 3, 2016
Function Redefinition
function handler() {alert("hello world");
}
el = document.getElementByID('img') el.addEventListener("click", handler);
preamble
App
licat
ion
JS c
ode
PR
EA
MB
LE
PR
EA
MB
LE var orig_f = Element.prototype.addEventListener;
Element.prototype.addEventListener = function(){console.log("new handler registration");return orig_f.apply(this, argument);
};
var orig_f = Element.prototype.addEventListener;
Element.prototype.addEventListener = function(){console.log("new handler registration");return orig_f.apply(this, argument);
};
![Page 30: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/30.jpg)
Nov. 3, 2016
Function Redefinition
function handler() {alert("hello world");
}
el = document.getElementByID('img') el.addEventListener("click", handler);
preamble
App
licat
ion
JS c
ode
PR
EA
MB
LE
PR
EA
MB
LE var orig_f = Element.prototype.addEventListener;
Element.prototype.addEventListener = function(){console.log("new handler registration");return orig_f.apply(this, argument);
};
var orig_f = Element.prototype.addEventListener;
Element.prototype.addEventListener = function(){console.log("new handler registration");return orig_f.apply(this, argument);
};
AP
IA
PI Element.prototype.addEventListener = function(e, h) {
[…]listeners[e].append(h);
}
Element.prototype.addEventListener = function(e, h) {[…]listeners[e].append(h);
}
![Page 31: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/31.jpg)
Nov. 3, 2016
Function Redefinition
function handler() {alert("hello world");
}
el = document.getElementByID('img') el.addEventListener("click", handler);
preamble
App
licat
ion
JS c
ode
PR
EA
MB
LE
PR
EA
MB
LE var orig_f = Element.prototype.addEventListener;
Element.prototype.addEventListener = function(){console.log("new handler registration");return orig_f.apply(this, argument);
};
var orig_f = Element.prototype.addEventListener;
Element.prototype.addEventListener = function(){console.log("new handler registration");return orig_f.apply(this, argument);
};
AP
IA
PI Element.prototype.addEventListener = function(e, h) {
[…]listeners[e].append(h);
}
Element.prototype.addEventListener = function(e, h) {[…]listeners[e].append(h);
}
![Page 32: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/32.jpg)
Nov. 3, 2016
Function Redefinition
function handler() {alert("hello world");
}
el = document.getElementByID('img') el.addEventListener("click", handler);
preamble
App
licat
ion
JS c
ode
PR
EA
MB
LE
PR
EA
MB
LE var orig_f = Element.prototype.addEventListener;
Element.prototype.addEventListener = function(){console.log("new handler registration");return orig_f.apply(this, argument);
};
var orig_f = Element.prototype.addEventListener;
Element.prototype.addEventListener = function(){console.log("new handler registration");return orig_f.apply(this, argument);
};
AP
IA
PI Element.prototype.addEventListener = function(e, h) {
[…]listeners[e].append(h);
}
Element.prototype.addEventListener = function(e, h) {[…]listeners[e].append(h);
}
![Page 33: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/33.jpg)
Nov. 3, 2016
Function Redefinition
function handler() {alert("hello world");
}
el = document.getElementByID('img') el.addEventListener("click", handler);
preamble
App
licat
ion
JS c
ode
PR
EA
MB
LE
PR
EA
MB
LE var orig_f = Element.prototype.addEventListener;
Element.prototype.addEventListener = function(){console.log("new handler registration");return orig_f.apply(this, argument);
};
var orig_f = Element.prototype.addEventListener;
Element.prototype.addEventListener = function(){console.log("new handler registration");return orig_f.apply(this, argument);
};
AP
IA
PI Element.prototype.addEventListener = function(e, h) {
[…]listeners[e].append(h);
}
Element.prototype.addEventListener = function(e, h) {[…]listeners[e].append(h);
}
![Page 34: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/34.jpg)
Nov. 3, 2016
Function Redefinition
function handler() {alert("hello world");
}
el = document.getElementByID('img') el.addEventListener("click", handler);
preamble
App
licat
ion
JS c
ode
PR
EA
MB
LE
PR
EA
MB
LE var orig_f = Element.prototype.addEventListener;
Element.prototype.addEventListener = function(){console.log("new handler registration");return orig_f.apply(this, argument);
};
var orig_f = Element.prototype.addEventListener;
Element.prototype.addEventListener = function(){console.log("new handler registration");return orig_f.apply(this, argument);
};
AP
IA
PI Element.prototype.addEventListener = function(e, h) {
[…]listeners[e].append(h);
}
Element.prototype.addEventListener = function(e, h) {[…]listeners[e].append(h);
}
![Page 35: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/35.jpg)
Nov. 3, 2016
Model-based Crawler
Creates and maintain a web application model● Oriented graph: nodes are page clusters and edges are URLs, HTML forms, or events
Model used to decide on the next action● Priority: Events → high, URLs/forms → low
Seed URL
Dynamic Analysis
Trace Analysis
APIs
I/O
Trace
Handler reg.
Hooking functions
JS Engine
Environment
Probe
Model-based Crawler
Model Inference/Update
Navigator
Action
![Page 36: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/36.jpg)
Nov. 3, 2016
Model-based Crawler
Creates and maintain a web application model● Oriented graph: nodes are page clusters and edges are URLs, HTML forms, or events
Model used to decide on the next action● Priority: Events → high, URLs/forms → low
Seed URL
Dynamic Analysis
Trace Analysis
APIs
I/O
Trace
Handler reg.
Hooking functions
JS Engine
Environment
Probe
Model-based Crawler
Model Inference/Update
Navigator
Action
![Page 37: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/37.jpg)
Nov. 3, 2016
Assessment
![Page 38: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/38.jpg)
Nov. 3, 2016
Our ToolDynamic Analysis
Trace Analysis
APIs
I/O
Trace
Handler reg.
Hooking functions
JS Engine
Environment
Probe
Model-based Crawler
Model Inference/Update
Navigator
Action
New tool: jÄk [pron. Jack]
Source code on GitHub● https://github.com/ConstantinT/jAEk
Free to run, copy, distribute, study, change and improve it● Free Software (GPL3) … and also free as free beer!
![Page 39: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/39.jpg)
Nov. 3, 2016
Experiments
Comparative analysis● Skipfish, W3af, Wget, and Crawljax
Case studies:● WIVET (Web Input Vector Extractor Teaser)
● assess strength and limitations of existing crawlers● 13 web applications
● studied coverage and vulnerability detection power
![Page 40: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/40.jpg)
Nov. 3, 2016
Coverage
Explored surface by jÄk (# of unique URL structs.)
● x16 (Crawljax) to x2 (Skipfish) bigger
CrawlerjÄk
~x6 bigger
![Page 41: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/41.jpg)
Nov. 3, 2016
Coverage
Explored surface by jÄk (# of unique URL structs.)
● x16 (Crawljax) to x2 (Skipfish) bigger
Relative size of new surface:● From +70% (Wget) to +98% (Crawljax) of URLs are new
jÄk
New surface Known surface
~86% more
![Page 42: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/42.jpg)
Nov. 3, 2016
Coverage
Explored surface by jÄk (# of unique URL structs.)
● x16 (Crawljax) to x2 (Skipfish) bigger
Relative size of new surface:● From +70% (Wget) to +98% (Crawljax) of URLs are new
Global surface missed by jÄk:● From 22% (Skipfish) to 0.5% (Crawljax) are missed
CrawlerjÄk
~15% missed URLs
Unknown to jÄk
![Page 43: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/43.jpg)
Nov. 3, 2016
Coverage
Explored surface by jÄk (# of unique URL structs.)
● x16 (Crawljax) to x2 (Skipfish) bigger
Relative size of new surface:● From +70% (Wget) to +98% (Crawljax) of URLs are new
Global surface missed by jÄk:● From 22% (Skipfish) to 0.5% (Crawljax) are missed● Further analysis:
● 75% of missed are due to URL forgery● 25% to static resources, unsupported action, and others
CrawlerjÄk
~15% missed URLs
Unknown to jÄk
![Page 44: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/44.jpg)
Nov. 3, 2016
Conclusion
![Page 45: jÄk: Using Dynamic Analysis to Crawl and Test Modern Web ... · Nov. 3, 2016 Crawler and Modern Web Applications Complexity of client side has dramatically increased (i.e., stateful](https://reader033.vdocuments.mx/reader033/viewer/2022050211/5f5d52d21ab2907005389be1/html5/thumbnails/45.jpg)
Nov. 3, 2016
Conclusion/Takeaway
Novel technique based on ● dynamic analysis of JS program + model-based crawling
Built jÄk, a tool implementing our approach
Assessed against 13 web applications
Our result show that jÄk explores a surface ~6x larger with +86% new URLs