xinyu xing, wei meng, dan doozan, georgia institute of technology alex c. snoeren, uc san diego nick...
TRANSCRIPT
Take This Personally:
Pollution Attacks on Personalized
Services
Xinyu Xing, Wei Meng, Dan Doozan, Georgia Institute of Technology
Alex C. Snoeren,UC San Diego
Nick Feamster, and Wenke Lee,Georgia Institute of Technology
22nd USENIX Security(August, 2013)
A Seminar at Advanced Defense Lab 2
Outline
Introduction Overview and Attack Model Pollution Attacks on YouTube Google Personalized Search Pollution Attacks on Amazon
2013/9/3
A Seminar at Advanced Defense Lab 3
Introduction
Modern Web services are increasingly relying upon personalization to improve the quality of their customers’ experience.
Many services with personalized content log their users’ Web activities.
2013/9/3
A Seminar at Advanced Defense Lab 4
This paper...
We demonstrate that contemporary personalization mechanisms are vulnerable to exploit.
2013/9/3
A Seminar at Advanced Defense Lab 5
Our Attack
We show that YouTube, Amazon, and Google are all vulnerable to the same class of cross-site scripting attack, which we call a pollution attack, that allows third parties to alter the customized content.
A distinguishing feature of our attack is that it does not exploit any vulnerability in the user’s Web browser.
2013/9/3
A Seminar at Advanced Defense Lab 6
Overview and Attack Model The main instrument that a service
provider can use to affect the content that a user sees is modifying the choice set.
When a user issues a query, a service’s personalization algorithm affects the user’s choice set for that query.
2013/9/3
A Seminar at Advanced Defense Lab 7
Overview and Attack Model (cont.) In this paper, we focus on how changes
to a user’s history can affect the choice set, holding other factors fixed.
This attack requires three steps:1. Model the service’s personalization
algorithm.
2. Create a “seed” to pollute the user’s history.
3. Inject the seed with a vector of false clicks.
2013/9/3
A Seminar at Advanced Defense Lab 82013/9/3
A Seminar at Advanced Defense Lab 9
Pollution Attacks on YouTube Personalization rule
Consider only those videos that the user watched for a long period of time
Similar viewing historiesNot recommend a video the user has
already watchedTwo of suggested videos are recommended
based upon personalization
2013/9/3
A Seminar at Advanced Defense Lab 102013/9/3
A Seminar at Advanced Defense Lab 11
Preparing Seed Videos
2013/9/3
Video channel (C)
ΩS ΩT
A Seminar at Advanced Defense Lab 12
Inject Seed Videos
We see the video:http://www.youtube.com/user_watch?plid=<value>&video_id=<value>
We watch for a period of time:http://www.youtube.com/set_awesome?plid=<value>&video_id=<value>
2013/9/3
A Seminar at Advanced Defense Lab 13
Experimental Design
Relationship
AccountNew Existing
New
Two 3-minute videos(with about 65 sequentially watching)
100 channel (in top 2000)X 25 videos
Existing(22 volunteers)
Channel OnlyyouHappycampX 15 videos
2013/9/3
A Seminar at Advanced Defense Lab 14
Evaluation
We evaluated the effectiveness of our pollution attacks by logging in as the victim user and viewing 114 representative videos.
2013/9/3
A Seminar at Advanced Defense Lab 15
Evaluation (New Accounts) Successfully we computed
the Pearson correlation between the showing frequencies and the lengths of the target videos○ 0.54 => medium
the Pearson correlation between the showing frequencies and the view counts of the target videos○ 0.23 => moderate
2013/9/3
A Seminar at Advanced Defense Lab 162013/9/3
A Seminar at Advanced Defense Lab 172013/9/3
A Seminar at Advanced Defense Lab 18
Evaluation (Existing Accounts) For existing channel OnlyyouHappycamp
14 of the 22 volunteers (64%)Ten of our volunteers shared their histories
The majority of the videos recommended to users for whom our attacks have low promotion rates have longer lengths and more view counts than our target videos.
2013/9/3
A Seminar at Advanced Defense Lab 192013/9/3
A Seminar at Advanced Defense Lab 20
Google Personalized Search We describe two classes of personalization
algorithms: contextual personalizationpersistent personalization
2013/9/3
A Seminar at Advanced Defense Lab 212013/9/3
A Seminar at Advanced Defense Lab 22
Identifying Search Terms
Contextual PersonalizationThe keywords injected into a user’s search
history should be both relevant to the promoting keyword and unique to the website being promoted.
2013/9/3
A Seminar at Advanced Defense Lab 23
Identifying Search Terms (cont.) Persistent Personalization
In this case, the size of the keyword set should be larger than that used for a contextual attack in order to have a greater effect on the user’s search history.
An attacker can safely inject roughly 50 keywords a minute using cross-site request forgery.we assume an attacker can inject at most 25
keywords into a user’s profile
2013/9/3
A Seminar at Advanced Defense Lab 24
Contextual Personalization
2013/9/3
5,761 Search Terms from made-in-china.com
30 URLs
30 URLs
30 URLs
30 URLs
URLs having unique <meta> keywords
URLs having unique <meta> keywords
URLs having unique <meta> keywords
Google results
151,363 unique URLs
2,136 URLs1,739 search
terms
A Seminar at Advanced Defense Lab 25
2,136 URLs for Contextual Personalization
2013/9/3
A Seminar at Advanced Defense Lab 26
Persistent Personalization
2013/9/3
551 Search Terms from made-in-china.com
30 URLs
30 URLs
30 URLs
30 URLs
URLs having unique Google AdWords keywords
Google results
151,363 unique URLs
15,979 URLs
A Seminar at Advanced Defense Lab 27
Evaluation
Contextual Personalization
2013/9/3
1.1%
62.8%
28%
44%
A Seminar at Advanced Defense Lab 28
Evaluation (cont.)
Persistent Personalization
2013/9/3
4.3%
22.7%
??%
17%
A Seminar at Advanced Defense Lab 29
Evaluation (cont.)
Real Users97.1% of our 729 previously successful
contextual attacks remain successful.Only 77.78% of the persistent pollution
attacks that work on fresh accounts achieve similar success
2013/9/3
A Seminar at Advanced Defense Lab 30
Pollution Attacks on Amazon Amazon tailors a customer’s homepage
based on the previous purchase, browsing and searching behavior of the user.
We focused on the personalized recommendations Amazon generates based on the browsing and searching activities
2013/9/3
A Seminar at Advanced Defense Lab 312013/9/3
A Seminar at Advanced Defense Lab 32
Amazon Recommendations Amazon’s personalization is based on
history that maintained by the user’s web browser.Session cookie
2013/9/3
A Seminar at Advanced Defense Lab 33
Identifying Seed Products and Terms Visit-Based Pollution
the attacker visits the Amazon page of the product and retrieves the related products that are shown on Amazon page of the targeted product.
Search-Based PollutionAn attacker could use a natural language
toolkit to automatically extract a candidate keyword set from the targeted product’s name.
2013/9/3
A Seminar at Advanced Defense Lab 342013/9/3
A Seminar at Advanced Defense Lab 352013/9/3
A Seminar at Advanced Defense Lab 36
Q & A
2013/9/3