college of information sciences and technology adversarial information retrieval aspects of...

20
College of Information Sciences College of Information Sciences and and Technology Technology Adversarial Information Retrieval Aspects of Sponsored Search Jim Jansen College of Information Sciences and Technology The Pennsylvania State University [email protected]

Upload: georgiana-horton

Post on 20-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: College of Information Sciences and Technology Adversarial Information Retrieval Aspects of Sponsored Search Jim Jansen College of Information Sciences

College of Information SciencesCollege of Information Sciencesandand TechnologyTechnology

Adversarial Information Retrieval Aspects of Sponsored Search

Jim Jansen

College of Information Sciences and Technology

The Pennsylvania State University

[email protected]

Page 2: College of Information Sciences and Technology Adversarial Information Retrieval Aspects of Sponsored Search Jim Jansen College of Information Sciences

College of Information SciencesCollege of Information Sciencesandand TechnologyTechnology

Agenda

• What is sponsored search?• How does sponsored search work? • Adversarial IR aspects – click fraud• How does click fraud work?• Click fraud prevention (or at least

mitigation)• Conclusion

Page 3: College of Information Sciences and Technology Adversarial Information Retrieval Aspects of Sponsored Search Jim Jansen College of Information Sciences

College of Information SciencesCollege of Information Sciencesandand TechnologyTechnology

Sponsored Search

• is an increasingly important, popular, and uniquely contextual form of information interaction on the Web.

• adversarial techniques to subvert sponsored search have received little attention in the research community.

• the negative effect of spam on the sponsored search process may have greater implications than on the algorithmic process.

Page 4: College of Information Sciences and Technology Adversarial Information Retrieval Aspects of Sponsored Search Jim Jansen College of Information Sciences

College of Information SciencesCollege of Information Sciencesandand TechnologyTechnology

Sponsored Search• content providers pay Web search engines to display specific

links in response to user queries alongside the algorithmic.• is increasingly important in locating information on the Web. • a distinctive form of IR - uniquely dynamic contextual

relationship.• significant social and political repercussions if the process is

significantly compromised.

Fain, D. C. and Pedersen, J. O., Sponsored Search: A Brief History, Bulletin of the American Society for Information Science and Technology, vol. 32, pp. 12-13, 2006

Page 5: College of Information Sciences and Technology Adversarial Information Retrieval Aspects of Sponsored Search Jim Jansen College of Information Sciences

College of Information SciencesCollege of Information Sciencesandand TechnologyTechnology

Sponsored Search – Goals and Processes

1. Provider Content

2. Provider Bid

3. Search Engine Review Process

4. Search Engine Keyword and Content Index

5. Search Engine User Interface

6. Search Engine Tracking

7. Searcher2

1

3

4

5

6

7

Process

Page 6: College of Information Sciences and Technology Adversarial Information Retrieval Aspects of Sponsored Search Jim Jansen College of Information Sciences

College of Information SciencesCollege of Information Sciencesandand TechnologyTechnology

Sponsored Search – Goals and Processes

1. Searcher – find right information, site, or service with the right characteristics at the right time

2. Search Engine – service relevant content

3. Provider – infer searcher intent or affect searcher attitude

23 1

Goals

Page 7: College of Information Sciences and Technology Adversarial Information Retrieval Aspects of Sponsored Search Jim Jansen College of Information Sciences

College of Information SciencesCollege of Information Sciencesandand TechnologyTechnology

Adversarial Aspects

• Sponsored search significantly reduces spam.

• Why? The cost motive for the provider and search engine to present relevant content.

• Search engines have review processes consisting of both automated and manual aspects to help ensure this.

• Monetary factors significantly reduce spam content.

Page 8: College of Information Sciences and Technology Adversarial Information Retrieval Aspects of Sponsored Search Jim Jansen College of Information Sciences

College of Information SciencesCollege of Information Sciencesandand TechnologyTechnology

Adversarial Aspects

• There is the issue of click fraud with sponsored search. • Click fraud is the intentional clicking on a sponsored link where the perpetrator does not intend to buy (or use) the products or services advertised. • Click fraud has not been widely perceived as search engine spamming, but its negative effect is severe.

Page 9: College of Information Sciences and Technology Adversarial Information Retrieval Aspects of Sponsored Search Jim Jansen College of Information Sciences

College of Information SciencesCollege of Information Sciencesandand TechnologyTechnology

Click Fraud

• can take various forms, but the final result is usually the same.

Content providers pay for unproductive traffic.

• produces revenue for the major search engines and Web sites. • the clicks generate sales commissions based on the content provider’s bid even if the click does not result in a sale.

Page 10: College of Information Sciences and Technology Adversarial Information Retrieval Aspects of Sponsored Search Jim Jansen College of Information Sciences

College of Information SciencesCollege of Information Sciencesandand TechnologyTechnology

Adversarial Aspects

• content providers are contractual obligated to pay for all valid clicks.

• the search engine company has discretion over what is valid.

• sponsored search programs suffered a click fraud rate of 12%, translating to more than $1.5 billion of Google's ad revenue in 2005.

• some content providers complain that their individual click fraud rate is as high as 35%.

Liedtke, M., Click Fraud Concerns Hound Google, in ABC News Money, 2006 http://abcnews.go.com/Technology/wireStory?id=1934655&CMP=OTC-RSSFeeds0312

Page 11: College of Information Sciences and Technology Adversarial Information Retrieval Aspects of Sponsored Search Jim Jansen College of Information Sciences

College of Information SciencesCollege of Information Sciencesandand TechnologyTechnology

Click Fraud Implementation

An Example of Click Fraud on the Sponsored Listings of a Web Search Engine.

Page 12: College of Information Sciences and Technology Adversarial Information Retrieval Aspects of Sponsored Search Jim Jansen College of Information Sciences

College of Information SciencesCollege of Information Sciencesandand TechnologyTechnology

Click Fraud Implementation

An Example of Contextual Link Where Click Fraud Can Occur

Page 13: College of Information Sciences and Technology Adversarial Information Retrieval Aspects of Sponsored Search Jim Jansen College of Information Sciences

College of Information SciencesCollege of Information Sciencesandand TechnologyTechnology

Click Fraud Implementation

A Sponsored Link Concerning Google’s AdSense Program

Page 14: College of Information Sciences and Technology Adversarial Information Retrieval Aspects of Sponsored Search Jim Jansen College of Information Sciences

College of Information SciencesCollege of Information Sciencesandand TechnologyTechnology

Click Fraud Prevention

• Automated and Human Filters: employ both automated and human filters in an attempt to identify current and prevent future click fraud. • Pay-per-action Paradigm: a shift in paradigm from pay-per-click to pay-per-action (i.e., actually executes an action, such as purchasing a product). • Block Blacklisted IP addresses: IPs that are know spammer sites. Click fraud perpetrators also use these IPs.

Page 15: College of Information Sciences and Technology Adversarial Information Retrieval Aspects of Sponsored Search Jim Jansen College of Information Sciences

College of Information SciencesCollege of Information Sciencesandand TechnologyTechnology

Click Fraud Prevention• Aggressive monitoring of click fraud perpetrators: Click fraud is similar to what occurred in the online music industry.

The Recording Industry Association of America’s (RIAA) campaign against illegal files sharing via peer-to-peer networks.

• Search engine’s must make efforts to ensure trust: Trust is a, if not the, critical element in the sponsored search paradigm.

Whether through independent auditing or internal efforts, content providers and searchers must have trust in the process if it is to be a long term business model.

Page 16: College of Information Sciences and Technology Adversarial Information Retrieval Aspects of Sponsored Search Jim Jansen College of Information Sciences

College of Information SciencesCollege of Information Sciencesandand TechnologyTechnology

Conclusion

• View of sponsored search as “solely advertising” may be incorrect.

• From the searcher’s point of view, sponsored links are just another type of search engine result.

• Sponsored search model will have increasing impact as new players enter the field (i.e., Microsoft Research and AskJeeves) and second tier players (i.e., FindWhat, Kanoodle)

Page 17: College of Information Sciences and Technology Adversarial Information Retrieval Aspects of Sponsored Search Jim Jansen College of Information Sciences

College of Information SciencesCollege of Information Sciencesandand TechnologyTechnology

Conclusion

• Major search engines continue to transform the basic sponsored search model.– Cross-medium linkage can significantly increase the synchronization

of information pull-and-push, perhaps providing more relevant content to the searcher.

• Click fraud threatens the entire process. With rates between 12% and 16%, translates into billions of dollars per year, and it jeopardizes the entire model as it decreases trust in the system.

Page 18: College of Information Sciences and Technology Adversarial Information Retrieval Aspects of Sponsored Search Jim Jansen College of Information Sciences

College of Information SciencesCollege of Information Sciencesandand TechnologyTechnology

Applicable References(Available at http://ist.psu.edu/faculty_pages/jjansen/)

• Jansen, B. J. and Resnick, M. Forthcoming. An examination of searcher's perceptions of non-sponsored and sponsored links during ecommerce Web searching. Journal of the American Society for Information Science and Technology.

• Jansen, B. J. 2006. Paid search. IEEE Computer. 39(7), 88-90.

• Jansen, B. J. and Molina, P. 2006. The effectiveness of Web search engines for retrieving relevant ecommerce links. Information Processing & Management. 42(4), 1075-1098.

• Jansen, B. J. 2006. Paid Search as an Information Seeking Paradigm. Bulletin of the American Society for Information Science and Technology. 32(2), 7-8.

• Jansen, B. J. and Resnick, M. 2005. Examining Searcher Perceptions of and Interactions with Sponsored Results. Workshop on Sponsored Search Auctions, The Sixth ACM Conference on Electronic Commerce (EC'05). Vancouver, Canada. 5-8 June.

Page 19: College of Information Sciences and Technology Adversarial Information Retrieval Aspects of Sponsored Search Jim Jansen College of Information Sciences

College of Information SciencesCollege of Information Sciencesandand TechnologyTechnology

CFP - International Journal of Electronic Business (IJEB) • Special issue on Sponsored Search (Due

date: 15 February 2007)•  Guest Editors

– Jim Jansen, The Pennsylvania State University– Abdur Chowdhury, AOL and the Illinois Institute

of Technology.

Page 20: College of Information Sciences and Technology Adversarial Information Retrieval Aspects of Sponsored Search Jim Jansen College of Information Sciences

College of Information SciencesCollege of Information Sciencesandand TechnologyTechnology

Questions and Discussion

Jim Jansen

College of Information Sciences and Technology

The Pennsylvania State University

[email protected]