peerspective.mpi-sws.org alan mislove krishna p. gummadi peter druschel by raghuram krishnamachari...

13
PEERSPECTIVE.MPI-SWS.ORG ALAN MISLOVE KRISHNA P. GUMMADI PETER DRUSCHEL BY RAGHURAM KRISHNAMACHARI Exploiting Social Networks for Internet Search

Upload: crystal-powell

Post on 04-Jan-2016

228 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: PEERSPECTIVE.MPI-SWS.ORG ALAN MISLOVE KRISHNA P. GUMMADI PETER DRUSCHEL BY RAGHURAM KRISHNAMACHARI Exploiting Social Networks for Internet Search

PEERSPECTIVE.MPI-SWS.ORG

ALAN MISLOVEKRISHNA P. GUMMADI

PETER DRUSCHEL

BYRAGHURAM KRISHNAMACHARI

Exploiting Social Networks for Internet Search

Page 2: PEERSPECTIVE.MPI-SWS.ORG ALAN MISLOVE KRISHNA P. GUMMADI PETER DRUSCHEL BY RAGHURAM KRISHNAMACHARI Exploiting Social Networks for Internet Search

Motivation

WWW, Search engines, social networkingHyperlinks – author, human, index, rankSocial Networks

No study to examine information exchange Explicit links between users, not content Can these links be used by search engines?

In this paper Compare mechanisms for publishing and location Experiment: Social network based Web search Challenges in leveraging social networks in the future

Page 3: PEERSPECTIVE.MPI-SWS.ORG ALAN MISLOVE KRISHNA P. GUMMADI PETER DRUSCHEL BY RAGHURAM KRISHNAMACHARI Exploiting Social Networks for Internet Search

The Web verses Social Networks

Publishing Users place documents on Web server Author places hyperlinks on Web page that refer to related pages Links placed to increase rank and promote indexing

Locating Web search engines employing sophisticated technologies Google: Uses hyperlink structure and query/page relevance Limitations:

New pages: discovering/indexing, hyper-linking, link(s) discovery # of links determines relevance -> reflects interests/biases of the

Web Ignored: Unlinked/private pages, pages with insufficient relevance

Page 4: PEERSPECTIVE.MPI-SWS.ORG ALAN MISLOVE KRISHNA P. GUMMADI PETER DRUSCHEL BY RAGHURAM KRISHNAMACHARI Exploiting Social Networks for Internet Search

The Web verses Social Networks

Ex: User shares Web content with friends; content is invisible to others; content is now linked between users

Publishing Content is posted by the user and is recommended by others Links among users: Directed (distinct) & Undirected (mutual)

Locating Traversing the social network, keyword search, top 10 lists Timely, relevant & reliable (non-)textual info can be found Content is rated by consumers, not producers Content is rated almost immediately; doesn’t rely on

discovery

Page 5: PEERSPECTIVE.MPI-SWS.ORG ALAN MISLOVE KRISHNA P. GUMMADI PETER DRUSCHEL BY RAGHURAM KRISHNAMACHARI Exploiting Social Networks for Internet Search

Integrating Web search and social networks

Problem No unified search tool, no unified finding tool as well Social network-based search not used in Web and vice

versa

Questions Leverage social network links to improve search

results Explore benefits of social network-based Web search

Solution Conduct an experiment to validate these

Page 6: PEERSPECTIVE.MPI-SWS.ORG ALAN MISLOVE KRISHNA P. GUMMADI PETER DRUSCHEL BY RAGHURAM KRISHNAMACHARI Exploiting Social Networks for Internet Search

PeerSpective: The experiment

Web content of 10 students/researchers are shared

A HTTP proxy indexes all visited URLs by an user

When a Google search (query) is performed Local proxy forwards query to Google and to all peer

proxies All proxies execute the query on local index & return

results Results are collated and presented alongside the

Google results

Page 7: PEERSPECTIVE.MPI-SWS.ORG ALAN MISLOVE KRISHNA P. GUMMADI PETER DRUSCHEL BY RAGHURAM KRISHNAMACHARI Exploiting Social Networks for Internet Search

PeerSpective: Measurements & Experiences

In a month long experimental deployment (10 users) 439, 384 HTTP requests 198, 492 distinct URLs (45%) 113, 800 HTML and PDF requests (25.9%)

User base is small, with highly specialized interestsThe results may not represent a large, diverse group

Technology Local text search engine – Lucene Local peer-peer overlay engine - FreePastry

Page 8: PEERSPECTIVE.MPI-SWS.ORG ALAN MISLOVE KRISHNA P. GUMMADI PETER DRUSCHEL BY RAGHURAM KRISHNAMACHARI Exploiting Social Networks for Internet Search

Limits of hyperlink-based search

Web search engines index only well linked content

Limit: URLs visited by users / not indexed by Google

Reasons why a page might not be indexed The page could be too new (blogs, news) The page could be in deep web and not well

connected The page could be in dark web (private pages)

Page 9: PEERSPECTIVE.MPI-SWS.ORG ALAN MISLOVE KRISHNA P. GUMMADI PETER DRUSCHEL BY RAGHURAM KRISHNAMACHARI Exploiting Social Networks for Internet Search

PeerSpective verses Google

For each HTTP request Does Google’s index contain this URL Has some peer in PeerSpective viewed this URL

Static HTML content (No GET/POST) 6,679 requests (<6%) for 3,987 URLs (2%)

Google Index Covers 62.5% of the requests, 68.1% of the distinct URLs 1/3rd of all URL requests cannot be retrieved by Google

PeerSpective Index Covers 30.4% of requested URLs Achieves half of Google’s coverage with a much smaller size 13.3% of the URLs were in PeerSpective but not in Google’s index 19.5% improvement by PeerSpective compared to Google

searchWhat are the documents that interests our users, but not

Google?

Page 10: PEERSPECTIVE.MPI-SWS.ORG ALAN MISLOVE KRISHNA P. GUMMADI PETER DRUSCHEL BY RAGHURAM KRISHNAMACHARI Exploiting Social Networks for Internet Search

Benefits of social network-based search

Search engines have to rank pages Users rarely go beyond first 20 search results

1,730 Google searches were observered First page results: Google – 9.45, PeerSpective – 5.17 1,079 (62.3%) resulted in clicks on result(s) 307 (17.7%) were followed by a refined query Users gave up 344 (19.8%) of the time 933 (86.5%) of clicked results were returned only by Google 83 (7.7%) of clicked results were returned only by

PeerSpective 63 (5.7%) of clicked results were returned by both 9% improvement in search result clicks over Google alone

Page 11: PEERSPECTIVE.MPI-SWS.ORG ALAN MISLOVE KRISHNA P. GUMMADI PETER DRUSCHEL BY RAGHURAM KRISHNAMACHARI Exploiting Social Networks for Internet Search

How PeerSpective outperforms Google

Disambiguation Search terms have multiple meanings depending on the

context

Ranking Search engine: Top rank, Social Network: Nearby pages

Serendipity Making unexpected or fortunate discoveries

Page 12: PEERSPECTIVE.MPI-SWS.ORG ALAN MISLOVE KRISHNA P. GUMMADI PETER DRUSCHEL BY RAGHURAM KRISHNAMACHARI Exploiting Social Networks for Internet Search

Opportunities and Challenges

Online social networking enables new forms of information exchange Users can very easily and conveniently publish

information Makes it possible to locate and access “WOM”

information Organizes information according to tastes and

preferences of smaller groups of individualsOpportunities and Challenges

Privacy – willingness of individuals to share information Membership and clustering of social networks Content rating and ranking (page rank, views) System architecture (centralized or distributed)

Page 13: PEERSPECTIVE.MPI-SWS.ORG ALAN MISLOVE KRISHNA P. GUMMADI PETER DRUSCHEL BY RAGHURAM KRISHNAMACHARI Exploiting Social Networks for Internet Search

Thank You