optimizing web search using social annotations

31
Optimizing Web Search Using Social Annotations By Worasit Choochaiwattana

Upload: fleta

Post on 17-Mar-2016

51 views

Category:

Documents


0 download

DESCRIPTION

Optimizing Web Search Using Social Annotations. By Worasit Choochaiwattana. Agenda. Optimizing Web Search Using Social Annotations Studies on Improving the Quality of Web Search Social Annotation Based Web Search Experimental Results Discussion and Conclusion. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Optimizing Web Search  Using Social Annotations

Optimizing Web Search Using Social Annotations

ByWorasit Choochaiwattana

Page 2: Optimizing Web Search  Using Social Annotations

Agenda• Optimizing Web Search Using Social

Annotations– Studies on Improving the Quality of Web

Search– Social Annotation Based Web Search– Experimental Results– Discussion and Conclusion

Page 3: Optimizing Web Search  Using Social Annotations

Optimizing Web Search Using Social Annotations

• Exploring the use for social annotations to improve web search

• Social annotations can benefit web search – Good summaries of corresponding web pages– Count of annotations indicates the popularity of web

pages• They proposed SocialSimRank(SSR) and

SocialPageRank(SPK)

Page 4: Optimizing Web Search  Using Social Annotations

Studies on Improving the Quality of Web Search

• Two aspects – Ordering the web pages according to the query-

document similarity e.g. anchor text generation, metadata extraction, link analysis, and search log mining.

– Ordering the web pages according to their qualities, aka query-independent ranking or static ranking e.g. PageRank, HITS, and fRank

• The retrieved results are ranked base on both page quality and query-page similarity

Page 5: Optimizing Web Search  Using Social Annotations

Studies on Improving the Quality of Web Search

• Web users are creating annotations for web pages at an incredible speed

• del.cio.us >1 million registered users• Social annotations are useful information that

can be used in various ways e.g. Folksonomy, Semantic Web, and Enterprise Search.

Page 6: Optimizing Web Search  Using Social Annotations

Social Annotation Based Web Search

Page 7: Optimizing Web Search  Using Social Annotations

Social Annotation Based Web Search• Web page creators provide not only the web

pages and anchor texts for similarity ranking, but also the link structure for static ranking f

• The interaction log of Search engine users also benefits web search by providing the click-through data.

• Social annotation based web search focuses on how Web page annotator can contribute to web search

Page 8: Optimizing Web Search  Using Social Annotations

Social Annotation Based Web Search• SocialSimRank(SSR) measures the similarity

between the query and annotations based on their semantic relation.

• SocialPageRank(SPR) measures the popularity of web pages from web page annotators’ point of view.

Page 9: Optimizing Web Search  Using Social Annotations

Similarity Ranking Between Query and Social Annotations• Term-Matching Based Similarity Ranking

– Calculate the similarity based on the count of shared terms between query and annotations

– Some pages’ annotations are quite sparse and the term-matching base approach suffers more or less for the synonymy problem

Page 10: Optimizing Web Search  Using Social Annotations

Similarity Ranking Between Query and Social Annotations• Social Similarity Ranking

– Observation • Similar (semantically-related) annotations are usually

assigned to similar (semantically-related) web pages by users with common interests. In the social annotation environment, the similarity among annotations in various forms can further be identified by the common web pages they annotated

Page 11: Optimizing Web Search  Using Social Annotations

Similarity Ranking Between Query and Social Annotations• Social Similarity Ranking

Page 12: Optimizing Web Search  Using Social Annotations

Similarity Ranking Between Query and Social Annotations

Assume that there are NA annotations, NP web pages and NU web users.– MAP is the NA×NP association matrix between annotations and pages. – MAP(ax,py) denotes the number of users who assign annotation ax to page py

– SA is the NA×NA matrix whose element SA(ai, aj) indicates the similarity score between annotations ai and aj

– SP is the NP×NP matrix each of whose element stores the similarity between two web pages– SocialSimRank(SSR) is iterative algorithm to quantitatively evaluate the similarity between any two annotations

• Social Similarity Ranking

Page 13: Optimizing Web Search  Using Social Annotations

Similarity Ranking Between Query and Social Annotations• Social Similarity Ranking

– The time complexity of SSR alrotighm is O(NA2NP

2)– If the scale of social annotations keeps growing

exponentially, the speed of convergence for the algorithms may slow down.

– The similarity calculation method base on the SocialSimRank is

Page 14: Optimizing Web Search  Using Social Annotations

Page Quality Estimation Using Social Annotations

• Social Page Rank– Observation

• High quality web pages are usually popularly annotated and popular web pages, up-to-date web users and hot social annotations have the following relations:

– popular web pages are bookmarked by many up-to-date users and annotated by hot annotations;

– up-to-date users like to bookmark popular pages and use hot annotations;

– hot annotations are used to annotate popular web pages and used by up-to-date users.

Page 15: Optimizing Web Search  Using Social Annotations

Page Quality Estimation Using Social Annotations

• Social Page Rank– To quantitatively evaluate the page quality

(popularity) indicated by social annotations– The intuition behind the algorithm is the mutual

enhancement relation among popular web pages, up-to-date web users, and hot social annotations.

Page 16: Optimizing Web Search  Using Social Annotations

Page Quality Estimation Using Social Annotations

• Social Page RankAssume that there are NA annotations, NP web pages and NU web users. – MPU is the NP×NU association matrix between pages and users – MAP is the NA×NP association matrix between annotations and pages– MUA is the NU×NA association matrix between users and annotations– Element MPU (pi,uj) is assigned with the count of annotations used by user uj to annotate page pi.– Elements of MAP and MUA are initialized similarly.– P0 be the vector containing randomly initialized SocialPageRank scores.

Page 17: Optimizing Web Search  Using Social Annotations

Page Quality Estimation Using Social Annotations

• Social Page Rank

Page 18: Optimizing Web Search  Using Social Annotations

Page Quality Estimation Using Social Annotations

• Social Page Rank– The time complexity of the algorithm is

O(NUNP+NANP+NUNA)

Page 19: Optimizing Web Search  Using Social Annotations

Dynamic Ranking with Social Information

• Incorporate both similarity and static feature exploited from social annotations into the ranking function by using RankSVM

Page 20: Optimizing Web Search  Using Social Annotations

Experimental Results• Delicious Data

– The data crawling from del.icio.us during May 2006, which consists of 1,736,268 web pages and 269,566 different annotations, has been used.

– Compound annotations in various forms e.g. java.programming or java/programming were split into standard words with the help of WordNet before using them in the experiments.

Page 21: Optimizing Web Search  Using Social Annotations

Experimental Results• Evaluation of Annotation Similarities

– With the SocialSimRank algorithm converged with 12 iterations, they are able to find semantically related annotations

Page 22: Optimizing Web Search  Using Social Annotations

Experimental Results• Evaluation of SPR Results

Page 23: Optimizing Web Search  Using Social Annotations

Experimental Results• Dynamic Ranking with Social Annotation

– Both Manual query set (MQ) and Automatic query set (AQ) are used.

– 50 MQ and their corresponding ground truths obtained from a group of CS students.

– 3000 AQ and their corresponding ground truths obtained from the Open Directory Project

Page 24: Optimizing Web Search  Using Social Annotations

Experimental Results• Dynamic Ranking with Social Annotation

– DocSimilarity is taken as the base line which calculated based on the BM25 formula

Page 25: Optimizing Web Search  Using Social Annotations

Experimental Results• Dynamic Ranking with Social Annotation

– Two popular retrieval metrics are used to evaluate the ranking algorithms

• Mean Average Precision

• NDCG at K

Page 26: Optimizing Web Search  Using Social Annotations

Experimental Results• Dynamic Ranking with Social Annotation

– Dynamic Ranking Using Social Similarity

Page 27: Optimizing Web Search  Using Social Annotations

Experimental Results• Dynamic Ranking with Social Annotation

– Dynamic Ranking Using Social Page Rank

Page 28: Optimizing Web Search  Using Social Annotations

Experimental Results• Dynamic Ranking with Social Annotation

– Dynamic Ranking Using Both SSR and SPR

Page 29: Optimizing Web Search  Using Social Annotations

Discussion and Conclusion• The social annotations do benefit web search but

there are still several problems– Annotation Coverage

• Submitted queries may not match any social annotation.• Many web pages may have no annotations

– Annotation Ambiguity• SSR may find the similar term to the query terms while fail to

disambiguate terms that have more than one meanings

– Annotation Spamming• Malicious annotation have a good opportunity to harm the

search quality

Page 30: Optimizing Web Search  Using Social Annotations

Discussion and Conclusion• The main contributions can be concluded as

follows:– The study on how to use social annotations to

improve the quality of web search– SocialSimRank algorithm to measure the

association among various annotations– SocialPageRank algorithm to measure a web

page’s static ranking based on social annotations

Page 31: Optimizing Web Search  Using Social Annotations

Q & A