[lecture notes in social networks] the influence of technology on social network analysis and mining...

25
Chapter 26 Integrating Online Social Network Analysis in Personalized Web Search Omair Shafiq, Tamer N. Jarada, Panagiotis Karampelas, Reda Alhajj, and Jon G. Rokne Abstract With the emergence of high speed internet applications and advanced Web 2.0 based Rich Internet Applications (i.e., blogs, wikis, etc.), it has become much easier for the users to publish data over the Web. This brings a challenge for the Web search solutions to let individual users find the right information as per their preferences, because traditional Web search engines have been built on “one size fits for all” concept. Different users of the Web may have different preferences. Search results for the same query raised by different users may differ in priority for individual users. In this book chapter, we present the extended version and results of our proposal on community-aware personalized Web search. It is quite challenging to know the preferences of the users by the search engines. We have designed and developed our unique approach of finding the preferences of users from the relevant parts of the user’s social network and community. We believe that the information related to the queries posed by the users may have strong correlation with the relevant information in their social networks. In order to find out personal O. Shafiq () J.G. Rokne Department of Computer Science, University of Calgary, Alberta, Canada e-mail: moshafi[email protected]; [email protected] T.N. Jarada University of Calgary, Calgary, Alberta, Canada e-mail: [email protected] P. Karampelas Department of Information Technology, Hellenic American University, Manchester, NH, USA e-mail: [email protected] R. Alhajj Department of Computer Science, University of Calgary, Alberta, Canada Department of Information Technology, Hellenic American University, Manchester, NH, USA Department of Computer Science, Global University, Beirut, Lebanon e-mail: [email protected] T. Özyer et al. (eds.), The Influence of Technology on Social Network Analysis and Mining, Lecture Notes in Social Networks 6, DOI 10.1007/978-3-7091-1346-2__26, © Springer-Verlag Wien 2013 589

Upload: arno-hp

Post on 23-Dec-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Chapter 26Integrating Online Social Network Analysisin Personalized Web Search

Omair Shafiq, Tamer N. Jarada, Panagiotis Karampelas, Reda Alhajj,and Jon G. Rokne

Abstract With the emergence of high speed internet applications and advancedWeb 2.0 based Rich Internet Applications (i.e., blogs, wikis, etc.), it has becomemuch easier for the users to publish data over the Web. This brings a challenge forthe Web search solutions to let individual users find the right information as pertheir preferences, because traditional Web search engines have been built on “onesize fits for all” concept. Different users of the Web may have different preferences.Search results for the same query raised by different users may differ in priorityfor individual users. In this book chapter, we present the extended version andresults of our proposal on community-aware personalized Web search. It is quitechallenging to know the preferences of the users by the search engines. We havedesigned and developed our unique approach of finding the preferences of usersfrom the relevant parts of the user’s social network and community. We believe thatthe information related to the queries posed by the users may have strong correlationwith the relevant information in their social networks. In order to find out personal

O. Shafiq (�) � J.G. RokneDepartment of Computer Science, University of Calgary, Alberta, Canadae-mail: [email protected]; [email protected]

T.N. JaradaUniversity of Calgary, Calgary, Alberta, Canadae-mail: [email protected]

P. KarampelasDepartment of Information Technology, Hellenic American University, Manchester, NH, USAe-mail: [email protected]

R. AlhajjDepartment of Computer Science, University of Calgary, Alberta, Canada

Department of Information Technology, Hellenic American University, Manchester, NH, USA

Department of Computer Science, Global University, Beirut, Lebanone-mail: [email protected]

T. Özyer et al. (eds.), The Influence of Technology on Social Network Analysisand Mining, Lecture Notes in Social Networks 6, DOI 10.1007/978-3-7091-1346-2__26,© Springer-Verlag Wien 2013

589

590 O. Shafiq et al.

interest and social-context, we find (1) activities of users in their social-network,and (2) relevant information from user’s social networks, based on our proposedtrust and relevance matrices. We have further developed a mechanism that extractsfrom user’s social network information to be used to re-rank search results from asearch engine. We also have discussed the implementation and evaluation details ofour proposed solution.

26.1 Introduction

The amount of information available over the Web is increasing rapidly. Throughthe use of high-speed internet, high capacity network, and upcoming advancedinteractive websites, like facebook, YouTube and blogging, it has become eveneasier for the internet users to publish data over the Web. With this informationflooding, it becomes hard for a user to find out right information over the Web.Search engines like Google, Altavista, Yahoo, Bing, etc. bring thousands of searchresults for a particular search query. It is almost impossible and mostly frustratingfor a human user to go through the content of all the search results manually.With the increase of information over the Web, scalability of Web search enginesbecame important, e.g., [13]. Search engines, like Google, even provide manualmechanisms to rank up and down search results. But for each and every searchresult, it is not possible for a user to adjust the ranking manually in advance; asa result categorization and personalization of search results become an importantissue [24]. Different efforts are being made in order to enable community andsocial network aware [1], user-terminology-aware [2] and group-aware [4,16] searchengines that may help in bringing results closer to the user’s interest. After providingthe personalized version [9] of Web search by Google, researchers still seem tofeel the need to move towards social network-aware search [17]. Not only Websearch, but the personalized search has been explored in other real-life scenarios,e.g., medical and digital libraries [15].

Our goal is to find out a way to prioritize search results based on the interests,activities and community of users. There are different ways that exist in order tofind out contextual information, i.e., location etc. But this information is most of thetime not enough or limited to find out the right search results for the right user atthe right time. Different users searching over the Web for the same thing may havedifferent preferences. An executive going for a business meeting would probably beinterested in a hotel closer to the meeting place; however a back-packer searches fora hotel in the same city might want to search for, rather cheaper hotel regardless ofthe location.

We propose to exploit user’s social network in order to find out the interests,activities of the users and their community, i.e., friends in the social network.This information can help in bringing word-of-mouth recommendation system fromtrusted friends in the social network. Every group of users may have certain set ofactivities over the social network, i.e. postings, public profile information, tagging,

26 Integrating Online Social Network Analysis in Personalized Web Search 591

blogging, etc. Similarly different friends of the user may also have similar activities.Most of the times, friends of a user in a social network have to some extent similarinterests and therefore could help in finding out the right information the userneeds. The information about the related search query from friends in a socialnetwork, or activities of a user in social network could help in finding out therelevant information over the Web and prioritize it for the user. It could help theuser to find out the right information at the right time. Such kind of community-aware personalized Web search utility could be used in many different ways andin many applications involving Web Intelligence and Business Intelligence, i.e., forsummarizing important search results, caching of important search results, as wellas collaboration techniques.

We have developed methodology to find out activities of users over a socialnetwork as well as their friends/community. Secondly, we find out how to use thisinformation to prioritize relevant information over the Web. This way, we are able touse social networks for finding relevant information over the Web in a personalizedway. There are different kinds of friends and communities in a social network basedon which we can find out the relevant information. Same as in real-life, a user canhave different trust matrices for different friends in a social network. This means,information collected from a more trusted friend is more important than that ofthe one from lesser trusted one. As a part of this research, we have found outmetrics based on which the trust can be built and used in a social network. Similarly,different metrics to rank the results over the Web have also been sought.

The rest of the chapter is organized as follows. Section 26.2 provides an overviewof the related work together with the pros and cons. Section 26.3 describes ourproposed solution of community-aware personalized Web search and discusses howto model the information based on user’s individual preferences as well as trustedand relevant friends of the users in their social network. Section 26.4 describes thedetails implementation and evaluation, followed by conclusion.

26.2 Related Work

Various approaches have been developed for personalized Web search based oncontextual information, i.e., interests and preferences of users [20, 23]. A numberof approaches are based on modeling and collecting contextual information. Forexample, context-aware search methods [18] use special ontology that is constructedfrom the table of contents of the book, to help formulating the query for efficiencyand to refine query search results that are relevant to the user’s interests. Location-awareness is another form of context-awareness that is used to filter out searchresults based on the current location of the users. There are different ways in whichclients access the services over the Internet. With the advancements of requirementsand complexity in consumer applications, consumers expect the services to beaware of their current environment and situation, i.e., the type of device through

592 O. Shafiq et al.

which consumers are communicating, their preferences, locations, etc. The contextinformation can be used by services to provide their clients with a customized andpersonalized behavior. Chen et al. [6] describe an ontology for supporting pervasivecontext-aware systems. An ontology is developed as a part of the Context BrokerArchitecture (CoBrA), a broker-centric agent architecture that provides knowledgesharing, context reasoning, and privacy protection supports for pervasive context-aware systems.

Other approaches take context into account one way or the other. For instance,Almeida and Almeida [1] propose an algorithm for search results ranking based oncommunity-based evidence; they call it query contextualization which is obtainedfrom user’s interactions. First, the algorithm identifies interest-based communityusing session interest graphs followed by community identification in the graphs.Based on community identification, search results are ranked. This way, communityinformation is used as a way to provide context for the query specified by the user.Authors claim to improve the search results up to 48 %.

Computing word-of-mouth recommendations is another form of context-awareness that has been introduced recently. The idea is to identify who knowswhat in the social-network of a user, and based on that some information is broughtup. Not only this, the process also includes finding out who is trustworthy sourceof information. For example, Granovetter [10] proposes how social networks canserve as a source of new information to which an individual may not otherwisehave access and Sugiyamaet et al. [21] proposes adaptive Web search based onautomatically created users profiles. This is especially useful in use-cases like job-hunting where weak social ties might be better than strong social ties to bringsome new and interesting information. Heath [11] proposes algorithms to computetrust metrics in his developed system based on social-network. The informationis collected from tags used in the system to comment on and recommend someinformation. Another algorithm helps in calculating the prevalence of an individualin the reviews of items that have been tagged with a particular tag, thereby providinga relative measure of their experience with the topic. Calculation of affinity scorebetween one individual and another is based on the analysis of reviews where theinformation is provided to the algorithm in the form of FOAF (Friend of a friend)description.

With the evolution of Semantic Web [3] over the last decade, several approachesfor personalized Web search from social networks based on Semantic Web have alsobeen initiated. For example, Carminati et al. [5] propose the usage of tags as a basisto enforce advanced personalization policies for Web based search. A multi-layerframework has been proposed where data collected by social tagging communitiesis complemented with additional services that provide users the ability of expressingtheir agreement or disagreement with existing tags, denoting the members thatthey trust based on their characteristics and relationships, or specifying policiesenforcing the criteria adopted by a given end user to access the resources basedon the associated tags and ratings. The framework helps in representing the tagginginformation as well as the user policies based on Semantic Web standards so that itsexpressivity could be exploited easily. The layers include data layer which consists

26 Integrating Online Social Network Analysis in Personalized Web Search 593

of services in-charge of managing social network data, Web metadata as well asratings by the users. Second is Rule layer that consists of modeling user-preferencesand trust policies specified by the users. This is the key of the informationthat helps in identifying contextual information based on social network. Last isApplication Layer where the output of data and rule layers is used by a set of agentscorresponding to the application layer, for a variety of purposes, i.e., social networkdata represented in Semantic Web is used by semantic search engines in order tofind users and resources matching given queries.

Zeng et al. [25] first introduce user interest models which are then used to builda social network to track the group interest related to the given user. The authorsthen refine the search task based on the semantic dataset using the acquired groupinterest. The search refinement is carried out based on the interest of the individualuser, as well as interest of the group which the user is part of.

After covering the related approaches, in the next section, we present our method-ologies for contextualizing users’ interest based on the information from their socialnetwork and use it to refine Web search results. The novelty of our approach is thatour methodologies are generic enough and can be customized for any social networkand any Web search engine. Secondly, the proposed methodologies prescribe theusage of Semantic Web technologies to make the modeling of user preferences moreexpressive that could be used during the search refinement process.

The key problems observed in the related works as described above could bearticulated as follows. Most of the related works consider location information as theonly important part of context [6]. However, in case of user-based customized Websearch, location is not the only important factor that should be taken into account.Therefore, most of the related work approaches have not considered user contextto its full potential. Secondly, several user based context calculation techniquesare computation exhaustive. For example, the work described in [1] proposes togenerate interest graphs and then tries to identify community, which makes thewhole approach highly computational extensive, and hence causes high resourceconsumption and hence not cost effective with respect to time. Further, most of theapproaches are limited to the interest represented in the form of keywords only.Considering keyword based approaches, this is a quite straight forward, simpleand cost effective method. However, it imposes a limitation because complexinformation about user’s preferences cannot be modeled using simple keywords;therefore, we have taken into account the modeling of complex preferences of usersusing semantic expressions.

Most of the approaches blindly take into account the information from social-network of a user, regardless of its relevance. There could be some information inthe social network which might be relevant from syntactic point-of-view, however,might be quite different from semantic point-of-view. Therefore, most of theapproaches are unable to identify relevant and irrelevant information from the socialnetwork of a user. We have introduced the use of relevance metrics to take intoaccount only information which is relevant to users’ interests.

594 O. Shafiq et al.

26.3 Community Aware Personalized Web Search

Currently available Web search solutions treat users mostly the same, in a “one sizefits all” manner. Web search engines give same search results for the same queriesby different users. However, in reality it is very likely that different users may havedifferent preferences. Therefore, we believe that the search mechanisms need to beaware of the contextual information, preferences and interests of the users. In thissection, we describe our proposed solution to tackle this problem.

Figure 26.1 depicts our solution in a layered manner that acts as a middlewarebetween the user and the Web search engine. The first layer takes the informationfrom the social network of users, and finds out the activities of the users in theirsocial network, e.g., what kind of postings a user is doing, tagging, and keywordsfrom user’s own profile. It helps us in knowing the current state of personal interestsof the user him/herself. Secondly, we have built a similar mechanism to find outthe activities and information posted by the community or friends of the user inthe social network; it can be tagging, posting a video, blogging and informationfrom friends’ public profile, etc. This may help us in finding out what kind ofinformation and interest does members of the community of a user have. We believethat the information collected from the community of the user may be of directinterest to the user, and hence may serve as preference of the user. However, theremay be several cases where some part of the community is not relevant for someinformation for the user (e.g., a user searching for computer science material mightnot find as relevant information from his/her friends belonging to the medical field).In order to tackle this, we have developed trust and relevance schemes to find out therelevant information from the user’s social network. This mechanism allows users tospecify trust relationship with their friends or community in the social network. Forexample, there can be some friends who are most close or trustworthy, from whomthe user might want to use more information whereas some of the friends may be oflesser trust, i.e., not close enough, from whom a user might not want to consider theinformation, similar to what happens in real-life.

We allow users to assign weights, which describe the level of trust, to thefriends and other related communities in a social network. This helps in finding outwhat information from the community is more trustworthy than others. The trustmechanism allows to pullout trustworthy information from user’s social network,which is then used to re-rank Web search results. Similarly, users are also providedwith mechanisms to specify the level of relevance of their friends in their socialnetwork, with respect to different search topics. This allows the personalizedmechanism to consider only relevant subset of the user’s social network, and henceavoid irrelevant information.

Our approach is to build our proposed solution in a generic manner, such that it isnot dependent on any particular social-network or Web search engine. Our solutioncan be customized for any search scenario, and hence can be applied on any kindof known social networks (e.g., facebook, youtube, blogs, etc.) as well as any kindof Web search engine (i.e., Google, Yahoo, Alta Vista, etc.). Our Social-Network

26 Integrating Online Social Network Analysis in Personalized Web Search 595

Fig. 26.1 Layered model forcommunity-awarepersonalized Web search

Analysis based contextual mechanism does not require any change or interferencein the Web search engine because it acts as a search-engine front-end for userssearching information over the Web; it helps them in ranking and reordering of Websearch results according to their preferences obtained from their social-network.

Figure 26.1 provides an overview of our proposed solution in the form of layeredmodel architecture. Each user and his/her friends have their activities on the socialnetwork from which the preferences and the trust are identified. Based on thetrust and relevance schemes, preferences of a user and his/her community fromthe social-network are obtained. Certain standard social-network analysis basedtechniques are used to select relevant subset preferences from the user’s community.The Social Network Aware Strategies are applied to Web search data with theinformation coming from the user’s social network, and Web search results areranked accordingly. While ranking the search results, our solution does not hideor change any results from the user. It just re-ranks and re-orders Web search resultsfor the user based on his/her social-network information, which might be interestingfor the user. We define the social network, in a generic way, as follows:

Definition 1 (Social Network SN). Let SN be a social network that shows connec-tivity between different members in the social network. It can be represented as atwo-dimensional matrix of n rows and n columns.

Social Network WD SN .n ; n/

where, SN is the two-dimensional matrix. Each of the entries in SN is referred to as,SNi;j , where 1 � i � n and 1 � j � n. The value of each entry SNi;j in the matrixdescribes the connectivity between two particular members of the social network,namely i and j. Null values refer to no connectivity, whereas, positive values indicatethe existence of some connectivity between the members.

In the subsections below, the sub-modules of the community-aware personalizedWeb search model shown in Fig. 26.1 are described in detail.

596 O. Shafiq et al.

26.3.1 User Preferences and Trust Model

In order to know user’s preferences in a social network, activities of users andtheir friends in the social-networks are to be modeled in a systematic way. Thisinformation could be used for the purpose of ranking and re-ordering Web searchresults. The design methodology behind our solution is generic. Therefore, oursolution is not restricted or dependent on any particular type of social network.In any social network, a user has certain activities which can be postings of certaininformation, doing some activities, attending some events, tagging and commentingon photos of the users, etc. The same holds for the friends of the considered userin the social-network; they are expected to have similar kind of activities. We haveadopted a simpler way to model user preferences by having this information askey-pair values. This is a practical approach where all the activities of a user aresummarized in the form of a set of keywords. For example, if a user is blogging onthe topic of soccer in his/her social network about the upcoming FIFA World Cup in2010 in South-Africa, in that case a simple set of keywords mentioning interests ofthe user may be listed as {FIFA, World Cup, soccer, 2010, South Africa}. Similarsets of keywords are collected for each of the users performing their activities onthe social network.

26.3.1.1 User Preferences Metric

A sample list of keywords of interest for users is given in Table 26.1 that shows theid of the users against the important keywords collected based on their activities inthe social network.

Based on the activities of the users, the keywords provided in Table 26.1 representtheir interests. These keywords, once collected, can be taken into account forranking Web search results for the users. Similarly, keywords of other users (whichmay be friends of each other) could be taken into account while ranking based onthe interest of a group of people.

Definition 2 (Preference List). Let PL be the list of preferences of a member inthe social network, which may also be a user searching for information over theWeb. PL can be represented as tuple:

PL . index; keywords .m/ /

where, index is a numeric item that corresponds to the user ID in the socialnetwork SN, and keywords is a one dimensional array of keywords, having eachitem keywordsi as preference of the user, with length 1 � i � m.

26.3.1.2 User Trust Matrix

Each user in a social network has his/her friends. However, as in real-life, thereare different ties with different friends. These ties may also vary with time. Close

26 Integrating Online Social Network Analysis in Personalized Web Search 597

Table 26.1 List of keyword sets for users in a social network

User ID Set of keywords

1 FIFA, World Cup, soccer, 20102 Cricket, World Cup, Australia, 19923 Politics, North America4 Programming, Java, Web, Enterprise5 Traveling, Airline tickets, holidays6 Electronics, Boxing day, sales7 Movies, Hollywood, Bollywood, Avatar

friends or important friends normally have closer ties and hence higher trust factorthan that of others. Our approach takes into consideration the social ties and trustof users with their friends in the social network. This information is necessary andis required in ranking the information. For some of the searches, friends of a userwith closer ties and high trust metric might be of more importance than others, e.g.,while buying a computer. However, there might be cases where weak social tiesare more important for a user than stronger social ties, e.g., in case of job search.Our approach is to represent the social ties based on trust in a numerical scale from0 to 10, ranging from 0 as the no/weakest ties to 10 as the strongest ties. Suchinformation is calculated for each user with each of his/her friends in the form of amatrix; it is taken into account while performing social network aware personalizedWeb search ranking.

Definition 3 (Trust Matrix TM). Let TM be the social network (as describedin Definition 1) that shows connectivity between different members in the socialnetwork. The value of each entry TMi;j in the matrix describes the level of trustbetween the two connected members.

TM .n; n/

As described in Definition 1, null values mean no connectivity; whereas, positivevalues describe connectivity between members, with its value ranging from 1 to 10as the level of trust (i.e., 1 as for lower level of trust and 10 as for higher level oftrust). We describe standard representation of a social network as follows:

1. Vertex v represents a user in social network n2. Edge e represent connection between two particular users in social network n3. Weight w of each edge represents the level of trust between the two connected

users in social network n4. Preferences p denote the list of preferences/interest values or logical expressions

for each of the users in the social network

Table 26.2 shows a one-mode adjacency matrix of a sample social network.Rows and columns show users A to G and each of the cells shows the weightof the connecting edge, i.e., the strength of trust relationship between the users.

598 O. Shafiq et al.

Table 26.2 TM – Trust matrix for a simple social network

A B C D E F G

A X 9 9 9 4 2 3B 2 X 7 6 6 5 1C 6 2 X 4 9 6 8D 4 5 3 X 8 1 4E 0 8 4 4 X 7 6F 8 5 7 2 2 X 2G 2 4 6 5 5 2 X

The diagonal of the matrix shows the highest weight as we assume that a user mayhave his/her own interests at highest priority.

26.3.1.3 User Relevance Matrix with Respect to Topics

As described in the previous section about user trust matrix, where the level of trustbetween different users in a social network is quantified with numbers, in a similarway, we represent the relevance of user or user-groups relationship with each other,with respect of a particular topic, in a quantifiable manner. For example, the user isinterested in sports (particularly football), however, there are couple of friends in theuser’s social network relevant to football as well as cricket. Now, in order to performgroup-level preference calculation, it will be important to include the informationobtained from friends relevant to football, rather than cricket. In order to representthis level of relevance, we propose the use of relevance matrix.

Definition 4 (Relevance Matrix RM). Let RM be the relevance matrix that showsthe level of relevance between different members in a social network, with respectto a particular topic.

RM . n; n; topic /

where, RM is the two-dimensional matrix,. Each of the entries in RM is referredto as, RMi;j , where 1 � i � n and 1 � j � n. The value of each entry RMi;j inthe matrix describes the level of relevance between members i and j, with respect toa particular topic, with the value ranging from 1 to 10 (i.e. 1 as for lower level ofrelevance and 10 as for higher level of relevance).

Tables 26.3a and b shows two relevance matrices that show relationships betweenvarious users in a social network, based on different topics.

After representing the level of relevance as well as the trust between the differentusers, we calculate balance between the trust and relevance matrices, and henceidentify the most relevant and trust friends of a user in his/her social network to findthe list of preferences. The balance between trust and relevance is calculated usingthe expression given as follows:

26 Integrating Online Social Network Analysis in Personalized Web Search 599

Table 26.3 RM – Relevance matrix of users based on topics ‘Football’ and ‘Cricket’

A B C D E F G

(a) Topic: Football

A X 5 7 2 5 3 2B 5 X 6 8 2 6 6C 3 2 X 5 6 5 3D 8 5 3 X 2 4 1E 0 8 4 4 X 2 2F 1 5 7 2 2 X 4G 2 4 6 5 5 2 X

(b) Topic: Cricket

A X 7 3 5 8 9 2B 5 X 7 6 4 6 6C 4 2 X 4 7 5 5D 7 5 3 X 2 3 3E 3 8 4 4 X 2 7F 4 5 7 2 2 X 6G 2 4 6 5 5 2 X

Table 26.4 Calculated measures based on relevance and trust

A B C D E F G

A X 0 0 0 0 6 4B 10 X 42 48 12 30 36C 18 4 X 20 54 30 9D 32 25 9 X 16 4 1E 0 64 16 16 X 14 4F 0 25 49 4 4 X 16G 4 16 36 25 25 4 X

˛TMi;j C ˇRMi;j D Pi;j (26.1)

where TM is the trust matrix and RM is the relevance matrix. Moreover, ˛ and ˇ

have been introduced as two threshold measures, which can be fine tuned by theuser to reflect whether he/she gives more preference to trust, relevance or both.

Table 26.4 shows the calculated measures between different users in a socialnetwork, based on the relevance as well as trust. Once the balance of trust andrelevance between a user and his/her friends is calculated, most trustworthy aswell as highly relevant part of the social network is selected to deduce preferenceinformation from the relevant part of user’s social network. This way, we manageto focus on the key and required part of the social network of the user, ratherthan performing calculation for whole network, and hence avoid non-trusted andirrelevant information.

600 O. Shafiq et al.

26.3.2 Social Network Analysis Strategies

Once the important part of the social network (with higher relevance and trust) isselected, social network analysis (SNA) strategies [8] constitute the next importantstep of our proposed solution. They further help in narrowing-down the requiredmembers of the user’s social network, for whom the preferences are to be collectedin order to re-rank Web search results. SNA measures help in analyzing the socialnetwork in terms of network and graph theory consisting of nodes and ties. Nodesare individual actors in social networks. For example, betweenness centrality ofa node calculates the extent to which the node lies between other nodes in thenetwork. This measure takes into account the connectivity of the node’s neighbors,giving a higher value for nodes which bridge clusters. An edge that is said to bea bridge, if deleted, would cause its endpoints to lie in different components ofthe social network. Closeness measure shows the degree to which an individualis near all other individuals in a network (directly or indirectly). Degree measurecounts the number of direct ties to other actors in the network. This is also knownas the “geodesic distance”. We have used such SNA measures to identify for a givenuser his/her friends whose preferences and interests can help in better ranking Websearch results. Below we list a couple of general purpose strategies which could beeffective by employing SNA measures:

Strategy 1: A user may want to get preferences from his/her friends to rank Websearch results.

Technique: Preferences of friends with higher value of betweenness centralityshould be taken into account while ranking Web search results. List of prefer-ences would be calculated by considering Eq. (26.1) and Definition 2 as:

From Eq. (26.1) . . . ˛ TMi,j C ˇ RMi,j D Pi,j and based on Definition 2 : : :

PL(index, keywords(m))Betweenness Centrality is calculated as

CB.v/ DX

s¤v¤t2V

�st .v/

�st

where �st is the number of shortest paths from s to t, and �st (v) is the number ofshortest paths from s to t that pass through vertex v.

PL. Max .Cb.P ///

where P is the sub-part of the social network that was identified after calculatingmembers with higher relevance and trust. Preference list of members with maximumbetweenness centrality in the identified P sub-social network would be consideredfor re-ranking Web search results.

26 Integrating Online Social Network Analysis in Personalized Web Search 601

Strategy 2: A user may want to get preferences from his/her friends havinghigher degree of dependence within the social network; this is to be used inranking Web search results.

Technique: Preferences of the friends connected with Bridge edge should betaken into account while ranking Web search results.

The bridge measure is computed as B(SN); its deletion increases the number ofconnected components within the social network SN. The algorithm follows fromTarjan [22].

PL ..B.P ///

where P is the sub-part of the social network that was identified after calculatingmembers with higher relevance and trust. Preference list of members in the Bridgevertex in the identified P sub-social network would be considered for re-rankingWeb search results.

Strategy 3: A user may want to get preferences from his/her friends, who arefound closely related in the social network, to rank Web search results.

Technique: Preferences of the friends with higher Closeness measure should betaken into account while ranking Web search results.

Closeness is calculated as

CC .v/ DX

s¤v¤t2V

1Pt2V=v

dG.v; t/

which is the reciprocal of the sum of geodesic distances to all other vertices inthe social-network.

PL.Max.Cc.P ///

where P is the sub-part of the social network that was identified after calculatingmembers with higher relevance and trust. Preference list of members with thehighest Closeness, in the identified P sub-social network, would be considered forre-ranking Web search results.

Strategy 4: A user may want to get preferences from his/her friends, who arefound receiving higher degree of connections in the social network, to rank Websearch results.

Technique: Preferences of the friends, with higher In-degree measure, should betaken into account while ranking Web search results.

In-degree is calculated as indeg�.v/ DX

v2V

deg�.v/

PL.Max.indeg�.P ///

602 O. Shafiq et al.

where P is the sub-part of the social network that was identified after calculatingmembers with higher relevance and trust. Preference list of members with thehighest in-degree value, in the identified P sub-social network, would be consideredfor re-ranking Web search results.

Strategy 5: A user may want to get preferences from his/her friends who arefound actively communicating with other friends in the social network, to rankWeb search results.

Technique: Preferences of the friends, with higher Out-degree measure shouldbe taken into account while ranking Web search results.

Out-degree is calculated as outdeg C (v) DX

v2V

degC.v/

PL.Max.outdeg C .P ///

where P is the sub-part of the social network that was identified after calculatingmembers with higher relevance and trust. Preference list of members with thehighest out-degree, in the identified P sub-social network, would be considered forre-ranking Web search results.

Strategy 6: A user may want to get preferences from his/her close friends to rankWeb search results.

Technique: Preferences of the friends in the social network that are directlyconnected to the user should be taken into account while ranking Web searchresults.

Directly connected friends D are the ones at distance dD0

PL.Max.D.d D 0/.P ///

where P is the sub-part of the social network that was identified after calculatingmembers with higher relevance and trust. Preference list of members with thefriends connected at distance d D 0, in the identified P sub-social network, wouldbe considered for re-ranking Web search results.

Strategy 7: A user may want to have recommendation for ranking search resultsbased only on his/her own interests.

Technique: Only preferences of the user him/herself should be taken into accountwhile ranking Web search results.

PL (user) shows the Preference list of the user him/herself only, and does not takeinto account the social-network of the user.

Strategy 8: A user may want to have recommendations for ranking search resultsbased on the interest of closely connected friend in his/her community.

26 Integrating Online Social Network Analysis in Personalized Web Search 603

Technique: Preferences of the friends forming a Clique, should be taken intoaccount while ranking Web search results.

Clique is calculated as Cl D subset of at least three members in the social-network, such that every two members in the subset are directly connected.

PL.C l.P //

where P is the sub-part of the social network that was identified after calculatingmembers with higher relevance and trust. Preference lists of members who arepart of the clique, in the identified P sub-social network, would be considered forre-ranking Web search results.

The above mentioned strategies use standard SNA measures to help the usergetting preference list from the members of his/her social network in differentpossible ways, while performing Web search. These measures are generic and can beapplied to any kind of social network and any Web search engine. These strategieshelp the user to know the preferences as a set of recommendations that are takeninto account while performing re-ranking of Web search results.

26.3.3 Ranking Mechanism for Web Search

In this section, we provide the details of our next key component of the social-network aware personalized Web search solution. It takes into account the pref-erences list which is retrieved as described in the previous sub-sections, usesthe preferences to re-rank Web search results. This way, Web search results arere-ordered, based on the information obtained from the social network of a user, tohelp in displaying more relevant and interesting Web search results at the top of theretrieved list.

Ranking of information is carried out on individual as well as group basis. Incase of individual basis, the user performing the search is the point of reference andall the calculations from the social network measures are carried out accordingly.However, in case of group-based ranking, the calculations from the social networkmeasures are carried out with reference to all users taking part in the group.

26.3.3.1 Individual Ranking

Ranking of Web search results based on the preferences of an individual user iscarried out while taking the user as the point of reference and all the calculationsfrom the social network measures are carried out accordingly. Preferences of theusers, after performing the calculations from the social network measures, are taken

604 O. Shafiq et al.

Algorithm 1 Individual Ranking of Web search resultsInput: User Query Q, Social-Network SN represented as the adjacency matrixOutput: List of re-ranked Web Search resultsbegin

1. TM (SN) is the adjacency matrix of the user’s social network representing the trust level asdescribed in the previous sub-sections

2. RM (SN, topic) is the adjacency matrix of the user’s social network representing the level ofrelevance with respect to a topic, as described in the previous sub-sections

3. P D ˛ TM C ˇ RM is the subpart of the social-network, as specified in Eq. (26.1)4. PL D ( index , keywords (m) ) P, where PL is the list of preferences from the members of the

sub social-network P5. List R’ D Re-ordered (R, PL), where R’ is the re-ordered list of Web search results based on the

list of preferences PL

end

into account while ranking Web search results. Listing 1 is proposed for rankingWeb search results based on individual ranking.

Listing 1. Algorithm for individual ranking of Web search results SourceCodewww

Algorithm 1 is described below step-wise.

Step 1: Calculate the sub-social network based on higher level of trust andrelevance matrices as P

Step 2: Calculate the list of members from the sub-social network P from thesocial network analysis strategy selected by the user

Step 3: Retrieve the preferences/interest keywords of all the users identified bythe calculation in Step 1 and Step 2

Step 4: Take top 100 search results from the search engine based on the searchquery provided by the user

Step 5: Match the list of preferences/interest keywords obtained in Step 3 withthe Web search results

Step 6: Re-order the Web search results in descending order to show the mostrelevant Web search results on the top, with respect to the selected social-network

There might be the possibility of having thousands of Web search results formost Web search queries; therefore, matching the list of preferences with each ofthe returned Web search results is not feasible. Hence, we propose to perform thematching of the preference/interest keywords for the top 100 Web search results.

26.3.3.2 Group Based Ranking

Ranking Web search results based on a certain group in the social network is carriedout while taking into account all users who are part of the group. Calculations for

26 Integrating Online Social Network Analysis in Personalized Web Search 605

Algorithm 2 Group Ranking of Web search resultsInput: User Query Q, Social-Network SN, represented by the adjacency matrixOutput: List of re-ranked Web Search resultsbegin

1. SN (n, e) is the network with n vertices and e edges2. SN’ D Social Network SN with vertices merged into one for the members identified in a group3. TM (SN’) is the adjacency matrix of the user’s social network representing the trust level among

members, as described in the previous sub-sections4. RM (SN’, topic) is the adjacency matrix of the user’s social network representing the level of

relevance among members, with respect to a topic, as described in the previous sub-sections5. P D ˛ TM C ˇ RM, is the sub part of the social-network, as described in Eq. (26.1)6. PL D (index, keywords(m)) P, where PL is the list of preferences from the members of the sub

social-network P7. List R’ D Re-ordered (R, PL), where R’ is the re-ordered list of Web search results based on the

list of preferences PL

end

the social network strategies are done for each of the user in the group. Then for aparticular social network strategy, preferences of the users identified by the group inthe social network are taken into account while ranking Web search results. Listing 2is proposed for ranking Web search results based on group ranking.

The key difference between the two algorithms for individual ranking and group-based ranking is the additional step in the group-based ranking algorithm, wherethe calculations are carried out for all users taking part in the group in the socialnetwork.

Listing 2. Algorithm for group ranking of Web search resultsThe algorithm is described below step-by-step.

Step 1: Retrieve the social network of the members belonging to a particulargroup.

Step 2: Merge vertices representing the group as one node in the social network,and calculate the sub social-network based on higher level of trust and relevancematrices as P

Step 3: Calculate the list of members from the sub social-network P derived byapplying selected the social network analysis strategy, with respect to the group

Step 4: Retrieve the preferences/interest keywords of all the users identified bythe calculation in Step 3

Step 5: Take the top 100 search results from the search engine based on the searchquery provided by the user

Step 6: Match the list of preferences/interest keywords obtained in Step 4 withthe Web search results

Step 7: Re-order the Web search results in descending order to show, with respectto the selected social-network, the most relevant Web search results on top

606 O. Shafiq et al.

26.3.3.3 Fractional Cascading for Ranking Search Results

After calculating the preferences based on personal and group level informationfrom the user’s social network, the final step is to rank the relevant Web searchresults based on the best matches with the preferences obtained. However, lookingat the huge size of Web search results, this task can become very cumbersomeand time-consuming to do (i.e., if we start matching each Web search result witheach of the preferences and then calculate and rank them in the order of bestmatch). Especially if the list of preferences is higher, the ranking process can takea significantly large amount of time, in comparison to the time in which the resultsare obtained from the Web search engines. Therefore, we employ the techniques oforthogonal range search to perform the ranking of Web search results for orderingthe best match.

Orthogonal range searching formalizes the search based on multiple dimensions(from one to n dimensions). We take each preference item as one dimension.So, if we have three items in the calculated preference list, then we have a three-dimensional match/ranking mechanism. Similarly, if there are 4, 5 or n items in thepreference list, the dimensions will be 4, 5 or n, respectively. In order to keep oursolution as generic as possible, we have chosen to use a dynamic data structure formatching the search results based on n dimensions. In this case, we use fractionalcascading mechanism, which takes into account each dimension, and calculates thematch with the first dimension; if a good match is found with the first dimension,only then proceeds to the other dimensions, otherwise it drops the calculation forthe rest of the dimensions (i.e., preference items) for the particular search result, andthen proceeds to the next search result accordingly. In this case, we save the time, bycalculating the necessary dimension match only, and drop the search results whereno good match is found with any of the dimensions.

The Listing 3 sketches the fractional cascading based algorithm for ranking Websearch results based on the preferences mentioned as dimensions. It calculates thematch of each Web search result R with each dimension D. As soon as a Web searchresult R does not match any of the dimensions, the algorithm stops calculating thefurther matches for that Web search result and continues to the next search result;hence it saves time by calculating only for the required dimensions. Once dimensionmatch check is performed for each Web search result D, the counter array itemshaving count equal to the number of dimensions is the answer, i.e., it ranks thoseWeb search results as the highest ones; they matches with every dimension, andvice versa. Web search results are then re-ordered based on the ranking obtainedfrom this mechanism.

Listing 3: Fractional-cascading based Ranking algorithm

As already described by fractional cascading technique [19], cost estimation ofthe algorithm is O.k/ C n, where n is the time spent in matching dimension D witha Web search result.

26 Integrating Online Social Network Analysis in Personalized Web Search 607

Algorithm 3 Fractional cascading for ranking search resultsCounter [1: : :. D] + 0For each Web search result RFor each Preference as dimension Dif (D matches R)counter[D] CCelsebreak and continue to next Web search result

Evaluation and Discussion

Ranking algorithms are the ones that involve maximum processing in re-ordering theWeb search results based on the preferences of the user’s community. As describedin Chap. 3, the complexity and cost estimation of our ranking algorithm, as beingbased on fractional cascading technique [19], is O.k/ C n, where n is the timespent in matching dimension D with a Web search result. The increase in thecost of the algorithm, with increasing dimension is linear, therefore, the approachis sustainable, as the list of preferences increases or extends. Moreover, we havelimited the re-ordering of only top 100 results, rather than attempting to re-orderall of the search results. This is because, after analyzing click-through and click-stream datasets, we found that, although many of the Web search engines providemillions of the search results. However, users mostly visit the top range of the Websearch results, and do not normally visit the results going beyond certain initialrange, normally in terms of hundreds. Therefore, re-ordering of the top 100 searchresults is beneficial for the users.

From implementation point-of-view, we have evaluated our solution using a real-life data set of social network, based on wikipedia where different users postedinformation, and compared the personalized Web search results with the originalsearch results in many different ways. One way to evaluate was the comparison ofranked search results with original search results from a search engine, through userfeedback, i.e. by first allowing the users to build a sample social network of webblogs through a wiki, and then using the community information collected fromwiki pages, to rank the search results. We assumed that different users posting onsame wiki page, were taken as connected to each other as “friends” in the socialnetwork.

For this purpose, we used the history of wiki page updates, to find out whichusers have been posting and sharing information of different wiki pages. In order toevaluate the usability of personalized search results, each participant was required toissue a certain number of test queries and determine whether each result is relevantto his/her interest or not. Users compared the original search results from a searchengine versus ranked search results, and rated the usefulness of the personalizedWeb search results in different aspects. The important observation was that, not onlyusers found a significant number of re-ranked Web search results more relevant, andalso did not find our results irrelevant as compared to the original search results.

608 O. Shafiq et al.

Search Query: FootballOriginal Search Result titles:

1. NFL.com – Official Site of the National Football League2. BBC SPORT j Football3. FIFA.com – Fédération Internationale de Football Association (FIFA)4. CBC.ca Meyer’s all-in approach a theme among college football coachesý5. Official Site of the Premier League – Barclays Premier League News6. The Official Website of Arsenal Football Club j Arsenal.com7. AFL – The official site of the Australian Football League – AFL.com.au8. Manchester United Football Club9. American football – Wikipedia, the free encyclopedia

10. Sky Sports j Football News11. Chelsea Football Club – Official Site for News, Tickets & Fixtures

Search Query: FootballPersonal Preferences: FIFA, American Football, ChelseaRe-ordered Search Results:

1. FIFA.com – Fédération Internationale de Football Association (FIFA)2. American football – Wikipedia, the free encyclopedia3. Chelsea Football Club – Official Site for News, Tickets & Fixtures4. NFL.com - Official Site of the National Football League5. BBC SPORT j Football6. CBC.ca Meyer’s all-in approach a theme among college football coachesý7. Official Site of the Premier League – Barclays Premier League News8. The Official Website of Arsenal Football Club j Arsenal.com9. AFL – The official site of the Australian Football League – AFL.com.au

10. Manchester United Football Club11. Sky Sports j Football News

This is because, our approach does not hide any search results, but only showsthe same search results, but in a re-ordered fashion. Therefore, the risk of bringingany irrelevant search results goes almost to none. The advantage of this evaluationapproach, based on user feedback, was that the relevance of the search resultswas explicitly judged by the participants, and hence, there was no compromise oncalculating the rating for the relevance of a Web search result, as per user needs.

Given below are the listings that show the search results obtained from Websearch engine against the reordered search results based on the preferences of theuser in its social network. Listing 4 shows the re-ordered search results based on thepreferences of the user itself. Listing 5 shows the re-ordered search results basedon the preferences of the user. While Listing 6 shows the re-ordered search resultsbased on the preferences computed from the friends in the user’s social network.

Listing 4. Original search results for a sample query

26 Integrating Online Social Network Analysis in Personalized Web Search 609

Search Query: FootballCommunity Preferences: FIFA, American Football, Australian Football, Chelsea,ArsenalRe-ordered Search Results:

1. FIFA.com – Fédération Internationale de Football Association (FIFA)2. American football – Wikipedia, the free encyclopedia3. Chelsea Football Club – Official Site for News, Tickets & Fixtures4. AFL – The official site of the Australian Football League – AFL.com.au5. The Official Website of Arsenal Football Club j Arsenal.com6. NFL.com – Official Site of the National Football League7. BBC SPORT j Football8. CBC.ca Meyer’s all-in approach a theme among college football coachesý9. Official Site of the Premier League – Barclays Premier League News

10. Manchester United Football Club11. Sky Sports j Football News

Listing 5. Reordered search results for a sample query based on individual user’sinterests

Listing 6. Reordered search results for a sample query based on interests of usersand its social network

However, we believe that the constraints on the number of participants and testqueries did effect and may have caused some level of biasness in the evaluationresults on accuracy and reliability of the ranking algorithms. In order to avoid thislimitation, we have used click-through data that is recorded in search engine logsto simulate user experiences in Web search, and analyze it, as described in [19].In general, when a user issues a query, the user usually checks search results intop to bottom manner. The user may click one or more results that look relevantand skips those documents that the user is not interested in. If our personalizationmechanism ranks relevant results for the users higher in the search results list, theuser might be more satisfied and browse through the re-ordered search results morethan that of original search results. Therefore, we may utilize the user click historyas a way to judge the level of satisfaction of the user to evaluate the ranking searchaccuracy. In this way, it has become easier for us to perform a large-scale evaluation.Furthermore, click-through data reflects the real world distribution of queries, users,and search scenarios. Thus, in our evaluation, usage of click-through data hasbrought results closer to the practical use cases in the evaluation of personalizedsearch than that of using user surveys.

Click-through data in search engines can be seen as a set of triples (q, r, c)consisting of the query q, the ranking r presented to the user, and the set c of links theuser clicked on. Listing 5 shows an example, the user asked a query, and receivedthe ranking shown as order of results, and then clicked on the links. Since everyquery corresponds to one triple, the amount of data that is potentially available isvirtually unlimited. Users, normally, do not click on links at random, but make a(somewhat) informed choice. While click-through data is typically noisy and clicks

610 O. Shafiq et al.

Table 26.5 Average DGC for search results

Result set DGC Standard deviation

Original search results 0.518 0.119Re-ordered search results 0.892 0.147

are not perfect relevance judgments, the clicks are likely to convey at least someinformation. Click-through data can be recorded with little overhead and withoutcompromising the functionality and usefulness of the search engine. In particular,compared to explicit user feedback, it does not add any overhead for the user. Thequery q and the returned ranking r can easily be recorded whenever the resultingranking is displayed to the user. For recording the clicks, a log file can be kept by asimple proxy system.

In order to collect the data and use it for experiments and testing, a simple Websearch engine was built, that uses well-known Web search engines (like Google,Yahoo and Altavista) to collect the Web search results, and performs the re-orderingof the Web search results (as per user preferences) on client side. Our searchengine works as follows: the users type query into the basic interface. This queryis forwarded to the search engine. The results are returned by one of these basicsearch engines and the top 100 suggested links are extracted. Re-ordering of thesearch results is then performed and top 10 results are shown to the user. For eachof the presented link, the system displays the title of the page along with its URL.The clicks of the user are recorded using the proxy system.

To be able to compare the quality of different retrieval functions, methodsprescribed in [14] have been used. The idea is to present two rankings as A and Bat the same time and then compare them after combining them as C. This particularform of presentation leads to a blind statistical test so that the clicks of the userdemonstrate unbiased preferences. If the user scans the links of C from top tobottom, at any point he has seen almost equally many links from the top of A as fromthe top of B. To measure the ranking quality, we use the Discounted Cumulativegain (DCG) [12], which is a measure that takes into consideration the rank ofrelevant documents and allows the incorporation of different relevance levels. DCGis defined as follows:

DCG.i/ D�

G.1/ if i D 1

DCG.i � 1/ C G.i/= log.i/ otherwise

where i is the rank of the result within the result set, and G.i/ is the relevancelevel of the result. We used G.i/ D 2 for highly relevant documents, G.i/ D 1 forrelevant ones, and G.i/ D 0 for non-relevant ones.

As shown in Table 26.5, automatic re-ordering of search results clearly improvessearch result quality. For example, the search results for query “football” withdifferent users having different preferences have been reformulated by our systemresulting in a better DCG.

26 Integrating Online Social Network Analysis in Personalized Web Search 611

The pure personalized results slightly overdue personalization, however, whencombined with the original web ranks using the personalization, the original Googleresults are outperformed. The ranking quality obtained by re-ordering the searchresults based on the user preferences indicates the need for our approach ofcommunity-aware personalization strategies based on the information need. Clickentropy, as introduced in [7], is defined as a measure of the variation in theinformation needs of users for a query as follows:

ClickEntropy.q/ DX

p2P.q/

�P.pjq/ log P.pjq/

where, ClickEntroy(q) is the click entropy of query q. P.q/ is the collection of webpages clicked on query q. P(pjq) is the percentage of clicks on web page p amongall clicks on q, i.e., P.pjq/ D jC licks.q;p;�/j

jC licks.q;�;�/jClick entropy is a direct indication of query click variation. If all users click only

one identical page on query q, we have ClickEntroy(q) D 0. Smaller click entropymeans that the majority of users agree with each other on a small number of webpages. In such cases, there is no need to do any personalization. Large click entropyindicates that many web pages were clicked for the query. This may mean that eithera user has to select several pages to satisfy his/her information need, which meansthat the query is most likely an informational query. Therefore, personalization-based reordering of search results can help to in bringing more relevant results tothe users or different users have different selections for a search query, which meansthat the query is an ambiguous query. In this case, personalization-based reorderingof search results can be used to provide different web pages to different users, as pertheir needs.

We show the average search performance improvement of different personal-ization strategies on test queries with different click entropies in Fig. 26.2 thatshows the improvement of personalized search performance increases with clickentropy being larger, especially when click entropy is higher than 1.5. For click-based methods, individual-level personalization and group-based personalization,the improvement is limited for the queries with click entropy being lower, i.e.between 0 and 0.5. In the case of low click entropy, our approach brings only0.37 % of improvement which is not statistically significant. It shows that theusers have general consistent clicks for the queries with low click entropy, andthus, no personalization is necessary for the current search results. However,for the queries where click entropy is significantly higher, say greater than 3.0,the result is obviously different. Both individual-level personalization and group-based personalization gets a dramatic improvement by having more relevant searchresults. In terms of ranking and re-ordering of search results, both, the group-based personalization and the individual-level personalization improve by 25 %.This shows that the search queries with higher click entropy benefit more frompersonalization. Therefore, we perform personalization and re-ordering of searchresults, only when it is beneficial to do so (i.e. click entropy is significantly high).

612 O. Shafiq et al.

Fig. 26.2 Click entropy vs. improvement in ranking

26.4 Conclusions

In this book chapter, we presented our proposed solution to re-order Web searchresults based on the contextual information collected from a user’s social network ina systematic manner. We have shown how the proposed system can help in bringingmore relevant and interesting information for different users by re-ordering thesearch results from Web search engines, while avoiding irrelevant information. Itenables the users to find out the information according to their interests in an easierand automated manner. We have developed various strategies, based on standardsocial network analysis techniques, for pulling out important publicly availableinformation from the community of a user that is taken as important contextualinformation. It helps in extracting the required information of the community ofa user from his/her social network and uses it to rank and re-order the searchresults of a Web search engine, based on the personal interests of the user andhis/her community. Our design methodology in developing the community-awarepersonalized Web search kept our solution generic, to be able to customize it andapply it to any search scenario. We presented the overall architecture, describedthe ranking algorithms, and discussed the implementation and evaluation usingwell-known techniques and real-life datasets to show the viability of our proposedsolution in real-life search scenarios.

References

1. Almeida, R.B., Almeida, V.A.F.: A community-aware search engine. In: Proceedings ofInternational World Wide Web Conference (WWW 2004). New York

2. Anick, P.: Using terminological feedback for web search refinement: a log-based study. InProceedings of WWW 2004, New York, pp. 89–95 (2004)

26 Integrating Online Social Network Analysis in Personalized Web Search 613

3. Berners-Lee, T., Hendler, J., Lassila, O.: The semantic web. Scientific American Magazine(2001). http://www.scientificamerican.com/article.cfm?id=the-semantic-web

4. Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. In:Proceedings of WWW, Santa Clara (1997)

5. Carminati, B., Ferrari, E., Perego, A.: Combining social networks and semantic web tech-nologies for personalizing web access. In: Proceedings of the 4th International Conference onCollaborative Computing (CollaborateCom 2008), Orlando (2008)

6. Chen, H., Finin, T., Joshi, A.: An ontology for context-aware pervasive computing environ-ments. Special Issue on Ontologies for Distributed Systems, Knowledge Engineering (2004)

7. Dou, Z., Song, R., Wen, J.R., Yuan, X.: Evaluating the effectiveness of personalized websearch. IEEE Trans. Knowl. Data Eng. 21(8), 1178–1190 (2009)

8. Freeman, L.: The development of social network analysis. Empirical Press, Vancouver (2006)9. Google Personal (2009). http://labs.google.com/personalized

10. Granovetter, M.S.: The strength of weak ties. Am. J. Soc. 78, 1360–1380 (1973)11. Heath, T., Motta, E., Petre, M.: Computing word-of-mouth trust relationships in social

networks from semantic Web and Web2.0 data sources. In: Proceedings of the Workshopon Bridging the Gap Between Semantic Web and Web 2.0, 4th European Semantic WebConference (ESWC2007), Innsbruck (2007)

12. Jarvelin, K., Keklinen, J.: Ir evaluation methods for retrieving highly relevant documents.In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval, July 2000, Athens (2000)

13. Jeh, G., Widom, J.: Scaling personalized web search. In: Proceedings of WWW 2003, pp. 271–279 (2003)

14. Joachims, T., Granka, L., Pan, B., Hembrooke, H., Gay, G.: Accurately interpreting click-through data as implicit feedback. In: Proceedings of the Conference on Research andDevelopment in Information Retrieval (SIGIR), Salvador (2005)

15. McKeown, K.R., Elhadad, N., Hatzivassiloglou, V.: Leveraging a common representation forpersonalized search and summarization in a medical digital library. In: Proceedings of ICDL2003, pp. 159–170. New Delhi, India (2003)

16. Morris, M.R., Teevan, J., Bush, S.: Enhancing collaborative Web search with personalization– Groupization, smart splitting, and group hit-highlighting. In: Proceedings of CSCW 2008.SIGCHI/Association for Computing Machinery, New York (2008)

17. Parr, B.: BREAKING News: Google announces social search. October 21st, 2009. http://mashable.com/2009/10/21/breaking-google-launches-social-search

18. Pattanasri, N., Jatowt, A., Tanaka, K.: Context-aware search inside e-learning materials usingtextbook ontologies. In: Advances in Data and Web Management. LNCS, vol. 4505/2009.Springer, Berlin (2009). doi:10.1007/978-3-540-72524-4, ISBN:978-3-540-72483-4

19. Shafiq, O., Alhajj, R., Rokne, J.G.: Community aware personalized web search. In: Proceed-ings of the 2010 International Conference on Advances in Social Networks Analysis andMining (ASONAM 2010), Odense (2010)

20. Speretta, M., Gauch, S.: Personalizing search based on user search history. In: Proceedings ofCIKM, Washington (2004)

21. Sugiyama, K., Hatano, K., Yoshikawa, M.: Adaptive web search based on user profileconstructed without any effort from user. In: Proceedings of WWW, New York, pp. 675–684(2004)

22. Tarjan, R.E.: A note on finding the bridges of a graph. Inf. Process. Lett. 2, 160–161 (1974)23. Teevan, J., Dumais, S.T., Horvitz, E.: Beyond the commons: investigating the value of person-

alizing Web search. In: Proceedings of the Workshop on New Technologies for PersonalizedInformation Access (PIA), Edinburgh (2005)

24. Teevan, J., Dumais, S.T., Horvitz, E.: Characterizing the value of personalizing search. In:Proceedings of SIGIR 2007, Amsterdam, pp. 757–756 (2007)

25. Zeng, Y., Ren, X., Qin, Y., Zhong, N., Huang, Z., Wang, Y.: Social relation based scalablesemantic search refinement. In: Workshop on Scalable Semantic Data Processing, AsianSemantic Web Conference (ASWC 2009), 7th December 2009, Shanghai (2009)