[IEEE 2011 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops) - Seattle, WA, USA (2011.03.21-2011.03.25)] 2011 IEEE International Conference on Pervasive Computing and Communications Workshops (PERCOM Workshops) - Detecting social network profile cloning

Download [IEEE 2011 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops) - Seattle, WA, USA (2011.03.21-2011.03.25)] 2011 IEEE International Conference on Pervasive Computing and Communications Workshops (PERCOM Workshops) - Detecting social network profile cloning

Post on 30-Sep-2016




1 download

Embed Size (px)


<ul><li><p>Detecting Social Network Profile Cloning</p><p>Georgios Kontaxis, Iasonas Polakis, Sotiris Ioannidis and Evangelos P. MarkatosInstitute of Computer Science</p><p>Foundation for Research and Technology Hellas{kondax, polakis, sotiris, markatos}@ics.forth.gr</p><p>AbstractSocial networking is one of the most popularInternet activities, with millions of users from around theworld. The time spent on sites like Facebook or LinkedInis constantly increasing at an impressive rate. At the sametime, users populate their online profile with a plethora ofinformation that aims at providing a complete and accuraterepresentation of themselves. Attackers may duplicate a usersonline presence in the same or across different social networksand, therefore, fool other users into forming trusting socialrelations with the fake profile. By abusing that implicit trusttransferred from the concept of relations in the physicalworld, they can launch phishing attacks, harvest sensitiveuser information, or cause unfavorable repercussions to thelegitimate profiles owner.</p><p>In this paper we propose a methodology for detecting socialnetwork profile cloning. We present the architectural designand implementation details of a prototype system that canbe employed by users to investigate whether they have fallenvictims to such an attack. Our experimental results from theuse of this prototype system prove its efficiency and alsodemonstrate its simplicity in terms of deployment by everydayusers. Finally, we present the findings from a short study interms of profile information exposed by social network users.</p><p>I. INTRODUCTION</p><p>Social networking has become a prevalent activity inthe Internet today, attracting hundreds of millions of users,spending billions of minutes on such services. Facebookhas more than 500 million users [1] and recently surpassedGoogle in visits [2]. At the same time, LinkedIn hostsprofiles for more than 70 million people and 1 millioncompanies [3]. As the majority of users are not familiar withprivacy issues, they often expose a large amount of personalinformation on their profiles that can be viewed by anyone inthe network. In [4] the authors demonstrate an attack of pro-file cloning, where someone other than the legitimate ownerof a profile creates a new profile in the same or differentsocial network in which he copies the original information.By doing so, he creates a fake profile impersonating thelegitimate owner using the cloned information. Since usersmay maintain profiles in more than one social networks, theircontacts, especially the more distant ones, have no way ofknowing if a profile encountered in a social networking sitehas been created by the same person who created the profilein the other site.</p><p>The usual assumption is that a new profile, claiming tobe related to a pre-existing contact, is a valid profile; either</p><p>a new or secondary one. Unsuspecting users tend to trustthe new profile and actions initiated from it. This can beexploited by attackers to lure victims into clicking linkscontained in messages that can lead to phishing or drive-by-download sites. Furthermore, a cloned profile could be usedto send falsified messages in order to harm the original user.The victimized user has no way of knowing the existence ofthe fake profiles (especially if across social networks). Forthat matter, we believe profile cloning is a silent but seriousthreat in todays world of social networks, where peoplemight face consequences in the real world for actions oftheir (counterfeit) electronic profiles [5], [6].</p><p>In this paper, we propose a tool that automatically seeksand identifies cloned profiles in social networks. The keyconcept behind its logic is that it employs user-specific(or user-identifying) information, collected from the usersoriginal social network profile to locate similar profilesacross social networks. Any returned results, depending onhow rare the common profile information is considered to be,are deemed suspicious and further inspection is performed.Finally, the user is presented with a list of possible profileclones and a score indicating their degree of similarity withhis own profile.</p><p>The contributions of this paper are the following. We design and implement a tool that can detect cloned</p><p>profiles in social domains, and conduct a case study inLinkedIn.</p><p> We present a study of the information publicly avail-able on a large number of social network profiles wecollected.</p><p>II. RELATED WORKSocial networks have gained the attention of the research</p><p>community that tries to understand, among other, their struc-ture and user interconnection [7], [8] as well as interactionsamong users [9] and how user privacy is compromised [10].In [4], the authors demonstrate the feasibility of automatedidentity theft attacks in social networks, both inside thesame network and different ones. They are able to createforged user profiles and invite the victims contacts to formsocial links or open direct messages. By establishing a sociallink with the forged profile, the attacker has full access tothe other partys protected profile information. Furthermore,direct messages, originating from the stolen and implicitly</p><p>3rd International Workshop on Security and Social Networking978-1-61284-937-9/11/$26.00 2011 IEEE 295</p></li><li><p>trusted identity, may contain malicious HTTP links to phish-ing or malware web sites. This attack is possible mostly dueto users revealing a large amount of information on theirprofiles that can be accessed by practically anyone. A studyconducted by Gross et al [11] revealed that only 0.06% ofthe users hide the visibility of information such as interestsand relationships, while in [7] the authors report that 99%of the Twitter users that they checked retained the defaultprivacy settings. Therefore, a first defense measure againstsuch attacks could be employed by social networking sitesif they promoted more strict default privacy policies. Badenet al [12] argue that by using exclusive shared knowledgefor identification, two friends can verify the true identity ofeach other in social networks. This can enable the detectionof impersonation attacks in such networks, as attackers thatimpersonate users will not be able to answer questions. Oncea users identity has been verified, public encryption keyscan be exchanged. Furthermore, by using a web of trust onecan discover many keys of friends-of-friends and verify thelegitimacy of user profiles that they dont know in the realworld and dont share any secret knowledge.</p><p>III. DESIGN</p><p>In this section we outline the design of our approachfor detecting forged profiles across the Web. Our systemis comprised of three main components and we describe thefunctionality of each one.</p><p>1) Information Distiller. This component is responsiblefor extracting information from the legitimate socialnetwork profile. Initially, it analyzes the users profileand identifies which pieces of information on thatprofile could be regarded as rare or user-specific andmay therefore be labeled as user-identifying terms.The information extracted from the profile is usedto construct test queries in search engines and socialnetwork search services. The number (count) of resultsreturned for each query is used as a heuristic andthose pieces of information that stand out, havingyielded significantly fewer results than the rest ofthe information on the users profile, are taken intoaccount by the distiller. Such pieces of information arelabeled as user-identifying terms and used to createa user-record for our system along with the usersfull name (as it appears in his profile). The recordis passed on to the next system component that usesthe information to detect other potential social networkprofiles of the user.</p><p>2) Profile Hunter. This component processes user-records and uses the user-identifying terms to locatesocial network profiles that may potentially belong tothe user. Profiles are harvested from social-network-specific queries using each networks search mecha-nism that contain these terms and the users real name.</p><p>All the returned results are combined and a profile-record is created. Profile-records contain a link to theusers legitimate profile along with links to all theprofiles returned in the results.</p><p>3) Profile Verifier. This component processes profile-records and extracts the information available in theharvested social profiles. Each profile is then exam-ined in regards to its similarity to the users originalprofile. A similarity score is calculated based on thecommon values of information fields. Furthermore,profile pictures are compared, as cloned profiles willuse the victims photo to look more legitimate. Afterall the harvested profiles have been compared to thelegitimate one, the user is presented with a list of allthe profiles along with a similarity score.</p><p>We can see a diagram of our system in Figure 1. In step(1) the Information Distiller extracts the user-identifying in-formation from the legitimate social network profile. This isused to create a user-record which is passed on to the ProfileHunter in Step (2). Next, Profile Hunter searches onlinesocial networks for profiles using the information from theuser-record in step (3). All returned profiles are inserted ina profile-record and passed on to the Profile Verifier in step(4). The Profile Verifier compares all the profiles from theprofile-record to the original legitimate profile and calculatesa similarity score based on the common values of certainfields. In step (5) the profiles are presented to the user, alongwith the similarity scores, and an indication of which profilesare most likely to be cloned.</p><p>IV. IMPLEMENTATION</p><p>In this section we provide details of the proof-of-conceptimplementation of our approach. We use the social networkLinkedIn [13] as the basis for developing our proposeddesign. LinkedIn is a business-oriented social networkingsite, hosting profiles for more than 70 million registeredusers and 1 million companies. As profiles are createdmostly for professional reasons, users tend to make theirprofiles viewable by almost all other LinkedIn users, or atleast all other users in the same network. Thus, an adversarycan easily find a large amount of information for a specificuser. For that matter, we consider it a good candidate forinvestigating the feasibility of an attack and developing ourproposed detection tool.</p><p>A. Automated Profile Cloning AttacksWe investigate the feasibility of an automated profile</p><p>cloning attack in LinkedIn. Bilge et al. [4] have demon-strated that scripted profile cloning is possible in Facebook,XING and the German sites StudiVZ and MeinVZ. Inall these services but XING, CAPTCHAs were employedand CAPTCHA-breaking techniques were required. In thecase of LinkedIn CAPTCHA mechanisms are not in place.296</p></li><li><p>Figure 1: Diagram of our system architecture.</p><p>The user is initially prompted for his real name, valid e-mail address and a password. This suffices for creatinga provisionary account in the service, which needs to beverified by accessing a private URL, sent to the user via e-mail, and entering the accounts password. Receiving suchmessages and completing the verification process is trivialto be scripted and therefore can be carried out withouthuman intervention. To address the need for different valide-mail addresses, we have employed services such as 10Min-uteMail [14] that provide disposable e-mail inbox accountsfor a short period of time. Once the account has beenverified, the user is asked to provide optional informationthat will populate his profile.</p><p>We have implemented the automated profile creation tooland all subsequent experiments detailed in this paper relyon this tool and not manual input from a human. This wasdone to test its operation under real-world conditions. Letit be noted that all accounts created for the purposes oftesting automated profile creation and carrying out subse-quent experiments have been now removed from LinkedIn,and during their time of activity we did not interact withany other users of the service. Furthermore, due to ethicalreasons, in the case where existing profiles were duplicated,they belonged to members of our lab, whose consent we hadfirst acquired.B. Detecting Forged Profiles</p><p>In this section we present the details of implementing ourproposed detection design in Linkedin. We employ the cURL[15] command-line tool to handle HTTP communicationwith the service and implement the logic of the variouscomponents of our tool using Unix bash shell scripts.</p><p>1) Information Distiller. This component requires thecredentials of the LinkedIn user, who wishes to checkfor clones of his profile information, as input. Thecomponents output is a user-record which containsa group of keywords, corresponding to pieces of</p><p>information from the users profile, that individuallyor as a combination identify that profile. After loggingin with the service, this component parses the HTMLtags present in the users profile to identify the dif-ferent types of information present. Consequently, itemploys the Advanced Search feature of LinkedIn toperform queries that aim to identify those keywordsthat yield fewer results that the rest 1. Our goal isto use the minimum number of fields. If no resultsare returned, we include more fields in an incrementalbasis, according to the number of results they yield. Inour prototype implementation, we identify the numberof results returned for information corresponding to apersons present title, current and past company andeducation. We insert the persons name along with theother information in a record and provide that data tothe next component.</p><p>2) Profile Hunter. This component employs the user-record, which contains a persons name and informa-tion identified as rare, to search LinkedIn for simi-lar user profiles. We employ the services AdvancedSearch feature to initially find out the number ofreturned matches and subsequently use the protectedand, if available, public links to those profiles tocreate a profile-record which is passed on to the nextcomponent. The upper limit of 100 results per query isnot a problem since at this point queries are designedto be quite specific and yield at least an order ofmagnitude less results, an assumption which has beenvalidated during our tests.</p><p>3) Profile Verifier. This component receives a profile-record which is a list of HTTP links pointing toprotected or public profiles that are returned whenwe search for user information similar to the original</p><p>1Those that yield a number of results in the lowest order of magnitudeor, in the worst case, the one with the least results.297</p></li><li><p>user. Subsequently, it accesses those profiles, uses theHTML tags of those pages to identify the differenttypes of information and performs one to one stringmatching with the profile of the original user. Thisapproach is generic and not limited to a specific socialnetwork, as the verifier can look for specific fieldsaccording to each network. In our prototype imple-mentation, we also employ naive image comp...</p></li></ul>