Analysis of Fusing Online and Co-presence Social Networks

Download Analysis of Fusing Online and Co-presence Social Networks

Post on 22-Feb-2016




0 download

Embed Size (px)


Juan (Susan) Pan , Daniel Boston, and Cristian Borcea Department of Computer Science New Jersey Institute of Technology. Analysis of Fusing Online and Co-presence Social Networks. Pervasive social applications. Location-aware social apps. Traditional social apps. Socially-aware a pps - PowerPoint PPT Presentation


<p>Analysis of Fusing Online and Co-presence Social Networks</p> <p>Analysis of Fusing Online and Co-presence Social NetworksJuan (Susan) Pan, Daniel Boston, and Cristian Borcea</p> <p>Department of Computer Science New Jersey Institute of Technology</p> <p>Hello everybody,My name is Susan Pan from the New Jersey Institute of TechnologyAnd today I will present our work on Analysis of Fusing Online and Co-presence Social networks</p> <p>We try to do in this paper is to see if the online and co-presence networks are similar or different. Does it make sense to fuse them?1Pervasive social applicationsTraditional social apps</p> <p>Location-aware social apps</p> <p>Socially-aware appsBUBBLE RapUse social knowledge to improve packet forwarding in delayed tolerant networksTriblerUse social knowledge to reduce peer-to-peer communication overhead </p> <p>2</p> <p>There are three types of social applications.Traditional social applications simply declare friendship onlineLocation aware application incorporate GPS location information to notify user if friends are close enough.Socially aware applications use social knowledge to improve the performance of other system such as packet forwarding and P2P communication.2Social information collectionDeclared by usersImplicitly, through online social networksExplicitly, through surveysExtracted from user online interactionsExtracted from user mobility tracesLocation tracesCo-presence traces (e.g., using Bluetooth)</p> <p>33so, what social information is collected or used by these applications?</p> <p>Social relationship can be pairwise individuals or individual and groups.</p> <p>There three ways, Declare by users, Extracted from user online interactions and extracted from user mobility traces.If Declare by user, it could be online friendship declaration or through surveys. In this study, we harvest online facebook social declarations given user permission.If mobility traces, it could be locations traces such as GPS or co-presence such as using bluetooth. Due to lower power consumption and privacy, we use co-presence bluetooth traces.Social information representationMultiple social graphs (e.g., Facebook and co-presence)Vertices -&gt; usersEdges -&gt; social tiesOnline social networks (OSN) provide relatively stable social graphMany connections are weakExample: actors have millions of friendsNot all social contacts use OSN appsCo-presence social network (CSN) identifies social ties grounded on real-world interactionsHard to differentiate social connections from passers-by4Pair-wise information typically can represented as social graphs.The vertices are users, and edges are social ties. We can assign weight on social ties based on the density of interaction </p> <p>Online social network and co-presence provide different social perspectives with drawbacks and strength.OSN provide relatively stable social graph, However, many connections are weak, OSN apps can not represent whole pictures of social contacts accurately, moreover, user rarely delete relationships from their profiles.</p> <p>Co-presence identify social ties grounded on real-world interactions. However, it is hard to differentiate social connections from passers-by. Some meetings are just chance encounters without social significance. For instance, two students sitting at nearby tables in the cafeteria</p> <p>OSN:- Slowly add new relationships after initial bootstrapRarely delete relationships from their profile</p> <p>4Research questionsDo OSN and CSN just reinforce each other or capture different types of social ties?Can a fused network take advantage of the strengths of both?How can we quantify the benefits of this fusion?Can we measure the contribution of each source network to the fused network?</p> <p>5The goal of our research is to investigate the potential of fusing OSN and CSN.</p> <p>Most of the system maintain the two social graphs separately.For instance, one of the papers published in our lab on 2010 Promethus manages different social graphs as multi-graph with labels on the edges. Application can queries social information from peers. Another paper developed an application that maintains the two networks separately, but utilizes both to help balanceyoungsters social connections.Thus, we ask the research questions as belowDo OSN and CSN just reinforce each other or they capture different type of social ties and complement each other?Does it make sense to fuse them?If so, How can we quantify the benefits of this fusion? Can we measure the contribution of each source network to the fused network?</p> <p>CB: Need to say that other systems (Prometheus, etc) keep them separateCB: We want to investigate the potential benefits of fusion5OutlineMotivationData collectionSocial graph representationAnalysis of global network parametersAnalysis of local network parametersConclusions66Study participantsOne month of CSN data and Facebook data for the same set of 104 students VolunteersReceived compensationBelong to various departments at NJIT7</p> <p>Based on those questions, we conducted an experiment. We recruited 104 student volunteers in our University.The slides shows the statistics and demographics of our subjects. We believe our subjects are representative of a median size campus in urban area.</p> <p>Dont say one month of faceboook data.7Bluetooth based co-presence data</p> <p>UserSeenTimeAB1:00BA1:05INTERNETAB1:078</p> <p>A</p> <p>B</p> <p>CBC1:05AC1:07A performs scan and created record (A,B)B performs scan and created record (B,A)(B,C)Then A performs scan again and created (A,B) (A,C) Records are uploaded by the end of the day.The average scan time period is 2 minutes, 40 seconds, with a standard deviation of 1 minute.This slide illustrates the how we generate our bluetooth co-presence record.We distribute each of the volunteer students mobile phones, where a program quietly scan the nearby devices and sent it back to the central server..</p> <p>HTC Windows Mobile 8595 and 8525 phones, which come preloaded with Windows Mobile 58Co-presence statistics MaxMeanStandard Dev.Meeting Duration220 hrs 2 min1hr 16min7hrs 34 minMeeting Frequency512.23.79</p> <p>The first graph illustrates the contribution of volunteers by hours.Given that our sample size (104 volunteers) was small compared to the university population (9000 students) and that many students are commuters, our trace data is relatively sparse. 50% of the students contribute less than 32 hours total during one month.We did not make effort to select friends.</p> <p>The typical user provided a few hours of data per day, especially during the week days.80% of the people provide less than 2 hours per day.The meeting durations is the summation of each of meeting time during one month. The meeting frequency is the total number of meetings during one month.</p> <p>We use elimination algorithm to merge the meeting records with very short intervals. (granularity is 5 minutes.)</p> <p>CB: use the color version of the graph (if you dont have it, ask Daniel for the source)</p> <p>9Facebook dataSubjects gave us permission to collect dataFriends, wall writings, comments, photo tagsOnline interaction is wall writing, comment or photo tagCount number of interactions between user pairs</p> <p>MaxMeansStandard Dev.Online Interactions40241010OutlineMotivationData collectionSocial graph representationAnalysis of global network parametersAnalysis of local network parametersConclusions11Global refer to structure of the network, the parameters averaged across the entire network.Local refer to parameter calculated based on individual properties of users, edges and communities. </p> <p>What difference between global and local11Weighted social graphs are more accurateOSN: Weightonline = number of interactionsCSN: Weightco-presence = 0.5 Weightduration + 0.5 Weightfrequency</p> <p>12Weightonline [1,40]</p> <p>Remember its the 2nd max12How to remove edges due to passers-by in CSN?Very short and infrequent co-presence does not indicate the presence of a social tieCSN noise reduction13Find duration &amp; frequency thresholds for adding a CSN edgeIncrease thresholds until Edit distance between CSN and OSN stabilizesEdit distance: number of edge additions/deletions to transform one graph into the otherKeep OSN unchanged because Facebook friendship confirmations validate social ties</p> <p>In order to eliminate the noised introduced by familiar strangers, we introduce two parameters. is the total meeting duration is total count number of total meetings. We think two person have a social tie only when their total meeting duration and total meeting time reach the threshold.In a sense, their social tie are strong enough.13Threshold selection</p> <p>14Total meeting duration threshold= 160 minutes per monthTotal meeting frequency threshold= 3 times per monthDraw two arrows and textboxes to show its the threshold Say per month</p> <p>Tij is the total time users i and j spent together; Fij is the total number of meetings in the encounter history. and are thresholds for meeting duration and meeting frequency that we vary during the analysis ( within [30min, 1800min]and within [1, 10]). We pick 160 minutes and 3 meetings as thresholds based on the results in Figures 2 and 3 that show the Edit distanceremaining stable beyond these values.14Resulting social graphs</p> <p>Co-presence SocialNetworkOnline SocialNetwork</p> <p>Fused Network (51 shared edges)1515OutlineMotivationData collectionSocial graph representationAnalysis of global network parametersDegree, connectivity, centrality, cohesivenessAnalysis of local network parametersConclusions16What difference between global and localGlobal refer to structure of the networkLocal refer to individual properties of users, edges and communities</p> <p>16OSNCSNFusedCorrelation (online, co-presence)= 0.202Average degree3.173.775.96</p> <p>OSN degree follows proximately power law distribution CSN degree does not resemble as strong power-law distribution as OSNsDue to meeting with familiar strangersConsequently, similar result observed for fused networkDegree distribution3 nodes are social butterfliesMost nodes have high degree in either CSN or OSN, but not both3 nodes have high degree in both CSN and OSNIncreased average degree means people meet different sets of contacts in the two source networks</p> <p>17Increased Average degree means people meet different set of contacts.</p> <p>OSN degree follows closely a power-law distribution in which few nodes with many degrees and many others with few degreesDue to mobility, in real life CSN, people meet a lot familiar stranger which they may know by face but not implicitly declare friendship.However, in OSN, people make through friends. CSN degree does not resemble a as strong power-law distribution as OSNPeoples mobility result in meetings with familiar strangersMost of the nodes either have more contacts on CSN or OSN , thus resulting the decreasing of max count in y axis.</p> <p>Familiar stranger is an individual who is recognized from regular activities, but with whom one does not interact.They are people who one indeed meet frequently or longer time.</p> <p>the validation of power-law claims remains a very active field of research in many areas of modern science.In general, there We remove only the people who meet rarely and less time.</p> <p>Only 3 nodes have high number of contacts both in OSN and CSN (social butterfly)Most of the nodes either have more contacts on CSN or OSN , thus resulting the decreasing of max count in y axis. (Degree correlation= 0.202 )</p> <p>17ConnectivityOSNCSNFusedWeightedNumber of edges165196310NSize of LCC(largest connected component)638498N</p> <p>Diameter of LCC787NAverage length of shortest path12.321.988.77YCSN contributes 27% more edges than OSNCompared to OSN, CSN has 55% more connected peopleAlmost all people connected in fused networkAverage weighted shortest path reduced in fused networkStronger social connectivity: reason to leverage it in social apps18Co-presence network contributes 27% more edgesCompared to OSN, 55% more people are connected in LCC.the diameter and average weighted shortest path are reduced. people become closer and more involved in each others lives if the fused network is leveraged in social applications.Only 51 edges are shared but average degree increased. It indicates people interact with different people when online and in real life. Among all the people one interact in real life, only 26% are facebook friend.</p> <p>diameter measures the longest shortest path in the connected component.The weighted shortest path is the path with the greatest capacity of carrying information</p> <p>There are 51 shared edges between the online and co-presence networks, which is less than a third of the number of edges in each of the two networks. Number of connected componentsFacebook 63 15 2 4 2 2 2Co-presence 84 4Fused 98</p> <p>18OSNCSNFusedWeightedAverage weight betweenness49.190.1394.83YAverage length of shortest path12.321.988.77YAverage edge weight3.023.641.95YAverage weighted clustercoefficient0.1560.1220.157YCSN has much longer average shortest path than OSNHence, average betweenness is highIn fused network, average shortest path is low, but betweenness is highest Social centrality is improvedBetweenness centrality and cluster coefficientAverage edge weight shows that people interact more in real life than onlineHighly socially active person online is not necessarily highly socially active in real lifeThus, smaller values in fused networkOSN has higher cohesivenessPeople become friends when sharing common friendsOSN contributes more to fused19OSNCSNCorresponding random graph(same number of vertices and edges) only have 0.029 cluster coefficient.I may need to calculate the number of the structureCentrality determine the relative importance of a vertex within the graph.</p> <p>Co-presence have much longer average length of shortest path, hence the average betweenness is high.In fused network, Average length of shortest path is small, how, meaning becoming friends because of their common friend. (Cohesiveness)Co-presence does not contributes to the cohesivenessever , betweenness is highest , thus showing social centrality is improved.In OSN, people have higher tendency to make transitive friendsThe betweenness centrality counts the number of times a node occurs on the shortestpath of other pairs of nodesThe local cluster coefficient (also known as transitivity) is a measure of the extent towhich nodes in a graph cluster together. It is the fraction of the number of present ties over the total number of possible ties between the nodes neighbors.the weighted version [18], the contribution of each tri-set (visualized as a triangle) of nodes is weighted by a ratio of the average weight of the two adjacent edges of the triangle to the average weight of the node.</p> <p>[18] A. Barrat, M. Barthelemy, R. Pastor-Satorras, and A. Vespignani, The architecture of complex weighted networks, Proceedingsof the National Academy of Sciences of the United States of America, vol. 101, no. 11, pp. 37473752, 2004</p> <p>19OutlineMotivationData collectionSocial...</p>