[ieee 2009 international conference on advances in social network analysis and mining (asonam) -...

Developing Compelling Social-Enabled Applicationswith Context-based Social Interaction Analysis

Ryan Skraba, Mathieu Beauvais, Johann Stan, Abderrahmane Maaradji and Johann Daigremont

Alcatel-Lucent Bell Labs FranceCentre de Villarceaux, Route de Villejust

94160 Nozay, [email protected]

Abstract—We present in this paper a new approach forconstructing implicit social networks from electronic sources,like emails, SMS and phone calls. The main novelty is theuse of Social Interaction Analysis to assist end-users in theircommunication needs. We discuss how a social proximity canbe calculated between two persons in a social network andshow how a weighted, directed network is constructed basedon interactions between people. After a description of thearchitecture of the framework, we show how a contextual,weighted social network can help in automatically finding thecontact with highest probability to know the whereabouts of aperson who is currently not reachable. The implemented socialhelper application uses a specific contextual view of the socialnetwork, exploiting only interactions that occurred in the last48 hours.

Keywords-Social Network Analysis, Social Proximity, Con-text, Interactions, Communications

I. INTRODUCTION

With the modern growth of online social networking sites,

there has been a corresponding explosion in social net-

working applications. For the most part, these applications

are simple and fun, and propagate among users’ friends.

However, with the wide deployment of mobiles in developed

countries, there is an opportunity to develop social network-

ing applications that truly act on the users’ social networks

to enable efficient social communications. Our approach for

constructing an end-to-end Social Interaction Analysis (SIA)

system involves three stages: collection from different user

devices and identities, ongoing analysis in the network and

a semantic user profile database for applications.

Social Interaction Analysis (SIA) attempts to construct

and qualify the users’ social networks by examining the

implicit interactions between users instead of the explicit

declarations of “friendship” or community. Users can have

a contextual view of their interactions (Which of my inter-

actions are related to cinema? Who is the person I called

most often in the last 48 hours?), serving as a back-end for

many innovative services. By processing social interactions

and by logging the content and context of interactions, a

matrix of interpersonal communications can be constructed.

We present the architecture of our framework and show

how context-based filtering can be applied in the spe-

cific scenario of an “emergency call” service, a contact

recommendation scenario. Social Interaction Analysis and

semantic user profiling can be combined in a framework that

allows efficiently answering similar scenarios. We present

the design of an end-to-end system that enables innovative

social applications by automatically constructing context-

based social graphs from end-users’ daily communications.

The remainder of this paper is organized as follows: in

Section 2, we detail our proposed SIA framework. Section

3 presents preliminary results to illustrate the interest of our

approach. In Section 4, we demonstrate the SIA concept in a

case study called Social Helper in which we show a potential

real usage of our framework using a mobile social network.

In Section 5, we discuss difficulties in tuning parameters

to experimentally validate our system. Section 6 discusses

related work. We conclude and discuss work in Section 7.

II. SOCIAL INTERACTION ANALYSIS FRAMEWORK

Many types of social interactions are already performed

in a manner that can be captured by a SIA framework.

Direct communications between actors, such as email, text

messages (SMS), and phone calls can already be logged

by the user, in the network or on their devices. These

direct interactions are typically between two actors that can

be identified by the endpoints of the communication. An

interaction can be considered direct even if it is mediated by

a communication object or technology. For example, face-to-

face interactions can be inferred using Bluetooth technology

to pinpoint users in a location. An indirect interaction can

be between a person and an object, but have the same

targeted communication intention as a direct interaction,

such as leaving a voice message when a call cannot be

completed. A social proximity can also be attached to the

communication object to profile how two users prefer to

communicate, which has implications on their calculated

social proximity. Interactions via objects can define an

indirect social relationship between actors without any direct

communications between them by implying a shared interest

or community, such as by exchanging media (downloading

a photo set that another user uploaded) or by participating

in the same internet forums. A social proximity can also be

2009 Advances in Social Network Analysis and Mining

978-0-7695-3689-7/09 $25.00 © 2009 IEEE

DOI 10.1109/ASONAM.2009.7

206

2009 Advances in Social Network Analysis and Mining

978-0-7695-3689-7/09 $25.00 © 2009 IEEE

DOI 10.1109/ASONAM.2009.7

206

inferred indirectly from direct social interactions between

multiple recipients, such as when two people are always

included together as recipients of an interaction despite never

communicating directly (forums, mailing lists).

In order to create a weighted, directed edge between

two actors in a social graph, a number of variables are

taken into account. The number of interactions between

each user (and for each context) is counted separately for

each type of interaction, and the type of interaction can

have a different subjective impact for a given user. For

example, for some users, email is used almost exclusively

for professional relationships and SMS are considered more

intimate and immediate for personal use. There are no fixed

rules for every user, so the values for social proximity are

normalized independently for each type of interaction, and

then weighted according to the individual’s usage patterns to

calculate a global social proximity between any two actors.

In order to validate the usefulness of this approach, the

experimental framework implements a system for context-

based Social Interaction Analysis. Figure 1 shows an ar-

chitecture for an end-to-end system that consumes social

interaction data from the end user (collection), handles and

processes it (analysis) and provides a context-based view

of a social graph to innovative social applications (presenta-

tion). The architecture is flexible; future types of interactions

and analysis functions can be integrated dynamically.

A. Collection

Many diverse sources exist for social interaction data,

and the social interaction collection layer is responsible

for observing, reporting and storing them to the system as

required.

One way to acquire data is to run a collector in the

background on the users’ end terminal. In this use case,

when an SMS is sent or received, a log is kept on the

terminal and periodically sent to the collector. In the initial

implementation, the log format is a simple XML construc-

tion containing all of the attributes belonging to an SMS

and is transmitted using the HTTP protocol. In order to

be successful, the user needs to see a benefit to providing

this information, and they have the opportunity to configure

the information that is sent to the collection layer from that

device.

Another method is to acquire data using components on

the network side, without any local device installation for

each end user, and in certain cases, without requiring their

permission. A responsible privacy policy would ensure that

the end user is aware of how this information is collected and

used in social applications, preferably by having them opt in

to the system. Like SMS interactions, email interactions are

reported to the collection layer using an XML format, and

some attributes are common to both types such as: sender

and receiver identifiers, length, timestamp and potentially

the complete text contents. Email interactions, however,

Figure 1. System Architecture

have a richer set of header information and can be shared

among multiple recipients, with different recipient classes

(“To:”, “Cc:” and “Bcc:”). This information is summarized

and captured at the collection layer, and stored in a large

relational database.

B. Analysis

The social interaction analysis layer acquires informa-

tion from the intermediary database provided by the social

interaction collection layer, and uses it to compute social

closeness, or social proximities, between end-users. Given

the ephemeral and complex nature of human relationships,

it is difficult to define exactly what “social closeness” means.

However, a useful model can be created by assuming certain

properties. First of all, social relationships are certainly not

equal or interchangeable. Unlike modern social networking

sites, declaring someone a “friend” doesn’t put them in an

unordered list at the same rank as every other “friend”.

Likewise, it is extremely difficult to order the list; some

“friends” inspire affection but not confidence, while others

have your respect although you don’t share important views.

Finally, relationships are not symmetrical or reciprocal; the

regard you hold for another is not necessarily the regard they

hold for you.

Given these basic observations, we can assume that if

two actors are nodes in a graph, their social proximity is

a directed, weighted value between them, and that different

social proximities can exist between two actors for different

contexts. The analysis layer deduces these social proximities

207207

form the basic attributes of social interactions from the

collection layer, refining them with any content or context

that is available. For example, given the SMS and email

interaction attributes, the system can calculate the frequency

of contact between two people, the direction of contact and

potentially keywords or topics (from the content, or subject

in the case of email). The user can also have reported

some contextual information such as when and where the

interaction took place, defined as either the GPS or cell id

coordinates of the interaction, or a looser definition such

as the sphere - the users can be in the home or work

environment, as declared explicitly or deduced by the time,

the terminal (professional or personal), or the location.

Email interactions have more information about the shape

of the conversation, such as threads of replies and forwards

between multiple recipients. A long thread of back and

forth exchanges between two people implies more intimacy

than one message broadcast to a large number of people

without any response. Likewise, between any two actors,

the system can calculate who takes the initiative in creating

a conversation, or the probability of responding (qualities

which have been experimentally named “sendiness” and

“replyness”). In fact, given the nature of email, a deduction

can be made about whether two actors belong to some social

group if they are consistently included as recipients together

in email messages, even if they never communicate directly

(a quality which has been investigated under the name of

”groupiness” between actors). These qualities are calculated

using the interactions from the collection layer, and weighted

according to the usage patterns of each individual end user,

and a weighted, directed edge is computed between the two

nodes representing two end-users.

Once a social graph has been automatically constructed

between actors, social network analysis techniques can be

applied to identify clusters/cliques of socially related users,

to identify key users/hubs/bridges for a context or a topic,

or to analyze graph topological attributes over time.

C. Social interaction presentation and API

The presentation layer stores the computed social prox-

imity graphs in a fluid, accessible way for social applica-

tions. By using semantic web technologies with computer-

understandable vocabularies, a wide range of social appli-

cations can be targeted. The current implementation uses

a format based on the FOAF/RDF representation of social

data stored in a semantic social user profile database, and

provides access through SPARQL semantic queries as well

as a lighter web service interface.

III. PRELIMINARY RESULTS

An implementation of the social interaction system was

used to investigate how a model of a social network is

constructed from interactions between actors, and to demon-

strate an example social application that takes advantage of

Figure 2. SIA constructed social graphs.

the resulting graph to provide an advanced social commu-

nication service.

A. Automatically constructed social graphs

As the social interaction collection layer acquires infor-

mation from the actors, it connects those who have had

interactions in the past. To test the automatic construction of

a social graph from historical email logs, the email collection

tools were run on volunteers’ professional email accounts

(from emails collected over a period from six months to

four years). These data sets have the benefits of involving

actors for the most part from a closed community (work

colleagues) and interactions that have a degree of content

continuity (ongoing projects).

The constructed graph linked two actors if an email was

exchanged between them. The social graph could be filtered

by running the collection tools on a subset of the exchanged

messages, based on keyword matching in the subject line.

Figure 2 shows two social graphs constructed in this manner,

and demonstrate that standard social network analysis tech-

niques can be applied to these graphs to determine: clusters,

key players, actor centrality, etc. [1] In both social graphs,

some links are greyed out based on a measure of centrality

of their nodes, identifying related clusters of users as a

consequence. Since email interactions contain recipients that

are not part of the experiment, these results show that it is

possible to make intelligent inferences about a larger social

community, even when only a subset of actors participate.

B. Social proximity in a directed graph

When automatically constructing the social graph, the

system weights each link with a value representing the social

strength or proximity between two actors. In general, the

more that two actors communicate, the higher the social

proximity between them; the simplest computation is to

count the number of interactions that involve the two actors.

A more refined computation uses the attributes that can

be collected for different types of social interactions. For

example, an email sent to a community is not as personal

as an email sent to a single recipient. The social proximity

208208

that one email adds to a link is inversely proportional to the

number of recipients.

The conversational aspect of the interactions is also rel-

evant. Consider an actor who composes ten unanswered

emails to another actor versus a back-and-forth conversation

of ten emails. Although the number of interactions between

the two actors is the same, only the second scenario shows

an engagement between the two.

C. Context-based social interaction

Further depth to the links between actors can be extracted

from the context of the interaction (and the content, where

available). For example, emails between two actors using

email identifiers from the same business domain, that occur

during work hours, that concern typically professional sub-

jects and are sent from a work location very strongly suggest

that the two actors are professional colleagues. Thus, content

and context (such as time, location, content of messages,

etc.) can be a useful characterizer for different types of social

proximity.

Other studies have shown that observing interactions over

a long period don’t always result in constructing a social

graph with immediate relevancy [8]. Therefore, the Social

Interaction Analysis system looks at two types of social

proximity based on the interaction period: a general “all-

time” social proximity and “last 48 hours” social proxim-

ity that only considers recent interactions. The “all-time”

proximity is useful for observing long-term relationships and

long-term usage patterns of an actor, and social applications

can take advantage of the “last 48 hours” proximity for

immediacy.

IV. CASE STUDY: SOCIAL HELPER

The purpose of the Social Interaction Analysis system is

to enable innovative social applications, initially targeting

a social helper mobile application running on end-users’

devices. The Social Helper finds a social contact in an

emergency context.

The use case is as follows: Bob and George are youths

organizing a birthday party for a mutual friend. Although

they have never communicated in the past, George becomes

Bob’s best social contact for the “last 48 hours” social

proximity as they interact via SMS, email and telephone

calls. At one point, Bob’s mother, Jena, is unable to get

in touch with him on his phone for a family emergency,

even through their mutual contacts. She then launches the

Social Helper on her phone, and is able to discover George

as Bob’s best “last 48 hours” social link thanks to their recent

communications about the birthday party, despite not having

George in her own address book.

This use case was implemented with a Social Interaction

Analysis server running in the network, with collection

components deployed on end-user’s terminals. In the case of

Bob and George, this application is monitoring their SMS

Figure 3. SIA Viewer.Number represent the number of exchanged calls.

and telephone call usage and reporting the interactions to

the collection layer. As well, a collector in the network

is reporting their email interactions to the collection layer.

The analysis layer incorporates all new interactions into the

calculation of an overall social proximity between the actors

of the use case, and a “last 48 hour” rolling window.

Figure 3 shows the weighted, directed social graph con-

structed in the use case, centered on Bob. This view is

only available to the mobile operator; individual actors are

only allowed to see their own social links. Both Bob and

George have used the Social Helper to set their mothers as

emergency contacts, and this information has been stored in

the semantic social user profile database. Therefore, via the

Social Helper, Bob has given his mother, Jena, permission

to see his social links, so she is eventually able to find

George and his contact information. Figure 4 shows Jena’s

Social Helper when George has become Bob’s new best

contact. She remains unable to see farther into the social

network to George’s social links, but she is able to place

an emergency call to the person he has configured as his

emergency contact.

The Social Helper mobile application is an example of

an end-to-end social application that can take advantage of

the Social Interaction Analysis system to provide a new

communication service to the end user.

V. DISCUSSION

One of the most important criticisms of the proposed

Social Interaction Analysis system is that though the compu-

tations for social proximity appear useful, they are difficult

to validate. In order to experimentally confirm the results,

the system needs to be run against a large set of real users,

and obtain feedback about whether the calculated social

proximity corresponds to the expectations and assumptions

of the end-users. This could either be obtained by performing

209209

Figure 4. Social helper mobile application.

a user survey on the social applications delivered to the

system, or incorporated into the collection stage by soliciting

ongoing user feedback. Likewise, the value of “last 48

hours” as a measure of relevant social proximity was chosen

as a plausible value; an experiment on a set of live social

interactions will confirm an appropriate value.

The studies that calculate quantitative statistics on email

database logs show that “sendiness” (i.e. given the past

interactions, the probability that one user will compose

or reply to another) is sufficient to pick the top social

contact for a user. However, it is unknown how the other

calculated values can be used to sort the remaining social

contacts in an intelligent and useful order. Another difficulty

in collecting historical email interactions is that end users

have already pre-processed their emails over time, deleting

unimportant or trivial messages. It is unclear how the results

are affected, given that the retained emails are more likely

to be important, or more precise results can be obtained by

considering every email on arrival.

Further consideration needs to be given to the subjective

impact of the type of the interactions for a given user. An

email can reflect a professional relationship for some users,

while an SMS can be considered something more intimate

and immediate for others, but there is no fixed rule for

everyone. When calculating a global social proximity, there

should be a justifiable way to weigh the contribution of the

interaction type based on global or individual actor patterns.

Another important aspect to raise is scalability. An net-

work operator can handle between 2000 to 5000 calls

per second in a network of a quarter of a billion unique

telephone numbers [2]. The storage capacity management

through incremental storage and the duration and periodicity

of analysis shall be addressed with the right architectural

design. Finally, the experimental framework for Social In-

teraction Analysis was designed to be very flexible to permit

adding new functions to its three layers. In order to remain

viable in a real-life deployment (in a mobile network, for

example), the system will require tuning for specifically tar-

geted social applications and services. The social interaction

user manager is a component that exists outside of the three

layers, and provides a provisioning interface to the mobile

operator. The mobile operator can determine which users are

of interest for social interaction collection, and can use the

interface to provide further identity information from their

subscriber databases, for example correlating multiple phone

numbers and/or email addresses to a single user identity.

The mobile operator can also use the social interaction

user manager to limit the significant computation resources

required by the social interaction analysis layer. For exam-

ple, one user may have subscribed to an enterprise appli-

cation that calculates social proximity in the work sphere,

while another user uses an application that only consid-

ers social applications with certain keyword content. The

necessary resources can be reduced by only performing the

calculations required for mapping social proximity data to

the selected vocabulary in the presentation layer. In addition,

by provisioning the interesting users and required calcula-

tions via the social interaction user manager, the analysis and

presentation layers of the system can be distributed across

the network, where different machines are responsible for

subsets of users and/or specific context-based calculations.

VI. RELATED WORK

The social-networking applications are based for the most

part on a constructed social graph. There are several ways

to build a social graph. The most used method is based on

web declarative mode (explicit declaration of relationships).

Users of Social Networking Sites fill in their profiles by

inviting contacts. In addition to that, aggregation and incen-

tive mechanisms are used.

A second method for social graph construction is the

analysis of the content mostly known as web crawling

(implicit declaration of relationships). The targeted contents

can be simple web pages, scientific publications or email

exchanges [3],[4]. The analysis can be applied to one or

more content at a time. In [5], authors present an end-to-

end system to mining and building a social network based

on exchanged email and web crawling. [6] proposes a new

approach to study the dynamics of key players in social

networks from an experiment using 57,158 email exchanged

during 113 days in a large university. The Flink system [7]

extracts knowledge about social relationships from email,

web pages and publications, performs several computations

on the data and consolidates what is learned using a common

RDF vocabulary based on the FOAF user profile ontology

[8]. A very complete description of this system, with a

special focus on the application of semantic technologies

for social network analysis is given in [9].

The third method is the collection and the analysis of

interactions based on telecommunications means such as

voice calls, SMS, MMS, instant messaging, etc. In [1]the

authors have constructed a social graph of 3.9 million nodes

210210

based on telephone call logs. This graph allowed calculating

and identifying the properties of large scale and weighted

social graph.

In our case, we combine several interaction sources from

the telecommunication world (SMS and phone calls) and

Internet communications (notably email) to build a social

network that is closest to the real social network, and an

enabler for compelling applications. Moreover, we build sev-

eral social networks depending on a considered context. Our

case study: Social helper uses the last 48 hours interactions

to build a social network describing the relationships of the

last 48 hours.

VII. CONCLUSION AND FUTURE WORK

We have presented in this paper a complete framework

for building a comprehensive social network from a large

set of interaction sources. We show how this network

can be the entry point for compelling applications on a

mobile device, by using contextual filters on the network.

The Social Helper emergency call application addresses the

specific case of an emergency call: leveraging the social

proximities to find the closest person based on interaction

in the last 48 hours. Our framework is designed to deal

with almost any kind of use case that needs social network

information. This application uses a simple definition of

the social strength, primarily based on the number of email

or SMS messages received. Since social proximity may be

positive or negative depending on the interaction content and

context, we are currently working on the improvement of

the social strength calculation by taking into account those

features. Indexing interactions with content and context will

enable rich information retrieval through social data mining

based on semantic tagging rather than keyword matching to

identify content topics.

The long term goal is to enrich the set of social inter-

actions that the system can capture and analyze (from the

Internet or telecommunication networks) in order to build

a social network that closely models reality. The area of

media sharing is an example of a rich source of interactions

that can be exploited to determine social proximity between

actors [10]. More importantly, capturing proximity or other

indicators of face-to-face communications is an interesting

way to reach this goal since up to two thirds of social

interactions occur face-to-face [11] as opposed to digital

message exchanges.

ACKNOWLEDGMENTS

The authors would like to thank the following department

members for their contribution to this framework: Lionel

Natarianni, Denis Leclerc, Ronan Daniellou, Adrien Joly,

Linas Maknavicius and Hakim Hacid. This work is being

performed as part of a collaborative research project called

HERMES within the European Eureka cluster programme

CELTIC for telecommunications [10] and is partially funded

by French Ministry of Economy, Industry and Labor, DGCIS

Directorate.

REFERENCES

[1] J. Onnela, J. Saramaki, J. Hyvonen, G. Szabo, M. de Menezes,K. Kaski, A. Barabasi, and J. Kertesz, “Analysis of a large-scale weighted network of one-to-one human communica-tion,” New Journal of Physics, vol. 9, no. 6, p. 179, 2007.

[2] N. Easter, “What would you do with the telephone callnetwork of an entire country?” 2006. [Online]. Available:http://www.iq.harvard.edu/blog/

[3] A. L. Barabasi, Linked: The New Science of Networks.Perseus Publishing, 2002.

[4] A. L. Barabasi, H. Jeong, Z. Neda, E. Ravasz, A. Schubert,and T. Vicsek, “Evolution of the social network of scientificcollaborations,” vol. 311, no. 3-4. Physica A: StatisticalMechanics and its Applications, Aug 2002, pp. 590–614.

[5] A. Culotta, R. Bekkerman, and A. McCallum, “Extractingsocial networks and contact information from email and theweb.” in CEAS, 2004.

[6] Y. Matsuo, J. Mori, M. Hamasaki, K. Ishida, T. Nishimura,H. Takeda, K. Hasida, and M. Ishizuka, “Polyphonet: anadvanced social network extraction system from the web,” inWWW ’06: Proceedings of the 15th international conferenceon World Wide Web. New York, NY, USA: ACM Press,2006, pp. 397–406.

[7] P. Mika, “Flink: Semantic web technology for the extractionand analysis of social networks,” Web Semantics: Science,Services and Agents on the World Wide Web, vol. 3, no. 2-3,pp. 211–223, October 2005.

[8] D. Brickley and L. Miller, “The Friend Of A Friend (FOAF)vocabulary specification,” November 2007.

[9] P. Mika, T. Elfring, and P. L. M. Groenewegen, “Applicationof semantic technology for social network analysis in thesciences,” Scientometrics, vol. 68, no. 1, pp. 3–27, 2006.

[10] “Celtic eureka cluster programme, integrated telecom-munications systems, numeric referencing: Projectinformation.” 2009. [Online]. Available: http://www.celtic-initiative.org/Projects/HERMES/

[11] S. Farnham, S. U. Kelly, W. Portnoy, and J. L. Schwartz,“Wallop: Designing social software for co-located social net-works,” Hawaii International Conference on System Sciences,vol. 4, p. 40107a, 2004.

211211

[ieee 2009 international conference on advances in social network analysis and mining (asonam) -...

Documents