diffusion in (social) networks
DESCRIPTION
Diffusion in (Social) networks. Rajesh Sharma http://rajshpec.github.io/ [email protected] October, 2014. - PowerPoint PPT PresentationTRANSCRIPT
1
Diffusion in (Social) networks
Rajesh Sharmahttp://rajshpec.github.io/[email protected]
October, 2014
This presentation is based on several works, including some with:Prof. Danilo Montessi (University of Bologna, Italy), Prof. Matteo Magnani (Uppsala University, Sweden) Prof. Anwitaman Datta (NTU, Singapore), Prof. Mostafa Salehi (University of Tehran, Iran)*Some slides’ content from Jure Leskovec ‘s course work.
Agenda
• Preliminary– Overview of Networks– Diffusion on Networks in Monoplex• Models, Algorithms etc.
• Algorithm for diffusion in decentralized settings.
• Diffusion on Networks in Multilayer Networks.• Models, Algorithms etc.
• Conclusion & Future work.
Networks: collection of objects where some pairs of objects are connected by links
Protein-protein ISP: Router etcTransportation: Metro
Sexual contact
Co-citationRecipeFriendship
Human Diseases Food Web
Network Really Matters
• If you want to understand the structure of the Web, it is hopeless without working with the Web’s topology.
• If you want to understand the spread of diseases, can you do it without social networks?
• If you want to understand dissemination of news or evolution of science, it is hopeless without considering the information networks.
Networks & DiffusionNetworks
DiffusionHuman-Human
Network
Comm. NetworkEg: OSN,
Internet, Mobile
Innovation
Virus
Rumor
Behavior
Idea, Innovation
Idea, Innovation
SARS, Virus
Transportation Network
Goods
Vegetables etc
Occupy Square
Smoking, Selfe
Selfe
Maria,Ronaldo
Inflation
Affect of Diffusion in ML Networks
Internal Entity• Diffusion process happening
in a network affecting internal entities.
• Example:– Influence (product, behavior
etc)
External Entity• A diffusion process
happening in a network affecting external entity
• Example:– Effect of tweets on stock
prices
Diffusion Dynamics: What can be done? B) Explanatory/Empirical
Analysis• Infer the underlying
spreading cascade.• Questions
– How Diffusion look like– Cascades look like ?
C) Algorithms– Influence
maximization– Outbreak detection– etc
A) Models:• Decision Based Models
– Independent Contagion Model
– Threshold Model– Questions:
• Finding Influential Nodes
• Detecting cascades
• Epidemic Based Models– SIS: Susceptible-Infected-
Susceptible (e.g., Flu) – SIR : Susceptible Infected
Recover (e.g., chicken pox)
– Question: • Virus will take over the
network?
9
Information Dissemination: Algorithm
• Objectives– Effective
• High precision (low spam) & recall (good coverage)
– Efficient• Low latency, low duplication
• Challenges : Decentralized settings– No global list, no explicit subscriptions or coordination
• Intuition– Use social links in each hop
• Locally available (interest) information• Less likely to be spammed• Easier accountability
10
Approach/Algorithm• Two logically independent mechanisms/phases– Control phase (runs in the background)• collect neighbor nodes’ information (interest, degree)• dissemination behavior (forwarding behavior, activeness)
– Propagation of messages using selective gossip
[4] Anwitaman Datta and Rajesh Sharma, GoDisco: Selective Gossip based Dissemination of Information in Social Community based Overlays, ICDCN 2011 [ best paper award in Networking track]
11
Intuitions for designing selective gossip
• Social science principals– Reciprocity based incentives– Social triads to reduce duplicates
• Feedback– Learning & adapting to neighbor interests
• Interest communities– Naturally clustered• But there may be isolated islands
12
Information agent (IA) categories
• Interest Classification :– main Category (MC)– subcategory (SC)
• Order of preference– shared main category– irrelevant but good forwarding history– irrelevant but well connected (high degree)
13
Approach• If any Relv Nbrs
– Forward to all relevant nbrs
• Duplication saving : social triad • a & b don’t send each other• Not for cases like c
• What about non-relv Nbrs– Send to e (closely related)
• With probability p
• Boundary nodes– αh + βd + γa (h – history, d - degree,
a-activeness )– C selects j– j starts a Random Walk
0
ab
c
d
p e
i j
k
l
n
m
h
• α, β, γ can be change• Feedback mechanism
14
Message Dissemination
15
More on Information Dissemination• Swarm Particle Approach [2]
• Communities: Multi-Dimensional Network (based on relations)
• Particle swarm technique - Mobility (particles/agent can move),
• Orthogonal to GoDisco ( as multi-dim and mobility).
• GoDisco++ [3]– Took best out of ICDCN 2011 and
2012 approaches.– Social sciences plus multi-dimensional
network.
.
[3] Rajesh Sharma and Anwitaman Datta , Decentralized information dissemination in multidimensional semantic social overlays, ICDCN 2012, Hongkong.
[4] Rajesh Sharma and Anwitaman Datta. GoDisco++: A Gossip algorithm for information dissemination in multi-dimensional community networks. Journal of Pervasive and Mobile Computing, Oct, 2012
Multilayer Networks• Multiplex networks– Every node is present in
every network.– multiple types of
Relationships.
• Interconnected networks– Not every node is present
in every network.– Multiple networks.
• Model– Diffusion
Modeling: cascade process• C1: (v4,l2)
• C2 : (v4,l1)
• Diffusion network: Aggregation of cascades C1 and C2[5] Spreading processes in Multilayer Networks, Mostafa Salehi, Rajesh Sharma, Moreno Marzolla, Danilo Montesi, Payam
Siyari, and Matteo Magnani, under review at IEEE Transactions on Network Sceience & Engg.
4 possibilities of diffusion in ML• Same-node inter-layer
– Cascade switches layer but remains on the same node
– Facebook post is shared on Twitter
• Other-node inter-layer– Cascade continues spreading to
another node in another layer– The spread of a disease in an
interconnected network of cities
• Other-node intra-layer– Cascade continues spreading
through the same layer.– Retweeting a post in Twitter
• Same-node intra-layer– ??
Dependent variables used in different diffusion studies
Milgram Experiment. (late 1960s)
• The navigation problem – Small world community.
• The experiment set up– One target (Massachusetts)– Many originators. (Nebraska)– Acquaintance chains of Letters
• Output– Six degrees of Separation
• New version (2003) by Dodds et al.– Multiple source and Targets– Web based experiment
History of Diffusion (Time Line)1967 1978 1993
MilgramNavigation in small world [1]
Granoveter: Threshold Model
Internet
2001
Wiki, Friendster, Myspace, FB, Blogs, Flickr, Youtube, smartphones.
SW: Small World Vesigpinani:
underlying n/w is important
2015
AIDS impact on Swedish population.
1975
Epidemic model [2]
2014
SF: Scale Free
1998
??
1999
Milgram Reloaded!• Attempt to understand the
navigation process • Multiple networks (FB, Twitter,
WhatsApp etc)• Across the Globe• Multiple originators• Multiple targets• Multi Lingual
T1
O1
O2
O3
O4 O5
T2T4
T3
T5 T6
Output: Average path length, Network usage (geographically), orig < -- >target impact
Milgram Reloaded!
• What data we will ask*– Who are you : Email ID or Phone No– Network: Through what network you received it.– Who sent you: ID of the person– Which networks are you going to use to move the
message towards its destination ?• Web Link: http://m.web.cs.unibo.it/• If you have comments or feedback. Please contact:– [email protected] or [email protected]
Reasoning about Networks
• How do we reason about networks?– Empirical: Study network data to find
organizational principles• How do we measure and quantify networks?
– Mathematical models: Graph theory and statistical models• Models allow us to understand behaviors and
distinguish surprising from expected phenomena.
– Algorithms: for analyzing graphs• Hard computational challenges
Networks: Structure & Process
• What do we study in networks?– Structure and evolution: • What is the structure of a network?• Why and how did it come to have such structure?
– Processes and dynamics:• Networks provide “skeleton for spreading of
information, behavior, diseases• How do information and diseases spread?
Networks: Impact
• Companies: Google (382.61B), Cisco (125.29B), Facebook (207.04B), Twitter (25.32B), LinkedIn (28.9B)
• Predicting Epidemics : Flu• Intelligence and fighting (cyber) terrorism:
Find the leaders/hubs of terrorist org/regimes• Financial Impact: Recession in Europe (who is
lending whom)
Networks: Size Matters
• Network data: Orders of magnitude– 436-node network of email exchange at a corporate
• research lab [Adamic-Adar, SocNets ‘03]– 43,553-node network of email exchange at an
• university [Kossinets-Watts, Science ‘06]– 4.4-million-node network of declared friendships on a
• blogging community [Liben-Nowell et al., PNAS ‘05]– 240-million-node network of communication on
• Microsoft Messenger [Leskovec-Horvitz, WWW ’08]– 800-million-node Facebook network [Backstrom et al. ‘1
Group Activity
• Big data : Network (and non network) data (mostly from web).– Understand and analysis
• Few Examples:– Impact of Tweets on :
• Financial patterns.• Reputation of Companies
– Community patterns in networks: Information dissemination.
– GPS data : insurance fraud
Rajesh SharmaUniversity of Bologna
http://rajshpec.github.io/[email protected]
Research Group: htt p://sigsna.net/impact/
Thank you !!Questions?