analysis of social information networks

33
Analysis of Social Information Networks Thursday January 20 th , Lecture 1: Small World 1

Upload: basil

Post on 24-Feb-2016

38 views

Category:

Documents


0 download

DESCRIPTION

Analysis of Social Information Networks. Thursday January 20 th , Lecture 1: Small World. What is the small world effect?. “You’r e in NY, b y the way, do you happen to know x? - In fact, yes, what a small world!” W hat seems remotely distant is indeed socially close. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Analysis of  Social Information Networks

1

Analysis of Social Information

NetworksThursday January 20th, Lecture 1: Small World

Page 2: Analysis of  Social Information Networks

2

What is the small world effect?“You’re in NY, by the way, do you happen to know x?- - In fact, yes, what a

small world!”

What seems remotely distant is indeed socially close. The small world problem,

S. Milgram, Psychology today (1967)

Page 3: Analysis of  Social Information Networks

3

Small world hypothesis is necessary for−Fast and wide information propagation−Tipping point: small change has large effect−Network robustness and consistency

Studying the conditions of the “small world” effects tells us a lot about how we are connected?

Man, by nature, an animal of a society−“ζῷον πολιτικὸν” Aristotle, Politics 1

Why study small world?

Page 4: Analysis of  Social Information Networks

4

Remember how the King lost his fortune to the chess player?

What would you take?−An increasing series {1¢,2¢,4¢, etc.} over a

month?−Against $1,000−Against $100,000−Against $1,000,000−Against $100,000,000

Small world: a simplistic argument

Page 5: Analysis of  Social Information Networks

5

How many people would you recognize by name?−‘67 M. Gurevitch (MIT): about 500

Roughly, how many are socially related to you?

Small world: a simplistic argument

how close to you? Compares to %. US pop.500 direct acquaintance C.S. dept 0.00017%

250,000

share an acquaintance with

you

Harlem district 0.083%

125m share an acquaintance with a

friend of yours

Northeast + Midwest

42%

Page 6: Analysis of  Social Information Networks

6

The previous model is way too optimistic!−Reason #1: it assumes acquaintance set are

disjoint whereas they are related as part of a social graph searching through the graph creates inbreeding.

−Reason #2: social acquaintances are biasedGeography, occupation & social status, race Favors clusters, inbreeding. Increases social

distance−Other reasons: unbalanced degrees

Small world: The skeptics

Page 7: Analysis of  Social Information Networks

7

Milgram’s “small world” experiment

It’s a “combinatorial small world” It’s a “complex small world” It’s an “algorithmic small world”

Outline

Page 8: Analysis of  Social Information Networks

8

A direct experimental approach−Main issue: we do not know the graph (This is

1967!)−Can we still find a method to exhibit small chains

of acquaintance between two arbitrary people? At least we can try it out:

Pick a single “target” and an arbitrary sample of peopleSend a “folder” with target description, asking participants to Add her name, and forward to her friend or acquaintance

who she thought would be likely to know the target, Send a “tracer” card directly to a collection point

Milgram’s experiment

Page 9: Analysis of  Social Information Networks

9

Would this work?First folders arrived using 4 days and 2 intermediaries

2 successful experiments− 1st: 44 folders out of

160− 2nd : 64 folders on 296− Average distance = 6.2.

The small world problem, S. Milgram, Psychology today (1967)

Page 10: Analysis of  Social Information Networks

10

More observations− Chains conditioned

locally: gender, family/ friendsglobally: hometown/job

− Some received multiple folders (up to 16)

− Drop out-rate ~25%May artificially reduce distance and can be corrected

The small world problem, S. Milgram, Psychology today (1967)

Page 11: Analysis of  Social Information Networks

11

Are these results reproducible?−Using new forms of communication

P. Dodds, R. Muhamad, and D. Watts. An experimental study of search in global social networks. Science (2003).

Another method: assuming graph is known −Exhaustive search:− J. Leskovec and E. Horvitz. Planetary-scale views on a large

instant-messaging network. WWW (2008)− Erdös number, Kevin Bacon number− More evidence of short paths, including in different

domains

Experiment sequels

Page 12: Analysis of  Social Information Networks

12

Confirmation using email− Larger scale

18 targets, 24,000 starting points

− Drop-out rates : 65%success rate is lower: 1.5%median distance: 4 raw data and 7 corrected data

− Role of social statussuccess rate changes, but not mean distance

An experimental study of search in global social networks. P. Dodds, R. Muhamad, and D. Watts. Science (2003).

Page 13: Analysis of  Social Information Networks

13

Erdös numberConnections = collaboration− 2 persons connected if

they co-authored a paper.

How far are you from Erdös?− 94% within distance 6− Median: 5− Small number prestigious

Field medalists: 2-5; Abel: 2-4

The Erdös Number Project, www.oakland.edu/enp/

Page 14: Analysis of  Social Information Networks

14

Kevin Bacon numberConnections = collaboration− 2 actors connected if

they appear in one movie

− ~ 88% actors connected

− 98.3% actors: 4 and less Maximum number is 8.

The oracle of Bacon, oracleofbacon.org

Page 15: Analysis of  Social Information Networks

15

Instant MessagingDifferent scales• 180m nodes, 13b

edges Connections = collaboration− 180m nodes, 1.3b

edges

Planetary-scale views on a large instant-messaging network. J. Leskovec and E. Horvitz. WWW (2008)

Page 16: Analysis of  Social Information Networks

16

Milgram’s “small world” experiment

It’s a “combinatorial small world” It’s a “complex small world” It’s an “algorithmic small world”

Outline

Page 17: Analysis of  Social Information Networks

17

Yes, according to our (simplistic) combinatorial argument shown before.−Can we extend it to more realistic model?

Approach #1: 1. construct a graph from model of social

connections2. Characterizes connectivity, distance,

diameter −Main difficulty: connections between people

are difficult to predict

Do short paths exist?

Page 18: Analysis of  Social Information Networks

18

Uniform random graphs (Erdös-Rényi), 3 flavors1. Each edge exists independently (with fixed prob.

p)2. Choose a subset of m edges3. Choose for each node d random neighbors

Main merit: properties are established rigourously for large scale asymptotic (size n going to ∞)−Properties of connected components−Presence/absence of isolated nodes−Small diameter

Random Graphs

Page 19: Analysis of  Social Information Networks

19

Flavor 1 and 2 : series of phase transitions

All monotone properties have sharp threshold

Properties of Unif. Rand. Graph

p11/n ln(n)(1+ω(1))/n0

Conn. Comp.<ln(n)

1 large connected component; others <ln(n)Isolated nodes

graph is connected

ω(ln(n))/n

Diameter O(ln(n))

E. Friedgut, G. Kalai, Proc. AMS (1996) following Erdös-Rényi’s results

Page 20: Analysis of  Social Information Networks

20

Flavor 3 (each node with d random neighbors) How to construct?

1. Assign d “semi-edge” to each node2. Pair “semi-edge” into edge randomly3. Remove self-loops and multi-edges (rejection)(NB: This construction can be used for any degree dist.)

Degree constant: should we expect connectivity?

Properties of Unif. Rand. Graph

Page 21: Analysis of  Social Information Networks

21

Transitions is even faster:−d=1: collection of disjoint pairs (not

connected)−d=2: collection of disjoint cycles (a.s.

disconnected)−d=3: connected, small diameter, in fact

much more ... Thm: for d≥3 there exists γ>0 such that

for any sizeG is a γ_expander with high

probability:

Properties of Unif. Rand. Graph

Page 22: Analysis of  Social Information Networks

22

Proposition 1: If G is a γ_expander with degree at most d, its diameter is at most −Proof: combinatorial geometric expansion

Expanders have other properties (more later)−Robustness w.r.t. edges removal−Behavior of random walks

Properties of expanders

Page 23: Analysis of  Social Information Networks

23

Are these graphs expanders?

Page 24: Analysis of  Social Information Networks

24

Assuming random uniform connections−Phase transitions (i.e. tipping points) are the rule.−Connectivity, small diameter, even expander

property can be guaranteed without strict coordination… this may outclass deterministic structure!

−Elegant and precise mathematical formulation

Is small world only a reflection of the combinatorial power of randomness?

Summary

Page 25: Analysis of  Social Information Networks

25

Milgram’s “small world” experiment

It’s a “combinatorial small world” It’s a “complex small world” It’s an “algorithmic small world”

Outline

Page 26: Analysis of  Social Information Networks

26

Do not underestimate the effect of the inbreeding!−How many of your friends are friends (strong

ties)?−Which are friends with different people

(weak ties)? Several evidences that these play different

roles−Strong ties: resistance to innovation and

danger(e.g., reduce risk of teen suicide)

−Weak ties: propagate information, connect groups(e.g., who to ask when looking for a job)

Social networks are not flat

Suicide and Friendships Among American Adolescents. P. Bearman, J. Moody, Am. J. Public Health (2004)

The Strength of Weak Ties, M. Granovetter Am. J Soc. (1983)

Page 27: Analysis of  Social Information Networks

27

A metric of strong ties−Clustering coefficient−Also writes

Structured networks: rings, grids, torus−Clustering remains constant in large graph−As indeed, in most empirical data set

Random graphs−Clustering coefficient vanishes in large graph

… hence most results are biased towards weak ties

Can we quantify the difference?

Page 28: Analysis of  Social Information Networks

28

Main idea: social networks follows a structure with a random perturbation

Formal construction:1. Connect nearby nodes in a regular lattice2. Rewire each edge uniformly with probability

p(variant: add a new uniform edge with probability p)

Small-world model

Collective dynamics of ‘small-world’ networks. D. Watts, S. Strogatz, Nature (1998)

Page 29: Analysis of  Social Information Networks

29

Main idea: social networks follows a structure with a random perturbation

Small-world model

Collective dynamics of ‘small-world’ networks. D. Watts, S. Strogatz, Nature (1998)

Page 30: Analysis of  Social Information Networks

30

Between order and randomnessAs a function of p:− C(p): clustering

coefficient− L(p): median length of

the shortest path− Both normalized by

p=0.For small p (~ 0.01)− large clustering − small diameterA few rewire suffice!

Page 31: Analysis of  Social Information Networks

31

Thm: ∀p>0, augmented lattice has small diameter,

there is A such that P[D≤A ln(N)] → 1−Elementary proof if assuming regular augmentation−Otherwise proof similar to Uniform Random Graph

Statistical analysis randomly perturbed structures−Degree distribution, assortative mixing,

hypergeometry

A new statistical mechanics

Page 32: Analysis of  Social Information Networks

32

Milgram’s “small world” experiment

It’s a “combinatorial small world” It’s a “complex small world” It’s an “algorithmic small world”

Outline

Page 33: Analysis of  Social Information Networks

33

Where are we so far?Analogy with a cosmological principle− Are you ready to accept a

cosmological theory that does not predict life?

In other words, let’s perform a simple sanity check