section 8 – ec1818 jeremy barofsky [email protected] march 31 st and april 1 st, 2010

16
Section 8 – Ec1818 Jeremy Barofsky [email protected] u March 31 st and April 1 st , 2010

Upload: magdalen-potter

Post on 29-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Section 8 – Ec1818 Jeremy Barofsky jbarofsk@hsph.harvard.edu March 31 st and April 1 st, 2010

Section 8 – Ec1818

Jeremy [email protected]

March 31st and April 1st, 2010

Page 2: Section 8 – Ec1818 Jeremy Barofsky jbarofsk@hsph.harvard.edu March 31 st and April 1 st, 2010

Section 8 Outline (lectures 15, 16)• Social Network Introduction

• Types of Networks / Graphs1) Random2) Regular3) Small-world

• Erdos / Bacon Numbers

• Review Questions?• Evaluations

• Office Hours - Thursday, 4/1/10 10-11am, outside 320 CGIS North.

Page 3: Section 8 – Ec1818 Jeremy Barofsky jbarofsk@hsph.harvard.edu March 31 st and April 1 st, 2010

Small World Phenomenon - Milgram • Question: Probability that two randomly selected people know

each other?• Design: In 1967, Milgram sent packages to 160 random people

living in Omaha NE asking them to send the package to a friend or acquaintance they thought might know or be connected to the final individual – a stock broker in Boston. (Postcards also sent back to Harvard to track progress).

• Results: 1) Of those letters that found destination, average path length 5.5-

6. 2) Significant selection bias – in one experiment 232 of 296 were

not sent on. 3) Most of cards given to target through a few people. Experiment

with 160 packages sent, 24 reached target at his home and 16 of these were given to target by one person (nodes in network).

-Reasons for under-estimate or over-estimate of avg. path length?

Page 4: Section 8 – Ec1818 Jeremy Barofsky jbarofsk@hsph.harvard.edu March 31 st and April 1 st, 2010

Social networks• A graph G consists of a set V(G) of vertices (or nodes)

together with a set of edges E(G) (or links) that connect vertices.

• Degree: number of edges connected to a given vertex.

• Order: the number of vertices V(G) in graph G represent its order.

• Size: the number of edges E(G) in G represents its size.

• Directed graph / undirected graph: graph is directed if all its edges are directional, ie- the network tells us not just whether people are friends but whether each person considers the other a friend. If none of edges are directional, then graph G is undirected.

Page 5: Section 8 – Ec1818 Jeremy Barofsky jbarofsk@hsph.harvard.edu March 31 st and April 1 st, 2010

Networks / Graphs and 3 elements

Page 6: Section 8 – Ec1818 Jeremy Barofsky jbarofsk@hsph.harvard.edu March 31 st and April 1 st, 2010

Social Networks Metrics• Characteristic path length L(G, p): measures average

distance between vertices. By distance we mean the shortest path that connects vertices v and v’.

• Clustering coefficient C(G, p): Measures a vertex / person’s level of cliquishness within its neighborhood. Answers – are the friends of my friends, my friends also?

• Formally C(G, p)= actual edges in network within its neighborhood / maximum possible edges in that neighborhood.

• Maximum number of graph edges / number of connections in social network: n(n-1)/2 where n = number of vertices.

Page 7: Section 8 – Ec1818 Jeremy Barofsky jbarofsk@hsph.harvard.edu March 31 st and April 1 st, 2010

Types of Graphs• Regular Network: each vertex is connected to same

number k of their nearest neighbors only. All vertices have the same degree. Long characteristic path length because takes a long time to get from one vertex to another, large clustering coefficient because vertices connected to all other nearby vertices.

• Random Network: Edges between vertices occur randomly with prob. = 1/V(G). Full connectedness occurs non-linearly when Pr(connection) = 1/V(G). Small characteristic path length and clustering coefficient.

• Adjacency matrix: Way to represent network data with each row/ column representing whether those vertices have a connection.

Page 8: Section 8 – Ec1818 Jeremy Barofsky jbarofsk@hsph.harvard.edu March 31 st and April 1 st, 2010

From Regular to Random Graphs via Small Worlds

Page 9: Section 8 – Ec1818 Jeremy Barofsky jbarofsk@hsph.harvard.edu March 31 st and April 1 st, 2010

Regular -> Small World -> Random Graphs

• Rewire: Start with a regular graph with vertices in a circle and each connected to 4 closest neighbors. Rewire each edge at random with probability p.

• Changing p means tunes graph such that p = 0 defines a regular graph, p = 1 random.

• Watts and Strogratz define small-world networks with two characteristics:

1)Large Clustering Coefficient C(G, p) – most of my neighbors are friends and friends with me too.

2)Small Characteristic Path Length L(G, p) – Presence of random, long-distance connections mean that moving from one part of the graph to the other can be done quickly.

Page 10: Section 8 – Ec1818 Jeremy Barofsky jbarofsk@hsph.harvard.edu March 31 st and April 1 st, 2010

Characteristic path length L(p) and clustering coefficient C(p) for rewired graphs as p varies.

(Watts and Strogatz, 1998)

Page 11: Section 8 – Ec1818 Jeremy Barofsky jbarofsk@hsph.harvard.edu March 31 st and April 1 st, 2010

Empirical Examples (Watts and Strogatz, 1998)

Page 12: Section 8 – Ec1818 Jeremy Barofsky jbarofsk@hsph.harvard.edu March 31 st and April 1 st, 2010

Power Laws Again? Are you serious? (Random means normal distribution and small world means power law)

Page 13: Section 8 – Ec1818 Jeremy Barofsky jbarofsk@hsph.harvard.edu March 31 st and April 1 st, 2010

Erdos and Bacon Numbers• Small World Networks exhibit strong connections

between neighbors (cliques) but information can still travel quickly because of random connections to other highly connected groups of vertices.

• Erdos/ Bacon numbers: Level of connection in peer-reviewed journal articles or movie credits. Bacon number of 1 means individual acted in same movie as Kevin Bacon. Nearly all actors connected in this way – exhibits characteristics of small world networks.

Page 14: Section 8 – Ec1818 Jeremy Barofsky jbarofsk@hsph.harvard.edu March 31 st and April 1 st, 2010

Is Bacon Best?

• “By processing all of the 1.6 million people in the Internet Movie Database I discovered that there are currently 506 people who are better centers than Kevin Bacon!” –Oracle of Bacon website.

• Compute average Bacon number and compare to others.

Page 15: Section 8 – Ec1818 Jeremy Barofsky jbarofsk@hsph.harvard.edu March 31 st and April 1 st, 2010

Degree distribution of Bacon / Connery Numbers for Actors in IMD.

Bacon/ Connery Number

# of actors / Bacon

# of actors / Connery

0 1 1

1 2349 2446

2 223940 282628

3 666941 674960

4 153220 89845

5 9662 6533

6 877 598

7 134 118

8 15 10

Average 2.95 2.82

Page 16: Section 8 – Ec1818 Jeremy Barofsky jbarofsk@hsph.harvard.edu March 31 st and April 1 st, 2010

Midterm Questions?