gabriela ochoa goc/...10/19/2015 2 network ≡ graph graph refers to the mathematical abstraction,...

14
10/19/2015 1 BIG DATA SCIENTIFIC AND COMMERCIAL APPLICATIONS (ITNPD4) LECTURE: COMPLEX NETWORKS Gabriela Ochoa http://www.cs.stir.ac.uk/~goc/ OUTLINE Networks and Complexity Social networks Definition Motivation History Networks in general Representation and types Properties and metrics Software packages Summary & What’s next? Gabriela Ochoa, [email protected] 2

Upload: others

Post on 12-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Gabriela Ochoa goc/...10/19/2015 2 Network ≡ Graph Graph refers to the mathematical abstraction, while network to the real world instantiation Graphs are just collections of “points”

10/19/2015

1

BIG DATA SCIENTIFIC AND COMMERCIAL

APPLICATIONS (ITNPD4)

LECTURE: COMPLEX NETWORKS

Gabriela Ochoa

http://www.cs.stir.ac.uk/~goc/

OUTLINE

Networks and Complexity

Social networks

Definition

Motivation

History

Networks in general

Representation and types

Properties and metrics

Software packages

Summary & What’s next?

Ga

brie

la O

choa

, goc@

cs.stir.ac.u

k

2

Page 2: Gabriela Ochoa goc/...10/19/2015 2 Network ≡ Graph Graph refers to the mathematical abstraction, while network to the real world instantiation Graphs are just collections of “points”

10/19/2015

2

Network ≡ Graph

Graph refers to the mathematical abstraction, while

network to the real world instantiation

Graphs are just collections of “points” joined by “lines”

Points Lines

vertices edges, arcs Math

nodes links Computer Science

sites bonds Physics

actors ties, relations Sociology

NETWORKS AND GRAPHS

WHAT IS A SOCIAL NETWORK?

A social network is a collection of people,

each of whom is acquainted with some

subset of the others

Represented as a set of points (or vertices)

denoting people, joined in pairs by lines (or

edges) denoting acquaintance.

One could, in principle, construct the social

network for a company or firm, for a school

or university, or for any other community up

to and including the entire world.

Page 3: Gabriela Ochoa goc/...10/19/2015 2 Network ≡ Graph Graph refers to the mathematical abstraction, while network to the real world instantiation Graphs are just collections of “points”

10/19/2015

3

WHAT IS A COMPUTER NETWORK?

A collection of computing devices

connected in order to communicate

and share resources

Connections between computing

devices can be

Physical: wire or cables

Wireless: radio waves or infrared

signals Resources • Computers

• Data and data storage

• Printing

• Authentication

MANY OTHER COMPLEX NETWORKS …

Technological networks

Internet, WWW, Telephone, Railway, Airlines, …

Biological networks

Neural networks, metabolic networks, food web, …

Social networks

Friendships, sicentific collaborations, …

Demos and links

Facebook social network graph

Mapping the Human ‘Diseasome’

Gallery of network images

6

Ga

brie

la O

choa

, goc@

cs.stir.ac.u

k

Page 4: Gabriela Ochoa goc/...10/19/2015 2 Network ≡ Graph Graph refers to the mathematical abstraction, while network to the real world instantiation Graphs are just collections of “points”

10/19/2015

4

COMPLEX SYSTEMS G

ab

riela

Och

oa

, goc@

cs.stir.ac.u

k

7

[adj., v. kuh m-pleks, kom-pleks; n. kom-pleks]

–adjective

1.

composed of many interconnected parts; compound; composite: a complex highway system.

2.

characterized by a very complicated or involved arrangement of parts, units, etc.: complex machinery.

3.

so complicated or intricate as to be hard to understand or deal with: a complex problem.

Source: Dictionary.com

Complexity, a scientific theory which asserts that some systems display behavioral phenomena that are completely inexplicable by any conventional analysis of the systems’ constituent parts. These phenomena, commonly referred to as emergent behaviour, seem to occur in many complex systems involving living organisms, such as a stock market or the human brain.

Source: John L. Casti, Encyclopædia Britannica

Complex Complexity

Network Science: Introduction January 10, 2011

THE ROLE OF NETWORKS

Behind each complex system there is an

intricate wiring diagram, or a network, that

defines the interactions between the

component.

We will never understand complex system

unless we map out and understand the

networks behind them.

Page 5: Gabriela Ochoa goc/...10/19/2015 2 Network ≡ Graph Graph refers to the mathematical abstraction, while network to the real world instantiation Graphs are just collections of “points”

10/19/2015

5

RESOURCES

Books

Network Science Book Project by Laszlo

Barabasi et al.

Networks: An Introduction, M. E. J.

Newman, Oxford University Press, Oxford

(2010)

More books

Articles

Newman, M. E. (2003). The structure and

function of complex networks. SIAM review,

45(2):167–256

Newman, M. E. (2001) The structure of

scientific collaboration networks.

Ga

brie

la O

choa

, goc@

cs.stir.ac.u

k

9

A S

OC

IAL

NE

TW

OR

K

555 scientists and

their co-

authorships

Ga

brie

la O

choa

, goc@

cs.stir.ac.u

k

Lothar Krempel, MPI für Gesellschaftsforschung,

Lothringerstr.78, 50677 Köln, Germany

email: [email protected] 10

Page 6: Gabriela Ochoa goc/...10/19/2015 2 Network ≡ Graph Graph refers to the mathematical abstraction, while network to the real world instantiation Graphs are just collections of “points”

10/19/2015

6

AN

EX

AM

PL

E O

F A

SM

AL

L C

OA

UT

HO

RS

HIP

NE

TW

OR

K

Collaborations among

scientists at a private

research institution.

Nodes in the network

represent scientists,

and a line between

two of them indicates

they coauthored a

paper during the

period of study. This

particular network

appears to divide into

a number of

subcommunities, as

indicated by the

shapes of the nodes,

and these

subcommunities

correspond roughly to

topics of research,

Ga

brie

la O

choa

, goc@

cs.stir.ac.u

k

Newman M E J PNAS 2004;101:5200-5205

11

Potterat J J et al. Sex Transm Infect 2002;78:i159-i163

HIV

/AID

S N

ET

WO

RK

Largest connected

component, early

period (1980s),

Colorado Springs

(n = 250).

The stereotypic

member was a

white gay man

nearly 30 years old

who associated with

injecting drug users.

Node labels:

G: Gay man F: Female M: Heterosexual man +: HIV positive −: HIV negative ? : Unknown HIV status N: injecting drug using needles

Page 7: Gabriela Ochoa goc/...10/19/2015 2 Network ≡ Graph Graph refers to the mathematical abstraction, while network to the real world instantiation Graphs are just collections of “points”

10/19/2015

7

WHY TO STUDY SOCIAL NETWORKS?

Inherent interest in the patterns of human

interaction

Their structure has important implications for

the spread of information and disease.

For example, varying the average no. of

acquaintances individuals have (average degree)

might substantially influence the propagation of

a rumour, a fashion, a joke, or this year’s flu.

Understanding influence and public opinion

formation

Ga

brie

la O

choa

, goc@

cs.stir.ac.u

k

13

HISTORY OF SOCIAL NETWORK ANALYSIS

14

Ga

brie

la O

choa

, goc@

cs.stir.ac.u

k

Two original sources

1. Graph theory: Euler (1735) solved the Seven

Bridges of Konigsberg problem

2. Sociometry: quantitative method for

measuring social relationships. Developed by

J. L. Moreno (1970’s) in his studies of the

relationship between social structures and

psychological well-being.

Social network analysis • Started by H. White and R. Metton, University of Columbia.

(Harvard revolution) 1970’s

• Essential idea: people’s actions have to be related to their

attributes, but to really understand them you also need to

look at the networks that enable them to do something

Page 8: Gabriela Ochoa goc/...10/19/2015 2 Network ≡ Graph Graph refers to the mathematical abstraction, while network to the real world instantiation Graphs are just collections of “points”

10/19/2015

8

SEMINAL PAPERS, MODELS OF COMPLEX NETWORKS

Small-world networks (Watts & Strogatz, Nature, 1998),

Scale-free networks (Barabasi & Albert, Science, 1999)

Neither ordered nor

completely random

Nodes are highly

clustered yet path

length between them is

small

the degree distribution is “right-skewed” with a heavy tail

Most nodes have less-than-average degree, whilst a small fraction of hubs have a large number of connections

Described mathematically by a power-law

Cited by 27,237 Cited by 23,580

Ga

brie

la O

choa

, goc@

cs.stir.ac.u

k

15

TYPES OF NETWORKS

(a) un-weighted,

undirected

(b) discrete vertex

and edge types,

undirected

(c) varying vertex

and edge

weights,

undirected

(d) Directed (also

called arcs)

16

Ga

brie

la O

choa

, goc@

cs.stir.ac.u

k

From (Newman, 2003)

Page 9: Gabriela Ochoa goc/...10/19/2015 2 Network ≡ Graph Graph refers to the mathematical abstraction, while network to the real world instantiation Graphs are just collections of “points”

10/19/2015

9

GLOSSARY

Degree. The number of edges connected to a vertex. A directed graph has both an in-degree and an out-degree for each vertex, which are the numbers of in-coming and out-going edges respectively.

Component: The component to which a vertex belongs is that set of vertices that can be reached from it by paths running along edges of the graph.

Geodesic path: A geodesic path is the shortest path through the network from one vertex to another. Note that there may be and often is more than one geodesic path between two vertices.

Diameter: The diameter of a network is the length (in number of edges) of the longest geodesic path between any two vertices. 17

Ga

brie

la O

choa

, goc@

cs.stir.ac.u

k

Topology (Degree distribution)

• Gives an idea of the spread in the

number of links the nodes have

• P(k) is the probability that a randomly

selected node has k links

Distance

• Number of links that make up the

path between two points

• “Geodesic” = shortest path

MAIN GLOBAL PROPERTIES OF NETWORKS

Cohesion or Clustering

• Cliques in social network analysis

• Circles of friends in which every member knows each other

• Example: "6-degrees" of distance phenomenon

Ga

brie

la O

choa

, goc@

cs.stir.ac.u

k

18

Page 10: Gabriela Ochoa goc/...10/19/2015 2 Network ≡ Graph Graph refers to the mathematical abstraction, while network to the real world instantiation Graphs are just collections of “points”

10/19/2015

10

DIAMETER AND SHORTEST PATH

Small world: only 6 hops separate any two people

in the world

How do we measure this property in a network?

Let dij be the shortest-path distance between nodes i

and j

19

Ga

brie

la O

choa

, goc@

cs.stir.ac.u

k

Diameter (longest

shortest path distance)

Average shortest path

distance

CLUSTERING COEFFICIENT OR

TRANSITIVITY

In social networks: a friend of a friend is also

frequently a friend

How do we measure the this property

To check whether “the friend of a friend is also

frequently a friend”, we use:

The transitivity or clustering coefficient, which basically

measures the probability that two of my friends are also

friends

20

Ga

brie

la O

choa

, goc@

cs.stir.ac.u

k

Metrics

• Global CC

• Local CC

Page 11: Gabriela Ochoa goc/...10/19/2015 2 Network ≡ Graph Graph refers to the mathematical abstraction, while network to the real world instantiation Graphs are just collections of “points”

10/19/2015

11

DEGREE DISTRIBUTION: RANDOM VS. SCALE-FREE

NETWORKS

Linked: The New Science of

Networks

by Albert-László Barabási

Perseus Publishing, April

2002, Hard cover, 229 pgs,

• Random network: like a national highway network (nodes: cities, links: highways)

• Scale-free network: like an air traffic system (nodes: airports, links: flights)

Ga

brie

la O

choa

, goc@

cs.stir.ac.u

k

21

SCALE-FREE NETWORKS

The degree distribution of most real-world networks follows a power-law distribution:

fk = ck-α

Where α is a parameter whose value is in the range 2 < α < 3

“heavy-tail” distribution, implies existence of hubs (nodes with very high degree)

Called scale-free because power laws have the same functional form at all scales. Power low remain unchanged (other than multipicative factor) when rescaling independent var. k

22

Ga

brie

la O

choa

, goc@

cs.stir.ac.u

k

Examples:

• World Wide Web

links,

• Biological networks

• Social networks,

Page 12: Gabriela Ochoa goc/...10/19/2015 2 Network ≡ Graph Graph refers to the mathematical abstraction, while network to the real world instantiation Graphs are just collections of “points”

10/19/2015

12

PREFERENTIAL ATTACHMENT

“Rich get richer” dynamics

The more someone has, the more she is likely to have

Examples

the more friends you have, the easier it is to make

new ones

the more business a firm has, the easier it is to win

more

the more people there are at a restaurant, the more

who want to go

23

Ga

brie

la O

choa

, goc@

cs.stir.ac.u

k

WHAT DO REAL NETWORKS LOOK LIKE?

A number of models have been proposed to study

complex networks

Real networks exhibit:

Small diameter: also present in the Erdos-Reny or

random model

High clustering coefficient: also present in the

Watts-Strogatz model

Power-law degree distribution: also present in the

Barabasi-Albert or preferential attachment model

24

Ga

brie

la O

choa

, goc@

cs.stir.ac.u

k

Page 13: Gabriela Ochoa goc/...10/19/2015 2 Network ≡ Graph Graph refers to the mathematical abstraction, while network to the real world instantiation Graphs are just collections of “points”

10/19/2015

13

CENTRALITY

Centrality is a node’s measure w.r.t. others

A central node is important and/or powerful

A central node has an influential position in the

network

A central node has an advantageous position in the

network

25

Ga

brie

la O

choa

, goc@

cs.stir.ac.u

k

CENTRALITY MEASURES

There are various centrality measures. The key

ones are:

1. degree – This counts how many people are

connected to you.

2. closeness – If you are close to everyone, you have

a high closeness score.

3. betweenness – People who connect people who

are otherwise separate. If information goes

through you, you have a high betweenness score.

4. eigenvector – A person who is popular with the

popular kids has high eigenvector centrality.

Google’s page rank is an example.

26

Ga

brie

la O

choa

, goc@

cs.stir.ac.u

k

Page 14: Gabriela Ochoa goc/...10/19/2015 2 Network ≡ Graph Graph refers to the mathematical abstraction, while network to the real world instantiation Graphs are just collections of “points”

10/19/2015

14

SOFTWARE PACKAGES

NodeXL, a plugin for Excel,

NetworkX for Python,

igraph for both Python and R

statnet for R

Jure Leskovec at Stanford Book and network

package for C.

27

Ga

brie

la O

choa

, goc@

cs.stir.ac.u

k

SUMMARY

Many real-world complex systems can be

represented as a network

Networks capture the connectivity pattern

Real-world networks have several common

structural characteristics

Network metrics (distance, topology, cohesion,

centrality)

What is next?

Seminar this week, Friday 23 Oct by Dr Gabriela Ochoa

Seminar Friday 13 No Dr by Paweł Widera ,Newcastle

University

28

Ga

brie

la O

choa

, goc@

cs.stir.ac.u

k