statistical latent variable and event models for …slovett/workshops/big-graphs...dyadic data...

84
Statistical Latent Variable and Event Models for Network Data Padhraic Smyth Department of Computer Science University of California, Irvine January 7 th 2016 Workshop on Big Graphs: Theory and Practice UCSD

Upload: others

Post on 27-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Statistical Latent Variable and Event Models for Network Data

Padhraic SmythDepartment of Computer Science

University of California, Irvine

January 7th 2016Workshop on Big Graphs: Theory and Practice

UCSD

Page 2: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 2

Acknowledgements

Students and Colleagues

Chris Dubois, Jimmy Foulds, Arthur Asuncion, Carter Butts, Zach Butler

Funding

Page 3: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 3

References

Multiplicative latent factor models for description and prediction of social networks

P. D. Hoff, Computational and Mathematical Organization Theory , 15(4), 2009.

Dyadic data analysis with amen

P. D. Hoff, available online, June 2015

A relational event model for social action

C. E. Butts, Sociological Methodology, 2008

A survey of statistical network models

A. Goldenberg, A. Zheng, S. Fienberg, E. Airoldi, Foundations and Trends in Machine Learning, 2009

Page 4: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 4

Email Contact Network

Data from HP Labs

Page 5: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 5

Goals

• Learn a predictive distribution over future events in the network

– Incorporate node and edge attributes

• Be able to answer queries such as

– What will the network look like at time t + k?

– How likely is it that node i will communicate with node j

– How much influence does node i have on node j?

• Understand the dynamics of the network process

Page 6: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 6

C. Butts, Science, 2009

Page 7: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 7

Descriptive/Exploratory Analysis of Networks

Long history in social network analysis, complex systems, etc

– Degree distributions, power laws, scale-free networks

– Clustering effects

– Betweenness and centrality

Often focused on broad network properties

Very useful….but does not support inferential or predictive statements about specific nodes or edges

Page 8: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 8

Statistical Network Modeling

Basic idea: hypothesize a (simple) generative model for the data given parameters….and then infer parameters given observed data

• Learning

– Systematic methods for estimating network parameters

• Prediction/Querying

– reduces to computation of conditional probabilities and expectations

• Noise/Missing Data

– Systematic way to handle real-world noise

• Covariates

– Relatively straightforward to integrate “non-network” information

Page 9: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 9

Modeling Approaches

• Static model

– Aggregate event data into a single network

– e.g., static model for binary edges

• Discrete time models

– Aggregate event data into temporal windows, e.g., per week

• Continuous-time models

– Model event rates directly

– e.g., stationary Poisson (simple)

– e.g., non-stationary Poisson (more complex)

• Sequences of dependent events

– Cascade models

Page 10: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 10

Static Network Models

Page 11: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 11

Network Notation

N actors (node set)

• Generally assume that set of actors is known and fixed

Page 12: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 12

Network Notation

N actors (node set)

• Generally assume that set of actors is known and fixed

Edges between actors: adjacency matrix Y

= 1 : an edge between actors i and j

: real-valued or counts: indication of strength of relationship

Page 13: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 13

Network Notation

N actors (node set)

• Generally assume that set of actors is known and fixed

Edges between actors: adjacency matrix Y

= 1 : an edge between actors i and j

: real-valued or counts: indication of strength of relationship

Covariates/Attributes X

• e.g., for each actor, for each edge

Page 14: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 14

Example of a Y matrix:Counts of 200,000 email messages between 3000 individuals over 3 months

Page 15: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 15

Page 16: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 16

Sidenote: Graphical Models

It is tempting to think of our N x N network as being related to a graphical model on N variables

However, in network modeling, the edges are viewed as the random variables, not the nodes

This hints at the complexity of the problem, i.e., O(N2) variables, and exponential in N possible graph realizations

Page 17: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 17

Network Models via Regression

Page 18: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 18

Network Models via Regression

Mean effectRow effect Column effect

Page 19: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 19

Binary Undirected Edges

Page 20: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 20

Likelihood

Note that edges are conditionally independent given parameters

Page 21: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 21

Special Case: Erdos-Renyi Graph

Page 22: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 22

Special Case: Erdos-Renyi Graph

Page 23: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 23

Likelihood

Page 24: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 24

Likelihood

We can learn the q’s using maximum likelihood or Bayesian methods, using a variety of techniques such as gradient methods, MCMC, variational approximations, etc

Page 25: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 25

Adding Node and Edge Covariates

CovariatesWeights

Example:

Page 26: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 26

Adding Latent (Hidden) Variables

Hypothesize that the nodes are embedded in a latent (hidden) space

The probability of a link is higher if nodes are closer in this space

Given a set of observed links can we infer a set of “good locations”?

Page 27: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 27

Adding Latent (Hidden) Variables

Hypothesize that the nodes are embedded in a latent (hidden) space

The probability of a link is higher if nodes are closer in this space

Given a set of observed links can we infer a set of “good locations”?

Old idea in social science, e.g., McFarland and Brown, “Social distance as metric…”, 1973

See also more recent word embedding methods

Page 28: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 28

Latent Space Model

K-dimensional real-valued latent space vector for each node

Intuition:

• Embed nodes in a K-dimensional latent space, K much smaller than N

• Probability (or log-odds) of edge(i,j) decreases as i and j become further away

Hoff, Raftery, Handcock, JASA, 2002

Page 29: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 29

Figure from Hoff, Raftery, Handcock, 2002

Page 30: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 30

Additive Latent Interactions

This model implies transitivity:

e.g., if (A,B) close and if (B,C) close then (A,C) close (and has high probability)

…but some relations are not transitive, e.g., “conflict”

Page 31: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 31

Multiplicative Latent InteractionsHoff, 2009

K x K real-valued matrix(learned from the data)

Page 32: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 32

Multiplicative Latent InteractionsHoff, 2009

K x K real-valued matrix(learned from the data)

Hoff (NIPS 2008) showed that for a diagonal W matrix (the latent eigenmodel) this model is a strict generalization of the distance model

For directed networks or rectangular matrices we can replace zj with vj , yielding links to matrix factorization

Page 33: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 33

Building Blocks for Network Modeling

See also P. Hoff, Dyadic data analysis with amen, ArXiv, 2015

Page 34: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 34

Building Blocks for Network Modeling

e.g., g = log(p/1-p) Network density

Row and column effects

See also P. Hoff, Dyadic data analysis with amen, ArXiv, 2015

Page 35: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 35

Edge covariates and regression

weights

Building Blocks for Network Modeling

See also P. Hoff, Dyadic data analysis with amen, ArXiv, 2015

Page 36: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 36

K-dimensional latent vector

per node

Similarity function on

latent vectors

Building Blocks for Network Modeling

See also P. Hoff, Dyadic data analysis with amen, ArXiv, 2015

Page 37: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 37

Stochastic Block Model

Each node assumed to belong to 1 of K “stochastically equivalent” blocks

z vectors are K-dimensional indicators, e.g., z = [0, 0, 1, 0]

Within-block and between-block edge probabilities at block level, K x K matrix W

Nowicki and Snijders, 2002

Page 38: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 38

Stochastic Block Model

Each node assumed to belong to 1 of K “stochastically equivalent” blocks

z vectors are K-dimensional indicators, e.g., z = [0, 0, 1, 0]

Within-block and between-block edge probabilities at block level, K x K matrix W

Nowicki and Snijders, 2001

(Figure from Goldenberg et al, 2010)

Page 39: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 39

Stochastic Block Model

Each node assumed to belong to 1 of K “stochastically equivalent” blocks

z vectors are K-dimensional indicators, e.g., z = [0, 0, 1, 0]

Within-block and between-block edge probabilities at block level, K x K matrix W

Example:

Interaction:

Nowicki and Snijders, 2001

Page 40: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 40

Binary Relational Feature Model

Each node can “turn on” any subset of K binary features (latent)

z vectors are K-dimensional binary vectors, e.g., z = [0, 0, 1, 1]

K x K weight matrix W captures feature interactions

Miller, Jordan, Griffiths, NIPS 2009

Page 41: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 41

Binary Relational Feature Model

Hidden Features

Actors

Presence of edge between actor i and actor j is (e.g.)a logistic function of a weighted sum of features they have in common

Estimation: based on MCMC or variational EM

Miller, Jordan, Griffiths, NIPS 2009

Page 42: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 42

Binary Relational Feature Model

Example:

Interaction:

Miller, Jordan, Griffiths, NIPS 2009

(Original proposed as an infinite-dimensional non-parametric model)

Page 43: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 43

Predictions on NIPS Coauthorship Data

From Miller, Griffiths, Jordan, 2009

Page 44: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 45

Other Models

Mixed membership stochastic blockmodel (MMSB), Airoldi et al, 2008

Each node: a probability vector zi over K possible groups

W is a matrix of Bernoulli probabilities

Page 45: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 46

Other Models

Mixed membership stochastic blockmodel (MMSB), Airoldi et al, 2008

Each node: a probability vector zi over K possible groups

W is a matrix of Bernoulli probabilities

Relational topic model, Chang and Blei 2009

For modeling linked documents, e.g., via citations

Each node = document = K-dimensional topic probability vector

Various possible combination functions to reflect topic similarity

Page 46: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 47

General Formulation

e.g., g = log(p/1-p) Network density

Row and column effects

Edge covariates and regression

weights

K-dimensional latent vector

per node

Similarity function on

latent vectors

Page 47: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 48

Scalability

• The O(N2) term in the likelihood is problematic for scalability

• However, there is hope

– In most real-world social networks the number of edges in a network often scales as O(N) not O(N2)

…but the number of non-edges still scales as O(N2)

• This suggests factoring the likelihood into 2 pieces

– A product over edges, with O(N) terms

– A product over non-edges, with O(N2) terms that we approximate with O(N) terms

– This idea has been discovered (and rediscovered) several times

Page 48: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 49

Approximating the Log-Likelihood

Can approximate this term with O(N) randomly-sampled non-edges

See Raftery et al, 2012, J. Computational and Graphical Statistics

This idea can also be combined with stochastic gradient methods

Page 49: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 50

Stochastic Variational Inference: a-MMSB model

From Gopalan et al, 2012

Red: stochastic gradient with mini-batchBlue: conventional gradient batch algorithm

Page 50: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 51

Variations and Extensions

• Sender and receiver effects

– Latent vectors for sender and receiver roles can be different

• Rectangular matrices, bipartite graphs

– rows and columns each get their own latent vectors

• Multi-way arrays and tensors

• Bayesian estimation

– Fully Bayesian methods: infer posterior locations in latent space

– MAP and regularized variations: enforce sparsity in solutions

• Non-linear “deep” models

– Could incorporate non-linearities in various ways

Page 51: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 52

Dynamic Networks…..Adding Time

Page 52: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 53

Networks over Time

• Many network problems are dynamic rather than static

– e.g., social relationships are changing over time

– instantaneous communication events (emails, phone calls)

• Edges, nodes, and covariates may all be evolving over time

– We will assume node set is fixed and edges and covariates may change

– Systematic temporal effects often important (TOD, DOW, seasonality)

• Different ways to define networks over time

– Snapshots at time t

– Aggregation over time windows

– Continuous time models

Page 53: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 54

Discrete-Time Models

Yt represents the network at discrete time t

Data D = {Y1 …… Yt ………. YT }

Example

actors = students in a school

Yt = friendships between students measured in month t, t = 1, … 12

Interest is often in network dynamics and evolution

e.g., Markov models for P( Yt+1 | Y t )

(See work of Tom Snijders, Eric Xing, and others)

Page 54: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 55

Figure from Carter Butts

Page 55: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 56

General Formulation

In principle we can add time-dependence to any or all terms

Page 56: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 57

General Formulation

In principle we can add time-dependence to any or all terms

One approach is to make the z’s time-dependent

i.e., allow latent features of each actor change over time

Example: linear Gaussian dynamics in z-space

- Sarkar and Moore (2005) for actors’ latent-space positions

- Fu, Song, and Xing (2009) for actors’ mixed membership vectors

Page 57: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 58

Dynamic Relational Binary Feature Model

Recall for the static version zi = k-dimensional binary vector, e.g., (1, 0, 1, 0 , 1) f( zi , zj ) = z’i W zj , where W is a k x k matrixCommon set of k features across all actors

Foulds, Asuncion, DuBois, Butts, Smyth 2011

Page 58: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 59

Dynamic Relational Binary Feature Model

Recall for the static version zi = k-dimensional binary vector, e.g., (1, 0, 1, 0 , 1) f( zi , zj ) = z’i W zj , where W is a k x k matrixCommon set of k features across all actors

Dynamic version (Dynamic Relational Features)• Assume discrete time • The kth feature for actor i, zik (t) is a binary hidden Markov process• Features can turn on, persist, or turn off at each time step• For infinite version, new features can be born over time

• Inference via MCMC – tricky, but works

Foulds, Asuncion, DuBois, Butts, Smyth 2011

Page 59: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 60

Hidden Features

Actors

Time

Presence of edge i,j attime t depends on interactionof actor i’s and j’s feature vectors at that time t

Page 60: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 61

Example of DRIFT Predictions on Enron

Page 61: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 62

Continuous-Time Data and Models

Relational events: < i, j, t >

yt is an edge between some pair i and j at time t

Birth-death edges: each yt has start and end times

Instantaneous edges: each yt is (effectively) instantaneous

• Data D = { y1 …… yt ………. yT }

In a certain sense there is no graph!

Example

actors = students in a school

yt = text message between 2 students at time t

Interest is often in rates and patterns of communication

e.g., Poisson rates for y i,j given network history up to time t

Page 62: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 63

Multinomial Models for Relational Events

• Let be the rates of Poisson processes for each pair of nodes in a network

• Assume for simplicity that these processes are conditionally independent given model parameters

• We can decompose the network process into

– A global rate l which generates events globally

– A choice process: given an event, which pair generated it, i.e.

Page 63: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 64

Marginal Product Mixture ModelDuBois and Smyth, 2010

Multinomial over N2

possible edgesMixture over K unobserved groups

Page 64: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 65

Marginal Product Mixture ModelDuBois and Smyth, 2010

Multinomial over N2

possible edgesMixture over K unobserved groups

Distribution over senders

for group k

Distribution over receivers

for group k

Marginal probability of

group k

Page 65: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 66

Marginal Product Mixture ModelDuBois and Smyth, 2010

Multinomial over N2

possible edgesMixture over K unobserved groups

Distribution over senders

for group k

Distribution over receivers

for group k

Marginal probability of

group k

Edge events (rather than nodes) belong to latent groups (unlike MMSB)

Straightforward to learn via EM or collapsed Gibbs sampling

Page 66: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 67

LikelihoodDuBois and Smyth, 2010

Product over events

Product over pairs with non-zero

counts

For large sparse networks number of non-zero pairs << N2

Similar to use of multinomial versus Bernoulli models for text

Page 67: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 68

Application to Email Data:200,000 email messages among 3000 individuals(data from Eckmann, Moses, Sergi, 2004)

Most likely Edge Assignments by Group

Figures from Dubois and Smyth, 2010

Page 68: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 69

International Relations Data40,000 events2700 actors171 action types

(King, 2003)

Page 69: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 70

Prediction and Evaluation

• Use future data to evaluate predictive power and compare models

– e.g., predict network at time t+1 given network up to time t

• Metrics

– Log score = log probability of events that actually occurred

– Brier/MSE style scores

– Ranking/ROC scores

Page 70: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 71

Simple Baseline for Comparison

• We could predict the likelihood of i and j communicating based directly on i and j’s history

– Multinomial with O(N2) entries

– Can use smoothing to combat sparsity

• Problems

– Data can be extremely sparse for large N – smoothing is non-informative, and does not “borrow strength” from the graph

• Nonetheless this is a useful baseline when evaluating predictions

– Historically, few papers evaluate models predictively

– Even fewer compare their models to simple baselines

Page 71: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 72

From DuBois and Smyth, 2010

Page 72: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 74

Relational Event Model

Time-varying Poisson rate for edge i,j

Baserate

Sender and receiver effects

Butts, 2009

Page 73: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 75

Relational Event Model

Time-varying Poisson rate for edge i,j

Baserate

Sender and receiver effects

p-dim vector of regression parameters

p-dim vector of historical statistics

on edge i,j

Butts, 2009

Page 74: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 76

Relational Event Model

Time-varying Poisson rate for edge i,j

Baserate

Sender and receiver effects

p-dim vector of regression parameters

p-dim vector of historical statistics

on edge i,j

Butts, 2009

Edge rates are time-varying functions of historical features

Results in a piecewise constant (between events) Poisson process

Features can include conversation effects, recency, persistence, etc

Page 75: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 77

Parameter Estimation

• Likelihood includes terms for all events that occurred and all events that did not occur, for all inter-event times

– Computation of likelihood is O( T N2 ), T = number of events

– Some computational tricks possible to improve scalability

– See Vu et al (ICML 2011, NIPS 2011) for extensions to large social networks and citation networks

• Can use point estimates (optimization) or Bayesian inference (MCMC)

Page 76: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 78

Applications?

• Modeling classroom interactions in education[DuBois, Butts, McFarland, Smyth, J Math Psych, 2013]

• Understanding and predicting citation patterns among documents[Vu et al, NIPS 2011, ICML 2011; Foulds and Smyth, EMNLP 2013]

• Modeling communication patterns among individuals[DuBois, Smyth, KDD 2010; F oulds et al, AI Stats 2011]

• Clustering individuals in email networks over time[Navaroli, DuBois, Smyth, MLJ, 2013]

Page 77: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 79

Modeling Cascades

• Given a structural network with binary directed/undirected edges

AB

C

D

E

F

Page 78: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 80

Modeling Cascades

• Given a structural network with binary directed/undirected edges

AB

C

D

E

F

Page 79: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 81

Modeling Cascades

• Given a structural network with binary directed/undirected edges

AB

C

D

E

F

Page 80: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 82

Modeling Cascades

• Given a structural network with binary directed/undirected edges

AB

C

D

E

F

Page 81: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 83

Modeling Cascades

• Given a structural network with binary directed/undirected edges

AB

C

D

E

F

Page 82: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 84

Modeling Cascades

• Given a structural network with binary directed/undirected edges

• A cascade is a sequence of “node infections” (may have time-stamps)

– E.g., a post that spreads on a network such as Facebook or LinkedIn

• We observe a set of cascades, e.g.,

{A, B, E}, {B, A, D, F}, {A, B, C, E, F}, ….

• Given cascades …. make inferences about the “infection process”

AB

C

D

E

F

Page 83: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 85

Prior Work

• Ideas based on epidemics in networks

– Analyze how infection spreads as a function of network structure

• e.g, work by Kempe, Kleinberg, Newman, and many others

– Typically assume a single homogenous infection rate b

– Typically does not look at learning from data

• Statistical models (more recent)

– Define a generative model (i.e., likelihood) for cascades on a network

– Example

• Assumes cascades are independent

• Assume heterogeneous infection rates for different edges

• Define a probabilistic model of infection spreads to next node

– Learn parameters, e.g., a matrix of infection rates b

(see work by Manuel Gomez-Rodriguez and colleagues)

Page 84: Statistical Latent Variable and Event Models for …slovett/workshops/big-graphs...Dyadic data analysis with amen P. D. Hoff, available online, June 2015 A relational event model for

Padhraic Smyth, January 2016: 86

Summary

• Static networks

– Statistical models can be built up from basic building blocks

– Latent representations (“node embeddings”) can be broadly useful

• Dynamic networks

– Modeling networks over time can be more straightforward than static case

– More natural representation of the underlying data

– Notion of prediction is clearer

– Can build these models using same building blocks as for static networks

• Scalability of the learning algorithms is a general issue….but there are promising approaches emerging