word2vec, node2vec, graph2vec, x2vec: towards a theory of ... · word2vec, node2vec, graph2vec,...

120
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data Martin Grohe RWTH Aachen

Upload: others

Post on 19-Oct-2020

63 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

word2vec, node2vec, graph2vec, X2vec:Towards a Theory of Vector Embeddingsof Structured Data

Martin GroheRWTH Aachen

Page 2: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

X2vec

Vector EmbeddingsRepresent objects from arbitrary class X by vectors in some vectorspace H

space X of objects vector space H

ExampleWord embeddings in natural language processing: word2vec

2

Page 3: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

X2vec

Vector EmbeddingsRepresent objects from arbitrary class X by vectors in some vectorspace H

space X of objects vector space H

ExampleWord embeddings in natural language processing: word2vec

2

Page 4: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Motivation and Ideas

How?I embed objects into vector space (Hilbert space) in such a way

that semantic relationships between objects are reflected ingeometric relationships of their images

I most importantly: “semantic distance” between objectscorrelated to geometric distance

Why?I standard machine learning algorithms operate on vector

representations of dataI enable a quantitative analysis using mathematical techniques

ranging from linear algebra to functional analysis

3

Page 5: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Motivation and Ideas

How?I embed objects into vector space (Hilbert space) in such a way

that semantic relationships between objects are reflected ingeometric relationships of their images

I most importantly: “semantic distance” between objectscorrelated to geometric distance

Why?I standard machine learning algorithms operate on vector

representations of dataI enable a quantitative analysis using mathematical techniques

ranging from linear algebra to functional analysis

3

Page 6: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Motivation and Ideas

How?I embed objects into vector space (Hilbert space) in such a way

that semantic relationships between objects are reflected ingeometric relationships of their images

I most importantly: “semantic distance” between objectscorrelated to geometric distance

Why?I standard machine learning algorithms operate on vector

representations of data

I enable a quantitative analysis using mathematical techniquesranging from linear algebra to functional analysis

3

Page 7: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Motivation and Ideas

How?I embed objects into vector space (Hilbert space) in such a way

that semantic relationships between objects are reflected ingeometric relationships of their images

I most importantly: “semantic distance” between objectscorrelated to geometric distance

Why?I standard machine learning algorithms operate on vector

representations of dataI enable a quantitative analysis using mathematical techniques

ranging from linear algebra to functional analysis

3

Page 8: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Node Embeddings

I objects that are embedded are the vertices of a graph

I graph can be directed, labelled, weighted, et ceteraI typical applications: node classification in social networks, link

prediction in knowledge graphs

4

Page 9: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Node Embeddings

I objects that are embedded are the vertices of a graphI graph can be directed, labelled, weighted, et cetera

I typical applications: node classification in social networks, linkprediction in knowledge graphs

4

Page 10: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Node Embeddings

I objects that are embedded are the vertices of a graphI graph can be directed, labelled, weighted, et ceteraI typical applications: node classification in social networks, link

prediction in knowledge graphs

4

Page 11: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Graph Embeddings

I objects that are embedded are graphs

I graphs can be directed, labelled, weighted, et ceteraI typical applications: classification and regression on chemical

molecules

5

Page 12: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Graph Embeddings

I objects that are embedded are graphsI graphs can be directed, labelled, weighted, et cetera

I typical applications: classification and regression on chemicalmolecules

5

Page 13: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Graph Embeddings

I objects that are embedded are graphsI graphs can be directed, labelled, weighted, et ceteraI typical applications: classification and regression on chemical

molecules

5

Page 14: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Node Embeddings vs Graph Embeddings

CommonalityI same structure

I graph embedding applied to neighbourhoods of nodes yieldsnode embedding

Important differencesI nodes a of a single graph have explicit relation(s) between

them and a clearly defined distance, but different graphs areonly “semantically” related

I node embeddings can (often) be trained for fixed graphs, butwe rarely have to deal with a finite set of graphs fixed inadvance

6

Page 15: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Node Embeddings vs Graph Embeddings

CommonalityI same structureI graph embedding applied to neighbourhoods of nodes yields

node embedding

Important differencesI nodes a of a single graph have explicit relation(s) between

them and a clearly defined distance, but different graphs areonly “semantically” related

I node embeddings can (often) be trained for fixed graphs, butwe rarely have to deal with a finite set of graphs fixed inadvance

6

Page 16: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Node Embeddings vs Graph Embeddings

CommonalityI same structureI graph embedding applied to neighbourhoods of nodes yields

node embedding

Important differencesI nodes a of a single graph have explicit relation(s) between

them and a clearly defined distance, but different graphs areonly “semantically” related

I node embeddings can (often) be trained for fixed graphs, butwe rarely have to deal with a finite set of graphs fixed inadvance

6

Page 17: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Node Embeddings vs Graph Embeddings

CommonalityI same structureI graph embedding applied to neighbourhoods of nodes yields

node embedding

Important differencesI nodes a of a single graph have explicit relation(s) between

them and a clearly defined distance, but different graphs areonly “semantically” related

I node embeddings can (often) be trained for fixed graphs, butwe rarely have to deal with a finite set of graphs fixed inadvance

6

Page 18: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Towards a Theory of Vector Embeddings

I Vector embeddings allow for a quantitative analysis of discretedata using standard machine learning algorithms andmathematical techniques from linear algebra to functionalanalysis.

I Vector embeddings allow it to integrate data from differentsources to a common framework

I Node and graph embeddings have been studied in machinelearning for a long time.The existing work focusses on graphs and binary structures,not much is known about embedding arbitrary relations.

I Except for the mathematical foundations, there is little theory.Vector embeddings deserve a deeper study using the tools andmethods of TCS (analysis of algorithms and complexity, logic,structural graph theory, et cetera)

7

Page 19: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Towards a Theory of Vector Embeddings

I Vector embeddings allow for a quantitative analysis of discretedata using standard machine learning algorithms andmathematical techniques from linear algebra to functionalanalysis.

I Vector embeddings allow it to integrate data from differentsources to a common framework

I Node and graph embeddings have been studied in machinelearning for a long time.The existing work focusses on graphs and binary structures,not much is known about embedding arbitrary relations.

I Except for the mathematical foundations, there is little theory.Vector embeddings deserve a deeper study using the tools andmethods of TCS (analysis of algorithms and complexity, logic,structural graph theory, et cetera)

7

Page 20: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Towards a Theory of Vector Embeddings

I Vector embeddings allow for a quantitative analysis of discretedata using standard machine learning algorithms andmathematical techniques from linear algebra to functionalanalysis.

I Vector embeddings allow it to integrate data from differentsources to a common framework

I Node and graph embeddings have been studied in machinelearning for a long time.The existing work focusses on graphs and binary structures,not much is known about embedding arbitrary relations.

I Except for the mathematical foundations, there is little theory.Vector embeddings deserve a deeper study using the tools andmethods of TCS (analysis of algorithms and complexity, logic,structural graph theory, et cetera)

7

Page 21: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Towards a Theory of Vector Embeddings

I Vector embeddings allow for a quantitative analysis of discretedata using standard machine learning algorithms andmathematical techniques from linear algebra to functionalanalysis.

I Vector embeddings allow it to integrate data from differentsources to a common framework

I Node and graph embeddings have been studied in machinelearning for a long time.The existing work focusses on graphs and binary structures,not much is known about embedding arbitrary relations.

I Except for the mathematical foundations, there is little theory.Vector embeddings deserve a deeper study using the tools andmethods of TCS (analysis of algorithms and complexity, logic,structural graph theory, et cetera)

7

Page 22: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Key Questions

Algorithms and ComplexityCan we compute the embedding efficiently? Can we invert itefficiently (i.e., reconstruct graphs from their image)?

Logic and SemanticsWhat is the semantical meaning of the embeddings? Whichproperties are preserved? Can we answer queries on the embeddeddata?

ComparisonHow do different embeddings compare? Do they induce the samemetric, the same topology?

8

Page 23: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Key Questions

Algorithms and ComplexityCan we compute the embedding efficiently? Can we invert itefficiently (i.e., reconstruct graphs from their image)?

Logic and SemanticsWhat is the semantical meaning of the embeddings? Whichproperties are preserved? Can we answer queries on the embeddeddata?

ComparisonHow do different embeddings compare? Do they induce the samemetric, the same topology?

8

Page 24: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Key Questions

Algorithms and ComplexityCan we compute the embedding efficiently? Can we invert itefficiently (i.e., reconstruct graphs from their image)?

Logic and SemanticsWhat is the semantical meaning of the embeddings? Whichproperties are preserved? Can we answer queries on the embeddeddata?

ComparisonHow do different embeddings compare? Do they induce the samemetric, the same topology?

8

Page 25: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Embeddings Techniques

Page 26: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Matrix Factorisation

Classical approach to defining node embeddings based on graphmetric.

Graph G = (V ,E )I Define similarity matrix S ∈ RV×V , e.g. by

Svw := exp(− distG (v ,w)

)I Compute low dimensional factorisation, e.g. matrix A ∈ RV×k

minimising‖AA> − S‖

I Rows av of matrix A define embedding V → Rk .Inner product 〈av , aw 〉 approximates similarity Svw .

10

Page 27: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Matrix Factorisation

Classical approach to defining node embeddings based on graphmetric.

Graph G = (V ,E )I Define similarity matrix S ∈ RV×V , e.g. by

Svw := exp(− distG (v ,w)

)

I Compute low dimensional factorisation, e.g. matrix A ∈ RV×k

minimising‖AA> − S‖

I Rows av of matrix A define embedding V → Rk .Inner product 〈av , aw 〉 approximates similarity Svw .

10

Page 28: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Matrix Factorisation

Classical approach to defining node embeddings based on graphmetric.

Graph G = (V ,E )I Define similarity matrix S ∈ RV×V , e.g. by

Svw := exp(− distG (v ,w)

)I Compute low dimensional factorisation, e.g. matrix A ∈ RV×k

minimising‖AA> − S‖

I Rows av of matrix A define embedding V → Rk .Inner product 〈av , aw 〉 approximates similarity Svw .

10

Page 29: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Matrix Factorisation

Classical approach to defining node embeddings based on graphmetric.

Graph G = (V ,E )I Define similarity matrix S ∈ RV×V , e.g. by

Svw := exp(− distG (v ,w)

)I Compute low dimensional factorisation, e.g. matrix A ∈ RV×k

minimising‖AA> − S‖

I Rows av of matrix A define embedding V → Rk .

Inner product 〈av , aw 〉 approximates similarity Svw .

10

Page 30: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Matrix Factorisation

Classical approach to defining node embeddings based on graphmetric.

Graph G = (V ,E )I Define similarity matrix S ∈ RV×V , e.g. by

Svw := exp(− distG (v ,w)

)I Compute low dimensional factorisation, e.g. matrix A ∈ RV×k

minimising‖AA> − S‖

I Rows av of matrix A define embedding V → Rk .Inner product 〈av , aw 〉 approximates similarity Svw .

10

Page 31: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Random Walks

DeepWalk and node2vec are node embedding algorithms based onthe following idea:

I Compile a list of short random walks in the graph.I Treat them like sentences in in natural language, with the

nodes of the graph as words, and use word embeddingtechniques (the skip-gram technique of word2vec) to computeembedding.

(Perozzi, Al-Rfou, Skiena 2014, Grover Leskovec 2016)

11

Page 32: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Random Walks

DeepWalk and node2vec are node embedding algorithms based onthe following idea:I Compile a list of short random walks in the graph.

I Treat them like sentences in in natural language, with thenodes of the graph as words, and use word embeddingtechniques (the skip-gram technique of word2vec) to computeembedding.

(Perozzi, Al-Rfou, Skiena 2014, Grover Leskovec 2016)

11

Page 33: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Random Walks

DeepWalk and node2vec are node embedding algorithms based onthe following idea:I Compile a list of short random walks in the graph.I Treat them like sentences in in natural language, with the

nodes of the graph as words, and use word embeddingtechniques (the skip-gram technique of word2vec) to computeembedding.

(Perozzi, Al-Rfou, Skiena 2014, Grover Leskovec 2016)11

Page 34: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Graph Neural NetworksGraph Neural Networks (GNNs) are a deep learning framework forgraphs:I learned message passing network on the input graph

I each node v has a vector-valued state x (t)v ∈ Rk at each point

t in timeI states are updated based on the states of the neighbours, e.g.

Aggregate : a(t+1)v ←

∑w∈N(v)

Wagg · x (t)w ,

Update : x (t+1)v ← σ

(Wup ·

(x (t)v

a(t+1)v

))

with learned parameter matrices Wagg,Wup and a nonlinear“activation function” σ.

I iterated for a fixed number s of rounds, v 7→ x (s)v is the

resulting node embedding

12

Page 35: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Graph Neural NetworksGraph Neural Networks (GNNs) are a deep learning framework forgraphs:I learned message passing network on the input graphI each node v has a vector-valued state x (t)

v ∈ Rk at each pointt in time

I states are updated based on the states of the neighbours, e.g.

Aggregate : a(t+1)v ←

∑w∈N(v)

Wagg · x (t)w ,

Update : x (t+1)v ← σ

(Wup ·

(x (t)v

a(t+1)v

))

with learned parameter matrices Wagg,Wup and a nonlinear“activation function” σ.

I iterated for a fixed number s of rounds, v 7→ x (s)v is the

resulting node embedding

12

Page 36: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Graph Neural NetworksGraph Neural Networks (GNNs) are a deep learning framework forgraphs:I learned message passing network on the input graphI each node v has a vector-valued state x (t)

v ∈ Rk at each pointt in time

I states are updated based on the states of the neighbours, e.g.

Aggregate : a(t+1)v ←

∑w∈N(v)

Wagg · x (t)w ,

Update : x (t+1)v ← σ

(Wup ·

(x (t)v

a(t+1)v

))

with learned parameter matrices Wagg,Wup and a nonlinear“activation function” σ.

I iterated for a fixed number s of rounds, v 7→ x (s)v is the

resulting node embedding

12

Page 37: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Graph Neural NetworksGraph Neural Networks (GNNs) are a deep learning framework forgraphs:I learned message passing network on the input graphI each node v has a vector-valued state x (t)

v ∈ Rk at each pointt in time

I states are updated based on the states of the neighbours, e.g.

Aggregate : a(t+1)v ←

∑w∈N(v)

Wagg · x (t)w ,

Update : x (t+1)v ← σ

(Wup ·

(x (t)v

a(t+1)v

))

with learned parameter matrices Wagg,Wup and a nonlinear“activation function” σ.

I iterated for a fixed number s of rounds, v 7→ x (s)v is the

resulting node embedding12

Page 38: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Graph Neural Networks (cont’d)

I a GNN encodes the embeddingfunction and not just a list ofvectors for the nodes of a fixedgraph (as the previous techniquesdo)

I once trained, a GNN can still beapplied if the graph changes, oreven to an entirely different graph

I we say that GNNs provide an inductive node-embeddingmethod, whereas matrix factorisation techniques and randomwalk techniques are transductive

I by aggregating the vectors of all nodes, we can also use GNNsto compute graph embeddings.

(Hamilton, Ying, Leskovec 2017)

13

Page 39: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Graph Neural Networks (cont’d)

I a GNN encodes the embeddingfunction and not just a list ofvectors for the nodes of a fixedgraph (as the previous techniquesdo)

I once trained, a GNN can still beapplied if the graph changes, oreven to an entirely different graph

I we say that GNNs provide an inductive node-embeddingmethod, whereas matrix factorisation techniques and randomwalk techniques are transductive

I by aggregating the vectors of all nodes, we can also use GNNsto compute graph embeddings.

(Hamilton, Ying, Leskovec 2017)

13

Page 40: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Graph Neural Networks (cont’d)

I a GNN encodes the embeddingfunction and not just a list ofvectors for the nodes of a fixedgraph (as the previous techniquesdo)

I once trained, a GNN can still beapplied if the graph changes, oreven to an entirely different graph

I we say that GNNs provide an inductive node-embeddingmethod, whereas matrix factorisation techniques and randomwalk techniques are transductive

I by aggregating the vectors of all nodes, we can also use GNNsto compute graph embeddings.

(Hamilton, Ying, Leskovec 2017)

13

Page 41: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Graph Neural Networks (cont’d)

I a GNN encodes the embeddingfunction and not just a list ofvectors for the nodes of a fixedgraph (as the previous techniquesdo)

I once trained, a GNN can still beapplied if the graph changes, oreven to an entirely different graph

I we say that GNNs provide an inductive node-embeddingmethod, whereas matrix factorisation techniques and randomwalk techniques are transductive

I by aggregating the vectors of all nodes, we can also use GNNsto compute graph embeddings.

(Hamilton, Ying, Leskovec 2017)

13

Page 42: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Graph Kernels

Graph kernels are the inner product functions of graph embeddings,that is, functions

K (G ,H) =⟨Φ(G ),Φ(H)

⟩for some graph embedding Φ.

I Kernel trick: Compute and use K (G ,H) without evercomputing Φ(G ),Φ(H).

I graph kernels are widely used for ML problems on graphsI Common graph kernels are based on comparing random walks,

counting small subgraphs, and the Weisfeiler-Leman algorithm(Kashima et al. 2003, Gärtner et al 2003, Shervashidze etal. 2009,. . . )

14

Page 43: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Graph Kernels

Graph kernels are the inner product functions of graph embeddings,that is, functions

K (G ,H) =⟨Φ(G ),Φ(H)

⟩for some graph embedding Φ.I Kernel trick: Compute and use K (G ,H) without ever

computing Φ(G ),Φ(H).

I graph kernels are widely used for ML problems on graphsI Common graph kernels are based on comparing random walks,

counting small subgraphs, and the Weisfeiler-Leman algorithm(Kashima et al. 2003, Gärtner et al 2003, Shervashidze etal. 2009,. . . )

14

Page 44: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Graph Kernels

Graph kernels are the inner product functions of graph embeddings,that is, functions

K (G ,H) =⟨Φ(G ),Φ(H)

⟩for some graph embedding Φ.I Kernel trick: Compute and use K (G ,H) without ever

computing Φ(G ),Φ(H).I graph kernels are widely used for ML problems on graphs

I Common graph kernels are based on comparing random walks,counting small subgraphs, and the Weisfeiler-Leman algorithm

(Kashima et al. 2003, Gärtner et al 2003, Shervashidze etal. 2009,. . . )

14

Page 45: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Graph Kernels

Graph kernels are the inner product functions of graph embeddings,that is, functions

K (G ,H) =⟨Φ(G ),Φ(H)

⟩for some graph embedding Φ.I Kernel trick: Compute and use K (G ,H) without ever

computing Φ(G ),Φ(H).I graph kernels are widely used for ML problems on graphsI Common graph kernels are based on comparing random walks,

counting small subgraphs, and the Weisfeiler-Leman algorithm(Kashima et al. 2003, Gärtner et al 2003, Shervashidze etal. 2009,. . . )

14

Page 46: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

The Weisfeiler-Leman Algorithm

Page 47: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

A Simple Combinatorial Algorithm

The 1-dimensional Weisfeiler-Leman algorithm (1-WL)(a.k.a. colour refinement) iteratively computes a colouring of thevertices of graph G .

Initialisation All vertices get the same colour.Refinement Step Two nodes v ,w get different colours if there is

some colour c such that v and w have differentnumbers of neighbours of colour c .Refinement is repeated until colouring stays stable.

Run 1-WL (Demo by Erkal Selman

16

Page 48: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

A Simple Combinatorial Algorithm

The 1-dimensional Weisfeiler-Leman algorithm (1-WL)(a.k.a. colour refinement) iteratively computes a colouring of thevertices of graph G .

Initialisation All vertices get the same colour.

Refinement Step Two nodes v ,w get different colours if there issome colour c such that v and w have differentnumbers of neighbours of colour c .Refinement is repeated until colouring stays stable.

Run 1-WL (Demo by Erkal Selman

16

Page 49: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

A Simple Combinatorial Algorithm

The 1-dimensional Weisfeiler-Leman algorithm (1-WL)(a.k.a. colour refinement) iteratively computes a colouring of thevertices of graph G .

Initialisation All vertices get the same colour.Refinement Step Two nodes v ,w get different colours if there is

some colour c such that v and w have differentnumbers of neighbours of colour c .

Refinement is repeated until colouring stays stable.

Run 1-WL (Demo by Erkal Selman

16

Page 50: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

A Simple Combinatorial Algorithm

The 1-dimensional Weisfeiler-Leman algorithm (1-WL)(a.k.a. colour refinement) iteratively computes a colouring of thevertices of graph G .

Initialisation All vertices get the same colour.Refinement Step Two nodes v ,w get different colours if there is

some colour c such that v and w have differentnumbers of neighbours of colour c .Refinement is repeated until colouring stays stable.

Run 1-WL (Demo by Erkal Selman

16

Page 51: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Example

initial graph

colouring after round 1

colouring after round 2 stable colouring after round 3

17

Page 52: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Example

initial graph colouring after round 1

colouring after round 2 stable colouring after round 3

17

Page 53: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Example

initial graph colouring after round 1

colouring after round 2

stable colouring after round 3

17

Page 54: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Example

initial graph colouring after round 1

colouring after round 2 stable colouring after round 3

17

Page 55: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Complexity

Theorem (Cardon, Crochemore, 1982, Paige, Tarjan, 1987)The stable colouring of a given graph with n vertices and m edgescan be computed in time O((n + m) log n).

This is optimal for a fairly general class of partitioning algorithms(Berkholz, Bonsma, G. 2017).

18

Page 56: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Complexity

Theorem (Cardon, Crochemore, 1982, Paige, Tarjan, 1987)The stable colouring of a given graph with n vertices and m edgescan be computed in time O((n + m) log n).

This is optimal for a fairly general class of partitioning algorithms(Berkholz, Bonsma, G. 2017).

18

Page 57: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

1-WL as an Isomorphism Test

1-WL distinguishes two graphs G ,H if their colour histogramsdiffer, that is, some colour appears a different number of times in Gand H.

Thus 1-WL can be used as an incomplete isomorphism test.I works on almost all graphs (Babai, Erdös, Selkow 1980)I fails on some very simple graphs:

19

Page 58: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

1-WL as an Isomorphism Test

1-WL distinguishes two graphs G ,H if their colour histogramsdiffer, that is, some colour appears a different number of times in Gand H.Thus 1-WL can be used as an incomplete isomorphism test.

I works on almost all graphs (Babai, Erdös, Selkow 1980)I fails on some very simple graphs:

19

Page 59: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

1-WL as an Isomorphism Test

1-WL distinguishes two graphs G ,H if their colour histogramsdiffer, that is, some colour appears a different number of times in Gand H.Thus 1-WL can be used as an incomplete isomorphism test.I works on almost all graphs (Babai, Erdös, Selkow 1980)

I fails on some very simple graphs:

19

Page 60: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

1-WL as an Isomorphism Test

1-WL distinguishes two graphs G ,H if their colour histogramsdiffer, that is, some colour appears a different number of times in Gand H.Thus 1-WL can be used as an incomplete isomorphism test.I works on almost all graphs (Babai, Erdös, Selkow 1980)I fails on some very simple graphs:

19

Page 61: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Colours and Trees

Λr = set of all colours used by 1-WL in the first r rounds

Λ =⋃

r≥0 Λr = set of all colours used

We can think of the colours in Λr as rooted trees of height r .

Example

Graph G

Round 0 Round 1 Round 2

20

Page 62: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Colours and Trees

Λr = set of all colours used by 1-WL in the first r rounds

Λ =⋃

r≥0 Λr = set of all colours used

We can think of the colours in Λr as rooted trees of height r .

Example

Graph G

Round 0 Round 1 Round 2

20

Page 63: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Colours and Trees

Λr = set of all colours used by 1-WL in the first r rounds

Λ =⋃

r≥0 Λr = set of all colours used

We can think of the colours in Λr as rooted trees of height r .

Example

Graph G

Round 0 Round 1 Round 2

20

Page 64: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Colours and Trees

Λr = set of all colours used by 1-WL in the first r rounds

Λ =⋃

r≥0 Λr = set of all colours used

We can think of the colours in Λr as rooted trees of height r .

Example

Graph G

Round 0 Round 1 Round 2

20

Page 65: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Colours and Trees

Λr = set of all colours used by 1-WL in the first r rounds

Λ =⋃

r≥0 Λr = set of all colours used

We can think of the colours in Λr as rooted trees of height r .

Example

Graph G Round 0

Round 1 Round 2

20

Page 66: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Colours and Trees

Λr = set of all colours used by 1-WL in the first r rounds

Λ =⋃

r≥0 Λr = set of all colours used

We can think of the colours in Λr as rooted trees of height r .

Example

Graph G Round 0 Round 1

Round 2

20

Page 67: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Colours and Trees

Λr = set of all colours used by 1-WL in the first r rounds

Λ =⋃

r≥0 Λr = set of all colours used

We can think of the colours in Λr as rooted trees of height r .

Example

Graph G Round 0 Round 1 Round 2

20

Page 68: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Weisfeiler-Leman Graph KernelsFor λ ∈ Λ: wl(λ,G ) = number of vertices of G of colour λ

WL Graph EmbeddingsGraph embeddings WLr for every r ≥ 0 defined by:

WLr (G ) =(

wl(λ,G ) | λ ∈ Λr

)∈ RΛr .

WL Graph KernelsThe r -round WL-kernel is the mapping K

(r)WL defined by

K(r)WL(G ,H) :=

r∑`=0

∑λ∈Λ`

wl(λ,G ) · wl(λ,H).

(Shervashidze, Schweitzer, van Leeuwen, Mehlhorn, Borgwardt 2011)

We can also define an infinite-dimensional version based on theembedding WL(G ) =

(wl(λ,G ) | λ ∈ Λ

)∈ RΛ.

21

Page 69: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Weisfeiler-Leman Graph KernelsFor λ ∈ Λ: wl(λ,G ) = number of vertices of G of colour λ

WL Graph EmbeddingsGraph embeddings WLr for every r ≥ 0 defined by:

WLr (G ) =(

wl(λ,G ) | λ ∈ Λr

)∈ RΛr .

WL Graph KernelsThe r -round WL-kernel is the mapping K

(r)WL defined by

K(r)WL(G ,H) :=

r∑`=0

∑λ∈Λ`

wl(λ,G ) · wl(λ,H).

(Shervashidze, Schweitzer, van Leeuwen, Mehlhorn, Borgwardt 2011)

We can also define an infinite-dimensional version based on theembedding WL(G ) =

(wl(λ,G ) | λ ∈ Λ

)∈ RΛ.

21

Page 70: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Weisfeiler-Leman Graph KernelsFor λ ∈ Λ: wl(λ,G ) = number of vertices of G of colour λ

WL Graph EmbeddingsGraph embeddings WLr for every r ≥ 0 defined by:

WLr (G ) =(

wl(λ,G ) | λ ∈ Λr

)∈ RΛr .

WL Graph KernelsThe r -round WL-kernel is the mapping K

(r)WL defined by

K(r)WL(G ,H) :=

r∑`=0

∑λ∈Λ`

wl(λ,G ) · wl(λ,H).

(Shervashidze, Schweitzer, van Leeuwen, Mehlhorn, Borgwardt 2011)

We can also define an infinite-dimensional version based on theembedding WL(G ) =

(wl(λ,G ) | λ ∈ Λ

)∈ RΛ.

21

Page 71: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Weisfeiler-Leman Graph KernelsFor λ ∈ Λ: wl(λ,G ) = number of vertices of G of colour λ

WL Graph EmbeddingsGraph embeddings WLr for every r ≥ 0 defined by:

WLr (G ) =(

wl(λ,G ) | λ ∈ Λr

)∈ RΛr .

WL Graph KernelsThe r -round WL-kernel is the mapping K

(r)WL defined by

K(r)WL(G ,H) :=

r∑`=0

∑λ∈Λ`

wl(λ,G ) · wl(λ,H).

(Shervashidze, Schweitzer, van Leeuwen, Mehlhorn, Borgwardt 2011)

We can also define an infinite-dimensional version based on theembedding WL(G ) =

(wl(λ,G ) | λ ∈ Λ

)∈ RΛ.

21

Page 72: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Higher-Dimensional Weisfeiler-Leman

The k-dimensional Weisfeiler-Lemanalgorithm (k-WL) iteratively coloursk-tuples of vertices (Weisfeiler andLeman 1968, Babai ∼1980)

Running time: O(nk log n)

k-WL is much more powerful than 1-WL, but still not a completeisomorphism test.

Theorem (Cai, Fürer, Immerman 1992)For every k there are non-isomorphic graphs Gk ,Hk notdistinguished by k-WL.

22

Page 73: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Higher-Dimensional Weisfeiler-Leman

The k-dimensional Weisfeiler-Lemanalgorithm (k-WL) iteratively coloursk-tuples of vertices (Weisfeiler andLeman 1968, Babai ∼1980)

Running time: O(nk log n)

k-WL is much more powerful than 1-WL, but still not a completeisomorphism test.

Theorem (Cai, Fürer, Immerman 1992)For every k there are non-isomorphic graphs Gk ,Hk notdistinguished by k-WL.

22

Page 74: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Higher-Dimensional Weisfeiler-Leman

The k-dimensional Weisfeiler-Lemanalgorithm (k-WL) iteratively coloursk-tuples of vertices (Weisfeiler andLeman 1968, Babai ∼1980)

Running time: O(nk log n)

k-WL is much more powerful than 1-WL, but still not a completeisomorphism test.

Theorem (Cai, Fürer, Immerman 1992)For every k there are non-isomorphic graphs Gk ,Hk notdistinguished by k-WL.

22

Page 75: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Weisfeiler and Leman go Neural

Theorem (Morris, Ritzert, Fey, Hamilton, Lenssen, Rattan,G. 2019, Xu, Hu, Leskovec, Jegelka 2019)GNNs can extract exactly the same information from a graph as1-WL.

23

Page 76: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Digression: GNNs in Chemical Engineering

Higher Dimensional GNNsBased on the higher dimensional WL algorithm (Morris, Ritzert,Fey, Hamilton, Lenssen, Rattan, G. 2019).

Predicting Fuel Ignition QualityPrediction model for combustion-related properties of hydrocarbonsbased on 2-dimensional GNNs

Python & RDKit & k-GNN Python & PyTorch & PyTorch Geometric

2-GNN1-GNN

SMILES

… …

1 423

57

6

1 423

57

6

1 423

57

6

1 423

57

6

1 423

57

6

1 423

57

6

… …

1 423

57

6

!… " !

#$$%&'(

#$$%&'(

DCN, MON, or RON

… …

(Schweidtmann, Rittig, König, G., Mitsos, and Dahmen, 2020)

24

Page 77: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Digression: GNNs in Chemical Engineering

Higher Dimensional GNNsBased on the higher dimensional WL algorithm (Morris, Ritzert,Fey, Hamilton, Lenssen, Rattan, G. 2019).

Predicting Fuel Ignition QualityPrediction model for combustion-related properties of hydrocarbonsbased on 2-dimensional GNNs

Python & RDKit & k-GNN Python & PyTorch & PyTorch Geometric

2-GNN1-GNN

SMILES

… …

1 423

57

6

1 423

57

6

1 423

57

6

1 423

57

6

1 423

57

6

1 423

57

6

… …

1 423

57

6

!… " !

#$$%&'(

#$$%&'(

DCN, MON, or RON

… …

(Schweidtmann, Rittig, König, G., Mitsos, and Dahmen, 2020)24

Page 78: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Logical Characterisation

C is the (syntactical) extension of first-order predicate logic withcounting quantifiers ∃≥ix .

Theorem (Cai, Fürer, Immerman 1992)For every k ≥ 1 and all graphs G ,H, the following are equivalent.1. k-WL does not distinguish G and H.2. G and H satisfy the same sentences of the logic C with at

most (k + 1) variables.

25

Page 79: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Logical Characterisation

C is the (syntactical) extension of first-order predicate logic withcounting quantifiers ∃≥ix .

Theorem (Cai, Fürer, Immerman 1992)For every k ≥ 1 and all graphs G ,H, the following are equivalent.1. k-WL does not distinguish G and H.2. G and H satisfy the same sentences of the logic C with at

most (k + 1) variables.

25

Page 80: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Fractional Isomorphisms

G ,H graphs with vertex sets V ,W and adjacency matrices A,B

System L(G ,H) of linear equations in variables Xvw :∑v ′∈V

Avv ′Xv ′w =∑

w ′∈W

Xvw ′Bw ′w for all v ∈ V ,w ∈W ,∑v ′∈V

Xv ′w =∑

w ′∈W

Xvw ′ = 1 for all v ∈ V ,w ∈W .

ObservationL(G ,H) has a nonnegative integer solution iff G and H areisomorphic.

Theorem (Tinhofer 1990)L(G ,H) has a nonnegative rational solution (called a fractionalisomorphism) if and only if 1-WL does not distinguish G ,H.

26

Page 81: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Fractional Isomorphisms

G ,H graphs with vertex sets V ,W and adjacency matrices A,BSystem L(G ,H) of linear equations in variables Xvw :∑

v ′∈V

Avv ′Xv ′w =∑

w ′∈W

Xvw ′Bw ′w for all v ∈ V ,w ∈W ,∑v ′∈V

Xv ′w =∑

w ′∈W

Xvw ′ = 1 for all v ∈ V ,w ∈W .

ObservationL(G ,H) has a nonnegative integer solution iff G and H areisomorphic.

Theorem (Tinhofer 1990)L(G ,H) has a nonnegative rational solution (called a fractionalisomorphism) if and only if 1-WL does not distinguish G ,H.

26

Page 82: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Fractional Isomorphisms

G ,H graphs with vertex sets V ,W and adjacency matrices A,BSystem L(G ,H) of linear equations in variables Xvw :∑

v ′∈V

Avv ′Xv ′w =∑

w ′∈W

Xvw ′Bw ′w for all v ∈ V ,w ∈W ,∑v ′∈V

Xv ′w =∑

w ′∈W

Xvw ′ = 1 for all v ∈ V ,w ∈W .

ObservationL(G ,H) has a nonnegative integer solution iff G and H areisomorphic.

Theorem (Tinhofer 1990)L(G ,H) has a nonnegative rational solution (called a fractionalisomorphism) if and only if 1-WL does not distinguish G ,H.

26

Page 83: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Fractional Isomorphisms

G ,H graphs with vertex sets V ,W and adjacency matrices A,BSystem L(G ,H) of linear equations in variables Xvw :∑

v ′∈V

Avv ′Xv ′w =∑

w ′∈W

Xvw ′Bw ′w for all v ∈ V ,w ∈W ,∑v ′∈V

Xv ′w =∑

w ′∈W

Xvw ′ = 1 for all v ∈ V ,w ∈W .

ObservationL(G ,H) has a nonnegative integer solution iff G and H areisomorphic.

Theorem (Tinhofer 1990)L(G ,H) has a nonnegative rational solution (called a fractionalisomorphism) if and only if 1-WL does not distinguish G ,H.

26

Page 84: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Higher Dimensional Version

Atserias and Maneva 2013I There is a close correspondence between k-WL and the kth

level of the Sherali-Adams hierarchy over L(G ,H) (togetherwith the inequalities Xvw ≥ 0).

I Exact correspondence by G. and Otto 2015.

We denote the system of linear equalities whose nonnegativerational solutions characterise k-WL by L(k)(G ,H).

27

Page 85: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Higher Dimensional Version

Atserias and Maneva 2013I There is a close correspondence between k-WL and the kth

level of the Sherali-Adams hierarchy over L(G ,H) (togetherwith the inequalities Xvw ≥ 0).

I Exact correspondence by G. and Otto 2015.

We denote the system of linear equalities whose nonnegativerational solutions characterise k-WL by L(k)(G ,H).

27

Page 86: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Higher Dimensional Version

Atserias and Maneva 2013I There is a close correspondence between k-WL and the kth

level of the Sherali-Adams hierarchy over L(G ,H) (togetherwith the inequalities Xvw ≥ 0).

I Exact correspondence by G. and Otto 2015.

We denote the system of linear equalities whose nonnegativerational solutions characterise k-WL by L(k)(G ,H).

27

Page 87: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Homomorphism Vectors

Page 88: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Counting Graph HomomorphismsHomomorphism Counts

hom(F ,G ) := number of homomorphisms from F to G

Homomorphism Graph EmbeddingsFor every class F of graphs, we define HomF :→ RF by

HomF (G ) :=(

hom(F ,G )∣∣∣ F ∈ F).

We can define an inner product on rg(HomF ) by⟨HomF (G ),HomF (H)

⟩=∑k≥1

12k · |Fk |

∑F∈Fk

hom(F ,G ) · hom(F ,H),

where Fk :={F ∈ F

∣∣ |F | = k}.

29

Page 89: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Counting Graph HomomorphismsHomomorphism Counts

hom(F ,G ) := number of homomorphisms from F to G

Homomorphism Graph EmbeddingsFor every class F of graphs, we define HomF : G → RF by

HomF (G ) :=(

hom(F ,G )∣∣∣ F ∈ F).

We can define an inner product on rg(HomF ) by⟨HomF (G ),HomF (H)

⟩=∑k≥1

12k · |Fk |

∑F∈Fk

hom(F ,G ) · hom(F ,H),

where Fk :={F ∈ F

∣∣ |F | = k}.

29

Page 90: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Counting Graph HomomorphismsHomomorphism Counts

hom(F ,G ) := number of homomorphisms from F to G

Homomorphism Graph EmbeddingsFor every class F of graphs, we define HomF : G

class of all graphs

→ RF by

HomF (G ) :=(

hom(F ,G )∣∣∣ F ∈ F).

We can define an inner product on rg(HomF ) by⟨HomF (G ),HomF (H)

⟩=∑k≥1

12k · |Fk |

∑F∈Fk

hom(F ,G ) · hom(F ,H),

where Fk :={F ∈ F

∣∣ |F | = k}.

29

Page 91: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Counting Graph HomomorphismsHomomorphism Counts

hom(F ,G ) := number of homomorphisms from F to G

Homomorphism Graph EmbeddingsFor every class F of graphs, we define HomF : G → RF by

HomF (G ) :=(

hom(F ,G )∣∣∣ F ∈ F).

We can define an inner product on rg(HomF ) by⟨HomF (G ),HomF (H)

⟩=∑k≥1

12k · |Fk |

∑F∈Fk

hom(F ,G ) · hom(F ,H),

where Fk :={F ∈ F

∣∣ |F | = k}.

29

Page 92: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Example

The embedding HomF for F = { , , }

The graphs G and H are mapped to the same vector:

HomF (G ) = HomF (H) = (10, 20, 0).

30

Page 93: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Example

The embedding HomF for F = { , , }

The graphs G and H are mapped to the same vector:

HomF (G ) = HomF (H) = (10, 20, 0).

30

Page 94: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Example

GH

The embedding HomF for F = { , , }

The graphs G and H are mapped to the same vector:

HomF (G ) = HomF (H) = (10, 20, 0).

30

Page 95: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Homomorphism Indistinguishability

Graph G and H are homomorphism indistinguishable over a class Fof graphs if HomF (G ) = HomF (H).

Theorem (Lovász 1967)For all graphs G ,H, the following are equivalent.1. G and H are homomorphism indistinguishable over the class of

all graphs.2. G and H are isomorphic.

31

Page 96: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Homomorphism Indistinguishability

Graph G and H are homomorphism indistinguishable over a class Fof graphs if HomF (G ) = HomF (H).

Theorem (Lovász 1967)For all graphs G ,H, the following are equivalent.1. G and H are homomorphism indistinguishable over the class of

all graphs.2. G and H are isomorphic.

31

Page 97: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Proofaut(G ) := number of automorphisms of GI (F ,G ) := number of injective homomorphisms from F to G

S(F ,G ) := number of surjective homomorphisms from F to G

We view hom, I , and S as infinite matrices in RG×G .Row and column indices are ordered by increasing size.

Observation 1I is upper triangular with positive diagonal entries and S is lowertriangular with positive diagonal entries.

Observation 2hom(F ,H) =

∑G S(F ,G ) · 1

aut(G) · I (G ,H).

That is, hom = S · D · I for a diagonal matrix D with positivediagonal entries.

Hence the columns of hom are linearly independent and thusmutually distinct.

32

Page 98: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Proofaut(G ) := number of automorphisms of GI (F ,G ) := number of injective homomorphisms from F to G

S(F ,G ) := number of surjective homomorphisms from F to G

We view hom, I , and S as infinite matrices in RG×G .

Row and column indices are ordered by increasing size.

Observation 1I is upper triangular with positive diagonal entries and S is lowertriangular with positive diagonal entries.

Observation 2hom(F ,H) =

∑G S(F ,G ) · 1

aut(G) · I (G ,H).

That is, hom = S · D · I for a diagonal matrix D with positivediagonal entries.

Hence the columns of hom are linearly independent and thusmutually distinct.

32

Page 99: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Proofaut(G ) := number of automorphisms of GI (F ,G ) := number of injective homomorphisms from F to G

S(F ,G ) := number of surjective homomorphisms from F to G

We view hom, I , and S as infinite matrices in RG×G .Row and column indices are ordered by increasing size.

Observation 1I is upper triangular with positive diagonal entries and S is lowertriangular with positive diagonal entries.

Observation 2hom(F ,H) =

∑G S(F ,G ) · 1

aut(G) · I (G ,H).

That is, hom = S · D · I for a diagonal matrix D with positivediagonal entries.

Hence the columns of hom are linearly independent and thusmutually distinct.

32

Page 100: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Proofaut(G ) := number of automorphisms of GI (F ,G ) := number of injective homomorphisms from F to G

S(F ,G ) := number of surjective homomorphisms from F to G

We view hom, I , and S as infinite matrices in RG×G .Row and column indices are ordered by increasing size.

Observation 1I is upper triangular with positive diagonal entries and S is lowertriangular with positive diagonal entries.

Observation 2hom(F ,H) =

∑G S(F ,G ) · 1

aut(G) · I (G ,H).

That is, hom = S · D · I for a diagonal matrix D with positivediagonal entries.

Hence the columns of hom are linearly independent and thusmutually distinct.

32

Page 101: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Proofaut(G ) := number of automorphisms of GI (F ,G ) := number of injective homomorphisms from F to G

S(F ,G ) := number of surjective homomorphisms from F to G

We view hom, I , and S as infinite matrices in RG×G .Row and column indices are ordered by increasing size.

Observation 1I is upper triangular with positive diagonal entries and S is lowertriangular with positive diagonal entries.

Observation 2hom(F ,H) =

∑G S(F ,G ) · 1

aut(G) · I (G ,H).

That is, hom = S · D · I for a diagonal matrix D with positivediagonal entries.

Hence the columns of hom are linearly independent and thusmutually distinct.

32

Page 102: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Proofaut(G ) := number of automorphisms of GI (F ,G ) := number of injective homomorphisms from F to G

S(F ,G ) := number of surjective homomorphisms from F to G

We view hom, I , and S as infinite matrices in RG×G .Row and column indices are ordered by increasing size.

Observation 1I is upper triangular with positive diagonal entries and S is lowertriangular with positive diagonal entries.

Observation 2hom(F ,H) =

∑G S(F ,G ) · 1

aut(G) · I (G ,H).

That is, hom = S · D · I for a diagonal matrix D with positivediagonal entries.

Hence the columns of hom are linearly independent and thusmutually distinct.

32

Page 103: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Proofaut(G ) := number of automorphisms of GI (F ,G ) := number of injective homomorphisms from F to G

S(F ,G ) := number of surjective homomorphisms from F to G

We view hom, I , and S as infinite matrices in RG×G .Row and column indices are ordered by increasing size.

Observation 1I is upper triangular with positive diagonal entries and S is lowertriangular with positive diagonal entries.

Observation 2hom(F ,H) =

∑G S(F ,G ) · 1

aut(G) · I (G ,H).

That is, hom = S · D · I for a diagonal matrix D with positivediagonal entries.

Hence the columns of hom are linearly independent and thusmutually distinct.

32

Page 104: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Bounded Tree Width

Theorem (Dvorák 2010)For all graphs G ,H, the following are equivalent.1. G and H are homomorphism indistinguishable over the class of

all graphs of tree width at most k .2. G and H are not distinguished by the k-dimensional

Weisfeiler-Leman graph isomorphism algorithm.

CorollaryFor all graphs G ,H, the following are equivalent.1. G and H are homomorphism indistinguishable over the class of

all graphs of tree width at most k2. L(k)(G ,H) has a nonnegative rational solution.

33

Page 105: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Bounded Tree Width

Theorem (Dvorák 2010)For all graphs G ,H, the following are equivalent.1. G and H are homomorphism indistinguishable over the class of

all graphs of tree width at most k .2. G and H are not distinguished by the k-dimensional

Weisfeiler-Leman graph isomorphism algorithm.

CorollaryFor all graphs G ,H, the following are equivalent.1. G and H are homomorphism indistinguishable over the class of

all graphs of tree width at most k2. L(k)(G ,H) has a nonnegative rational solution.

33

Page 106: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Paths, Cycles, and Planar Graphs

Theorem (Dell, G., Rattan 2018)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all paths.2. L(G ,H) has a rational solution.

Theorem (Folklore)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all cycles.2. G and H are co-spectral.

Theorem (Mančinska and Roberson 2019)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all planar graphs.2. G and H are quantum isomorphic.

34

Page 107: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Paths, Cycles, and Planar Graphs

Theorem (Dell, G., Rattan 2018)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all paths.2. L(G ,H) has a rational solution.

Theorem (Folklore)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all cycles.2. G and H are co-spectral.

Theorem (Mančinska and Roberson 2019)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all planar graphs.2. G and H are quantum isomorphic.

34

Page 108: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Paths, Cycles, and Planar Graphs

Theorem (Dell, G., Rattan 2018)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all paths.2. L(G ,H) has a rational solution.

Theorem (Folklore)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all cycles.2. G and H are co-spectral.

Theorem (Mančinska and Roberson 2019)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all planar graphs.2. G and H are quantum isomorphic.

34

Page 109: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Logical CharacterisationsRecall: C is the (syntactical) extension of first-order predicate logicwith counting quantifiers ∃≥ix .

Corollary (Dvorák 2010)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all graphs of tree

width at most k .2. G and H satisfy the same sentences of the logic C with at

most k + 1 variables.

Theorem (G. 2020)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all graphs of tree

depth at most k .2. G and H satisfy the same sentences of the logic C of

quantifier rank at most k .

35

Page 110: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Logical CharacterisationsRecall: C is the (syntactical) extension of first-order predicate logicwith counting quantifiers ∃≥ix .

Corollary (Dvorák 2010)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all graphs of tree

width at most k .2. G and H satisfy the same sentences of the logic C with at

most k + 1 variables.

Theorem (G. 2020)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all graphs of tree

depth at most k .2. G and H satisfy the same sentences of the logic C of

quantifier rank at most k .

35

Page 111: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Logical CharacterisationsRecall: C is the (syntactical) extension of first-order predicate logicwith counting quantifiers ∃≥ix .

Corollary (Dvorák 2010)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all graphs of tree

width at most k .2. G and H satisfy the same sentences of the logic C with at

most k + 1 variables.

Theorem (G. 2020)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all graphs of tree

depth at most k .2. G and H satisfy the same sentences of the logic C of

quantifier rank at most k .35

Page 112: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Logical CharacterisationsRecall: C is the (syntactical) extension of first-order predicate logicwith counting quantifiers ∃≥ix .

Corollary (Dvorák 2010)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all graphs of tree

width at most k .2. G and H satisfy the same sentences of the logic C with at

most k + 1 variables.

Theorem (G. 2020)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all graphs of tree

depth at most k .2. G and H satisfy the same sentences of the logic C of

quantifier rank at most k .35

Page 113: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Remarks

I Lovász’s deep theory of Graph Limits is based on the metricinduced by HomG .

I Homomorphism embeddings can be extended to arbitraryrelational structures and to weighted graphs and structures;many of the results are preserved.

I There are also homomorphism node embeddings based onhomomorphism counts of rooted graphs.

36

Page 114: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Remarks

I Lovász’s deep theory of Graph Limits is based on the metricinduced by HomG .

I Homomorphism embeddings can be extended to arbitraryrelational structures and to weighted graphs and structures;many of the results are preserved.

I There are also homomorphism node embeddings based onhomomorphism counts of rooted graphs.

36

Page 115: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Remarks

I Lovász’s deep theory of Graph Limits is based on the metricinduced by HomG .

I Homomorphism embeddings can be extended to arbitraryrelational structures and to weighted graphs and structures;many of the results are preserved.

I There are also homomorphism node embeddings based onhomomorphism counts of rooted graphs.

36

Page 116: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Concluding Remarks

Page 117: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Research Directions

I Study relations between graph metrics, for example, themeasure induced by WL-Graph Kernels and thehomomorphism vectors over trees.

I Design algorithms for reconstructing graphs from vectors.I Design vector embeddings for more complex objects like

relational structures, processes, dynamical systems.I Develop a better understanding of the semantics of vector

embeddings and answer queries directly on the embedded data.

38

Page 118: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Research Directions

I Study relations between graph metrics, for example, themeasure induced by WL-Graph Kernels and thehomomorphism vectors over trees.

I Design algorithms for reconstructing graphs from vectors.

I Design vector embeddings for more complex objects likerelational structures, processes, dynamical systems.

I Develop a better understanding of the semantics of vectorembeddings and answer queries directly on the embedded data.

38

Page 119: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Research Directions

I Study relations between graph metrics, for example, themeasure induced by WL-Graph Kernels and thehomomorphism vectors over trees.

I Design algorithms for reconstructing graphs from vectors.I Design vector embeddings for more complex objects like

relational structures, processes, dynamical systems.

I Develop a better understanding of the semantics of vectorembeddings and answer queries directly on the embedded data.

38

Page 120: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen

Research Directions

I Study relations between graph metrics, for example, themeasure induced by WL-Graph Kernels and thehomomorphism vectors over trees.

I Design algorithms for reconstructing graphs from vectors.I Design vector embeddings for more complex objects like

relational structures, processes, dynamical systems.I Develop a better understanding of the semantics of vector

embeddings and answer queries directly on the embedded data.

38