department of mathematics, cscamm and nwc university of ... · department of mathematics, cscamm...

35
Permutation Invariant Representations Radu Balan Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session on Mathematical Analysis in Data Science, II JMM 2020, Denver, CO

Upload: others

Post on 22-Jul-2020

15 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

Radu Balan

Department of Mathematics, CSCAMM and NWCUniversity of Maryland, College Park, MD

January 17, 2020AMS Special Session on Mathematical Analysis in Data Science, II

JMM 2020, Denver, CO

Page 2: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

Acknowledgments

”This material is based upon work partiallysupported by the National Science Founda-tion under grant no. DMS-1816608 andLTS under grant H9823013D00560049. Anyopinions, findings, and conclusions or rec-ommendations expressed in this material arethose of the author(s) and do not necessar-ily reflect the views of the National ScienceFoundation.”

Joint works with:Naveed Haghani (UMD) Debdeep Bhattacharya (GWU)Maneesh Singh (Verisk)

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020

Page 3: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

OverviewTwo related problems:Given a discrete group G acting on a normed space V :

1 Construct a (bi)Lipschitz Euclidean embedding of the quotient spaceV /G , α : V → Rm.

2 Construct projections onto cosets, π : V → y = {g .y , g ∈ G}.

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020

Page 4: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

OverviewTwo related problems:Given a discrete group G acting on a normed space V :

1 Construct a (bi)Lipschitz Euclidean embedding of the quotient spaceV /G , α : V → Rm. Classification of cosets.

2 Construct the projections cosets, π : V → y = {g .y , g ∈ G}.

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020

Page 5: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

OverviewTwo related problems:Given a discrete group G acting on a normed space V :

1 Construct a (bi)Lipschitz Euclidean embedding of the quotient spaceV /G , α : V → Rm. Classification of cosets.

2 Construct projections onto cosets, π : V → y = {g .y , g ∈ G}.Optimizations within cosets.

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020

Page 6: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

OverviewIn this talk we discuss the first problem:Given a discrete group G = Sn acting on a normed space V = Rn×d :

1 Construct a (bi)Lipschitz Euclidean embedding of the quotient spaceV /G , α : V → Rm. Application: Classification of cosets.

2 Construct the projections cosets, π : V → y = {g .y , g ∈ G}.

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020

Page 7: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

Notations

Permutation Invariant RepresentationsConsider the equivalence relation ∼ on V = Rn×d induced by the group ofpermutation matrices Sn acting on V by left multiplication: for anyX ,X ′ ∈ Rn×d ,

X ∼ X ′ ⇔ X ′ = PX , for some P ∈ Sn

Let Rn×d = Rn×d/ ∼ be the quotient space endowed with the naturaldistance induced by Frobenius norm ‖ · ‖F

d(X1, X2) = minP∈Sn

‖X1 − PX2‖F , X1, X2 ∈ Rn×d .

The Problem: Construct a Lipschitz embedding α : Rn×d → Rm, i.e., aninteger m = m(n, d), a map α : Rn×d → Rm and a constant L = L(α) > 0so that for any X ,X ′ ∈ Rn×d ,

1 If X ∼ X ′ then α(X ) = α(X ′)2 If α(X ) = α(X ′) then X ∼ X ′3 ‖α(X )− α(X ′)‖2 ≤ L · d(X , X ′) = L minP∈Sn ‖X − PX ′‖F

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020

Page 8: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

Notations

Permutation Invariant RepresentationsConsider the equivalence relation ∼ on V = Rn×d induced by the group ofpermutation matrices Sn acting on V by left multiplication: for anyX ,X ′ ∈ Rn×d ,

X ∼ X ′ ⇔ X ′ = PX , for some P ∈ Sn

Let Rn×d = Rn×d/ ∼ be the quotient space endowed with the naturaldistance induced by Frobenius norm ‖ · ‖F

d(X1, X2) = minP∈Sn

‖X1 − PX2‖F , X1, X2 ∈ Rn×d .

The Problem: Construct a Lipschitz embedding α : Rn×d → Rm, i.e., aninteger m = m(n, d), a map α : Rn×d → Rm and a constant L = L(α) > 0so that for any X ,X ′ ∈ Rn×d ,

1 If X ∼ X ′ then α(X ) = α(X ′)2 If α(X ) = α(X ′) then X ∼ X ′3 ‖α(X )− α(X ′)‖2 ≤ L · d(X , X ′) = L minP∈Sn ‖X − PX ′‖F

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020

Page 9: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

Motivation

Motivation (1)Graph Learning Problems

Given a data graph (e.g., social network, transportation network, citationnetwork, chemical network, protein network, biological networks):

Graph adjacency or weight matrix, A ∈ Rn×n;Data matrix, X ∈ Rn×d , where each row corresponds to a featurevector per node.

Contruct a map f : (A,X )→ f (A,X ) that performs:1 classification: f (A,X ) ∈ {1, 2, · · · , c}2 regression/prediction: f (A,X ) ∈ R.

Key observation: The outcome should be invariant to vertex permutation:f (PAPT ,PX ) = f (A,X ), for every P ∈ Sn.

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020

Page 10: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

Motivation

Motivation (2)Graph Convolutive Networks (GCN), Graph Neural Networks (GNN)

General architecture of a GCN/GNN

GCN (Kipf and Welling (’16)) choses A = I + A; GNN (Scarselli et.al.(’08), Bronstein et.al. (’16)) choses A = pl (A), a polynomial in adjacencymatrix. L-layer GNN has parameters (p1,W1,B1, · · · , pL,WL,BL).

Note the covariance (or, equivariance) property: for any P ∈ O(n)(including Sn), if (A,X ) 7→ (PAPT ,PX ) and Bi 7→ PBi then Y 7→ PY .

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020

Page 11: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

Motivation

Motivation (2)Graph Convolutive Networks (GCN), Graph Neural Networks (GNN)

General architecture of a GCN/GNN

GCN (Kipf and Welling (’16)) choses A = I + A; GNN (Scarselli et.al.(’08), Bronstein et.al. (’16)) choses A = pl (A), a polynomial in adjacencymatrix. L-layer GNN has parameters (p1,W1,B1, · · · , pL,WL,BL).

Note the covariance (or, equivariance) property: for any P ∈ O(n)(including Sn), if (A,X ) 7→ (PAPT ,PX ) and Bi 7→ PBi then Y 7→ PY .

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020

Page 12: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

Motivation

Motivation (3)Deep Learning with GCNOur solution for the two learning tasks (classification or regression) is toutilize the following scheme:

where α is a permutation invariant map (extractor), and SVM/NN is asingle-layer or a deep neural network (Support Vector Machine or a FullyConnected Neural Network) trained on invariant representations.The purpose of this talk is to analyze the α component.

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020

Page 13: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

Motivation

Example on the Protein DatasetEnzyme Classification Example

Protein Dataset: the task is classification of each protein into enzyme ornon-enzyme.Dataset: 450 enzymes and 450 non-enzymes.Architecture (ReLU activation):

GCN with L = 3 layers and d = 25 feature vectors in each layer;No Permutation Invariant Component: α = IdentityFully connected NN with dense 3-layers and 120 internal units.

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020

Page 14: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

Embeddings

The Universal Embedding

Consider the map

µ : Rn×d → P(Rd ) , µ(X )(x) = 1n

n∑k=1

δ(x − xk)

where P(Rd ) denotes the convex set of probability measures over Rd , andδ denotes the Dirac measure.Clearly µ(X ′) = µ(X ) iff X ′ = PX for some P ∈ Sn.

Main drawback: P(Rd ) is infinite dimensional!

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020

Page 15: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

Embeddings

Finite Dimensional EmbeddingsArchitectures

Two classes of extractors [Zaheer et.al.17’ -’Deep Sets’]:1 Pooling Map – based on Max pooling2 Readout Map – based on Sum pooling

Intuition in the case d = 1:Max pooling:

λ : Rn → Rn , λ(x) = x↓ := (xπ(k))nk=1 , xπ(1) ≥ xπ(2) ≥ · · · ≥ xπ(n)

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020

Page 16: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

Embeddings

Finite Dimensional EmbeddingsArchitectures

Two classes of extractors [Zaheer et.al.17’ -’Deep Sets’]:1 Pooling Map – based on Max pooling2 Readout Map – based on Sum pooling

Intuition in the case d = 1:Max pooling:

λ : Rn → Rn , λ(x) = x↓ := (xπ(k))nk=1 , xπ(1) ≥ xπ(2) ≥ · · · ≥ xπ(n)

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020

Page 17: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

Embeddings

Max pooling as isometric embeddingThe Pooling Map, i.e., sorting, produces an isometric embedding of thequotient space:

PropositionIn the case d = 1, λ : Rn → Rn, x 7→ x↓ is an isometric embedding:

1 λ is injective2 λ(x)− λ(y) = d(x , y), for all x , y ∈ Rn

ProofClaim is equivalent to: minΠ∈Sn ‖x − Πy‖ = ‖x↓ − y↓‖.

WLOG: Assume x = x↓, y = y↓. Then

argminΠ∈Sn‖x − Πy‖ = argminΠ∈Sn‖x − xn · 1− Π(y − yn · 1)‖

Therefore assume xn = yn = 0 and x , y ≥ 0. The conclusion now followsby induction over n.

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020

Page 18: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

Embeddings

Max pooling as isometric embeddingThe Pooling Map, i.e., sorting, produces an isometric embedding of thequotient space:

PropositionIn the case d = 1, λ : Rn → Rn, x 7→ x↓ is an isometric embedding:

1 λ is injective2 λ(x)− λ(y) = d(x , y), for all x , y ∈ Rn

ProofClaim is equivalent to: minΠ∈Sn ‖x − Πy‖ = ‖x↓ − y↓‖.WLOG: Assume x = x↓, y = y↓. Then

argminΠ∈Sn‖x − Πy‖ = argminΠ∈Sn‖x − xn · 1− Π(y − yn · 1)‖

Therefore assume xn = yn = 0 and x , y ≥ 0. The conclusion now followsby induction over n.

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020

Page 19: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

Embeddings

Finite Dimensional EmbeddingsArchitectures

Two classes of extractors [Zaheer et.al.17’ -’Deep Sets’]:1 Pooling Map – based on Max pooling2 Readout Map – based on Sum pooling

Intuition in the case d = 1:Max pooling:

λ : Rn → Rn , λ(x) = x↓ := (xπ(k))nk=1 , xπ(1) ≥ xπ(2) ≥ · · · ≥ xπ(n)

Sum pooling:σ : Rn → Rn , σ(x) = (yk)n

k=1 , yk =n∑

j=1ν(ak , xj)

where kernel ν : R× R→ R, e.g. ν(a, t) = e−(a−t)2 , or ν(a = k, t) = tk .

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020

Page 20: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

Embeddings

Pooling Mapping ApproachFix a matrix R ∈ Rd×D. Consider the map:

Λ : Rn×d → Rn×D ≡ RnD , Λ(X ) = λ(XR)

where λ acts columnwise (reorders monotonically decreasing eachcolumn). Since Λ(ΠX ) = Λ(X ), then Λ : Rn×d → Rn×D.

TheoremFor any matrix R ∈ Rn,d+1 so that any n × n submatrix is invertible, thereis a subset Z ⊂ Rn×d of zero measure so that Λ : Rn×d \ Z → Rn×d+1 isfaithful (i.e., injective).

No known tight bound yet as to the minimum D = D(n, d) so that thereis a matrix R so that Λ is faithful (injective).Due to local linearity, if Λ is faithful (injective), then it is stable(bi-Lipschitz).

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020

Page 21: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

Embeddings

Enzyme Classification ExampleExtraction with Hadamard Matrix

Protein Dataset where task is classification into enzyme vs. non-enzyme.Dataset: 450 enzymes and 450 non-enzymes.Architecture (ReLU activation):

GCN with L = 3 layers and d = 25 feature vectors in each layer;α = Λ, Z = λ(YR) with R = [I Hadamard ]. D = 50, m = 50.Fully connected NN with dense 3-layers and 120 internal units.

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020

Page 22: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

Embeddings

Readout Mapping ApproachKernel Sampling

Consider:

Φ : Rn×d → Rm , (Φ(X ))j =n∑

k=1ν(aj , xk) or (Φ(X ))j =

n∏k=1

ν(aj , xk)

where ν : Rd × Rd → R is a kernel, and x1, · · · , xn denote the rows ofmatrix X .Known solutions: If m =∞, then there exists a Φ that is globally faithful(injective) and stable on compacts.Interesting mathematical connexion: On compacts, some kernels ν defineRepreducing Kernel Hilberts Spaces (RKHSs) and yield a decomposition

(Φ(X ))j =∑p≥1

σpfp(aj)gp(X )

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020

Page 23: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

Embeddings

Enzyme Classification ExampleFeature Extraction with Exponential Kernel Sampling

Protein Dataset where task is classification into enzyme vs. non-enzyme.Dataset: 450 enzymes and 450 non-enzymes.Architecture (ReLU activation):

GCN with L = 3 layers and d = 25 feature vectors in each layer;Ext : Zj =

∑nk=1 exp(−‖yk − zj‖2) with m = 120 and zj random.

Fully connected NN with dense 3-layers and 120 internal units.

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020

Page 24: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

Embeddings

Readout Mapping ApproachPolynomial Expansion - Quadratics

Another interpretation of the moments for d = 1: using Vieta’s formula,Newton-Girard identities

P(X ) =N∏

k=1(X − xk)↔ (

∑k

xk ,∑

kx2

k , ...,∑

kxn

k )

For d > 1, consider the quadratic d-variate polynomial:P(Z1, · · · ,Zd ) =

n∏k=1

((Z1 − xk,1)2 + · · ·+ (Zd − xk,d )2

)

=2n∑

p1,...,pd =0ap1,...,pd Zp1

1 · · ·Zpdd

Encoding complexity:m =

(2n + d

d

)∼ (2n)d .

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020

Page 25: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

Embeddings

Readout Mapping ApproachPolynomial Expansion - Quadratics

Another interpretation of the moments for d = 1: using Vieta’s formula,Newton-Girard identities

P(X ) =N∏

k=1(X − xk)↔ (

∑k

xk ,∑

kx2

k , ...,∑

kxn

k )

For d > 1, consider the quadratic d-variate polynomial:P(Z1, · · · ,Zd ) =

n∏k=1

((Z1 − xk,1)2 + · · ·+ (Zd − xk,d )2

)

=2n∑

p1,...,pd =0ap1,...,pd Zp1

1 · · ·Zpdd

Encoding complexity:m =

(2n + d

d

)∼ (2n)d .

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020

Page 26: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

Embeddings

Readout Mapping ApproachPolynomial Expansion - Quadratics (2)

A more careful analysis of P(Z1, ...,Zd ) reveals a form:

P(Z1, ...,Zd ) = tn+Q1(Z1, ...,Zd )tn−1+· · ·+Qn−1(Z1, ...,Zd )t+Qn(Z1, ...,Zd )

where t = Z 21 + · · ·+ Z 2

d and each Qk(Z1, ...,Zd ) ∈ Rk [Z1, ...,Zd ]. Henceone needs to encode:

m =(

d + 11

)+(

d + 22

)+ · · ·+

(d + n

n

)=(

d + n + 1n

)− 1

number of coefficients.A significant drawback: Inversion is very hard and numerically unstable.

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020

Page 27: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

Embeddings

Readout Mapping ApproachPolynomial Expansion - Linear Forms

A stable embedding can be constructed as follows (see also Gobels’algorithm (1996) or [Derksen, Kemper ’02]).Consider the n linear forms λk(Z1, ...,Zd ) = xk,1Z1 + · · · xk,dZd . Constructthe polynomial in variable t with coefficients in R[Z1, ...,Zd ]:

P(t) =n∏

k=1(t−λk(Z1, ...,Zd )) = tn−e1(Z1, ..,Zd )tn−1+· · · (−1)nen(Z1, ...,Zd )

The elementary symmetric polynomials (e1, ..., en) are in 1-1correspondence (Newton-Girard theorem) with the moments:

µp =n∑

k=1λp

k(Z1, ...,Zd ) , 1 ≤ p ≤ n

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020

Page 28: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

Embeddings

Readout Mapping ApproachPolynomial Expansion - Linear Forms (2)

Each µp is a homogeneous polynomial of degree p in d variables. Hence to

encode each of them one needs(

d + p − 1p

)coefficients. Hence the

total embedding dimension is

m =(

d1

)+(

d + 12

)+ · · ·+

(d + n − 1

n

)=(

d + nn

)− 1

For d = 1, m = n which is optimal.

For d = 2, m = n2+3n2 . Is this optimal?

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020

Page 29: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

Embeddings

Readout Mapping ApproachPolynomial Expansion - Linear Forms (2)

Each µp is a homogeneous polynomial of degree p in d variables. Hence to

encode each of them one needs(

d + p − 1p

)coefficients. Hence the

total embedding dimension is

m =(

d1

)+(

d + 12

)+ · · ·+

(d + n − 1

n

)=(

d + nn

)− 1

For d = 1, m = n which is optimal.

For d = 2, m = n2+3n2 . Is this optimal?

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020

Page 30: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

Embeddings

Algebraic EmbeddingEncoding using Complex Roots

Idea: Consider the case d = 2. Then each x1, · · · , xn ∈ R2 can be replacedby n complex numbers z1, · · · , zn ∈ C, zk = xk,1 + ixk,2.Consider the complex polynomial:

Q(z) =n∏

k=1(z − zk) = zn +

n∑k=1

σkzn−k

which requires n complex numbers, or 2n real numbers.

Open problem: Can this construction be extended to d ≥ 3?Remark: A drawback of polynomial (algebraic) embeddings: [Cahill’19]showed that polynomial embeddings of translation invariant spaces cannotbe bi-Lipschitz.

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020

Page 31: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

Embeddings

Algebraic EmbeddingEncoding using Complex Roots

Idea: Consider the case d = 2. Then each x1, · · · , xn ∈ R2 can be replacedby n complex numbers z1, · · · , zn ∈ C, zk = xk,1 + ixk,2.Consider the complex polynomial:

Q(z) =n∏

k=1(z − zk) = zn +

n∑k=1

σkzn−k

which requires n complex numbers, or 2n real numbers.

Open problem: Can this construction be extended to d ≥ 3?Remark: A drawback of polynomial (algebraic) embeddings: [Cahill’19]showed that polynomial embeddings of translation invariant spaces cannotbe bi-Lipschitz.

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020

Page 32: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

Bibliography

[1] Vinyals, O., Fortunato, M., and Jaitly, N., Pointer Networks, arXive-prints , arXiv:1506.03134 (Jun 2015).[2] Sutskever, I., Vinyals, O., and Le, Q. V., Sequence to SequenceLearning with Neural Networks, arXiv e-prints , arXiv:1409.3215 (Sep2014).[3] Bello, I., Pham, H., Le, Q. V., Norouzi, M., and Bengio, S., NeuralCombinatorial Optimization with Reinforcement Learning, arXiv e-prints ,arXiv:1611.09940 (Nov 2016).[4] Williams, R. J., Simple statistical gradient-following algorithms forconnectionist reinforcement learning, Machine learning 8(3-4), 229-256(1992).[5] Kool, W., van Hoof, H., and Welling, M., Attention, Learn to SolveRouting Problems, arXiv e-prints , arXiv:1803.08475 (Mar 2018).

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020

Page 33: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

Bibliography

[6] Dai, H., Khalil, E. B., Zhang, Y., Dilkina, B., and Song, L., LearningCombinatorial Optimization Algorithms over Graphs, arXiv e-prints ,arXiv:1704.01665 (Apr 2017).[7] Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J.,Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski,G., et al., Human-level control through deep reinforcement learning,Nature 518(7540), 529 (2015).[8] Dai, H., Dai, B., and Song, L., Discriminative embeddings of latentvariable models for structured data, in International conference onmachine learning, 2702-2711 (2016).[9] Nowak, A., Villar, S., Bandeira, A. S., and Bruna, J., Revised Note onLearning Algorithms for Quadratic Assignment with Graph NeuralNetworks, arXiv e-prints , arXiv:1706.07450 (Jun 2017).

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020

Page 34: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

Bibliography

[10] Scarselli, F., Gori, M., Tsoi, A. C., Hagenbuchner, M., andMonfardini, G., The graph neural network model, IEEE Transactions onNeural Networks 20(1), 61-80 (2008).[11] Li, Z., Chen, Q., and Koltun, V., Combinatorial Optimization withGraph Convolutional Networks and Guided Tree Search, arXiv e-prints ,arXiv:1810.10659 (Oct 2018).[12] Kipf, T. N. and Welling, M., Semi-Supervised Classification withGraph Convolutional Networks, arXiv e-prints , arXiv:1609.02907 (Sep2016).[13] Kingma, D. P. and Ba, J., Adam: A Method for StochasticOptimization, arXiv e-prints , arXiv:1412.6980 (Dec 2014).[14] H. Derksen, G. Kemper, Computational Invariant Theory, Springer2002.

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020

Page 35: Department of Mathematics, CSCAMM and NWC University of ... · Department of Mathematics, CSCAMM and NWC University of Maryland, College Park, MD January 17, 2020 AMS Special Session

Permutation Invariant Representations

Bibliography

[15] J. Cahill, A. Contreras, A.C. Hip, Complete Set of translationInvariant Measurements with Lipschitz Bounds, arXiv:1903.02811 (2019).[16] M. Zaheer, S. Kottur, S. Ravanbhakhsh, B. Poczos, R. Salakhutdinov,A.J. Smola, Deep Sets, arXiv:1703.06114[17] H. Maron, E. Fetaya, N. Segol, Y. Lipman, On the Universality ofInvariant Networks, arXiv:1901.09342 [cs.LG] (May 2019).[18] M.M. Bronstein, J. Bruna, Y. LeCun, A. Szlam, and P.Vandergheynst. Geometric deep learning: going beyond euclidean data.CoRR, abs/1611.08097, 2016.

Radu Balan (UMD) Permutations Invariant Representations 1/17/2020