consistency of spectral algorithms for hypergraphs …ph.d. thesis defense advisor: prof. ambedkar...
TRANSCRIPT
![Page 1: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/1.jpg)
Consistency of Spectral Algorithms for Hypergraphsunder Planted Partition Model
Debarghya Ghoshdastidar
Ph.D. Thesis Defense
Advisor: Prof. Ambedkar Dukkipati
January 2, 2017
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 1 / 47
![Page 2: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/2.jpg)
Overview
Purpose of the work:
Theoretical study of spectral methods for hypergraph partitioning
Contributions:
Model for random hypergraphs with planted partition
Error bounds for partitioning planted hypergraphs
New algorithms with improved error rates
Analysis of edge sampling strategies
Bi-partite hypergraph coloring
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 2 / 47
![Page 3: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/3.jpg)
Spectral Algorithm for Graph PartitioningSpectral Clustering
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 3 / 47
![Page 4: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/4.jpg)
Graph Partitioning
Objective:
High connectivity within clusters
Few edges across clusters (small cut)
Balanced partitions
Applications:
Network Data Imagepartitioning clustering segmentation
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 4 / 47
![Page 5: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/5.jpg)
Graph Partitioning
Objective:
High connectivity within clusters
Few edges across clusters (small cut)
Balanced partitions
Applications:
Network Data Imagepartitioning clustering segmentation
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 4 / 47
![Page 6: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/6.jpg)
Graph Partitioning
Objective:
High connectivity within clusters
Few edges across clusters (small cut)
Balanced partitions
Applications:
Network Data Imagepartitioning clustering segmentation
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 4 / 47
![Page 7: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/7.jpg)
Spectral Graph Partitioning / Spectral Clustering
Input Graph Good balanced cut
(Normalized) Find k dominant Run k-meansAdjacency matrix eigenvectors on rows
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 5 / 47
![Page 8: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/8.jpg)
Spectral Clustering (in practice)
Input Graph Good balanced cut
(Normalized) Find k dominant Run k-meansAdjacency matrix eigenvectors on rows
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 6 / 47
![Page 9: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/9.jpg)
Theoretical analysis
Stochastic block model: [Holland, Laskey & Leinhardt '83]
Random hypergraph (V, E) on |V| = n nodes
Nodes have (hidden) class labels, ψ : 1, . . . , n → 1, . . . , kP(euv ∈ E) depends on labels of u, v
Question:
Error(ψ,ψ′) = minσ
n∑i=1
1ψi 6= σ(ψ′i) (ψ′ is output label)
Find βn such that
Error(ψ,ψ′) ≤ βn with probability 1− o(1)
Consistency of algorithms:
Weakly consistent if βn = o(n); Strongly consistent if βn = o(1)
Spectral clustering is weakly consistent [Rohe, Chatterjee & Yu '11]
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 7 / 47
![Page 10: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/10.jpg)
Hypergraph PartitioningApplications and Algorithms
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 8 / 47
![Page 11: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/11.jpg)
Hypergraphs
Collection of sets / Generalization of graphs
Each edge can connect more than two nodes
Graph 3-uniform Hypergraph(2-uniform) hypergraph
m-uniform hypergraph:
Each edge connects m nodes
Adjacencies can be represented by mth-order tensor
Ai1i2...im =
1 if there is edge on i1, i2, . . . , im0 otherwise
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 9 / 47
![Page 12: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/12.jpg)
Hypergraphs in Databases [Gibson, Kleinberg & Raghavan '00]
Gender Male Female Male Male FemaleHair Red Black Bald Black Red
Glasses Yes No Yes No No
Edges can be of varying sizes (non-uniform hypergraph)
Male, Black hair, Without glasses, and so on . . .
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 10 / 47
![Page 13: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/13.jpg)
Hypergraphs in Computer Vision [Agarwal et al. '05]
Subspace clustering Motion segmentation
Matching / Image Registration
Involves 3-way / 4-way similarities (uniform hypergraph)
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 11 / 47
![Page 14: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/14.jpg)
Hypergraph Partitioning Methods
Partitioning circuits [Schweikert & Kernighan '79]
Graph approximation for hypergraphs [Hadley '95]
Spectral hypergraph partitioning [Zien et al. '99]
hMETIS for VLSI design [Karypis & Kumar '00]
Uniform hypergraph in databases [Gibson et al. '00]
Uniform hypergraph in vision [Agarwal et al. '05]
Tensor based algorithms [Govindu '05; Chen & Lerman '09]
Learning with non-uniform hypergraph [Zhou et al. '07]
Higher order learning [Duchenne et al. '11; Rota Bulo & Pellilo '13; etc.]
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 12 / 47
![Page 15: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/15.jpg)
Algorithms studied in our work
HOSVD / SCC: [Govindu '05; Chen & Lerman '09]
Uniform hypergraph partitioning using higher order SVD of adjacencytensor.
TTM / TeTrIS: (proposed)
Uniform hypergraph partitioning by solving a tensor trace maximizationproblem.
TeTrIS is efficient (sampled) version of TTM.
NH-Cut: [Zhou, Huang & Scholkopf '07]
Non-uniform hypergraph partitioning by minimizing normalizedhypergraph cut.
COLOR: (proposed)
Vertex 2-coloring of bi-partite non-uniform hypergraph.
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 13 / 47
![Page 16: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/16.jpg)
Uniform Hypergraph PartitioningSpectral Algorithms
Approach 1: Higher order SVD of adjacency tensor
Approach 2: Associativity or tensor trace maximization
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 14 / 47
![Page 17: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/17.jpg)
Approach 1: Higher Order SVD
Matrix eigen decomposition:
A U Σ UT .(orthonormal) (diagonal)
HOSVD of 3rd-order tensor: [De Lathauwer et al. '00]
A U Σ UT .
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 15 / 47
![Page 18: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/18.jpg)
HOSVD based Partitioning [Govindu '05]
m-uniform hypergraph
Adjacency tensor A Flattened matrix A
Find dominant left Run k-meanssingular vectors on rows
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 16 / 47
![Page 19: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/19.jpg)
HOSVD based Partitioning [Govindu '05]
m-uniform hypergraph
Adjacency tensor A Flattened matrix A
Find dominant left Run k-meanssingular vectors on rows
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 16 / 47
![Page 20: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/20.jpg)
Approach 2: Associativity Maximization
Normalized associativity:
For any cluster V1 ⊂ Vassociativity(V1) =
∑e⊂V1
w(e)
volume(V1) =∑
v∈V1degree(v)
Normalized associativity of partition
N-assoc(V1, . . . ,Vk) =
k∑j=1
associativity(Vj)volume(Vj)
Problem:
Find partition that maximizes N-assoc(V1, . . . ,Vk)
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 17 / 47
![Page 21: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/21.jpg)
Tensor Trace Maximization (TTM)
Problem (reformulated):
For m-uniform hypergraph
N-assoc(V1, . . . ,Vk) = 1m! Trace
(A×1 Y
b1 ×2 . . .×m Y bm)
Y ∈ Rn×k has orthogonal columns, and∑
j bj = 1
Y b1 A Y b2 .
Spectral relaxation of TTM:
Set b1 = b2 = 12 , b3 = . . . = bm = 0 and X = Y 1/2
Optimize over all orthonormal X
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 18 / 47
![Page 22: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/22.jpg)
Spectral TTM Algorithm [Ghoshdastidar & Dukkipati, ICML'15]
m-uniform hypergraph
Matrix A
Adjacency tensor A Add slices of tensor
Run k-means Find k dominanton rows eigenvectors
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 19 / 47
![Page 23: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/23.jpg)
Uniform Hypergraph PartitioningConsistency
Planted partition model for uniform hypergraphs
Error bounds for algorithms
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 20 / 47
![Page 24: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/24.jpg)
Planted Partition Model (graph)
Sparse Stochastic Block Model: [Lei & Rinaldo '15]
Given n nodes, and k (hidden) classes
An unknown matrix B ∈ [0, 1]k×k symmetric
An unknown sparsity factor αn
Independent edges with probabilities depending on labels
• • •Class-1 Class-2 Class-3
Prob(•,•) = αnB11, Prob(•,•) = αnB12, Prob(•,•) = αnB13 . . .
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 21 / 47
![Page 25: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/25.jpg)
Planted Partition Model (uniform hypergraph)
Extension of Sparse SBM: (proposed)
Given n nodes, and k (hidden) classes
Unknown mth-order tensor B ∈ [0, 1]k×k×...×k
Unknown sparsity factor αn
Independent edges with label-dependent distribution
Unweighted hypergraph:Prob(edge) = αnBi1i2...im
Weighted hypergraph:w(edge) ∈ [0, 1]E[w(edge)] = αnBi1i2...im
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 22 / 47
![Page 26: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/26.jpg)
Consistency of HOSVD [Ghoshdastidar & Dukkipati, NIPS, 2014]
Define:
nmax (or nmin) = max. (min.) cluster size
A = E[AAT
]and Amin = min
i,jAij : Aij > 0
δ = kth eigen-gap of normalized A
Theorem
There exists constant C > 0, such that, if
δ > 0 and Amin > Cknmax(log n)2
nminδ2
then with probability (1− o(1))
Error(ψ,ψ′) = O
(knmax log n
δ2Amin
)= o(n).
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 23 / 47
![Page 27: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/27.jpg)
Consistency of TTM [Ghoshdastidar & Dukkipati, ICML, 2015]
Define:
d = mini
E[degree(i)] = mini
∑e3i
E[w(e)]
δ = kth eigen-gap of normalized E[A]
Theorem
There exists constant C > 0, such that, if
δ > 0 and d > Cknmax(log n)2
nminδ2
then with probability (1− o(1))
Error(ψ,ψ′) = O
(knmax log n
δ2d
)= o(n).
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 24 / 47
![Page 28: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/28.jpg)
Special Case
m-uniform hypergraph
k = O(log n) clusters of equal size
Edge probabilities
Prob(edge) =
αnp if edge lies within a clusterαnq otherwise (p > q)
HOSVD TTM
Allowable sparsity: αn = Ω((logn)m+1.5
n(m−1)/2
)αn = Ω
((logn)2m+1
nm−1
)Dense hypergraph:(αn = 1)
error = O((logn)2m+1
nm−2
)error = O
((logn)2m−1
nm−2
)
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 25 / 47
![Page 29: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/29.jpg)
Non-uniform Hypergraph PartitioningAlgorithm and Consistency
Approach 3: Normalized hypergraph cut minimization
Planted partition model for non-uniform hypergraphs
Consistency result (with proof sketch)
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 26 / 47
![Page 30: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/30.jpg)
Normalized Hypergraph Cut
Approach: [Zhou, Huang & Scholkopf '07]
Solve spectral relaxation of minimizing normalized hypergraph cut
Reduction to graph:
A,D ∈ Rn×n so that Aij =∑e3i,j
1|e| , Dii = degree(i)
Spectral clustering:
Normalized Laplacian, L = I −D−1/2AD−1/2
Compute k leading orthonormal eigenvectors of L
k-means on normalized rows of eigenvector matrix
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 27 / 47
![Page 31: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/31.jpg)
Planted Partition Model (non-uniform hypergraph)
Model: (proposed)
Given n nodes, and k (hidden) classes
Maximum edge cardinality M
Unknown mth-order tensors B(m) ∈ [0, 1]k×k×...×k
Unknown sparsity factors αm,n, m = 2, 3, . . . ,M
Independent edges with label-dependent distribution
Prob(m-edge) = αm,nB(m)i1i2...im
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 28 / 47
![Page 32: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/32.jpg)
Consistency of NH-Cut [Ghoshdastidar & Dukkipati, Ann. Stat., 2017]
Define:
A = E[A], D = E[D] and L = I −D−1/2AD−1/2
d = mini E[degree(i)]
δ = kth eigen-gap of L
Theorem
There exists constant C > 0, such that, if
δ > 0 and d > Cknmax(log n)2
nminδ2
then with probability (1− o(1))
Error(ψ,ψ′) = O
(knmax log n
δ2d
)= o(n).
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 29 / 47
![Page 33: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/33.jpg)
Proof of consistency
Stage 1: (expected case)
If δ > 0, then A is essentially of rank k
If A used instead of A, then Error = 0
Stage 2: (matrix concentration)
A can be expressed as a sum of random matrices
A =∑e∈2V
1e ∈ E(
1|e|heh
Te
)If d > 9 log n for all large n, then w.p. (1− 4
n2 ),
‖L− L‖2 ≤ 12
√log n
dProof uses matrix concentration inequality [Tropp '12]
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 30 / 47
![Page 34: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/34.jpg)
Proof of consistency
Stage 3: (matrix perturbation)
X,X row normalized eigenvector matrices of L,L
If δ > 24√
lognd for all large n, then w.p. (1− 4
n2 )
‖X −X‖F ≤24
δ
√2knmax log n
dProof using matrix perturbation [Davis & Kahan '70]
Stage 4: (analyzing k-means)
Rows of X are ε-separable for ε = (log n)−1/2
k-means succeeds w.p. (1− o(1))
Error = O(‖X −X‖2F )
Based on guarantees of k-means [Ostrovsky et al. '12]
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 31 / 47
![Page 35: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/35.jpg)
Sampling Hypergraph Edges
Consistency of partitioning with edge sampling
Approach 4: TTM with iterative sampling
Numerical comparison
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 32 / 47
![Page 36: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/36.jpg)
Edge Sampling (weighted m-uniform hypergraph)
Complexity of tensor methods:
O(nm) runtime to compute all edge weights
Typically m = 3 to 8 in practice
Efficient variant: Use only N nm sampled edges
Question:
Edges sampled with replacement
Sampling distribution (pe)e∈E
Find min. number of samples needed for consistency
Sampling bound for TTM: [Ghoshdastidar & Dukkipati, arXiv:1602.06516]
(Special case) Error = o(n) if
Uniform sampling: N = Ω(α−1n k2m−1n(log n)2
)Weighted, pe ∝ w(e): N = Ω
(k2m−1n(log n)2
)Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 33 / 47
![Page 37: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/37.jpg)
TTM with Iterative Sampling (TeTrIS)
Iterative Sampling:
Principle:
Sample edges with large weight more frequentlyEdges within cluster usually have large weight
Approach (SCC): [Chen & Lerman '09]
Sample few edgesCluster using HOSVD based methodRe-sample with preference to within cluster edgesRe-cluster and repeat till convergence
TeTrIS Algorithm: [proposed]
Replace HOSVD step by TTM
Sampling bound for TTM justifies the usefulness of sampling largeweight edges via iterative sampling
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 34 / 47
![Page 38: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/38.jpg)
Numerical Comparison
Motion Segmentation:
Cluster motion trajectories
Posed as subspace clustering problem
Each motion – subspace of dimension ≤ 4
Mean clustering error on Hopkins 155 data set (%)
Method 2 motion 3 motion All(120 videos) (35 videos)
k-means 19.57 26.16 21.06k-flats 13.05 15.78 13.67SSC 1.53 4.40 2.18LRR 2.13 4.03 2.56NSN 3.62 8.28 4.67
SCC (HOSVD) 2.38 5.71 3.13TeTrIS (TTM) 1.36 5.38 2.27
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 35 / 47
![Page 39: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/39.jpg)
Hypergraph vertex 2-coloring
Objective: No edge can be mono-chromatic
Assume: Planted bi-partite hypergraphM = O(1) and E[#edges] ≥ Cn log n
Algorithm: [Ghoshdastidar & Dukkipati, arXiv:1507.00763]
Spectral step:
Let A ∈ Rn×n as Aij =∑
e3i,j
1|e|
Compute eigenvector x for smallest eigenvalue of AColor node-i red if sign(xi) > 0, else blue
⇒ Achieves error < cn for c 1
Iterative refinement:
Re-color node-i red if∑
j∈VRAij <
∑j∈VB
Aij , else blue
⇒ Error reduces by half in each iteration (log2 n steps suffice)
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 36 / 47
![Page 40: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/40.jpg)
Summary
Hypergraph partitioning can be done efficiently
First study of planted non-uniform hypergraphs
Literature considers only planted k-SAT / 2-coloring
Statistical analysis of tensor based methods
Popular in practice, but no known error bound
Removing the assumptions on k-means
First study of sampled spectral algorithms
Justification for iterative sampling
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 37 / 47
![Page 41: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/41.jpg)
Further works & open questions
Extension to large scale hypergraph partitioningDown sampling of hypergraphs
Analysis of other approaches under planted modelMove based strategiesOptimization based algorithms
Study of sparse planted hypergraphsOverlapping communities / Degree heterogenityAlgorithmic barrier for partitioning
[Angelini et al. '15; Florescu & Perkins '16]
Generalization of graphs problems to hypergraphsTheoretical studiesApplications
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 38 / 47
![Page 42: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/42.jpg)
Thank You
Acknowledgment:The work was supported by Google Ph.D. Fellowship in Statistical Learning Theory
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 39 / 47
![Page 43: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/43.jpg)
References
Agarwal, S., Lim, J., Zelnik-Manor, L., Perona, P., Kriegman, D. & Belongie,S. (2005). In IEEE Computer Vision and Pattern Recognition 838-845.
Angelini, M. C., Caltagirone, F., Krzakala, F. and Zdeborova, L. (2015). InAnnual Allerton Conference on Communication, Control, and Computing.
Chen, G. & Lerman, G. (2009). International Journal of Computer Vision81(3) 317-330.
Davis, C. & Kahan, W. M. (1970). SIAM Journal on Numerical Analysis 7(1)1-46.
De Lathauwer, L., De Moor, B. and Vandewalle, J. (2000). SIAM Journal onMatrix Analysis and Applications 21(4) 1253-1278.
Duchenne, O., Bach, F., Kweon, I.-S. & Ponce, J. (2011). IEEE Transactionson Pattern Analysis and Machine Intelligence 33(12) 2383-2395.
Florescu, L. & Perkins, W. (2016). In Conference on Learning Theory.
Ghoshdastidar, D. & Dukkipati, A. (2014). In Advances in Neural InformalProcessing Systems 397-405.
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 40 / 47
![Page 44: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/44.jpg)
References
Ghoshdastidar, D. & Dukkipati, A. (2015). In International Conference onMachine Learning.
Ghoshdastidar, D. & Dukkipati, A. (2015). Annals of Statistics (in press).
Ghoshdastidar, D. & Dukkipati, A. (2015). arXiv preprint 1507.00763.
Ghoshdastidar, D. & Dukkipati, A. (2016). arXiv preprint 1602.06516.
Gibson, D., Kleinberg, J. & Raghavan, P. (2000). VLDB Journal 8 222-236.
Govindu, V. M. (2005). In IEEE Computer Vision and Pattern Recognition1150-1157.
Hadley, S. W. (1995). Discrete Applied Mathematics 59 115-127.
Hein, M., Setzer, S., Jost, L. and Rangapuram, S. (2013). In Advances inNeural Informal Processing Systems 2427-2435.
Holland, P. W., Laskey, K. B. & Leinhardt, S. (1983). Social Networks 5109-137.
Karypis, G. & Kumar, V. (2000). VLSI Design 11 285-300.
Lei, J. & Rinaldo, A. (2015). Annals of Statistics 43 215-237.
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 41 / 47
![Page 45: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/45.jpg)
References
Ostrovsky, R., Rabani, Y., Schulman, L. J. & Swamy, C. (2012). Journal of theACM 59(6) 28:128.
Rohe, K., Chatterjee, S., & Yu, B. (2011). Annals of Statistics 39 1878-1915.
Rota Bulo, S. & Pellilo, M. (2013). IEEE Transactions on Pattern Analysisand Machine Intelligence 35(6) 1312-1327.
Schweikert, G. & Kernighan, B. W. (1979). In Design Automation Workshop57-62.
Tropp, J. A. (2012). Foundations of Computational Mathematics 12(4)389-434.
Zien, J. Y., Schlag, M. D. F. and Chan, P. K. (1999). IEEE Transactions onComputer-Aided Design of Integrated Circuits and Systems 13(9) 1088-1096.
Zhou, D., Huang, J. and Scholkopf, B. (2007). In Advances in Neural InformalProcessing Systems 1601-1608.
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 42 / 47
![Page 46: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/46.jpg)
Consistency of Sampled TTM (General Case)
Define:
γ = maxew(e)p(e) , where p(e) = P(e is sampled)
d = mini
E[degree(i)], δ = kth eigen-gap of normalized E[A]
Theorem [Ghoshdastidar & Dukkipati '16]
There exist constants C,C ′ > 0, such that, if
δ > 0, d > Cknmax(log n)2
nminδ2
and N > C ′(
1 +2γ
d
)knmax(log n)2
nminδ2
then with probability (1− o(1))
Error(ψ,ψ′) = o(n).
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 43 / 47
![Page 47: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/47.jpg)
More Numerical Results
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 44 / 47
![Page 48: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/48.jpg)
Numerical Comparison (uniform hypergraph)
Subspace clustering
60 points in 5-dim ambient space
Data from union of three random lines (1-dim subspaces)
Data perturbed by Gaussian noise of standard deviation σa
Fractional error (over 20 runs)
Algorithm Noise levelσa = 0.02 σa = 0.05
SNTF 0.025 0.086hMETIS 0.045 0.118
HGT 0.083 0.222HOSVD 0.052 0.126
TTM 0.033 0.103
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 45 / 47
![Page 49: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/49.jpg)
Numerical Comparison (sampled uniform hypergraph)
Subspace clustering
5-dim ambient spaceData from union of five 3-dim subspaces (added noise)
Nois
ele
vel
,σa
Fra
ctio
nal
erro
r(o
ver
50
runs)
Number of points in each subspace, n/k
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 46 / 47
![Page 50: Consistency of Spectral Algorithms for Hypergraphs …Ph.D. Thesis Defense Advisor: Prof. Ambedkar Dukkipati January 2, 2017 Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017](https://reader034.vdocuments.mx/reader034/viewer/2022050115/5f4c9445bb084c11752425b6/html5/thumbnails/50.jpg)
Numerical Comparison (non-uniform hypergraph)
Categorical data clustering
Data set #instances #attributes #attr. values
Voting records 435 16 3Mushroom 8124 22 varies
Fractional errorData set ROCK CoolCat LIMBO hMETIS NH-Cut
Voting 0.16 0.15 0.13 0.24 0.12Mushroom 0.43 0.27 0.11 0.48 0.11
Debarghya Ghoshdastidar Ph.D. Thesis Defense Jan 2, 2017 47 / 47