community detection in networks: sdp relaxation and ...€¦ · community detection in networks...

76
Community Detection in Networks: SDP relaxation and Computational Gaps Yihong Wu Department of ECE University of Illinois at Urbana-Champaign [email protected] Joint work with Bruce Hajek (Illinois) and Jiaming Xu (Wharton) May 20, 2015

Upload: others

Post on 30-Sep-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Community Detection in Networks:SDP relaxation and Computational Gaps

Yihong Wu

Department of ECEUniversity of Illinois at Urbana-Champaign

[email protected]

Joint work with Bruce Hajek (Illinois) and Jiaming Xu (Wharton)

May 20, 2015

Page 2: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Community detection in networks

• Networks with community structures arise in many applications

Santa Fe Institute Collaboration network [Girvan-Newman ’02]

• Task: Discover underlying communities based on the networktopology

• Applications: Friend or movie recommendation in online socialnetworks

Yihong Wu (Illinois) Community Detection 2

Page 3: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Community detection in networks

• Networks with community structures arise in many applications

Santa Fe Institute Collaboration network [Girvan-Newman ’02]

• Task: Discover underlying communities based on the networktopology

• Applications: Friend or movie recommendation in online socialnetworks

Yihong Wu (Illinois) Community Detection 2

Page 4: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Community detection in networks

• Networks with community structures arise in many applications

Santa Fe Institute Collaboration network [Girvan-Newman ’02]

• Task: Discover underlying communities based on the networktopology

• Applications: Friend or movie recommendation in online socialnetworks

Yihong Wu (Illinois) Community Detection 2

Page 5: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Statistical and computational challenges

• The observed network is sparse

• Large solution space

Question

• Is there a computationally efficient and statistically optimalcommunity detection algorithm?

Yihong Wu (Illinois) Community Detection 3

Page 6: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Statistical and computational challenges

• The observed network is sparse

• Large solution space

Question

• Is there a computationally efficient and statistically optimalcommunity detection algorithm?

Yihong Wu (Illinois) Community Detection 3

Page 7: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Stochastic block model [Holland et al. ’83]Planted partition model [Condon-Karp 01]

n = 40, K = 10, r = 3

Yihong Wu (Illinois) Community Detection 4

Page 8: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Stochastic block model [Holland et al. ’83]Planted partition model [Condon-Karp 01]

p = 0.9

q = 0.1

Yihong Wu (Illinois) Community Detection 5

Page 9: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Stochastic block model [Holland et al. ’83]Planted partition model [Condon-Karp 01]

p = 0.9 q = 0.1

Yihong Wu (Illinois) Community Detection 5

Page 10: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Stochastic block model [Holland et al. ’83]Planted partition model [Condon-Karp 01]

p = 0.9 q = 0.1

Yihong Wu (Illinois) Community Detection 6

Page 11: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Exact recovery

• True cluster: C∗

• Estimated cluster: C

• Goal: exact recovery (strong consistency)

PC = C∗ n→∞−−−→ 1

• AlternativesI almost exact recovery (weak consistency):

[Mossel-Neeman-Sly ’13, Abbe-Sandon ’15, Montanari ’15]...I correlated recovery:

[Decelle-Krzakala-Moore-Zdeborova ’11, Mossel-Neeman-Sly ’12,Massoulie ’13]...

Yihong Wu (Illinois) Community Detection 7

Page 12: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Exact recovery

• True cluster: C∗

• Estimated cluster: C

• Goal: exact recovery (strong consistency)

PC = C∗ n→∞−−−→ 1

• AlternativesI almost exact recovery (weak consistency):

[Mossel-Neeman-Sly ’13, Abbe-Sandon ’15, Montanari ’15]...I correlated recovery:

[Decelle-Krzakala-Moore-Zdeborova ’11, Mossel-Neeman-Sly ’12,Massoulie ’13]...

Yihong Wu (Illinois) Community Detection 7

Page 13: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Objectives of this talk

• Statistical limit: When is exact recovery possible (impossible)?

• Computational limit: When is exact recovery computationally easy(hard)?

Yihong Wu (Illinois) Community Detection 8

Page 14: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

remainder of the talk

1 Linear community size: Sharp recovery via semidefinite programming

2 Sublinear community size: Computational lower bounds

Yihong Wu (Illinois) Community Detection 9

Page 15: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Two equal-sized communities

Page 16: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Binary symmetric SBM

Model:

• n nodes partitioned into two communities of size n2 (σi = ±1).

• i ∼ j independently w.p.

p = a logn

n σi = σj

q = b lognn σi 6= σj

Yihong Wu (Illinois) Community Detection 11

Page 17: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

MLE ⇔ MIN BISECTION

Assuming p > q

• Maximum likelihood estimator (MLE)

maxσ〈A, σσ>〉

s.t. σi ∈ ±1, i ∈ [n]

σ>1 = 0,

lift⇐==⇒ maxY〈A, Y 〉

s.t. rank(Y ) = 1

Yii = 1, i ∈ [n]

〈J, Y 〉 = 0.

where J = all-one matrix

Yihong Wu (Illinois) Community Detection 12

Page 18: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

MLE ⇔ MIN BISECTION

Assuming p > q

• Maximum likelihood estimator (MLE)

maxσ〈A, σσ>〉

s.t. σi ∈ ±1, i ∈ [n]

σ>1 = 0,

lift⇐==⇒ maxY〈A, Y 〉

s.t. rank(Y ) = 1

Yii = 1, i ∈ [n]

〈J, Y 〉 = 0.

where J = all-one matrix

Yihong Wu (Illinois) Community Detection 12

Page 19: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

SDP relaxation

• Semidefinite programming (SDP) relaxation of MLE

YSDP = arg maxY

〈A, Y 〉

s.t. Y 0

Yii = 1, i ∈ [n]

〈J, Y 〉 = 0.

• similar SDP as in [Frieze-Jerrum ’95] for MAX BISECTION

• average-case analysis on generative model (SBM)

• focus on arg max rather than approximating max

• goal: P

YSDP =

−1

−11

1

→ 1

Yihong Wu (Illinois) Community Detection 13

Page 20: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Optimal recovery via SDP

Theorem (Abbe-Bandeira-Hall ’14, Mossel-Neeman-Sly ’14)

• If (√a−√b)2 > 2, recovery is achievable in polynomial-time.

• If (√a−√b)2 < 2, recovery is impossible.

Theorem (Hajek-W.-Xu ’14)

SDP achieves the optimal recovery threshold (√a−√b)2 > 2.

Remarks

• originally conjectured in [Abbe-Bandeira-Hall ’14]

• independently proved by [Bandeira ’15]

• P

YSDP =

−1

−11

1

= 1− n−Ω(1)

Yihong Wu (Illinois) Community Detection 14

Page 21: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Optimal recovery via SDP

Theorem (Abbe-Bandeira-Hall ’14, Mossel-Neeman-Sly ’14)

• If (√a−√b)2 > 2, recovery is achievable in polynomial-time.

• If (√a−√b)2 < 2, recovery is impossible.

Theorem (Hajek-W.-Xu ’14)

SDP achieves the optimal recovery threshold (√a−√b)2 > 2.

Remarks

• originally conjectured in [Abbe-Bandeira-Hall ’14]

• independently proved by [Bandeira ’15]

• P

YSDP =

−1

−11

1

= 1− n−Ω(1)

Yihong Wu (Illinois) Community Detection 14

Page 22: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Optimal recovery via SDP

Theorem (Abbe-Bandeira-Hall ’14, Mossel-Neeman-Sly ’14)

• If (√a−√b)2 > 2, recovery is achievable in polynomial-time.

• If (√a−√b)2 < 2, recovery is impossible.

Theorem (Hajek-W.-Xu ’14)

SDP achieves the optimal recovery threshold (√a−√b)2 > 2.

Remarks

• originally conjectured in [Abbe-Bandeira-Hall ’14]

• independently proved by [Bandeira ’15]

• P

YSDP =

−1

−11

1

= 1− n−Ω(1)

Yihong Wu (Illinois) Community Detection 14

Page 23: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Dual certificate

maxY〈A, Y 〉

dual variables

s.t. Y 0

S 0

Yii = 1

D = diag di

〈J, Y 〉 = 0

λ ∈ R

Lemma

Y ∗ = σ∗(σ∗)> is unique solution if ∃D,λ s.t. S = λJ +D −A satisfies

Sσ = 0 and λ2(S) > 0.

⇒ di = (# of nbrs in own cluster)− (# of nbrs in other cluster)

=

e(i, C1)− e(i, C2) i ∈ C1

e(i, C2)− e(i, C1) i ∈ C2

Yihong Wu (Illinois) Community Detection 15

Page 24: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Dual certificate

maxY〈A, Y 〉 dual variables

s.t. Y 0 S 0

Yii = 1 D = diag di〈J, Y 〉 = 0 λ ∈ R

Lemma

Y ∗ = σ∗(σ∗)> is unique solution if ∃D,λ s.t. S = λJ +D −A satisfies

Sσ = 0 and λ2(S) > 0.

⇒ di = (# of nbrs in own cluster)− (# of nbrs in other cluster)

=

e(i, C1)− e(i, C2) i ∈ C1

e(i, C2)− e(i, C1) i ∈ C2

Yihong Wu (Illinois) Community Detection 15

Page 25: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Dual certificate

maxY〈A, Y 〉 dual variables

s.t. Y 0 S 0

Yii = 1 D = diag di〈J, Y 〉 = 0 λ ∈ R

Lemma

Y ∗ = σ∗(σ∗)> is unique solution if ∃D,λ s.t. S = λJ +D −A satisfies

Sσ = 0 and λ2(S) > 0.

⇒ di = (# of nbrs in own cluster)− (# of nbrs in other cluster)

=

e(i, C1)− e(i, C2) i ∈ C1

e(i, C2)− e(i, C1) i ∈ C2

Yihong Wu (Illinois) Community Detection 15

Page 26: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Verify PSD

• Mean adj matrix: E [A] = p+q2 J + p−q

2 σ∗(σ∗)> − pI

S = λJ−A+D

=(λ− p+ q

2

)J︸ ︷︷ ︸−p− q2

σ∗(σ∗)> + pI +D − (A− E [A])︸ ︷︷ ︸• λ2(S) = infx⊥σ∗,‖x‖2=1 x

>Sx > 0 if min di ≥ ‖A− E [A] ‖ andλ ≥ (p+ q)/2

• To finish the proof:

1 min di = ΩP (log n) if√a−√b >√

22 ‖A− E [A] ‖ = OP (

√log n)

Yihong Wu (Illinois) Community Detection 16

Page 27: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Verify PSD

• Mean adj matrix: E [A] = p+q2 J + p−q

2 σ∗(σ∗)> − pI•

S = λJ−A+D

=(λ− p+ q

2

)J︸ ︷︷ ︸−p− q2

σ∗(σ∗)> + pI +D − (A− E [A])︸ ︷︷ ︸

• λ2(S) = infx⊥σ∗,‖x‖2=1 x>Sx > 0 if min di ≥ ‖A− E [A] ‖ and

λ ≥ (p+ q)/2

• To finish the proof:

1 min di = ΩP (log n) if√a−√b >√

22 ‖A− E [A] ‖ = OP (

√log n)

Yihong Wu (Illinois) Community Detection 16

Page 28: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Verify PSD

• Mean adj matrix: E [A] = p+q2 J + p−q

2 σ∗(σ∗)> − pI•

S = λJ−A+D

=(λ− p+ q

2

)J︸ ︷︷ ︸−p− q2

σ∗(σ∗)> + pI +D − (A− E [A])︸ ︷︷ ︸• λ2(S) = infx⊥σ∗,‖x‖2=1 x

>Sx > 0 if min di ≥ ‖A− E [A] ‖ andλ ≥ (p+ q)/2

• To finish the proof:

1 min di = ΩP (log n) if√a−√b >√

22 ‖A− E [A] ‖ = OP (

√log n)

Yihong Wu (Illinois) Community Detection 16

Page 29: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Verify PSD

• Mean adj matrix: E [A] = p+q2 J + p−q

2 σ∗(σ∗)> − pI•

S = λJ−A+D

=(λ− p+ q

2

)J︸ ︷︷ ︸−p− q2

σ∗(σ∗)> + pI +D − (A− E [A])︸ ︷︷ ︸• λ2(S) = infx⊥σ∗,‖x‖2=1 x

>Sx > 0 if min di ≥ ‖A− E [A] ‖ andλ ≥ (p+ q)/2

• To finish the proof:

1 min di = ΩP (log n) if√a−√b >√

22 ‖A− E [A] ‖ = OP (

√log n)

Yihong Wu (Illinois) Community Detection 16

Page 30: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Verify PSD

• Mean adj matrix: E [A] = p+q2 J + p−q

2 σ∗(σ∗)> − pI•

S = λJ−A+D

=(λ− p+ q

2

)J︸ ︷︷ ︸−p− q2

σ∗(σ∗)> + pI +D − (A− E [A])︸ ︷︷ ︸• λ2(S) = infx⊥σ∗,‖x‖2=1 x

>Sx > 0 if min di ≥ ‖A− E [A] ‖ andλ ≥ (p+ q)/2

• To finish the proof:

1 min di = ΩP (log n) if√a−√b >√

2

2 ‖A− E [A] ‖ = OP (√

log n)

Yihong Wu (Illinois) Community Detection 16

Page 31: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Verify PSD

• Mean adj matrix: E [A] = p+q2 J + p−q

2 σ∗(σ∗)> − pI•

S = λJ−A+D

=(λ− p+ q

2

)J︸ ︷︷ ︸−p− q2

σ∗(σ∗)> + pI +D − (A− E [A])︸ ︷︷ ︸• λ2(S) = infx⊥σ∗,‖x‖2=1 x

>Sx > 0 if min di ≥ ‖A− E [A] ‖ andλ ≥ (p+ q)/2

• To finish the proof:

1 min di = ΩP (log n) if√a−√b >√

22 ‖A− E [A] ‖ = OP (

√log n)

Yihong Wu (Illinois) Community Detection 16

Page 32: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Remarks

1 Necessity

√a−√b <√

2

⇒ min di < 0 w.h.p.

⇒ ∃i : # of nbrs in own cluster < # of nbrs in other cluster

⇒ MLE fails

2 Proof of ‖A− E [A] ‖ = OP (√

log n)I 2nd-order stochastic dominance argument [Tomozei-Massoulie ’14]

+ result for iid matrix [Seginer ’00]I [Feige-Ofek ’05]: G(n, C logn

n ) for sufficiently large CI [Bandeira-van Handel ’14]: comparison argument

Yihong Wu (Illinois) Community Detection 17

Page 33: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Remarks

1 Necessity

√a−√b <√

2

⇒ min di < 0 w.h.p.

⇒ ∃i : # of nbrs in own cluster < # of nbrs in other cluster

⇒ MLE fails

2 Proof of ‖A− E [A] ‖ = OP (√

log n)I 2nd-order stochastic dominance argument [Tomozei-Massoulie ’14]

+ result for iid matrix [Seginer ’00]I [Feige-Ofek ’05]: G(n, C logn

n ) for sufficiently large CI [Bandeira-van Handel ’14]: comparison argument

Yihong Wu (Illinois) Community Detection 17

Page 34: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Multiple equal-sized communities

Page 35: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

r equal-sized clusters

• 0, 1-cluster matrix:

Y ∗ =∑r

k=1 ξk(ξk)> =

1

1

1

1

0

0

where ξk = indicator of the kth cluster of size K = n/r

• SDP relaxation of MLE:

maxY〈A, Y 〉

s.t. Y 0

Yii = 1

Yij ≥ 0∑j

Yij = K

Yihong Wu (Illinois) Community Detection 19

Page 36: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

r equal-sized clusters

• 0, 1-cluster matrix:

Y ∗ =∑r

k=1 ξk(ξk)> =

1

1

1

1

0

0

where ξk = indicator of the kth cluster of size K = n/r• SDP relaxation of MLE:

maxY〈A, Y 〉

s.t. Y 0

Yii = 1

Yij ≥ 0∑j

Yij = K

Yihong Wu (Illinois) Community Detection 19

Page 37: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Optimality of SDP

Theorem ([Hajek-W.-Xu ’15])

SDP achieves optimal threshold (√a−√b)2 > r.

Proof of correctness:

maxY〈A, Y 〉

s.t. Y 0

S 0

Yii = 1

di

Yij ≥ 0

B ≥ 0

∑j

Yij = K

λi

Yihong Wu (Illinois) Community Detection 20

Page 38: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Optimality of SDP

Theorem ([Hajek-W.-Xu ’15])

SDP achieves optimal threshold (√a−√b)2 > r.

Proof of correctness:

maxY〈A, Y 〉

s.t. Y 0 S 0

Yii = 1 di

Yij ≥ 0 B ≥ 0∑j

Yij = K λi

Yihong Wu (Illinois) Community Detection 20

Page 39: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Construction of the dual witness

• For node i ∈ Ck,

λi =1

K

(max6=k

e(i, C`)−Kq/2 +√

log n/2)

di = e(i, Ck)−max6=k

e(i, C`)−1

K

∑j∈Ck

max6=k

e(j, C`) +Kq −√

log n

• B =

0

0

0

0

, where each is rank-2, specified by

BCk×Ck′ (i, j) =1

K

(max6=k

e(i, Ck′)− e(i, Ck′) + max`6=k′

e(j, Ck)− e(j, Ck)

+e(Ck, Ck′)

K−Kq +

√log n

)• S = D −A−B + λ1> + 1λ>

Yihong Wu (Illinois) Community Detection 21

Page 40: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Construction of the dual witness

maxY〈A, Y 〉

s.t. Y 0 S 0

Yii = 1 di

Yij ≥ 0 B ≥ 0∑j

Yij = K λi

• Sξk = 0 for k = 1, . . . , r.

• λr+1(S) > 0 if min di ≥ ‖A− E [A] ‖ = OP (√

log n)

• di = (# of nbrs in own cluster)−maximal (# of nbrs in other clusters) +OP (

√log n).

• Sharp thresholdI√a−√b >√r ⇒ min di = Ω(log n)⇒ SDP succeeds

I√a−√b <√r ⇒ min di = −Ω(log n)⇒ MLE fails

Yihong Wu (Illinois) Community Detection 22

Page 41: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Unequal-sized clusters

Page 42: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Two unequal-sized clusters: known size

q

qp

p

Two clusters of size K and n−K (K = ρn):

YSDP = arg maxY

〈A, Y 〉

s.t. Y 0

Yii = 1, i ∈ [n]

〈J, Y 〉 = (2K − n)2

achieves optimal threshold η(ρ, a, b) > 1.

Note: ρ 7→ η(ρ, a, b) is minimized at η(1/2, a, b) = 12(√a−√b)2 ⇒

“suggests” equal-sized case is the hardest for two communities

Yihong Wu (Illinois) Community Detection 24

Page 43: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Two unequal-sized clusters: known size

q

qp

p

Two clusters of size K and n−K (K = ρn):

YSDP = arg maxY

〈A, Y 〉

s.t. Y 0

Yii = 1, i ∈ [n]

〈J, Y 〉 = (2K − n)2

achieves optimal threshold η(ρ, a, b) > 1.

Note: ρ 7→ η(ρ, a, b) is minimized at η(1/2, a, b) = 12(√a−√b)2 ⇒

“suggests” equal-sized case is the hardest for two communities

Yihong Wu (Illinois) Community Detection 24

Page 44: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Two unequal-sized clusters: unknown size

q

qp

p

Two clusters of size K and n−K (K = 0, 1, . . . , n):

YSDP = arg maxY

〈A, Y 〉 − λ〈J, Y 〉

s.t. Y 0

Yii = 1, i ∈ [n]

with λ = a−blog a−log b

lognn achieves optimal threshold

(√a−√b)2 > 2.

Note: If K = Ω(n), there exists a data-driven choice of λ.

Yihong Wu (Illinois) Community Detection 25

Page 45: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Two unequal-sized clusters: unknown size

q

qp

p

Two clusters of size K and n−K (K = 0, 1, . . . , n):

YSDP = arg maxY

〈A, Y 〉 − λ〈J, Y 〉

s.t. Y 0

Yii = 1, i ∈ [n]

with λ = a−blog a−log b

lognn achieves optimal threshold

(√a−√b)2 > 2.

Note: If K = Ω(n), there exists a data-driven choice of λ.

Yihong Wu (Illinois) Community Detection 25

Page 46: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

More generally...

• Binary censored block model: G(n, a lognn ) observe edge label flipped

w.p. εI SDP achieves sharp threshold a (

√1− ε−

√ε)2 > 1

I Closes the gap in [Abbe-Bandeira-Bracher-Singer ’14]

• General SBM:I Optimality of SDP relaxation remains open (but within a factor of 4)I Sharp threshold is found in [Abbe-Sandon ’15] via a two-stage

procedure.

Yihong Wu (Illinois) Community Detection 26

Page 47: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Detecting a single cluster

Page 48: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Finding a single community

q

qp

q

• One cluster of size K plus n−K outliers

• Connectivity p within cluster and q otherwise

• Also known as Planted Dense Subgraph model

• Linear community size: K = ρn and SDPachieves sharp threshold

• Next focus on K = Θ(nβ).

Yihong Wu (Illinois) Community Detection 28

Page 49: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Conjecture on computational limit

β

1

1

p = cq = Θ(n−α)

K = Θ(nβ)

impossible

1/2

easy

spectral barrier

Conjecture [Chen-Xu ’14]: no polynomial-time algorithm succeedsbeyond the spectral barrier [Nadakuditi-Newman ’12]

Yihong Wu (Illinois) Community Detection 29

Page 50: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Conjecture on computational limit

β

1

1

p = cq = Θ(n−α)

K = Θ(nβ)

impossible

1/2

easy

spectral barrier

Conjecture [Chen-Xu ’14]: no polynomial-time algorithm succeedsbeyond the spectral barrier [Nadakuditi-Newman ’12]

Yihong Wu (Illinois) Community Detection 29

Page 51: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Conjecture on computational limit

β

1

1

p = cq = Θ(n−α)

K = Θ(nβ)

impossible

1/2

easy

spectral barrier

Conjecture [Chen-Xu ’14]: no polynomial-time algorithm succeedsbeyond the spectral barrier [Nadakuditi-Newman ’12]

Yihong Wu (Illinois) Community Detection 29

Page 52: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Conjecture on computational limit

β

1

1

p = cq = Θ(n−α)

K = Θ(nβ)

impossible

1/2

easy

spectral barrier

Conjecture [Chen-Xu ’14]: no polynomial-time algorithm succeedsbeyond the spectral barrier [Nadakuditi-Newman ’12]

Yihong Wu (Illinois) Community Detection 29

Page 53: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

A =

p

q

K

K

+ A− E[A]

−3 −2 −1 0 1 2 3 4 50

0.05

0.1

0.15

0.2

0.25

0.3

K(p−q)σ

semi−circle law

Eigenvalue distribution of A−q11>

σ for σ =√q(1− q)n

Yihong Wu (Illinois) Community Detection 30

Page 54: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

A =

p

q

K

K

+ A− E[A]

−3 −2 −1 0 1 2 3 4 50

0.05

0.1

0.15

0.2

0.25

0.3

K(p−q)σ

semi−circle law

Eigenvalue distribution of A−q11>

σ for σ =√q(1− q)n

Yihong Wu (Illinois) Community Detection 30

Page 55: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Planted clique hardness hypothesis

H0 : Bern(γ) vs H1 : Bern(1)

K

K

Bern(γ)

Intermediate regime: log n K √n, γ = Θ(1)

• detection is possible but believed to have high computationalcomplexity: [Alon et al. ’11] [Feldman et al. ’13][Deshpande-Montanari ’15] [Meka-Potechin-Wigderson ’15]

• various hardness results assuming Planted Clique hardnessI detecting sparse principal component [Berthet-Rigollet ’13]: γ = 1

2I detecting sparse submatrix [Ma-W. ’13, Cai-Liang-Rakhlin ’15]:γ = 1

2

I cryptography [Applebaum-Barak-Wigderson ’10]: γ = 2− log0.99 n

Yihong Wu (Illinois) Community Detection 31

Page 56: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Planted clique hardness hypothesis

H0 : Bern(γ) vs H1 : Bern(1)

K

K

Bern(γ)

Intermediate regime: log n K √n, γ = Θ(1)

• detection is possible but believed to have high computationalcomplexity: [Alon et al. ’11] [Feldman et al. ’13][Deshpande-Montanari ’15] [Meka-Potechin-Wigderson ’15]

• various hardness results assuming Planted Clique hardnessI detecting sparse principal component [Berthet-Rigollet ’13]: γ = 1

2I detecting sparse submatrix [Ma-W. ’13, Cai-Liang-Rakhlin ’15]:γ = 1

2

I cryptography [Applebaum-Barak-Wigderson ’10]: γ = 2− log0.99 n

Yihong Wu (Illinois) Community Detection 31

Page 57: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Hard regime for recovering a single cluster

Assuming Planted Clique hardness for any constant γ > 0

1

12/3

p = cq = Θ(n−α)

K = Θ(nβ)

1/2

impossible

easy

1/2

hard

O α

β

Recovering a single cluster in the red regime is at least as hard asdetecting a clique of size K = o(

√n)

Yihong Wu (Illinois) Community Detection 32

Page 58: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Hard regime for recovering a single cluster

Assuming Planted Clique hardness for any constant γ > 0

1

12/3

p = cq = Θ(n−α)

K = Θ(nβ)

1/2

impossible

easy

1/2

hard

O α

β

Recovering a single cluster in the red regime is at least as hard asdetecting a clique of size K = o(

√n)

Yihong Wu (Illinois) Community Detection 32

Page 59: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Proof step 1: Recovery is harder than detection

Recovery versus Detection [Arias-Castro-Verzelen ’14] :

H0 : Bern(q) vs H1 : Bern(p)

S

S

Bern(q)

Each node is included in S with probability Kn

Yihong Wu (Illinois) Community Detection 33

Page 60: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Proof step 1: Recovery is harder than detection

Recovery versus Detection [Arias-Castro-Verzelen ’14] :

H0 : Bern(q) vs H1 : Bern(p)

S

S

Bern(q)

Each node is included in S with probability Kn

Yihong Wu (Illinois) Community Detection 33

Page 61: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Proof step 2: Hardness for detecting a single cluster

1

12/3

p = cq = Θ(n−α)

K = Θ(nβ)

1/2

impossible

easy

1/2

hard

O α

β

• Detecting a single cluster in the red regime is at least as hard asdetecting a clique of size K = o(

√n)

• Reduced from Planted Clique detection in polynomial time

Yihong Wu (Illinois) Community Detection 34

Page 62: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Proof step 2: Hardness for detecting a single cluster

1

12/3

p = cq = Θ(n−α)

K = Θ(nβ)

1/2

impossible

easy

1/2

hard

O α

β

• Detecting a single cluster in the red regime is at least as hard asdetecting a clique of size K = o(

√n)

• Reduced from Planted Clique detection in polynomial time

Yihong Wu (Illinois) Community Detection 34

Page 63: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

An×n AN×N

H0 :

H1 :

vs vs

Bern(γ)

clique

k

k

h : 7→

Bern(p)

K

K

Bern(q)

h : A 7→ A is agnostic to the clique and can be computed in P-time

Yihong Wu (Illinois) Community Detection 35

Page 64: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

An×n AN×N

H0 :

H1 :

vs vs

Bern(γ)

clique

k

k

h : 7→

Bern(p)

K

K

Bern(q)

h : A 7→ A is agnostic to the clique and can be computed in P-time

Yihong Wu (Illinois) Community Detection 35

Page 65: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Given an integer `, two probability distributions P,Q on 0, 1, . . . , `2

• • • • •

• • • • •

••

Split each nodeinto ` new nodesN = n`,K = k`

`

`

0 Q7→Assign edges withdistributions P,Q

1 P7→

H0 : Bern(γ)

H1 : Bern(1) (in-clique)

(1− γ)Q+ γP

P (in-cluster)

How to choose P,Q?

• Matching H0: (1− γ)Q+ γP = Binom(`2, q)

• Matching H1 approximately: P ≈ Binom(`2, p) in total variation

• Main effort: the law of the resulting graph is close to SBM in totalvariation

Yihong Wu (Illinois) Community Detection 36

Page 66: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Given an integer `, two probability distributions P,Q on 0, 1, . . . , `2

• • • • •

• • • • •

••

Split each nodeinto ` new nodesN = n`,K = k`

`

`

0 Q7→Assign edges withdistributions P,Q

1 P7→

H0 : Bern(γ)

H1 : Bern(1) (in-clique)

(1− γ)Q+ γP

P (in-cluster)

How to choose P,Q?

• Matching H0: (1− γ)Q+ γP = Binom(`2, q)

• Matching H1 approximately: P ≈ Binom(`2, p) in total variation

• Main effort: the law of the resulting graph is close to SBM in totalvariation

Yihong Wu (Illinois) Community Detection 36

Page 67: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Given an integer `, two probability distributions P,Q on 0, 1, . . . , `2

• • • • •

• • • • •

••

Split each nodeinto ` new nodesN = n`,K = k`

`

`

0 Q7→Assign edges withdistributions P,Q

1 P7→

H0 : Bern(γ)

H1 : Bern(1) (in-clique)

(1− γ)Q+ γP

P (in-cluster)

How to choose P,Q?

• Matching H0: (1− γ)Q+ γP = Binom(`2, q)

• Matching H1 approximately: P ≈ Binom(`2, p) in total variation

• Main effort: the law of the resulting graph is close to SBM in totalvariation

Yihong Wu (Illinois) Community Detection 36

Page 68: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Given an integer `, two probability distributions P,Q on 0, 1, . . . , `2

• • • • •

• • • • •

••

Split each nodeinto ` new nodesN = n`,K = k`

`

`

0 Q7→Assign edges withdistributions P,Q

1 P7→

H0 : Bern(γ)

H1 : Bern(1) (in-clique)

(1− γ)Q+ γP

P (in-cluster)

How to choose P,Q?

• Matching H0: (1− γ)Q+ γP = Binom(`2, q)

• Matching H1 approximately: P ≈ Binom(`2, p) in total variation

• Main effort: the law of the resulting graph is close to SBM in totalvariation

Yihong Wu (Illinois) Community Detection 36

Page 69: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Given an integer `, two probability distributions P,Q on 0, 1, . . . , `2

• • • • •

• • • • •

••

Split each nodeinto ` new nodesN = n`,K = k`

`

`

0 Q7→Assign edges withdistributions P,Q

1 P7→

H0 : Bern(γ)

H1 : Bern(1) (in-clique)

(1− γ)Q+ γP

P (in-cluster)

How to choose P,Q?

• Matching H0: (1− γ)Q+ γP = Binom(`2, q)

• Matching H1 approximately: P ≈ Binom(`2, p) in total variation

• Main effort: the law of the resulting graph is close to SBM in totalvariation

Yihong Wu (Illinois) Community Detection 36

Page 70: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Concluding remarks

• Versatility of SDP as a simple, general purpose, computationallyfeasible methodology for community detection

• Construction of dual witness lacks a general recipe

Yihong Wu (Illinois) Community Detection 37

Page 71: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Concluding remarks

1

12/3

p = cq = Θ(n−α)

K = Θ(nβ)

1/2

impossible

easy

1/2hard

O α

β

?

References• B. Hajek, Y. W. & J. Xu (2014). Computational lower bounds for

community detection on random graphs. arXiv:1406.6625 (COLT ’15)• B. Hajek, Y. W. & J. Xu (2014). Achieving exact cluster recovery

threshold via semidefinite programming. arXiv:1412.6156

• B. Hajek, Y. W. & J. Xu (2015). Achieving exact cluster recovery

threshold via semidefinite programming: Extensions. arXiv:1502.07738

Yihong Wu (Illinois) Community Detection 38

Page 72: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Formal statement of hardness of detecting a cluster

γ: edge probability in Planted Clique

Theorem

Assume Planted Clique Hypothesis holds for all 0 < γ ≤ 1/2. Let α > 0and 0 < β < 1 be such that

α < β <1

2+α

4.

Then there exists a sequence (N`,K`, q`)`∈N satisfyinglim`→∞

− log q`logN`

= α and lim`→∞logK`logN`

= β such that for any sequenceof randomized polynomial-time tests φ` for the PDS(N`,K`, 2q`, q`)problem, the Type-I+II error probability is lower bounded by 1.

Proof ideas: Reduce from Planted Clique in polynomial-timeMap approximately:

• G(n, γ) 7→ G(N, q)

• G(n, k, γ, 1) 7→ G(N,K, q, p)

Yihong Wu (Illinois) Community Detection 39

Page 73: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Bound the total variation distance

Lemma

Let `, n ∈ N, k ∈ [n] and γ ∈ (0, 12 ]. Let N = `n, K = k`, p = 2q and

m0 = blog2(1/γ)c. Assume that 16q`2 ≤ 1 and k ≥ 6e`. If G ∼ G(n, γ),then G ∼ G(N, q). If G ∼ G(n, k, 1, γ), then

dTV

(PG,G(N,K, p, q)

). e−K + ke−` + k2(q`2)m0+1 +

√eq`2 − 1

Proof ideas: dTV(P,Q) ≤ 12

√χ2(P,Q) and use negative associations

[Dubhashi-Ranjan ’98] to get rid of dependency in calculating the χ2

distance.

Apply the Lemma by choosing q = `−2−δ so that q`2 → 0: N = `2+δα ,

K = `(2+δ)βα , n = `

2+δα−1, k = `

(2+δ)βα−1. Easy to check that

α < β <1

2− δ +

α(1 + 2δ)

4 + 2δ⇒ log k

log n≤ 1

2− δ

Yihong Wu (Illinois) Community Detection 40

Page 74: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Bound the total variation distance

Lemma

Let `, n ∈ N, k ∈ [n] and γ ∈ (0, 12 ]. Let N = `n, K = k`, p = 2q and

m0 = blog2(1/γ)c. Assume that 16q`2 ≤ 1 and k ≥ 6e`. If G ∼ G(n, γ),then G ∼ G(N, q). If G ∼ G(n, k, 1, γ), then

dTV

(PG,G(N,K, p, q)

). e−K + ke−` + k2(q`2)m0+1 +

√eq`2 − 1

Proof ideas: dTV(P,Q) ≤ 12

√χ2(P,Q) and use negative associations

[Dubhashi-Ranjan ’98] to get rid of dependency in calculating the χ2

distance.

Apply the Lemma by choosing q = `−2−δ so that q`2 → 0: N = `2+δα ,

K = `(2+δ)βα , n = `

2+δα−1, k = `

(2+δ)βα−1. Easy to check that

α < β <1

2− δ +

α(1 + 2δ)

4 + 2δ⇒ log k

log n≤ 1

2− δ

Yihong Wu (Illinois) Community Detection 40

Page 75: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Bound the total variation distance

Lemma

Let `, n ∈ N, k ∈ [n] and γ ∈ (0, 12 ]. Let N = `n, K = k`, p = 2q and

m0 = blog2(1/γ)c. Assume that 16q`2 ≤ 1 and k ≥ 6e`. If G ∼ G(n, γ),then G ∼ G(N, q). If G ∼ G(n, k, 1, γ), then

dTV

(PG,G(N,K, p, q)

). e−K + ke−` + k2(q`2)m0+1 +

√eq`2 − 1

Proof ideas: dTV(P,Q) ≤ 12

√χ2(P,Q) and use negative associations

[Dubhashi-Ranjan ’98] to get rid of dependency in calculating the χ2

distance.

Apply the Lemma by choosing q = `−2−δ so that q`2 → 0: N = `2+δα ,

K = `(2+δ)βα , n = `

2+δα−1, k = `

(2+δ)βα−1. Easy to check that

α < β <1

2− δ +

α(1 + 2δ)

4 + 2δ⇒ log k

log n≤ 1

2− δ

Yihong Wu (Illinois) Community Detection 40

Page 76: Community Detection in Networks: SDP relaxation and ...€¦ · Community detection in networks Networks with community structures arise in many applications Santa Fe Institute Collaboration

Spectral concentration

Theorem

Let A denote a symmetric and zero-diagonal random matrix, where theentries Aij : i < j are independent and [0, 1]-valued. Assume thatE [Aij ] ≤ p, where c0 log n/n ≤ p ≤ 1− c1 for arbitrary constants c0 > 0and c1 > 0. Then for any c > 0, there exists c′ > 0 such that for anyn ≥ 1,

P‖A− E [A]‖2 ≤ c

′√np≥ 1− n−c.

Yihong Wu (Illinois) Community Detection 41