f t c ., t ctwiki.di.uniroma1.it/pub/bdc/schedule/lecture9-tight-knit.pdf · smallest tight-knit...

20
FINDING TIGHT-KNIT CIRCLES (I.E., TRIANGLE COUNTING) Irene Finocchi

Upload: others

Post on 30-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: F T C ., T Ctwiki.di.uniroma1.it/pub/BDC/Schedule/lecture9-tight-knit.pdf · smallest tight-knit community ! A lot of interest for social network analysis (not only!) 1 3 2 4 5. Naïf

FINDING TIGHT-KNIT CIRCLES (I.E., TRIANGLE COUNTING)

Irene Finocchi

Page 2: F T C ., T Ctwiki.di.uniroma1.it/pub/BDC/Schedule/lecture9-tight-knit.pdf · smallest tight-knit community ! A lot of interest for social network analysis (not only!) 1 3 2 4 5. Naïf

Network properties

¨  Locality of relationships ¤  Relationships tend to cluster: high clustering coefficient ¤  If A is related to both B and C, than B and C are

related with probability higher than average ¨  Giant connected component ¨  Sparse: m = o(n2) ¨  Small-world

¤  Small average distance between nodes ¨  Scale-free

¤  Power-law degree distribution

Page 3: F T C ., T Ctwiki.di.uniroma1.it/pub/BDC/Schedule/lecture9-tight-knit.pdf · smallest tight-knit community ! A lot of interest for social network analysis (not only!) 1 3 2 4 5. Naïf

Counting triangles

¨  How many triangles in a graph?

¨  Triangle = smallest clique = smallest tight-knit community

¨  A lot of interest for social network analysis (not only!)

1

23

4 5

Page 4: F T C ., T Ctwiki.di.uniroma1.it/pub/BDC/Schedule/lecture9-tight-knit.pdf · smallest tight-knit community ! A lot of interest for social network analysis (not only!) 1 3 2 4 5. Naïf

Naïf approach

¨  List all node triples, and for each triple check if it forms a triangle

¨  triples

¨  Θ(n3) time independently of graph structure (and number of edges)

1

23

4 5

n3

!

"##

$

%&&

Page 5: F T C ., T Ctwiki.di.uniroma1.it/pub/BDC/Schedule/lecture9-tight-knit.pdf · smallest tight-knit community ! A lot of interest for social network analysis (not only!) 1 3 2 4 5. Naïf

Three case studies

¨  Clique on n nodes

¨  Star with n-1 leaves

¨  Binary tree on n nodes

Naïf approach: Θ(n3)

Page 6: F T C ., T Ctwiki.di.uniroma1.it/pub/BDC/Schedule/lecture9-tight-knit.pdf · smallest tight-knit community ! A lot of interest for social network analysis (not only!) 1 3 2 4 5. Naïf

Algorithm NodeIterator

Γ(x) = neighbors of node x

1

23

4 5

Ο(n3) iterations

1; 6;

Page 7: F T C ., T Ctwiki.di.uniroma1.it/pub/BDC/Schedule/lecture9-tight-knit.pdf · smallest tight-knit community ! A lot of interest for social network analysis (not only!) 1 3 2 4 5. Naïf

NodeIterator on the three case studies

¨  Clique on n nodes: Θ(n3) ¤ Cannot improve on this bound: after all, this is the

number of triangles

¨  Star with n-1 leaves: Θ(n2)

¤  Better than naïf, but still 0 triangles and sparse (constant degree on average)

¨  Binary tree on n nodes: Θ(n)

¤ Cannot do better than linear time

Page 8: F T C ., T Ctwiki.di.uniroma1.it/pub/BDC/Schedule/lecture9-tight-knit.pdf · smallest tight-knit community ! A lot of interest for social network analysis (not only!) 1 3 2 4 5. Naïf

1

23

4 5

1 1 1

1 1 1

1 1 1

Reduce 1 v= 1

Page 9: F T C ., T Ctwiki.di.uniroma1.it/pub/BDC/Schedule/lecture9-tight-knit.pdf · smallest tight-knit community ! A lot of interest for social network analysis (not only!) 1 3 2 4 5. Naïf

1

23

4 5

1 1 1 1

1 1 1 1 $

Reduce 2

Page 10: F T C ., T Ctwiki.di.uniroma1.it/pub/BDC/Schedule/lecture9-tight-knit.pdf · smallest tight-knit community ! A lot of interest for social network analysis (not only!) 1 3 2 4 5. Naïf

MR-NodeIterator: analysis

¨  Rounds: 2

¨  Global space: O(n3)

¨  Local reducer space: O(n)

Better algorithms exist

Page 11: F T C ., T Ctwiki.di.uniroma1.it/pub/BDC/Schedule/lecture9-tight-knit.pdf · smallest tight-knit community ! A lot of interest for social network analysis (not only!) 1 3 2 4 5. Naïf

MR-NodeIterator performance

Page 12: F T C ., T Ctwiki.di.uniroma1.it/pub/BDC/Schedule/lecture9-tight-knit.pdf · smallest tight-knit community ! A lot of interest for social network analysis (not only!) 1 3 2 4 5. Naïf

Algorithm NodeIterator++

1

23

4 5

Each triangle counted only once degree d(w)>d(u) or (d(w)=d(u) and w>u)

Page 13: F T C ., T Ctwiki.di.uniroma1.it/pub/BDC/Schedule/lecture9-tight-knit.pdf · smallest tight-knit community ! A lot of interest for social network analysis (not only!) 1 3 2 4 5. Naïf

NodeIterator++: analysis

h = # nodes of G with degree > √m h × √m ≤ 2m h ≤ 2√m

d+(v) = # neighbors of v with degree > d(v) 1)  d(v) ≤ √m: d+(v) ≤ d(v) ≤ √m 2)  d(v) > √m:

nodes counted in d+(v) have degree > d(v) > √m hence d+(v) ≤ h ≤ 2√m

Page 14: F T C ., T Ctwiki.di.uniroma1.it/pub/BDC/Schedule/lecture9-tight-knit.pdf · smallest tight-knit community ! A lot of interest for social network analysis (not only!) 1 3 2 4 5. Naïf

Algorithm NodeIterator++

degree d(w)>d(u) or (d(w)=d(u) and w>u)

O(m3/2)

# iterations

Page 15: F T C ., T Ctwiki.di.uniroma1.it/pub/BDC/Schedule/lecture9-tight-knit.pdf · smallest tight-knit community ! A lot of interest for social network analysis (not only!) 1 3 2 4 5. Naïf

NodeIterator++ on the three case studies

¨  Clique on n nodes: Θ(n3) ¤ Cannot improve on this bound: after all, this is the

number of triangles

¨  Star with n-1 leaves: Θ(n1.5)

¤ √n faster than before (m=Θ(n))

¨  Binary tree on n nodes: Θ(n)

¤ Cannot do better than linear time

Page 16: F T C ., T Ctwiki.di.uniroma1.it/pub/BDC/Schedule/lecture9-tight-knit.pdf · smallest tight-knit community ! A lot of interest for social network analysis (not only!) 1 3 2 4 5. Naïf

1

23

4 5

Page 17: F T C ., T Ctwiki.di.uniroma1.it/pub/BDC/Schedule/lecture9-tight-knit.pdf · smallest tight-knit community ! A lot of interest for social network analysis (not only!) 1 3 2 4 5. Naïf

MR-NodeIterator++: analysis

¨  Rounds: 2

¨  Global space: O(m3/2)

¨  Local reducer space: O(n)

Local space can be further reduced

Page 18: F T C ., T Ctwiki.di.uniroma1.it/pub/BDC/Schedule/lecture9-tight-knit.pdf · smallest tight-knit community ! A lot of interest for social network analysis (not only!) 1 3 2 4 5. Naïf

MR-NodeIterator++ performance

Page 19: F T C ., T Ctwiki.di.uniroma1.it/pub/BDC/Schedule/lecture9-tight-knit.pdf · smallest tight-knit community ! A lot of interest for social network analysis (not only!) 1 3 2 4 5. Naïf

The curse of the last reducer

•  Very skewed degree distributions •  NodeIterator++ deals with skewness much better

Page 20: F T C ., T Ctwiki.di.uniroma1.it/pub/BDC/Schedule/lecture9-tight-knit.pdf · smallest tight-knit community ! A lot of interest for social network analysis (not only!) 1 3 2 4 5. Naïf

References

Siddharth Suri & Sergei Vassilvitskii, Counting triangles and the curse of the last reducer, Proc. 20th International Conference on World Wide Web, 2011