scalable and robust management of dynamic graph datajhh/publications/labouseur.bd3_2013... · 2013....

57
BD 3 2013 Scalable and Robust Management of Dynamic Graph Data Alan G. Labouseur Paul W. Olsen Jr. Jeong-Hyon Hwang {alan, polsen, jhh}@cs.albany.edu Sunday, September 22, 2013

Upload: others

Post on 19-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Scalable and Robust Management of Dynamic Graph Data

Alan G. Labouseur Paul W. Olsen Jr. Jeong-Hyon Hwang

{alan, polsen, jhh}@cs.albany.edu

Sunday, September 22, 2013

Page 2: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Large, Dynamic Networks

2

Sunday, September 22, 2013

Page 3: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Large, Dynamic Networks

• Social Networks

2

Sunday, September 22, 2013

Page 4: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Large, Dynamic Networks

• Social Networks

• Consumer Commerce Networks

2

Sunday, September 22, 2013

Page 5: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Large, Dynamic Networks

• Social Networks

• Consumer Commerce Networks

• Financial Networks

2

Sunday, September 22, 2013

Page 6: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Large, Dynamic Networks

• Social Networks

• Consumer Commerce Networks

• Financial Networks

• Road Networks

2

Sunday, September 22, 2013

Page 7: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Large, Dynamic Networks

• Social Networks

• Consumer Commerce Networks

• Financial Networks

• Road Networks

• Internet / WWW

2

Sunday, September 22, 2013

Page 8: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Large, Dynamic Networks

• Social Networks

• Consumer Commerce Networks

• Financial Networks

• Road Networks

• Internet / WWW

• DNA Interactions

2

Sunday, September 22, 2013

Page 9: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Analysis of Large, Dynamic Networks

• Transportation

3

5:00 AM

Sunday, September 22, 2013

Page 10: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Analysis of Large, Dynamic Networks

• Transportation

3

9.1 mi, 20 mins

5:00 AM

Sunday, September 22, 2013

Page 11: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Analysis of Large, Dynamic Networks

• Transportation

4

9.1 mi, 20 mins

5:00 AM

Sunday, September 22, 2013

Page 12: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Analysis of Large, Dynamic Networks

• Transportation

4

9.1 mi, 20 mins

5:00 AM 6:00 AM

15 mi, 25 mins

Sunday, September 22, 2013

Page 13: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Analysis of Large, Dynamic Networks

• Transportation

5

9.1 mi, 20 mins

5:00 AM 6:00 AM

15 mi, 25 mins

Sunday, September 22, 2013

Page 14: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Analysis of Large, Dynamic Networks

• Transportation

5

9.1 mi, 20 mins

5:00 AM 6:00 AM

15 mi, 25 mins

7:00 AM

20 mi, 30 mins

Sunday, September 22, 2013

Page 15: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Analysis of Large, Dynamic Networks

• Transportation

6

9.1 mi, 20 mins

5:00 AM 6:00 AM

15 mi, 25 mins

7:00 AM

20 mi, 30 mins

• Social and Political Studies / Marketing / National Security- How do communities or the centrality of an entity change over time?

- Who are rising stars?

Sunday, September 22, 2013

Page 16: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

The G* System (1/2)

7

G1

ac

bd

• distributed, deduplicated storage of graph snapshots

......γα β

Sunday, September 22, 2013

Page 17: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

The G* System (1/2)

8

G1

ac

bd

bd

ac

b

c

d

• distributed, deduplicated storage of graph snapshots

......γα β

Sunday, September 22, 2013

Page 18: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

The G* System (1/2)

9

G1

ac

bd

bd

G1

ac

b

G1 G1

c

d

c

d

a

b

• distributed, deduplicated storage of graph snapshots

......γα β

Sunday, September 22, 2013

Page 19: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

The G* System (1/2)

10

G1

ac

bd

bd

G1

ac

b

G1 G1

c

d

c

d

a

b

• distributed, deduplicated storage of graph snapshots

......γα β

Sunday, September 22, 2013

Page 20: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

The G* System (1/2)

10

G1

ac

bd

bd

G1

ac

b

G1 G1

c

d

G2 ec

d

a

b

ce

• distributed, deduplicated storage of graph snapshots

......γα β

Sunday, September 22, 2013

Page 21: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

The G* System (1/2)

11

G1

ac

bd

bd

G1∩G2

ac

b

G1∩G2

G2 ec

d

a

b

c

G1-G2

ce

G2-G1

d

G1∩G2

ce

d

a

b

• distributed, deduplicated storage of graph snapshots

......γα β

Sunday, September 22, 2013

Page 22: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

The G* System (1/2)

12

G1

ac

bd

bd

G1∩G2

ac

b

G1∩G2

G2 ec

d

a

b

c

G1-G2

ce

G2-G1

d

G1∩G2

ce

d

a

b

• distributed, deduplicated storage of graph snapshots

......γα β

Sunday, September 22, 2013

Page 23: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

The G* System (1/2)

12

G1

ac

bd

bd

G1∩G2

ac

b

G1∩G2

G2 ec

d

a

b

c

G1-G2

ce

G2-G1

d

G1∩G2

G3

ce

d

a

bf

df

• distributed, deduplicated storage of graph snapshots

......γα β

Sunday, September 22, 2013

Page 24: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

The G* System (1/2)

13

G1

ac

bd

bd

G1∩G2∩G3

ac

b

G1∩G2∩G3

......γα β

G2 ec

d

a

b

c

G1-G2-G3

ce

(G2∩G3)-G1

d

(G1∩G2)-G3

f

ce

d

a

b

df

G3-G1-G2

G3

• distributed, deduplicated storage of graph snapshots

Sunday, September 22, 2013

Page 25: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

The G* System (2/2)

14

c

bd

{G1,G2,G3}

ac

b

{G1,G2,G3} {G1}

df

ce

{G2,G3} {G3}......

d

{G1,G2}γβα

• sophisticated queries / sharing across graph snapshots

Sunday, September 22, 2013

Page 26: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

The G* System (2/2)

14

c

bd

{G1,G2,G3}

ac

b

{G1,G2,G3} {G1}

df

ce

{G2,G3} {G3}......

d

{G1,G2}γβα

vertex

degree

count, sum

average

union

vertex

degree

count, sum

vertex

degree

count, sum

• sophisticated queries / sharing across graph snapshots

Sunday, September 22, 2013

Page 27: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

The G* System (2/2)

14

(c,♢,{G1}), (d,♢,{G1,G2}), (c,♢,{G2}), (e,♢,{G2})(a,♢,{G1,G2}) (b,♢,{G1,G2})

c

bd

{G1,G2,G3}

ac

b

{G1,G2,G3} {G1}

df

ce

{G2,G3} {G3}......

d

{G1,G2}γβα

vertex

degree

count, sum

average

union

vertex

degree

count, sum

vertex

degree

count, sum

vertex vertexvertex

• sophisticated queries / sharing across graph snapshots

Sunday, September 22, 2013

Page 28: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

The G* System (2/2)

14

(a,2,{G1,G2}) (b,1,{G1,G2})

(c,♢,{G1}), (d,♢,{G1,G2}), (c,♢,{G2}), (e,♢,{G2})(a,♢,{G1,G2}) (b,♢,{G1,G2})

c

bd

{G1,G2,G3}

ac

b

{G1,G2,G3} {G1}

df

ce

{G2,G3} {G3}......

d

{G1,G2}γβα

vertex

degree

count, sum

average

union

vertex

degree

count, sum

vertex

degree

count, sum

vertex

degree

vertex

degree

vertex

degree(c,0,{G1}), (d,0,{G1,G2}), (c,1,{G2}), (e,0,{G2}),

• sophisticated queries / sharing across graph snapshots

Sunday, September 22, 2013

Page 29: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

The G* System (2/2)

14

(1,1,{G1,G2})

(c,0,{G1}), (d,0,{G1,G2}), (c,1,{G2}), (e,0,{G2}), (a,2,{G1,G2}) (b,1,{G1,G2})

(c,♢,{G1}), (d,♢,{G1,G2}), (c,♢,{G2}), (e,♢,{G2})(a,♢,{G1,G2}) (b,♢,{G1,G2})

(1,2,{G1,G2}) (2,0,{G1}), (3,1,{G2}))

c

bd

{G1,G2,G3}

ac

b

{G1,G2,G3} {G1}

df

ce

{G2,G3} {G3}......

d

{G1,G2}γβα

vertex

degree

count, sum

average

union

vertex

degree

count, sum

vertex

degree

count, sum

vertex

degree

count, sum

vertex

degree

count, sum

vertex

degree

count, sum

(c,0,{G1}), (d,0,{G1,G2}), (c,1,{G2}), (e,0,{G2}),

• sophisticated queries / sharing across graph snapshots

Sunday, September 22, 2013

Page 30: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

The G* System (2/2)

14

(1,1,{G1,G2})

(c,0,{G1}), (d,0,{G1,G2}), (c,1,{G2}), (e,0,{G2}), (a,2,{G1,G2}) (b,1,{G1,G2})

(c,♢,{G1}), (d,♢,{G1,G2}), (c,♢,{G2}), (e,♢,{G2})(a,♢,{G1,G2}) (b,♢,{G1,G2})

(1,2,{G1,G2})

(1,2,{G1,G2}), (1,1,{G1,G2}), (2,0,{G1}), (3,1,{G2})

(2,0,{G1}), (3,1,{G2}))

c

bd

{G1,G2,G3}

ac

b

{G1,G2,G3} {G1}

df

ce

{G2,G3} {G3}......

d

{G1,G2}γβα

vertex

degree

count, sum

average

union

vertex

degree

count, sum

vertex

degree

count, sum

vertex

degree

count, sum

union

vertex

degree

count, sum

vertex

degree

count, sum

(c,0,{G1}), (d,0,{G1,G2}), (c,1,{G2}), (e,0,{G2}),

• sophisticated queries / sharing across graph snapshots

Sunday, September 22, 2013

Page 31: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

The G* System (2/2)

14

(1,1,{G1,G2})

(c,0,{G1}), (d,0,{G1,G2}), (c,1,{G2}), (e,0,{G2}), (a,2,{G1,G2}) (b,1,{G1,G2})

(c,♢,{G1}), (d,♢,{G1,G2}), (c,♢,{G2}), (e,♢,{G2})(a,♢,{G1,G2}) (b,♢,{G1,G2})

(1,2,{G1,G2})

(3/4, G1), (4/5, G2)

(1,2,{G1,G2}), (1,1,{G1,G2}), (2,0,{G1}), (3,1,{G2})

(2,0,{G1}), (3,1,{G2}))

c

bd

{G1,G2,G3}

ac

b

{G1,G2,G3} {G1}

df

ce

{G2,G3} {G3}......

d

{G1,G2}γβα

vertex

degree

count, sum

average

union

vertex

degree

count, sum

vertex

degree

count, sum

vertex

degree

count, sum

average

union

vertex

degree

count, sum

vertex

degree

count, sum

(c,0,{G1}), (d,0,{G1,G2}), (c,1,{G2}), (e,0,{G2}),

(1,2,{G1,G2}), (1,1,{G1,G2}), (2,0,{G1}), (3,1,{G2})

• sophisticated queries / sharing across graph snapshots

Sunday, September 22, 2013

Page 32: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Problem Statements

• How to distribute graph snapshots on G* workers?

- new graph snapshots generated continuously

- must be efficient, scalable, and optimized for queries

• How to replicate graph snapshots?

- aim to maximize both availability and performance

15

Sunday, September 22, 2013

Page 33: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Impact of Snapshot Distribution (Example)

• 100 similarly-sized graph snapshots

• 100 G* workers

• PageRank on one snapshot or all snapshots

16

query 1 worker/snapshot 100 workers/snapshot

one snapshot 300 seconds 20 seconds

all snapshots 300 seconds 2,000 seconds

Sunday, September 22, 2013

Page 34: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Impact of Snapshot Distribution (Example)

• 100 similarly-sized graph snapshots

• 100 G* workers

• PageRank on one snapshot or all snapshots

16

query 1 worker/snapshot 100 workers/snapshot

one snapshot 300 seconds 20 seconds

all snapshots 300 seconds 2,000 seconds

loading: 200 secondscomputation: 100 seconds

Sunday, September 22, 2013

Page 35: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Impact of Snapshot Distribution (Example)

• 100 similarly-sized graph snapshots

• 100 G* workers

• PageRank on one snapshot or all snapshots

16

query 1 worker/snapshot 100 workers/snapshot

one snapshot 300 seconds 20 seconds

all snapshots 300 seconds 2,000 seconds

loading: 200 secondscomputation: 100 seconds

loading + comp.: 3 secondstransmission: 17 seconds

Sunday, September 22, 2013

Page 36: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Impact of Snapshot Distribution (Example)

• 100 similarly-sized graph snapshots

• 100 G* workers

• PageRank on one snapshot or all snapshots

16

query 1 worker/snapshot 100 workers/snapshot

one snapshot 300 seconds 20 seconds

all snapshots 300 seconds 2,000 seconds

loading: 200 secondscomputation: 100 seconds

loading + comp.: 3 secondstransmission: 17 seconds

Sunday, September 22, 2013

Page 37: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Impact of Snapshot Distribution (Example)

• 100 similarly-sized graph snapshots

• 100 G* workers

• PageRank on one snapshot or all snapshots

16

query 1 worker/snapshot 100 workers/snapshot

one snapshot 300 seconds 20 seconds

all snapshots 300 seconds 2,000 seconds

loading: 200 secondscomputation: 100 seconds

loading + comp.: 3 secondstransmission: 17 seconds

Sunday, September 22, 2013

Page 38: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Impact of Snapshot Distribution (Example)

• 100 similarly-sized graph snapshots

• 100 G* workers

• PageRank on one snapshot or all snapshots

16

query 1 worker/snapshot 100 workers/snapshot

one snapshot 300 seconds 20 seconds

all snapshots 300 seconds 2,000 seconds

loading: 200 secondscomputation: 100 seconds

loading + comp.: 3 secondstransmission: 17 seconds

• Lessons- balance correlated snapshots on many workers

- distribute each snapshot on a few workers

Sunday, September 22, 2013

Page 39: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Snapshot Distribution Overview

17

Sunday, September 22, 2013

Page 40: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Snapshot Distribution Overview

• partitions groups of snapshots (e.g., {G1, ..., G10}, {G11, ..., G20} ) into segments with a maximum size (e.g., 10GB)

17

Sunday, September 22, 2013

Page 41: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Snapshot Distribution Overview

• partitions groups of snapshots (e.g., {G1, ..., G10}, {G11, ..., G20} ) into segments with a maximum size (e.g., 10GB)

• workers exchange segments for higher query speed

17

Sunday, September 22, 2013

Page 42: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Snapshot Distribution Overview

• partitions groups of snapshots (e.g., {G1, ..., G10}, {G11, ..., G20} ) into segments with a maximum size (e.g., 10GB)

• workers exchange segments for higher query speed

• whenever a segment becomes full, splits it into two (e.g., METIS [SC 95])

17

Sunday, September 22, 2013

Page 43: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Segment Exchange (Example)

18

α

G1,1

G2,1

G2,2

β

G1,2

G3,1

G3,2

Sunday, September 22, 2013

Page 44: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Segment Exchange (Example)

18

α

G1,1

G2,1

G2,2

β

G1,2

G3,1

G3,2

Sunday, September 22, 2013

Page 45: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Segment Exchange (Example)

18

α

G1,1

G2,1

G2,2

β

G1,2

G3,1

G3,2

poor balancing

low locality

Sunday, September 22, 2013

Page 46: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Segment Exchange (Example)

19

α

G1,1

G2,1

G2,2

β

G1,2

G3,1

G3,2

α

G1,1

G2,1

G2,2

β

G1,2

G3,1

G3,2

poor balancing

low locality

Sunday, September 22, 2013

Page 47: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Segment Exchange (Example)

20

α

G1,1

G2,1

G2,2

β

G1,2

G3,1

G3,2

α

G1,1

G2,1

G2,2

β

G1,2

G3,1

G3,2

α

G3,1

G2,1

G2,2

β

G1,2

G1,1

G3,2

poor balancing

low locality

Sunday, September 22, 2013

Page 48: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Segment Exchange (Example)

20

α

G1,1

G2,1

G2,2

β

G1,2

G3,1

G3,2

α

G1,1

G2,1

G2,2

β

G1,2

G3,1

G3,2

α

G3,1

G2,1

G2,2

β

G1,2

G1,1

G3,2

good balancing

high locality

poor balancing

low locality

Sunday, September 22, 2013

Page 49: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Estimating Segment Migration Benefit

• notation

21

α

G1,1

G2,1

G2,2

β

G1,2

G3,1

G3,2

Sα Sβs segmenttomovefromworkerαtoworkerβSα segmentsonworkerα={G1,1,G2,1,G2,2}Sβ segmentsonworkerβ={G1,2,G3,1,G3,2}Qk krepresentativequerypatternsp(q) probabilitythatquerypatternqisexecuted

time(q,Sα,Sβ)estimateddurationofquerypatternqgivensegmentplacementsSαandSβ.

p(q)(time(q,Sα,Sβ)‐time(q,Sα­{s},Sβ∪{s}))q∈Qk∑

Sunday, September 22, 2013

Page 50: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Updates of Vertices and Edges

• A vertex v and its edges initially assigned to a worker w(v) corresponding to the hash value of the vertex ID.

• worker w(v) stores vertex v and its edges in a segment s and registers (v, s) in an index.

• If segment s migrates to another worker, the worker that created s maintains the current location of s.

22

Sunday, September 22, 2013

Page 51: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Graph Snapshot Replication

• r copies of each snapshot to mask up to r-1 simultaneous worker failures

• queries classified into r categories

• j-th replica optimized for the j-th query category (e.g., one replica distributed over many workers, another replica distributed over a few workers)

23

Sunday, September 22, 2013

Page 52: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Experimental Settings

• 6 nodes

• each node has 8 cores (2.67 GHz), 16 GB RAM, and a 2TB hard drive

• 500 cumulative graph snapshots, each with 20,000 additional edges.

• SSSP, PageRank

24

Sunday, September 22, 2013

Page 53: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Experimental Results (SSSP)

• Speedup

• Impact of Graph Distribution

25

cores 1 2 4 8 16 24 48

speedup 1.0 1.9 3.7 5.9 9.7 12.5 14.7

query all workers subset of workers

one snapshot 8.2 seconds 19.2 seconds

all snapshots 80.5 seconds 53.2 seconds

Sunday, September 22, 2013

Page 54: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Related Work

• Graph processing systems

- Pregel [SIGMOD 10], GraphLab/GraphChi [OSDI 12], DeltaGraph [ICDE 13]

• Graph Partitioning

- METIS [SC 95], GPS [SSDBM 13], CatchW [ICDE 13]

• Data Replication

- C-Store [VLDB 05]

26

Sunday, September 22, 2013

Page 55: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Future Work

• full implementation of snapshot distribution/replication techniques

• experiments using various data/queries/environments

• fine grained splitting and migration of data

• scheduling multiple queries

27

Sunday, September 22, 2013

Page 56: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Summary• Distribution of Graph Snapshots

- balance correlated snapshots on many machines

- store each snapshot on a few machines

• Replication of Graph Snapshots

- Optimize each replica for a different type of queries

• Supported by NSF CAREER award IIS-1149372

• G* demonstrated at ICDE 2013

• G* available as open source at: http://www.cs.albany.edu/~gstar/

28

Sunday, September 22, 2013

Page 57: Scalable and Robust Management of Dynamic Graph Datajhh/publications/labouseur.bd3_2013... · 2013. 9. 22. · 20 mi, 30 mins • Social and ... -How do communities or the centrality

BD3 2013

Thank You

29

http://www.cs.albany.edu/~gstar/

Sunday, September 22, 2013