improving graphchi for large graph...

33
Improving GraphChi for Large Graph Processing Fast Radix Sort in Pre-Processing Stuart Thiel Greg Butler Larry Thiel Department of Computer Science Software Engineering Concordia University July 11, 2016 / IDEAS 2016 Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 1 / 33

Upload: others

Post on 17-Mar-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Improving GraphChi for Large Graph Processingusers.encs.concordia.ca/~sthiel/IDEAS2016/FastRadixIdeas2016Presentation.pdf · Contributions Fast Radix: plug-in replacement Radix Sort

Improving GraphChi for Large Graph ProcessingFast Radix Sort in Pre-Processing

Stuart Thiel Greg Butler Larry Thiel

Department of Computer Science Software EngineeringConcordia University

July 11, 2016 / IDEAS 2016

Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 1 / 33

Page 2: Improving GraphChi for Large Graph Processingusers.encs.concordia.ca/~sthiel/IDEAS2016/FastRadixIdeas2016Presentation.pdf · Contributions Fast Radix: plug-in replacement Radix Sort

Who Am I

Stuart Thiel

PhD. Student

Part-time Faculty

Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 2 / 33

Page 3: Improving GraphChi for Large Graph Processingusers.encs.concordia.ca/~sthiel/IDEAS2016/FastRadixIdeas2016Presentation.pdf · Contributions Fast Radix: plug-in replacement Radix Sort

Contributions

Fast Radix: plug-in replacement Radix Sort

Improved GraphChi performance

Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 3 / 33

Page 4: Improving GraphChi for Large Graph Processingusers.encs.concordia.ca/~sthiel/IDEAS2016/FastRadixIdeas2016Presentation.pdf · Contributions Fast Radix: plug-in replacement Radix Sort

In this talk

GraphChi: context of large graph processing

GraphChi: pre-processing examined

Fast Radix: our contribution, an improved LSD Radix Sort

Experimental Results: how Fast Radix improved GraphChi

Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 4 / 33

Page 5: Improving GraphChi for Large Graph Processingusers.encs.concordia.ca/~sthiel/IDEAS2016/FastRadixIdeas2016Presentation.pdf · Contributions Fast Radix: plug-in replacement Radix Sort

Improving GraphChi

The Cloud / Parallel computing for Large GraphsPregelGiraphGraphLabGraphX

GraphChi: competitive performance, one PC

Improvements can make it more competitive

Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 5 / 33

Page 6: Improving GraphChi for Large Graph Processingusers.encs.concordia.ca/~sthiel/IDEAS2016/FastRadixIdeas2016Presentation.pdf · Contributions Fast Radix: plug-in replacement Radix Sort

Large Graph Processing

What is large?

This context: 10s of billions of edges

Operations such as Page Rank

Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 6 / 33

Page 7: Improving GraphChi for Large Graph Processingusers.encs.concordia.ca/~sthiel/IDEAS2016/FastRadixIdeas2016Presentation.pdf · Contributions Fast Radix: plug-in replacement Radix Sort

A Focus on Pre-processing

GraphChi Pre-processing 20–80% of total processing time

Only needs to be done once per graph

Can use different graph processing operations after

Sorting happens in pre-processing

Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 7 / 33

Page 8: Improving GraphChi for Large Graph Processingusers.encs.concordia.ca/~sthiel/IDEAS2016/FastRadixIdeas2016Presentation.pdf · Contributions Fast Radix: plug-in replacement Radix Sort

Big Data, Small Details

Many approaches

Some basics used frequentlyWe consider

General: graph processingSpecific: sorting

Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 8 / 33

Page 9: Improving GraphChi for Large Graph Processingusers.encs.concordia.ca/~sthiel/IDEAS2016/FastRadixIdeas2016Presentation.pdf · Contributions Fast Radix: plug-in replacement Radix Sort

Saving Resources

If we sort frequentlyBetter sorting means saving

TimeMoney

∼ 20% server time spent sorting [1, 2].

Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 9 / 33

Page 10: Improving GraphChi for Large Graph Processingusers.encs.concordia.ca/~sthiel/IDEAS2016/FastRadixIdeas2016Presentation.pdf · Contributions Fast Radix: plug-in replacement Radix Sort

Graph Chi

Single-machine graph processing

Edge-wise graph processing

Sorts to grid

Processes graph using “parallel sliding windows” approach [3]

Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 10 / 33

Page 11: Improving GraphChi for Large Graph Processingusers.encs.concordia.ca/~sthiel/IDEAS2016/FastRadixIdeas2016Presentation.pdf · Contributions Fast Radix: plug-in replacement Radix Sort

Graph Chi preprocssing 1

“parallel sliding windows” needs organized data

Sorting during pre-processing creates a “grid”

Processing follows

Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 11 / 33

Page 12: Improving GraphChi for Large Graph Processingusers.encs.concordia.ca/~sthiel/IDEAS2016/FastRadixIdeas2016Presentation.pdf · Contributions Fast Radix: plug-in replacement Radix Sort

Graph Chi preprocssing 2

Pre-processing involves:Converting graph input into edge formatFilling a shovel buffer with edgesSorting edges by destination into shovelsPerforming a k-way merge of shovels into a shard buffersorting edges in shard buffer into shards

Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 12 / 33

Page 13: Improving GraphChi for Large Graph Processingusers.encs.concordia.ca/~sthiel/IDEAS2016/FastRadixIdeas2016Presentation.pdf · Contributions Fast Radix: plug-in replacement Radix Sort

Graph Chi Example

1 2

3

4

5

6

Figure : An example Graph

Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 13 / 33

Page 14: Improving GraphChi for Large Graph Processingusers.encs.concordia.ca/~sthiel/IDEAS2016/FastRadixIdeas2016Presentation.pdf · Contributions Fast Radix: plug-in replacement Radix Sort

Graph Chi Process

graph.txt

1 2

1 3

4 6

2 4

3 6

3 4

3 5

2 3

4 5

1,2 1,3 4,6 2,4 3,6 3,4 3,5 2,3 4,5

1,2

1,3

2,4

4,6

3,6

2,3

3,4

3,5

4,5 1,2 1,3 2,3 2,4 3,4 3,5 4,5 4,6 3,6

3,6

4,6

2,4

3,4

3,5

4,5

1,2

1,3

2,3

conver

t to

edge

s

shovelbuffer

shovel 1 shovel 2

shard 1 shard 2 shard 3

shovels

shards

shardsink

sort on dst sort on dst

k-way merge

sort on src

sort on src

sort

onsr

c

Figure : Creating Shards for our example graph

Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 14 / 33

Page 15: Improving GraphChi for Large Graph Processingusers.encs.concordia.ca/~sthiel/IDEAS2016/FastRadixIdeas2016Presentation.pdf · Contributions Fast Radix: plug-in replacement Radix Sort

GraphChi Sorting

GraphChi uses PBBS Radix Sort

Problem Based Benchmark Suite (PBBS)

Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 15 / 33

Page 16: Improving GraphChi for Large Graph Processingusers.encs.concordia.ca/~sthiel/IDEAS2016/FastRadixIdeas2016Presentation.pdf · Contributions Fast Radix: plug-in replacement Radix Sort

Least Significant Digit Radix Sort

Scan right-most digit

Deal into buckets

Consider next digit and repeat

Like sorting cards by value, then suit

Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 16 / 33

Page 17: Improving GraphChi for Large Graph Processingusers.encs.concordia.ca/~sthiel/IDEAS2016/FastRadixIdeas2016Presentation.pdf · Contributions Fast Radix: plug-in replacement Radix Sort

PBBS Radix Sort

Algorithm 1 PBBS Radix Sort

Input: input, val()/* Initialization */

1: buf ← initArray[input.length]2: counts← initArray[bucketCount]3: bIndexT ← initArray[input.length]Output: sorted input

/* deal based on current radix and counts */4: for j = 0 to passes−1 do5: zero(counts)

/* build histogram, store value */6: for i = 0 while i < input.length do7: bIndexT [i]← bitsFor(val(input[i]), j)8: counts[bIndexT [i]]++9: end for

/* convert counts to indices */10: for i = 0 to N−1 do11: convertToIndices(counts)12: end for

/* deal to buf */13: for i = 0 while i < input.length do14: buf [counts[bIndexT [i]]] = i15: end for16: swap(input,buf )17: end for

Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 17 / 33

Page 18: Improving GraphChi for Large Graph Processingusers.encs.concordia.ca/~sthiel/IDEAS2016/FastRadixIdeas2016Presentation.pdf · Contributions Fast Radix: plug-in replacement Radix Sort

Fast Radix

We introduce Fast RadixA better Radix Sort

Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 18 / 33

Page 19: Improving GraphChi for Large Graph Processingusers.encs.concordia.ca/~sthiel/IDEAS2016/FastRadixIdeas2016Presentation.pdf · Contributions Fast Radix: plug-in replacement Radix Sort

Fast Radix

Estimate bucket sizeSkips initial counting passManages overflow

Given three or four passes:removing a complete read through data is very noticeable

Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 19 / 33

Page 20: Improving GraphChi for Large Graph Processingusers.encs.concordia.ca/~sthiel/IDEAS2016/FastRadixIdeas2016Presentation.pdf · Contributions Fast Radix: plug-in replacement Radix Sort

Fast Radix Example

Estimate bucket sizes

Deal into buckets based on LSD

Deal from buckets and overflow based on next digit

Repeat untill all significant digits dealt

Everything is sorted

Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 20 / 33

Page 21: Improving GraphChi for Large Graph Processingusers.encs.concordia.ca/~sthiel/IDEAS2016/FastRadixIdeas2016Presentation.pdf · Contributions Fast Radix: plug-in replacement Radix Sort

Fast Radix Example: Estimation

ind input

0 A1

1 D7

2 5D

3 89

4 B6

5 18

6 A3

7 39

8 98

9 F3

A 52

B F9

C F6

D 67

E 9A

F 05

buf input

39

A1 98

52 F3

A3 F9

F6

05 67

B6 A3

D7 39

18 98

89 F3

9A 52

F9

F6

5D 67

9A

05

Figure : Dealing and counting pass. The ind is the index. Dealing is frominput to buf, shown by arrows. The second input shows overflow dealt back tothe beginning of that buffer.

Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 21 / 33

Page 22: Improving GraphChi for Large Graph Processingusers.encs.concordia.ca/~sthiel/IDEAS2016/FastRadixIdeas2016Presentation.pdf · Contributions Fast Radix: plug-in replacement Radix Sort

Fast Radix Example: Overflow Management

buff

A1

52

A3

05

B6

D7

18

89

9A

5D

input

05

18

39

52

5D

67

89

98

9A

A1

A3

B6

D7

F3

F6

F9

over

F3

F6

67

98

39

F9

1

2

3

5

6

7

8

9

A

D

3

6

7

8

9

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

Figure : Dealing pass. The buckets’ low-order hex digits are shown withbraces on B and Over. The interleaving is shown in the dealing, representedby numbered arrows.

Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 22 / 33

Page 23: Improving GraphChi for Large Graph Processingusers.encs.concordia.ca/~sthiel/IDEAS2016/FastRadixIdeas2016Presentation.pdf · Contributions Fast Radix: plug-in replacement Radix Sort

Data Sets

Social media/Web graphs

Between 1M–150M verticesBetween 5M–5.5B edges

google [4]: web graphlive-journal [5]: social media graphhollywood [6, 7]: related actorstwitter-2010 [8]: social media graphfriendster [9]: social media graphuk-union [10]: web graph

Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 23 / 33

Page 24: Improving GraphChi for Large Graph Processingusers.encs.concordia.ca/~sthiel/IDEAS2016/FastRadixIdeas2016Presentation.pdf · Contributions Fast Radix: plug-in replacement Radix Sort

Setup

4 Intel Xeon E5-2697 2.6GHz processors x14 core

768GB RAM

6x2 TB HDD

Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 24 / 33

Page 25: Improving GraphChi for Large Graph Processingusers.encs.concordia.ca/~sthiel/IDEAS2016/FastRadixIdeas2016Presentation.pdf · Contributions Fast Radix: plug-in replacement Radix Sort

Results: Sorting

Graph Name Vertices Edges GraphChi:Sort

Fast Radix:Sort

∆ Sort overflow

google 900K 5M 0.15s 0.077s 99% 0.65%live-journal 4.8M 69M 5.4s 5.4s 0% 0.72%hollywood 1.1M 113M 7.2s 4.9s 47% 0.50%

twitter-2010 42M 1.5B 128s 106s 21% 1.54%friendster 66M 1.8B 165s 133s 24% 0.12%uk-union 134M 5.5B 469s 329s 43% 0.49%

Graph Name: Name of the GraphVertices: Number of Vertices in GraphEdges : Number of Edges in GraphGraphChi: Sort : Total sorting time for GraphChi using PBBS radix sortFast Radix: Sort : Total sorting time for GraphChi using Fast Radix∆ Time : percentage improvement GraphChi:Sort

FastRadix:Sortoverflow : overflow percentage during Fast Radix

Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 25 / 33

Page 26: Improving GraphChi for Large Graph Processingusers.encs.concordia.ca/~sthiel/IDEAS2016/FastRadixIdeas2016Presentation.pdf · Contributions Fast Radix: plug-in replacement Radix Sort

Results: Pre-processing

Graph Name Vertices Edges GraphChi:Pre-processing

Fast Radix:Pre-processing

∆ Time

google 900K 5M 1.7s 1.5s 14%live-journal 4.8M 69M 32s 32s 0%hollywood 1.1M 113M 45s 38s 18%

twitter-2010 42M 1.5B 759s 693s 9.4%friendster 66M 1.8B 1022s 921s 11%uk-union 134M 5.5B 2549s 2135s 19%

Graph Name: Name of the GraphVertices: Number of Vertices in GraphEdges : Number of Edges in GraphGraphChi: Pre-processing : Total pre-processing for GraphChi using PBBS radix sortFast Radix: Pre-processing : Total pre-processing for GraphChi using Fast Radix

∆ Time : percentage improvement GraphChi:Pre−processingFastRadix:Pre−processing

Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 26 / 33

Page 27: Improving GraphChi for Large Graph Processingusers.encs.concordia.ca/~sthiel/IDEAS2016/FastRadixIdeas2016Presentation.pdf · Contributions Fast Radix: plug-in replacement Radix Sort

Unexpected Results

Graph Name Vertices Edges GraphChi:Sort

Fast Radix:Sort

∆ Sort overflow

google 900K 5M 0.15s 0.077s 99% 0.65%live-journal 4.8M 69M 5.4s 5.4s 0% 0.72%hollywood 1.1M 113M 7.2s 4.9s 47% 0.50%

twitter-2010 42M 1.5B 128s 106s 21% 1.54%friendster 66M 1.8B 165s 133s 24% 0.12%uk-union 134M 5.5B 469s 329s 43% 0.49%

Expected timing difference with fewer than 224 nodes28 bit digits used in a dealing passless than 224−1 = 28+8+8−1 three dealing passesmore than 224 means four dealing passes

livejournal fared worse than expected

uk-union did better than expected

in both cases, we suspect cache-issues, but unconfirmed

Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 27 / 33

Page 28: Improving GraphChi for Large Graph Processingusers.encs.concordia.ca/~sthiel/IDEAS2016/FastRadixIdeas2016Presentation.pdf · Contributions Fast Radix: plug-in replacement Radix Sort

Conclusion

In general Fast Radix was much better

It was no worse than PBBS in the worst case

Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 28 / 33

Page 29: Improving GraphChi for Large Graph Processingusers.encs.concordia.ca/~sthiel/IDEAS2016/FastRadixIdeas2016Presentation.pdf · Contributions Fast Radix: plug-in replacement Radix Sort

Questions?

http://users.encs.concordia.ca/~sthiel/IDEAS2016/

Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 29 / 33

Page 30: Improving GraphChi for Large Graph Processingusers.encs.concordia.ca/~sthiel/IDEAS2016/FastRadixIdeas2016Presentation.pdf · Contributions Fast Radix: plug-in replacement Radix Sort

Bibliography I

S. Chen and J. Reif, “Using difficulty of prediction to decreasecomputation: fast sort, priority queue and convex hull on entropybounded inputs,” in Proceedings, 34th Annual Symposium onFoundations of Computer Science, 1993., pp. 104–112, 1993.

B. Wagar, “System for MSD radix sort bin storage management,”1995.US Patent 5,440,734.

A. Kyrola, G. Blelloch, and C. Guestrin, “Graphchi: Large-scalegraph computation on just a pc,” in Presented as part of the 10thUSENIX Symposium on Operating Systems Design andImplementation (OSDI 12), pp. 31–46, 2012.

Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 30 / 33

Page 31: Improving GraphChi for Large Graph Processingusers.encs.concordia.ca/~sthiel/IDEAS2016/FastRadixIdeas2016Presentation.pdf · Contributions Fast Radix: plug-in replacement Radix Sort

Bibliography II

J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney,“Community Structure in Large Networks: Natural Cluster Sizes andthe Absence of Large Well-Defined Clusters,” CoRR,vol. abs/0810.1355, 2008.

L. Backstrom, D. Huttenlocher, J. Kleinberg, and X. Lan, “Groupformation in large social networks: membership, growth, andevolution,” in Proceedings of the 12th ACM SIGKDD InternationalConference on Knowledge discovery and data mining, pp. 44–54,ACM, 2006.

P. Boldi and S. Vigna, “The WebGraph framework I: Compressiontechniques,” in Proc. of the Thirteenth International World WideWeb Conference (WWW 2004), (Manhattan, USA), pp. 595–601,ACM Press, 2004.

Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 31 / 33

Page 32: Improving GraphChi for Large Graph Processingusers.encs.concordia.ca/~sthiel/IDEAS2016/FastRadixIdeas2016Presentation.pdf · Contributions Fast Radix: plug-in replacement Radix Sort

Bibliography III

P. Boldi, M. Rosa, M. Santini, and S. Vigna, “Layered LabelPropagation: A Multiresolution Coordinate-Free Ordering forCompressing Social Networks,” in Proceedings of the 20thInternational Conference on World Wide Web (S. Srinivasan,K. Ramamritham, A. Kumar, M. P. Ravindra, E. Bertino, andR. Kumar, eds.), pp. 587–596, ACM Press, 2011.

H. Kwak, C. Lee, H. Park, and S. Moon, “What is Twitter, a socialnetwork or a news media?,” in WWW ’10: Proceedings of the 19thInternational Conference on World Wide Web, (New York, NY,USA), pp. 591–600, ACM, 2010.

J. Yang and J. Leskovec, “Defining and Evaluating NetworkCommunities based on Ground-truth,” CoRR, vol. abs/1205.6233,2012.

P. Boldi, M. Santini, and S. Vigna, “A Large Time-Aware Graph,”SIGIR Forum, vol. 42, no. 2, pp. 33–38, 2008.

Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 32 / 33

Page 33: Improving GraphChi for Large Graph Processingusers.encs.concordia.ca/~sthiel/IDEAS2016/FastRadixIdeas2016Presentation.pdf · Contributions Fast Radix: plug-in replacement Radix Sort

Bibliography IV

Stuart Thiel, Greg Butler, Larry Thiel Improving GraphChi for Large Graph Processing 33 / 33