Hermes: Dynamic Partitioning for Distributed Social ... ?· Hermes: Dynamic Partitioning for Distributed…

Download Hermes: Dynamic Partitioning for Distributed Social ... ?· Hermes: Dynamic Partitioning for Distributed…

Post on 23-Jul-2018




0 download


<ul><li><p>Hermes: Dynamic Partitioning for DistributedSocial Network Graph Databases</p><p>Daniel NicoaraUniversity of Waterloo</p><p></p><p>Shahin KamaliUniversity of Waterloo</p><p></p><p>Khuzaima DaudjeeUniversity of Waterloo</p><p></p><p>Lei ChenHKUST</p><p></p><p>ABSTRACTSocial networks are large graphs that require multiple graphdatabase servers to store and manage them. Each databaseserver hosts a graph partition with the objectives of bal-ancing server loads, reducing remote traversals (edge-cuts),and adapting the partitioning to changes in the structureof the graph in the face of changing workloads. To achievethese objectives, a dynamic repartitioning algorithm is re-quired to modify an existing partitioning to maintain goodquality partitions while not imposing a significant overheadto the system. In this paper, we introduce a lightweightrepartitioner, which dynamically modifies a partitioning us-ing a small amount of resources. In contrast to the exist-ing repartitioning algorithms, our lightweight repartitioneris ecient, making it suitable for use in a real system. Weintegrated our lightweight repartitioner into Hermes, whichwe designed as an extension of the open source Neo4j graphdatabase system, to support workloads over partitioned graphdata distributed over multiple servers. Using real-worldsocial network data, we show that Hermes leverages thelightweight repartitioner to maintain high quality partitionsand provides a 2 to 3 times performance improvement overthe de-facto standard random hash-based partitioning.</p><p>1. INTRODUCTIONLarge scale graphs, in particular social networks, perme-</p><p>ate our lives. The scale of these networks, often in millionsof vertices or more, means that it is often infeasible to store,query and manage them on a single graph database server.Thus, there is a need to partition, or shard, the graph acrossmultiple database servers, allowing the load and concurrentprocessing to be distributed over these servers to providegood performance and increase availability. Social networksexhibit a high degree of correlation for accesses of certaingroups of records, for example through frictionless sharing[15]. Also, these networks have a heavy-tailed distributionfor popularity of vertices. To achieve a good partitioningwhich improves the overall performance, the following ob-</p><p>c 2015, Copyright is with the authors. Published in Proc. 18th Inter-national Conference on Extending Database Technology (EDBT), March23-27, 2015, Brussels, Belgium: ISBN 978-3-89318-067-7, on Distribution of this paper is permitted under the terms of theCreative Commons license CC-by-nc-nd 4.0.</p><p>jectives need to be met:</p><p> The partitioning should be balanced. Each vertex of thegraph has a weight that indicates the popularity of thevertex (e.g., in terms of the frequency of queries to thatvertex). In social networks, a small number of users (e.g.,celebrities, politicians) are extremely popular while a largenumber of users are much less popular. This discrepancyreveals the importance of achieving a balanced partitioningin which all partitions have almost equal aggregate weightdefined as the total weight of vertices in the partition.</p><p> The partitioning should minimize the number of edge-cuts.An edge-cut is defined by an edge connecting vertices intwo dierent partitions and involves queries that need totransition from a partition on one server to a partitionon another server. This results in shifting local traversalto remote traversal, thereby incurring significant networklatency. In social networks, it is critical to minimize edge-cuts since most operations are done on the node that rep-resents a user and its immediate neighbors. Since these 1-hop traversal operations are so prevalent in these networks,minimizing edge-cuts is analogous to keeping communitiesintact. This leads to highly local queries similar to thosein SPAR [27] and minimizes the network load, allowing forbetter scalability by reducing network IO.</p><p> The partitioning should be incremental. Social networksare dynamic in the sense that users and their relationsare always changing, e.g., a new user might be added, twousers might get connected, or an ordinary user might be-come popular. Although the changes in the social graphcan be much slower when compared to the read trac [8],a good partitioning solution should dynamically adapt itspartitioning to these changes. Considering the size of thegraph, it is infeasible to create a partitioning from scratch;hence, a repartitioning solution, a repartitioner, is neededto improve on an existing partitioning. This usually in-volves migrating some vertices from one partition to an-other.</p><p> The repartitioning algorithm should perform well in termsof time and memory requirements. To achieve this e-ciency, it is desirable to perform repartitioning locally byaccessing a small amount of information about the struc-ture of the graph. From a practical point of view, thisrequirement is critical and prevents us from applying ex-isting approaches, e.g., [18, 30, 31, 6] for the repartitioningproblem.</p><p>The focus of this paper is on the design and provision ofa practical partitioned social graph data management sys-tem that can support remote traversals while providing an</p><p>25 10.5441/002/edbt.2015.04</p><p></p></li><li><p>eective method to dynamically repartition the graph usingonly local views. The distributed partitioning aims to co-locate vertices of the graph on-the-fly so as to satisfy theabove requirements. The fundamental contribution of thispaper is a dynamic partitioning algorithm, referred to aslightweight repartitioner, that can identify which parts ofgraph data can benefit from co-location. The algorithm aimsto incrementally improve an existing partitioning by decreas-ing edge-cuts while maintaining almost balanced partitions.The main advantage of the algorithm is that it relies on onlya small amount of knowledge on the graph structure referredto as auxiliary data. Since the auxiliary data is small andeasy to update, our repartitioning algorithm is performantin terms of time and memory while maintaining high-qualitypartitionings in terms of edge-cut and load balance.</p><p>We built Hermes as an extension of the Neo4j1 open sourcegraph database system by incorporating into it our algo-rithm to provide the functionality to move data on-the-flyto achieve data locality and reduce the cost of remote traver-sals for graph data. Our experimental evaluation of Hermesusing real-world social network graphs shows that our tech-niques are eective in producing performance gains and workalmost as well as the popular Metis partitioning algorithms[18, 30, 6] that performs static oine partitioning by relyingon a global view of the graph.</p><p>The rest of the paper is structured as follows. Section 2describes the problem addressed in the paper and reviewsclassical approaches and their shortcomings. Section 3 in-troduces and analyzes the lightweight repartitioner. Section4 presents an overview of the Hermes system. Section 5presents performance evaluation of the system. Section 6covers related work, and Section 7 concludes the paper.</p><p>2. PROBLEM DEFINITIONIn this section we formally define the partitioning problem</p><p>and review some of the related results. In what follows, theterm graph refers to an undirected graph with weights onvertices.</p><p>2.1 Graph PartitioningIn the classical (, )-graph partitioning problem [20], the</p><p>goal is to partition a given graph into vertex-disjoint sub-graphs. The weight of a partition is the total weight of ver-tices in that partition. In a valid solution, the weight of eachpartition is at most a factor 1 away from the averageweight of partitions. More precisely, for a partition P of agraph G, we need to have !(P ) </p><p>Pv2V (G)</p><p>!(v)/. Here,</p><p>!(P ) and !(v) denote the weight of a partition P and vertexv, respectively. Parameter is called the imbalance load fac-tor and defines how imbalanced the partitions are allowedto be. Practically, is in range [1, 2]. Here, = 1 impliesthat partitions are required to be completely balanced (allhave the same aggregate weights), while = 2 allows theweight of one partition to be up to twice the average weightof all partitions. The goal of the minimization problem is toachieve a valid solution in which the number of edge-cuts isminimized.The partitioning problem is NP-hard [13]. Moreover, there</p><p>is no approximation algorithm with a constant approxima-</p><p>1Neo4j is being used by customers such as Adobe and HP[3].</p><p>tion ratio unless P=NP [7]. Hence, it is not possible to intro-duce algorithms which provide worst-case guarantees on thequality of solutions, and it makes more sense to study thetypical behavior of algorithms. Consequently, the problemis mostly approached through heuristics [20] [12] which areaimed to improve the average-case performance. Regardless,the time complexity of these heuristics (n3) which makesthem unsuitable in practice.</p><p>To improve the time complexity, a class of multi-level al-gorithms were introduced. In each level of these algorithms,the input graph is coarsened to a representative graph ofsmaller size; when the representative graph is small enough,a partitioning algorithm like that of Kernighan-Lin [20] isapplied to it, and the resulting partitions are mapped back(uncoarsened) to the original graph. Many algorithms fit inthis general framework of multi-level algorithms; a widelyused example is the family of Metis algorithms [19, 30, 6].The multi-level algorithms are global in the sense that theyneed to know the whole structure of the graph in the coars-ening phase, and the coarsened graph in each stage shouldbe stored for the uncoarsening stage. This problem is par-tially solved by introducing distributed versions of these al-gorithms in which the partitioning algorithm is performedin parallel for each partition [4]. In these algorithms, in ad-dition to the local information (structure of the partition),for each vertex, the list of the adjacent vertices in other par-titions is required in the coarsening phase. The followingtheorem establishes that in the worst case, acquiring thisamount of data is close to having a global knowledge ofgraph (the proof can be found in [25]).</p><p>Theorem 1. Consider the (, )-graph partitioning prob-lem where &lt; 2. There are instances of the problem forwhich the number of edge-cuts in any valid solution is asymp-totically equal to the number of edges in the input graph.</p><p>Hence, the average amount of data required in the coars-ening phase of multi-level algorithms can be a constant frac-tion of all edges. The graphs used in the proof of the abovetheorem belong to the family of power-law graphs which areoften used to model social networks. Consequently, even thedistributed versions of multi-level algorithms in the worstcase require almost global information on the structure ofthe graph (particularly when used for partitioning socialnetworks). This reveals the importance of providing practi-cal partitioning algorithms which need only a small amountof knowledge about the structure of the graph that can beeasily maintained in memory. The lightweight repartitionerintroduced in this paper has this property, i.e., it maintainsonly a small amount of data, referred to as auxiliary data,to perform repartitioning.</p><p>2.2 RepartitioningA variety of partitioning methods can be used to create</p><p>an initial, static, partitioning. This should be followed bya repartitioning strategy to maintain good partitioning thatcan adapt to changes in the graph. One solution is to pe-riodically run an algorithm on the whole graph to get newpartitions. However, running an algorithm to get new par-titions from scratch is costly in terms of time and space.Hence, an incremental partitioning algorithm needs to adaptthe existing partitions to changes in the graph structure.</p><p>It is desirable to have a lightweight repartitioner thatmaintains only a small amount of auxiliary data to perform</p><p>26</p></li><li><p>repartitioning. Since such algorithm refers only to this auxil-iary data, which is significantly smaller than the actual datarequired for storing the graph, the repartitioning algorithmis not a system performance bottleneck. The auxiliary datamaintained at each machine (partition) consists of the list ofaccumulated weight of vertices in each partition, as well asthe number of neighbors of each hosted vertex in each parti-tion. Note that maintaining the number of neighbors is farcheaper that maintaining the list of neighbors in other parti-tions. In what follows, the main ideas behind our lightweightrepartitioner are introduced through an example.</p><p>Example: Consider the partitioning problem on the graphshown in Figure 1. Assume there are = 2 partitions in thesystem and the imbalance factor is = 1.1, i.e., in a validsolution, the aggregate weight of a partition is at most 1.1times more than the average weight of partitions. Assumethe numbers on vertices denote their weight. During nor-mal operation in social networks, users will request dierentpieces of information. In this sense, the weight of a ver-tex is the number of read requests to that vertex. Figure1a shows a partitioning of the graph into two partitions,where there is only one edge-cut and the partitions are wellbalanced, i.e., the weight of both partitions is equal to theaverage weight. Assuming user b is a popular weblogger whoposts a post, the request trac for vertex b will increase asits neighbors poll for updates, leading to an imbalance inload on the first partition (see Figure 1b). Here, the ratiobetween aggregate weight of partition 1 (i.e., 15) and theaverage weight of partitions (i.e., 13) is more than . Thismeans that the response time and request rates increase bymore than the acceptable skew limit, and the repartitioningneeds to be triggered to rebalance the load across partitions(while keeping the number of edge-cuts as small as possible).The auxiliary data of the lightweight repartitioner avail-</p><p>able to each partition includes the weight of each of thetwo partitions, as well as the number of neighbors of eachvertex v hosted in the partition. Provided with this aux-iliary data, a partition can determine whether load imbal-ances exist and the extent of the imbalance in the system(to compare it with ). If there is a load imbalance, a repar-titioner needs to indicate where to migrate data to restoreload balance. Migration is an iterative process which willidentify vertices that when moved will balance loads (aggre-</p><p>2</p><p>2</p><p>3</p><p>2</p><p>2</p><p>2</p><p>3</p><p>2</p><p>2</p><p>2</p><p>a</p><p>bc</p><p>de</p><p>f</p><p>g</p><p>hi</p><p>j</p><p>1Partition 2Partition</p><p>1111</p><p>(a) Balanced partitioned graph</p><p>2</p><p>6</p><p>3</p><p>2</p><p>2</p><p>2</p><p>3</p><p>2</p><p>2</p><p>2</p><p>a</p><p>bc</p><p>de</p><p>f</p><p>g</p><p>hi</p><p>j</p><p>1Partition 2Partition</p><p>11=15</p><p>(b) Skewed graph</p><p>2</p><p>6</p><p>3</p><p>2</p><p>2</p><p>2</p><p>3</p><p>2</p><p>2</p><p>2</p><p>a</p><p>bc</p><p>de</p><p>f</p><p>g</p><p>hi</p><p>j</p><p>1Partition 2Partition</p><p>=13 =13</p><p>(c) Repartitioned graph</p><p>Figure 1: Graph evolution and eects of repartitioning inresponse to imbalances.</p><p>gate weights) while keeping the number of edge-cuts as smallas possible. For example, when the...</p></li></ul>


View more >