graphlab : how i understood it with sample code

18
GraphLab: how I understood it with sample code Aapo Kyrola, Carnegie Mellon Univ. Oct 1, 2009

Upload: maeve

Post on 24-Feb-2016

35 views

Category:

Documents


0 download

DESCRIPTION

GraphLab : how I understood it with sample code. Aapo Kyrola , Carnegie Mellon Univ. Oct 1, 2009. To test if I got your idea…. … I created two imaginary GraphLab sample applications by using imaginary GraphLab Java API Is this how you imagined GraphLab applications would look like?. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: GraphLab : how I understood it  with sample code

GraphLab: how I understood it with sample code

Aapo Kyrola,Carnegie Mellon Univ.

Oct 1, 2009

Page 2: GraphLab : how I understood it  with sample code

To test if I got your idea…

• … I created two imaginary GraphLab sample applications by using imaginary GraphLab Java API

• Is this how you imagined GraphLab applications would look like?

Page 3: GraphLab : how I understood it  with sample code

Technology layers

GraphLab • GraphLab API

– Defined and maintained by us

• GraphLab Engine – Reference implementation

done by us– Others encouraged to

implement their own

OpenGL• OpenGL API

– Maintained by Khronos group• glVertex3f(a,b,c,d)• glTransform(…)

• OpenGL graphics card drivers– By Nvidia, ATI, …; interface

with their hardware

Page 4: GraphLab : how I understood it  with sample code

Contents

1. GraphLab sample code for belief propagation –based inference• ML practitioner’s (end-user’s) point of view• What happens in the Engine?

2. Sample code for stochastic matrix eigenvector calculation by iteration• Issue with syncs and aggregation functions

Page 5: GraphLab : how I understood it  with sample code

Note about BP

• Bishop’s text uses BP on bipartite graph (variable + factor nodes), while Keller’s book uses Cluster Factor graphs– I will use Keller’s representation because it is

simpler

Page 6: GraphLab : how I understood it  with sample code

Sample program• User has huge Bayes network that models weather in USA• He knows it is 37F in Philadelphia and it rained yesterday in Pittsburgh, and it is

October (evidence)– What is the probability of rain in Pittsburgh today?

• See main() below (no GraphLab –stuff here yet)

Page 7: GraphLab : how I understood it  with sample code

Initialization of BP• Create cluster factor graph with special nodes for Belief

Propagation (have storage for messages; edges contain the shared variables between factors)

This implicitly marks each node ‘dirty’ (which the engine will add to task queue)

BayesNetwork and ClusterFactorGraph are classes defined by the GraphLab API or/and extend some more abstract Graph class

Page 8: GraphLab : how I understood it  with sample code

Node function (kernel)• To run the distributed BP algorithm, we need to define

function (kernel) that runs on each factor node --- always when the factor is “dirty” (task queue is not visible?)

Only if message changes significantly, do we send it. Sending a message flags recipient as dirty -> it will be added to task queue.

Note: edge might be remote or local, depending on graph partitioning. Kernel may or may not care about it. (For example, threshold could be higher for remote edges?)

Page 9: GraphLab : how I understood it  with sample code

Executing the engine• User can execute the algorithm on different GraphLab implementations

– Data cluster version, multicore version, GPU version, DNA computer version, Quantum computer etc.

• Advanced users can customize graph partitioning algorithm, scheduling priority, timeout etc.– For example, loopy BP may not converge everywhere, but still be usable?? Need timeout our relaxed convergence

criteria.

After lightning fast computation, we have calibrated belief network. We can use this to efficiently ask marginal distributions.

Page 10: GraphLab : how I understood it  with sample code

What Engine does?1. Client sends the graph data and functions to be run to the Computation Parent

1. How code is delivered? Good question. In Java, easy to send class files.2. Graph is partitioned to logical partitions (minimizing of links between partitions)

1. Edges that cross partitions are made into remote edges2. Each CPU is assigned one or more logical partitions by the Computation Parent

3. In each logical partition, computation is done sequentially1. In the beginning of each iteration, partition collects the dirty nodes (-> taskqueue(T))2. … and calls each dirty node with node function sequentially3. This will result into new set of dirty nodes (-> taskqueue(T+1))

• via remote edges, nodes in other partitions are flagged dirty4. Computation Parent monitors each logical partition for number of dirty nodes

1. When dirty count is zero or under defined limit, computation is finished.5. Graph state in the end of computation is sent back to client.

Next example of eigenvalue calculation shows how we can calculate partition-level accumulative functions efficiently and deliver them to the central unit

Note: in this model, nodes are not able to read from other nodes. Instead they can send data to other nodes, which can then cache this information.

Page 11: GraphLab : how I understood it  with sample code

A posteriori

Page 12: GraphLab : how I understood it  with sample code

Stochastic Matrix Eigenvector• Task: to iterate x = Mx, where x is a probability distribution

over (X1..Xn) and M is a stochastic matrix (Markov transition matrix), until we reach convergence (“fixed point”)– Existence of eigenvector (limit distribution) is guaranteed in all

but pathological cases? (= periodic chains?)• Running iteration in parallel is not stable because of

“feedback loops”– In serial computation, |Mx| = 1 (norm is L1 norm, right?)

– Normalization factor is needed to keep computation in control– But calculation of |Mx| needs input from all Xi synchronously

• Sync is costly, so we want to do this infrequently– how well is the effect studied? Are the some runaway problems?

Two players in Markov’s chain talking

Page 13: GraphLab : how I understood it  with sample code

Normalization1. Each logical partition has its on SumAccumulator

1. This is passed to each node on function computation. Node discounts its previous value and adds new (=> we need not to enumerate al nodes to get an updated sum)

2. After iteration, partition sends its accumulator value to the computation parent, which has its own SumAccumulator• Amount of remote accumulator communication = N(num of partitions)

3. Before each iteration, partition queries parent for current value of normalization. This is passed to all nodes when node function is computed.• If normalization factor changes significantly, all nodes are renormalized.

But does it work? Good question!

Page 14: GraphLab : how I understood it  with sample code

Initialization

Page 15: GraphLab : how I understood it  with sample code

Node Function

Invokes update on outbound nodes only if its value changed significantly. When converging, there are less and less dirty nodes.

Page 16: GraphLab : how I understood it  with sample code

Partition InterceptorInterceptor-idea is copied from certain web application frameworks

Page 17: GraphLab : how I understood it  with sample code

Computation Parent code

Page 18: GraphLab : how I understood it  with sample code

Putting it together…