large graph analysis using g mine system

18
Large Graph Analysis In The GMine System By Saurabh Jogalekar TE C 51 Seminar Guide: Prof. S V Jagtap

Upload: saujog

Post on 26-Jan-2015

106 views

Category:

Technology


2 download

DESCRIPTION

A small effort to illustrate the paper by the CMU

TRANSCRIPT

Page 1: Large graph analysis using g mine system

Large Graph Analysis In The GMine System

By Saurabh Jogalekar

TE C 51

Seminar Guide: Prof. S V Jagtap

Page 2: Large graph analysis using g mine system

Large Graph

A large graph is a graph with hundreds of thousands of nodes and a million edges

Our friend list, recommendations, likes, comments in case of social networks is the best example of Large Graphs

Other examples of large graphs include web graphs i.e. web pages pointing to each other through hyperlinks, bipartite graphs and computer communication graphs in which IP addresses send packets to other IP addresses.

Page 3: Large graph analysis using g mine system

Representing Graphs

The three techniques traditionally used for graph representation are

• 1. Adjacency matrix

• 2. Adjacency list

• 3. Binary Decision Diagrams

Page 4: Large graph analysis using g mine system

Representing Large Graphs

• Representation of large graphs is a challenging task in the way that the overall visibility of the graph is reduced due to huge amounts of nodes and edges.

• Thus the traditional methods for representation fail

Example of a large graph

Page 5: Large graph analysis using g mine system

Large Graph Representation

• Another problem with representing large graphs is that to acquire or mine the required nodes and edges, several complex calculations are required

• To overcome such hindrances in graph representation, a graph summarization method called CEPS (CEntre Piece Subgraph) is utilized

Page 6: Large graph analysis using g mine system

GRAPH-TREE

• The CEPS is utilized from Graph-tree, which is hierarchical representation of graph containing SuperGraph, SuperNodes and SuperEdges

• The graph-tree is formed as shown in the figure

Page 7: Large graph analysis using g mine system
Page 8: Large graph analysis using g mine system

FILLING A GRAPH-TREE

Algorithm FillGraphTree(ptr)

• If ptr is leaf then set ptr -> fliepath to the file of corr. Subgraph

• Else for each child of ptr do:

• FillGraphTree(child)

• Instantiate a SuperEdge for each pair of children, find matches between unresolved edges from each pair and store them in superEdges

• Use external edges to determine ptr’s open nodes

• Propagate unresolved external edges to the parent

Page 9: Large graph analysis using g mine system

SuperNodes and GraphNodes connectivity

• SuperNodes connectivity for two SuperNodes is the set of edges, where each of the source belongs to coverage of first SuperNode and target belongs to the coverage of second SuperNode

• Graph Node connectivity is the set of edges connecting the graph node to other graph nodes which are not a part of coverage of the SuperNode which includes the Graph Node

• Both of the connectivity are useful in constructing the graph from its hierarchical representation

Page 10: Large graph analysis using g mine system

Motivation behind CEPS

• Using a Graph-tree and hierarchical representation of a SuperGraph lessens the problem of inspecting large graphs

• However, the information retrieved from reaching the sub-graph is sometimes much greater than required information. To overcome this lacuna, CEPS is utilized

Page 11: Large graph analysis using g mine system

CEPS

• . A centre-piece subgraph contains the collection of paths connecting a subset of graph nodes of interest

• CEPS helps interaction by significantly reducing the number of edges and of nodes to be inspected

• CEPS uses a Random Walk Restart method to fine the ‘importance’ score between 2 nodes

Page 12: Large graph analysis using g mine system

GOODNESS SCORE

• Goodness score is calculated by a method Random Walk Restart. A matrix A(i, j) is defined which stores the steady state probabilities for each node ‘j’ with respect to the query ‘i’.

1

10

11

9 8

12

13

4

3

62

0.5767

0.1260

0.1235

0.1260

0.0283

0.0333

0.0024

0.0088

0.0076

0.0076

0.00240.0333

0.0088

7

5

Q1 Q2 Q3

Node 1

Node 2

Node 3

Node 4

Node 5

Node 6

Node 7

Node 8

Node 9

Node 10

Node 11

Node 12

Node 13

0.5767 0.0088 0.0088

0.1235 0.0076 0.0076

0.0283 0.0283 0.0283

0.0076 0.1235 0.0076

0.0088 0.5767 0.0088

0.0076 0.0076 0.1235

0.0088 0.0088 0.5767

0.0333 0.0024 0.1260

0.1260 0.0024 0.0333

0.1260 0.0333 0.0024

0.0333 0.1260 0.0024

0.0024 0.1260 0.0333

0.0024 0.0333 0.1260

Individual Score Matrix

Page 13: Large graph analysis using g mine system

EXTRACT ALGORITHM• The “EXTRACT” algorithm takes as input the weighted graph W, the importance scores on all

nodes, the budget b; and produces as output a small, unweighted, undirected graph H.

• It is performed using dynamic programming or greedy method

• 1. Initialize output graph H be null

• 2. Let len be the maximum allowable path length

• 3. While H is not big enough

• 3.1. Pick up destination node pd

• 3.2. For each active source node qi wrt node pd

• 3.2.1. discover a key path P(qi, pd)

• 3.2.2. add P(qi, pd) to H

• 4. Output the final H

Page 14: Large graph analysis using g mine system

GMINE SYSTEM

• GMine is a graph visualisation tool, used for handling large graphs.

• The tool makes use of Graph-Trees to offer good and readable graph exploration

• As the user interacts with the visualization, the system keeps track of the connectivity among communities of nodes at different levels of the partitioned graph.

• When the user changes the focus position on the tree structure, the system works on demand to calculate and present contextual information.

Page 15: Large graph analysis using g mine system

GMINE VISUALIZATION

Page 16: Large graph analysis using g mine system

REFERENCES• Jose F. Rodrigues Jr, Hanghang Tong, Jia-Yu Pan, Agma J.M. Traina, Caetano Traina Jr. and

Christos Faloutsos, “Large Graph Analysis in the GMine System”, IEEE transactions on knowledge and data engineering, vol. 25, no. 1, January 2013

• Christos Falustos, Jose F. Rodrigues Jr, HanghangTong, Agma J.M. Traina, “GMine: A system for scalable, interactive, graph visualization and mining” In IEEE/ACM International Conference, pages 1195–1198, Oconomowoc, Wisconsin, USA.

• Hanghang Tong, Christos Falustos, Center Piece Subgraphs: Problem definition and fast solutions”, Carnegie-Mellon University, Research Track Paper, page 404-414

• www.cmu.edu (Carnegie-Mellon University Site )

• Jose F. Rodrigues Jr, Agma J.M. Traina, Caetano Traina Jr. Caio, Cesar Moreli , “GMine: Interactive browsing of large graphs”, Workshop On Information Visualization and Analysis In Social Networks – WIVA 2008

Page 17: Large graph analysis using g mine system

QUESTIONS / QUERIES .. ?

Page 18: Large graph analysis using g mine system

THANK-YOU