+ sparsification and sampling of networks for collective classification tanwistha saha, huzefa...
TRANSCRIPT
+
Sparsification and Sampling of Networks for Collective Classification
Tanwistha Saha, Huzefa Rangwala and Carlotta DomeniconiDepartment of Computer Science
George Mason UniversityFairfax, VA, USA
+Outline
Introduction
Motivation
Related Work
Proposed Methods
Results
Conclusion and Future Work
+Sparsification and Sampling of Networks for Collective ClassificationGiven:
Partially labeled weighted network
Node attributes for all the nodes
Goal:
Predict the labels of unlabeled nodes in network
Points to consider:
Networks with fewer edges can be formed using sparsification algorithms
The selection of labeled nodes for training, influences the overall accuracy – research on sampling algorithms for collective classification
+Sample Input Network (partially labeled)
+Relational Network Sparsification
Study of networks involves Relational Learning
Relational network consists of nodes representing entities and edges representing pairwise interactions
Edges can be weighted / unweighted
Weights represents similarity between pair of nodes
Edges with low weights don’t carry much information – we can remove them based on some criteria!
Sparsify the network without losing much information
+Example: Network with noisy edges
+Example: Noise edges removed!
+Importance of Sparsification in Network
Problems:
Data analysis is time consuming
Noisy edges can not convey fruitful information in relational data
Solutions:
Identify and remove the noisy edges
Make sure to remove noisy edges only, and not the others!
Classify the unlabeled nodes in sparsified network using Collective Classification and compare results with unsparsified network
+Graph sparsification methods for clustering
(GS) Global Graph Sparsification (Satuluri et al. SIGMOD 2011)
(LS) Local Graph Sparsification (Satuluri et al. SIGMOD 2011)
Drawbacks:
Methods designed for fast clustering, not suitable for classification
All edges treated equally
Sparsified network becomes more disconnected
+Global Graph Sparsification
(Satuluri et al. SIGMOD 2011)
Singleton nodesDisconnected component
+Local Graph Sparsification
(Satuluri et al. SIGMOD 2011)
In addition to edges marked red, some more edges marked blue were removed!The edges removed with this method might not be a superset of the edges removed by global sparsification method.
Removal of this edge disconnects the graph
+Adaptive Global Sparsifier
Aims to address the drawbacks of LS and GS
Doesn’t remove an edge if the removal is going to make the graph more disconnected
Note:
This method is less aggressive in removing edges compared to local and global sparsification algorithms by Satuluri et al.
(Saha et al. SBP 2013)
+Adaptive Global Sparsifier
Keep the edges with top similarity scores (here, score >= 0.3)
+Adaptive Global Sparsifier (contd.)
Removing red edges doesn’t increase the number of connected components
Mauve colored edges have low similarity score but we put them back to avoid disconnect components
+Collective Classification in Networks
Input: A graph G = (V,E) with given percentage of labeled nodes for training, node features for all the nodes
Output: Predicted labels of the test nodes
Model:
1. Relational features and node features are used for training local classifier using labeled nodes
2. Test nodes labels are initialized with labels predicted by local classifier using node attributes
3. Inference through iterative classification of test nodes until convergence criterion reached
Network of researchers
MLDM SW AI
Bio
?
+Datasets & Experiments Cora citation network, directed graph of 2708 research
papers belonging to either one of 7 research areas (classes) in Computer Science (data downloaded from http://www.cs.umd.edu/projects/linqs/projects/lbc/index.html )
DBLP co-authorship network among 5602 researchers in 6 different areas of computer science (raw data downloaded from http://arnetminer.org and processed)
Number of edges acquired with different sparsification algorithms with sparsification ratio s=70%:
Dataset Total edges in network
Adaptive Global Sparsifier
Global Sparsifier
Local Sparsifier
Cora 5429 3850 3800 2429
DBLP 17265 12251 12086 6859
+Experiments (contd.)
Weighted Vote Relational Neighbor (wvRN) is used as the base collective classification algorithm (Macskassy et al. JMLR 2007)
Baseline methods: Global Sparsification Algorithm (GS) and Local Sparsification Algorithm (LS) (Satuluri et al. SIGMOD 2011)
Performance metric: Accuracy of Classification
+Results
Cora DBLP
+Sampling for Collective Classification A good sample from a data should inherit all the characteristics
Forest fire sampling, node sampling, edge sampling with induction (Ahmed et al. ICWSM 2012)
We argue: “goodness” of a sample is defined based on the problem we want to solve
Rationale:
Choosing samples for training should make sure that each test node is connected to at least one training node
Why? To facilitate collective classification by ensuring test nodes can have useful relational features computed from training nodes!
+Adaptive Forest Fire Sampling
Modified version of Forest Fire Sampling (Leskovec et al. KDD 2005)
Selects a random node as “seed node” to start and marks as “visited”
“Adaptive” because it randomly selects only a certain percentage of edges incident on a visited node, to propagate along the network and mark the nodes on the other end of edges as “visited”
Maintains a queue of unvisited nodes as propagation occurs in the network
Ensures that each test node is connected to at least one training node
+Adaptive Forest Fire Sampling of network with 19 nodes
Test nodes
Test nodes
+Experiments
Baseline classifiers used for comparing Random Sampling with Adaptive Forest Fire sampling:
wvRN (Macskassy et al. JMLR 2007)
Multi-class SVM (Krammer and Singer JMLR 2001, Tsochantaridis et al. ICML 2004)
RankNN for single labeled data (Saha et al. ICMLA 2012)
+Results (Cora citation network)
Random Sampling Adaptive Forest Fire Sampling
+Conclusions
Introduced a sparsification method for collective classification of network datasets without losing much information and comparable accuracies
Introduced a network sampling algorithm for facilitating collective classification
These algorithms work on single labeled networks, in future we would extend these approach to treat multi-labeled networks as well
These algorithms are designed for static networks, an interesting work would be to formulate sampling methods for networks that change over time
+Thank You!