zhuo peng, chaokun wang, lu han, jingchao hao and yiyuan ba proceedings of the third international...
TRANSCRIPT
![Page 1: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/1.jpg)
Discovering the Most Potential Stars in Social
Networks
Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba
Proceedings of the Third International Conference onEmerging Databases, Incheon, Korea (August 2011)
![Page 2: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/2.jpg)
2
Introduction Related Work Preliminary Algorithm Experiments Conclusion
Outline
![Page 3: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/3.jpg)
3
Introduction Related Work Preliminary Algorithm Experiments Conclusion
Outline
![Page 4: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/4.jpg)
4
Purpose: to find the most potential stars in
social networks to be promoted How to measure the importance
incoming edge and outgoing edge most potential stars = minimum promotion cost How to find the most potential stars
Skyline query promote a non-skyline member into skyline by
adding new edges which are directly connected to it it will take some costs to add a new edge
Introduction
![Page 5: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/5.jpg)
5
![Page 6: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/6.jpg)
6
member promotion in SNs = to identify the
most appropriate non-skyline member(s) which can be promoted to be skyline member(s) by adding edges at minimum cost
To the best of our knowledge, our paper is the first one that raises the member promotion problem in SNs.
Problem Definition
![Page 7: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/7.jpg)
7
first one that raises the member promotion problem
in SNs and provides the formal definition propose the general promotion algorithmic
framework and bring forward the brute-force method for promotion to solve the problem intuitively
utilize several optimization strategies to improve the efficiency and accordingly propose the IDP algorithm
Extensive experiments were conducted to show the effectiveness and efficiency of the IDP algorithm on both real and synthetic datasets
Contributions
![Page 8: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/8.jpg)
8
Introduction Related Work
Skyline Query Skyline Minimum Vector
Preliminary Algorithm Experiments Conclusion
Outline
![Page 9: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/9.jpg)
9
retrieves a subset of data points that are not
dominated by any other points in a set of D-dimensional data points
algorithms Block Nested Loop (BNL) Divide-and-Conquer (D&C) Bitmap method Nearest Neighbor (NN) Branched and Bound Skyline(BBS)
Skyline Query
![Page 10: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/10.jpg)
10
studies the query for the points that can be changed to be a
skyline point at the minimum cost The costs are measured by L1 distance of the skyline vectors
starting from the original position and pointing to a skyline position. The skyline minimum vector thus indicates minimum L1 distance.
Those non-skyline points which can be changed to be skyline points by the skyline minimum vectors are the solutions to the problem.
Drawbacks the virtual points which are needed for the computation of the
skyline vectors must be provided in advance the skip region for optimization is not good enough no theoretical analysis such as time complexity analysis and
correctness proof has been provided
Skyline Minimum Vector
![Page 11: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/11.jpg)
11
Introduction Related Work Preliminary Algorithm Experiments Conclusion
Outline
![Page 12: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/12.jpg)
12
An SN is modeled as a directed graph G(V, E,
W) V = the members in the SN E = the existing directed edges Each w ∈ W : V × V→R+ denotes the cost for
establishing the edge between any two different members
Preliminary
![Page 13: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/13.jpg)
13
An SN is modeled as a directed graph G(V, E,
W) V = the members in the SN E = the existing directed edges Each w ∈ W : V × V→R+ denotes the cost for
establishing the edge between any two different members
Preliminary
![Page 14: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/14.jpg)
14
Authoritativeness
Given a node v in an SN G(V, E, W), the authoritativeness of v is denoted as the indegree of v, namely din(v)
Shows how much attention v can get Hubness
Given a node v in an SN G(V, E, W), the hubness of v is denoted as the outdegree of v, namely dout(v)
Shows how the importance of v as a hub
Authoritativeness and Hubness
![Page 15: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/15.jpg)
15
Candidate Set
Given an SN G(V, E, W), let the skyline member set be SG, when SG ≠ V , the set V-SG, denoted as C*, is the candidate set of G. We say each node c ∈ V-SG is a candidate for member promotion
Dominator Set Given a member v in an SN G(V, E, W), the
dominator set of v, marked as δ(v), is defined as a set of nodes D: {n | n dominates v, n ∈ V}.
Candidate Set and Dominator Set
![Page 16: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/16.jpg)
16
![Page 17: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/17.jpg)
17
Given an SN G(V, E, W), ∀c ∈ C*, p ⊆ V × V, a
promotion plan against c, denoted as p, is defined as such an edge combination that satisfies: (1) p ⊆ {e | e = (c, ·) ∨ e = (·, c) ∧ e ≠ (c, c) ∧ e
∉ E}, (2) c ∉ SG’, where G’ = (V, E + p, W).
In more general cases, the one which only meets (1) is defined as a plan
Promotion
![Page 18: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/18.jpg)
18
Given an SN G(V, E, W), the cost of any plan p,
marked as γ(p), is the sum of the weights corresponding to the edges included in p. As we mark the weight of an edge e as ϵ(e), that is,γ(p) =Σe∈p(ϵ(e)) = Σe∈p (W[e.f rom][e.to])
in which e.from and e.to represent the source node and the sink node of edge e respectively. Thereby, ∀ c ∈ C*, p ∈ Pc , the promotion cost of c is the minimum cost among all the promotion plans. We mark it as ζ (c), namely,ζ (c) = minp ∈ Pc (γ(p))
Promotion Cost
![Page 19: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/19.jpg)
19
Member Promotion in SNs
Given an SN G(V, E, W), member promotion in SNs is to find such a member set R which satisfies:(1) R ⊆ C*,(2) R = {r | r = argmin(ζ(c))}
Problem Statement
![Page 20: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/20.jpg)
20
Introduction Related Work Preliminary Algorithm Experiments Conclusion
Outline
![Page 21: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/21.jpg)
21
A general framework for promotion algorithms A brute-force method An index-based dynamic pruning method
Algorithm
![Page 22: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/22.jpg)
22
1. Offline calculation of the distribution of both
measures of all the members2. Determine the candidate set by skyline query3. Against each candidate, perform promotions
by adding edges in the promotion plans and update the minimum promotion cost if necessary
4. Return the optimal candidate and related optimal promotion plans
General Framework
![Page 23: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/23.jpg)
23
![Page 24: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/24.jpg)
24
Verifies all the possible plans with i edges
against all the candidates before we locate the best candidate
Brute-Force Algorithm
![Page 25: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/25.jpg)
25
![Page 26: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/26.jpg)
26
A number of “meaningless” promotions will
decrease the efficiency, so we should find a way to recognize the skippable plans for pruning
There are some related theorems and lemmas
IDP : The Index-based Dynamic Pruning
Algorithm
![Page 27: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/27.jpg)
27
Given an SN G(V, E, W), if adding an edge e
connecting node vi and the candidate node c still cannot promote c into the skyline set, all the attempts of adding an edge e′ connecting the node vj and c with the same direction as e are not able to successfully promote c, where vj ∈ δ(vi)
IDP : Theorem
![Page 28: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/28.jpg)
28
Assume a plan p including n edges: e1, e2, …,
en cannot get its target candidate c promoted. For each edge ei connecting vi and c in p, let li
be the list containing all the non-existing edges each of which links one member ∈ δ(vi) and c with the same direction as ei (i = 1, 2,… , n). All the plans with n edges which belong to , the Cartesian product of li, can be skipped in the subsequent verification process against c
IDP : Lemma
![Page 29: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/29.jpg)
29
Skyline may change after applying a plan, thus
the candidate may still be dominated by other members
In the brute-force algorithm is to recalculate the skyline set based on the whole updated network
Theorem Given a plan p, let M be the set of members
relevant to the edges in p except the candidate c. If a member v neither dominates c before the promotion nor belongs to M, v will still not dominate c after p is conducted.
Final Verification
![Page 30: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/30.jpg)
30
Just need to eliminate the possibility of any
member being a dominator of the candidate c to make sure c is successfully promoted
Two cases the members connected to any edge in the plan
may become new dominators of c because at least one of their two measures will increase after the promotion
the members in the skyline member set may still dominate c
Final Verification
![Page 31: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/31.jpg)
31
![Page 32: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/32.jpg)
32
Introduction Related Work Preliminary Algorithm Experiments Conclusion
Outline
![Page 33: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/33.jpg)
33
Implemented using Java with JDK version
1.6.0_10, Inter Core2 Duo CPU T7300 2.00GHz, 1G memory, 120G hard disk, Running Windows XP
Datasets USAir
Includes 332 nodes and 2126 edges Power-law set
Used a graph data generator gengraph_win to generate graph datasets
Experimental Settings
![Page 34: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/34.jpg)
34
we verified the effectiveness by comparing
the promotion costs between the IDP algorithm and a random promotion algorithm
Comparison on Promotion Cost
![Page 35: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/35.jpg)
35
to compare the time cost of the brute-force
algorithm and the IDP algorithm on both USAir and Power-law Set respectively
Comparison on Time Cost
![Page 36: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/36.jpg)
36
Introduction Related Work Preliminary Algorithm Experiments Conclusion
Outline
![Page 37: Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August](https://reader035.vdocuments.mx/reader035/viewer/2022062722/56649f2b5503460f94c468c5/html5/thumbnails/37.jpg)
37
Raised a new interesting problem, namely
member promotion in in social networks Purpose two algorithms
the brute-force algorithm the IDP algorithm
The future work Further improve the algorithm Allows several members to promote
concurrently
Conclusion