[ieee 2014 ieee/acm international conference on advances in social networks analysis and mining...

8
Cascading Failures of Social Networks under Attacks Chengqi Yi 1 , Yuanyuan Bao 2, 3 , Jingchi Jiang 1 , Yibo Xue 2, 3, * , Yingfei Dong 4 1 School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, 150080, China. 2 Tsinghua National Lab for Information Science and Technology, Tsinghua University, Beijing, 100084, China. 3 Research Institute of Information Technology, Tsinghua University, Beijing 100084, China. 4 Department of Electrical Engineering, University of Hawaii, Honolulu 96822, USA. ([email protected], [email protected], [email protected], [email protected], [email protected]) Abstract—Although cascading failures have occurred on many real-world networks, to our best knowledge, no one has clearly identified this issue on a social network. In this paper, we identify this potential issue on social networks, and develop a theoretical model to analyze related issues. Note that highly- influential “super” users play critical roles on a social network. When they are suddenly unavailable, a large portion of the social network may be seriously disrupted. The proposed model captures this dynamic process and helps us better understand related issues. Furthermore, we evaluate the proposed model under four attack strategies based on real social network datasets collected on Twitter and Sina Weibo. We also analyze the connectivity, the persistent time, and the cascade effect of a social network under these attacks. Our results show that social network service providers have to pay closer attention to super users to avoid dramatic failures. Keywords-cascading failures; social network; attack strategies; betweenness centrality; super users I. INTRODUCTION In recent years, social networks such as Twitter, Facebook and Sina Weibo [1] have become an indispensable part of our lives. Twitter and Facebook are the most popular microblogging websites in the world. Facebook has over 1.23 billion monthly active users in 2013 (with over 945 million monthly active mobile users), and over 757 million daily active users in December 2013 [2]. Twitter has now over 241 million monthly active users as of December 2013, with 184 million monthly active mobile users [3]. Sina Weibo is a social forum in China somewhat like a hybrid of Twitter and Facebook. It is one of the most popular Chinese website and has about 600 million registered users and 61.4 million daily active users in December 2013 [4]. These networks provide a powerful means for organizing contacts, publishing contents, and sharing interests. Nonetheless, the rapid development of social networks also brought some potential problems. Note that although the user base of social networks keeps growing, we have seen the decrease of user activities. For instance, according to a widely sourced report by third party data tracking service WeiboReach [5], Sina Weibo’s activity in 2013 is down by 30% comparing to the beginning of 2011. There are some direct and indirect reasons behind this issue. One of the key reasons is that, when many highly-influential users are unavailable for some reasons, such as dropped out of usage or be attacked intentionally, a large portion of social networks is disrupted; sometimes, the whole network is paralyzed. These users, which we call super users, play important roles on social networks. To formalize the above phenomenon, we abstract it as a process of cascading failures. A cascading failure is a procedure in which the failure of one part of a system can trigger the successive failures of many other parts of the system. Cascading failures can occur in many real-world networks [6– 11], such as power grids [12–14], communication networks [15], and economical networks [16]. In these networks, loads are transported from nodes to nodes. If some nodes fail in a network, their loads are redistributed to other nodes in the system, which may cause more nodes to fail. As such a process spread across the network, cascading failures take place. In this paper, we focus on this issue on social networks and conduct our investigation in the following steps: We first quantify user loads based on betweenness centrality, and then construct the cascading process based on network dynamics. We then use four attack strategies to evaluate the correctness of the proposed cascading model based on real social network datasets. We also evaluate the connectivity, the persistent time, and the cascade effect under various attack strategies. We show that super users can cause the complete failure of a social network. Moreover, our experimental results show that the sparse and inhomogeneous social networks are more robust and more tolerable to failures of super users. However, whether real social network are sparse and inhomogeneous depends on the characteristics of concrete communities. For example, we observe that our Twitter dataset is more dense and homogenous than out Weibo dataset. We will give a detailed introduction on the main differences between our Twitter dataset and out Weibo dataset in Section IV. The remainder of this paper is organized as follows. In Section II, we will present related work on cascading failures on various networks. In Section III, we will introduce the basic ideas and construct the cascading model. In Section IV, we will introduce our experimental datasets collected on Twitter and Sina Weibo. In Section V, we will evaluate the correctness of the proposed cascading model. We will further analyze the connectivity, the persistent time and the cascade effect under various attack strategies. In Section VI, we will conclude this paper and discuss future directions. 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014) 679 ASONAM 2014, August 17-20, 2014, Beijing, China 978-1-4799-5877-1/14/$31.00 ©2014 IEEE

Upload: yingfei

Post on 14-Apr-2017

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: [IEEE 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) - China (2014.8.17-2014.8.20)] 2014 IEEE/ACM International Conference on Advances

Cascading Failures of Social Networks under Attacks

Chengqi Yi1, Yuanyuan Bao2, 3, Jingchi Jiang1, Yibo Xue2, 3, *, Yingfei Dong4

1School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, 150080, China. 2Tsinghua National Lab for Information Science and Technology, Tsinghua University, Beijing, 100084, China.

3Research Institute of Information Technology, Tsinghua University, Beijing 100084, China. 4Department of Electrical Engineering, University of Hawaii, Honolulu 96822, USA.

([email protected], [email protected], [email protected], [email protected], [email protected])

Abstract—Although cascading failures have occurred on many real-world networks, to our best knowledge, no one has clearly identified this issue on a social network. In this paper, we identify this potential issue on social networks, and develop a theoretical model to analyze related issues. Note that highly-influential “super” users play critical roles on a social network. When they are suddenly unavailable, a large portion of the social network may be seriously disrupted. The proposed model captures this dynamic process and helps us better understand related issues. Furthermore, we evaluate the proposed model under four attack strategies based on real social network datasets collected on Twitter and Sina Weibo. We also analyze the connectivity, the persistent time, and the cascade effect of a social network under these attacks. Our results show that social network service providers have to pay closer attention to super users to avoid dramatic failures.

Keywords-cascading failures; social network; attack strategies; betweenness centrality; super users

I. INTRODUCTION In recent years, social networks such as Twitter, Facebook

and Sina Weibo [1] have become an indispensable part of our lives. Twitter and Facebook are the most popular microblogging websites in the world. Facebook has over 1.23 billion monthly active users in 2013 (with over 945 million monthly active mobile users), and over 757 million daily active users in December 2013 [2]. Twitter has now over 241 million monthly active users as of December 2013, with 184 million monthly active mobile users [3]. Sina Weibo is a social forum in China somewhat like a hybrid of Twitter and Facebook. It is one of the most popular Chinese website and has about 600 million registered users and 61.4 million daily active users in December 2013 [4]. These networks provide a powerful means for organizing contacts, publishing contents, and sharing interests.

Nonetheless, the rapid development of social networks also brought some potential problems. Note that although the user base of social networks keeps growing, we have seen the decrease of user activities. For instance, according to a widely sourced report by third party data tracking service WeiboReach [5], Sina Weibo’s activity in 2013 is down by 30% comparing to the beginning of 2011. There are some direct and indirect reasons behind this issue. One of the key reasons is that, when many highly-influential users are unavailable for some reasons, such as dropped out of usage or be attacked intentionally, a large portion of social networks is disrupted; sometimes, the

whole network is paralyzed. These users, which we call super users, play important roles on social networks.

To formalize the above phenomenon, we abstract it as a process of cascading failures. A cascading failure is a procedure in which the failure of one part of a system can trigger the successive failures of many other parts of the system. Cascading failures can occur in many real-world networks [6–11], such as power grids [12–14], communication networks [15], and economical networks [16]. In these networks, loads are transported from nodes to nodes. If some nodes fail in a network, their loads are redistributed to other nodes in the system, which may cause more nodes to fail. As such a process spread across the network, cascading failures take place.

In this paper, we focus on this issue on social networks and conduct our investigation in the following steps:

� We first quantify user loads based on betweenness centrality, and then construct the cascading process based on network dynamics.

� We then use four attack strategies to evaluate the correctness of the proposed cascading model based on real social network datasets.

� We also evaluate the connectivity, the persistent time, and the cascade effect under various attack strategies.

We show that super users can cause the complete failure of a social network. Moreover, our experimental results show that the sparse and inhomogeneous social networks are more robust and more tolerable to failures of super users. However, whether real social network are sparse and inhomogeneous depends on the characteristics of concrete communities. For example, we observe that our Twitter dataset is more dense and homogenous than out Weibo dataset. We will give a detailed introduction on the main differences between our Twitter dataset and out Weibo dataset in Section IV.

The remainder of this paper is organized as follows. In Section II, we will present related work on cascading failures on various networks. In Section III, we will introduce the basic ideas and construct the cascading model. In Section IV, we will introduce our experimental datasets collected on Twitter and Sina Weibo. In Section V, we will evaluate the correctness of the proposed cascading model. We will further analyze the connectivity, the persistent time and the cascade effect under various attack strategies. In Section VI, we will conclude this paper and discuss future directions.

2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014)

679ASONAM 2014, August 17-20, 2014, Beijing, China

978-1-4799-5877-1/14/$31.00 ©2014 IEEE

Page 2: [IEEE 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) - China (2014.8.17-2014.8.20)] 2014 IEEE/ACM International Conference on Advances

II. RELATED WORK Researchers have investigated the important aspects of

cascading failures in many networks and obtained many valuable results. Wang et al. [17] proposed a two-stage cascading model in interdependent networks. They found that the link patterns had important effects on improving the robustness of interdependent networks. Moreover, Wang [18] demonstrated the efficiencies of mitigation strategies to enhance the robustness of scale-free networks against cascading failures, and gave the order of the effectiveness of mitigation strategies. These results were very helpful for avoiding various cascading-failure-induced disasters in real-world.

Dou et al. [19] proposed a non-linear load-capacity model against cascading failures. Tan et al. [20] extended the cascading failure model used in isolated networks to the case of interconnected networks. Li et al. [21] modeled the cascading dynamics in scale-free networks, WS small-world networks, and ER random networks. They further revealed the process under edge-based attacks. Eppstein [22] described a stochastic RC algorithm to identify a large collections of multiple contingencies that initiated large cascading failures in a simulated power system. Bernstein [23] studied the effects of geographically correlated outages and the resulting cascades. He also compared their numerical results to the actual events in a recent blackout in the San Diego area, thereby demonstrated that the model’s predictions were consistent with real events.

Our research in this paper is different from these existing works. First, while cascading failures may also occur on social networks, this issue has not been thoroughly investigated so far. Second, we utilize several unique features of social networks to quantify loads, such as the user’s ability of receiving information and its location.

III. CASCADING FAILURES ON SOCIAL NETWORKS

A. Example of Cascading Failures A social network is similar to many other networks. While

all users desire to obtain information on social networks quickly and comprehensively, the ability of a user to obtain information depends on its location on the network. If a user is located at the intersection of information transmission routes, it would obtain information much more easily. However, the better ability of receiving information, the higher risk of obtaining malicious information, such as advertising messages, illegal messages, and rumors. As too much malicious information may turn off a user, it may simply drop out of the network. When this happens, the network topology is changed. Such changes may increase the probabilities for other users to obtain malicious information. As a result, more users may fail, and thus cascading failures spread across the network, as shown in Fig. 1.

A1

A2

A3

A4

A5

B1

B2

B3

B4

B5

Attack

CA1 A1

A2

A3

A4

A5

B1

B2

B4

B5

Failure

Failure

A1

A2

A4

A5

B1

B4

B5

Failure Failure

Failure

Failure

A5

B1

B5

CA2

CA3

CA5

CA4

CB1

CB2

CB3

CB5

CB4

CA1

CA2

CA3

CA5

CA4

CB1

CB2

CB5

CB4

CB1

CB5

CB4

CA1

CA2

CA5

CA4

CB1

CB5CA5

(a) original network (b) network under attack

(c) network after cascading failures(d) final network

Figure 1. Cascading Failures on a Social Network

In Fig. 1, the edge between users A1 and B1 is shown as a line with an arrow. It represents that user A1 follows B1. A white circle represents an active user. A shaded circle represents a failed user. In Fig.1.(a), the failure of user B3 (e.g., due to an attack) causes the disconnection of B3’s edges. In Fig.1.(b), B3’s load is redistributed to B3’s neighbors, including A3, B1, B2 and B4. This may leads to B2’s and A3’s loads exceeding their capacities, and cause their failures. As such a process spreads across the network, the steady state is shown in Fig.1.(d). Only three users (B1, B5, and A5) are active at the end.

B. Cascading Dynamics As cascading failures are caused by load redistribution, we

need a precise measure to represent the potential load redistribution process. To address this issue, we propose to use betweenness centrality to capture the characteristics of user loads. Betweenness centrality is a measure of a user’s centrality on a social network. It is equal to the number of shortest paths from all users to all others that pass through a particular user. A user with a high betweenness centrality has a high probability to obtain information on a social network.

Intuitively, we model a social network as a weighted and directed graph G = {V, E, W}, where node set V represents users, edge set E represents relationships between users, and weight set W represents user loads. We define the edge direction in a graph is the direction of information dissemination. If the edge direction is from user s to user t, it represents that user t follows user s. Information can be propagated from user s to user t. Formally, based on the concept of betweenness centrality, the weight of user i is defined as follows.

2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014)

680

Page 3: [IEEE 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) - China (2014.8.17-2014.8.20)] 2014 IEEE/ACM International Conference on Advances

ist

is i t st

W ��� �

� � (1)

where δist is the number of shortest paths from user s to user t

through user i for information transmission, and δst is the total number of available shortest paths from user s to user t. If δst = 0, then Wi = 0, and 0 Wi 1. Based on formula (1), we can also take into account the time factor in quantifying the load of a user. Wi (t) means the load of user i at time t. As the network topology may be changed at different times due to node failures, the load of a user may also be changed. Wi (0) means the initial load of node i on the original networks without failures.

The capacity of a user is defined as the highest load that the user can endure. The capacity is usually decided by the initial load. So, we assume that the capacity of node i is proportional to its initial load [9][24]. We assume a linear relationship between the capacity and the initial load of a user [17-19]. Let Ci be the capacity of user i, defined as

(1 ) (0)i iC W�� � (2)

where the constant α (0<α<1) is a tolerance parameter. A small α means that a user is sensitive to the topology change of its neighbors; a large α means that a user is less sensitive to the topology change of its neighbors. If Wi (t) > Ci, it indicates that the load at time t causes user i to fail. The redistribution of the loads of failed users may cause more users to fail, and a cascading failure may occur.

C. Simulation Algorithm of Cascading Failures on Social Networks Based on the above analysis, we design an algorithm to

simulate the process of cascading failures on social networks. The detailed algorithm is shown in Fig. 2.

The algorithm contains three steps: (i) building of a weighted social network G = {V, E, W}, (ii) calculating the initial load Wi (0) with a fast algorithm based on betweenness centrality [25] and the capacity Ci of each user according to tolerance parameter α, and (iii) simulating the process of cascading failures.

Algorithm 1: Simulation algorithm of cascading failures on social networks Input: the initial social network Gini = {Vini, Eini}. Output: the final social network Gfin = {Vfin, Efin}. Begin 1: Initialize the nodes vector Vini, and edges vector Eini at time

ti � Vini = {V1, V2, ... ,Vn}, Eini = {E1, E2, ... , En}, i = 0 2: Initialize the initial vector of failure � U = {U1, U2, ... ,Uk} 3: Calculate the weight vector of each node by formula (1) �

W = {W1, W2, ... ,Wn} 4: Calculate the capacity vector of each node by formula (2)

� C = {C1, C2, ... ,Cn} 5: Generate the weighted social network by the nodes vector

Vini, the edges vector Eini, and the weight vector W � G = {V, E, W}

6: Remove the nodes U and their edges from G � G’ 7: Function cascading(G’, r := n - k) do 8: Calculate the current load vector of each node by formula

(1) � W’ = {W1’, W2’, ... ,Wn’} 9: for j := 1 to r do 10: if Wj’ > Cj then 11: Remove the node j and its edges at time ti � G’’ 12: r -= 1 13: end if 14: end for 15: i += 1 16: if G’ != G’’ then 17: cascading(G’’, r) 18: else 19: break 20: end if 21: end Function cascading(G, r) 22: return G’’

Figure 2. Simulation Algorithm of Cascading Failures on Social Networks

To illustrate the above algorithm, we built a small synthetic network G according to the Watts-Strogatz Small World model. We then simulated the cascading failures on this network with the above algorithms. In the original network G, the number of nodes N = 100. The number of edges for each node K = 10. The probability P that an edge is rewired randomly is 0.5. The simulation results are shown in Fig. 3.

Fig. 3 provides a more intuitive and convenient way to observe the process of cascading failures. There are five steps in this process. Fig.3.(a) is the original network with 100 nodes and 500 edges. Fig.3.(b) is the network under attack five nodes with 95 nodes and 426 edges. Fig.3.(c) is the network after one time cascading failure with 63 nodes and 166 edges. Fig.3.(d) is the network after two times cascading failures with 45 nodes and 63 edges. In contrast, Fig.3.(e) is the final network with only 36 nodes and 34 edges.

IV. DATASET DESCRIPTION AND VISUALIZATION

A. Data Acquisition Method and Datasets To evaluate the proposed cascading model, we collect

corresponding data on Twitter and Sina Weibo. We use an online community perceiving method [26] to collect Chinese users and their relations on Twitter. During the data collection on Twitter, we find that the amount of data have remained stable after the number of users reached 630,000. This illustrates that almost all of Chinese users on Twitter have been collected by us. We further sort 633,471 Chinese users by the number of followers, and extract 2000 famous Chinese users and their relations which have 182561 edges. The average degree <k> of these users is 91. On Sina Weibo, we collect the top 2000 famous users according to the official website. Moreover, we acquire the 11702 relations of these users. The average degree <k> is 6.8.

2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014)

681

Page 4: [IEEE 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) - China (2014.8.17-2014.8.20)] 2014 IEEE/ACM International Conference on Advances

Figure 3. Visualization of Cascading Failures by Using Simulation Algorithm

B. Data Statistics and Visualization We present the degree distributions in Fig. 4, including the

out-degree distribution, the in-degree distribution, and the total-degree distribution. The degree distribution of Sina Weibo follows a power law distribution. To some extent, the degree distribution of Twitter is more close to a Poisson distribution.

Meanwhile, we present the graph distance distribution in Fig. 5, including the closeness centrality distribution, the betweenness centrality distribution and the eccentricity distribution.

Furthermore, we use Gephi [27] to visualize the cascading failures on social network. We mark user nodes with different colors according to the degree of a node. After applying for the layout algorithm of Yifan Hu [28], the situation of cascading failure can be visualized clearly, as shown in Fig. 6. (Please see the colored version of Fig. 6.)

Figure 4. Degree Distribution of Twitter and Sina Weibo

2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014)

682

Page 5: [IEEE 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) - China (2014.8.17-2014.8.20)] 2014 IEEE/ACM International Conference on Advances

Figure 5. Graph Distance Distribution of Twitter and Sina Weibo

Figure 6. Visualization of Twitter and Sina Weibo

The average degree, the degree distribution, the graph distance distribution, and the visualization show that Twitter is fairly dense and homogeneous and Sina Weibo is relatively sparse and inhomogeneous. We believe that the reason for the difference of these two datasets is: the following percentage with each other of Chinese famous users is extremely high on Twitter. The following percentage f of Twitter dataset is 4.55%, which is calculated by

1kf

n�

(3)

where <k> is the average degree and n is the number of nodes. In contrast, famous users on Sina Weibo mostly belong to completely different industries including entertainment, sports, media, finance, IT, literature and government. Then they have relative sparse relations. (But we are not sure whether or not famous users within the same industry on Sina Weibo are dense and homogenous as Twitter dataset.) The following percentage f of this dataset is only 0.34%.

V. CASCADING FAILURES ANALYSIS UNDER ATTACKS

A. Persistent Time Analysis In this section, we focus on the persistent time analysis of

cascading failures. Based on the above cascading failures simulation algorithm in Fig. 2, we regard the iterative times in the algorithm as the persistent time of cascading failures. We use four attack strategies including a HL attack strategy, a MPR attack strategy, a MD attack strategy, and a RD attack strategy to analyze our datasets. As illustrated in Fig.7 and Fig. 8, the x-axis represents the tolerance parameter α varies from 0 to 1. The y-axis represents the persistent time. We choose the number of attacked nodes (β) as 1, 5, 10, 20, 50 and 100, respectively. The reason why choose β = 100 as the maximum number is that 100 nodes is less than 5% of total nodes. The experimental results on the two datasets are shown in Fig. 7 and Fig. 8.

Interestingly, the persistent time has a tendency of increasing first then dropping to a stable number under different attack strategies, different datasets, and different number of attacked nodes. This shows that the cascade propagation will last for a short time when users are particularly sensitive or simply do not care the failures of their neighbors. Although the persistent times of both datasets are similar, we see a higher influence of cascading failures on Twitter dataset than Weibo dataset. The experimental results can match the practical significance and show that the above cascading dynamics is correct. In addition, the persistent time is longer when the tolerance parameter 0.2<α<0.6 on Sina Weibo and 0.1<α<0.5 on Twitter. On either the sparse or dense social networks, the persistent time under the ML attack last longest in general.

Figure 7. Persistent Time Analysis of Weibo Dataset under Various Attack Strategies

2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014)

683

Page 6: [IEEE 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) - China (2014.8.17-2014.8.20)] 2014 IEEE/ACM International Conference on Advances

Figure 8. Persistent Time Analysis of Twitter Dataset under Various Attack Strategies

B. Connectivity Analysis In this section, we focus on the impact of cascading failures

on the connectivity of social networks. The connectivity of social networks is quantified by a connected ratio ω, defined as follows.

( )( ')

R GR G

� � (4)

where the numerator R(G) is the number of connected component on an original social networks G, and the denominator R(G’) is the number of connected component on a social networks G’ after cascading failures. A small ω represents low connectivity after cascading failures, while a large ω represents high connectivity.

We also use the same attack strategies to analyze the impact of cascading failures on the connectivity of Weibo dataset and Twitter dataset. As is shown in Fig. 9 and Fig. 10, social networks still preserve high connectivity under the RD attack strategy. However, the connectivity has been greatly reduced under the other three attack strategies. The embedded figures provide a more clear contrast of the other three attack strategies. First, we observe that more attacked nodes will results in the smaller ω. The largest connected ratio ω falls from 0.6 to 0.02 on Weibo dataset, and falls from 1 to 0.2 on Twitter dataset. In the meantime, we can observe that Twitter is more robust than Sina Weibo. Second, in the respect of the trends analysis of different curves, we can find the inflection points which means the curvature changes sign from minus to plus when α ≈ 0.3 on Twitter dataset and α ≈ 0.2 on Weibo dataset. Finally, in all cases, the HL, MPR and MD strategies are more effective than the RD strategy. Moreover, on Twitter dataset, the MPR strategy is more effective than others when β < 5. The HL and MD strategies are more effective when β ≥ 5. But because of its

low density and homogeneity, the effects of HL, MPR and MD strategies are not obvious on Weibo dataset.

Figure 9. Connectivity Analysis of Weibo Dataset under Various Attack Strategies

Figure 10. Connectivity Analysis of Twitter Dataset under Various Attack Strategies

C. Cascade Effect Analysis In this section, to futher comprehend the process of

cascading failures, we want to reveal how many users and their relations are failed ultimately. The cascade effect analysis of social networks is quantified by failure ratio σ, defined as follows.

2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014)

684

Page 7: [IEEE 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) - China (2014.8.17-2014.8.20)] 2014 IEEE/ACM International Conference on Advances

( ')( )

N GN G

� � (5)

where the denominator N(G) is the number of nodes on an original social network G, and the numerator N(G’) is the number of nodes on the social network G’ after cascading failures. Smaller σ represents weaker cascade effects after cascading failures, while larger σ represents stronger cascade effects. The experimental results of Weibo dataset are shown in Fig. 11, and the results of Twitter dataset are shown in Fig. 12.

Figure 11. Cascade Effect Analysis of Weibo Dataset under Various Attack Strategies

Figure 12. Cascade Effect Analysis of Twitter Dataset under Various Attack Strategies

Fig. 11 and Fig. 12 show that the lines which are under the RD attack strategy fluctuations up and down on Weibo dataset and Twitter dataset. The reason for this is because under the RD attack strategy, if super users are randomly attacked, they will trigger cascading failures and cause the complete failure of a social network. Then the failure ratio σ may be extremely large. Contrarily, the real failure ratio σ is low. So we can conclude that there are some highly-influential users can cause the most failures of social networks. The failure ratio σ of β = 1 are similar to that of β = 5, 10, 20, 50 and 100. These results show that selecting a critical user to attack could inflict a serious damage to the entire social network.

In above four figures, we can find the inflection points at which the failure ratios σ drops dramatically when 0.1<α<0.4 on Weibo dataset and 0.2< α <0.4 on Twitter dataset. The reason is that the increase of α represents lower user sensitivity to topology changes. When α < 0.4, users’ sensitivity is quite high and cause the stronger cascade effect. When α 0.5, only a few users will fail due to cascading.

In Fig. 11 and Fig. 12, by contrast, the maximum failure ratio is 0.18<σ<0.23 on the Weibo dataset and it is 0.82<σ<0.86 on Twitter dataset. This illustrates that Weibo dataset is more difficult to disrupt by attacking a set of super users than the Twitter dataset. So we can conclude that a sparse and inhomogeneous network is more difficult to disrupt by attacking a set of super users than a dense and homogeneous network. Liu et al. [29] found that dense and homogeneous networks can be controlled much easier by using a few driver nodes. This conclusion is similar to our above conclusion, but nonetheless, there are different analytical methods. They developed analytical tools to study the controllability of an arbitrary complex directed network and mainly adopted control theory. Unlike control theory, we focus on the perspective of cascading failures and network dynamics.

In order to make the above conclusion more intuitive and comprehensible, we use the HL attack strategy (the tolerance parameter α = 0.3, the number of attacked nodes β = 5) to compare the cascading failures on the two datasets as shown in Fig. 13. We come to the same conclusion that sparse and inhomogeneous social networks are more difficult to disrupt by attacking a set of super users than dense and homogeneous social networks.

Figure 13. The Contrast of Cascading Failure for Twitter and Sina Weibo

2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014)

685

Page 8: [IEEE 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) - China (2014.8.17-2014.8.20)] 2014 IEEE/ACM International Conference on Advances

VI. CONCLUSION AND FUTURE WORK In this paper, we have quantified user load based on

betweenness centrality, and construct the cascading failure process based on network dynamics. Using real social network datasets, we have used four attack strategies to evaluate the correctness of the proposed cascading model. We have further analyzed the connectivity, the persistent time and the cascade effect of social networks under attacks. We have shown that social networks with super users are vulnerable to attacks. Our experimental results have shown that sparse and inhomogeneous social networks are more difficult to disrupt than dense and homogeneous social networks. Due to space limit, we have to use small figures to report the general trends. We have large and clear figures available in a report [30].

In our future investigation, we will examine the strategy of loads redistribution to mitigate cascading failure of social networks. In many physical networks, in order to avoid the disasters of cascading failures, the traditional mitigation strategy is to redistribute protected node’s extra load to its neighboring nodes when the load of a node exceeds its capacity. On the contrary, social network is different from these networks. The protected node should avoid to be redistributed to its neighboring nodes, especially its friends. Otherwise, if the redistributed nodes are the protected node’s neighbors, it’s most likely to cause the protected node to be of failure.

ACKNOWLEDGMENT This work was supported by the National Key Technology

R&D Program of China under Grant No.2012BAH46B04. We would like to thank the reviewers for their time to help us.

REFERENCES [1] Gao Q, Abel F, Houben G J, et al. A comparative study of users’

microblogging behavior on Sina Weibo and Twitter. User Modeling, Adaptation, and Personalization. Springer Berlin Heidelberg, 2012: 88-101.

[2] Emil P, “Facebook passes 1.23 billion monthly active users, 945 million mobile users, and 757 million daily users”. [Online] Available: http://thenextweb.com/facebook/2014/01/29/facebook-passes-1-23-billion-monthly-active-users-945-million-mobile-users-757-million-daily-users/#!zXtJr.

[3] Emil P, “Twitter passes 241m monthly active users, 184m mobile users, and sees 75% of advertising revenue from mobile”. [Online] Available: http://thenextweb.com/twitter/2014/02/05/twitter-passes-million-monthly-active-users-x-million-mobile-users/#!zXvTZ.

[4] Paul B, “Sina Weibo adds 2.5 million daily active users, growth cut in half”. [Online] Available: http://www.techinasia.com/sina-weibo-adds-35-million-daily-active-users-growth-cut/.

[5] Digitaljungle, “Is Sina Weibo Losing Its Cool-Activity Down 30%”. [Online] Available: http://www.digitaljungle.com.cn/blogs/is-sina-weibo-losing-its-cool-activity-down-30.

[6] Wang W X, Chen G. Universal robustness characteristic of weighted networks against cascading failure. Physical Review E, 2008, 77(2): 026101.

[7] Ash J, Newth D. Optimizing complex networks for resilience against cascading failure. Physica A: Statistical Mechanics and its Applications, 2007, 380: 673-683.

[8] Wang J W, Rong L L. Edge-based-attack induced cascading failures on scale-free networks. Physica A: Statistical Mechanics and its Applications, 2009, 388(8): 1731-1737.

[9] Wei D Q, Luo X S, Zhang B. Analysis of cascading failure in complex power networks under the load local preferential redistribution rule. Physica A: Statistical Mechanics and its Applications, 2012, 391(8): 2771-2777.

[10] Li W, Bashan A, Buldyrev S V, et al. Cascading failures in interdependent lattice networks: The critical role of the length of dependency links. Physical Review Letters, 2012, 108(22): 228702.

[11] Mirzasoleiman B, Babaei M, Jalili M, et al. Cascaded failures in weighted networks. Physical Review E, 2011, 84(4): 046114.

[12] Albert R, Albert I, Nakarado G L. Structural vulnerability of the North American power grid. Physical review E, 2004, 69(2): 025103.

[13] Chang L, Wu Z. Performance and reliability of electrical power grids under cascading failures. International Journal of Electrical Power & Energy Systems, 2011, 33(8): 1410-1419.

[14] Wang J W, Rong L L. Cascade-based attack vulnerability on the US power grid. Safety Science, 2009, 47(10): 1332-1336.

[15] Rosato V, Issacharoff L, Tiriticco F, et al. Modelling interdependent infrastructures using interacting dynamical models. International Journal of Critical Infrastructures, 2008, 4(1): 63-79.

[16] Buldyrev S V, Parshani R, Paul G, et al. Catastrophic cascade of failures in interdependent networks. Nature, 2010, 464(7291): 1025-1028.

[17] Wang J, Jiang C, Qian J. Robustness of interdependent networks with different link patterns against cascading failures. Physica A: Statistical Mechanics and its Applications, 2014, 393: 535-541.

[18] Wang J. Mitigation strategies on scale-free networks against cascading failures. Physica A: Statistical Mechanics and its Applications, 2013.

[19] Dou B L, Wang X G, Zhang S Y. Robustness of networks against cascading failures. Physica A: Statistical Mechanics and its Applications, 2010, 389(11): 2310-2317.

[20] Tan F, Xia Y, Zhang W, et al. Cascading failures of loads in interconnected networks under intentional attack. EPL (Europhysics Letters), 2013, 102(2): 28009.

[21] Li S, Li L, Yang Y, et al. Revealing the process of edge-based-attack cascading failures. Nonlinear Dynamics, 2012, 69(3): 837-845.

[22] Eppstein M J, Hines P D H. A “random chemistry” algorithm for identifying collections of multiple contingencies that initiate cascading failure. IEEE Transactions on Power Systems, 2012, 27(3): 1698-1705.

[23] Bernstein A, Bienstock D, Hay D, et al. Sensitivity analysis of the power grid vulnerability to large-scale cascading failures. ACM SIGMETRICS Performance Evaluation Review, 2012, 40(3): 33-37.

[24] Xia Y, Fan J, Hill D. Cascading failure in Watts–Strogatz small-world networks. Physica A: Statistical Mechanics and its Applications, 2010, 389(6): 1281-1285.

[25] Brandes U. A faster algorithm for betweenness centrality*. Journal of Mathematical Sociology, 2001, 25(2): 163-177.

[26] Jingchi J, Chengqi Y, Yuanyuan B, et al. Online Community Perceiving Method on Social Network. The International Workshop on Cloud Computing and Information Security. Atlantis Press, 2013.

[27] Bastian M, Heymann S, Jacomy M. Gephi: an open source software for exploring and manipulating networks. Proc. of 3th International AAAI Conference on Weblogs and Social Media (ICWSM). 2009.

[28] Hu Y. Efficient, high-quality force-directed graph drawing. Mathematica Journal, 2005, 10(1): 37-71.

[29] Liu Y Y, Slotine J J, Barabási A L. Controllability of complex networks. Nature, 2011, 473(7346): 167-173.

[30] [Online] Available: http://yun.baidu.com/share/link?shareid=451630227&uk=1107764767.

2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014)

686