[ieee 2009 international conference on advances in social network analysis and mining (asonam) -...

6
Mining Organizational Networks for Layoff Prediction Model Construction Huo-Tsan Chang, Hui-Ju Wu Graduate Institute of Human Resource Management National Changhua University of Education [email protected] I-Hsien Ting Department of Information Management National University of Kaohsiung [email protected] Abstract Global economic recession has been causing the unpaid leave and massive layoffs in major high-tech firms of Taiwan, both factors present great potential hardship to many employees according to the reports from industry. Therefore, layoff prediction and management have become great concerns of employees and managers. Employees wish to retain their jobs and keep their work for a long time. Hence, they need to predict the possible layoff and then utilize their resources to retain their job. In response to the difficulty of layoff prediction, this study applies social networks and data mining techniques to build a model for layoff prediction. This study compares various techniques to propose a better approach to generate a possible layoff list for employees. Through an empirical study, the results indicate that the proposed approach has pretty good prediction accuracy by using organizational networks, employee databases and layoff records to build the layoff prediction model. 1. Introduction The troubled economy has led some companies to lay off employees. The domino effect recently on unpaid leave and layoff is spreading around the high-tech industry in Taiwan. As the developing of firms slowed down and resources became limited, more and more firms were fighting for fewer resources, to adopt downsizing and survival tactics. Previous researches on employees’ turnover behavior mainly focus on the reasoning and affecting for employees’ turnover intention. However, the factors for layoff and the construction of layoff prediction model from real business data still have not been well examined. Moreover, the application of social network analysis with data mining techniques for layoff prediction model construction is less addressed as well. Therefore, layoff prediction and management have become of great concern to the employees and managers. Employees wish to retain their jobs and keep their work for a long time. Hence, they need to predict the possible layoff and then utilize their resources to retain their job. In response to the difficulty of layoff prediction, this study applies SNA and DM techniques to build a model for layoff prediction. Social network analysis treats organizations in society as a system of objects (e.g. people, groups, and organizations) joined by variety of relationships [11]. A research on social networks indicates that network structure and activities influence employees and affect individual organizational outcomes [13]. Data mining is thus emerging as a class of analytical techniques that goes beyond statistics and aims at examining large quantities of data in database. This paper aims at introducing the importance of the application of DM and SNA to predict layoff through an empirical study. It first provides a literature review on the recent research and application of SNA and DM. It is followed by a discussion of the concepts of DM and SNA. A case study based on an organization is then used to illustrate how SNA and DM is applied to develop a model for predicting the layoff. Future directions in applying SNA and DM in the organization networks have also provided in this paper. 2. Literature Review In this section, related literatures about social networks analysis and data mining will be reviewed. 2.1. Social Networks Social network analysis provides a rich and systematic means of assessing such network by mapping and analyzing relationships among people, teams, departments or even the entire organization [10]. Organizations are considered as a network of individuals and researchers have used network analysis to map information flow as well as relational characteristics among strategically 2009 Advances in Social Network Analysis and Mining 978-0-7695-3689-7/09 $25.00 © 2009 IEEE DOI 10.1109/ASONAM.2009.52 411

Upload: i-hsien

Post on 10-Mar-2017

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: [IEEE 2009 International Conference on Advances in Social Network Analysis and Mining (ASONAM) - Athens, Greece (2009.07.20-2009.07.22)] 2009 International Conference on Advances in

Mining Organizational Networks for Layoff Prediction Model Construction

Huo-Tsan Chang, Hui-Ju Wu Graduate Institute of Human Resource

Management National Changhua University of Education

[email protected]

I-Hsien Ting Department of Information Management

National University of Kaohsiung [email protected]

Abstract

Global economic recession has been causing the

unpaid leave and massive layoffs in major high-tech firms of Taiwan, both factors present great potential hardship to many employees according to the reports from industry. Therefore, layoff prediction and management have become great concerns of employees and managers. Employees wish to retain their jobs and keep their work for a long time. Hence, they need to predict the possible layoff and then utilize their resources to retain their job. In response to the difficulty of layoff prediction, this study applies social networks and data mining techniques to build a model for layoff prediction. This study compares various techniques to propose a better approach to generate a possible layoff list for employees. Through an empirical study, the results indicate that the proposed approach has pretty good prediction accuracy by using organizational networks, employee databases and layoff records to build the layoff prediction model. 1. Introduction

The troubled economy has led some companies to lay off employees. The domino effect recently on unpaid leave and layoff is spreading around the high-tech industry in Taiwan. As the developing of firms slowed down and resources became limited, more and more firms were fighting for fewer resources, to adopt downsizing and survival tactics.

Previous researches on employees’ turnover behavior

mainly focus on the reasoning and affecting for employees’ turnover intention. However, the factors for layoff and the construction of layoff prediction model from real business data still have not been well examined. Moreover, the application of social network analysis with data mining techniques for layoff prediction model construction is less addressed as well.

Therefore, layoff prediction and management have become of great concern to the employees and managers. Employees wish to retain their jobs and keep their work for a long time. Hence, they need to predict the possible layoff and then utilize their resources to retain their job. In response to the difficulty of layoff prediction, this study applies SNA and DM techniques to build a model for layoff prediction.

Social network analysis treats organizations in society

as a system of objects (e.g. people, groups, and organizations) joined by variety of relationships [11]. A research on social networks indicates that network structure and activities influence employees and affect individual organizational outcomes [13]. Data mining is thus emerging as a class of analytical techniques that goes beyond statistics and aims at examining large quantities of data in database.

This paper aims at introducing the importance of the

application of DM and SNA to predict layoff through an empirical study. It first provides a literature review on the recent research and application of SNA and DM. It is followed by a discussion of the concepts of DM and SNA. A case study based on an organization is then used to illustrate how SNA and DM is applied to develop a model for predicting the layoff. Future directions in applying SNA and DM in the organization networks have also provided in this paper. 2. Literature Review

In this section, related literatures about social networks analysis and data mining will be reviewed. 2.1. Social Networks

Social network analysis provides a rich and systematic means of assessing such network by mapping and analyzing relationships among people, teams, departments or even the entire organization [10]. Organizations are considered as a network of individuals and researchers have used network analysis to map information flow as well as relational characteristics among strategically

2009 Advances in Social Network Analysis and Mining

978-0-7695-3689-7/09 $25.00 © 2009 IEEE

DOI 10.1109/ASONAM.2009.52

411

Page 2: [IEEE 2009 International Conference on Advances in Social Network Analysis and Mining (ASONAM) - Athens, Greece (2009.07.20-2009.07.22)] 2009 International Conference on Advances in

important groups to improve knowledge creation and sharing[5]. Mapping and understanding social networks within an organization is a mean for us to understand how social relationships may affect business processes. To understand the complexity of the task, let us consider the various structural measures that can be applied to social networks. These structures are characterized by relationships, entities, context, configurations, and temporal stability. Some of the indices and dimensions that express outcomes of network are: 1. Size: density and degree. Size is critical for the

structure of social relations due to each actor has limit resources and for building and maintaining ties. The degree of an actor is defined as the sum of the connections between the actor and others. The density measurement can be used to analyze the connectivity and the degree of nodes and links in a social network [14].

2. Centrality: The centrality of a social network is a measurement that is used to measure the betweenness and closeness of the social network. The measure of centrality which can be used to identify who have the most connections to others in the network (high degree) or whose departure would cause the network to fall apart [14].

3. Structural hole: The structural hole is also a measurement of social network analysis, which can be used to discover the holes in a social network and by this to fill the hole and expand the social network [14].

4. Reachability: The reachability can be used to analyze how to reach a node from another node in the social networks. An actor is reachable by another if there exists any set of connections by which we can trace from the source to the target actor, regardless of how many others fall between them [7].

5. Distance. Because most individuals are not usually connected directly to most other individuals in a population, it can be quite important to go beyond simply examining the immediate connections of actors, and the overall density of direct connections in populations. Walk, trail and path are basic concepts to develop more powerful ways of describing various aspects of the distances among actors in a network[4] [12].

2.2. Data Mining

Data mining has given the cleaned data intelligent methods that can be applied in order to extract data patterns. Data Mining is the extraction of hidden predictive information from large databases, is a powerful new technology with great potential to help companies

focus on the most important information in their huge database [15].

Data mining technologies can be use to generate new

business opportunities by providing capabilities if given databases of sufficient size and quality: automatic prediction of trends and behaviors, and automatic discovery of previously unknown patterns. The mostly common used techniques in data mining are listed as followings [15]: 1. Classification: The goal of classification is to

predict the value of a user-specified goal attribute based on the values of other attributes, called the predicting attributes. This is the most studied data mining approach [6] [8].

2. Clustering: In clustering applications, data mining algorithms must ‘‘discover’’ classes by partitioning the whole data set into several clusters, which is a form of unsupervised learning[2].

3. Associations: it is unique feature with the capability to find association rules for items in a transaction file, and the capability to find all rules including compound and hierarchical rule [2].

4. Genetic Algorithm: Optimization techniques that use process such as genetic combination, mutation, and natural selection in a design based on the concepts of evolution [3].

5. Decision Tree: Decision tree technique is one of the data mining methods developed for classification and prediction. It is still of great help to reveal explicit relationships between attributes among huge data. Many researches have been done with decision tree algorithm because of the great rule extraction and prediction ability [15].

3. Methods 3.1 Research process

After clarifying the research background and objectives, we must now define the process and architecture of the research. To achieve the research goal, left employees’ database of a semiconductor company at the Hsinchu Science Park of Taiwan will be used. The research architecture is shown in Figure 1, and the descriptions of each stage are presented as below: 1. Step1:Exploring Data Analysis. The first stage to

discover the record files of layoff from the left employees’ databases.

2. Step2:Constructing Organizational Network. Using social networks analysis for constructing a organizational network from the left employees databases. We use some network indicators, including density, degree, reachability, centrality, position and role to analyze the relationship between the manager and the laid-off employees.

412

Page 3: [IEEE 2009 International Conference on Advances in Social Network Analysis and Mining (ASONAM) - Athens, Greece (2009.07.20-2009.07.22)] 2009 International Conference on Advances in

3. Step3:Data Mining Analysis for the Layoff’s file. Applying data mining techniques for analysis the employees’ attributes. We used the cluster analysis to discover classes by partitioning the whole data set into several clusters and used the association rules to discover the important associations among items.

4. Step4 :Constructing Layoff Prediction Model . Finally, we used the decision tree technique for classification and construction the layoff prediction model from the laid-off employees’ organizational networks.

Figure 1: Research Architecture

3.2 Constructing Organizational Network

Global economic recession has been causing the unpaid leave and massive layoffs by major high-tech firms in Taiwan. To explore the phenomenon, we used 528 left job employees records during 2007~2009 to test the empirical study. We propose ten attributes to build a organizational network by using the social networks analysis. These attributes include department, supervisor, sex, age, shift, live register, marriage, position, education level and grade.

Social network analysis is an appropriate analysis

method for “relational data”. We need to construct the similarity attributes between managers and laid-off employees based on the social network. According to the data with similar attribute values, tend to be assigned to the same networks and exist a relationship, whereas data different from each other tend to be assigned to distinct networks. In this network, the similarity attributes of employee and employee (manager) are than divided into three kinds. We can indicate it has a tie between

employee and manager. The relationship matrix is shown in Table 1. A1 has a tie with employees A3 and A6, but not with A2, A4, A5, SUA(Manager A) and not with him/her self.

Using social networks analysis for construct a organizational networks relationship of laid-off employees from the left employees databases. The process is described as figure 2. This study proposed an example of organizational network A in order to explanation the research process.

Table 1: The relationship matrix of laid-off employees

emA1 emA2 emA3 emA4 emA5 emA6 suA

emA1 0 0 1 0 0 1 0emA2 0 0 0 0 1 0 0emA3 1 0 0 0 1 0 0emA4 0 0 0 0 0 1 0emA5 0 1 1 0 0 1 1emA6 1 0 0 1 1 0 1suA 0 0 0 0 1 1 0

Figure 2: Organizational Networks of laid-off employees

The study uses network software UCINET 6.182 to

analyze the laid-off indicators of employees’ organizational networks, including size, reachability, centrality, distance, and position and role analysis. Due to different stresses on the network, those indicators separately give us insight on how and for what degree they communicate with each other. The study is capable of finding some clues about network position for layoff. Then these variables will be used to construct network structure graph and data. With the approach, the research expects to presume that the structure or pattern of ties in a social network is meaningful to the members of the network. The descriptions of each indicator are showed as follow: 1. Size: density and degree.

The density of a network is an examination of how many correlations between employees compared to the

413

Page 4: [IEEE 2009 International Conference on Advances in Social Network Analysis and Mining (ASONAM) - Athens, Greece (2009.07.20-2009.07.22)] 2009 International Conference on Advances in

maximum possible number of connections that exist between employees. The following Figure 3 is the density analysis of this network. The density of the network is 0.3816 based on the 16 ties.

Figure 3: Density of the Network

The following Figure 4 is the descriptive statistics of

this network. The network degree centrality is 40% which describes this network centralization.

Figure 4: Degree of the network

2. Reachability: The reachability can be used to analyze how to reach a node from another node in the social networks. The following Figure 5 is for each pair of nodes, the algorithm finds whether there exists a path of any length that connects them.

Figure 5: Reachability of the network

3. Centrality:

The closeness centrality of a vertex is relied on the distance between one vertex and other vertices, which means that larger distances yield lower closeness centrality scores [12]. The following Figure 6 is the descriptive statistics of this network. The network closeness is 41.65% which describes the network centralization.

Figure 6: Closeness centrality

The betweenness centrality of an actor is the portion of

whole geodesics between pairs of other vertices that contain this vertex. The following Figure 7 is the descriptive statistics of this network. The group betweenness is 36.67% which describes this network centralization.

Figure 7: Betweenness centrality

4.Distance

The following Figure 8 can be quite important to go beyond simply examining the immediate connections of employees, and the overall density of direct connections in populations. Walk, trail and path are basic concepts to develop more powerful ways of describing various aspects of the distances among employees in a network.

Figure 8: Distance analysis

The following Figure 9 is a set of points and a set of

lines between pairs of points. Points and lines understood in graph display employees and their ties known in social

414

Page 5: [IEEE 2009 International Conference on Advances in Social Network Analysis and Mining (ASONAM) - Athens, Greece (2009.07.20-2009.07.22)] 2009 International Conference on Advances in

network analysis, directed graphs with one or two way arrows are used to display the degree of correlation between employees.

Figure 9: Graph analysis

5. Position and role Analysis

The position and role Analysis define the social position as collections of employees who are similar in their tie with others and modeling social roles as systems of ties between employees or between positions.

Figure 10: Position and role Analysis

The following Figure 11 is dendrogram for complete

link hierarchical clustering of Euclidean distances on the relation for employees. To compare Euclidean distance, the short distance is 1.414 between the employee A2 and A3 d. The employee A2 and A3 have the similar position and classify to the same cluster.

Figure 11: Clustering analysis of the position

3.3 Data Mining Analysis for Layoff’s file

In this session, in order to construct the layoff prediction model, we used the data mining techniques for extracting rules from selected data. This research used 124 training data from the laid-off employees network of a semiconductor company in the Science Park. The testing data is 100 employees’ data of the active employees’ database from the same resource in the year of 2009. Each record in employees’ database consists of 15 attributes. The original attributes of each column are as follows.

Table 2: The Attributes List

Attribute Attribute Attribute 1.ID 6. Compensation _LV 11. Hire_DT 2. Name 7. Live register 12.Termination_DT 3. Dept_ID 8. Education_LV 13.Supervisor_ID

4. Sex 9. Marriage 14. Position 5.Age 10. Grade 15. Shift_DESCR

The data mining techniques involved in this research

are demonstrated, including feature selection techniques for diminishing the data dimension. The classification analysis and association rule for extracting rules from selected data. The descriptions of each analysis as followed: 1. Clustering

Clustering is the task of segmenting a heterogeneous population into a number of more homogeneous subgroups or clusters. According to the attributes, we selected age, sex, marriage, grade, education level, shift, position and compensation level to cluster the left employee 6 segments by K-Means. To keep all clusters almost the same number of employees, we firstly divided each variable into four parts by quantification. We then transformed these numeric data to be categorical one for clustering. 2. Association Rule

Association analysis is the discovery of association rules showing attribute value conditions that occur frequently together in a given set of data. The layoff of association rules are generated from WEKA tool that was developed by University of Waikato in New Zealand. We used WEKA to do association and found the 8 useful rules for the laid-off employees’ attribute. 3. Decision Tree

A decision tree divides the records in the training set into disjoint subsets, each of which is described by a simple rule on one or more fields [3]. In this research, the training data set contains 124 records and the testing dataset has 100 records. This research combines the training dataset and testing dataset into a table.

We summarized research results that: 1. The decision-tree algorithm for fatty liver screening has an accuracy of 86.2%, and it is better than logistic regression; 2. The accuracy of decision-tree algorithm for moderate to severe fatty liver disease is 93%; 3. The cut points of six parameters in decision-tree algorithm are: 1. Layoff, Age=40~50, Sex=M, Mar=Y, Shi=Normal shift, Pos=Manager_Lv and Com 50000 had an approximately 86.2% accuracy rate for predicting the laid-off employees. 2. Layoff, Age=40~50, Sex=M,

415

Page 6: [IEEE 2009 International Conference on Advances in Social Network Analysis and Mining (ASONAM) - Athens, Greece (2009.07.20-2009.07.22)] 2009 International Conference on Advances in

Mar=Y, Edu=University , Pos=Manager_Lv and Com50000 had an approximately 92.17% accuracy rate for predicting the laid-off employees. 3. Layoff, Age=40~50, Sex=M, Mar=Y, Gra 10 , Pos=Manager_Lv and Com50000 had an approximately 96.44% accuracy rate for predicting the laid-off employees.

Figure 12: Decision Tree analysis

In consequence, we find the high compensation, high

position, high grade, high education level are in the dangerous list of layoff. 5. Discussion and Conclusion

This study aims to verify the main causes for layoff factors. This research intends to base on these factors and concepts that addressed by above to find a best layoff predictive model using the social network and data mining techniques. Through an empirical evaluation, the results indicated that the proposed approach has pretty good prediction accuracy by using organizational networks relationship, employees databases, and layoff records to build layoff prediction model. Both decision tree techniques are good candidates to be applied to develop the model. The main aim of this study is to highlight how to predict layoff for employees and reduce the unemployed rates based on mining historical databases, and hopefully provide a layoff predictive model for employees and company.

Facing the global recession, within the Hi-Tech industry such as the semiconductor one of the challenges is to understand and retain the beneficial employees for company. The current trend of layoff cut many high compensation managers in Hi-Tech industry. It is important phenomenon to make one deep in thought for employees.

This research data only forms a single semiconductor company in the Hsinchu Taiwan Science Park. In the future, we expect to apply this model to other industries. 7. Acknowledgement

This work was partially supported by National Science Council, Republic of China under Contract Numbers NSC 97-2410-H-390-022.

8. References [1] Burt, R. S. (2000), “ The Network Structure of

Social Capital. ”, Research in Organizational Behavior, 22, 345-423.

[2] Berson A., Smith S. and Thearling K.,(2000), Building Data Mining Applications for CRM, McGraw-Hill, New York, NY.

[3] Berry, M.J.A.,& Linoff, G. (1997), Data mining techniques: For marketing, sales, and customer support, New York: Wiley.

[4] Carrington, Peter J., John Scott and Stanley Wasserman (Eds.) (2005), Models and Methods in Social Network Analysis, New York: Cambridge University Press.

[5] Cross, R. and Parker, A (2004), The Hidden Power of Social Networks, Harvard University Press.

[6] Han J., Kamber M., (2001), Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers.

[7] Hanneman, Robert A. (1998), Introduction to Social Network Methods, University of California, Riverside. http://www.faculty.ucr.edu/~hanneman/.

[8] Kudo M., Skalansky J., (2000), “Comparison of Algorithms That Select Features for Pattern Classifiers”. Pattern Recognition, 33(1), pp. 25-41.

[9] Kilduff, M. a. T., Wenpin. (2003), Social Networks and Organizations, London: SAGE Publications.

[10] Lutters, W.G., Ackerman, M.S., Boster, J., McDonald, D.W., (2001). “Creating a knowledge mapping instrument: approximation techniques for mapping knowledge networks in organizations”, ICS Technical Report, No. 99–32), Center for Research on Information Technology and Organizations University of California, Irvine, CA.

[11] M. Škerlavaj and V. Dimovski(2006), “Social Network Approach To Organizational Learning”, Journal of Applied Business Research, 22(2), pp. 89-97.

[12] Nooy, W. D., Mrvar, A., and Batagelj, V. (2005).,Exploratory Social Network Analysis with Pajek.,New York: Cambridge University Press.

[13] Sparrowe,R.T.,Liden,R.C., Wayne, & Kraimer,M.L. (2001). “Social networks and the performance of individuals and groups”, Academy of Management Journal, 44(20), 316-325.

[14] Stanley Wasserman, Katherine Faust, Dawn Iacobucci and Mark Granovetter (1994), Social network analysis: methods and applications, Currently unavailable.

[15] Thearling Kurt.,(1999), “A Introduction of Data Mining” , Direct Marketing Magazine, Feb.

416