intrusion detection by machine learning: a review
Post on 21-Jun-2016
Embed Size (px)
tern nIn les.e le
reviews 55 related studies in the period between 2000 and 2007 focusing on developing single, hybrid,
systems by machine learning are present and discussed. A number of future research directions are also
daily l, suchlar, Intess m
and computer systems usage, whereas the conventional rewallcan not perform this task. Intrusion detection is based on theassumption that the behavior of intruders different from a legaluser (Stallings, 2006).
In general, IDSs can be divided into two categories: anomalyand misuse (signature) detection based on their detection ap-
niques have been used, what experiments have been conducted,and what should be considered for future work based on the ma-chine learnings perspective.
This paper is organized as follows. Section 2 provides anoverview of machine learning techniques and briey describesa number of related techniques for intrusion detection. Section 3compares related work based on the types of classier design,the chosen baselines, datasets used for experiments, etc. Conclu-sion and discussion for future research are given in Section 4.
* Corresponding author. Tel.: +886 5 2720411; fax: +886 5 2720859.
Expert Systems with Applications 36 (2009) 1199412000
Contents lists availab
.eE-mail address: firstname.lastname@example.org (W.-Y. Lin).2007). For the business operation, both business and customers ap-ply the Internet application such as website and e-mail on businessactivities. Therefore, information security of using Internet as themedia needs to be carefully concerned. Intrusion detection is onemajor research problem for business and personal networks.
As there are many risks of network attacks under the Internetenvironment, there are various systems designed to blockthe Internet-based attacks. Particularly, intrusion detection sys-tems (IDSs) aid the network to resist external attacks. That is,the goal of IDSs is to provide a wall of defense to confront theattacks of computer systems on Internet. IDSs can be used ondetect difference types of malicious network communications
developed based on many different machine learning techniques(c.f. Section 3). For example, some studies apply single learn-ing techniques, such as neural networks, genetic algorithms,support vector machines, etc. On the other hand, some systemsare based on combining different learning techniques, such ashybrid or ensemble techniques. In particular, these techniquesare developed as classiers, which are used to classify or recognizewhether the incoming Internet access is the normal access or an at-tack. However, there is no a review of these different machinelearning techniques over the intrusion detection domain.
Therefore, the goal of this paper is to review 55 related studies/systems published from 2000 to 2007 by examining what tech-1. Introduction
The Internet has become a part oftoday. It aids people in many areasment and education, etc. In particuan important component of busin0957-4174/$ - see front matter 2009 Elsevier Ltd. Adoi:10.1016/j.eswa.2009.05.029provided. 2009 Elsevier Ltd. All rights reserved.
ife and an essential toolas business, entertain-ernet has been used asodels (Shon & Moon,
proaches (Anderson, 1995; Rhodes, Mahaffey, & Cannady, 2000).Anomaly detection tries to determine whether deviation fromthe established normal usage patterns can be agged as intrusions.On the other hand, misuse detection uses patterns of well-knownattacks or weak spots of the system to identify intrusions.
In literature, numbers of anomaly detection systems areand ensemble classiers. Related studies are compared by their classier design, datasets used, andother experimental setups. Current achievements and limitations in developing intrusion detectionReview
Intrusion detection by machine learning:
Chih-Fong Tsai a, Yu-Feng Hsu b, Chia-Ying Lin c, WeiaDepartment of Information Management, National Central University, TaiwanbDepartment of Information Management, National Sun Yat-Sen University, TaiwancDepartment of Accounting and Information Technology, National Chung Cheng UniversdDepartment of Computer Science and Information Engineering, National Chung Cheng
a r t i c l e i n f o
Keywords:Intrusion detectionMachine learningHybrid classiersEnsemble classiers
a b s t r a c t
The popularity of using Inmajor research problem isecure internal networks.machine learning techniqurent status of using machin
journal homepage: wwwll rights reserved.review
ang Lin d,*
net contains some risks of network attacks. Intrusion detection is oneetwork security, whose aim is to identify unusual access or attacks toiterature, intrusion detection systems have been approached by variousHowever, there is no a review paper to examine and understand the cur-arning techniques to solve the intrusion detection problems. This chapter
le at ScienceDirect
lsevier .com/locate /eswa
Pattern recognition is the action to take raw data and activity on
chines, articial neural network, decision trees, self-organizing
on-line trains the examples and nds out k-nearest neighbor of
h Apthe new instance.
2.2.2. Support vector machinesSupport vector machines (SVM) is proposed by Vapnik (1998).
SVM rst maps the input vector into a higher dimensional featurespace and then obtain the optimal separating hyper-plane in thehigher dimensional feature space. Moreover, a decision boundary,i.e. the separating hyper-plane, is determined by support vectorsrather than the whole training samples and thus is extremely ro-bust to outliers.
In particular, an SVM classier is designed for binary classica-tion. That is, to separate a set of training vectors which belong totwo different classes. Note that the support vectors are the trainingsamples close to a decision boundary. The SVM also provides a userspecied parameter called penalty factor. It allows users to make atradeoff between the number of misclassied samples and thewidth of a decision boundary.
2.2.3. Articial neural networksThe neural network is information processing units which to
mimic the neurons of human brain (Haykin, 1999). Multilayer per-ceptron (MLP) is the widely used neural network architecture inmaps, etc.) have been used to solve these problems.
2.2.1. K-nearest neighborK-nearest neighbor (k-NN) is one of the most simple and tradi-
tional nonparametric technique to classify samples (Bishop, 1995;Manocha & Girolami, 2007). It computes the approximate dis-tances between different points on the input vectors, and then as-signs the unlabeled point to the class of its K-nearest neighbors. Inthe process of create k-NN classier, k is an important parameterand different k values will cause different performances. If k is con-siderably huge, the neighbors which used for prediction will makelarge classication time and inuence the accuracy of prediction.
k-NN is called instance based learning, and it is different fromthe inductive learning approach (Mitchell, 1997). Thus, it doesnot contain the model training stage, but only searches the exam-ples of input vectors and classies new instances. Therefore, k-NNdata category (Michalski, Bratko, & Kubat, 1998). The methods ofsupervised and unsupervised learning can be used to solve differ-ent pattern recognition problems (Theodoridis & Koutroumbas,2006, 2006). In supervised learning, it is based on using the train-ing data to create a function, in which each of the training datacontains a pair of the input vector and output (i.e. the class label).
The learning (training) task is to compute the approximate dis-tance between the inputoutput examples to create a classier(model). When the model is created, it can classify unknown exam-ples into a learned class labels.
2.2. Single classiers
The intrusion detection problem can be approached by usingone single machine learning algorithm. In literature, machinelearning techniques (e.g. k-nearest neighbor, support vector ma-2. Machine learning techniques
2.1. Pattern classication
C.-F. Tsai et al. / Expert Systems witmany pattern recognition problems. A MLP network consists ofan input layer including a set of sensory nodes as input nodes,one or more hidden layers of computation nodes, and an outputlayer of computation nodes. Each interconnection has associatedwith it a scalar weight which is adjusted during the training phase.
In addition, the backpropagation learning algorithm is usuallyused to train a MLP, which are also called as backpropagation neu-ral networks. First of all, random weights are given at the begin-ning of training. Then, the algorithm performs weights tuning todene whatever hidden unit representation is most effective atminimizing the error of misclassication.
2.2.4. Self-organizing mapsSelf-organizing map (SOM) (Kohonen, 1982) is trained by an
unsupervised competitive learning algorithm, a process of selforganization. The aim of SOM is to reduce the dimension of datavisualization. That is, SOM projects and clusters high-dimensionalinput vectors onto a low-dimensional visualized map, usually 2 forvisualization. It usually consists of an input layer and the Kohonenlayer which is designed as two-dimensional arrangement of neu-rons that maps n dimensional input to two dimensions. KohonensSOM associates each of the input vectors to a representative out-put. The network nds the node closest to each training case andmoves the winning node, which is the closest neuron (i.e. the neu-ron with minimum distance) to the training case.
That is, SOM maps similar input vectors onto the same or sim-ilar output units on such a two-dimensional map. Therefore, out-put units will