[ieee 2010 third international workshop on advanced computational intelligence (iwaci) - suzhou,...

6

Click here to load reader

Post on 14-Dec-2016

224 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: [IEEE 2010 Third International Workshop on Advanced Computational Intelligence (IWACI) - Suzhou, China (2010.08.25-2010.08.27)] Third International Workshop on Advanced Computational

Abstract—The success of any Intrusion Detection Systems (IDSs) is a complicated problem due to its nonlinearity and the quantitative or qualitative network traffic data stream with irrelevant and redundant features. How to choose the effective and key features is very important topic for an intrusion detection problem. Support vector machine (SVM) has been employed to provide potential solutions for the IDSs problem. However, the practicability of SVM is affected due to the difficulty of selecting appropriate SVM parameters. Artificial Bee Colony algorithm (ABC) is an optimization method, which is not only has strong global search capability, but also is very easy to implement. Thus, the proposed ABC–SVM model is applied to determine free parameters of SVM and feature selection at building intrusion detection system. The standard ABC is used to determine free parameters of support vector machine and the binary ABC is to obtain the optimum feature selection for IDSs from KDD Cup 99 data set. The experimental results indicate that the ABC–SVM method can achieve higher accuracy rate than Particle swarm optimization (PSO) and GA-SVM algorithms in the same time.

I. INTRODUCTION RADITIONAL security policies or firewalls have difficulty in preventing such attacks because of the hidden

vulnerabilities contained in software applications. Therefore, intrusion detection system (IDS) is required as an additional wall for protecting systems despite the prevention techniques. Support vector machine (SVM) is the method that is receiving increasing attention with remarkable speed and results for the design of IDS recently. Unfortunately, The determination of parameters values becomes an optimization problem in the practicability of SVM. IDS is always to deal with huge amount of data causing slow training and testing process and low detection rate. So feature selection is one of the key topics in IDS[1]. Approaches for feature selection can be categorized into two models, namely a filter model and a wrapper model[2].Chen and Hsieh (2006) presented latent semantic analysis and web page feature selection, which are

This work was supported in part by the Science Foundation of

Northeastern University at Qinhuangdao of PRC. Jun Wang is with the Department of Management, Northeastern

University at Qinhuangdao, HeBei Province, 0660046 PRC (corresponding author phone: (86)335-805-8151; fax: (86)335-805-7478; e-mail: wjhit@ sina.com).

Taihang Li is with the Department of Sport Management, Northeastern University at Qinhuangdao, HeBei Province, 0660046 PRC (e-mail: [email protected]).

Rongrong Ren is with the Department of Management, Northeastern University at Qinhuangdao, HeBei Province, 0660046 PRC. (e-mail: [email protected]).

combined with the SVM technique to extract features[3]. Gold, Holub, and Sollich(2005) presented a Bayesian viewpoint of SVM classifiers to tune hyper-parameter values in order to determine useful criteria for pruning irrelevant features[4]. Even though the filter model is fast, the resulting feature subset may not be optimal [2]. In the wrapper model, meta-heuristic approaches are commonly employed to help in looking for the best feature subset. Although meta-heuristic approaches are slow, they obtain the (near) best feature subset. Shih Wei Lin and Kuo Ching Ying and Shih Chieh Chen and Zne Jung Lee(2008)Developed PSO + SVM approach for determining the parameter values for SVM with and without feature selection[5].YANG Huachao and ZHANG Shubi and DENG Kazhong and DU Peijun(2007)proposed feather selection for hyperspectral images combing a PSO and SVM algorithm[6].Sheng-wei Fei and Ming-Jun Wang and Yu-bin Miao and Jun Tu and Chengliang Liu used PSO–SVM model to forecast dissolved gases content in power transformer oil[7].

Jack and Nandi (2002) and Shon, Kim, Lee, and Moon (2005), employed GA to screen the features of a dataset. The selected subset of features is then fed into the SVM for classification testing. Zhang, Jack, and Nandi (2005) developed a GA-based approach to discover a beneficial subset of features for SVM in machine condition monitoring[8]. Samanta, Al-Balushi, and Al-Araimi (2003) proposed a GA approach to modify the RBF width parameter of SVM with feature selection[9]. Nevertheless, since these approaches only consider the RBF width parameter for the SVM, they may miss the optimal parameter setting.

Thus a new technology: A Global optimal search performance of Artificial Bee Colony (ABC) is used to optimize the SVM model parameters and feature selection for IDS from the KDD Cup 99 data sets[10]. The method is very easy to implement and find the true global minimum of a multi-modal search space regardless of the initial parameter values, fast convergence, being simple and flexible, and using very few control parameters to adjust[11].

This paper is organized as follows: Section II introduces the regression arithmetic of SVM. Parameters selection of SVM and the feature selection for IDS based on ABC is introduced in Section III. Section IV testifies the performance of the proposed model with the data sets from KDD Cup (1999) intrusion detection dataset. Finally, the conclusion is presented in Section V.

A Real Time IDSs Based on Artificial Bee Colony-

Support Vector Machine Algorithm Jun Wang, Taihang Li, and Rongrong Ren

T

91

Third International Workshop on Advanced Computational Intelligence August 25-27, 2010 - Suzhou, Jiangsu, China

978-1-4244-6337-4/10/$26.00 @2010 IEEE

Page 2: [IEEE 2010 Third International Workshop on Advanced Computational Intelligence (IWACI) - Suzhou, China (2010.08.25-2010.08.27)] Third International Workshop on Advanced Computational

II. SUPPORT VECTOR MACHINES Support vector machine is put forward by Vapnik (1995), it

becomes a popular method in machine learning area. The basic concept of SVM regression is to map nonlinearly the original data x into a high-dimensional feature space, and to solve a linear regression problem in this feature space. Let ( ){ } { }, 1

mi i i

x y X∈ × ± . Where ix denotes the input vector,

iy denotes the corresponding output value and m denotes the total number of data patterns, the SVM regression function is: ( )f x w x b= ⋅ + , Where x denotes the high-dimensional feature space, w denotes the weight vector and b denotes the bias term. The coefficients w and b are estimated by minimizing the following regularized risk function:

( ) ( )( )2

1

1 1 ,2

m

i ii

R C w C L y f xm ε

=

= + ∑ (1)

Where C denotes a cost function measuring the empirical

risk. 212

w is the regularization term. ( )( ),i iL y f xε is called

the e-insensitive loss function, which is defined as:

( )( ) ( ) ( )( )

,0

i i i ii i

i i

y f x y f xL y f x

y f xε

ε ε

ε

⎧ − − − ≥⎪= ⎨− <⎪⎩

(2)

In (2), the loss equals zero if the error of forecasting value is less thanε , otherwise the loss equals value beyondε . Two positive slack variablesξ andξ ∗ are introduced to represent the distance from actual values to the corresponding boundary values of the ε -tube. Then, ( )R C is transformed into the following constrained form:

( ) ( )2

1

1min , ,2

0.

0

m

i

i

i

w w C

y w x bs t

w x b y

φ ξ ξ ξ ξ

ε ξ ξε ξ ξ

∗ ∗

=

∗ ∗

= + +

− ⋅ − ≤ + ≥⎧⎨ ⋅ + − ≤ + ≥⎩

∑ (3)

This constrained optimization problem is solved using the following Lagrangian form:

( ) ( ) ( )

( )( ) ( )

( ) [ ]

1 1

1 1

1

max ,

1 ,2

. 0 , 0,

m m

i i i i ii i

m m

i i j j i ji j

mm

i i i ii

H y

K x x

s t C

ε∗ ∗ ∗

= =

∗ ∗

= =

=

∂ ∂ = ∂ − ∂ − ∂ + ∂

− ∂ − ∂ ∂ − ∂

∂ − ∂ = ∂ ∂ ∈

∑ ∑

∑∑

(4)

Where i∂ and mi∂ are the so-called Lagrangian multipliers.

By the Lagrange multipliers i∂ and mi∂ calculated, an optimal

desired weight vector is obtained, that is:

( ) ( )1

,m

i i ii

w K x x∗ ∗

=

= ∂ − ∂∑ (5)

Hence, the regression function is:

( ) ( ) ( )1

,m

i i ii

f x K x x b∗

=

= ∂ − ∂ +∑ (6)

Based on the Karush–Kuhn–Tucker’s (KKT) conditions of solving quadratic programming problem, the corresponding data points of 0i i

∗∂ − ∂ ≠ are support vectors, which are employed in determining the decision function. SVM constructed by radial basis function (RBF) has excellent nonlinear forecasting performance and fewer free parameters need determination. Thus, in this work, RBF is adopted in the SVM.

In (6), ( ) ( )2 2, exp /i iK x x x x σ= − − . Here, C , σ and

ε are user-determined parameters, the election of the parameters plays an important role in the performance of SVM[11].

III. HYBRID ABC-SVM FOR PARAMETERS AND FEATURES SELECTION

A. Standard Artificial Bee Colony Algorithm (SABC) The artificial bee colony algorithm is a new

population-based metaheuristic approach proposed by Karaboga[12]. Motivated by the foraging behavior of honeybees, Karaboga proposed the artificial bee colony algorithm. In the ABC algorithm, the position of a food source represents a possible solution to the optimization problem and the nectar amount of a food source corresponds to the quality (fitness) of the associated solution. In this algorithm also colony of artificial bees (bees for short) has three types of bees: employed, onlookers and scouts. The number of the employed bees or the onlooker bees is equal to the number of solutions in the population. First half of the bee colony comprises employed bees, whereas the latter half contains the onlookers. The ABC algorithm assumes that there is only one employed bee for every food source. The employed bee of an abandoned food source becomes a scout and as soon as it finds a new food source it again becomes employed[13]. The ABC algorithm is an iterative algorithm. At the first step, the ABC generates a randomly distributed initial population food sources (solution). Then during each iteration, every employed bee determines a food source in the neighborhood of its currently associated food source and evaluates its nectar amount (fitness). If its nectar amount is better than that of its currently associated food source then that employed bee moves to this new food source leaving the old one, otherwise it retains its old food source. After all employed bees complete the search process, they share the nectar information of the food sources and their position information with the onlooker bees. An onlooker bee evaluates the nectar information taken from all employed bees and chooses a food source with a probability related to its nectar amount. The probability ip of selecting a food

92

Page 3: [IEEE 2010 Third International Workshop on Advanced Computational Intelligence (IWACI) - Suzhou, China (2010.08.25-2010.08.27)] Third International Workshop on Advanced Computational

source i is determined using the following expression:

1

ii m

j j

fpf=

=∑

(7)

where f is the fitness of the solution represented by the food source i and m is the total number of food sources. Clearly, with this scheme good food sources will get more onlookers than the bad ones. After all onlookers have selected their food sources, each of them determines a food source in the neighborhood of his chosen food source and computes its fitness. The best food source among all the neighboring food sources determined by the onlookers associated with a particular food source i and food source i itself, will be the new location of the food source i . If a solution represented by a particular food source does not improve for a predetermined number of iterations then that food source is abandoned by its associated employed bee and it becomes a scout, i.e., it will search for a new food source randomly[14]. This tantamount to assigning a randomly generated food source(solution) to this scout and changing its status again from scout to employed. After the new location of each food source is determined, another iteration of ABC algorithm begins. The whole process is repeated again and again till the termination condition is satisfied. The population of the positions (solutions) is subject to repeated cycles,

1, 2,...cycle MCN= . The food source in the neighborhood of a particular food

source is determined by altering the value of one randomly chosen solution parameter and keeping other parameters unchanged. This is done by adding to the current value of the chosen parameter the product of an uniform variety in [ 1,1]− and the difference in values of this parameter for this food source and some other randomly chosen food source[9]. Formally, suppose each solution consists of d parameters and let ( )1 2, ,...i i i idx x x x= be a solution with parameter values

1 2, ,...i i idx x x . In order to determine a solution iv in the neighborhood of ix , a solution parameter j and another solution ( )1 2, ,...k k k kdx x x x= are selected randomly. Except for the value of the selected parameter j , all other parameter values of iv are same as ix , i.e.,

( )1 2 ( 1) ( 1), ,... , , ,...i i i i j ij i j idv x x x x x x− += . The value iv of the

selected parameter j in iv is determined using the following formula:

( )ij ij ij kjv x u x x= + − (8)

where u is random number in [ 1,1]− . If the resulting value falls outside the acceptable range for parameter j , it is set to the corresponding extreme value in that range[12].

The food source of which the nectar is abandoned by the bees is replaced with a new food source by the scouts. In ABC, this is simulated by producing a position randomly and replacing it with the abandoned one. In ABC, if a position cannot be improved further through a predetermined number

of cycles, then that food source is assumed to be abandoned. The value of predetermined number of cycles is an important control parameter of the ABC algorithm, which is called “limit” for abandonment. Assume that the abandoned source is ix , then the scout discovers a new food source to be replaced with ix . This operation can be defined as in :

[ ]( )min max min0,1ix x rand x x= + − (9)

The ABC algorithm has three control parameters: colony size (number of employed bees or food source positions), maximum cycle number or iteration number, and the limit value. The SABC search method is used to select C ,σ and ε parameters in SVM. Similar to genetic algorithms, ABC performs searches using a colony bees of individuals bee that are updated from iteration to iteration. The bee of SABC is composed of three parts, C ,σ and ε [15].

B. Binary ABC (BABC) The previous ABC algorithm is a standard ABC algorithm,

i.e., each dimension of a bee can only be set as real values. So it is hard to be used in discrete optimization problems such as feature selection of network connection records in DARPA data. Datasets with unimportant, noisy or highly correlated features will significantly decrease the classification accuracy rate. By removing these features, the efficiency and classification accuracy rate can be obtained. For the BABC algorithm, a binary encoding is adopted, where ix , and iv for each dimension are limited to 1 or 0. The sigmoid function of velocity is a logical choice to do this as follows:

( ) 11 exp( )ij

ij

s vv

=+ −

(10)

The position is updated as follows:

() ( ) 1

0ij ij

ij

if rand s v then xelse x

< ==

(11)

where ()rand sets the value of ijv to a range of [ ]0,1 . The maximal maxv can be used to limit the probability to 0 or 1. Selection feature of subset in this case is binary, that is, attack and normal features are distinguished.

Detailed pseudo-code of the ABC algorithm is given below[16]: 1: Initialize the population of solutions ix , 1,...i m= ; 2: Evaluate the population; 3: cycle = 1; 4: repeat; 5: Produce new solutions iv for the employed bees by using (8) and evaluate them; 6: Apply the greedy selection process for the employed bees; 7: Calculate the probability values ip for the solutions ix refer to (7); 8: Produce the new solutions iv for the onlookers from the solutions ix selected depending on ip and evaluate them;

93

Page 4: [IEEE 2010 Third International Workshop on Advanced Computational Intelligence (IWACI) - Suzhou, China (2010.08.25-2010.08.27)] Third International Workshop on Advanced Computational

9: Apply the greedy selection process for the onlookers; 10: Determine the abandoned solution for the scout, if exists, and replace it with a new randomly produced solution ix refer to (8); 11: Memorize the best solution achieved so far; 12: cycle = cycle + 1; 13: until cycle = MCN.

C. Hybrid ABC-SVM Approach Firstly, SABC is used to elect the C ,σ and ε in SVM .

Secondly, we selected the best feature subsets by using BABC algorithm. Feature subset selection and parameter values determination: Each employed bee represents a solution, which denotes the selected subset of features and parameter values[17]. The selected features, parameter values, and training dataset are used for building SVM classifier models. The basic process of the ABC algorithm is given by[4]:

Step 1: (Initialization) Randomly generate initial onlooks. For the BABC algorithm, the complete set of features is represented by a binary string of length N, where a bit in the string is set to ‘1’ if it is to be kept, and set to ‘0’ if it is to be discarded, and N is the original number of features.

Step 2: (Fitness) Measure the fitness of each onlook in the population. The selection of this fitness function is a crucial point in using the ABC algorithm, which determines what the ABC should optimize. Here, the task of the ABC algorithm is to find the global minimum value according to the definition of the fitness function. The definition of the fitness function for the basic method is simply the accuracy of detection.

Step 3: (Update) Compute the fitness of each scout. Step 4: (Construction) For each bee, move to the next

position. Step 5: (Termination) Stop the algorithm if the termination

criterion is satisfied; return to Step 2 otherwise[18].

IV. EXPERIMENT SETUP & PERFORMANCE EVALUATION

A. Data Set and Processing The data used here originated from MIT’s Lincoln Labs

and is considered a standard benchmark for intrusion detection evaluations. Lincoln labs acquired nine weeks of raw TCP dump data. The data set contains 24 attack types. These attacks fall into four main categories: Denial of service (DOS), Remote to user (R2L), User to root (U2R), Probing. Our experiments have two phases, namely, a training and a testing phase. This data set is again divided into training data with 6092 records and testing data with 5890 records set, which were randomly generated from the MIT data set. All the intrusion detection models are trained and tested with the same set of data. As the data set has five different classes we perform a 5-class classification. The normal data belongs to class1, probe belongs to class2, denial of service (DoS) belongs to class3, user to root (U2R) belongs to class4 and remote to local (R2L) belongs to class5.

In the experiments, we firstly utilized our feature selection algorithm to select important features, and then built intrusion detection systems using these selected features. The training data set is then separated into attack data sets and normal data sets, which are then subsequently fed into the hybrid ABC- SVM algorithms. Through the training process, hybrid ABC- SVM models can be built. We then feed the test data set into the ABC-SVM models[19].

B. Parameter of Hybrid ABC-SVM The proposed model has been implemented in Matlab7.1

programming language. The experiments are made on a 1.80 GHz Core(TM) 2 CPU personal computer (PC) with 2.0G memory under Microsoft Windows xp professional. In ABC algorithm, Except common parameters (population number and maximum evaluation number), the basic ABC used in this study employs only one control parameter, which is called limit. A food source will not be exploited anymore and is assumed to be abandoned when limit is exceeded for the source. This means that the solution of which “trial number” exceeds the limit value cannot be improved anymore. We defined a relation for the limit value using the dimension of the problem and the colony size:

( )limit m d= × (12)

where d is the dimension of the problem and m is the number of food sources or employed bees. The maximum number of iterations was set to 500, the dimensions of the solution space was set to 41(each type of attacks will be included in one of the all 41 features) , the colony size was 20 and the maximum evaluation number was 100.000. For each experiment, ABC algorithm was run 50 times and it was terminated when it reached the maximum number[20].

According to a previous study [21], The searching range of parameter C of SVM was between 0.01 and 35,000, while the searching range of parameter δ of SVM was between 0.0001 and 32 (Lin & Lin, 2003).The fitness function selected for the ABC can directly reflect the classification performance as follows: f Accuracy rate= . Where the accuracy rate denotes overall classification accuracy for each individual of a population obtained using ten multiple cross-validation with SVM classifiers, which will be used to guide the optimization of a employed bee[22].

C. Anomaly Detection Results The research intends to compare the efficiency of SVM and

ABC-SVM under different circumstances. Detection and identification of attack and non-attack behaviors can be generalized as the following: True positive (TP): the amount of attack detected when it is actually attack. True negative (TN): the amount of normal detected when it is actually normal. False positive (FP): The amount of attack detected when it is actually normal, namely false alarm. False negative (FN): The amount of normal detected when it is actually attack, namely the attacks which can be detected by intrusion detection system[22].

94

Page 5: [IEEE 2010 Third International Workshop on Advanced Computational Intelligence (IWACI) - Suzhou, China (2010.08.25-2010.08.27)] Third International Workshop on Advanced Computational

True negative rate (TNR): /( )TN TN TP+ , also known as specificity.

True positive rate (TPR): / ( )TP FN TP+ , also known as detection rate (DR) or sensitivity. In information retrieval, this is called recall.

False positive rate (FPR): / ( ) 1FP TN FP+ = − specificity, also known as false alarm rate (FAR).

False negative rate (FNR): / ( )FN TP FN+ .

Accuracy rate: ( ) / ( )TN TP TN TP FN FP+ + + + .

Precision: / ( )TP TP FP+ , which is another information retrieval term.

An IDS should have a high DR and a low FAR. Other commonly used combinations include precision and recall, or sensitivity and specificity[23]. Accuracy rate is chosen as the ABC-SVM performance metrics. Table I and II summarize the results of the test data. It shows the training and testing times of the two kind different SVM classifier in seconds for each of the five classes and their accuracy.

After selected five best feature subsets by BABC algorithm, we then built several intrusion detection models on the sampled training datasets using the above feature subsets and all 41 features. We considered all attacks as a whole, and built two types of intrusion detection system, one type was built using all 41 features, and the other was built using selected features. We selected all attack type features including service, src_bytes, count, dst_host_count by BABC and SVM parameters 2238.2041C = , 1.1037σ = and 0.01275ε = by SABC.

For normal class both methods give the same performance. There is only a small difference in the accuracy for Normal, Probe and DOS classes for SVM but there is a significant difference for U2R and R2L classes. Since these two classes have small training data compared to other classes. Speed is always a concern as the purpose of intrusion detection system is for on-line applications. Table I and II shows that the most time-consuming part of course comes from trying to find the ABC-SVM model, the correct combination of parameters, better ways of training the data and the search for a proper training data set etc, Obviously the training takes much longer time than testing[24].

The performance of the hybrid ABC-SVM is illustrated in Table III. All the classifiers considered so far could not perform well for detecting all the attacks. But Empirical results depict that the Hybrid ABC-SVM works better than the PSO-SVM and GA-SVM for five classes attack types. The use of ABC and SVM based feature extraction methods improves the overall accuracy by 1.31% to 2.65%. The proposed ABC-SVM approach gives better performance for detecting DOS and U2R attacks than all the other individual models especially.

We can calculate that ABC-SVM intrusion detection systems with selected features have higher accuracy rate (100%) than PSO and GA SVM approaches(98.69% & 97.35%) in terms of detecting attacks[25].

V. CONCLUSION In this paper, we proposed a hybrid ABC-based feature

selection algorithm to build novel IDS. In SVM parameters C , σ and ε are elected by SABC; IDS Data feature selection algorithm consists of search strategy - BABC and evaluation criterion- SVM. We developed a series of experiments on KDD Cup (1999) intrusion detection dataset to examine the effectiveness of our feature selection and its free parameters in building effective IDS. The experiment results show that our approach is not only able to achieve the process of selecting important features but also to yield high accuracy rates for IDS. In our future work, we will further improve our feature selection algorithm on search strategy and evaluation criterion to help build efficient and practical intrusion detection.

REFERENCES [1] Y. Grandvalet, S. Canu. “Adaptive scaling for feature selection in

SVMs.” 15th Advances in Neural Information Processing Systems. 2003, pp: 553–560.

[2] H. Liu, H. Motoda. Feature Selection for knowledge discovery and data mining. Boston: Kluwer Academic,1998.

[3] R.C. Chen, C. H. Hsieh. “Web page classification based on a support vector machine using a weighed vote schema. ” Expert Systems with Applications, vol.31, pp.427–435,2006.

[4] C. Gold , A. Holub, P. Sollich. “Bayesian approach to feature selection and parameter tuning for support vector machine classifiers. ” Neural Networks, vol.18, pp. 693–701,2005.

[5] Lin Shih Wei, Ying Kuo Ching, Chen Shih Chieh, Lee Zne Jung. Particle swarm optimization for parameter determination and feature

TABLE I PERFORMANCE OF ABC-SVM

Attack type

Training time(s)

Testing time(s) Accuracy(%)

Normal 5.16 0.53 100.00 Probe 2.38 0.13 100.00 DOS 21.26 3.15 99.92 U2R 3.65 0.95 76.00 R2L 1.92 1.13 87.92

TABLE II PERFORMANCE OF PSO-SVM

Attack type

Training time(s)

Testing time(s) Accuracy(%)

Normal 7.58 2.44 98.69 Probe 8.25 1.59 95.53 DOS 27.48 5.18 93.28 U2R 8.05 1.65 65.85 R2L 3.52 1.17 66.08

TABLE III PERFORMANCE OF GA-SVM

Attack type

Training time(s)

Testing time(s) Accuracy(%)

Normal 9.64 5.81 97.35 Probe 10.25 3.78 95.53 DOS 32.85 8.91 93.28 U2R 12.94 4.37 60.35 R2L 5.73 3.82 57.12

95

Page 6: [IEEE 2010 Third International Workshop on Advanced Computational Intelligence (IWACI) - Suzhou, China (2010.08.25-2010.08.27)] Third International Workshop on Advanced Computational

selection of support vector machines.” Expert Systems with Applications. vol. 35, pp. 1817–1824, 2008.

[6] YANG Huachao, ZHANG Shubi, DENG Kazhong, DU Peijun. “Research into a Feature Selection Method for Hyperspectral Imagery Using PSO and SVM” Journal of China University of Mining & Technology. vol,17, pp.0473 – 0478.2007

[7] Fei Shengwei ,Wang Ming Jun, Miao Yu bin, Tu Jun, Liu Cheng-liang. “Particle swarm optimization-based support vector machine for forecasting dissolved gases content in power transformer oil.” Expert Systems with Applications. vol. 35, pp. 1817–1824.2008.

[8] T. Shon, Y. Kim, C. Lee, J. Moon. “A machine learning framework for network anomaly detection using SVM and GA.” IEEE Workshop on Information Assurance and Security.vol.2, pp. 176–183. 2005.

[9] I. Oh, J. Lee, B. Moon. “Hybrid genetic algorithms for feature selection.” IEEE Transactions on Pattern Analysis and Machine Intelligence.2006, 26: 1424-1437.Wun-Hwa Chen, Sheng-Hsun Hsu, Hwang-Pin Shen. “Application of SVM and ANN for intrusion detection.” Computers & Operations Research, vol. 32, pp:2617–2634, 2003.

[10] KDD Cup. Data available: http: //kdd.ics.uci.edu/databases/kddcup99/kddcup99. html; 1999.

[11] D. Karaboga, “An idea based on honey bee swarm for numerical optimization,” Technical Report TR06, Computer Engineering Department, Erciyes University, Turkey, 2005.

[12] Alok Singh, “An artificial bee colony algorithm for the leaf-constrained minimum spanning tree problem.” Applied Soft Computing.vol.9, pp.625-631,2009.

[13] Fei Kang, Junjie Li, Qing Xu. “Structural inverse analysis by hybrid simplex artificial bee colony algorithms.” Computers and Structures.vol.87,pp.861-870,2009

[14] D. Karaboga , B. Basturk . “On the performance of artificial bee colony (ABC) algorithm.”Applied Soft Computing .vol.8, pp. 687–697,2008.

[15] Dervis Karaboga , Bahriye Akay. “A comparative study of Artificial Bee Colony algorithm.” Applied Mathematics and Computation. vol.214 , pp. 108–132,2009.

[16] Jaehak Yu, Hansung Lee, Myung Sup Kim, Daihee Park.“Traffic flooding attack detection with SNMP MIB using SVM.” Computer Communications. Vol.31, pp. 4212–4219,2008.

[17] Sandhya Peddabachigari, Ajith Abraham,Crina Grosan, Johnson Thomas.“Modeling intrusion detection system using hybrid intelligent systems.” Journal of Network and Computer Applications.vol. 30 ,pp. 114–13, 2007.

[18] E. Hernández Pereira, J.A. Suárez Romero, O. Fontenla Romero, A. Alonso Betanzos. “Conversion methods for symbolic features: A comparison applied to an intrusion detection problem.” Expert Systems with Applications. vol. 36 , pp.10612–10617, 2009.

[19] Daniel Gayo Avello. “A survey on session detection methods in query logs and a proposal for future evaluation.” Information Sciences. vol. 179, pp. 1822–1843,2009.

[20] Arman Tajbakhsh, Mohammad Rahmati, Abdolreza Mirzaei. “Intrusion detection using fuzzy association rules.” Applied Soft Computing. vol. 9, pp.462–469,2009.

[21] Animesh Patcha, Jung-Min Park. “An overview of anomaly detection techniques: Existing solutions and latest technological trends.” Computer Networks . vol.51 , pp.3448–3470,2007.

[22] Giorgio Giacinto, Roberto Perdisci, Mauro Del Rio, Fabio Roli. “Intrusion detection in computer networks by a modular ensemble of one-class classifiers.” Information Fusion .vol.9, pp. 69–82,2008.

[23] Shelly Xiaonan Wu, Wolfgang Banzhaf. “The use of computational intelligence in intrusion detection systems: A review.” Applied Soft Computing. 2010,vol.10. pp.1–35.

[24] Cheng Xiang, Png Chin Yong, Lim Swee Meng. Design of multiple-level hybrid classifier for intrusion detection system using Bayesian clustering and decision trees. Pattern Recognition Letters. 2008, vol.29, pp.918–924.

[25] Srilatha Chebrolu, Ajith Abraham, Johnson P. Thomas. “Feature deduction and ensemble design of intrusion detection systems.” Computers & Security .vol.24, pp.295-307,2005.

96