improvedtlbo …downloads.hindawi.com/journals/complexity/2020/5287684.pdffss is a multiobjective...

18
Research Article ImprovedTLBO-JAYAAlgorithmforSubsetFeatureSelectionand Parameter Optimisation in Intrusion Detection System Mohammad Aljanabi , 1,2 MohdArfianIsmail , 2 andVitalyMezhuyev 3 1 College of Education, Aliraqia University, Baghdad, Iraq 2 Faculty of Computing, College of Computing and Applied Sciences, Universiti Malaysia Pahang, Malaysia 3 Institute of Industrial Management, FH Joanneum University of Applied Sciences, Graz, Austria Correspondence should be addressed to Mohammad Aljanabi; [email protected] Received 16 January 2020; Revised 2 May 2020; Accepted 4 May 2020; Published 31 May 2020 Academic Editor: Harish Garg Copyright © 2020 Mohammad Aljanabi et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Many optimisation-based intrusion detection algorithms have been developed and are widely used for intrusion identification. is condition is attributed to the increasing number of audit data features and the decreasing performance of human-based smart intrusion detection systems regarding classification accuracy, false alarm rate, and classification time. Feature selection and classifier parameter tuning are important factors that affect the performance of any intrusion detection system. In this paper, an improved intrusion detection algorithm for multiclass classification was presented and discussed in detail. e proposed method combined the improved teaching-learning-based optimisation (ITLBO) algorithm, improved parallel JAYA (IPJAYA) algorithm, and support vector machine. ITLBO with supervised machine learning (ML) technique was used for feature subset selection (FSS). e selection of the least number of features without causing an effect on the result accuracy in FSS is a multiobjective optimisation problem. is work proposes ITLBO as an FSS mechanism, and its algorithm-specific, parameterless concept (no parameter tuning is required during optimisation) was explored. IPJAYA in this study was used to update the C and gamma parameters of the support vector machine (SVM). Several experiments were performed on the prominent intrusion ML dataset, where sig- nificant enhancements were observed with the suggested ITLBO-IPJAYA-SVM algorithm compared with the classical TLBO and JAYA algorithms. 1.Introduction Recent advancements and popularisation of network and information technologies have increased the significance of network information security. Compared with conventional network defence mechanisms, human-based smart intrusion detection systems (IDSs) can either intercept or warn of network intrusion. However, most studies on information security have focused on the ways to improve the effec- tiveness of smart network IDSs. e use of smart IDSs is an effective network security solution that can protect against attacks. Nonetheless, machine learning (ML) methods and optimisation algorithms are often used for intrusion de- tection because the detection rate of existing IDSs is low when faced with audit data that have a high overhead [1]. e execution time can sometimes increase substantially when one attempts to rise a detection accuracy. Also, the execution time may be significantly reduced but at the cost of decreased accuracy. erefore, the feature subset selection (FSS) problem can be considered as a multiobjective opti- misation problem; it has more than one solution, from which the best may be chosen. Solutions that offer superior accuracy are selected by customers who prioritise precision. Other clients choose solutions that provide reduced exe- cution times as the best solutions, even though accuracy is compromised to a certain extent. e teaching-learning-based optimisation algorithm (TLBO), as a novel metaheuristic, has been recently applied to various intractable optimisation problems with consid- erable success. TLBO is superior to many other algorithms, Hindawi Complexity Volume 2020, Article ID 5287684, 18 pages https://doi.org/10.1155/2020/5287684

Upload: others

Post on 22-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ImprovedTLBO …downloads.hindawi.com/journals/complexity/2020/5287684.pdfFSS is a multiobjective optimisation problem. e first objective is the number of features, and the second

Research ArticleImprovedTLBO-JAYAAlgorithm for Subset Feature Selection andParameter Optimisation in Intrusion Detection System

Mohammad Aljanabi 12 Mohd Arfian Ismail 2 and Vitaly Mezhuyev3

1College of Education Aliraqia University Baghdad Iraq2Faculty of Computing College of Computing and Applied Sciences Universiti Malaysia Pahang Malaysia3Institute of Industrial Management FH Joanneum University of Applied Sciences Graz Austria

Correspondence should be addressed to Mohammad Aljanabi mohammadcs88gmailcom

Received 16 January 2020 Revised 2 May 2020 Accepted 4 May 2020 Published 31 May 2020

Academic Editor Harish Garg

Copyright copy 2020 Mohammad Aljanabi et al (is is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work isproperly cited

Many optimisation-based intrusion detection algorithms have been developed and are widely used for intrusion identification(is condition is attributed to the increasing number of audit data features and the decreasing performance of human-based smartintrusion detection systems regarding classification accuracy false alarm rate and classification time Feature selection andclassifier parameter tuning are important factors that affect the performance of any intrusion detection system In this paper animproved intrusion detection algorithm for multiclass classification was presented and discussed in detail (e proposed methodcombined the improved teaching-learning-based optimisation (ITLBO) algorithm improved parallel JAYA (IPJAYA) algorithmand support vector machine ITLBOwith supervisedmachine learning (ML) technique was used for feature subset selection (FSS)(e selection of the least number of features without causing an effect on the result accuracy in FSS is amultiobjective optimisationproblem (is work proposes ITLBO as an FSS mechanism and its algorithm-specific parameterless concept (no parametertuning is required during optimisation) was explored IPJAYA in this study was used to update the C and gamma parameters ofthe support vector machine (SVM) Several experiments were performed on the prominent intrusion ML dataset where sig-nificant enhancements were observed with the suggested ITLBO-IPJAYA-SVM algorithm compared with the classical TLBO andJAYA algorithms

1 Introduction

Recent advancements and popularisation of network andinformation technologies have increased the significance ofnetwork information security Compared with conventionalnetwork defence mechanisms human-based smart intrusiondetection systems (IDSs) can either intercept or warn ofnetwork intrusion However most studies on informationsecurity have focused on the ways to improve the effec-tiveness of smart network IDSs (e use of smart IDSs is aneffective network security solution that can protect againstattacks Nonetheless machine learning (ML) methods andoptimisation algorithms are often used for intrusion de-tection because the detection rate of existing IDSs is lowwhen faced with audit data that have a high overhead [1]

(e execution time can sometimes increase substantiallywhen one attempts to rise a detection accuracy Also theexecution timemay be significantly reduced but at the cost ofdecreased accuracy (erefore the feature subset selection(FSS) problem can be considered as a multiobjective opti-misation problem it has more than one solution fromwhich the best may be chosen Solutions that offer superioraccuracy are selected by customers who prioritise precisionOther clients choose solutions that provide reduced exe-cution times as the best solutions even though accuracy iscompromised to a certain extent

(e teaching-learning-based optimisation algorithm(TLBO) as a novel metaheuristic has been recently appliedto various intractable optimisation problems with consid-erable success TLBO is superior to many other algorithms

HindawiComplexityVolume 2020 Article ID 5287684 18 pageshttpsdoiorg10115520205287684

such as genetic algorithms (GAs) particle swarm and antcolony Moreover TLBO needs fewer parameters for tuningduring execution compared with other algorithms(us thecombination of improved multiobjective TLBO frameworkswith supervised ML techniques was proposed in the presentstudy for FSS in multiclass classification problems (MCPs)for intrusion detection (e selection of the least number offeatures without causing an effect on the result accuracy inFSS is a multiobjective optimisation problem (e firstobjective is the number of features and the second is thedetection accuracy TLBO remarkably outperforms othermetaheuristic algorithms (us ITLBO and a set of su-pervised SVMwere deployed in this study for the selection ofthe optimal feature subset JAYA is a new metaheuristicoptimisation algorithm proposed by Rao (2016) which wasrecently deployed in several intractable optimisation prob-lems JAYA differed from other optimisation algorithms bynot requiring parameter tuning [2] It has been used as abenchmark function for constrained and unconstrainedcases and despite being parameterless like TLBO it requiresno learning phase making it different from TLBO [3] (eprinciple of JAYA is the establishment of the problemrsquossolution by inclining towards the best result and keeping offfrom the bad one (is movement depends on certaincontrol parameters like the number of design variables themaximum number of generations and the size of thepopulation It requires no tunable control parameter beforethe computation phase (us IPJAYA is used to tune theparameters of the SVM In order to improve the featureselection process and SVM parameter tuning in this paperwe propose an improved algorithm for subset feature se-lection using an enhanced TLBO algorithm It uses an ad-ditional phase in TLBO to increase the informationexchange between teachers and learners SVM parametertuning is based on the improved parallel JAYA algorithmwhich uses parallel processing to increase the speed of pa-rameter tuning (e proposed algorithm is called ITLBO-IPJAYA-SVM

(e remaining part of this paper is presented in thefollowing manner Section 2 reviews work related to thisstudy and the FSS problem is introduced in Section 3 (eITLBO is discussed in Section 4 and Section 5 explains MLapplied with ITLBO Section 6 compares the results of theITLBO and TLBO algorithms Finally Section 7 concludesthis study

2 Related Work

Intrusion detection is a prevalent security infrastructuretopic in the era of big data Combinations of different MLmethods and optimisation algorithms have been developedand applied in the IDS to distinguish a normal networkaccess from the attacks Existing combinations includefuzzy logic cuttlefish optimisation algorithm K-nearestneighbour artificial neural network particle swarm algo-rithm support vector machine (SVM) and artificial im-mune system approaches [4] Most methods that combineML with optimisation algorithms outperform conventionalclassification methods Numerous researchers have also

proposed ML and optimisation-based IDSs [5] Louvieriset al [6] proposed a novel combination of techniques (K-means clustering naıve Bayes (NB) KruskalndashWallis (KW)and C45) that pinpointed attacks as anomalies with highaccuracy even within cluttered and conflicted cyber-net-work environments Furthermore the inclusion of the NBfeature selection and the KW test in this method facilitatesthe classification of statistically significant and relevantfeature sets including a statistical benchmark for thevalidity of the method while the detection of SQL injectionin this method remains low De la Hoz et al [7] presented amethod for NIDS that was based on self-organising maps(SOMs) and principal component analysis (PCA) Noisewithin the dataset and low-variance features were filteredby means of PCA and Fisher discriminant ratio (isprocedure uses the most discriminative projections basedon the variance explained by the eigenvectors Prototypesgenerated by the self-organising process are modelled by aGaussian where d is the number of SOM units (ereforethis system must be trained only once however the mainlimitation of this work is that the detection rate remainslow Bamakan et al [8] proposed a chaos-particle swarmoptimisation method to provide a new ML IDS based ontwo conventional classifiers multiple-criteria linear pro-gramming and an SVM

(e proposed approach has been applied to simulta-neously set the parameters of these classifiers and providethe optimal feature subset (e main drawback of this workis the long training time needed (erefore even thoughthese combinations can improve the performance of IDSsin terms of learning speed and detection rate compared toconventional algorithms further improvement is needed(e performance of most IDSs is affected in terms ofclassification accuracy and training time by an increase inthe number of audit data features (e present paperproposes the use of the TLBO technology to address thisissue through the supply of a fast and accurate optimisationprocess that can improve the capability of an IDS to find theoptimal detection model based on ML In the TLBO al-gorithm proposed by Rao et al [9] the optimisationprocess for mechanical design problems does not need anyuser-defined parameter (is novel technique was tested ondifferent benchmark functions and the results demon-strated that the developed TLBO outperformed particleevolutionary swarm optimisation artificial bee colony(ABC) and cultural DE Das and Padhy [10] studied thepossibility of applying a novel TLBO algorithm to theselection of optimal free parameters for an SVM regressionmodel of financial time-series data by using multi-commodity futures index data retrieved from multicutcrossover (MCX) (eir experimental results showed thatthe proposed hybrid SVM-TLBO model successfullyidentified the optimal parameters and yielded better pre-dictions compared to the conventional SVM Das et al [11]proposed an extension of the hybrid SVM-TLBO model byintroducing a dimension reduction technique whereby thenumber of input variables can be reduced by using PCAkernel PCA (KPCA) and independent component analysis(ICA) (three common dimension reduction methods) (is

2 Complexity

study also examined the feasibility of the proposed modelusing multicommodity futures index data retrieved fromMCX Rao et al [12] confirmed the superiority of the modelcompared to some population-inspired optimisationframeworks Rao and Patel [13] investigated the effect ofsample size and number of generations on algorithmicperformance and concluded that this algorithm can beeasily applied to several optimisation cases Crepinsek et al[14] solved the problems presented in [9 12] by usingTLBO Nayak et al [15] developed a multiobjective TLBOin which a matrix of solutions was created for each ob-jective (e teacher selection process in TLBO is mainlybased on the best solution presented in the solution spaceand learners are taught to merely maximise that objectiveAll the available solutions in the solution space were sortedto generate a collection of optimal solutions Xu et al [16]presented multiobjective TLBO based on different teachingtechniques (ey used a crossover operator (rather than ascalar function) between solutions in the teaching andlearning phases Kiziloz et al [17] suggested three multi-objective TLBO algorithms for FSS in binary classification(FSS-BCP) Among the presented methods a multi-objective TLBO with scalar transformation was found to bethe fastest algorithm although it provided a limitednumber of nondominated solutions Multiobjective TLBOwith nondominated selection (MTLBO-NS) explores thesolution space and produces a set of nondominated so-lutions but requires a long execution time MultiobjectiveTLBO with minimum distance (MTLBO-MD) generatessolutions that are similar to those of MTLBO-NS but in asignificantly shorter time (e proposed multiobjectiveTLBO algorithms have been evaluated in terms of per-formance using LR SVM and extreme learning machine(ELM) Wang et al suggested a novel ldquoalcoholism iden-tification method from healthy controls based on a com-puter-vision approachrdquo [18] (is approach relied on threecomponentsmdashthe proposed wavelet Renyi entropy feed-forward neural network and the proposed three-segmentencoded JAYA algorithm (e results showed the proposedmethod exhibits good sensitivity but the accuracy stillneeds improvements Migallon et al [19] developed parallelalgorithms and presented their detailed analysis (eydeveloped a hybrid algorithm that exploited inherentparallelism at two different levels (e lower level wasexploited by parallel shared-memory platforms while theupper level was exploited by distributed shared memoryplatforms (e results of both algorithms were good es-pecially in scalability Hence the proposed hybrid algo-rithm successfully used a number of processes with near-perfect efficiencies (e experiments showed that themethod used about 60 processes to achieve near-ideal ef-ficiencies as analysed on 30 unconstrained functions Gong[20] suggested a ldquonovel E-JAYA algorithm for the per-formance enhancement of the original JAYA algorithmrdquo(e proposed E-JAYA used the average of the better andworse groups to derive the best solution (e solutionprovided by the proposed E-JAYA had better accuracy thanthat of the original JAYA (e swarm behaviours wereconsidered in the E-JAYA rather than considering the best

and worst individual behaviours (e performance ofE-JAYA was assessed on 12 benchmark functions ofvarying dimensionality

Another study proposed an effective demand-sidemanagement scheme for residential HEMS [21] (esystem was proposed for peak creation prevention toreduce electricity bills (is study applied JAYA SBA andEDE to realise its objectives it also deployed the TOUpricing scheme for electricity bill computation From theresult JAYA was sufficient in reducing electricity bill andPAR thereby achieving customer satisfaction Further-more the SBA outperformed JAYA and EDE in achievinguser comfortability as it related negatively with an elec-tricity bill Yu et al [22] developed improved JAYA(IJAYA) for steady and accurate PV model parameterestimation by incorporating a self-adaptive weight for theadjustment of the propensity of reaching the best solutionand avoiding the bad solution while searching (e weighthelps in ensuring the framework achieves the possiblesearch region early and to perform local search laterFurthermore the algorithm contains a learning strategyderived from other individualsrsquo experiences which wasrandomly used for population diversity improvementTable 1 shows the lacks and limitation of IDS studiesmentioned in the related work

3 Feature Subset Selection Problem

(is section explains the representation of the featuresand the problem of choosing the best feature subset FSSrefers to the selection of feature subsets from a largerfeature set FSS reduces the number of features in adataset thereby preventing complex calculations andimproving the speed and performance of classifiersSeveral definitions of FSS exist in literature [23] somedefinitions deal with the reduction in size of the selectedsubset while others focus on the improvement of pre-diction accuracy FSS is essentially a process of con-structing an effective subset that represents theinformation contained in a dataset by eliminating re-dundant and irrelevant features FSS mainly aims atfinding the least number of features without having asignificant influence on classification accuracy Owing tothe complicated nature of optimal subset feature ex-traction as well as the nonexistence of a polynomial-timealgorithm for addressing it FSS has been classified as anNP-hard problem [24] (ere are four steps in typical FSS[23] the first step involves the selection of candidatefeatures that will constitute the subsets while the secondstep is the evaluation and comparison of these subsetswith each other In the third step a check is made for thesatisfaction of the termination condition otherwise thefirst and second steps will be repeated (e final stepchecks if the optimal feature subset has been establishedbased on prior knowledge With these two major aimsFSS can be considered a multiobjective problem A formaldefinition of finding optimal solutions through the sat-isfaction of both objectives is given in the followingequation

Complexity 3

Min (f1)

Max (f2)

subject to f1 |k|

f2 accuracy(k) where ksubeK

(1)

where k is the subset of the original dataset K which opti-mises f1 and f2 (the objectives)

(e establishment of the best solution or the decision onthe improved condition of a new individual is a complicatedtask in a multiobjective optimisation process (is is due tothe chances of enhancement in one objective causing areduction in the other

4 Improved TLBO Algorithm

(e ITLBO algorithm was executed at the FSS phase in thisstudy (e ITLBO algorithm was initialised by randomlygenerated initial population namely the teacher and a set ofstudents which represents the set of solutions To representthe features in the ITLBO algorithm ITLBO borrowed thecrossover and mutation operators from GA by representingthe features as chromosomes (one of the GA properties) Toupdate this chromosome crossover and mutation operatorswere used In the population (called a classroom) eachsolution is taken as an individualchromosome (Figure 1) Afeature gene of a chromosome with a value of 1 is consideredas selected while a value of 0 denotes otherwise Figure 1shows a sample of the dataset regarding Figure 2 features AB C D E I K and L were selected (their values are 1) whilefeatures F G H and J were not (their values are 0) (eTLBO algorithm runs through iterations where the teacher isthe best individual in the population and the rest of theindividuals become the students Having selected theteacher ITLBO works in three phases Teacher BestClassmates (Learner Phase 1) and Learner Phase 2 In theTeacher phase the teacher enhances the knowledge of eachstudent by sharing knowledge with them but in the BestClassmates phase two best students are selected andassigned the task of interacting with the other students Inthe Learner phase there is a random interaction among thestudents in a bid to enhance their levels of knowledge Newchromosomes are generated in the proposed ITLBO usingldquohalf-uniform crossover and bit-flip mutation operatorsrdquowhich are special crossover operators (Figures 3 and 4)Two-parent chromosomes (could be a teacher a student ortwo students) are needed for the crossover operator (ecrossover operator relies on the information of the two-parent chromosomes if both parents feature the same genethe gene is kept but whenever there are different featuregenes in both parents a parentrsquos gene is randomly chosenOnly one new chromosome is generated from this operation

(e ldquobit-flip mutationrdquo works on a single chromosome whentrying to manipulate a single gene based on a probabilisticratio If the gene has a zero value it will be updated as one orvice versa In the proposed ITLBO algorithm nondominatedsorting and selection were used (e dominance of an in-dividual over another individual is determined strictly on thebasis of whether a minimum of one of its objectives is su-perior to that of the other while keeping all the other ob-jectives the same

A nondominated scenario arises when there is nopossibility of an individual being dominated by another (efront line of the solution set is filled by the nondominatedindividuals (ose that are closest to the ideal point in thefront line are chosen as the teachers All the teachers teach allstudents discretely at the Teacher Best Classmate andLearner phases (e details of the ITLBO algorithm arepresented in Figures 5 and 6(e detail steps of ITLBO are asfollows

(i) Step 1 initialise the population randomly with eachpopulation having a different set of features from 1to a maximum number of features (41 in NSL-KDD) (is step is captured in line 2 of Figure 5

(ii) Step 2 choose the best individual as a teacher (echosen teacher interacts with all other individualsseparately and a crossover is applied with each oneand then a mutation is applied to all the resultingindividuals (e operators used are half-uniformcrossover and bit-flip mutation operators (repre-sented in lines 4 to 5 in Figure 5)

(iii) Step 3 check the population (chromosome) thatresults from the crossover and mutation if the newchromosome is better than the old then the new oneis kept otherwise the old one is retained All theaforementioned steps are collectively called theTeacher phase because all individuals learn from thebest one (the teacher) (is step is represented inlines 6 to 13 in Figure 5

(iv) Step 4 after that Learner Phase 1 or learning fromthe best classmates is started (is phase begins withthe fifth step which is the selection of the best twoindividuals as students and applying a crossoverbetween them followed by a mutation If the newone is better than the previous two students thenthe newer choice is kept otherwise the older bestchoice is kept (is process is repeated with all otherindividuals (students) At this point Learner Phase1 has terminated (viewed in lines 14 to 27 inFigure 5)

(v) Step 5 this step is Learner Phase 2 which involveschoosing two random individuals (students) be-tween whom a crossover is applied followed by amutation on the new individual If the new

Table 1 IDS existing work

Ref Limitation[6] Detection of SQL injection is low[7] Detection rate is low[8] Long training time

110100011111

Figure 1 Schematic representation of a chromosome 1 selectedfeatures 0 unselected features

4 Complexity

individual is better than the old two students thenthe new one is kept otherwise the best old one isretained (is step is repeated with all other

students At this point the main three stages ofITLBO have been completed and a check should becarried out on whether the termination criteria have

(1) Start (2) Initialize population(3) Calculate_weighted_average_of

_individuals (population)(4) for (k =1 to number_of_generations) do(5) Xteacher = Best_individual (6) Learning from Teacher lowastteacher phase (7) for (i =1 to number_of_individuals) do(8) Xnew = Crossover (Xteacher Xi)(9) Xnew = Mutation (Xnew)(10) if (Xnew is better than Xi) then(11) Xi = [Xnew](12) End if(13) End for (14) Learning from Best Classmates lowastlearner phase 1 (15) for (i =1 to number_of_individuals) do(16) m = Select_best_individual_from

(population)(17) n = Select_best_individual_from (population) (18) n ne m ne teacherlowast(19) Xnew = Crossover (Xm Xn)(20) Xnew = Mutation (Xnew)(21) if (Xnew is better than Xm) then(22) Xm = Xnew(23) End if

(24) if (Xnew is better than Xn) then (25) Xn = Xnew(26) End if (27) End for (28) Learning from Classmates lowastlearner phase 2(29) for (i =1 to number_of_individuals) do(30) m = Select_random_individual_from

(population)(31) n = Select_random_individual_from

(population) n ne m ne teacherlowast(32) Xnew = Crossover (Xm Xn)(33) Xnew = Mutation (Xnew)(34) if (Xnew is better than Xm) then(35) Xm = Xnew(36) End if (37) if (Xnew is better than Xn) then(38) Xn = Xnew(39) End if (40) End for (41) Is termination criterion satisfied if yes go to line

42 else continue(42) End for(43) Show_the_pareto_optimal_set (population)(44) End

Figure 5 ITLBO algorithm

Figure 2 Sample of the dataset

0100111111

1101110101

1100110111

Teacher

New student

Student

Figure 3 Crossover operator

0100111111

0100110111

Figure 4 Mutation operator

Complexity 5

been satisfied or not If the termination criteria weresatisfied proceed to the next step otherwise themain three stages are repeated (Teacher phaseLearner Phase 1 and Learner Phase 2) (is step isrepresented in lines 28 to 40 in Figure 5

(vi) Step 6 the final step is the application of non-dominated sorting to the result Nondominatedsortingmeans no result (individual) is better than all

other individuals (is step can be viewed in line 43in Figure 5

5 Parameter Optimisation

After selecting the optimal subset feature several SVMparameters will be tuned(e tuning of SVM parameters is aproblem which can determine algorithm performance (e

Start

Calculate weighted average of every individual in the population

Choose best individual as teacher

Is new one better thanold one

Keep old one Keep new one

Select best two students apply crossover and mutation

Is new one better than theworse student

Keep new oneKeep old one

Select random two students apply crossover and mutation

Is new one better thanold one

Keep new oneKeep old one

Is the termination criterion satisfied

Is new one better than the worse student based on new

value of C and YKeep new oneKeep old one

Is the terminationcriterion satisfied

End

YesNo

YesNo

No Yes

YesNo

No

Crossover teacher with all other individuals (student) separately and apply mutation

Apply nondominated sorting and find the Pareto setYes

Initialize population randomly

Figure 6 ITLBO flowchart

6 Complexity

radial basis function (RBF) kernel function of the SVM isemployed for the conversion of the completely nonseparableproblem into a separable or approximate separable state(eRBF kernel parameter c suggests data distribution to a newfeature space while parameter C suggests the level of penaltyfor the classification error in the linear nonseparable caseEquations (2) and (3) represent the cost and gamma re-spectively In the next section the two parameters (C and c)were tuned by using the IPJAYA algorithm

w b12

w22

+ C 1113944n

ζnstyn wtxn + b1113872 1113873ge 1 minus ζn (2)

k xn xm( 1113857 exp minusc xn minus xm

23

1113874 1113875 (3)

6 Improved Parallel JAYA Algorithm

(e JAYA algorithm needs improvements to work betterOne of the observations on the JAYA algorithm is that if wesort the populations from best to worst and divide them intotwo groups the best and the worst solutions Obviously theoptimal solution is located in the best solution group [2]Based on this observation an improvement has been done inthe JAYA algorithm rather than selecting the best and worstcases from the whole solutions which puts the worst so-lution further from the best solution and increase the it-erations needed to reach the optimal solution the solutionswere divided into two groups (e best solution is chosenfrom the best solution group as ldquoBestrdquo and the best solutionfrom the worst solution group is the ldquoWorstrdquo (is proce-dure reserves the populationrsquos diversity and makes the so-lution start from a point closer to the optimal solution anddecreases the number of iterations needed to reach theoptimal solution In the proposed work JAYA algorithmwas improved to optimise two parameters of the SVMclassifier simultaneously Figure 7 shows the flowchart ofIPJAYA while Figure 8 shows the IPJAYA algorithm fol-lowed by the detailed steps of IPJAYA

(e detailed steps of IPJAYA are shown as follows(i) Step 1 select the population size and the number of

design variables as well as initialise the terminationcondition To explain the parameter optimisation indetail we assume the following scenario populationsize 3 design variables 2 and terminationcriterion 2 iterations (e value of the populationis the value of parameters C and c in this scenarioeach one has 3 values (ese values were initialisedrandomly for C between 0001 to 100 and for c

between 00001 to 64 Table 2 shows the values of Cand c

(ii) Step 2 SVM needs three things to classify any la-belled data ie features to choose value of C pa-rameter and value of c parameter (is step can beviewed in line 2 of Figure 8

(iii) Step 3 the next step is to evaluate each value forboth C and c separately by using the SVM and on

the first student from Learner Phase 2 after applyingcrossover and mutation as shown in Table 3To continue the optimisation process the pop-ulation is arranged from best to worst and split intotwo groups (Best and Worst groups) as shown inTable 4(e same procedure is repeated for c parameter andthis time C is by default and the new value of c is11006

(iv) Tables 5 and 6 show the details of c parameter(v) Step 4 the result will be considered as the objective

function for both C and c and then compared withother populations and continued until the termi-nation criterion is satisfied (is step can be viewedin line 7 of Figure 8

(ese two new values for C and c will be evaluated usingthe same subset feature at the same time as shown in Table 7(is step can be viewed in lines 5 to 6 of Figure 8

7 The Proposed Method

(is section describes the proposed combination of threedifferent algorithms Each algorithm has a different task todo and these tasks complete the work of the model (e firstalgorithm is the ITLBO whose task is to choose the optimalsubset feature from the whole features(e second algorithmis IPJAYA algorithm and its task is to optimise the pa-rameters of the SVM (e third algorithm is the SVMclassifier which takes the outcome of the first two algorithmsto determine if the processed traffic is intrusion or normaltraffic Figure 9 shows the flowchart of the proposed methodFigure 10 shows the pseudo-code of the proposed methodwhile Figure 11 details the proposed steps of the IPJAYA-ITLBO-SVM method

(e detail steps of ITLBO-IPJAYA-SVM are asfollows

(i) Step 1 initialise the population randomly Eachpopulation is a different set of features from 1 tothe maximum number of features (41 in NSL-KDD) (is step can be viewed in line 2 ofFigure 10

(ii) Step 2 calculate the weighted average of everyindividual population (is step can be viewed inline 3 of Figure 10

(iii) Step 3 choose the best individual as a teacher (echosen teacher interacts with all other individualsseparately Apply crossover with each one andthen apply mutation to all resulted individuals(ecrossover used is called the half-uniform crossoverand bit-flip mutation operator (is step can beviewed in lines 4 to 5 of Figure 10

(iv) Step 4 check the population (chromosome)resulting from crossover and mutation if the newone is better than the old one keep the new oneotherwise retain the old one (e best and worstpopulations refer to the degree of accuracy during

Complexity 7

Start

Initialize population size number of design variables and termination criterion

Identify best and worst solutions for C and γ

Calculate new worst solution for C Calculate new worst solution for γ

Modify the solution based on eq 4 Modify the solution based on eq 4

Use the new value of C and γ do the classification

Is new Acc Better than old Acc

Keep new C and γKeep old C and γ

Is the termination criterionsatisfied

Return best value of C and γ

End

YesNo

NO

Yes

Parallel pool

Figure 7 IPJAYA flowchart

(1) Start (2) Initialise the population size number of designed

variables and termination criteria (3) Repeat Steps 3ndash6 until the termination criteria are met (4) Arrange the solutions from best to worst and split the

solutions into two groups -best and worst solutions(5) Make the best solution in best group as best and

make the best solution in worst group as worst(6) Modify the solution based on the following equation

Yprimej k I = Yj k I + r1 k I (Yj best I ndash | Yj k I |) ndashr2 k I (Yj worst I ndash | Yj k I|)

(7) Update the previous solution if Yprimej k I gt Yj k Iotherwise do not update the previous solution

(8) Display the established optimum solution(9) End

Figure 8 IPJAYA algorithm

Table 2 C and c values

C c

20 101 210 0701 1

Table 3 Evaluation of C

C c Subset feature Accuracy (objective function)20 Default Fixed 0971 Default Fixed 089910 Default Fixed 099401 Default Fixed 099

Table 4 Best and Worst groups for C

C cSubsetfeature

Accuracy (objectivefunction)

10 Default Fixed 0994 best of best Best group01 Default Fixed 09920 Default Fixed 097 best of worst Worst

group1 Default Fixed 0899

Table 5 Accuracy based on c

C γ Subset feature Accuracy (objective function)20 Default Fixed 0971 Default Fixed 089910 Default Fixed 099401 Default Fixed 099

8 Complexity

classification All the aforementioned steps arecalled the Teacher phase because all individualslearn from the best one (teacher) After thatLearner Phase 1 is started (is step can be viewedin lines 6 to 13 of Figure 10

(v) Step 5 select the best two individuals as studentsand apply crossover between these two students(en apply mutation on the new one If the newone is better than the old two students keep thenew one otherwise keep the best old one and

apply this with all other individuals (students) (estudents are chosen once and will not be chosenagain At this point Learner Phase 1 has ended(is step can be viewed in lines 14 to 27 ofFigure 10

(vi) Step 6 Learner Phase 2 is initiated with two objec-tives one is to optimise the SVMparameters and theother is to make students learn from each other(isphase starts with choosing two random individuals(students) and then applying crossover between these

Table 6 Best and Worst groups for c

C c Subset feature Accuracy (objective function)Default 07 Fixed 09941 best of best Best groupDefault 1 Fixed 099Default 2 Fixed 098 best of worst Worst groupDefault 10 Fixed 097

Table 7 Evaluation of features based on new C and c

C c Subset feature Accuracy (objective function)142 11006 Fixed Result

Start

FSS using ITLBO

SVM parameters optimization using IPJAYA

Is termination criterion satisfied for IPJAYA

No

Is termination criterion satisfied for ITLBO

No

Training

Classification

Intrusion detection

End

Yes

Yes

Figure 9 Proposed method flowchart

Complexity 9

two students and applying mutation on the newindividual After that and before the classificationprocess is initiated check if the new student is betterthan the old two students (e SVM parameteroptimisation is started using IPJAYA this processstarts at the 29th step by initialising the populationsize the number of design variables and the ter-mination criteria for IPJAYA (e population sizecan be set before the execution and each populationis generated randomly(e designed variables are thetwo parameters of the SVM which need to beoptimised (e termination criteria can be thenumber of iterations after that each population foreach parameter is evaluated separately (which onegives better accuracy) followed by a parallel poll foreach parameter sorting the population from best toworst (best accuracy to worst accuracy) and sepa-rating them into two groups (best and worst groups)(e best population in the best group is chosen asbest and the best population in the worst group ischosen as worst (en the population is modifiedbased on equation in Figure 8 and updated if the newone is better than the old one IPJAYA is repeated

until the termination criterion is satisfied (e finalstep of IPJAYA is to deliver the best value of the twoparameters to be used by the SVM At this point theparameter optimisation has ended and LearnerPhase 2 continues in the next step (is step can beviewed in lines 28 to 39 of Figure 10

(vii) Step 7 evaluate the individuals (chromosome) byusing the outcome of IPJAYA If the new indi-vidual is better than the old two students keepthe new one otherwise keep the best old oneApply this step to all other students At this stepthe main three stages of the ITLBO have finished(e next step is to check for the satisfaction of thetermination criteria if satisfied proceed to thenext step Otherwise the main three stages arerepeated (is step can be viewed in lines 40 to 48of Figure 10

(viii) Step 8 the last step is to apply nondominatedsorting on the result Nondominated sortingmeans no result (individual) is better than all theother individuals(is step can be viewed in line 49of Figure 10

(1) Start (2) Initialize population(3) Calculate_weighted_average_of_individuals (4) for (k =1 to number_of_generations) do(5) Xteacher = Best_individual (6) Learning from Teacher lowastteacher phase (7) for (i = 1 to number_of_individuals) do(8) Xnew = Crossover (Xteacher Xi)(9) Xnew = Mutation(Xnew)(10) if (Xnew is better than Xi) then

(11) Xi = [Xnew]

(12) End if

(13) End for (14) Learning from Best Classmates lowastlearner phase 1 (15) for (i = 1 to number_of_individuals) do(16) m = Select_best_individual_from

(population)

(17) n = Select_best_individual_from (population)(18) n nem ne teacherlowast(19) Xnew = Crossover (Xm Xn)(20) Xnew = Mutation(Xnew)

(21) if (Xnew is better than Xm) then

(22) Xm = Xnew

(23) End if(24) if (Xnew is better than Xn) then

(25) Xn = Xnew

(26) End if

(27) End for

(28) Learning from Classmates lowastlearner phase 2(29) for (i =1 to number_of_individuals) do(30) m =Select_random_individual_from

(population)

(31 n =Select_random_individual_from (population) n nem ne teacherlowast

(32) Xnew = Crossover (Xm Xn)

(33) Xnew = Mutation (Xnew)(34) IPJAYA algorithm for SVM parameter

(35) Initialize the population size number of designed variables and termination criteria (IPJAYA)

(36) Arrange the solutions from the best to the worstand Split the solutions into two groups best andworst solutions

(37) Modify the solutionYj k I = Yj k I + r1 k I (Yj best I ndash |Yj k I|) ndash r2 k I (Yj worst I ndash | Yj k I|)

(38 Update the previous solution if Yj k I gt Yj k Iotherwise do not update the previous solution

(39) return best value of C and Y(40) if ( Xnew is better than Xm) then(41) Xm = Xnew(42) End if (43) if (Xnew is better than Xn) then(44) Xn = Xnew(45) End if (46) End for (47) Is termination criterion satisfied if yes go to line

42 else continue

(48) End for(49) Show_the_pareto_optimal_set (population)(50) End

Figure 10 Proposed method pseudo-code

10 Complexity

Start

Calculate weighted average of every individual in the population

Choose best individual as teacher

Crossover teacher with all other individuals (student) separately and apply mutation

Is new one better than old oneKeep old one Keep new one

Select best two students apply crossover and mutation

Is new one better than theworse student

Keep new oneKeep old one

Select random two students apply crossover and mutation

Initialize population size number of design variablesand termination criterion

Identify best and worst solutions for C and γ

Calculate new worst solution

Modify the solutions based on equation 32

Use the new value of c and y do the classification

Is new one better than old one Keep new oneKeep old one

Is the termination criterionsatisfied

This is the best value of C and γ

Is new one better thanthe worse student based on new

value of C and γ

Keep new oneKeep old one

Is the termination criterionsatisfied

End

YesNo

YesNo

Yes

No Yes

YesNo

No

No

Initialise population randomly

Apply nondominated sorting and find the Pareto setYes

ITLBO

IPJAYA

Figure 11 Details of the proposed method

Complexity 11

8 Evaluation Metrics

(emetrics measures and validation procedures used in theevaluation of the experimental data were reviewed in thissection (e literature review showed that most studies useoverall accuracy as the major performance measure for IDsystems However other metrics and validation measureshave also been mentioned Some works have detailed theinformation on FAR detections andmissed detections whichare all useful system performance evaluation measures (efollowing section details the analysis based on standardmetrics for objective evaluation of the results achieved byvarious classification methods (e performance of thesystem was evaluated using several metrics based on theNSL-KDD and CICIDS 2017 datasets A detailed descriptionof learning performance measures has been provided bySingh et al and Sokolova and Guy [25 26] whilePhoungphol et al [27] detailed the imbalanced dataset is-sues One of these metrics is the accuracy as given in thefollowing equation

accuracy TP + TN

TP + FN + TF + FP (4)

Accuracy is the capability of the classifier in predictingthe actual class here TP true positive TN true negativeFP false positive and FN false negative

Several metrics can be computed from the confusionmatrix (e false-positive rate (FPR) is another metric it isthe percentage of the samples incorrectly predicted aspositive by the classifier It is calculated by using the fol-lowing equation

false minus positive rate(FPR) FP

FP + TN (5)

(e false-negative rate (FNR) is the percentage of thedata incorrectly classified by the classifier as negative It iscalculated by using the following equation

false minus negative rate(FNR) FN

TP + FN (6)

(e detection rate (DR) is the percentage of the samplescorrectly classified by the classifier to their correct class It iscalculated by using the following equation

detection rate FP

TP + FP (7)

(e recall quantifies the number of correct positivepredictions made out of all positive predictions It is cal-culated by using the following equation

recall TP

TP + FN (8)

F-Measure provides a way to combine both detectionrate and recall into a single measure that captures bothproperties It is calculated by using the following equation

F minus M (2lowastDRlowast recall)

(DR + recall) (9)

(e results were validated by using k-fold cross-vali-dation technique [27ndash30] (is technique requires a randompartitioning of the data into k different parts and one part isselected from each iteration as testing data while the other(kminus1) parts are considered as the training dataset All theconnection records are eventually used for training andtesting For all experiments the value of k is taken as 10 toensure low bias low variance low overfitting and good errorestimate [28]

9 Dataset Preprocessing and Partitioning

(e whole dataset is preprocessed in this stage It consists oftwo steps ie scaling and normalisation In the scaling stepthe dataset is converted from a string representation to anumerical representation For example the class label in thedataset contains two different categories ldquoNormalrdquo andldquoAttackrdquo After implementing this step the label is changedto ldquo1rdquo and ldquo0rdquo where ldquo1rdquo means normal case while ldquo0rdquomeans attack (e second step is normalisation [31] (enormalisation process removes noise from the dataset anddecreases the differences in the ranges between the featuresIn this work the Max-Max normalisation method was usedas shown in the following equation

Fi Fi minus Mini

Maxi minus Minii (10)

where Fi represents the current feature that needs to benormalized and Mini and Maxi represent the minimum andthe maximum value for that feature respectively (e ob-jective function represents the accuracy of the SVM when itis evaluated on the validation set (e validation set is a partof the training set In order to make the validation fairer K-fold validation can be used(e value K is 10(e NSL-KDDand CICIDS 2017 datasets were used to evaluate the per-formance of the proposed models

10 NSL-KDD Dataset

In this study NSL-KDD datasets were used to evaluate theproposed method (is dataset was suggested in 2009 byTavallaee et al [32] due to the drawbacks of KDD CUP99(e NSL-KDD is a variant of the KDD CUP 99 dataset inwhich the redundant instances were discarded followed bythe reconstitution of the dataset structure [29] (e NSL-KDD dataset is commonly used for evaluating the per-formance of new ID approaches especially anomaly-basednetwork ID (ere are a reasonable number of testing andtraining records in the NSL-KDD (e training set(KDDTrain+) consists of 125973 records while the testingset (KDDTest+) contains 22544 records In this dataseteach traffic record has 41 features (six symbolic and 35continuous) and one class label (Table 7) (e features areclassified into basic content and traffic types (Table 8)Attack classification in the NSL-KDD is based on thefeature characteristics [33] (e NSL-KDD dataset can bedownloaded from httpswwwunbcacicdatasetsnslhtml

12 Complexity

11 CICIDS 2017 Dataset

(e CICIDS 2017 dataset consists of benign and the mostcurrent common attacks which mimic real-world data(PCAPs) It also contains the results of a network trafficanalysis obtained by using a CICFlowMeter the flows arelabelled based on the timestamp source and destinationports source and destination IPs protocols and attack (eCICIDS 2017 dataset satisfies the 11 indispensable features ofa valid IDS dataset namely anonymity available protocolsfeature set attack diversity complete capture completeinteraction complete network configuration completetraffic metadata heterogeneity and labelling [34] (ere are2830540 rows in the CICIDS 2017 devised on eight fileswith each row containing 79 features In the CICIDS 2017each row is labelled as benign or as one of the 14 attack typesA summary of the distribution of different attack types andthe benign rows is presented in Table 9

12 Results of ITLBO-IPJAYA vs ITLBOand ITLBO-JAYA

(is section provides the results of the improved method-based ITLBO-IPJAYA algorithm (is method selects thebest features and updates the value of SVM parameters (iswork proposed the idea of ldquoparallel executionrdquo to update theSVM parameters (e parameters for ITLBO ITLBO-JAYAand ITLBO-IPJAYA methods used in this study are shownin Table 10

(e NSL-KDD dataset is used to evaluate the threemethods and the evaluation metrics used are maximumaccuracy (Max Acc) average accuracy (AVR Acc) de-tection rate (DT) false alarm rate (FAR) false negative rate(FNR) F-measure (F-M) recall and error rate (ER)Table 11 shows the comparison in results among ITLBOITLBO-JAYA and ITLBO-IPJAYA

(e results show that ITLBO-IPJAYA performs betterthan ITLBO and ITLBO-JAYA in all metrics Figure 11shows the comparison results based on the accuracy ofITLBO ITLBO-JAYA and ITLBO-IPJAYA

Figure 12 shows a comparison between ITLBO-JAYAand ITLBO-IPJAYA based on the number of iterations Itshows that ITLBO-IPJAYA performs better than ITLBO-JAYA even with less number of iterations (e increase inrate of accuracy for ITLBO-IPJAYA is higher than ITLBO-JAYA (e figure shows that ITLBO-IPJAYA with 20 iter-ations performs better than ITLBO-JAYA with 30 iterationsand that ITLBO-IPJAYA performs better than ITLBO-JAYAwith less number of iterations (is means there is lesscomplexity and less execution time for ITLBO-IPJAYAFigure 13 shows the average FAR of the three methodsshowing that ITLBO-IPJAYA performs better than ITLBOand ITLBO-JAYA even with less number of features whereITLBO-IPJAYA with 19 features performs better than TLBOand ITLBO-JAYA with 21 and 22 features respectively (eimprovements shown in Sections 4 and 6 reduce the exe-cution time for ITLBO-IPJAYA over ITLBO-JAYA (eparallel processing of each SVM parameter independently isthe main factor that reduces the execution time for ITLBO-IPJAYA over ITLBO-JAYA as shown in Figure 14

(e results of the CICIDS 2017 dataset are shown inTable 12

Finally statistical significance tests (T-test) T-test madeon the distribution of values in both samples showed theirsignificant difference which allowed us to reject null hy-pothesis H0 (e test sh2ows the superiority of IPJAYA-ITLBO-SVM over JAYA-ITLBO-SVM (e P values and T-

Table 8 NSL-KDD dataset

Attack classes 22 types of attacks No of instancesNormal 67343DoS smurt neptune pod teardrop back land 45927R2L phf ftp-write imap multihop warezclient warezmaster spy guess password 995U2R perl loadmodule buffer-overflow rootkit 52Probing portsweep ipsweep satan nmap 11656

Table 9 CICIDS dataset

Attack class 14 types of attacks No of instancesBenign (normal) 2359087DOS DDoS slowloris Heratbleed Hulk GoldenEye Slowhttptest 294506PortScan Portscan 158930Bot Bot 1966Brute-Force FTP-Patator SSH-Patator 13835Web attack Web attack XSS web attack SQL injection web attack brute force 2180Infiltration Infiltration 36

Table 10 Parameters used in this study margin

Parameter ValuePopulation size for ITLBO 40Number of generations for ITLBO 60Population size for JAYA 40Number of generations for JAYA 60Population size for IPJAYA 40Number of generations for IPJAYA 60Population size for ITLBO 40Number of generations for IPJAYA 60Crossover type Half-uniformMutation type Bit-flip

Complexity 13

0973097409750976097709780979

098098109820983

Acc

urac

y

ITLBO-JAYAITLBO-IPJAYA

25 30 35 40 45 50 55 60 6520Number of iterations

Figure 13 Accuracy comparison based on the number of iterations for the NSL-KDD dataset

Table 11 Comparison of ITLBO ITLBO-JAYA and ITLBO-IPJAYA for the NSL-KDD dataset

No of features Method MAX Acc AVR Acc DR FAR FNR F-M Recall ER

16

TLBO 09639 09630 09612 00449 00282 09664 09717 0036ITLBO 09680 09678 09671 00379 00268 09701 09731 0032

ITLBO-JAYA 09688 09685 09676 00373 00258 0971 09741 00312ITLBO-IPJAYA 09708 09705 09712 00331 00256 09727 09742 00292

18

TLBO 09713 0971 09739 00299 00275 09731 09724 00286ITLBO 09718 09713 09744 00292 00273 09736 09726 00282

ITLBO-JAYA 09735 09733 09752 00285 00247 09752 09752 00265ITLBO-IPJAYA 09747 09746 09753 00280 00221 09764 09779 00252

19

TLBO 09738 09735 09727 00313 00225 09755 09774 00261ITLBO 09751 09745 09737 00305 00189 09769 09811 00248

ITLBO-JAYA 09759 09758 09758 00278 00178 09775 09791 00241ITLBO-IPJAYA 09772 09770 09786 00245 00162 09787 09787 00228

21

TLBO 09782 09780 09742 00299 00145 09797 09844 00217ITLBO 09787 09784 09756 00279 00144 0981 09846 00212

ITLBO-JAYA 09793 0979 09789 00273 00132 09811 09867 00207ITLBO-IPJAYA 09802 0980 09792 00263 00123 09812 09716 00198

22

TLBO 09801 0979 09755 00284 00131 09814 09868 00199ITLBO 0981 09805 09758 00277 00117 09823 0989 00191

ITLBO-JAYA 09816 09814 09794 00265 00114 09829 0989 00183ITLBO-IPJAYA 0-9823 09821 09798 00262 00102 09835 09898 00177

Acc

urac

y

16 18 19 21 22Number of features

096

80

9688 097

08

097

180

9735

097

47

097

510

9759

097

72

097

870

9793

098

02

098

10

9816

098

23

ITLBOITLBO-JAYAITLBO-IPJAYA

Figure 12 Accuracy based on the number of features for the NSL-KDD dataset

14 Complexity

values are shown in Table 13 the small values show that theIPJAYA-ITLBO-SVM method (MV1) is highly significant

13 The Comparison of the Proposed Methods

To illustrate the effectiveness of our proposed IDS methodsthe performance of the proposed methods is compared withsix recently developed anomaly detection techniquesTable 14 demonstrates the result achieved by the proposedmethods compared with other methods tested on the NSL-KDD dataset in terms of detection rate and false alarm rateIt is very clear that our proposed methods (ITLBO-JAYAand ITLBO-IPJAYA) obtained the best results with 09823accuracy 09798 detection rate and 00102 false alarm ratefor the ITLBO-IPJAYA model and 09816 accuracy 09794detection rate and 00114 false alarm rate for the ITLBO-JAYA method as shown in Table 11 However Table 15demonstrates the result achieved by the proposed methodscompared with other methods tested on the CICIDS 2017dataset in terms of detection rate and false alarm rate

14 Discussion

(is work in general contains 4 sections based on theproposed method Furthermore all methods proposed inthis work were evaluated based on the NSL-KDD andCICIDS 2017 datasets

Firstly the proposed ITLBO-IPJAYA based on networkintrusion detection and method results were compared withTLBO ITLBO and ITLBO-JAYA as shown in Tables 11 and12 Additionally the table shows the different features for thethree algorithms to investigate the influence of the featurersquosincrease on the performance which represents a differentalgorithm structure (e ITLBO-IPJAYA results showedhigher stability and better accuracy than ITLBO and ITLBO-JAYA algorithms

Furthermore Figure 13 shows that ITLBO-JAYA needs60 iterations to reach accuracy of 09816 when the ITLBO-IPJAYA algorithm with 50 iterations achieved higher ac-curacy (erefore ITLBO-IPJAYA achieved better detectionrate and less false alarm rate with less complexity of

0020

0022

0024

0026

0028

003

0032

0034

0036

0038

004

FAR

2217 18 19 20 21 2315 16Number of features

ITLBOITLBO-JAYAITLBO-IPJAYA

Figure 14 Comparison based on the number of features head for the NSL-KDD dataset

Table 12 Comparison of ITLBO ITLBO-JAYA and ITLBO-IPJAYA for the CICIDS dataset

No of features Method MAX Acc AVR Acc DR FAR FNR F-M Recall ER

12ITLBO 09634 09631 09661 00389 00268 0970 09721 00323

ITLBO-JAYA 09685 09683 09682 00360 00267 09713 09722 00315ITLBO-IPJAYA 09704 09702 09701 00310 00265 09725 09724 00298

13ITLBO 09712 09710 09724 00298 00273 09736 09726 00282

ITLBO-JAYA 09745 09744 09728 00290 00264 09741 09794 00272ITLBO-IPJAYA 09768 09767 09732 00285 00260 09752 09787 00264

14ITLBO 09776 09775 09737 00280 00189 09769 09811 00258

ITLBO-JAYA 09789 09787 09742 00270 00174 0978 0986 00235ITLBO-IPJAYA 09801 0980 09749 00265 00134 0981 0987 00210

16ITLBO 09804 09803 09755 00271 0011 09821 0989 00190

ITLBO-JAYA 0981 09808 09773 00266 00109 09825 0989 00183ITLBO-IPJAYA 09817 09815 09782 00264 00105 09831 09896 00170

Complexity 15

iterations Secondly with all the improvement of ITLBO-SVM mentioned above random selection of the main SVMparameters is considered as one of the algorithm limitationswhich may not provide optimal parameter value and affectthe model accuracy negatively

(e results above showed that the ITLBO-IPJAYAperformance improved the basic SVM performance byproviding the best parameter values as shown in the ITLBO-IPJAYA block diagram in Figure 11 In the end the per-formance of ITLBO-IPJAYA is worth reducing the impact ofselected parameters randomly

As a result of the differences in the algorithm structurethe ITLBO structure contains three phases which shouldprevent the algorithm from being trapped in local and globaloptima Also teachers not only teach learners (students) butalso teach other teachers On the contrary the TLBO structurecontains two phases only where teachers teach learners only

Furthermore the ITLBO algorithm achieved higheraccuracy than TLBO because the knowledge exchange rate ishigher in ITLBO since teachers teach learners and otherteachers (erefore ITLBO achieved better detection rateand less false alarm rate with less complexity of iterations

Table 15 Comparison with the existing work for the CICIDS 2017 dataset

Ref Method Dataset Acc DR FAR[41] Hybrid model CICIDS 8976 NG NG[42] Wrapper-based feature selection CICIDS 9768 NG NG[43] Feature selection technique and SVM CICIDS 09803 NG NGTLBO-SVM TLBO and SVM CICIDS 09794 09745 00274ITLBO-SVM Improved TLBO and SVM CICIDS 09804 09755 00271ITLBO-JAYA-SVM Improved TLBO improved JAYA and SVM CICIDS 0981 09773 00266ITLBO-IPJAYA-SVM Improved TLBO improved JAYA and SVM CICIDS 09817 09782 00264

Table 13 T-test results

NSL-KDD CICIDS 2017P value 00156 00068T value 3174 4044

Table 14 Comparison with the existing work for the NSL-KDD dataset

Ref Method Dataset Acc DR FAR[35] Two-stage classifier NSL-KDD 9638 NG NG[36] Hypergraph-based genetic algorithm and SVM NSL-KDD 0975 09714 083[8] PSO and SVM NSL-KDD 09784 09723 087[37] Chi-square and SVM NSL-KDD 098 NG 013[38] SVM and hybrid PSO NSL-KDD 07341 06628 281[39] SVM and feature selection NSL-KDD 090 NG NG[40] SVM and GA NSL-KDD 0975 NG NGTLBO-SVM TLBO and SVM NSL-KDD 09801 09755 00284ITLBO-SVM Improved TLBO and SVM NSL-KDD 0981 09758 00277ITLBO-JAYA-SVM Improved TLBO improved JAYA and SVM NSL-KDD 09816 09794 00265ITLBO-IPJAYA-SVM Improved TLBO improved JAYA and SVM NSL-KDD 09823 09798 00262

13534

13239

ITLBO-JAYA ITLBO-IPJAYA130501310013150132001325013300133501340013450135001355013600

Figure 15 Execution time comparison for the NSL-KDD dataset

16 Complexity

Dividing the solutions of the IPJAYA algorithm into twogroups and choosing the best solution from the best solutiongroup as ldquoBestrdquo and the best solution from the worst solutiongroup as ldquoWorstrdquo cause IPJAYA to need less iterations thanJAYA to reach better solutions as shown in Figure 13 (isalso leads to improvement in accuracy and detection rate

(e parallel improvement done on the JAYA algorithmreduces the time needed for execution and hence reduces thetotal execution time for the ITLBO-IPJAYA-SVM model asshown in Figure 15

Data Availability

(e data used to support the findings of this study areavailable online at httpswwwunbcacicdatasetsnslhtml

Conflicts of Interest

(e authors declare that they have no conflicts of interest

Acknowledgments

Special appreciation is due to Universiti Malaysia Pahang forthe sponsorship of this study approved by the Ministry ofHigher Education (MOHE) for Fundamental ResearchGrant Scheme (FRGS) with Vot No RDU190113

References

[1] A Sultana and M Jabbar ldquoIntelligent network intrusiondetection system using data mining techniquesrdquo in Pro-ceedings of the 2016 2nd International Conference on Appliedand 0eoretical Computing and Communication Technology(iCATccT) pp 329ndash333 IEEE Bangalore India July 2016

[2] R Rao ldquoJaya a simple and new optimization algorithm forsolving constrained and unconstrained opti- mizationproblemsrdquo International Journal of Industrial EngineeringComputations vol 7 no 1 pp 19ndash34 2016

[3] M Alsajri M A Ismail and S Abdul-Baqi ldquoA review on therecent application of Jaya optimization algorithmrdquo in Pro-ceedings of the 2018 1st Annual International Conference onInformation and Sciences (AiCIS) Springer Al-Fallujah Iraqpp 129ndash132 November 2018

[4] P Tao Z Sun and Z Sun ldquoAn improved intrusion detectionalgorithm based on GA and SVMrdquo IEEE Access vol 6pp 13624ndash13631 2018

[5] A S Eesa Z Orman and A M A Brifcani ldquoA novel feature-selection ap- proach based on the cuttlefish optimizationalgorithm for intrusion detection systemsrdquo Expert Systemswith Applications vol 42 no 5 pp 2670ndash2679 2015

[6] P Louvieris N Clewley and X Liu ldquoEffects-based featureidentification for network intrusion detectionrdquo Neuro-computing vol 121 pp 265ndash273 2013

[7] E De la Hoz A Ortiz J Ortega and B Prieto ldquoPCA filteringand probabilistic SOM for network intrusion detectionrdquoNeurocomputing vol 164 pp 71ndash81 2015

[8] S Mojtaba H Bamakan H Wang Y Tian and Y Shi ldquoAneffective intrusion detection framework based onMCLPSVMoptimized by time-varying chaos particle swarm optimiza-tionrdquo Neurocomputing vol 199 pp 90ndash102 2016

[9] R V Rao V J Savsani and D P Vakharia ldquoTeach-ingndashlearning-based optimization a novel method for

constrained mechanical design optimization problemsrdquordquoComputer-Aided Design vol 43 no 3 pp 303ndash315 2011

[10] S P Das and S Padhy ldquoA novel hybrid model using teaching-learning-based optimization and a support vector machine forcommodity futures index forecastingrdquo International Journalof Machine Learning and Cybernetics vol 9 no 1 pp 97ndash1112018

[11] S P Das N S Achary and S Padhy ldquoNovel hybrid SVM-TLBO forecasting model incorporating dimensionality re-duction techniquesrdquo Applied Intelligence vol 45 no 4pp 1148ndash1165 2016

[12] R V Rao V J Savsani and J Balic ldquoTeachingndashlearning-based optimization algorithm for unconstrained and con-strained real-parameter optimization problemsrdquordquo EngineeringOptimization vol 44 no 12 pp 1447ndash1462 2012

[13] R V Rao and V Patel ldquoAn improved teaching-learning-based optimization algorithm for solving uncon- strainedoptimization problemsrdquo Scientia Iranica vol 20 no 3pp 710ndash720 2013

[14] M Crepinsek S-H Liu and L Mernik ldquoA note on teach-ingndashlearning-based optimization algo- rithmrdquo InformationSciences vol 212 pp 79ndash93 2012

[15] M R Nayak C K Nayak and P K Rout ldquoApplication ofmulti-objective teaching learning based optimization algo-rithm to optimal power flow problemrdquo Procedia Technologyvol 6 pp 255ndash264 2012

[16] Y Xu L Wang S-Y Wang and M Liu ldquoAn effectiveteaching-learning-based optimization algorithm for theflexible job-shop scheduling problem with fuzzy processingtimerdquo Neurocomputing vol 148 pp 260ndash268 2015

[17] H E Kiziloz A Deniz T Dokeroglu and A Cosar ldquoNovelmultiobjective TLBO algorithms for the feature subset se-lection problemrdquoNeurocomputing vol 306 pp 94ndash107 2018

[18] S H Wang K Muhammad Y Lv et al ldquoIdentification of Al-coholism based on wavelet Renyi entropy and three-segmentencoded Jaya algorithmrdquo Complexity vol 2018 Article ID3198184 13 pages 2018

[19] H Migallon A Jimeno-Morenilla and J-L Sanchez-RomeroldquoParallel improvements of the Jaya optimization algorithmrdquoApplied Sciences vol 8 no 5 p 819 2018

[20] C Gong ldquoAn enhanced Jaya algorithm with a two groupAdaptionrdquo International Journal of Computational Intelli-gence Systems vol 10 no 1 pp 1102ndash1115 2017

[21] O Samuel N Javaid S Aslam and M H Rahim ldquoJAYAoptimization based energy management controller for smartgrid JAYA optimization based energy management con-trollerrdquo in Proceedings of the 2018 International Conference onComputing Mathematics and Engineering Technologies(iCoMET) March 2018

[22] K Yu J J Liang B Y Qu X Chen and H Wang ldquoPa-rameters identification of photovoltaic models using an im-proved JAYA optimization algorithmrdquo Energy Conversionand Management vol 150 pp 742ndash753 2017

[23] M Dash and H Liu ldquoFeature selection for classificationrdquoIntelligent Data Analysis vol 1 no 1-4 pp 131ndash156 1997

[24] S Dumais J Platt D Heckerman and M Sahami ldquoInductivelearning algorithms and representations for text categoriza-tionrdquo in Proceedings of the seventh international conference onInformation and knowledge management-CIKMrsquo98 Novem-ber 1998

[25] R Singh H Kumar and R K Singla ldquoAn intrusion detectionsystem using network traffic profiling and online sequentialextreme learning machinerdquo Expert Systems with Applicationsvol 42 no 22 pp 8609ndash8624 2015

Complexity 17

[26] M Sokolova and L Guy ldquoA systematic analysis of perfor-mance measures for classification tasksrdquo Information Pro-cessing amp Management vol 45 no 4 pp 427ndash437 2009

[27] P Phoungphol Y Zhang and Y Zhao ldquoRobust multiclassclassification for learning from imbalanced biomedical datardquoTsinghua Science and Technology vol 17 no 6 pp 619ndash6282012

[28] A Jahan F Mustapha M Y Ismail S M Sapuan andM Bahraminasab ldquoA comprehensive VIKOR method formaterial selectionrdquo Materials amp Design vol 32 no 3pp 1215ndash1221 2011

[29] L Aljarah and S A Ludwig ldquoMapreduce intrusion detectionsystem based on a particle swarm optimization clusteringalgorithmrdquo in Proceedings of the 2013 IEEE Congress onEvolutionary Computation pp 955ndash962 IEEE CancunMexico June 2013

[30] K Khaleel M A Ismail U Yunan and S Kasim ldquoReview onintrusion detection system based on the goal of the detectionsystemrdquo International Journal of Integrated Engineeringvol 10 no 6 2018

[31] J D Rodriguez A Perez and J A Lozano ldquoSensitivityanalysis of k-Fold cross validation in prediction error esti-mationrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 32 no 3 pp 569ndash575 2010

[32] M Tavallaee E Bagheri W Lu and A A Ghorbani ldquoAdetailed analysis of the KDD CUP 99 data setrdquo in Proceedingsof the 2009 IEEE Symposium on Computational Intelligence forSecurity and Defense Applications pp 1ndash6 IEEE Ottawa ONCanada July 2009

[33] M Al-Qatf Y Lasheng M Al-Habib and K Al-SabahildquoDeep learning approach combining sparse autoencoder withSVM for network intrusion detectionrdquo IEEE Access vol 6pp 52843ndash52856 2018

[34] R Vinayakumar M Alazab K P Soman P PoornachandranA Al-Nemrat and S Venkatraman ldquoDeep learning approachfor intelligent intrusion detection systemrdquo IEEE Access vol 7pp 41525ndash41550 2019

[35] BA Tama M Comuzzi and K-H Rhee ldquoTSE-IDS a two-stage classifier ensemble for intelligent anomaly-based in-trusion detection systemrdquo IEEE Access vol 7 pp 94497ndash94507 2019

[36] M R Gauthama Raman N Somu K Kannan R Liscano andV S Shankar Sriram ldquoAn efficient intrusion detection systembased on hypergraph - genetic algorithm for parameter op-timization and feature selection in support vector machinerdquordquoKnowledge-Based Systems vol 134 pp 1ndash12 2017

[37] I S (aseen and C Aswani Kumar ldquoIntrusion detectionmodel using fusion of chi- square feature selection and multiclass SVMrdquo Journal of King Saud University - Computer andInformation Sciences vol 29 no 4 pp 462ndash472 2017

[38] Y Li S Yu J Bai and X Cheng ldquoTowards effective networkintrusion detection a hybrid model integrating Gini indexand GBDTwith PSOrdquo Journal of Sensors vol 2018 Article ID1578314 9 pages 2018

[39] A A Aburomman and M B I Reaz ldquoA novel weightedsupport vector machines multi- class classifier based ondifferential evolution for intrusion detection systemsrdquo In-formation Sciences vol 414 pp 225ndash246 2017

[40] J Esmaily and J Ghasemi ldquoA novel intrusion detectionsystems based on genetic algorithms-suggested features by themeans of different permutations of labelsrsquo ordersrdquo Interna-tional Journal of Engineering vol 30 no 10 pp 1494ndash15022017

[41] S Aljawarneh M Aldwairi and M B Yassein ldquoAnomaly-based intrusion detection system through feature selectionanalysis and building hybrid efficient modelrdquo Journal ofComputational Science vol 25 pp 152ndash160 2018

[42] Y Li J L Wang Z H Tian T B Lu and C Young ldquoBuildinglightweight intrusion detection system using wrapper-basedfeature selection mechanismsrdquo Computers amp Security vol 28no 6 pp 466ndash475 2009

[43] S U Jan S Ahmed V Shakhov and I Koo ldquoToward alightweight intrusion detection system for the internet ofthingsrdquo IEEE Access vol 7 pp 42450ndash42471 2019

18 Complexity

Page 2: ImprovedTLBO …downloads.hindawi.com/journals/complexity/2020/5287684.pdfFSS is a multiobjective optimisation problem. e first objective is the number of features, and the second

such as genetic algorithms (GAs) particle swarm and antcolony Moreover TLBO needs fewer parameters for tuningduring execution compared with other algorithms(us thecombination of improved multiobjective TLBO frameworkswith supervised ML techniques was proposed in the presentstudy for FSS in multiclass classification problems (MCPs)for intrusion detection (e selection of the least number offeatures without causing an effect on the result accuracy inFSS is a multiobjective optimisation problem (e firstobjective is the number of features and the second is thedetection accuracy TLBO remarkably outperforms othermetaheuristic algorithms (us ITLBO and a set of su-pervised SVMwere deployed in this study for the selection ofthe optimal feature subset JAYA is a new metaheuristicoptimisation algorithm proposed by Rao (2016) which wasrecently deployed in several intractable optimisation prob-lems JAYA differed from other optimisation algorithms bynot requiring parameter tuning [2] It has been used as abenchmark function for constrained and unconstrainedcases and despite being parameterless like TLBO it requiresno learning phase making it different from TLBO [3] (eprinciple of JAYA is the establishment of the problemrsquossolution by inclining towards the best result and keeping offfrom the bad one (is movement depends on certaincontrol parameters like the number of design variables themaximum number of generations and the size of thepopulation It requires no tunable control parameter beforethe computation phase (us IPJAYA is used to tune theparameters of the SVM In order to improve the featureselection process and SVM parameter tuning in this paperwe propose an improved algorithm for subset feature se-lection using an enhanced TLBO algorithm It uses an ad-ditional phase in TLBO to increase the informationexchange between teachers and learners SVM parametertuning is based on the improved parallel JAYA algorithmwhich uses parallel processing to increase the speed of pa-rameter tuning (e proposed algorithm is called ITLBO-IPJAYA-SVM

(e remaining part of this paper is presented in thefollowing manner Section 2 reviews work related to thisstudy and the FSS problem is introduced in Section 3 (eITLBO is discussed in Section 4 and Section 5 explains MLapplied with ITLBO Section 6 compares the results of theITLBO and TLBO algorithms Finally Section 7 concludesthis study

2 Related Work

Intrusion detection is a prevalent security infrastructuretopic in the era of big data Combinations of different MLmethods and optimisation algorithms have been developedand applied in the IDS to distinguish a normal networkaccess from the attacks Existing combinations includefuzzy logic cuttlefish optimisation algorithm K-nearestneighbour artificial neural network particle swarm algo-rithm support vector machine (SVM) and artificial im-mune system approaches [4] Most methods that combineML with optimisation algorithms outperform conventionalclassification methods Numerous researchers have also

proposed ML and optimisation-based IDSs [5] Louvieriset al [6] proposed a novel combination of techniques (K-means clustering naıve Bayes (NB) KruskalndashWallis (KW)and C45) that pinpointed attacks as anomalies with highaccuracy even within cluttered and conflicted cyber-net-work environments Furthermore the inclusion of the NBfeature selection and the KW test in this method facilitatesthe classification of statistically significant and relevantfeature sets including a statistical benchmark for thevalidity of the method while the detection of SQL injectionin this method remains low De la Hoz et al [7] presented amethod for NIDS that was based on self-organising maps(SOMs) and principal component analysis (PCA) Noisewithin the dataset and low-variance features were filteredby means of PCA and Fisher discriminant ratio (isprocedure uses the most discriminative projections basedon the variance explained by the eigenvectors Prototypesgenerated by the self-organising process are modelled by aGaussian where d is the number of SOM units (ereforethis system must be trained only once however the mainlimitation of this work is that the detection rate remainslow Bamakan et al [8] proposed a chaos-particle swarmoptimisation method to provide a new ML IDS based ontwo conventional classifiers multiple-criteria linear pro-gramming and an SVM

(e proposed approach has been applied to simulta-neously set the parameters of these classifiers and providethe optimal feature subset (e main drawback of this workis the long training time needed (erefore even thoughthese combinations can improve the performance of IDSsin terms of learning speed and detection rate compared toconventional algorithms further improvement is needed(e performance of most IDSs is affected in terms ofclassification accuracy and training time by an increase inthe number of audit data features (e present paperproposes the use of the TLBO technology to address thisissue through the supply of a fast and accurate optimisationprocess that can improve the capability of an IDS to find theoptimal detection model based on ML In the TLBO al-gorithm proposed by Rao et al [9] the optimisationprocess for mechanical design problems does not need anyuser-defined parameter (is novel technique was tested ondifferent benchmark functions and the results demon-strated that the developed TLBO outperformed particleevolutionary swarm optimisation artificial bee colony(ABC) and cultural DE Das and Padhy [10] studied thepossibility of applying a novel TLBO algorithm to theselection of optimal free parameters for an SVM regressionmodel of financial time-series data by using multi-commodity futures index data retrieved from multicutcrossover (MCX) (eir experimental results showed thatthe proposed hybrid SVM-TLBO model successfullyidentified the optimal parameters and yielded better pre-dictions compared to the conventional SVM Das et al [11]proposed an extension of the hybrid SVM-TLBO model byintroducing a dimension reduction technique whereby thenumber of input variables can be reduced by using PCAkernel PCA (KPCA) and independent component analysis(ICA) (three common dimension reduction methods) (is

2 Complexity

study also examined the feasibility of the proposed modelusing multicommodity futures index data retrieved fromMCX Rao et al [12] confirmed the superiority of the modelcompared to some population-inspired optimisationframeworks Rao and Patel [13] investigated the effect ofsample size and number of generations on algorithmicperformance and concluded that this algorithm can beeasily applied to several optimisation cases Crepinsek et al[14] solved the problems presented in [9 12] by usingTLBO Nayak et al [15] developed a multiobjective TLBOin which a matrix of solutions was created for each ob-jective (e teacher selection process in TLBO is mainlybased on the best solution presented in the solution spaceand learners are taught to merely maximise that objectiveAll the available solutions in the solution space were sortedto generate a collection of optimal solutions Xu et al [16]presented multiobjective TLBO based on different teachingtechniques (ey used a crossover operator (rather than ascalar function) between solutions in the teaching andlearning phases Kiziloz et al [17] suggested three multi-objective TLBO algorithms for FSS in binary classification(FSS-BCP) Among the presented methods a multi-objective TLBO with scalar transformation was found to bethe fastest algorithm although it provided a limitednumber of nondominated solutions Multiobjective TLBOwith nondominated selection (MTLBO-NS) explores thesolution space and produces a set of nondominated so-lutions but requires a long execution time MultiobjectiveTLBO with minimum distance (MTLBO-MD) generatessolutions that are similar to those of MTLBO-NS but in asignificantly shorter time (e proposed multiobjectiveTLBO algorithms have been evaluated in terms of per-formance using LR SVM and extreme learning machine(ELM) Wang et al suggested a novel ldquoalcoholism iden-tification method from healthy controls based on a com-puter-vision approachrdquo [18] (is approach relied on threecomponentsmdashthe proposed wavelet Renyi entropy feed-forward neural network and the proposed three-segmentencoded JAYA algorithm (e results showed the proposedmethod exhibits good sensitivity but the accuracy stillneeds improvements Migallon et al [19] developed parallelalgorithms and presented their detailed analysis (eydeveloped a hybrid algorithm that exploited inherentparallelism at two different levels (e lower level wasexploited by parallel shared-memory platforms while theupper level was exploited by distributed shared memoryplatforms (e results of both algorithms were good es-pecially in scalability Hence the proposed hybrid algo-rithm successfully used a number of processes with near-perfect efficiencies (e experiments showed that themethod used about 60 processes to achieve near-ideal ef-ficiencies as analysed on 30 unconstrained functions Gong[20] suggested a ldquonovel E-JAYA algorithm for the per-formance enhancement of the original JAYA algorithmrdquo(e proposed E-JAYA used the average of the better andworse groups to derive the best solution (e solutionprovided by the proposed E-JAYA had better accuracy thanthat of the original JAYA (e swarm behaviours wereconsidered in the E-JAYA rather than considering the best

and worst individual behaviours (e performance ofE-JAYA was assessed on 12 benchmark functions ofvarying dimensionality

Another study proposed an effective demand-sidemanagement scheme for residential HEMS [21] (esystem was proposed for peak creation prevention toreduce electricity bills (is study applied JAYA SBA andEDE to realise its objectives it also deployed the TOUpricing scheme for electricity bill computation From theresult JAYA was sufficient in reducing electricity bill andPAR thereby achieving customer satisfaction Further-more the SBA outperformed JAYA and EDE in achievinguser comfortability as it related negatively with an elec-tricity bill Yu et al [22] developed improved JAYA(IJAYA) for steady and accurate PV model parameterestimation by incorporating a self-adaptive weight for theadjustment of the propensity of reaching the best solutionand avoiding the bad solution while searching (e weighthelps in ensuring the framework achieves the possiblesearch region early and to perform local search laterFurthermore the algorithm contains a learning strategyderived from other individualsrsquo experiences which wasrandomly used for population diversity improvementTable 1 shows the lacks and limitation of IDS studiesmentioned in the related work

3 Feature Subset Selection Problem

(is section explains the representation of the featuresand the problem of choosing the best feature subset FSSrefers to the selection of feature subsets from a largerfeature set FSS reduces the number of features in adataset thereby preventing complex calculations andimproving the speed and performance of classifiersSeveral definitions of FSS exist in literature [23] somedefinitions deal with the reduction in size of the selectedsubset while others focus on the improvement of pre-diction accuracy FSS is essentially a process of con-structing an effective subset that represents theinformation contained in a dataset by eliminating re-dundant and irrelevant features FSS mainly aims atfinding the least number of features without having asignificant influence on classification accuracy Owing tothe complicated nature of optimal subset feature ex-traction as well as the nonexistence of a polynomial-timealgorithm for addressing it FSS has been classified as anNP-hard problem [24] (ere are four steps in typical FSS[23] the first step involves the selection of candidatefeatures that will constitute the subsets while the secondstep is the evaluation and comparison of these subsetswith each other In the third step a check is made for thesatisfaction of the termination condition otherwise thefirst and second steps will be repeated (e final stepchecks if the optimal feature subset has been establishedbased on prior knowledge With these two major aimsFSS can be considered a multiobjective problem A formaldefinition of finding optimal solutions through the sat-isfaction of both objectives is given in the followingequation

Complexity 3

Min (f1)

Max (f2)

subject to f1 |k|

f2 accuracy(k) where ksubeK

(1)

where k is the subset of the original dataset K which opti-mises f1 and f2 (the objectives)

(e establishment of the best solution or the decision onthe improved condition of a new individual is a complicatedtask in a multiobjective optimisation process (is is due tothe chances of enhancement in one objective causing areduction in the other

4 Improved TLBO Algorithm

(e ITLBO algorithm was executed at the FSS phase in thisstudy (e ITLBO algorithm was initialised by randomlygenerated initial population namely the teacher and a set ofstudents which represents the set of solutions To representthe features in the ITLBO algorithm ITLBO borrowed thecrossover and mutation operators from GA by representingthe features as chromosomes (one of the GA properties) Toupdate this chromosome crossover and mutation operatorswere used In the population (called a classroom) eachsolution is taken as an individualchromosome (Figure 1) Afeature gene of a chromosome with a value of 1 is consideredas selected while a value of 0 denotes otherwise Figure 1shows a sample of the dataset regarding Figure 2 features AB C D E I K and L were selected (their values are 1) whilefeatures F G H and J were not (their values are 0) (eTLBO algorithm runs through iterations where the teacher isthe best individual in the population and the rest of theindividuals become the students Having selected theteacher ITLBO works in three phases Teacher BestClassmates (Learner Phase 1) and Learner Phase 2 In theTeacher phase the teacher enhances the knowledge of eachstudent by sharing knowledge with them but in the BestClassmates phase two best students are selected andassigned the task of interacting with the other students Inthe Learner phase there is a random interaction among thestudents in a bid to enhance their levels of knowledge Newchromosomes are generated in the proposed ITLBO usingldquohalf-uniform crossover and bit-flip mutation operatorsrdquowhich are special crossover operators (Figures 3 and 4)Two-parent chromosomes (could be a teacher a student ortwo students) are needed for the crossover operator (ecrossover operator relies on the information of the two-parent chromosomes if both parents feature the same genethe gene is kept but whenever there are different featuregenes in both parents a parentrsquos gene is randomly chosenOnly one new chromosome is generated from this operation

(e ldquobit-flip mutationrdquo works on a single chromosome whentrying to manipulate a single gene based on a probabilisticratio If the gene has a zero value it will be updated as one orvice versa In the proposed ITLBO algorithm nondominatedsorting and selection were used (e dominance of an in-dividual over another individual is determined strictly on thebasis of whether a minimum of one of its objectives is su-perior to that of the other while keeping all the other ob-jectives the same

A nondominated scenario arises when there is nopossibility of an individual being dominated by another (efront line of the solution set is filled by the nondominatedindividuals (ose that are closest to the ideal point in thefront line are chosen as the teachers All the teachers teach allstudents discretely at the Teacher Best Classmate andLearner phases (e details of the ITLBO algorithm arepresented in Figures 5 and 6(e detail steps of ITLBO are asfollows

(i) Step 1 initialise the population randomly with eachpopulation having a different set of features from 1to a maximum number of features (41 in NSL-KDD) (is step is captured in line 2 of Figure 5

(ii) Step 2 choose the best individual as a teacher (echosen teacher interacts with all other individualsseparately and a crossover is applied with each oneand then a mutation is applied to all the resultingindividuals (e operators used are half-uniformcrossover and bit-flip mutation operators (repre-sented in lines 4 to 5 in Figure 5)

(iii) Step 3 check the population (chromosome) thatresults from the crossover and mutation if the newchromosome is better than the old then the new oneis kept otherwise the old one is retained All theaforementioned steps are collectively called theTeacher phase because all individuals learn from thebest one (the teacher) (is step is represented inlines 6 to 13 in Figure 5

(iv) Step 4 after that Learner Phase 1 or learning fromthe best classmates is started (is phase begins withthe fifth step which is the selection of the best twoindividuals as students and applying a crossoverbetween them followed by a mutation If the newone is better than the previous two students thenthe newer choice is kept otherwise the older bestchoice is kept (is process is repeated with all otherindividuals (students) At this point Learner Phase1 has terminated (viewed in lines 14 to 27 inFigure 5)

(v) Step 5 this step is Learner Phase 2 which involveschoosing two random individuals (students) be-tween whom a crossover is applied followed by amutation on the new individual If the new

Table 1 IDS existing work

Ref Limitation[6] Detection of SQL injection is low[7] Detection rate is low[8] Long training time

110100011111

Figure 1 Schematic representation of a chromosome 1 selectedfeatures 0 unselected features

4 Complexity

individual is better than the old two students thenthe new one is kept otherwise the best old one isretained (is step is repeated with all other

students At this point the main three stages ofITLBO have been completed and a check should becarried out on whether the termination criteria have

(1) Start (2) Initialize population(3) Calculate_weighted_average_of

_individuals (population)(4) for (k =1 to number_of_generations) do(5) Xteacher = Best_individual (6) Learning from Teacher lowastteacher phase (7) for (i =1 to number_of_individuals) do(8) Xnew = Crossover (Xteacher Xi)(9) Xnew = Mutation (Xnew)(10) if (Xnew is better than Xi) then(11) Xi = [Xnew](12) End if(13) End for (14) Learning from Best Classmates lowastlearner phase 1 (15) for (i =1 to number_of_individuals) do(16) m = Select_best_individual_from

(population)(17) n = Select_best_individual_from (population) (18) n ne m ne teacherlowast(19) Xnew = Crossover (Xm Xn)(20) Xnew = Mutation (Xnew)(21) if (Xnew is better than Xm) then(22) Xm = Xnew(23) End if

(24) if (Xnew is better than Xn) then (25) Xn = Xnew(26) End if (27) End for (28) Learning from Classmates lowastlearner phase 2(29) for (i =1 to number_of_individuals) do(30) m = Select_random_individual_from

(population)(31) n = Select_random_individual_from

(population) n ne m ne teacherlowast(32) Xnew = Crossover (Xm Xn)(33) Xnew = Mutation (Xnew)(34) if (Xnew is better than Xm) then(35) Xm = Xnew(36) End if (37) if (Xnew is better than Xn) then(38) Xn = Xnew(39) End if (40) End for (41) Is termination criterion satisfied if yes go to line

42 else continue(42) End for(43) Show_the_pareto_optimal_set (population)(44) End

Figure 5 ITLBO algorithm

Figure 2 Sample of the dataset

0100111111

1101110101

1100110111

Teacher

New student

Student

Figure 3 Crossover operator

0100111111

0100110111

Figure 4 Mutation operator

Complexity 5

been satisfied or not If the termination criteria weresatisfied proceed to the next step otherwise themain three stages are repeated (Teacher phaseLearner Phase 1 and Learner Phase 2) (is step isrepresented in lines 28 to 40 in Figure 5

(vi) Step 6 the final step is the application of non-dominated sorting to the result Nondominatedsortingmeans no result (individual) is better than all

other individuals (is step can be viewed in line 43in Figure 5

5 Parameter Optimisation

After selecting the optimal subset feature several SVMparameters will be tuned(e tuning of SVM parameters is aproblem which can determine algorithm performance (e

Start

Calculate weighted average of every individual in the population

Choose best individual as teacher

Is new one better thanold one

Keep old one Keep new one

Select best two students apply crossover and mutation

Is new one better than theworse student

Keep new oneKeep old one

Select random two students apply crossover and mutation

Is new one better thanold one

Keep new oneKeep old one

Is the termination criterion satisfied

Is new one better than the worse student based on new

value of C and YKeep new oneKeep old one

Is the terminationcriterion satisfied

End

YesNo

YesNo

No Yes

YesNo

No

Crossover teacher with all other individuals (student) separately and apply mutation

Apply nondominated sorting and find the Pareto setYes

Initialize population randomly

Figure 6 ITLBO flowchart

6 Complexity

radial basis function (RBF) kernel function of the SVM isemployed for the conversion of the completely nonseparableproblem into a separable or approximate separable state(eRBF kernel parameter c suggests data distribution to a newfeature space while parameter C suggests the level of penaltyfor the classification error in the linear nonseparable caseEquations (2) and (3) represent the cost and gamma re-spectively In the next section the two parameters (C and c)were tuned by using the IPJAYA algorithm

w b12

w22

+ C 1113944n

ζnstyn wtxn + b1113872 1113873ge 1 minus ζn (2)

k xn xm( 1113857 exp minusc xn minus xm

23

1113874 1113875 (3)

6 Improved Parallel JAYA Algorithm

(e JAYA algorithm needs improvements to work betterOne of the observations on the JAYA algorithm is that if wesort the populations from best to worst and divide them intotwo groups the best and the worst solutions Obviously theoptimal solution is located in the best solution group [2]Based on this observation an improvement has been done inthe JAYA algorithm rather than selecting the best and worstcases from the whole solutions which puts the worst so-lution further from the best solution and increase the it-erations needed to reach the optimal solution the solutionswere divided into two groups (e best solution is chosenfrom the best solution group as ldquoBestrdquo and the best solutionfrom the worst solution group is the ldquoWorstrdquo (is proce-dure reserves the populationrsquos diversity and makes the so-lution start from a point closer to the optimal solution anddecreases the number of iterations needed to reach theoptimal solution In the proposed work JAYA algorithmwas improved to optimise two parameters of the SVMclassifier simultaneously Figure 7 shows the flowchart ofIPJAYA while Figure 8 shows the IPJAYA algorithm fol-lowed by the detailed steps of IPJAYA

(e detailed steps of IPJAYA are shown as follows(i) Step 1 select the population size and the number of

design variables as well as initialise the terminationcondition To explain the parameter optimisation indetail we assume the following scenario populationsize 3 design variables 2 and terminationcriterion 2 iterations (e value of the populationis the value of parameters C and c in this scenarioeach one has 3 values (ese values were initialisedrandomly for C between 0001 to 100 and for c

between 00001 to 64 Table 2 shows the values of Cand c

(ii) Step 2 SVM needs three things to classify any la-belled data ie features to choose value of C pa-rameter and value of c parameter (is step can beviewed in line 2 of Figure 8

(iii) Step 3 the next step is to evaluate each value forboth C and c separately by using the SVM and on

the first student from Learner Phase 2 after applyingcrossover and mutation as shown in Table 3To continue the optimisation process the pop-ulation is arranged from best to worst and split intotwo groups (Best and Worst groups) as shown inTable 4(e same procedure is repeated for c parameter andthis time C is by default and the new value of c is11006

(iv) Tables 5 and 6 show the details of c parameter(v) Step 4 the result will be considered as the objective

function for both C and c and then compared withother populations and continued until the termi-nation criterion is satisfied (is step can be viewedin line 7 of Figure 8

(ese two new values for C and c will be evaluated usingthe same subset feature at the same time as shown in Table 7(is step can be viewed in lines 5 to 6 of Figure 8

7 The Proposed Method

(is section describes the proposed combination of threedifferent algorithms Each algorithm has a different task todo and these tasks complete the work of the model (e firstalgorithm is the ITLBO whose task is to choose the optimalsubset feature from the whole features(e second algorithmis IPJAYA algorithm and its task is to optimise the pa-rameters of the SVM (e third algorithm is the SVMclassifier which takes the outcome of the first two algorithmsto determine if the processed traffic is intrusion or normaltraffic Figure 9 shows the flowchart of the proposed methodFigure 10 shows the pseudo-code of the proposed methodwhile Figure 11 details the proposed steps of the IPJAYA-ITLBO-SVM method

(e detail steps of ITLBO-IPJAYA-SVM are asfollows

(i) Step 1 initialise the population randomly Eachpopulation is a different set of features from 1 tothe maximum number of features (41 in NSL-KDD) (is step can be viewed in line 2 ofFigure 10

(ii) Step 2 calculate the weighted average of everyindividual population (is step can be viewed inline 3 of Figure 10

(iii) Step 3 choose the best individual as a teacher (echosen teacher interacts with all other individualsseparately Apply crossover with each one andthen apply mutation to all resulted individuals(ecrossover used is called the half-uniform crossoverand bit-flip mutation operator (is step can beviewed in lines 4 to 5 of Figure 10

(iv) Step 4 check the population (chromosome)resulting from crossover and mutation if the newone is better than the old one keep the new oneotherwise retain the old one (e best and worstpopulations refer to the degree of accuracy during

Complexity 7

Start

Initialize population size number of design variables and termination criterion

Identify best and worst solutions for C and γ

Calculate new worst solution for C Calculate new worst solution for γ

Modify the solution based on eq 4 Modify the solution based on eq 4

Use the new value of C and γ do the classification

Is new Acc Better than old Acc

Keep new C and γKeep old C and γ

Is the termination criterionsatisfied

Return best value of C and γ

End

YesNo

NO

Yes

Parallel pool

Figure 7 IPJAYA flowchart

(1) Start (2) Initialise the population size number of designed

variables and termination criteria (3) Repeat Steps 3ndash6 until the termination criteria are met (4) Arrange the solutions from best to worst and split the

solutions into two groups -best and worst solutions(5) Make the best solution in best group as best and

make the best solution in worst group as worst(6) Modify the solution based on the following equation

Yprimej k I = Yj k I + r1 k I (Yj best I ndash | Yj k I |) ndashr2 k I (Yj worst I ndash | Yj k I|)

(7) Update the previous solution if Yprimej k I gt Yj k Iotherwise do not update the previous solution

(8) Display the established optimum solution(9) End

Figure 8 IPJAYA algorithm

Table 2 C and c values

C c

20 101 210 0701 1

Table 3 Evaluation of C

C c Subset feature Accuracy (objective function)20 Default Fixed 0971 Default Fixed 089910 Default Fixed 099401 Default Fixed 099

Table 4 Best and Worst groups for C

C cSubsetfeature

Accuracy (objectivefunction)

10 Default Fixed 0994 best of best Best group01 Default Fixed 09920 Default Fixed 097 best of worst Worst

group1 Default Fixed 0899

Table 5 Accuracy based on c

C γ Subset feature Accuracy (objective function)20 Default Fixed 0971 Default Fixed 089910 Default Fixed 099401 Default Fixed 099

8 Complexity

classification All the aforementioned steps arecalled the Teacher phase because all individualslearn from the best one (teacher) After thatLearner Phase 1 is started (is step can be viewedin lines 6 to 13 of Figure 10

(v) Step 5 select the best two individuals as studentsand apply crossover between these two students(en apply mutation on the new one If the newone is better than the old two students keep thenew one otherwise keep the best old one and

apply this with all other individuals (students) (estudents are chosen once and will not be chosenagain At this point Learner Phase 1 has ended(is step can be viewed in lines 14 to 27 ofFigure 10

(vi) Step 6 Learner Phase 2 is initiated with two objec-tives one is to optimise the SVMparameters and theother is to make students learn from each other(isphase starts with choosing two random individuals(students) and then applying crossover between these

Table 6 Best and Worst groups for c

C c Subset feature Accuracy (objective function)Default 07 Fixed 09941 best of best Best groupDefault 1 Fixed 099Default 2 Fixed 098 best of worst Worst groupDefault 10 Fixed 097

Table 7 Evaluation of features based on new C and c

C c Subset feature Accuracy (objective function)142 11006 Fixed Result

Start

FSS using ITLBO

SVM parameters optimization using IPJAYA

Is termination criterion satisfied for IPJAYA

No

Is termination criterion satisfied for ITLBO

No

Training

Classification

Intrusion detection

End

Yes

Yes

Figure 9 Proposed method flowchart

Complexity 9

two students and applying mutation on the newindividual After that and before the classificationprocess is initiated check if the new student is betterthan the old two students (e SVM parameteroptimisation is started using IPJAYA this processstarts at the 29th step by initialising the populationsize the number of design variables and the ter-mination criteria for IPJAYA (e population sizecan be set before the execution and each populationis generated randomly(e designed variables are thetwo parameters of the SVM which need to beoptimised (e termination criteria can be thenumber of iterations after that each population foreach parameter is evaluated separately (which onegives better accuracy) followed by a parallel poll foreach parameter sorting the population from best toworst (best accuracy to worst accuracy) and sepa-rating them into two groups (best and worst groups)(e best population in the best group is chosen asbest and the best population in the worst group ischosen as worst (en the population is modifiedbased on equation in Figure 8 and updated if the newone is better than the old one IPJAYA is repeated

until the termination criterion is satisfied (e finalstep of IPJAYA is to deliver the best value of the twoparameters to be used by the SVM At this point theparameter optimisation has ended and LearnerPhase 2 continues in the next step (is step can beviewed in lines 28 to 39 of Figure 10

(vii) Step 7 evaluate the individuals (chromosome) byusing the outcome of IPJAYA If the new indi-vidual is better than the old two students keepthe new one otherwise keep the best old oneApply this step to all other students At this stepthe main three stages of the ITLBO have finished(e next step is to check for the satisfaction of thetermination criteria if satisfied proceed to thenext step Otherwise the main three stages arerepeated (is step can be viewed in lines 40 to 48of Figure 10

(viii) Step 8 the last step is to apply nondominatedsorting on the result Nondominated sortingmeans no result (individual) is better than all theother individuals(is step can be viewed in line 49of Figure 10

(1) Start (2) Initialize population(3) Calculate_weighted_average_of_individuals (4) for (k =1 to number_of_generations) do(5) Xteacher = Best_individual (6) Learning from Teacher lowastteacher phase (7) for (i = 1 to number_of_individuals) do(8) Xnew = Crossover (Xteacher Xi)(9) Xnew = Mutation(Xnew)(10) if (Xnew is better than Xi) then

(11) Xi = [Xnew]

(12) End if

(13) End for (14) Learning from Best Classmates lowastlearner phase 1 (15) for (i = 1 to number_of_individuals) do(16) m = Select_best_individual_from

(population)

(17) n = Select_best_individual_from (population)(18) n nem ne teacherlowast(19) Xnew = Crossover (Xm Xn)(20) Xnew = Mutation(Xnew)

(21) if (Xnew is better than Xm) then

(22) Xm = Xnew

(23) End if(24) if (Xnew is better than Xn) then

(25) Xn = Xnew

(26) End if

(27) End for

(28) Learning from Classmates lowastlearner phase 2(29) for (i =1 to number_of_individuals) do(30) m =Select_random_individual_from

(population)

(31 n =Select_random_individual_from (population) n nem ne teacherlowast

(32) Xnew = Crossover (Xm Xn)

(33) Xnew = Mutation (Xnew)(34) IPJAYA algorithm for SVM parameter

(35) Initialize the population size number of designed variables and termination criteria (IPJAYA)

(36) Arrange the solutions from the best to the worstand Split the solutions into two groups best andworst solutions

(37) Modify the solutionYj k I = Yj k I + r1 k I (Yj best I ndash |Yj k I|) ndash r2 k I (Yj worst I ndash | Yj k I|)

(38 Update the previous solution if Yj k I gt Yj k Iotherwise do not update the previous solution

(39) return best value of C and Y(40) if ( Xnew is better than Xm) then(41) Xm = Xnew(42) End if (43) if (Xnew is better than Xn) then(44) Xn = Xnew(45) End if (46) End for (47) Is termination criterion satisfied if yes go to line

42 else continue

(48) End for(49) Show_the_pareto_optimal_set (population)(50) End

Figure 10 Proposed method pseudo-code

10 Complexity

Start

Calculate weighted average of every individual in the population

Choose best individual as teacher

Crossover teacher with all other individuals (student) separately and apply mutation

Is new one better than old oneKeep old one Keep new one

Select best two students apply crossover and mutation

Is new one better than theworse student

Keep new oneKeep old one

Select random two students apply crossover and mutation

Initialize population size number of design variablesand termination criterion

Identify best and worst solutions for C and γ

Calculate new worst solution

Modify the solutions based on equation 32

Use the new value of c and y do the classification

Is new one better than old one Keep new oneKeep old one

Is the termination criterionsatisfied

This is the best value of C and γ

Is new one better thanthe worse student based on new

value of C and γ

Keep new oneKeep old one

Is the termination criterionsatisfied

End

YesNo

YesNo

Yes

No Yes

YesNo

No

No

Initialise population randomly

Apply nondominated sorting and find the Pareto setYes

ITLBO

IPJAYA

Figure 11 Details of the proposed method

Complexity 11

8 Evaluation Metrics

(emetrics measures and validation procedures used in theevaluation of the experimental data were reviewed in thissection (e literature review showed that most studies useoverall accuracy as the major performance measure for IDsystems However other metrics and validation measureshave also been mentioned Some works have detailed theinformation on FAR detections andmissed detections whichare all useful system performance evaluation measures (efollowing section details the analysis based on standardmetrics for objective evaluation of the results achieved byvarious classification methods (e performance of thesystem was evaluated using several metrics based on theNSL-KDD and CICIDS 2017 datasets A detailed descriptionof learning performance measures has been provided bySingh et al and Sokolova and Guy [25 26] whilePhoungphol et al [27] detailed the imbalanced dataset is-sues One of these metrics is the accuracy as given in thefollowing equation

accuracy TP + TN

TP + FN + TF + FP (4)

Accuracy is the capability of the classifier in predictingthe actual class here TP true positive TN true negativeFP false positive and FN false negative

Several metrics can be computed from the confusionmatrix (e false-positive rate (FPR) is another metric it isthe percentage of the samples incorrectly predicted aspositive by the classifier It is calculated by using the fol-lowing equation

false minus positive rate(FPR) FP

FP + TN (5)

(e false-negative rate (FNR) is the percentage of thedata incorrectly classified by the classifier as negative It iscalculated by using the following equation

false minus negative rate(FNR) FN

TP + FN (6)

(e detection rate (DR) is the percentage of the samplescorrectly classified by the classifier to their correct class It iscalculated by using the following equation

detection rate FP

TP + FP (7)

(e recall quantifies the number of correct positivepredictions made out of all positive predictions It is cal-culated by using the following equation

recall TP

TP + FN (8)

F-Measure provides a way to combine both detectionrate and recall into a single measure that captures bothproperties It is calculated by using the following equation

F minus M (2lowastDRlowast recall)

(DR + recall) (9)

(e results were validated by using k-fold cross-vali-dation technique [27ndash30] (is technique requires a randompartitioning of the data into k different parts and one part isselected from each iteration as testing data while the other(kminus1) parts are considered as the training dataset All theconnection records are eventually used for training andtesting For all experiments the value of k is taken as 10 toensure low bias low variance low overfitting and good errorestimate [28]

9 Dataset Preprocessing and Partitioning

(e whole dataset is preprocessed in this stage It consists oftwo steps ie scaling and normalisation In the scaling stepthe dataset is converted from a string representation to anumerical representation For example the class label in thedataset contains two different categories ldquoNormalrdquo andldquoAttackrdquo After implementing this step the label is changedto ldquo1rdquo and ldquo0rdquo where ldquo1rdquo means normal case while ldquo0rdquomeans attack (e second step is normalisation [31] (enormalisation process removes noise from the dataset anddecreases the differences in the ranges between the featuresIn this work the Max-Max normalisation method was usedas shown in the following equation

Fi Fi minus Mini

Maxi minus Minii (10)

where Fi represents the current feature that needs to benormalized and Mini and Maxi represent the minimum andthe maximum value for that feature respectively (e ob-jective function represents the accuracy of the SVM when itis evaluated on the validation set (e validation set is a partof the training set In order to make the validation fairer K-fold validation can be used(e value K is 10(e NSL-KDDand CICIDS 2017 datasets were used to evaluate the per-formance of the proposed models

10 NSL-KDD Dataset

In this study NSL-KDD datasets were used to evaluate theproposed method (is dataset was suggested in 2009 byTavallaee et al [32] due to the drawbacks of KDD CUP99(e NSL-KDD is a variant of the KDD CUP 99 dataset inwhich the redundant instances were discarded followed bythe reconstitution of the dataset structure [29] (e NSL-KDD dataset is commonly used for evaluating the per-formance of new ID approaches especially anomaly-basednetwork ID (ere are a reasonable number of testing andtraining records in the NSL-KDD (e training set(KDDTrain+) consists of 125973 records while the testingset (KDDTest+) contains 22544 records In this dataseteach traffic record has 41 features (six symbolic and 35continuous) and one class label (Table 7) (e features areclassified into basic content and traffic types (Table 8)Attack classification in the NSL-KDD is based on thefeature characteristics [33] (e NSL-KDD dataset can bedownloaded from httpswwwunbcacicdatasetsnslhtml

12 Complexity

11 CICIDS 2017 Dataset

(e CICIDS 2017 dataset consists of benign and the mostcurrent common attacks which mimic real-world data(PCAPs) It also contains the results of a network trafficanalysis obtained by using a CICFlowMeter the flows arelabelled based on the timestamp source and destinationports source and destination IPs protocols and attack (eCICIDS 2017 dataset satisfies the 11 indispensable features ofa valid IDS dataset namely anonymity available protocolsfeature set attack diversity complete capture completeinteraction complete network configuration completetraffic metadata heterogeneity and labelling [34] (ere are2830540 rows in the CICIDS 2017 devised on eight fileswith each row containing 79 features In the CICIDS 2017each row is labelled as benign or as one of the 14 attack typesA summary of the distribution of different attack types andthe benign rows is presented in Table 9

12 Results of ITLBO-IPJAYA vs ITLBOand ITLBO-JAYA

(is section provides the results of the improved method-based ITLBO-IPJAYA algorithm (is method selects thebest features and updates the value of SVM parameters (iswork proposed the idea of ldquoparallel executionrdquo to update theSVM parameters (e parameters for ITLBO ITLBO-JAYAand ITLBO-IPJAYA methods used in this study are shownin Table 10

(e NSL-KDD dataset is used to evaluate the threemethods and the evaluation metrics used are maximumaccuracy (Max Acc) average accuracy (AVR Acc) de-tection rate (DT) false alarm rate (FAR) false negative rate(FNR) F-measure (F-M) recall and error rate (ER)Table 11 shows the comparison in results among ITLBOITLBO-JAYA and ITLBO-IPJAYA

(e results show that ITLBO-IPJAYA performs betterthan ITLBO and ITLBO-JAYA in all metrics Figure 11shows the comparison results based on the accuracy ofITLBO ITLBO-JAYA and ITLBO-IPJAYA

Figure 12 shows a comparison between ITLBO-JAYAand ITLBO-IPJAYA based on the number of iterations Itshows that ITLBO-IPJAYA performs better than ITLBO-JAYA even with less number of iterations (e increase inrate of accuracy for ITLBO-IPJAYA is higher than ITLBO-JAYA (e figure shows that ITLBO-IPJAYA with 20 iter-ations performs better than ITLBO-JAYA with 30 iterationsand that ITLBO-IPJAYA performs better than ITLBO-JAYAwith less number of iterations (is means there is lesscomplexity and less execution time for ITLBO-IPJAYAFigure 13 shows the average FAR of the three methodsshowing that ITLBO-IPJAYA performs better than ITLBOand ITLBO-JAYA even with less number of features whereITLBO-IPJAYA with 19 features performs better than TLBOand ITLBO-JAYA with 21 and 22 features respectively (eimprovements shown in Sections 4 and 6 reduce the exe-cution time for ITLBO-IPJAYA over ITLBO-JAYA (eparallel processing of each SVM parameter independently isthe main factor that reduces the execution time for ITLBO-IPJAYA over ITLBO-JAYA as shown in Figure 14

(e results of the CICIDS 2017 dataset are shown inTable 12

Finally statistical significance tests (T-test) T-test madeon the distribution of values in both samples showed theirsignificant difference which allowed us to reject null hy-pothesis H0 (e test sh2ows the superiority of IPJAYA-ITLBO-SVM over JAYA-ITLBO-SVM (e P values and T-

Table 8 NSL-KDD dataset

Attack classes 22 types of attacks No of instancesNormal 67343DoS smurt neptune pod teardrop back land 45927R2L phf ftp-write imap multihop warezclient warezmaster spy guess password 995U2R perl loadmodule buffer-overflow rootkit 52Probing portsweep ipsweep satan nmap 11656

Table 9 CICIDS dataset

Attack class 14 types of attacks No of instancesBenign (normal) 2359087DOS DDoS slowloris Heratbleed Hulk GoldenEye Slowhttptest 294506PortScan Portscan 158930Bot Bot 1966Brute-Force FTP-Patator SSH-Patator 13835Web attack Web attack XSS web attack SQL injection web attack brute force 2180Infiltration Infiltration 36

Table 10 Parameters used in this study margin

Parameter ValuePopulation size for ITLBO 40Number of generations for ITLBO 60Population size for JAYA 40Number of generations for JAYA 60Population size for IPJAYA 40Number of generations for IPJAYA 60Population size for ITLBO 40Number of generations for IPJAYA 60Crossover type Half-uniformMutation type Bit-flip

Complexity 13

0973097409750976097709780979

098098109820983

Acc

urac

y

ITLBO-JAYAITLBO-IPJAYA

25 30 35 40 45 50 55 60 6520Number of iterations

Figure 13 Accuracy comparison based on the number of iterations for the NSL-KDD dataset

Table 11 Comparison of ITLBO ITLBO-JAYA and ITLBO-IPJAYA for the NSL-KDD dataset

No of features Method MAX Acc AVR Acc DR FAR FNR F-M Recall ER

16

TLBO 09639 09630 09612 00449 00282 09664 09717 0036ITLBO 09680 09678 09671 00379 00268 09701 09731 0032

ITLBO-JAYA 09688 09685 09676 00373 00258 0971 09741 00312ITLBO-IPJAYA 09708 09705 09712 00331 00256 09727 09742 00292

18

TLBO 09713 0971 09739 00299 00275 09731 09724 00286ITLBO 09718 09713 09744 00292 00273 09736 09726 00282

ITLBO-JAYA 09735 09733 09752 00285 00247 09752 09752 00265ITLBO-IPJAYA 09747 09746 09753 00280 00221 09764 09779 00252

19

TLBO 09738 09735 09727 00313 00225 09755 09774 00261ITLBO 09751 09745 09737 00305 00189 09769 09811 00248

ITLBO-JAYA 09759 09758 09758 00278 00178 09775 09791 00241ITLBO-IPJAYA 09772 09770 09786 00245 00162 09787 09787 00228

21

TLBO 09782 09780 09742 00299 00145 09797 09844 00217ITLBO 09787 09784 09756 00279 00144 0981 09846 00212

ITLBO-JAYA 09793 0979 09789 00273 00132 09811 09867 00207ITLBO-IPJAYA 09802 0980 09792 00263 00123 09812 09716 00198

22

TLBO 09801 0979 09755 00284 00131 09814 09868 00199ITLBO 0981 09805 09758 00277 00117 09823 0989 00191

ITLBO-JAYA 09816 09814 09794 00265 00114 09829 0989 00183ITLBO-IPJAYA 0-9823 09821 09798 00262 00102 09835 09898 00177

Acc

urac

y

16 18 19 21 22Number of features

096

80

9688 097

08

097

180

9735

097

47

097

510

9759

097

72

097

870

9793

098

02

098

10

9816

098

23

ITLBOITLBO-JAYAITLBO-IPJAYA

Figure 12 Accuracy based on the number of features for the NSL-KDD dataset

14 Complexity

values are shown in Table 13 the small values show that theIPJAYA-ITLBO-SVM method (MV1) is highly significant

13 The Comparison of the Proposed Methods

To illustrate the effectiveness of our proposed IDS methodsthe performance of the proposed methods is compared withsix recently developed anomaly detection techniquesTable 14 demonstrates the result achieved by the proposedmethods compared with other methods tested on the NSL-KDD dataset in terms of detection rate and false alarm rateIt is very clear that our proposed methods (ITLBO-JAYAand ITLBO-IPJAYA) obtained the best results with 09823accuracy 09798 detection rate and 00102 false alarm ratefor the ITLBO-IPJAYA model and 09816 accuracy 09794detection rate and 00114 false alarm rate for the ITLBO-JAYA method as shown in Table 11 However Table 15demonstrates the result achieved by the proposed methodscompared with other methods tested on the CICIDS 2017dataset in terms of detection rate and false alarm rate

14 Discussion

(is work in general contains 4 sections based on theproposed method Furthermore all methods proposed inthis work were evaluated based on the NSL-KDD andCICIDS 2017 datasets

Firstly the proposed ITLBO-IPJAYA based on networkintrusion detection and method results were compared withTLBO ITLBO and ITLBO-JAYA as shown in Tables 11 and12 Additionally the table shows the different features for thethree algorithms to investigate the influence of the featurersquosincrease on the performance which represents a differentalgorithm structure (e ITLBO-IPJAYA results showedhigher stability and better accuracy than ITLBO and ITLBO-JAYA algorithms

Furthermore Figure 13 shows that ITLBO-JAYA needs60 iterations to reach accuracy of 09816 when the ITLBO-IPJAYA algorithm with 50 iterations achieved higher ac-curacy (erefore ITLBO-IPJAYA achieved better detectionrate and less false alarm rate with less complexity of

0020

0022

0024

0026

0028

003

0032

0034

0036

0038

004

FAR

2217 18 19 20 21 2315 16Number of features

ITLBOITLBO-JAYAITLBO-IPJAYA

Figure 14 Comparison based on the number of features head for the NSL-KDD dataset

Table 12 Comparison of ITLBO ITLBO-JAYA and ITLBO-IPJAYA for the CICIDS dataset

No of features Method MAX Acc AVR Acc DR FAR FNR F-M Recall ER

12ITLBO 09634 09631 09661 00389 00268 0970 09721 00323

ITLBO-JAYA 09685 09683 09682 00360 00267 09713 09722 00315ITLBO-IPJAYA 09704 09702 09701 00310 00265 09725 09724 00298

13ITLBO 09712 09710 09724 00298 00273 09736 09726 00282

ITLBO-JAYA 09745 09744 09728 00290 00264 09741 09794 00272ITLBO-IPJAYA 09768 09767 09732 00285 00260 09752 09787 00264

14ITLBO 09776 09775 09737 00280 00189 09769 09811 00258

ITLBO-JAYA 09789 09787 09742 00270 00174 0978 0986 00235ITLBO-IPJAYA 09801 0980 09749 00265 00134 0981 0987 00210

16ITLBO 09804 09803 09755 00271 0011 09821 0989 00190

ITLBO-JAYA 0981 09808 09773 00266 00109 09825 0989 00183ITLBO-IPJAYA 09817 09815 09782 00264 00105 09831 09896 00170

Complexity 15

iterations Secondly with all the improvement of ITLBO-SVM mentioned above random selection of the main SVMparameters is considered as one of the algorithm limitationswhich may not provide optimal parameter value and affectthe model accuracy negatively

(e results above showed that the ITLBO-IPJAYAperformance improved the basic SVM performance byproviding the best parameter values as shown in the ITLBO-IPJAYA block diagram in Figure 11 In the end the per-formance of ITLBO-IPJAYA is worth reducing the impact ofselected parameters randomly

As a result of the differences in the algorithm structurethe ITLBO structure contains three phases which shouldprevent the algorithm from being trapped in local and globaloptima Also teachers not only teach learners (students) butalso teach other teachers On the contrary the TLBO structurecontains two phases only where teachers teach learners only

Furthermore the ITLBO algorithm achieved higheraccuracy than TLBO because the knowledge exchange rate ishigher in ITLBO since teachers teach learners and otherteachers (erefore ITLBO achieved better detection rateand less false alarm rate with less complexity of iterations

Table 15 Comparison with the existing work for the CICIDS 2017 dataset

Ref Method Dataset Acc DR FAR[41] Hybrid model CICIDS 8976 NG NG[42] Wrapper-based feature selection CICIDS 9768 NG NG[43] Feature selection technique and SVM CICIDS 09803 NG NGTLBO-SVM TLBO and SVM CICIDS 09794 09745 00274ITLBO-SVM Improved TLBO and SVM CICIDS 09804 09755 00271ITLBO-JAYA-SVM Improved TLBO improved JAYA and SVM CICIDS 0981 09773 00266ITLBO-IPJAYA-SVM Improved TLBO improved JAYA and SVM CICIDS 09817 09782 00264

Table 13 T-test results

NSL-KDD CICIDS 2017P value 00156 00068T value 3174 4044

Table 14 Comparison with the existing work for the NSL-KDD dataset

Ref Method Dataset Acc DR FAR[35] Two-stage classifier NSL-KDD 9638 NG NG[36] Hypergraph-based genetic algorithm and SVM NSL-KDD 0975 09714 083[8] PSO and SVM NSL-KDD 09784 09723 087[37] Chi-square and SVM NSL-KDD 098 NG 013[38] SVM and hybrid PSO NSL-KDD 07341 06628 281[39] SVM and feature selection NSL-KDD 090 NG NG[40] SVM and GA NSL-KDD 0975 NG NGTLBO-SVM TLBO and SVM NSL-KDD 09801 09755 00284ITLBO-SVM Improved TLBO and SVM NSL-KDD 0981 09758 00277ITLBO-JAYA-SVM Improved TLBO improved JAYA and SVM NSL-KDD 09816 09794 00265ITLBO-IPJAYA-SVM Improved TLBO improved JAYA and SVM NSL-KDD 09823 09798 00262

13534

13239

ITLBO-JAYA ITLBO-IPJAYA130501310013150132001325013300133501340013450135001355013600

Figure 15 Execution time comparison for the NSL-KDD dataset

16 Complexity

Dividing the solutions of the IPJAYA algorithm into twogroups and choosing the best solution from the best solutiongroup as ldquoBestrdquo and the best solution from the worst solutiongroup as ldquoWorstrdquo cause IPJAYA to need less iterations thanJAYA to reach better solutions as shown in Figure 13 (isalso leads to improvement in accuracy and detection rate

(e parallel improvement done on the JAYA algorithmreduces the time needed for execution and hence reduces thetotal execution time for the ITLBO-IPJAYA-SVM model asshown in Figure 15

Data Availability

(e data used to support the findings of this study areavailable online at httpswwwunbcacicdatasetsnslhtml

Conflicts of Interest

(e authors declare that they have no conflicts of interest

Acknowledgments

Special appreciation is due to Universiti Malaysia Pahang forthe sponsorship of this study approved by the Ministry ofHigher Education (MOHE) for Fundamental ResearchGrant Scheme (FRGS) with Vot No RDU190113

References

[1] A Sultana and M Jabbar ldquoIntelligent network intrusiondetection system using data mining techniquesrdquo in Pro-ceedings of the 2016 2nd International Conference on Appliedand 0eoretical Computing and Communication Technology(iCATccT) pp 329ndash333 IEEE Bangalore India July 2016

[2] R Rao ldquoJaya a simple and new optimization algorithm forsolving constrained and unconstrained opti- mizationproblemsrdquo International Journal of Industrial EngineeringComputations vol 7 no 1 pp 19ndash34 2016

[3] M Alsajri M A Ismail and S Abdul-Baqi ldquoA review on therecent application of Jaya optimization algorithmrdquo in Pro-ceedings of the 2018 1st Annual International Conference onInformation and Sciences (AiCIS) Springer Al-Fallujah Iraqpp 129ndash132 November 2018

[4] P Tao Z Sun and Z Sun ldquoAn improved intrusion detectionalgorithm based on GA and SVMrdquo IEEE Access vol 6pp 13624ndash13631 2018

[5] A S Eesa Z Orman and A M A Brifcani ldquoA novel feature-selection ap- proach based on the cuttlefish optimizationalgorithm for intrusion detection systemsrdquo Expert Systemswith Applications vol 42 no 5 pp 2670ndash2679 2015

[6] P Louvieris N Clewley and X Liu ldquoEffects-based featureidentification for network intrusion detectionrdquo Neuro-computing vol 121 pp 265ndash273 2013

[7] E De la Hoz A Ortiz J Ortega and B Prieto ldquoPCA filteringand probabilistic SOM for network intrusion detectionrdquoNeurocomputing vol 164 pp 71ndash81 2015

[8] S Mojtaba H Bamakan H Wang Y Tian and Y Shi ldquoAneffective intrusion detection framework based onMCLPSVMoptimized by time-varying chaos particle swarm optimiza-tionrdquo Neurocomputing vol 199 pp 90ndash102 2016

[9] R V Rao V J Savsani and D P Vakharia ldquoTeach-ingndashlearning-based optimization a novel method for

constrained mechanical design optimization problemsrdquordquoComputer-Aided Design vol 43 no 3 pp 303ndash315 2011

[10] S P Das and S Padhy ldquoA novel hybrid model using teaching-learning-based optimization and a support vector machine forcommodity futures index forecastingrdquo International Journalof Machine Learning and Cybernetics vol 9 no 1 pp 97ndash1112018

[11] S P Das N S Achary and S Padhy ldquoNovel hybrid SVM-TLBO forecasting model incorporating dimensionality re-duction techniquesrdquo Applied Intelligence vol 45 no 4pp 1148ndash1165 2016

[12] R V Rao V J Savsani and J Balic ldquoTeachingndashlearning-based optimization algorithm for unconstrained and con-strained real-parameter optimization problemsrdquordquo EngineeringOptimization vol 44 no 12 pp 1447ndash1462 2012

[13] R V Rao and V Patel ldquoAn improved teaching-learning-based optimization algorithm for solving uncon- strainedoptimization problemsrdquo Scientia Iranica vol 20 no 3pp 710ndash720 2013

[14] M Crepinsek S-H Liu and L Mernik ldquoA note on teach-ingndashlearning-based optimization algo- rithmrdquo InformationSciences vol 212 pp 79ndash93 2012

[15] M R Nayak C K Nayak and P K Rout ldquoApplication ofmulti-objective teaching learning based optimization algo-rithm to optimal power flow problemrdquo Procedia Technologyvol 6 pp 255ndash264 2012

[16] Y Xu L Wang S-Y Wang and M Liu ldquoAn effectiveteaching-learning-based optimization algorithm for theflexible job-shop scheduling problem with fuzzy processingtimerdquo Neurocomputing vol 148 pp 260ndash268 2015

[17] H E Kiziloz A Deniz T Dokeroglu and A Cosar ldquoNovelmultiobjective TLBO algorithms for the feature subset se-lection problemrdquoNeurocomputing vol 306 pp 94ndash107 2018

[18] S H Wang K Muhammad Y Lv et al ldquoIdentification of Al-coholism based on wavelet Renyi entropy and three-segmentencoded Jaya algorithmrdquo Complexity vol 2018 Article ID3198184 13 pages 2018

[19] H Migallon A Jimeno-Morenilla and J-L Sanchez-RomeroldquoParallel improvements of the Jaya optimization algorithmrdquoApplied Sciences vol 8 no 5 p 819 2018

[20] C Gong ldquoAn enhanced Jaya algorithm with a two groupAdaptionrdquo International Journal of Computational Intelli-gence Systems vol 10 no 1 pp 1102ndash1115 2017

[21] O Samuel N Javaid S Aslam and M H Rahim ldquoJAYAoptimization based energy management controller for smartgrid JAYA optimization based energy management con-trollerrdquo in Proceedings of the 2018 International Conference onComputing Mathematics and Engineering Technologies(iCoMET) March 2018

[22] K Yu J J Liang B Y Qu X Chen and H Wang ldquoPa-rameters identification of photovoltaic models using an im-proved JAYA optimization algorithmrdquo Energy Conversionand Management vol 150 pp 742ndash753 2017

[23] M Dash and H Liu ldquoFeature selection for classificationrdquoIntelligent Data Analysis vol 1 no 1-4 pp 131ndash156 1997

[24] S Dumais J Platt D Heckerman and M Sahami ldquoInductivelearning algorithms and representations for text categoriza-tionrdquo in Proceedings of the seventh international conference onInformation and knowledge management-CIKMrsquo98 Novem-ber 1998

[25] R Singh H Kumar and R K Singla ldquoAn intrusion detectionsystem using network traffic profiling and online sequentialextreme learning machinerdquo Expert Systems with Applicationsvol 42 no 22 pp 8609ndash8624 2015

Complexity 17

[26] M Sokolova and L Guy ldquoA systematic analysis of perfor-mance measures for classification tasksrdquo Information Pro-cessing amp Management vol 45 no 4 pp 427ndash437 2009

[27] P Phoungphol Y Zhang and Y Zhao ldquoRobust multiclassclassification for learning from imbalanced biomedical datardquoTsinghua Science and Technology vol 17 no 6 pp 619ndash6282012

[28] A Jahan F Mustapha M Y Ismail S M Sapuan andM Bahraminasab ldquoA comprehensive VIKOR method formaterial selectionrdquo Materials amp Design vol 32 no 3pp 1215ndash1221 2011

[29] L Aljarah and S A Ludwig ldquoMapreduce intrusion detectionsystem based on a particle swarm optimization clusteringalgorithmrdquo in Proceedings of the 2013 IEEE Congress onEvolutionary Computation pp 955ndash962 IEEE CancunMexico June 2013

[30] K Khaleel M A Ismail U Yunan and S Kasim ldquoReview onintrusion detection system based on the goal of the detectionsystemrdquo International Journal of Integrated Engineeringvol 10 no 6 2018

[31] J D Rodriguez A Perez and J A Lozano ldquoSensitivityanalysis of k-Fold cross validation in prediction error esti-mationrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 32 no 3 pp 569ndash575 2010

[32] M Tavallaee E Bagheri W Lu and A A Ghorbani ldquoAdetailed analysis of the KDD CUP 99 data setrdquo in Proceedingsof the 2009 IEEE Symposium on Computational Intelligence forSecurity and Defense Applications pp 1ndash6 IEEE Ottawa ONCanada July 2009

[33] M Al-Qatf Y Lasheng M Al-Habib and K Al-SabahildquoDeep learning approach combining sparse autoencoder withSVM for network intrusion detectionrdquo IEEE Access vol 6pp 52843ndash52856 2018

[34] R Vinayakumar M Alazab K P Soman P PoornachandranA Al-Nemrat and S Venkatraman ldquoDeep learning approachfor intelligent intrusion detection systemrdquo IEEE Access vol 7pp 41525ndash41550 2019

[35] BA Tama M Comuzzi and K-H Rhee ldquoTSE-IDS a two-stage classifier ensemble for intelligent anomaly-based in-trusion detection systemrdquo IEEE Access vol 7 pp 94497ndash94507 2019

[36] M R Gauthama Raman N Somu K Kannan R Liscano andV S Shankar Sriram ldquoAn efficient intrusion detection systembased on hypergraph - genetic algorithm for parameter op-timization and feature selection in support vector machinerdquordquoKnowledge-Based Systems vol 134 pp 1ndash12 2017

[37] I S (aseen and C Aswani Kumar ldquoIntrusion detectionmodel using fusion of chi- square feature selection and multiclass SVMrdquo Journal of King Saud University - Computer andInformation Sciences vol 29 no 4 pp 462ndash472 2017

[38] Y Li S Yu J Bai and X Cheng ldquoTowards effective networkintrusion detection a hybrid model integrating Gini indexand GBDTwith PSOrdquo Journal of Sensors vol 2018 Article ID1578314 9 pages 2018

[39] A A Aburomman and M B I Reaz ldquoA novel weightedsupport vector machines multi- class classifier based ondifferential evolution for intrusion detection systemsrdquo In-formation Sciences vol 414 pp 225ndash246 2017

[40] J Esmaily and J Ghasemi ldquoA novel intrusion detectionsystems based on genetic algorithms-suggested features by themeans of different permutations of labelsrsquo ordersrdquo Interna-tional Journal of Engineering vol 30 no 10 pp 1494ndash15022017

[41] S Aljawarneh M Aldwairi and M B Yassein ldquoAnomaly-based intrusion detection system through feature selectionanalysis and building hybrid efficient modelrdquo Journal ofComputational Science vol 25 pp 152ndash160 2018

[42] Y Li J L Wang Z H Tian T B Lu and C Young ldquoBuildinglightweight intrusion detection system using wrapper-basedfeature selection mechanismsrdquo Computers amp Security vol 28no 6 pp 466ndash475 2009

[43] S U Jan S Ahmed V Shakhov and I Koo ldquoToward alightweight intrusion detection system for the internet ofthingsrdquo IEEE Access vol 7 pp 42450ndash42471 2019

18 Complexity

Page 3: ImprovedTLBO …downloads.hindawi.com/journals/complexity/2020/5287684.pdfFSS is a multiobjective optimisation problem. e first objective is the number of features, and the second

study also examined the feasibility of the proposed modelusing multicommodity futures index data retrieved fromMCX Rao et al [12] confirmed the superiority of the modelcompared to some population-inspired optimisationframeworks Rao and Patel [13] investigated the effect ofsample size and number of generations on algorithmicperformance and concluded that this algorithm can beeasily applied to several optimisation cases Crepinsek et al[14] solved the problems presented in [9 12] by usingTLBO Nayak et al [15] developed a multiobjective TLBOin which a matrix of solutions was created for each ob-jective (e teacher selection process in TLBO is mainlybased on the best solution presented in the solution spaceand learners are taught to merely maximise that objectiveAll the available solutions in the solution space were sortedto generate a collection of optimal solutions Xu et al [16]presented multiobjective TLBO based on different teachingtechniques (ey used a crossover operator (rather than ascalar function) between solutions in the teaching andlearning phases Kiziloz et al [17] suggested three multi-objective TLBO algorithms for FSS in binary classification(FSS-BCP) Among the presented methods a multi-objective TLBO with scalar transformation was found to bethe fastest algorithm although it provided a limitednumber of nondominated solutions Multiobjective TLBOwith nondominated selection (MTLBO-NS) explores thesolution space and produces a set of nondominated so-lutions but requires a long execution time MultiobjectiveTLBO with minimum distance (MTLBO-MD) generatessolutions that are similar to those of MTLBO-NS but in asignificantly shorter time (e proposed multiobjectiveTLBO algorithms have been evaluated in terms of per-formance using LR SVM and extreme learning machine(ELM) Wang et al suggested a novel ldquoalcoholism iden-tification method from healthy controls based on a com-puter-vision approachrdquo [18] (is approach relied on threecomponentsmdashthe proposed wavelet Renyi entropy feed-forward neural network and the proposed three-segmentencoded JAYA algorithm (e results showed the proposedmethod exhibits good sensitivity but the accuracy stillneeds improvements Migallon et al [19] developed parallelalgorithms and presented their detailed analysis (eydeveloped a hybrid algorithm that exploited inherentparallelism at two different levels (e lower level wasexploited by parallel shared-memory platforms while theupper level was exploited by distributed shared memoryplatforms (e results of both algorithms were good es-pecially in scalability Hence the proposed hybrid algo-rithm successfully used a number of processes with near-perfect efficiencies (e experiments showed that themethod used about 60 processes to achieve near-ideal ef-ficiencies as analysed on 30 unconstrained functions Gong[20] suggested a ldquonovel E-JAYA algorithm for the per-formance enhancement of the original JAYA algorithmrdquo(e proposed E-JAYA used the average of the better andworse groups to derive the best solution (e solutionprovided by the proposed E-JAYA had better accuracy thanthat of the original JAYA (e swarm behaviours wereconsidered in the E-JAYA rather than considering the best

and worst individual behaviours (e performance ofE-JAYA was assessed on 12 benchmark functions ofvarying dimensionality

Another study proposed an effective demand-sidemanagement scheme for residential HEMS [21] (esystem was proposed for peak creation prevention toreduce electricity bills (is study applied JAYA SBA andEDE to realise its objectives it also deployed the TOUpricing scheme for electricity bill computation From theresult JAYA was sufficient in reducing electricity bill andPAR thereby achieving customer satisfaction Further-more the SBA outperformed JAYA and EDE in achievinguser comfortability as it related negatively with an elec-tricity bill Yu et al [22] developed improved JAYA(IJAYA) for steady and accurate PV model parameterestimation by incorporating a self-adaptive weight for theadjustment of the propensity of reaching the best solutionand avoiding the bad solution while searching (e weighthelps in ensuring the framework achieves the possiblesearch region early and to perform local search laterFurthermore the algorithm contains a learning strategyderived from other individualsrsquo experiences which wasrandomly used for population diversity improvementTable 1 shows the lacks and limitation of IDS studiesmentioned in the related work

3 Feature Subset Selection Problem

(is section explains the representation of the featuresand the problem of choosing the best feature subset FSSrefers to the selection of feature subsets from a largerfeature set FSS reduces the number of features in adataset thereby preventing complex calculations andimproving the speed and performance of classifiersSeveral definitions of FSS exist in literature [23] somedefinitions deal with the reduction in size of the selectedsubset while others focus on the improvement of pre-diction accuracy FSS is essentially a process of con-structing an effective subset that represents theinformation contained in a dataset by eliminating re-dundant and irrelevant features FSS mainly aims atfinding the least number of features without having asignificant influence on classification accuracy Owing tothe complicated nature of optimal subset feature ex-traction as well as the nonexistence of a polynomial-timealgorithm for addressing it FSS has been classified as anNP-hard problem [24] (ere are four steps in typical FSS[23] the first step involves the selection of candidatefeatures that will constitute the subsets while the secondstep is the evaluation and comparison of these subsetswith each other In the third step a check is made for thesatisfaction of the termination condition otherwise thefirst and second steps will be repeated (e final stepchecks if the optimal feature subset has been establishedbased on prior knowledge With these two major aimsFSS can be considered a multiobjective problem A formaldefinition of finding optimal solutions through the sat-isfaction of both objectives is given in the followingequation

Complexity 3

Min (f1)

Max (f2)

subject to f1 |k|

f2 accuracy(k) where ksubeK

(1)

where k is the subset of the original dataset K which opti-mises f1 and f2 (the objectives)

(e establishment of the best solution or the decision onthe improved condition of a new individual is a complicatedtask in a multiobjective optimisation process (is is due tothe chances of enhancement in one objective causing areduction in the other

4 Improved TLBO Algorithm

(e ITLBO algorithm was executed at the FSS phase in thisstudy (e ITLBO algorithm was initialised by randomlygenerated initial population namely the teacher and a set ofstudents which represents the set of solutions To representthe features in the ITLBO algorithm ITLBO borrowed thecrossover and mutation operators from GA by representingthe features as chromosomes (one of the GA properties) Toupdate this chromosome crossover and mutation operatorswere used In the population (called a classroom) eachsolution is taken as an individualchromosome (Figure 1) Afeature gene of a chromosome with a value of 1 is consideredas selected while a value of 0 denotes otherwise Figure 1shows a sample of the dataset regarding Figure 2 features AB C D E I K and L were selected (their values are 1) whilefeatures F G H and J were not (their values are 0) (eTLBO algorithm runs through iterations where the teacher isthe best individual in the population and the rest of theindividuals become the students Having selected theteacher ITLBO works in three phases Teacher BestClassmates (Learner Phase 1) and Learner Phase 2 In theTeacher phase the teacher enhances the knowledge of eachstudent by sharing knowledge with them but in the BestClassmates phase two best students are selected andassigned the task of interacting with the other students Inthe Learner phase there is a random interaction among thestudents in a bid to enhance their levels of knowledge Newchromosomes are generated in the proposed ITLBO usingldquohalf-uniform crossover and bit-flip mutation operatorsrdquowhich are special crossover operators (Figures 3 and 4)Two-parent chromosomes (could be a teacher a student ortwo students) are needed for the crossover operator (ecrossover operator relies on the information of the two-parent chromosomes if both parents feature the same genethe gene is kept but whenever there are different featuregenes in both parents a parentrsquos gene is randomly chosenOnly one new chromosome is generated from this operation

(e ldquobit-flip mutationrdquo works on a single chromosome whentrying to manipulate a single gene based on a probabilisticratio If the gene has a zero value it will be updated as one orvice versa In the proposed ITLBO algorithm nondominatedsorting and selection were used (e dominance of an in-dividual over another individual is determined strictly on thebasis of whether a minimum of one of its objectives is su-perior to that of the other while keeping all the other ob-jectives the same

A nondominated scenario arises when there is nopossibility of an individual being dominated by another (efront line of the solution set is filled by the nondominatedindividuals (ose that are closest to the ideal point in thefront line are chosen as the teachers All the teachers teach allstudents discretely at the Teacher Best Classmate andLearner phases (e details of the ITLBO algorithm arepresented in Figures 5 and 6(e detail steps of ITLBO are asfollows

(i) Step 1 initialise the population randomly with eachpopulation having a different set of features from 1to a maximum number of features (41 in NSL-KDD) (is step is captured in line 2 of Figure 5

(ii) Step 2 choose the best individual as a teacher (echosen teacher interacts with all other individualsseparately and a crossover is applied with each oneand then a mutation is applied to all the resultingindividuals (e operators used are half-uniformcrossover and bit-flip mutation operators (repre-sented in lines 4 to 5 in Figure 5)

(iii) Step 3 check the population (chromosome) thatresults from the crossover and mutation if the newchromosome is better than the old then the new oneis kept otherwise the old one is retained All theaforementioned steps are collectively called theTeacher phase because all individuals learn from thebest one (the teacher) (is step is represented inlines 6 to 13 in Figure 5

(iv) Step 4 after that Learner Phase 1 or learning fromthe best classmates is started (is phase begins withthe fifth step which is the selection of the best twoindividuals as students and applying a crossoverbetween them followed by a mutation If the newone is better than the previous two students thenthe newer choice is kept otherwise the older bestchoice is kept (is process is repeated with all otherindividuals (students) At this point Learner Phase1 has terminated (viewed in lines 14 to 27 inFigure 5)

(v) Step 5 this step is Learner Phase 2 which involveschoosing two random individuals (students) be-tween whom a crossover is applied followed by amutation on the new individual If the new

Table 1 IDS existing work

Ref Limitation[6] Detection of SQL injection is low[7] Detection rate is low[8] Long training time

110100011111

Figure 1 Schematic representation of a chromosome 1 selectedfeatures 0 unselected features

4 Complexity

individual is better than the old two students thenthe new one is kept otherwise the best old one isretained (is step is repeated with all other

students At this point the main three stages ofITLBO have been completed and a check should becarried out on whether the termination criteria have

(1) Start (2) Initialize population(3) Calculate_weighted_average_of

_individuals (population)(4) for (k =1 to number_of_generations) do(5) Xteacher = Best_individual (6) Learning from Teacher lowastteacher phase (7) for (i =1 to number_of_individuals) do(8) Xnew = Crossover (Xteacher Xi)(9) Xnew = Mutation (Xnew)(10) if (Xnew is better than Xi) then(11) Xi = [Xnew](12) End if(13) End for (14) Learning from Best Classmates lowastlearner phase 1 (15) for (i =1 to number_of_individuals) do(16) m = Select_best_individual_from

(population)(17) n = Select_best_individual_from (population) (18) n ne m ne teacherlowast(19) Xnew = Crossover (Xm Xn)(20) Xnew = Mutation (Xnew)(21) if (Xnew is better than Xm) then(22) Xm = Xnew(23) End if

(24) if (Xnew is better than Xn) then (25) Xn = Xnew(26) End if (27) End for (28) Learning from Classmates lowastlearner phase 2(29) for (i =1 to number_of_individuals) do(30) m = Select_random_individual_from

(population)(31) n = Select_random_individual_from

(population) n ne m ne teacherlowast(32) Xnew = Crossover (Xm Xn)(33) Xnew = Mutation (Xnew)(34) if (Xnew is better than Xm) then(35) Xm = Xnew(36) End if (37) if (Xnew is better than Xn) then(38) Xn = Xnew(39) End if (40) End for (41) Is termination criterion satisfied if yes go to line

42 else continue(42) End for(43) Show_the_pareto_optimal_set (population)(44) End

Figure 5 ITLBO algorithm

Figure 2 Sample of the dataset

0100111111

1101110101

1100110111

Teacher

New student

Student

Figure 3 Crossover operator

0100111111

0100110111

Figure 4 Mutation operator

Complexity 5

been satisfied or not If the termination criteria weresatisfied proceed to the next step otherwise themain three stages are repeated (Teacher phaseLearner Phase 1 and Learner Phase 2) (is step isrepresented in lines 28 to 40 in Figure 5

(vi) Step 6 the final step is the application of non-dominated sorting to the result Nondominatedsortingmeans no result (individual) is better than all

other individuals (is step can be viewed in line 43in Figure 5

5 Parameter Optimisation

After selecting the optimal subset feature several SVMparameters will be tuned(e tuning of SVM parameters is aproblem which can determine algorithm performance (e

Start

Calculate weighted average of every individual in the population

Choose best individual as teacher

Is new one better thanold one

Keep old one Keep new one

Select best two students apply crossover and mutation

Is new one better than theworse student

Keep new oneKeep old one

Select random two students apply crossover and mutation

Is new one better thanold one

Keep new oneKeep old one

Is the termination criterion satisfied

Is new one better than the worse student based on new

value of C and YKeep new oneKeep old one

Is the terminationcriterion satisfied

End

YesNo

YesNo

No Yes

YesNo

No

Crossover teacher with all other individuals (student) separately and apply mutation

Apply nondominated sorting and find the Pareto setYes

Initialize population randomly

Figure 6 ITLBO flowchart

6 Complexity

radial basis function (RBF) kernel function of the SVM isemployed for the conversion of the completely nonseparableproblem into a separable or approximate separable state(eRBF kernel parameter c suggests data distribution to a newfeature space while parameter C suggests the level of penaltyfor the classification error in the linear nonseparable caseEquations (2) and (3) represent the cost and gamma re-spectively In the next section the two parameters (C and c)were tuned by using the IPJAYA algorithm

w b12

w22

+ C 1113944n

ζnstyn wtxn + b1113872 1113873ge 1 minus ζn (2)

k xn xm( 1113857 exp minusc xn minus xm

23

1113874 1113875 (3)

6 Improved Parallel JAYA Algorithm

(e JAYA algorithm needs improvements to work betterOne of the observations on the JAYA algorithm is that if wesort the populations from best to worst and divide them intotwo groups the best and the worst solutions Obviously theoptimal solution is located in the best solution group [2]Based on this observation an improvement has been done inthe JAYA algorithm rather than selecting the best and worstcases from the whole solutions which puts the worst so-lution further from the best solution and increase the it-erations needed to reach the optimal solution the solutionswere divided into two groups (e best solution is chosenfrom the best solution group as ldquoBestrdquo and the best solutionfrom the worst solution group is the ldquoWorstrdquo (is proce-dure reserves the populationrsquos diversity and makes the so-lution start from a point closer to the optimal solution anddecreases the number of iterations needed to reach theoptimal solution In the proposed work JAYA algorithmwas improved to optimise two parameters of the SVMclassifier simultaneously Figure 7 shows the flowchart ofIPJAYA while Figure 8 shows the IPJAYA algorithm fol-lowed by the detailed steps of IPJAYA

(e detailed steps of IPJAYA are shown as follows(i) Step 1 select the population size and the number of

design variables as well as initialise the terminationcondition To explain the parameter optimisation indetail we assume the following scenario populationsize 3 design variables 2 and terminationcriterion 2 iterations (e value of the populationis the value of parameters C and c in this scenarioeach one has 3 values (ese values were initialisedrandomly for C between 0001 to 100 and for c

between 00001 to 64 Table 2 shows the values of Cand c

(ii) Step 2 SVM needs three things to classify any la-belled data ie features to choose value of C pa-rameter and value of c parameter (is step can beviewed in line 2 of Figure 8

(iii) Step 3 the next step is to evaluate each value forboth C and c separately by using the SVM and on

the first student from Learner Phase 2 after applyingcrossover and mutation as shown in Table 3To continue the optimisation process the pop-ulation is arranged from best to worst and split intotwo groups (Best and Worst groups) as shown inTable 4(e same procedure is repeated for c parameter andthis time C is by default and the new value of c is11006

(iv) Tables 5 and 6 show the details of c parameter(v) Step 4 the result will be considered as the objective

function for both C and c and then compared withother populations and continued until the termi-nation criterion is satisfied (is step can be viewedin line 7 of Figure 8

(ese two new values for C and c will be evaluated usingthe same subset feature at the same time as shown in Table 7(is step can be viewed in lines 5 to 6 of Figure 8

7 The Proposed Method

(is section describes the proposed combination of threedifferent algorithms Each algorithm has a different task todo and these tasks complete the work of the model (e firstalgorithm is the ITLBO whose task is to choose the optimalsubset feature from the whole features(e second algorithmis IPJAYA algorithm and its task is to optimise the pa-rameters of the SVM (e third algorithm is the SVMclassifier which takes the outcome of the first two algorithmsto determine if the processed traffic is intrusion or normaltraffic Figure 9 shows the flowchart of the proposed methodFigure 10 shows the pseudo-code of the proposed methodwhile Figure 11 details the proposed steps of the IPJAYA-ITLBO-SVM method

(e detail steps of ITLBO-IPJAYA-SVM are asfollows

(i) Step 1 initialise the population randomly Eachpopulation is a different set of features from 1 tothe maximum number of features (41 in NSL-KDD) (is step can be viewed in line 2 ofFigure 10

(ii) Step 2 calculate the weighted average of everyindividual population (is step can be viewed inline 3 of Figure 10

(iii) Step 3 choose the best individual as a teacher (echosen teacher interacts with all other individualsseparately Apply crossover with each one andthen apply mutation to all resulted individuals(ecrossover used is called the half-uniform crossoverand bit-flip mutation operator (is step can beviewed in lines 4 to 5 of Figure 10

(iv) Step 4 check the population (chromosome)resulting from crossover and mutation if the newone is better than the old one keep the new oneotherwise retain the old one (e best and worstpopulations refer to the degree of accuracy during

Complexity 7

Start

Initialize population size number of design variables and termination criterion

Identify best and worst solutions for C and γ

Calculate new worst solution for C Calculate new worst solution for γ

Modify the solution based on eq 4 Modify the solution based on eq 4

Use the new value of C and γ do the classification

Is new Acc Better than old Acc

Keep new C and γKeep old C and γ

Is the termination criterionsatisfied

Return best value of C and γ

End

YesNo

NO

Yes

Parallel pool

Figure 7 IPJAYA flowchart

(1) Start (2) Initialise the population size number of designed

variables and termination criteria (3) Repeat Steps 3ndash6 until the termination criteria are met (4) Arrange the solutions from best to worst and split the

solutions into two groups -best and worst solutions(5) Make the best solution in best group as best and

make the best solution in worst group as worst(6) Modify the solution based on the following equation

Yprimej k I = Yj k I + r1 k I (Yj best I ndash | Yj k I |) ndashr2 k I (Yj worst I ndash | Yj k I|)

(7) Update the previous solution if Yprimej k I gt Yj k Iotherwise do not update the previous solution

(8) Display the established optimum solution(9) End

Figure 8 IPJAYA algorithm

Table 2 C and c values

C c

20 101 210 0701 1

Table 3 Evaluation of C

C c Subset feature Accuracy (objective function)20 Default Fixed 0971 Default Fixed 089910 Default Fixed 099401 Default Fixed 099

Table 4 Best and Worst groups for C

C cSubsetfeature

Accuracy (objectivefunction)

10 Default Fixed 0994 best of best Best group01 Default Fixed 09920 Default Fixed 097 best of worst Worst

group1 Default Fixed 0899

Table 5 Accuracy based on c

C γ Subset feature Accuracy (objective function)20 Default Fixed 0971 Default Fixed 089910 Default Fixed 099401 Default Fixed 099

8 Complexity

classification All the aforementioned steps arecalled the Teacher phase because all individualslearn from the best one (teacher) After thatLearner Phase 1 is started (is step can be viewedin lines 6 to 13 of Figure 10

(v) Step 5 select the best two individuals as studentsand apply crossover between these two students(en apply mutation on the new one If the newone is better than the old two students keep thenew one otherwise keep the best old one and

apply this with all other individuals (students) (estudents are chosen once and will not be chosenagain At this point Learner Phase 1 has ended(is step can be viewed in lines 14 to 27 ofFigure 10

(vi) Step 6 Learner Phase 2 is initiated with two objec-tives one is to optimise the SVMparameters and theother is to make students learn from each other(isphase starts with choosing two random individuals(students) and then applying crossover between these

Table 6 Best and Worst groups for c

C c Subset feature Accuracy (objective function)Default 07 Fixed 09941 best of best Best groupDefault 1 Fixed 099Default 2 Fixed 098 best of worst Worst groupDefault 10 Fixed 097

Table 7 Evaluation of features based on new C and c

C c Subset feature Accuracy (objective function)142 11006 Fixed Result

Start

FSS using ITLBO

SVM parameters optimization using IPJAYA

Is termination criterion satisfied for IPJAYA

No

Is termination criterion satisfied for ITLBO

No

Training

Classification

Intrusion detection

End

Yes

Yes

Figure 9 Proposed method flowchart

Complexity 9

two students and applying mutation on the newindividual After that and before the classificationprocess is initiated check if the new student is betterthan the old two students (e SVM parameteroptimisation is started using IPJAYA this processstarts at the 29th step by initialising the populationsize the number of design variables and the ter-mination criteria for IPJAYA (e population sizecan be set before the execution and each populationis generated randomly(e designed variables are thetwo parameters of the SVM which need to beoptimised (e termination criteria can be thenumber of iterations after that each population foreach parameter is evaluated separately (which onegives better accuracy) followed by a parallel poll foreach parameter sorting the population from best toworst (best accuracy to worst accuracy) and sepa-rating them into two groups (best and worst groups)(e best population in the best group is chosen asbest and the best population in the worst group ischosen as worst (en the population is modifiedbased on equation in Figure 8 and updated if the newone is better than the old one IPJAYA is repeated

until the termination criterion is satisfied (e finalstep of IPJAYA is to deliver the best value of the twoparameters to be used by the SVM At this point theparameter optimisation has ended and LearnerPhase 2 continues in the next step (is step can beviewed in lines 28 to 39 of Figure 10

(vii) Step 7 evaluate the individuals (chromosome) byusing the outcome of IPJAYA If the new indi-vidual is better than the old two students keepthe new one otherwise keep the best old oneApply this step to all other students At this stepthe main three stages of the ITLBO have finished(e next step is to check for the satisfaction of thetermination criteria if satisfied proceed to thenext step Otherwise the main three stages arerepeated (is step can be viewed in lines 40 to 48of Figure 10

(viii) Step 8 the last step is to apply nondominatedsorting on the result Nondominated sortingmeans no result (individual) is better than all theother individuals(is step can be viewed in line 49of Figure 10

(1) Start (2) Initialize population(3) Calculate_weighted_average_of_individuals (4) for (k =1 to number_of_generations) do(5) Xteacher = Best_individual (6) Learning from Teacher lowastteacher phase (7) for (i = 1 to number_of_individuals) do(8) Xnew = Crossover (Xteacher Xi)(9) Xnew = Mutation(Xnew)(10) if (Xnew is better than Xi) then

(11) Xi = [Xnew]

(12) End if

(13) End for (14) Learning from Best Classmates lowastlearner phase 1 (15) for (i = 1 to number_of_individuals) do(16) m = Select_best_individual_from

(population)

(17) n = Select_best_individual_from (population)(18) n nem ne teacherlowast(19) Xnew = Crossover (Xm Xn)(20) Xnew = Mutation(Xnew)

(21) if (Xnew is better than Xm) then

(22) Xm = Xnew

(23) End if(24) if (Xnew is better than Xn) then

(25) Xn = Xnew

(26) End if

(27) End for

(28) Learning from Classmates lowastlearner phase 2(29) for (i =1 to number_of_individuals) do(30) m =Select_random_individual_from

(population)

(31 n =Select_random_individual_from (population) n nem ne teacherlowast

(32) Xnew = Crossover (Xm Xn)

(33) Xnew = Mutation (Xnew)(34) IPJAYA algorithm for SVM parameter

(35) Initialize the population size number of designed variables and termination criteria (IPJAYA)

(36) Arrange the solutions from the best to the worstand Split the solutions into two groups best andworst solutions

(37) Modify the solutionYj k I = Yj k I + r1 k I (Yj best I ndash |Yj k I|) ndash r2 k I (Yj worst I ndash | Yj k I|)

(38 Update the previous solution if Yj k I gt Yj k Iotherwise do not update the previous solution

(39) return best value of C and Y(40) if ( Xnew is better than Xm) then(41) Xm = Xnew(42) End if (43) if (Xnew is better than Xn) then(44) Xn = Xnew(45) End if (46) End for (47) Is termination criterion satisfied if yes go to line

42 else continue

(48) End for(49) Show_the_pareto_optimal_set (population)(50) End

Figure 10 Proposed method pseudo-code

10 Complexity

Start

Calculate weighted average of every individual in the population

Choose best individual as teacher

Crossover teacher with all other individuals (student) separately and apply mutation

Is new one better than old oneKeep old one Keep new one

Select best two students apply crossover and mutation

Is new one better than theworse student

Keep new oneKeep old one

Select random two students apply crossover and mutation

Initialize population size number of design variablesand termination criterion

Identify best and worst solutions for C and γ

Calculate new worst solution

Modify the solutions based on equation 32

Use the new value of c and y do the classification

Is new one better than old one Keep new oneKeep old one

Is the termination criterionsatisfied

This is the best value of C and γ

Is new one better thanthe worse student based on new

value of C and γ

Keep new oneKeep old one

Is the termination criterionsatisfied

End

YesNo

YesNo

Yes

No Yes

YesNo

No

No

Initialise population randomly

Apply nondominated sorting and find the Pareto setYes

ITLBO

IPJAYA

Figure 11 Details of the proposed method

Complexity 11

8 Evaluation Metrics

(emetrics measures and validation procedures used in theevaluation of the experimental data were reviewed in thissection (e literature review showed that most studies useoverall accuracy as the major performance measure for IDsystems However other metrics and validation measureshave also been mentioned Some works have detailed theinformation on FAR detections andmissed detections whichare all useful system performance evaluation measures (efollowing section details the analysis based on standardmetrics for objective evaluation of the results achieved byvarious classification methods (e performance of thesystem was evaluated using several metrics based on theNSL-KDD and CICIDS 2017 datasets A detailed descriptionof learning performance measures has been provided bySingh et al and Sokolova and Guy [25 26] whilePhoungphol et al [27] detailed the imbalanced dataset is-sues One of these metrics is the accuracy as given in thefollowing equation

accuracy TP + TN

TP + FN + TF + FP (4)

Accuracy is the capability of the classifier in predictingthe actual class here TP true positive TN true negativeFP false positive and FN false negative

Several metrics can be computed from the confusionmatrix (e false-positive rate (FPR) is another metric it isthe percentage of the samples incorrectly predicted aspositive by the classifier It is calculated by using the fol-lowing equation

false minus positive rate(FPR) FP

FP + TN (5)

(e false-negative rate (FNR) is the percentage of thedata incorrectly classified by the classifier as negative It iscalculated by using the following equation

false minus negative rate(FNR) FN

TP + FN (6)

(e detection rate (DR) is the percentage of the samplescorrectly classified by the classifier to their correct class It iscalculated by using the following equation

detection rate FP

TP + FP (7)

(e recall quantifies the number of correct positivepredictions made out of all positive predictions It is cal-culated by using the following equation

recall TP

TP + FN (8)

F-Measure provides a way to combine both detectionrate and recall into a single measure that captures bothproperties It is calculated by using the following equation

F minus M (2lowastDRlowast recall)

(DR + recall) (9)

(e results were validated by using k-fold cross-vali-dation technique [27ndash30] (is technique requires a randompartitioning of the data into k different parts and one part isselected from each iteration as testing data while the other(kminus1) parts are considered as the training dataset All theconnection records are eventually used for training andtesting For all experiments the value of k is taken as 10 toensure low bias low variance low overfitting and good errorestimate [28]

9 Dataset Preprocessing and Partitioning

(e whole dataset is preprocessed in this stage It consists oftwo steps ie scaling and normalisation In the scaling stepthe dataset is converted from a string representation to anumerical representation For example the class label in thedataset contains two different categories ldquoNormalrdquo andldquoAttackrdquo After implementing this step the label is changedto ldquo1rdquo and ldquo0rdquo where ldquo1rdquo means normal case while ldquo0rdquomeans attack (e second step is normalisation [31] (enormalisation process removes noise from the dataset anddecreases the differences in the ranges between the featuresIn this work the Max-Max normalisation method was usedas shown in the following equation

Fi Fi minus Mini

Maxi minus Minii (10)

where Fi represents the current feature that needs to benormalized and Mini and Maxi represent the minimum andthe maximum value for that feature respectively (e ob-jective function represents the accuracy of the SVM when itis evaluated on the validation set (e validation set is a partof the training set In order to make the validation fairer K-fold validation can be used(e value K is 10(e NSL-KDDand CICIDS 2017 datasets were used to evaluate the per-formance of the proposed models

10 NSL-KDD Dataset

In this study NSL-KDD datasets were used to evaluate theproposed method (is dataset was suggested in 2009 byTavallaee et al [32] due to the drawbacks of KDD CUP99(e NSL-KDD is a variant of the KDD CUP 99 dataset inwhich the redundant instances were discarded followed bythe reconstitution of the dataset structure [29] (e NSL-KDD dataset is commonly used for evaluating the per-formance of new ID approaches especially anomaly-basednetwork ID (ere are a reasonable number of testing andtraining records in the NSL-KDD (e training set(KDDTrain+) consists of 125973 records while the testingset (KDDTest+) contains 22544 records In this dataseteach traffic record has 41 features (six symbolic and 35continuous) and one class label (Table 7) (e features areclassified into basic content and traffic types (Table 8)Attack classification in the NSL-KDD is based on thefeature characteristics [33] (e NSL-KDD dataset can bedownloaded from httpswwwunbcacicdatasetsnslhtml

12 Complexity

11 CICIDS 2017 Dataset

(e CICIDS 2017 dataset consists of benign and the mostcurrent common attacks which mimic real-world data(PCAPs) It also contains the results of a network trafficanalysis obtained by using a CICFlowMeter the flows arelabelled based on the timestamp source and destinationports source and destination IPs protocols and attack (eCICIDS 2017 dataset satisfies the 11 indispensable features ofa valid IDS dataset namely anonymity available protocolsfeature set attack diversity complete capture completeinteraction complete network configuration completetraffic metadata heterogeneity and labelling [34] (ere are2830540 rows in the CICIDS 2017 devised on eight fileswith each row containing 79 features In the CICIDS 2017each row is labelled as benign or as one of the 14 attack typesA summary of the distribution of different attack types andthe benign rows is presented in Table 9

12 Results of ITLBO-IPJAYA vs ITLBOand ITLBO-JAYA

(is section provides the results of the improved method-based ITLBO-IPJAYA algorithm (is method selects thebest features and updates the value of SVM parameters (iswork proposed the idea of ldquoparallel executionrdquo to update theSVM parameters (e parameters for ITLBO ITLBO-JAYAand ITLBO-IPJAYA methods used in this study are shownin Table 10

(e NSL-KDD dataset is used to evaluate the threemethods and the evaluation metrics used are maximumaccuracy (Max Acc) average accuracy (AVR Acc) de-tection rate (DT) false alarm rate (FAR) false negative rate(FNR) F-measure (F-M) recall and error rate (ER)Table 11 shows the comparison in results among ITLBOITLBO-JAYA and ITLBO-IPJAYA

(e results show that ITLBO-IPJAYA performs betterthan ITLBO and ITLBO-JAYA in all metrics Figure 11shows the comparison results based on the accuracy ofITLBO ITLBO-JAYA and ITLBO-IPJAYA

Figure 12 shows a comparison between ITLBO-JAYAand ITLBO-IPJAYA based on the number of iterations Itshows that ITLBO-IPJAYA performs better than ITLBO-JAYA even with less number of iterations (e increase inrate of accuracy for ITLBO-IPJAYA is higher than ITLBO-JAYA (e figure shows that ITLBO-IPJAYA with 20 iter-ations performs better than ITLBO-JAYA with 30 iterationsand that ITLBO-IPJAYA performs better than ITLBO-JAYAwith less number of iterations (is means there is lesscomplexity and less execution time for ITLBO-IPJAYAFigure 13 shows the average FAR of the three methodsshowing that ITLBO-IPJAYA performs better than ITLBOand ITLBO-JAYA even with less number of features whereITLBO-IPJAYA with 19 features performs better than TLBOand ITLBO-JAYA with 21 and 22 features respectively (eimprovements shown in Sections 4 and 6 reduce the exe-cution time for ITLBO-IPJAYA over ITLBO-JAYA (eparallel processing of each SVM parameter independently isthe main factor that reduces the execution time for ITLBO-IPJAYA over ITLBO-JAYA as shown in Figure 14

(e results of the CICIDS 2017 dataset are shown inTable 12

Finally statistical significance tests (T-test) T-test madeon the distribution of values in both samples showed theirsignificant difference which allowed us to reject null hy-pothesis H0 (e test sh2ows the superiority of IPJAYA-ITLBO-SVM over JAYA-ITLBO-SVM (e P values and T-

Table 8 NSL-KDD dataset

Attack classes 22 types of attacks No of instancesNormal 67343DoS smurt neptune pod teardrop back land 45927R2L phf ftp-write imap multihop warezclient warezmaster spy guess password 995U2R perl loadmodule buffer-overflow rootkit 52Probing portsweep ipsweep satan nmap 11656

Table 9 CICIDS dataset

Attack class 14 types of attacks No of instancesBenign (normal) 2359087DOS DDoS slowloris Heratbleed Hulk GoldenEye Slowhttptest 294506PortScan Portscan 158930Bot Bot 1966Brute-Force FTP-Patator SSH-Patator 13835Web attack Web attack XSS web attack SQL injection web attack brute force 2180Infiltration Infiltration 36

Table 10 Parameters used in this study margin

Parameter ValuePopulation size for ITLBO 40Number of generations for ITLBO 60Population size for JAYA 40Number of generations for JAYA 60Population size for IPJAYA 40Number of generations for IPJAYA 60Population size for ITLBO 40Number of generations for IPJAYA 60Crossover type Half-uniformMutation type Bit-flip

Complexity 13

0973097409750976097709780979

098098109820983

Acc

urac

y

ITLBO-JAYAITLBO-IPJAYA

25 30 35 40 45 50 55 60 6520Number of iterations

Figure 13 Accuracy comparison based on the number of iterations for the NSL-KDD dataset

Table 11 Comparison of ITLBO ITLBO-JAYA and ITLBO-IPJAYA for the NSL-KDD dataset

No of features Method MAX Acc AVR Acc DR FAR FNR F-M Recall ER

16

TLBO 09639 09630 09612 00449 00282 09664 09717 0036ITLBO 09680 09678 09671 00379 00268 09701 09731 0032

ITLBO-JAYA 09688 09685 09676 00373 00258 0971 09741 00312ITLBO-IPJAYA 09708 09705 09712 00331 00256 09727 09742 00292

18

TLBO 09713 0971 09739 00299 00275 09731 09724 00286ITLBO 09718 09713 09744 00292 00273 09736 09726 00282

ITLBO-JAYA 09735 09733 09752 00285 00247 09752 09752 00265ITLBO-IPJAYA 09747 09746 09753 00280 00221 09764 09779 00252

19

TLBO 09738 09735 09727 00313 00225 09755 09774 00261ITLBO 09751 09745 09737 00305 00189 09769 09811 00248

ITLBO-JAYA 09759 09758 09758 00278 00178 09775 09791 00241ITLBO-IPJAYA 09772 09770 09786 00245 00162 09787 09787 00228

21

TLBO 09782 09780 09742 00299 00145 09797 09844 00217ITLBO 09787 09784 09756 00279 00144 0981 09846 00212

ITLBO-JAYA 09793 0979 09789 00273 00132 09811 09867 00207ITLBO-IPJAYA 09802 0980 09792 00263 00123 09812 09716 00198

22

TLBO 09801 0979 09755 00284 00131 09814 09868 00199ITLBO 0981 09805 09758 00277 00117 09823 0989 00191

ITLBO-JAYA 09816 09814 09794 00265 00114 09829 0989 00183ITLBO-IPJAYA 0-9823 09821 09798 00262 00102 09835 09898 00177

Acc

urac

y

16 18 19 21 22Number of features

096

80

9688 097

08

097

180

9735

097

47

097

510

9759

097

72

097

870

9793

098

02

098

10

9816

098

23

ITLBOITLBO-JAYAITLBO-IPJAYA

Figure 12 Accuracy based on the number of features for the NSL-KDD dataset

14 Complexity

values are shown in Table 13 the small values show that theIPJAYA-ITLBO-SVM method (MV1) is highly significant

13 The Comparison of the Proposed Methods

To illustrate the effectiveness of our proposed IDS methodsthe performance of the proposed methods is compared withsix recently developed anomaly detection techniquesTable 14 demonstrates the result achieved by the proposedmethods compared with other methods tested on the NSL-KDD dataset in terms of detection rate and false alarm rateIt is very clear that our proposed methods (ITLBO-JAYAand ITLBO-IPJAYA) obtained the best results with 09823accuracy 09798 detection rate and 00102 false alarm ratefor the ITLBO-IPJAYA model and 09816 accuracy 09794detection rate and 00114 false alarm rate for the ITLBO-JAYA method as shown in Table 11 However Table 15demonstrates the result achieved by the proposed methodscompared with other methods tested on the CICIDS 2017dataset in terms of detection rate and false alarm rate

14 Discussion

(is work in general contains 4 sections based on theproposed method Furthermore all methods proposed inthis work were evaluated based on the NSL-KDD andCICIDS 2017 datasets

Firstly the proposed ITLBO-IPJAYA based on networkintrusion detection and method results were compared withTLBO ITLBO and ITLBO-JAYA as shown in Tables 11 and12 Additionally the table shows the different features for thethree algorithms to investigate the influence of the featurersquosincrease on the performance which represents a differentalgorithm structure (e ITLBO-IPJAYA results showedhigher stability and better accuracy than ITLBO and ITLBO-JAYA algorithms

Furthermore Figure 13 shows that ITLBO-JAYA needs60 iterations to reach accuracy of 09816 when the ITLBO-IPJAYA algorithm with 50 iterations achieved higher ac-curacy (erefore ITLBO-IPJAYA achieved better detectionrate and less false alarm rate with less complexity of

0020

0022

0024

0026

0028

003

0032

0034

0036

0038

004

FAR

2217 18 19 20 21 2315 16Number of features

ITLBOITLBO-JAYAITLBO-IPJAYA

Figure 14 Comparison based on the number of features head for the NSL-KDD dataset

Table 12 Comparison of ITLBO ITLBO-JAYA and ITLBO-IPJAYA for the CICIDS dataset

No of features Method MAX Acc AVR Acc DR FAR FNR F-M Recall ER

12ITLBO 09634 09631 09661 00389 00268 0970 09721 00323

ITLBO-JAYA 09685 09683 09682 00360 00267 09713 09722 00315ITLBO-IPJAYA 09704 09702 09701 00310 00265 09725 09724 00298

13ITLBO 09712 09710 09724 00298 00273 09736 09726 00282

ITLBO-JAYA 09745 09744 09728 00290 00264 09741 09794 00272ITLBO-IPJAYA 09768 09767 09732 00285 00260 09752 09787 00264

14ITLBO 09776 09775 09737 00280 00189 09769 09811 00258

ITLBO-JAYA 09789 09787 09742 00270 00174 0978 0986 00235ITLBO-IPJAYA 09801 0980 09749 00265 00134 0981 0987 00210

16ITLBO 09804 09803 09755 00271 0011 09821 0989 00190

ITLBO-JAYA 0981 09808 09773 00266 00109 09825 0989 00183ITLBO-IPJAYA 09817 09815 09782 00264 00105 09831 09896 00170

Complexity 15

iterations Secondly with all the improvement of ITLBO-SVM mentioned above random selection of the main SVMparameters is considered as one of the algorithm limitationswhich may not provide optimal parameter value and affectthe model accuracy negatively

(e results above showed that the ITLBO-IPJAYAperformance improved the basic SVM performance byproviding the best parameter values as shown in the ITLBO-IPJAYA block diagram in Figure 11 In the end the per-formance of ITLBO-IPJAYA is worth reducing the impact ofselected parameters randomly

As a result of the differences in the algorithm structurethe ITLBO structure contains three phases which shouldprevent the algorithm from being trapped in local and globaloptima Also teachers not only teach learners (students) butalso teach other teachers On the contrary the TLBO structurecontains two phases only where teachers teach learners only

Furthermore the ITLBO algorithm achieved higheraccuracy than TLBO because the knowledge exchange rate ishigher in ITLBO since teachers teach learners and otherteachers (erefore ITLBO achieved better detection rateand less false alarm rate with less complexity of iterations

Table 15 Comparison with the existing work for the CICIDS 2017 dataset

Ref Method Dataset Acc DR FAR[41] Hybrid model CICIDS 8976 NG NG[42] Wrapper-based feature selection CICIDS 9768 NG NG[43] Feature selection technique and SVM CICIDS 09803 NG NGTLBO-SVM TLBO and SVM CICIDS 09794 09745 00274ITLBO-SVM Improved TLBO and SVM CICIDS 09804 09755 00271ITLBO-JAYA-SVM Improved TLBO improved JAYA and SVM CICIDS 0981 09773 00266ITLBO-IPJAYA-SVM Improved TLBO improved JAYA and SVM CICIDS 09817 09782 00264

Table 13 T-test results

NSL-KDD CICIDS 2017P value 00156 00068T value 3174 4044

Table 14 Comparison with the existing work for the NSL-KDD dataset

Ref Method Dataset Acc DR FAR[35] Two-stage classifier NSL-KDD 9638 NG NG[36] Hypergraph-based genetic algorithm and SVM NSL-KDD 0975 09714 083[8] PSO and SVM NSL-KDD 09784 09723 087[37] Chi-square and SVM NSL-KDD 098 NG 013[38] SVM and hybrid PSO NSL-KDD 07341 06628 281[39] SVM and feature selection NSL-KDD 090 NG NG[40] SVM and GA NSL-KDD 0975 NG NGTLBO-SVM TLBO and SVM NSL-KDD 09801 09755 00284ITLBO-SVM Improved TLBO and SVM NSL-KDD 0981 09758 00277ITLBO-JAYA-SVM Improved TLBO improved JAYA and SVM NSL-KDD 09816 09794 00265ITLBO-IPJAYA-SVM Improved TLBO improved JAYA and SVM NSL-KDD 09823 09798 00262

13534

13239

ITLBO-JAYA ITLBO-IPJAYA130501310013150132001325013300133501340013450135001355013600

Figure 15 Execution time comparison for the NSL-KDD dataset

16 Complexity

Dividing the solutions of the IPJAYA algorithm into twogroups and choosing the best solution from the best solutiongroup as ldquoBestrdquo and the best solution from the worst solutiongroup as ldquoWorstrdquo cause IPJAYA to need less iterations thanJAYA to reach better solutions as shown in Figure 13 (isalso leads to improvement in accuracy and detection rate

(e parallel improvement done on the JAYA algorithmreduces the time needed for execution and hence reduces thetotal execution time for the ITLBO-IPJAYA-SVM model asshown in Figure 15

Data Availability

(e data used to support the findings of this study areavailable online at httpswwwunbcacicdatasetsnslhtml

Conflicts of Interest

(e authors declare that they have no conflicts of interest

Acknowledgments

Special appreciation is due to Universiti Malaysia Pahang forthe sponsorship of this study approved by the Ministry ofHigher Education (MOHE) for Fundamental ResearchGrant Scheme (FRGS) with Vot No RDU190113

References

[1] A Sultana and M Jabbar ldquoIntelligent network intrusiondetection system using data mining techniquesrdquo in Pro-ceedings of the 2016 2nd International Conference on Appliedand 0eoretical Computing and Communication Technology(iCATccT) pp 329ndash333 IEEE Bangalore India July 2016

[2] R Rao ldquoJaya a simple and new optimization algorithm forsolving constrained and unconstrained opti- mizationproblemsrdquo International Journal of Industrial EngineeringComputations vol 7 no 1 pp 19ndash34 2016

[3] M Alsajri M A Ismail and S Abdul-Baqi ldquoA review on therecent application of Jaya optimization algorithmrdquo in Pro-ceedings of the 2018 1st Annual International Conference onInformation and Sciences (AiCIS) Springer Al-Fallujah Iraqpp 129ndash132 November 2018

[4] P Tao Z Sun and Z Sun ldquoAn improved intrusion detectionalgorithm based on GA and SVMrdquo IEEE Access vol 6pp 13624ndash13631 2018

[5] A S Eesa Z Orman and A M A Brifcani ldquoA novel feature-selection ap- proach based on the cuttlefish optimizationalgorithm for intrusion detection systemsrdquo Expert Systemswith Applications vol 42 no 5 pp 2670ndash2679 2015

[6] P Louvieris N Clewley and X Liu ldquoEffects-based featureidentification for network intrusion detectionrdquo Neuro-computing vol 121 pp 265ndash273 2013

[7] E De la Hoz A Ortiz J Ortega and B Prieto ldquoPCA filteringand probabilistic SOM for network intrusion detectionrdquoNeurocomputing vol 164 pp 71ndash81 2015

[8] S Mojtaba H Bamakan H Wang Y Tian and Y Shi ldquoAneffective intrusion detection framework based onMCLPSVMoptimized by time-varying chaos particle swarm optimiza-tionrdquo Neurocomputing vol 199 pp 90ndash102 2016

[9] R V Rao V J Savsani and D P Vakharia ldquoTeach-ingndashlearning-based optimization a novel method for

constrained mechanical design optimization problemsrdquordquoComputer-Aided Design vol 43 no 3 pp 303ndash315 2011

[10] S P Das and S Padhy ldquoA novel hybrid model using teaching-learning-based optimization and a support vector machine forcommodity futures index forecastingrdquo International Journalof Machine Learning and Cybernetics vol 9 no 1 pp 97ndash1112018

[11] S P Das N S Achary and S Padhy ldquoNovel hybrid SVM-TLBO forecasting model incorporating dimensionality re-duction techniquesrdquo Applied Intelligence vol 45 no 4pp 1148ndash1165 2016

[12] R V Rao V J Savsani and J Balic ldquoTeachingndashlearning-based optimization algorithm for unconstrained and con-strained real-parameter optimization problemsrdquordquo EngineeringOptimization vol 44 no 12 pp 1447ndash1462 2012

[13] R V Rao and V Patel ldquoAn improved teaching-learning-based optimization algorithm for solving uncon- strainedoptimization problemsrdquo Scientia Iranica vol 20 no 3pp 710ndash720 2013

[14] M Crepinsek S-H Liu and L Mernik ldquoA note on teach-ingndashlearning-based optimization algo- rithmrdquo InformationSciences vol 212 pp 79ndash93 2012

[15] M R Nayak C K Nayak and P K Rout ldquoApplication ofmulti-objective teaching learning based optimization algo-rithm to optimal power flow problemrdquo Procedia Technologyvol 6 pp 255ndash264 2012

[16] Y Xu L Wang S-Y Wang and M Liu ldquoAn effectiveteaching-learning-based optimization algorithm for theflexible job-shop scheduling problem with fuzzy processingtimerdquo Neurocomputing vol 148 pp 260ndash268 2015

[17] H E Kiziloz A Deniz T Dokeroglu and A Cosar ldquoNovelmultiobjective TLBO algorithms for the feature subset se-lection problemrdquoNeurocomputing vol 306 pp 94ndash107 2018

[18] S H Wang K Muhammad Y Lv et al ldquoIdentification of Al-coholism based on wavelet Renyi entropy and three-segmentencoded Jaya algorithmrdquo Complexity vol 2018 Article ID3198184 13 pages 2018

[19] H Migallon A Jimeno-Morenilla and J-L Sanchez-RomeroldquoParallel improvements of the Jaya optimization algorithmrdquoApplied Sciences vol 8 no 5 p 819 2018

[20] C Gong ldquoAn enhanced Jaya algorithm with a two groupAdaptionrdquo International Journal of Computational Intelli-gence Systems vol 10 no 1 pp 1102ndash1115 2017

[21] O Samuel N Javaid S Aslam and M H Rahim ldquoJAYAoptimization based energy management controller for smartgrid JAYA optimization based energy management con-trollerrdquo in Proceedings of the 2018 International Conference onComputing Mathematics and Engineering Technologies(iCoMET) March 2018

[22] K Yu J J Liang B Y Qu X Chen and H Wang ldquoPa-rameters identification of photovoltaic models using an im-proved JAYA optimization algorithmrdquo Energy Conversionand Management vol 150 pp 742ndash753 2017

[23] M Dash and H Liu ldquoFeature selection for classificationrdquoIntelligent Data Analysis vol 1 no 1-4 pp 131ndash156 1997

[24] S Dumais J Platt D Heckerman and M Sahami ldquoInductivelearning algorithms and representations for text categoriza-tionrdquo in Proceedings of the seventh international conference onInformation and knowledge management-CIKMrsquo98 Novem-ber 1998

[25] R Singh H Kumar and R K Singla ldquoAn intrusion detectionsystem using network traffic profiling and online sequentialextreme learning machinerdquo Expert Systems with Applicationsvol 42 no 22 pp 8609ndash8624 2015

Complexity 17

[26] M Sokolova and L Guy ldquoA systematic analysis of perfor-mance measures for classification tasksrdquo Information Pro-cessing amp Management vol 45 no 4 pp 427ndash437 2009

[27] P Phoungphol Y Zhang and Y Zhao ldquoRobust multiclassclassification for learning from imbalanced biomedical datardquoTsinghua Science and Technology vol 17 no 6 pp 619ndash6282012

[28] A Jahan F Mustapha M Y Ismail S M Sapuan andM Bahraminasab ldquoA comprehensive VIKOR method formaterial selectionrdquo Materials amp Design vol 32 no 3pp 1215ndash1221 2011

[29] L Aljarah and S A Ludwig ldquoMapreduce intrusion detectionsystem based on a particle swarm optimization clusteringalgorithmrdquo in Proceedings of the 2013 IEEE Congress onEvolutionary Computation pp 955ndash962 IEEE CancunMexico June 2013

[30] K Khaleel M A Ismail U Yunan and S Kasim ldquoReview onintrusion detection system based on the goal of the detectionsystemrdquo International Journal of Integrated Engineeringvol 10 no 6 2018

[31] J D Rodriguez A Perez and J A Lozano ldquoSensitivityanalysis of k-Fold cross validation in prediction error esti-mationrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 32 no 3 pp 569ndash575 2010

[32] M Tavallaee E Bagheri W Lu and A A Ghorbani ldquoAdetailed analysis of the KDD CUP 99 data setrdquo in Proceedingsof the 2009 IEEE Symposium on Computational Intelligence forSecurity and Defense Applications pp 1ndash6 IEEE Ottawa ONCanada July 2009

[33] M Al-Qatf Y Lasheng M Al-Habib and K Al-SabahildquoDeep learning approach combining sparse autoencoder withSVM for network intrusion detectionrdquo IEEE Access vol 6pp 52843ndash52856 2018

[34] R Vinayakumar M Alazab K P Soman P PoornachandranA Al-Nemrat and S Venkatraman ldquoDeep learning approachfor intelligent intrusion detection systemrdquo IEEE Access vol 7pp 41525ndash41550 2019

[35] BA Tama M Comuzzi and K-H Rhee ldquoTSE-IDS a two-stage classifier ensemble for intelligent anomaly-based in-trusion detection systemrdquo IEEE Access vol 7 pp 94497ndash94507 2019

[36] M R Gauthama Raman N Somu K Kannan R Liscano andV S Shankar Sriram ldquoAn efficient intrusion detection systembased on hypergraph - genetic algorithm for parameter op-timization and feature selection in support vector machinerdquordquoKnowledge-Based Systems vol 134 pp 1ndash12 2017

[37] I S (aseen and C Aswani Kumar ldquoIntrusion detectionmodel using fusion of chi- square feature selection and multiclass SVMrdquo Journal of King Saud University - Computer andInformation Sciences vol 29 no 4 pp 462ndash472 2017

[38] Y Li S Yu J Bai and X Cheng ldquoTowards effective networkintrusion detection a hybrid model integrating Gini indexand GBDTwith PSOrdquo Journal of Sensors vol 2018 Article ID1578314 9 pages 2018

[39] A A Aburomman and M B I Reaz ldquoA novel weightedsupport vector machines multi- class classifier based ondifferential evolution for intrusion detection systemsrdquo In-formation Sciences vol 414 pp 225ndash246 2017

[40] J Esmaily and J Ghasemi ldquoA novel intrusion detectionsystems based on genetic algorithms-suggested features by themeans of different permutations of labelsrsquo ordersrdquo Interna-tional Journal of Engineering vol 30 no 10 pp 1494ndash15022017

[41] S Aljawarneh M Aldwairi and M B Yassein ldquoAnomaly-based intrusion detection system through feature selectionanalysis and building hybrid efficient modelrdquo Journal ofComputational Science vol 25 pp 152ndash160 2018

[42] Y Li J L Wang Z H Tian T B Lu and C Young ldquoBuildinglightweight intrusion detection system using wrapper-basedfeature selection mechanismsrdquo Computers amp Security vol 28no 6 pp 466ndash475 2009

[43] S U Jan S Ahmed V Shakhov and I Koo ldquoToward alightweight intrusion detection system for the internet ofthingsrdquo IEEE Access vol 7 pp 42450ndash42471 2019

18 Complexity

Page 4: ImprovedTLBO …downloads.hindawi.com/journals/complexity/2020/5287684.pdfFSS is a multiobjective optimisation problem. e first objective is the number of features, and the second

Min (f1)

Max (f2)

subject to f1 |k|

f2 accuracy(k) where ksubeK

(1)

where k is the subset of the original dataset K which opti-mises f1 and f2 (the objectives)

(e establishment of the best solution or the decision onthe improved condition of a new individual is a complicatedtask in a multiobjective optimisation process (is is due tothe chances of enhancement in one objective causing areduction in the other

4 Improved TLBO Algorithm

(e ITLBO algorithm was executed at the FSS phase in thisstudy (e ITLBO algorithm was initialised by randomlygenerated initial population namely the teacher and a set ofstudents which represents the set of solutions To representthe features in the ITLBO algorithm ITLBO borrowed thecrossover and mutation operators from GA by representingthe features as chromosomes (one of the GA properties) Toupdate this chromosome crossover and mutation operatorswere used In the population (called a classroom) eachsolution is taken as an individualchromosome (Figure 1) Afeature gene of a chromosome with a value of 1 is consideredas selected while a value of 0 denotes otherwise Figure 1shows a sample of the dataset regarding Figure 2 features AB C D E I K and L were selected (their values are 1) whilefeatures F G H and J were not (their values are 0) (eTLBO algorithm runs through iterations where the teacher isthe best individual in the population and the rest of theindividuals become the students Having selected theteacher ITLBO works in three phases Teacher BestClassmates (Learner Phase 1) and Learner Phase 2 In theTeacher phase the teacher enhances the knowledge of eachstudent by sharing knowledge with them but in the BestClassmates phase two best students are selected andassigned the task of interacting with the other students Inthe Learner phase there is a random interaction among thestudents in a bid to enhance their levels of knowledge Newchromosomes are generated in the proposed ITLBO usingldquohalf-uniform crossover and bit-flip mutation operatorsrdquowhich are special crossover operators (Figures 3 and 4)Two-parent chromosomes (could be a teacher a student ortwo students) are needed for the crossover operator (ecrossover operator relies on the information of the two-parent chromosomes if both parents feature the same genethe gene is kept but whenever there are different featuregenes in both parents a parentrsquos gene is randomly chosenOnly one new chromosome is generated from this operation

(e ldquobit-flip mutationrdquo works on a single chromosome whentrying to manipulate a single gene based on a probabilisticratio If the gene has a zero value it will be updated as one orvice versa In the proposed ITLBO algorithm nondominatedsorting and selection were used (e dominance of an in-dividual over another individual is determined strictly on thebasis of whether a minimum of one of its objectives is su-perior to that of the other while keeping all the other ob-jectives the same

A nondominated scenario arises when there is nopossibility of an individual being dominated by another (efront line of the solution set is filled by the nondominatedindividuals (ose that are closest to the ideal point in thefront line are chosen as the teachers All the teachers teach allstudents discretely at the Teacher Best Classmate andLearner phases (e details of the ITLBO algorithm arepresented in Figures 5 and 6(e detail steps of ITLBO are asfollows

(i) Step 1 initialise the population randomly with eachpopulation having a different set of features from 1to a maximum number of features (41 in NSL-KDD) (is step is captured in line 2 of Figure 5

(ii) Step 2 choose the best individual as a teacher (echosen teacher interacts with all other individualsseparately and a crossover is applied with each oneand then a mutation is applied to all the resultingindividuals (e operators used are half-uniformcrossover and bit-flip mutation operators (repre-sented in lines 4 to 5 in Figure 5)

(iii) Step 3 check the population (chromosome) thatresults from the crossover and mutation if the newchromosome is better than the old then the new oneis kept otherwise the old one is retained All theaforementioned steps are collectively called theTeacher phase because all individuals learn from thebest one (the teacher) (is step is represented inlines 6 to 13 in Figure 5

(iv) Step 4 after that Learner Phase 1 or learning fromthe best classmates is started (is phase begins withthe fifth step which is the selection of the best twoindividuals as students and applying a crossoverbetween them followed by a mutation If the newone is better than the previous two students thenthe newer choice is kept otherwise the older bestchoice is kept (is process is repeated with all otherindividuals (students) At this point Learner Phase1 has terminated (viewed in lines 14 to 27 inFigure 5)

(v) Step 5 this step is Learner Phase 2 which involveschoosing two random individuals (students) be-tween whom a crossover is applied followed by amutation on the new individual If the new

Table 1 IDS existing work

Ref Limitation[6] Detection of SQL injection is low[7] Detection rate is low[8] Long training time

110100011111

Figure 1 Schematic representation of a chromosome 1 selectedfeatures 0 unselected features

4 Complexity

individual is better than the old two students thenthe new one is kept otherwise the best old one isretained (is step is repeated with all other

students At this point the main three stages ofITLBO have been completed and a check should becarried out on whether the termination criteria have

(1) Start (2) Initialize population(3) Calculate_weighted_average_of

_individuals (population)(4) for (k =1 to number_of_generations) do(5) Xteacher = Best_individual (6) Learning from Teacher lowastteacher phase (7) for (i =1 to number_of_individuals) do(8) Xnew = Crossover (Xteacher Xi)(9) Xnew = Mutation (Xnew)(10) if (Xnew is better than Xi) then(11) Xi = [Xnew](12) End if(13) End for (14) Learning from Best Classmates lowastlearner phase 1 (15) for (i =1 to number_of_individuals) do(16) m = Select_best_individual_from

(population)(17) n = Select_best_individual_from (population) (18) n ne m ne teacherlowast(19) Xnew = Crossover (Xm Xn)(20) Xnew = Mutation (Xnew)(21) if (Xnew is better than Xm) then(22) Xm = Xnew(23) End if

(24) if (Xnew is better than Xn) then (25) Xn = Xnew(26) End if (27) End for (28) Learning from Classmates lowastlearner phase 2(29) for (i =1 to number_of_individuals) do(30) m = Select_random_individual_from

(population)(31) n = Select_random_individual_from

(population) n ne m ne teacherlowast(32) Xnew = Crossover (Xm Xn)(33) Xnew = Mutation (Xnew)(34) if (Xnew is better than Xm) then(35) Xm = Xnew(36) End if (37) if (Xnew is better than Xn) then(38) Xn = Xnew(39) End if (40) End for (41) Is termination criterion satisfied if yes go to line

42 else continue(42) End for(43) Show_the_pareto_optimal_set (population)(44) End

Figure 5 ITLBO algorithm

Figure 2 Sample of the dataset

0100111111

1101110101

1100110111

Teacher

New student

Student

Figure 3 Crossover operator

0100111111

0100110111

Figure 4 Mutation operator

Complexity 5

been satisfied or not If the termination criteria weresatisfied proceed to the next step otherwise themain three stages are repeated (Teacher phaseLearner Phase 1 and Learner Phase 2) (is step isrepresented in lines 28 to 40 in Figure 5

(vi) Step 6 the final step is the application of non-dominated sorting to the result Nondominatedsortingmeans no result (individual) is better than all

other individuals (is step can be viewed in line 43in Figure 5

5 Parameter Optimisation

After selecting the optimal subset feature several SVMparameters will be tuned(e tuning of SVM parameters is aproblem which can determine algorithm performance (e

Start

Calculate weighted average of every individual in the population

Choose best individual as teacher

Is new one better thanold one

Keep old one Keep new one

Select best two students apply crossover and mutation

Is new one better than theworse student

Keep new oneKeep old one

Select random two students apply crossover and mutation

Is new one better thanold one

Keep new oneKeep old one

Is the termination criterion satisfied

Is new one better than the worse student based on new

value of C and YKeep new oneKeep old one

Is the terminationcriterion satisfied

End

YesNo

YesNo

No Yes

YesNo

No

Crossover teacher with all other individuals (student) separately and apply mutation

Apply nondominated sorting and find the Pareto setYes

Initialize population randomly

Figure 6 ITLBO flowchart

6 Complexity

radial basis function (RBF) kernel function of the SVM isemployed for the conversion of the completely nonseparableproblem into a separable or approximate separable state(eRBF kernel parameter c suggests data distribution to a newfeature space while parameter C suggests the level of penaltyfor the classification error in the linear nonseparable caseEquations (2) and (3) represent the cost and gamma re-spectively In the next section the two parameters (C and c)were tuned by using the IPJAYA algorithm

w b12

w22

+ C 1113944n

ζnstyn wtxn + b1113872 1113873ge 1 minus ζn (2)

k xn xm( 1113857 exp minusc xn minus xm

23

1113874 1113875 (3)

6 Improved Parallel JAYA Algorithm

(e JAYA algorithm needs improvements to work betterOne of the observations on the JAYA algorithm is that if wesort the populations from best to worst and divide them intotwo groups the best and the worst solutions Obviously theoptimal solution is located in the best solution group [2]Based on this observation an improvement has been done inthe JAYA algorithm rather than selecting the best and worstcases from the whole solutions which puts the worst so-lution further from the best solution and increase the it-erations needed to reach the optimal solution the solutionswere divided into two groups (e best solution is chosenfrom the best solution group as ldquoBestrdquo and the best solutionfrom the worst solution group is the ldquoWorstrdquo (is proce-dure reserves the populationrsquos diversity and makes the so-lution start from a point closer to the optimal solution anddecreases the number of iterations needed to reach theoptimal solution In the proposed work JAYA algorithmwas improved to optimise two parameters of the SVMclassifier simultaneously Figure 7 shows the flowchart ofIPJAYA while Figure 8 shows the IPJAYA algorithm fol-lowed by the detailed steps of IPJAYA

(e detailed steps of IPJAYA are shown as follows(i) Step 1 select the population size and the number of

design variables as well as initialise the terminationcondition To explain the parameter optimisation indetail we assume the following scenario populationsize 3 design variables 2 and terminationcriterion 2 iterations (e value of the populationis the value of parameters C and c in this scenarioeach one has 3 values (ese values were initialisedrandomly for C between 0001 to 100 and for c

between 00001 to 64 Table 2 shows the values of Cand c

(ii) Step 2 SVM needs three things to classify any la-belled data ie features to choose value of C pa-rameter and value of c parameter (is step can beviewed in line 2 of Figure 8

(iii) Step 3 the next step is to evaluate each value forboth C and c separately by using the SVM and on

the first student from Learner Phase 2 after applyingcrossover and mutation as shown in Table 3To continue the optimisation process the pop-ulation is arranged from best to worst and split intotwo groups (Best and Worst groups) as shown inTable 4(e same procedure is repeated for c parameter andthis time C is by default and the new value of c is11006

(iv) Tables 5 and 6 show the details of c parameter(v) Step 4 the result will be considered as the objective

function for both C and c and then compared withother populations and continued until the termi-nation criterion is satisfied (is step can be viewedin line 7 of Figure 8

(ese two new values for C and c will be evaluated usingthe same subset feature at the same time as shown in Table 7(is step can be viewed in lines 5 to 6 of Figure 8

7 The Proposed Method

(is section describes the proposed combination of threedifferent algorithms Each algorithm has a different task todo and these tasks complete the work of the model (e firstalgorithm is the ITLBO whose task is to choose the optimalsubset feature from the whole features(e second algorithmis IPJAYA algorithm and its task is to optimise the pa-rameters of the SVM (e third algorithm is the SVMclassifier which takes the outcome of the first two algorithmsto determine if the processed traffic is intrusion or normaltraffic Figure 9 shows the flowchart of the proposed methodFigure 10 shows the pseudo-code of the proposed methodwhile Figure 11 details the proposed steps of the IPJAYA-ITLBO-SVM method

(e detail steps of ITLBO-IPJAYA-SVM are asfollows

(i) Step 1 initialise the population randomly Eachpopulation is a different set of features from 1 tothe maximum number of features (41 in NSL-KDD) (is step can be viewed in line 2 ofFigure 10

(ii) Step 2 calculate the weighted average of everyindividual population (is step can be viewed inline 3 of Figure 10

(iii) Step 3 choose the best individual as a teacher (echosen teacher interacts with all other individualsseparately Apply crossover with each one andthen apply mutation to all resulted individuals(ecrossover used is called the half-uniform crossoverand bit-flip mutation operator (is step can beviewed in lines 4 to 5 of Figure 10

(iv) Step 4 check the population (chromosome)resulting from crossover and mutation if the newone is better than the old one keep the new oneotherwise retain the old one (e best and worstpopulations refer to the degree of accuracy during

Complexity 7

Start

Initialize population size number of design variables and termination criterion

Identify best and worst solutions for C and γ

Calculate new worst solution for C Calculate new worst solution for γ

Modify the solution based on eq 4 Modify the solution based on eq 4

Use the new value of C and γ do the classification

Is new Acc Better than old Acc

Keep new C and γKeep old C and γ

Is the termination criterionsatisfied

Return best value of C and γ

End

YesNo

NO

Yes

Parallel pool

Figure 7 IPJAYA flowchart

(1) Start (2) Initialise the population size number of designed

variables and termination criteria (3) Repeat Steps 3ndash6 until the termination criteria are met (4) Arrange the solutions from best to worst and split the

solutions into two groups -best and worst solutions(5) Make the best solution in best group as best and

make the best solution in worst group as worst(6) Modify the solution based on the following equation

Yprimej k I = Yj k I + r1 k I (Yj best I ndash | Yj k I |) ndashr2 k I (Yj worst I ndash | Yj k I|)

(7) Update the previous solution if Yprimej k I gt Yj k Iotherwise do not update the previous solution

(8) Display the established optimum solution(9) End

Figure 8 IPJAYA algorithm

Table 2 C and c values

C c

20 101 210 0701 1

Table 3 Evaluation of C

C c Subset feature Accuracy (objective function)20 Default Fixed 0971 Default Fixed 089910 Default Fixed 099401 Default Fixed 099

Table 4 Best and Worst groups for C

C cSubsetfeature

Accuracy (objectivefunction)

10 Default Fixed 0994 best of best Best group01 Default Fixed 09920 Default Fixed 097 best of worst Worst

group1 Default Fixed 0899

Table 5 Accuracy based on c

C γ Subset feature Accuracy (objective function)20 Default Fixed 0971 Default Fixed 089910 Default Fixed 099401 Default Fixed 099

8 Complexity

classification All the aforementioned steps arecalled the Teacher phase because all individualslearn from the best one (teacher) After thatLearner Phase 1 is started (is step can be viewedin lines 6 to 13 of Figure 10

(v) Step 5 select the best two individuals as studentsand apply crossover between these two students(en apply mutation on the new one If the newone is better than the old two students keep thenew one otherwise keep the best old one and

apply this with all other individuals (students) (estudents are chosen once and will not be chosenagain At this point Learner Phase 1 has ended(is step can be viewed in lines 14 to 27 ofFigure 10

(vi) Step 6 Learner Phase 2 is initiated with two objec-tives one is to optimise the SVMparameters and theother is to make students learn from each other(isphase starts with choosing two random individuals(students) and then applying crossover between these

Table 6 Best and Worst groups for c

C c Subset feature Accuracy (objective function)Default 07 Fixed 09941 best of best Best groupDefault 1 Fixed 099Default 2 Fixed 098 best of worst Worst groupDefault 10 Fixed 097

Table 7 Evaluation of features based on new C and c

C c Subset feature Accuracy (objective function)142 11006 Fixed Result

Start

FSS using ITLBO

SVM parameters optimization using IPJAYA

Is termination criterion satisfied for IPJAYA

No

Is termination criterion satisfied for ITLBO

No

Training

Classification

Intrusion detection

End

Yes

Yes

Figure 9 Proposed method flowchart

Complexity 9

two students and applying mutation on the newindividual After that and before the classificationprocess is initiated check if the new student is betterthan the old two students (e SVM parameteroptimisation is started using IPJAYA this processstarts at the 29th step by initialising the populationsize the number of design variables and the ter-mination criteria for IPJAYA (e population sizecan be set before the execution and each populationis generated randomly(e designed variables are thetwo parameters of the SVM which need to beoptimised (e termination criteria can be thenumber of iterations after that each population foreach parameter is evaluated separately (which onegives better accuracy) followed by a parallel poll foreach parameter sorting the population from best toworst (best accuracy to worst accuracy) and sepa-rating them into two groups (best and worst groups)(e best population in the best group is chosen asbest and the best population in the worst group ischosen as worst (en the population is modifiedbased on equation in Figure 8 and updated if the newone is better than the old one IPJAYA is repeated

until the termination criterion is satisfied (e finalstep of IPJAYA is to deliver the best value of the twoparameters to be used by the SVM At this point theparameter optimisation has ended and LearnerPhase 2 continues in the next step (is step can beviewed in lines 28 to 39 of Figure 10

(vii) Step 7 evaluate the individuals (chromosome) byusing the outcome of IPJAYA If the new indi-vidual is better than the old two students keepthe new one otherwise keep the best old oneApply this step to all other students At this stepthe main three stages of the ITLBO have finished(e next step is to check for the satisfaction of thetermination criteria if satisfied proceed to thenext step Otherwise the main three stages arerepeated (is step can be viewed in lines 40 to 48of Figure 10

(viii) Step 8 the last step is to apply nondominatedsorting on the result Nondominated sortingmeans no result (individual) is better than all theother individuals(is step can be viewed in line 49of Figure 10

(1) Start (2) Initialize population(3) Calculate_weighted_average_of_individuals (4) for (k =1 to number_of_generations) do(5) Xteacher = Best_individual (6) Learning from Teacher lowastteacher phase (7) for (i = 1 to number_of_individuals) do(8) Xnew = Crossover (Xteacher Xi)(9) Xnew = Mutation(Xnew)(10) if (Xnew is better than Xi) then

(11) Xi = [Xnew]

(12) End if

(13) End for (14) Learning from Best Classmates lowastlearner phase 1 (15) for (i = 1 to number_of_individuals) do(16) m = Select_best_individual_from

(population)

(17) n = Select_best_individual_from (population)(18) n nem ne teacherlowast(19) Xnew = Crossover (Xm Xn)(20) Xnew = Mutation(Xnew)

(21) if (Xnew is better than Xm) then

(22) Xm = Xnew

(23) End if(24) if (Xnew is better than Xn) then

(25) Xn = Xnew

(26) End if

(27) End for

(28) Learning from Classmates lowastlearner phase 2(29) for (i =1 to number_of_individuals) do(30) m =Select_random_individual_from

(population)

(31 n =Select_random_individual_from (population) n nem ne teacherlowast

(32) Xnew = Crossover (Xm Xn)

(33) Xnew = Mutation (Xnew)(34) IPJAYA algorithm for SVM parameter

(35) Initialize the population size number of designed variables and termination criteria (IPJAYA)

(36) Arrange the solutions from the best to the worstand Split the solutions into two groups best andworst solutions

(37) Modify the solutionYj k I = Yj k I + r1 k I (Yj best I ndash |Yj k I|) ndash r2 k I (Yj worst I ndash | Yj k I|)

(38 Update the previous solution if Yj k I gt Yj k Iotherwise do not update the previous solution

(39) return best value of C and Y(40) if ( Xnew is better than Xm) then(41) Xm = Xnew(42) End if (43) if (Xnew is better than Xn) then(44) Xn = Xnew(45) End if (46) End for (47) Is termination criterion satisfied if yes go to line

42 else continue

(48) End for(49) Show_the_pareto_optimal_set (population)(50) End

Figure 10 Proposed method pseudo-code

10 Complexity

Start

Calculate weighted average of every individual in the population

Choose best individual as teacher

Crossover teacher with all other individuals (student) separately and apply mutation

Is new one better than old oneKeep old one Keep new one

Select best two students apply crossover and mutation

Is new one better than theworse student

Keep new oneKeep old one

Select random two students apply crossover and mutation

Initialize population size number of design variablesand termination criterion

Identify best and worst solutions for C and γ

Calculate new worst solution

Modify the solutions based on equation 32

Use the new value of c and y do the classification

Is new one better than old one Keep new oneKeep old one

Is the termination criterionsatisfied

This is the best value of C and γ

Is new one better thanthe worse student based on new

value of C and γ

Keep new oneKeep old one

Is the termination criterionsatisfied

End

YesNo

YesNo

Yes

No Yes

YesNo

No

No

Initialise population randomly

Apply nondominated sorting and find the Pareto setYes

ITLBO

IPJAYA

Figure 11 Details of the proposed method

Complexity 11

8 Evaluation Metrics

(emetrics measures and validation procedures used in theevaluation of the experimental data were reviewed in thissection (e literature review showed that most studies useoverall accuracy as the major performance measure for IDsystems However other metrics and validation measureshave also been mentioned Some works have detailed theinformation on FAR detections andmissed detections whichare all useful system performance evaluation measures (efollowing section details the analysis based on standardmetrics for objective evaluation of the results achieved byvarious classification methods (e performance of thesystem was evaluated using several metrics based on theNSL-KDD and CICIDS 2017 datasets A detailed descriptionof learning performance measures has been provided bySingh et al and Sokolova and Guy [25 26] whilePhoungphol et al [27] detailed the imbalanced dataset is-sues One of these metrics is the accuracy as given in thefollowing equation

accuracy TP + TN

TP + FN + TF + FP (4)

Accuracy is the capability of the classifier in predictingthe actual class here TP true positive TN true negativeFP false positive and FN false negative

Several metrics can be computed from the confusionmatrix (e false-positive rate (FPR) is another metric it isthe percentage of the samples incorrectly predicted aspositive by the classifier It is calculated by using the fol-lowing equation

false minus positive rate(FPR) FP

FP + TN (5)

(e false-negative rate (FNR) is the percentage of thedata incorrectly classified by the classifier as negative It iscalculated by using the following equation

false minus negative rate(FNR) FN

TP + FN (6)

(e detection rate (DR) is the percentage of the samplescorrectly classified by the classifier to their correct class It iscalculated by using the following equation

detection rate FP

TP + FP (7)

(e recall quantifies the number of correct positivepredictions made out of all positive predictions It is cal-culated by using the following equation

recall TP

TP + FN (8)

F-Measure provides a way to combine both detectionrate and recall into a single measure that captures bothproperties It is calculated by using the following equation

F minus M (2lowastDRlowast recall)

(DR + recall) (9)

(e results were validated by using k-fold cross-vali-dation technique [27ndash30] (is technique requires a randompartitioning of the data into k different parts and one part isselected from each iteration as testing data while the other(kminus1) parts are considered as the training dataset All theconnection records are eventually used for training andtesting For all experiments the value of k is taken as 10 toensure low bias low variance low overfitting and good errorestimate [28]

9 Dataset Preprocessing and Partitioning

(e whole dataset is preprocessed in this stage It consists oftwo steps ie scaling and normalisation In the scaling stepthe dataset is converted from a string representation to anumerical representation For example the class label in thedataset contains two different categories ldquoNormalrdquo andldquoAttackrdquo After implementing this step the label is changedto ldquo1rdquo and ldquo0rdquo where ldquo1rdquo means normal case while ldquo0rdquomeans attack (e second step is normalisation [31] (enormalisation process removes noise from the dataset anddecreases the differences in the ranges between the featuresIn this work the Max-Max normalisation method was usedas shown in the following equation

Fi Fi minus Mini

Maxi minus Minii (10)

where Fi represents the current feature that needs to benormalized and Mini and Maxi represent the minimum andthe maximum value for that feature respectively (e ob-jective function represents the accuracy of the SVM when itis evaluated on the validation set (e validation set is a partof the training set In order to make the validation fairer K-fold validation can be used(e value K is 10(e NSL-KDDand CICIDS 2017 datasets were used to evaluate the per-formance of the proposed models

10 NSL-KDD Dataset

In this study NSL-KDD datasets were used to evaluate theproposed method (is dataset was suggested in 2009 byTavallaee et al [32] due to the drawbacks of KDD CUP99(e NSL-KDD is a variant of the KDD CUP 99 dataset inwhich the redundant instances were discarded followed bythe reconstitution of the dataset structure [29] (e NSL-KDD dataset is commonly used for evaluating the per-formance of new ID approaches especially anomaly-basednetwork ID (ere are a reasonable number of testing andtraining records in the NSL-KDD (e training set(KDDTrain+) consists of 125973 records while the testingset (KDDTest+) contains 22544 records In this dataseteach traffic record has 41 features (six symbolic and 35continuous) and one class label (Table 7) (e features areclassified into basic content and traffic types (Table 8)Attack classification in the NSL-KDD is based on thefeature characteristics [33] (e NSL-KDD dataset can bedownloaded from httpswwwunbcacicdatasetsnslhtml

12 Complexity

11 CICIDS 2017 Dataset

(e CICIDS 2017 dataset consists of benign and the mostcurrent common attacks which mimic real-world data(PCAPs) It also contains the results of a network trafficanalysis obtained by using a CICFlowMeter the flows arelabelled based on the timestamp source and destinationports source and destination IPs protocols and attack (eCICIDS 2017 dataset satisfies the 11 indispensable features ofa valid IDS dataset namely anonymity available protocolsfeature set attack diversity complete capture completeinteraction complete network configuration completetraffic metadata heterogeneity and labelling [34] (ere are2830540 rows in the CICIDS 2017 devised on eight fileswith each row containing 79 features In the CICIDS 2017each row is labelled as benign or as one of the 14 attack typesA summary of the distribution of different attack types andthe benign rows is presented in Table 9

12 Results of ITLBO-IPJAYA vs ITLBOand ITLBO-JAYA

(is section provides the results of the improved method-based ITLBO-IPJAYA algorithm (is method selects thebest features and updates the value of SVM parameters (iswork proposed the idea of ldquoparallel executionrdquo to update theSVM parameters (e parameters for ITLBO ITLBO-JAYAand ITLBO-IPJAYA methods used in this study are shownin Table 10

(e NSL-KDD dataset is used to evaluate the threemethods and the evaluation metrics used are maximumaccuracy (Max Acc) average accuracy (AVR Acc) de-tection rate (DT) false alarm rate (FAR) false negative rate(FNR) F-measure (F-M) recall and error rate (ER)Table 11 shows the comparison in results among ITLBOITLBO-JAYA and ITLBO-IPJAYA

(e results show that ITLBO-IPJAYA performs betterthan ITLBO and ITLBO-JAYA in all metrics Figure 11shows the comparison results based on the accuracy ofITLBO ITLBO-JAYA and ITLBO-IPJAYA

Figure 12 shows a comparison between ITLBO-JAYAand ITLBO-IPJAYA based on the number of iterations Itshows that ITLBO-IPJAYA performs better than ITLBO-JAYA even with less number of iterations (e increase inrate of accuracy for ITLBO-IPJAYA is higher than ITLBO-JAYA (e figure shows that ITLBO-IPJAYA with 20 iter-ations performs better than ITLBO-JAYA with 30 iterationsand that ITLBO-IPJAYA performs better than ITLBO-JAYAwith less number of iterations (is means there is lesscomplexity and less execution time for ITLBO-IPJAYAFigure 13 shows the average FAR of the three methodsshowing that ITLBO-IPJAYA performs better than ITLBOand ITLBO-JAYA even with less number of features whereITLBO-IPJAYA with 19 features performs better than TLBOand ITLBO-JAYA with 21 and 22 features respectively (eimprovements shown in Sections 4 and 6 reduce the exe-cution time for ITLBO-IPJAYA over ITLBO-JAYA (eparallel processing of each SVM parameter independently isthe main factor that reduces the execution time for ITLBO-IPJAYA over ITLBO-JAYA as shown in Figure 14

(e results of the CICIDS 2017 dataset are shown inTable 12

Finally statistical significance tests (T-test) T-test madeon the distribution of values in both samples showed theirsignificant difference which allowed us to reject null hy-pothesis H0 (e test sh2ows the superiority of IPJAYA-ITLBO-SVM over JAYA-ITLBO-SVM (e P values and T-

Table 8 NSL-KDD dataset

Attack classes 22 types of attacks No of instancesNormal 67343DoS smurt neptune pod teardrop back land 45927R2L phf ftp-write imap multihop warezclient warezmaster spy guess password 995U2R perl loadmodule buffer-overflow rootkit 52Probing portsweep ipsweep satan nmap 11656

Table 9 CICIDS dataset

Attack class 14 types of attacks No of instancesBenign (normal) 2359087DOS DDoS slowloris Heratbleed Hulk GoldenEye Slowhttptest 294506PortScan Portscan 158930Bot Bot 1966Brute-Force FTP-Patator SSH-Patator 13835Web attack Web attack XSS web attack SQL injection web attack brute force 2180Infiltration Infiltration 36

Table 10 Parameters used in this study margin

Parameter ValuePopulation size for ITLBO 40Number of generations for ITLBO 60Population size for JAYA 40Number of generations for JAYA 60Population size for IPJAYA 40Number of generations for IPJAYA 60Population size for ITLBO 40Number of generations for IPJAYA 60Crossover type Half-uniformMutation type Bit-flip

Complexity 13

0973097409750976097709780979

098098109820983

Acc

urac

y

ITLBO-JAYAITLBO-IPJAYA

25 30 35 40 45 50 55 60 6520Number of iterations

Figure 13 Accuracy comparison based on the number of iterations for the NSL-KDD dataset

Table 11 Comparison of ITLBO ITLBO-JAYA and ITLBO-IPJAYA for the NSL-KDD dataset

No of features Method MAX Acc AVR Acc DR FAR FNR F-M Recall ER

16

TLBO 09639 09630 09612 00449 00282 09664 09717 0036ITLBO 09680 09678 09671 00379 00268 09701 09731 0032

ITLBO-JAYA 09688 09685 09676 00373 00258 0971 09741 00312ITLBO-IPJAYA 09708 09705 09712 00331 00256 09727 09742 00292

18

TLBO 09713 0971 09739 00299 00275 09731 09724 00286ITLBO 09718 09713 09744 00292 00273 09736 09726 00282

ITLBO-JAYA 09735 09733 09752 00285 00247 09752 09752 00265ITLBO-IPJAYA 09747 09746 09753 00280 00221 09764 09779 00252

19

TLBO 09738 09735 09727 00313 00225 09755 09774 00261ITLBO 09751 09745 09737 00305 00189 09769 09811 00248

ITLBO-JAYA 09759 09758 09758 00278 00178 09775 09791 00241ITLBO-IPJAYA 09772 09770 09786 00245 00162 09787 09787 00228

21

TLBO 09782 09780 09742 00299 00145 09797 09844 00217ITLBO 09787 09784 09756 00279 00144 0981 09846 00212

ITLBO-JAYA 09793 0979 09789 00273 00132 09811 09867 00207ITLBO-IPJAYA 09802 0980 09792 00263 00123 09812 09716 00198

22

TLBO 09801 0979 09755 00284 00131 09814 09868 00199ITLBO 0981 09805 09758 00277 00117 09823 0989 00191

ITLBO-JAYA 09816 09814 09794 00265 00114 09829 0989 00183ITLBO-IPJAYA 0-9823 09821 09798 00262 00102 09835 09898 00177

Acc

urac

y

16 18 19 21 22Number of features

096

80

9688 097

08

097

180

9735

097

47

097

510

9759

097

72

097

870

9793

098

02

098

10

9816

098

23

ITLBOITLBO-JAYAITLBO-IPJAYA

Figure 12 Accuracy based on the number of features for the NSL-KDD dataset

14 Complexity

values are shown in Table 13 the small values show that theIPJAYA-ITLBO-SVM method (MV1) is highly significant

13 The Comparison of the Proposed Methods

To illustrate the effectiveness of our proposed IDS methodsthe performance of the proposed methods is compared withsix recently developed anomaly detection techniquesTable 14 demonstrates the result achieved by the proposedmethods compared with other methods tested on the NSL-KDD dataset in terms of detection rate and false alarm rateIt is very clear that our proposed methods (ITLBO-JAYAand ITLBO-IPJAYA) obtained the best results with 09823accuracy 09798 detection rate and 00102 false alarm ratefor the ITLBO-IPJAYA model and 09816 accuracy 09794detection rate and 00114 false alarm rate for the ITLBO-JAYA method as shown in Table 11 However Table 15demonstrates the result achieved by the proposed methodscompared with other methods tested on the CICIDS 2017dataset in terms of detection rate and false alarm rate

14 Discussion

(is work in general contains 4 sections based on theproposed method Furthermore all methods proposed inthis work were evaluated based on the NSL-KDD andCICIDS 2017 datasets

Firstly the proposed ITLBO-IPJAYA based on networkintrusion detection and method results were compared withTLBO ITLBO and ITLBO-JAYA as shown in Tables 11 and12 Additionally the table shows the different features for thethree algorithms to investigate the influence of the featurersquosincrease on the performance which represents a differentalgorithm structure (e ITLBO-IPJAYA results showedhigher stability and better accuracy than ITLBO and ITLBO-JAYA algorithms

Furthermore Figure 13 shows that ITLBO-JAYA needs60 iterations to reach accuracy of 09816 when the ITLBO-IPJAYA algorithm with 50 iterations achieved higher ac-curacy (erefore ITLBO-IPJAYA achieved better detectionrate and less false alarm rate with less complexity of

0020

0022

0024

0026

0028

003

0032

0034

0036

0038

004

FAR

2217 18 19 20 21 2315 16Number of features

ITLBOITLBO-JAYAITLBO-IPJAYA

Figure 14 Comparison based on the number of features head for the NSL-KDD dataset

Table 12 Comparison of ITLBO ITLBO-JAYA and ITLBO-IPJAYA for the CICIDS dataset

No of features Method MAX Acc AVR Acc DR FAR FNR F-M Recall ER

12ITLBO 09634 09631 09661 00389 00268 0970 09721 00323

ITLBO-JAYA 09685 09683 09682 00360 00267 09713 09722 00315ITLBO-IPJAYA 09704 09702 09701 00310 00265 09725 09724 00298

13ITLBO 09712 09710 09724 00298 00273 09736 09726 00282

ITLBO-JAYA 09745 09744 09728 00290 00264 09741 09794 00272ITLBO-IPJAYA 09768 09767 09732 00285 00260 09752 09787 00264

14ITLBO 09776 09775 09737 00280 00189 09769 09811 00258

ITLBO-JAYA 09789 09787 09742 00270 00174 0978 0986 00235ITLBO-IPJAYA 09801 0980 09749 00265 00134 0981 0987 00210

16ITLBO 09804 09803 09755 00271 0011 09821 0989 00190

ITLBO-JAYA 0981 09808 09773 00266 00109 09825 0989 00183ITLBO-IPJAYA 09817 09815 09782 00264 00105 09831 09896 00170

Complexity 15

iterations Secondly with all the improvement of ITLBO-SVM mentioned above random selection of the main SVMparameters is considered as one of the algorithm limitationswhich may not provide optimal parameter value and affectthe model accuracy negatively

(e results above showed that the ITLBO-IPJAYAperformance improved the basic SVM performance byproviding the best parameter values as shown in the ITLBO-IPJAYA block diagram in Figure 11 In the end the per-formance of ITLBO-IPJAYA is worth reducing the impact ofselected parameters randomly

As a result of the differences in the algorithm structurethe ITLBO structure contains three phases which shouldprevent the algorithm from being trapped in local and globaloptima Also teachers not only teach learners (students) butalso teach other teachers On the contrary the TLBO structurecontains two phases only where teachers teach learners only

Furthermore the ITLBO algorithm achieved higheraccuracy than TLBO because the knowledge exchange rate ishigher in ITLBO since teachers teach learners and otherteachers (erefore ITLBO achieved better detection rateand less false alarm rate with less complexity of iterations

Table 15 Comparison with the existing work for the CICIDS 2017 dataset

Ref Method Dataset Acc DR FAR[41] Hybrid model CICIDS 8976 NG NG[42] Wrapper-based feature selection CICIDS 9768 NG NG[43] Feature selection technique and SVM CICIDS 09803 NG NGTLBO-SVM TLBO and SVM CICIDS 09794 09745 00274ITLBO-SVM Improved TLBO and SVM CICIDS 09804 09755 00271ITLBO-JAYA-SVM Improved TLBO improved JAYA and SVM CICIDS 0981 09773 00266ITLBO-IPJAYA-SVM Improved TLBO improved JAYA and SVM CICIDS 09817 09782 00264

Table 13 T-test results

NSL-KDD CICIDS 2017P value 00156 00068T value 3174 4044

Table 14 Comparison with the existing work for the NSL-KDD dataset

Ref Method Dataset Acc DR FAR[35] Two-stage classifier NSL-KDD 9638 NG NG[36] Hypergraph-based genetic algorithm and SVM NSL-KDD 0975 09714 083[8] PSO and SVM NSL-KDD 09784 09723 087[37] Chi-square and SVM NSL-KDD 098 NG 013[38] SVM and hybrid PSO NSL-KDD 07341 06628 281[39] SVM and feature selection NSL-KDD 090 NG NG[40] SVM and GA NSL-KDD 0975 NG NGTLBO-SVM TLBO and SVM NSL-KDD 09801 09755 00284ITLBO-SVM Improved TLBO and SVM NSL-KDD 0981 09758 00277ITLBO-JAYA-SVM Improved TLBO improved JAYA and SVM NSL-KDD 09816 09794 00265ITLBO-IPJAYA-SVM Improved TLBO improved JAYA and SVM NSL-KDD 09823 09798 00262

13534

13239

ITLBO-JAYA ITLBO-IPJAYA130501310013150132001325013300133501340013450135001355013600

Figure 15 Execution time comparison for the NSL-KDD dataset

16 Complexity

Dividing the solutions of the IPJAYA algorithm into twogroups and choosing the best solution from the best solutiongroup as ldquoBestrdquo and the best solution from the worst solutiongroup as ldquoWorstrdquo cause IPJAYA to need less iterations thanJAYA to reach better solutions as shown in Figure 13 (isalso leads to improvement in accuracy and detection rate

(e parallel improvement done on the JAYA algorithmreduces the time needed for execution and hence reduces thetotal execution time for the ITLBO-IPJAYA-SVM model asshown in Figure 15

Data Availability

(e data used to support the findings of this study areavailable online at httpswwwunbcacicdatasetsnslhtml

Conflicts of Interest

(e authors declare that they have no conflicts of interest

Acknowledgments

Special appreciation is due to Universiti Malaysia Pahang forthe sponsorship of this study approved by the Ministry ofHigher Education (MOHE) for Fundamental ResearchGrant Scheme (FRGS) with Vot No RDU190113

References

[1] A Sultana and M Jabbar ldquoIntelligent network intrusiondetection system using data mining techniquesrdquo in Pro-ceedings of the 2016 2nd International Conference on Appliedand 0eoretical Computing and Communication Technology(iCATccT) pp 329ndash333 IEEE Bangalore India July 2016

[2] R Rao ldquoJaya a simple and new optimization algorithm forsolving constrained and unconstrained opti- mizationproblemsrdquo International Journal of Industrial EngineeringComputations vol 7 no 1 pp 19ndash34 2016

[3] M Alsajri M A Ismail and S Abdul-Baqi ldquoA review on therecent application of Jaya optimization algorithmrdquo in Pro-ceedings of the 2018 1st Annual International Conference onInformation and Sciences (AiCIS) Springer Al-Fallujah Iraqpp 129ndash132 November 2018

[4] P Tao Z Sun and Z Sun ldquoAn improved intrusion detectionalgorithm based on GA and SVMrdquo IEEE Access vol 6pp 13624ndash13631 2018

[5] A S Eesa Z Orman and A M A Brifcani ldquoA novel feature-selection ap- proach based on the cuttlefish optimizationalgorithm for intrusion detection systemsrdquo Expert Systemswith Applications vol 42 no 5 pp 2670ndash2679 2015

[6] P Louvieris N Clewley and X Liu ldquoEffects-based featureidentification for network intrusion detectionrdquo Neuro-computing vol 121 pp 265ndash273 2013

[7] E De la Hoz A Ortiz J Ortega and B Prieto ldquoPCA filteringand probabilistic SOM for network intrusion detectionrdquoNeurocomputing vol 164 pp 71ndash81 2015

[8] S Mojtaba H Bamakan H Wang Y Tian and Y Shi ldquoAneffective intrusion detection framework based onMCLPSVMoptimized by time-varying chaos particle swarm optimiza-tionrdquo Neurocomputing vol 199 pp 90ndash102 2016

[9] R V Rao V J Savsani and D P Vakharia ldquoTeach-ingndashlearning-based optimization a novel method for

constrained mechanical design optimization problemsrdquordquoComputer-Aided Design vol 43 no 3 pp 303ndash315 2011

[10] S P Das and S Padhy ldquoA novel hybrid model using teaching-learning-based optimization and a support vector machine forcommodity futures index forecastingrdquo International Journalof Machine Learning and Cybernetics vol 9 no 1 pp 97ndash1112018

[11] S P Das N S Achary and S Padhy ldquoNovel hybrid SVM-TLBO forecasting model incorporating dimensionality re-duction techniquesrdquo Applied Intelligence vol 45 no 4pp 1148ndash1165 2016

[12] R V Rao V J Savsani and J Balic ldquoTeachingndashlearning-based optimization algorithm for unconstrained and con-strained real-parameter optimization problemsrdquordquo EngineeringOptimization vol 44 no 12 pp 1447ndash1462 2012

[13] R V Rao and V Patel ldquoAn improved teaching-learning-based optimization algorithm for solving uncon- strainedoptimization problemsrdquo Scientia Iranica vol 20 no 3pp 710ndash720 2013

[14] M Crepinsek S-H Liu and L Mernik ldquoA note on teach-ingndashlearning-based optimization algo- rithmrdquo InformationSciences vol 212 pp 79ndash93 2012

[15] M R Nayak C K Nayak and P K Rout ldquoApplication ofmulti-objective teaching learning based optimization algo-rithm to optimal power flow problemrdquo Procedia Technologyvol 6 pp 255ndash264 2012

[16] Y Xu L Wang S-Y Wang and M Liu ldquoAn effectiveteaching-learning-based optimization algorithm for theflexible job-shop scheduling problem with fuzzy processingtimerdquo Neurocomputing vol 148 pp 260ndash268 2015

[17] H E Kiziloz A Deniz T Dokeroglu and A Cosar ldquoNovelmultiobjective TLBO algorithms for the feature subset se-lection problemrdquoNeurocomputing vol 306 pp 94ndash107 2018

[18] S H Wang K Muhammad Y Lv et al ldquoIdentification of Al-coholism based on wavelet Renyi entropy and three-segmentencoded Jaya algorithmrdquo Complexity vol 2018 Article ID3198184 13 pages 2018

[19] H Migallon A Jimeno-Morenilla and J-L Sanchez-RomeroldquoParallel improvements of the Jaya optimization algorithmrdquoApplied Sciences vol 8 no 5 p 819 2018

[20] C Gong ldquoAn enhanced Jaya algorithm with a two groupAdaptionrdquo International Journal of Computational Intelli-gence Systems vol 10 no 1 pp 1102ndash1115 2017

[21] O Samuel N Javaid S Aslam and M H Rahim ldquoJAYAoptimization based energy management controller for smartgrid JAYA optimization based energy management con-trollerrdquo in Proceedings of the 2018 International Conference onComputing Mathematics and Engineering Technologies(iCoMET) March 2018

[22] K Yu J J Liang B Y Qu X Chen and H Wang ldquoPa-rameters identification of photovoltaic models using an im-proved JAYA optimization algorithmrdquo Energy Conversionand Management vol 150 pp 742ndash753 2017

[23] M Dash and H Liu ldquoFeature selection for classificationrdquoIntelligent Data Analysis vol 1 no 1-4 pp 131ndash156 1997

[24] S Dumais J Platt D Heckerman and M Sahami ldquoInductivelearning algorithms and representations for text categoriza-tionrdquo in Proceedings of the seventh international conference onInformation and knowledge management-CIKMrsquo98 Novem-ber 1998

[25] R Singh H Kumar and R K Singla ldquoAn intrusion detectionsystem using network traffic profiling and online sequentialextreme learning machinerdquo Expert Systems with Applicationsvol 42 no 22 pp 8609ndash8624 2015

Complexity 17

[26] M Sokolova and L Guy ldquoA systematic analysis of perfor-mance measures for classification tasksrdquo Information Pro-cessing amp Management vol 45 no 4 pp 427ndash437 2009

[27] P Phoungphol Y Zhang and Y Zhao ldquoRobust multiclassclassification for learning from imbalanced biomedical datardquoTsinghua Science and Technology vol 17 no 6 pp 619ndash6282012

[28] A Jahan F Mustapha M Y Ismail S M Sapuan andM Bahraminasab ldquoA comprehensive VIKOR method formaterial selectionrdquo Materials amp Design vol 32 no 3pp 1215ndash1221 2011

[29] L Aljarah and S A Ludwig ldquoMapreduce intrusion detectionsystem based on a particle swarm optimization clusteringalgorithmrdquo in Proceedings of the 2013 IEEE Congress onEvolutionary Computation pp 955ndash962 IEEE CancunMexico June 2013

[30] K Khaleel M A Ismail U Yunan and S Kasim ldquoReview onintrusion detection system based on the goal of the detectionsystemrdquo International Journal of Integrated Engineeringvol 10 no 6 2018

[31] J D Rodriguez A Perez and J A Lozano ldquoSensitivityanalysis of k-Fold cross validation in prediction error esti-mationrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 32 no 3 pp 569ndash575 2010

[32] M Tavallaee E Bagheri W Lu and A A Ghorbani ldquoAdetailed analysis of the KDD CUP 99 data setrdquo in Proceedingsof the 2009 IEEE Symposium on Computational Intelligence forSecurity and Defense Applications pp 1ndash6 IEEE Ottawa ONCanada July 2009

[33] M Al-Qatf Y Lasheng M Al-Habib and K Al-SabahildquoDeep learning approach combining sparse autoencoder withSVM for network intrusion detectionrdquo IEEE Access vol 6pp 52843ndash52856 2018

[34] R Vinayakumar M Alazab K P Soman P PoornachandranA Al-Nemrat and S Venkatraman ldquoDeep learning approachfor intelligent intrusion detection systemrdquo IEEE Access vol 7pp 41525ndash41550 2019

[35] BA Tama M Comuzzi and K-H Rhee ldquoTSE-IDS a two-stage classifier ensemble for intelligent anomaly-based in-trusion detection systemrdquo IEEE Access vol 7 pp 94497ndash94507 2019

[36] M R Gauthama Raman N Somu K Kannan R Liscano andV S Shankar Sriram ldquoAn efficient intrusion detection systembased on hypergraph - genetic algorithm for parameter op-timization and feature selection in support vector machinerdquordquoKnowledge-Based Systems vol 134 pp 1ndash12 2017

[37] I S (aseen and C Aswani Kumar ldquoIntrusion detectionmodel using fusion of chi- square feature selection and multiclass SVMrdquo Journal of King Saud University - Computer andInformation Sciences vol 29 no 4 pp 462ndash472 2017

[38] Y Li S Yu J Bai and X Cheng ldquoTowards effective networkintrusion detection a hybrid model integrating Gini indexand GBDTwith PSOrdquo Journal of Sensors vol 2018 Article ID1578314 9 pages 2018

[39] A A Aburomman and M B I Reaz ldquoA novel weightedsupport vector machines multi- class classifier based ondifferential evolution for intrusion detection systemsrdquo In-formation Sciences vol 414 pp 225ndash246 2017

[40] J Esmaily and J Ghasemi ldquoA novel intrusion detectionsystems based on genetic algorithms-suggested features by themeans of different permutations of labelsrsquo ordersrdquo Interna-tional Journal of Engineering vol 30 no 10 pp 1494ndash15022017

[41] S Aljawarneh M Aldwairi and M B Yassein ldquoAnomaly-based intrusion detection system through feature selectionanalysis and building hybrid efficient modelrdquo Journal ofComputational Science vol 25 pp 152ndash160 2018

[42] Y Li J L Wang Z H Tian T B Lu and C Young ldquoBuildinglightweight intrusion detection system using wrapper-basedfeature selection mechanismsrdquo Computers amp Security vol 28no 6 pp 466ndash475 2009

[43] S U Jan S Ahmed V Shakhov and I Koo ldquoToward alightweight intrusion detection system for the internet ofthingsrdquo IEEE Access vol 7 pp 42450ndash42471 2019

18 Complexity

Page 5: ImprovedTLBO …downloads.hindawi.com/journals/complexity/2020/5287684.pdfFSS is a multiobjective optimisation problem. e first objective is the number of features, and the second

individual is better than the old two students thenthe new one is kept otherwise the best old one isretained (is step is repeated with all other

students At this point the main three stages ofITLBO have been completed and a check should becarried out on whether the termination criteria have

(1) Start (2) Initialize population(3) Calculate_weighted_average_of

_individuals (population)(4) for (k =1 to number_of_generations) do(5) Xteacher = Best_individual (6) Learning from Teacher lowastteacher phase (7) for (i =1 to number_of_individuals) do(8) Xnew = Crossover (Xteacher Xi)(9) Xnew = Mutation (Xnew)(10) if (Xnew is better than Xi) then(11) Xi = [Xnew](12) End if(13) End for (14) Learning from Best Classmates lowastlearner phase 1 (15) for (i =1 to number_of_individuals) do(16) m = Select_best_individual_from

(population)(17) n = Select_best_individual_from (population) (18) n ne m ne teacherlowast(19) Xnew = Crossover (Xm Xn)(20) Xnew = Mutation (Xnew)(21) if (Xnew is better than Xm) then(22) Xm = Xnew(23) End if

(24) if (Xnew is better than Xn) then (25) Xn = Xnew(26) End if (27) End for (28) Learning from Classmates lowastlearner phase 2(29) for (i =1 to number_of_individuals) do(30) m = Select_random_individual_from

(population)(31) n = Select_random_individual_from

(population) n ne m ne teacherlowast(32) Xnew = Crossover (Xm Xn)(33) Xnew = Mutation (Xnew)(34) if (Xnew is better than Xm) then(35) Xm = Xnew(36) End if (37) if (Xnew is better than Xn) then(38) Xn = Xnew(39) End if (40) End for (41) Is termination criterion satisfied if yes go to line

42 else continue(42) End for(43) Show_the_pareto_optimal_set (population)(44) End

Figure 5 ITLBO algorithm

Figure 2 Sample of the dataset

0100111111

1101110101

1100110111

Teacher

New student

Student

Figure 3 Crossover operator

0100111111

0100110111

Figure 4 Mutation operator

Complexity 5

been satisfied or not If the termination criteria weresatisfied proceed to the next step otherwise themain three stages are repeated (Teacher phaseLearner Phase 1 and Learner Phase 2) (is step isrepresented in lines 28 to 40 in Figure 5

(vi) Step 6 the final step is the application of non-dominated sorting to the result Nondominatedsortingmeans no result (individual) is better than all

other individuals (is step can be viewed in line 43in Figure 5

5 Parameter Optimisation

After selecting the optimal subset feature several SVMparameters will be tuned(e tuning of SVM parameters is aproblem which can determine algorithm performance (e

Start

Calculate weighted average of every individual in the population

Choose best individual as teacher

Is new one better thanold one

Keep old one Keep new one

Select best two students apply crossover and mutation

Is new one better than theworse student

Keep new oneKeep old one

Select random two students apply crossover and mutation

Is new one better thanold one

Keep new oneKeep old one

Is the termination criterion satisfied

Is new one better than the worse student based on new

value of C and YKeep new oneKeep old one

Is the terminationcriterion satisfied

End

YesNo

YesNo

No Yes

YesNo

No

Crossover teacher with all other individuals (student) separately and apply mutation

Apply nondominated sorting and find the Pareto setYes

Initialize population randomly

Figure 6 ITLBO flowchart

6 Complexity

radial basis function (RBF) kernel function of the SVM isemployed for the conversion of the completely nonseparableproblem into a separable or approximate separable state(eRBF kernel parameter c suggests data distribution to a newfeature space while parameter C suggests the level of penaltyfor the classification error in the linear nonseparable caseEquations (2) and (3) represent the cost and gamma re-spectively In the next section the two parameters (C and c)were tuned by using the IPJAYA algorithm

w b12

w22

+ C 1113944n

ζnstyn wtxn + b1113872 1113873ge 1 minus ζn (2)

k xn xm( 1113857 exp minusc xn minus xm

23

1113874 1113875 (3)

6 Improved Parallel JAYA Algorithm

(e JAYA algorithm needs improvements to work betterOne of the observations on the JAYA algorithm is that if wesort the populations from best to worst and divide them intotwo groups the best and the worst solutions Obviously theoptimal solution is located in the best solution group [2]Based on this observation an improvement has been done inthe JAYA algorithm rather than selecting the best and worstcases from the whole solutions which puts the worst so-lution further from the best solution and increase the it-erations needed to reach the optimal solution the solutionswere divided into two groups (e best solution is chosenfrom the best solution group as ldquoBestrdquo and the best solutionfrom the worst solution group is the ldquoWorstrdquo (is proce-dure reserves the populationrsquos diversity and makes the so-lution start from a point closer to the optimal solution anddecreases the number of iterations needed to reach theoptimal solution In the proposed work JAYA algorithmwas improved to optimise two parameters of the SVMclassifier simultaneously Figure 7 shows the flowchart ofIPJAYA while Figure 8 shows the IPJAYA algorithm fol-lowed by the detailed steps of IPJAYA

(e detailed steps of IPJAYA are shown as follows(i) Step 1 select the population size and the number of

design variables as well as initialise the terminationcondition To explain the parameter optimisation indetail we assume the following scenario populationsize 3 design variables 2 and terminationcriterion 2 iterations (e value of the populationis the value of parameters C and c in this scenarioeach one has 3 values (ese values were initialisedrandomly for C between 0001 to 100 and for c

between 00001 to 64 Table 2 shows the values of Cand c

(ii) Step 2 SVM needs three things to classify any la-belled data ie features to choose value of C pa-rameter and value of c parameter (is step can beviewed in line 2 of Figure 8

(iii) Step 3 the next step is to evaluate each value forboth C and c separately by using the SVM and on

the first student from Learner Phase 2 after applyingcrossover and mutation as shown in Table 3To continue the optimisation process the pop-ulation is arranged from best to worst and split intotwo groups (Best and Worst groups) as shown inTable 4(e same procedure is repeated for c parameter andthis time C is by default and the new value of c is11006

(iv) Tables 5 and 6 show the details of c parameter(v) Step 4 the result will be considered as the objective

function for both C and c and then compared withother populations and continued until the termi-nation criterion is satisfied (is step can be viewedin line 7 of Figure 8

(ese two new values for C and c will be evaluated usingthe same subset feature at the same time as shown in Table 7(is step can be viewed in lines 5 to 6 of Figure 8

7 The Proposed Method

(is section describes the proposed combination of threedifferent algorithms Each algorithm has a different task todo and these tasks complete the work of the model (e firstalgorithm is the ITLBO whose task is to choose the optimalsubset feature from the whole features(e second algorithmis IPJAYA algorithm and its task is to optimise the pa-rameters of the SVM (e third algorithm is the SVMclassifier which takes the outcome of the first two algorithmsto determine if the processed traffic is intrusion or normaltraffic Figure 9 shows the flowchart of the proposed methodFigure 10 shows the pseudo-code of the proposed methodwhile Figure 11 details the proposed steps of the IPJAYA-ITLBO-SVM method

(e detail steps of ITLBO-IPJAYA-SVM are asfollows

(i) Step 1 initialise the population randomly Eachpopulation is a different set of features from 1 tothe maximum number of features (41 in NSL-KDD) (is step can be viewed in line 2 ofFigure 10

(ii) Step 2 calculate the weighted average of everyindividual population (is step can be viewed inline 3 of Figure 10

(iii) Step 3 choose the best individual as a teacher (echosen teacher interacts with all other individualsseparately Apply crossover with each one andthen apply mutation to all resulted individuals(ecrossover used is called the half-uniform crossoverand bit-flip mutation operator (is step can beviewed in lines 4 to 5 of Figure 10

(iv) Step 4 check the population (chromosome)resulting from crossover and mutation if the newone is better than the old one keep the new oneotherwise retain the old one (e best and worstpopulations refer to the degree of accuracy during

Complexity 7

Start

Initialize population size number of design variables and termination criterion

Identify best and worst solutions for C and γ

Calculate new worst solution for C Calculate new worst solution for γ

Modify the solution based on eq 4 Modify the solution based on eq 4

Use the new value of C and γ do the classification

Is new Acc Better than old Acc

Keep new C and γKeep old C and γ

Is the termination criterionsatisfied

Return best value of C and γ

End

YesNo

NO

Yes

Parallel pool

Figure 7 IPJAYA flowchart

(1) Start (2) Initialise the population size number of designed

variables and termination criteria (3) Repeat Steps 3ndash6 until the termination criteria are met (4) Arrange the solutions from best to worst and split the

solutions into two groups -best and worst solutions(5) Make the best solution in best group as best and

make the best solution in worst group as worst(6) Modify the solution based on the following equation

Yprimej k I = Yj k I + r1 k I (Yj best I ndash | Yj k I |) ndashr2 k I (Yj worst I ndash | Yj k I|)

(7) Update the previous solution if Yprimej k I gt Yj k Iotherwise do not update the previous solution

(8) Display the established optimum solution(9) End

Figure 8 IPJAYA algorithm

Table 2 C and c values

C c

20 101 210 0701 1

Table 3 Evaluation of C

C c Subset feature Accuracy (objective function)20 Default Fixed 0971 Default Fixed 089910 Default Fixed 099401 Default Fixed 099

Table 4 Best and Worst groups for C

C cSubsetfeature

Accuracy (objectivefunction)

10 Default Fixed 0994 best of best Best group01 Default Fixed 09920 Default Fixed 097 best of worst Worst

group1 Default Fixed 0899

Table 5 Accuracy based on c

C γ Subset feature Accuracy (objective function)20 Default Fixed 0971 Default Fixed 089910 Default Fixed 099401 Default Fixed 099

8 Complexity

classification All the aforementioned steps arecalled the Teacher phase because all individualslearn from the best one (teacher) After thatLearner Phase 1 is started (is step can be viewedin lines 6 to 13 of Figure 10

(v) Step 5 select the best two individuals as studentsand apply crossover between these two students(en apply mutation on the new one If the newone is better than the old two students keep thenew one otherwise keep the best old one and

apply this with all other individuals (students) (estudents are chosen once and will not be chosenagain At this point Learner Phase 1 has ended(is step can be viewed in lines 14 to 27 ofFigure 10

(vi) Step 6 Learner Phase 2 is initiated with two objec-tives one is to optimise the SVMparameters and theother is to make students learn from each other(isphase starts with choosing two random individuals(students) and then applying crossover between these

Table 6 Best and Worst groups for c

C c Subset feature Accuracy (objective function)Default 07 Fixed 09941 best of best Best groupDefault 1 Fixed 099Default 2 Fixed 098 best of worst Worst groupDefault 10 Fixed 097

Table 7 Evaluation of features based on new C and c

C c Subset feature Accuracy (objective function)142 11006 Fixed Result

Start

FSS using ITLBO

SVM parameters optimization using IPJAYA

Is termination criterion satisfied for IPJAYA

No

Is termination criterion satisfied for ITLBO

No

Training

Classification

Intrusion detection

End

Yes

Yes

Figure 9 Proposed method flowchart

Complexity 9

two students and applying mutation on the newindividual After that and before the classificationprocess is initiated check if the new student is betterthan the old two students (e SVM parameteroptimisation is started using IPJAYA this processstarts at the 29th step by initialising the populationsize the number of design variables and the ter-mination criteria for IPJAYA (e population sizecan be set before the execution and each populationis generated randomly(e designed variables are thetwo parameters of the SVM which need to beoptimised (e termination criteria can be thenumber of iterations after that each population foreach parameter is evaluated separately (which onegives better accuracy) followed by a parallel poll foreach parameter sorting the population from best toworst (best accuracy to worst accuracy) and sepa-rating them into two groups (best and worst groups)(e best population in the best group is chosen asbest and the best population in the worst group ischosen as worst (en the population is modifiedbased on equation in Figure 8 and updated if the newone is better than the old one IPJAYA is repeated

until the termination criterion is satisfied (e finalstep of IPJAYA is to deliver the best value of the twoparameters to be used by the SVM At this point theparameter optimisation has ended and LearnerPhase 2 continues in the next step (is step can beviewed in lines 28 to 39 of Figure 10

(vii) Step 7 evaluate the individuals (chromosome) byusing the outcome of IPJAYA If the new indi-vidual is better than the old two students keepthe new one otherwise keep the best old oneApply this step to all other students At this stepthe main three stages of the ITLBO have finished(e next step is to check for the satisfaction of thetermination criteria if satisfied proceed to thenext step Otherwise the main three stages arerepeated (is step can be viewed in lines 40 to 48of Figure 10

(viii) Step 8 the last step is to apply nondominatedsorting on the result Nondominated sortingmeans no result (individual) is better than all theother individuals(is step can be viewed in line 49of Figure 10

(1) Start (2) Initialize population(3) Calculate_weighted_average_of_individuals (4) for (k =1 to number_of_generations) do(5) Xteacher = Best_individual (6) Learning from Teacher lowastteacher phase (7) for (i = 1 to number_of_individuals) do(8) Xnew = Crossover (Xteacher Xi)(9) Xnew = Mutation(Xnew)(10) if (Xnew is better than Xi) then

(11) Xi = [Xnew]

(12) End if

(13) End for (14) Learning from Best Classmates lowastlearner phase 1 (15) for (i = 1 to number_of_individuals) do(16) m = Select_best_individual_from

(population)

(17) n = Select_best_individual_from (population)(18) n nem ne teacherlowast(19) Xnew = Crossover (Xm Xn)(20) Xnew = Mutation(Xnew)

(21) if (Xnew is better than Xm) then

(22) Xm = Xnew

(23) End if(24) if (Xnew is better than Xn) then

(25) Xn = Xnew

(26) End if

(27) End for

(28) Learning from Classmates lowastlearner phase 2(29) for (i =1 to number_of_individuals) do(30) m =Select_random_individual_from

(population)

(31 n =Select_random_individual_from (population) n nem ne teacherlowast

(32) Xnew = Crossover (Xm Xn)

(33) Xnew = Mutation (Xnew)(34) IPJAYA algorithm for SVM parameter

(35) Initialize the population size number of designed variables and termination criteria (IPJAYA)

(36) Arrange the solutions from the best to the worstand Split the solutions into two groups best andworst solutions

(37) Modify the solutionYj k I = Yj k I + r1 k I (Yj best I ndash |Yj k I|) ndash r2 k I (Yj worst I ndash | Yj k I|)

(38 Update the previous solution if Yj k I gt Yj k Iotherwise do not update the previous solution

(39) return best value of C and Y(40) if ( Xnew is better than Xm) then(41) Xm = Xnew(42) End if (43) if (Xnew is better than Xn) then(44) Xn = Xnew(45) End if (46) End for (47) Is termination criterion satisfied if yes go to line

42 else continue

(48) End for(49) Show_the_pareto_optimal_set (population)(50) End

Figure 10 Proposed method pseudo-code

10 Complexity

Start

Calculate weighted average of every individual in the population

Choose best individual as teacher

Crossover teacher with all other individuals (student) separately and apply mutation

Is new one better than old oneKeep old one Keep new one

Select best two students apply crossover and mutation

Is new one better than theworse student

Keep new oneKeep old one

Select random two students apply crossover and mutation

Initialize population size number of design variablesand termination criterion

Identify best and worst solutions for C and γ

Calculate new worst solution

Modify the solutions based on equation 32

Use the new value of c and y do the classification

Is new one better than old one Keep new oneKeep old one

Is the termination criterionsatisfied

This is the best value of C and γ

Is new one better thanthe worse student based on new

value of C and γ

Keep new oneKeep old one

Is the termination criterionsatisfied

End

YesNo

YesNo

Yes

No Yes

YesNo

No

No

Initialise population randomly

Apply nondominated sorting and find the Pareto setYes

ITLBO

IPJAYA

Figure 11 Details of the proposed method

Complexity 11

8 Evaluation Metrics

(emetrics measures and validation procedures used in theevaluation of the experimental data were reviewed in thissection (e literature review showed that most studies useoverall accuracy as the major performance measure for IDsystems However other metrics and validation measureshave also been mentioned Some works have detailed theinformation on FAR detections andmissed detections whichare all useful system performance evaluation measures (efollowing section details the analysis based on standardmetrics for objective evaluation of the results achieved byvarious classification methods (e performance of thesystem was evaluated using several metrics based on theNSL-KDD and CICIDS 2017 datasets A detailed descriptionof learning performance measures has been provided bySingh et al and Sokolova and Guy [25 26] whilePhoungphol et al [27] detailed the imbalanced dataset is-sues One of these metrics is the accuracy as given in thefollowing equation

accuracy TP + TN

TP + FN + TF + FP (4)

Accuracy is the capability of the classifier in predictingthe actual class here TP true positive TN true negativeFP false positive and FN false negative

Several metrics can be computed from the confusionmatrix (e false-positive rate (FPR) is another metric it isthe percentage of the samples incorrectly predicted aspositive by the classifier It is calculated by using the fol-lowing equation

false minus positive rate(FPR) FP

FP + TN (5)

(e false-negative rate (FNR) is the percentage of thedata incorrectly classified by the classifier as negative It iscalculated by using the following equation

false minus negative rate(FNR) FN

TP + FN (6)

(e detection rate (DR) is the percentage of the samplescorrectly classified by the classifier to their correct class It iscalculated by using the following equation

detection rate FP

TP + FP (7)

(e recall quantifies the number of correct positivepredictions made out of all positive predictions It is cal-culated by using the following equation

recall TP

TP + FN (8)

F-Measure provides a way to combine both detectionrate and recall into a single measure that captures bothproperties It is calculated by using the following equation

F minus M (2lowastDRlowast recall)

(DR + recall) (9)

(e results were validated by using k-fold cross-vali-dation technique [27ndash30] (is technique requires a randompartitioning of the data into k different parts and one part isselected from each iteration as testing data while the other(kminus1) parts are considered as the training dataset All theconnection records are eventually used for training andtesting For all experiments the value of k is taken as 10 toensure low bias low variance low overfitting and good errorestimate [28]

9 Dataset Preprocessing and Partitioning

(e whole dataset is preprocessed in this stage It consists oftwo steps ie scaling and normalisation In the scaling stepthe dataset is converted from a string representation to anumerical representation For example the class label in thedataset contains two different categories ldquoNormalrdquo andldquoAttackrdquo After implementing this step the label is changedto ldquo1rdquo and ldquo0rdquo where ldquo1rdquo means normal case while ldquo0rdquomeans attack (e second step is normalisation [31] (enormalisation process removes noise from the dataset anddecreases the differences in the ranges between the featuresIn this work the Max-Max normalisation method was usedas shown in the following equation

Fi Fi minus Mini

Maxi minus Minii (10)

where Fi represents the current feature that needs to benormalized and Mini and Maxi represent the minimum andthe maximum value for that feature respectively (e ob-jective function represents the accuracy of the SVM when itis evaluated on the validation set (e validation set is a partof the training set In order to make the validation fairer K-fold validation can be used(e value K is 10(e NSL-KDDand CICIDS 2017 datasets were used to evaluate the per-formance of the proposed models

10 NSL-KDD Dataset

In this study NSL-KDD datasets were used to evaluate theproposed method (is dataset was suggested in 2009 byTavallaee et al [32] due to the drawbacks of KDD CUP99(e NSL-KDD is a variant of the KDD CUP 99 dataset inwhich the redundant instances were discarded followed bythe reconstitution of the dataset structure [29] (e NSL-KDD dataset is commonly used for evaluating the per-formance of new ID approaches especially anomaly-basednetwork ID (ere are a reasonable number of testing andtraining records in the NSL-KDD (e training set(KDDTrain+) consists of 125973 records while the testingset (KDDTest+) contains 22544 records In this dataseteach traffic record has 41 features (six symbolic and 35continuous) and one class label (Table 7) (e features areclassified into basic content and traffic types (Table 8)Attack classification in the NSL-KDD is based on thefeature characteristics [33] (e NSL-KDD dataset can bedownloaded from httpswwwunbcacicdatasetsnslhtml

12 Complexity

11 CICIDS 2017 Dataset

(e CICIDS 2017 dataset consists of benign and the mostcurrent common attacks which mimic real-world data(PCAPs) It also contains the results of a network trafficanalysis obtained by using a CICFlowMeter the flows arelabelled based on the timestamp source and destinationports source and destination IPs protocols and attack (eCICIDS 2017 dataset satisfies the 11 indispensable features ofa valid IDS dataset namely anonymity available protocolsfeature set attack diversity complete capture completeinteraction complete network configuration completetraffic metadata heterogeneity and labelling [34] (ere are2830540 rows in the CICIDS 2017 devised on eight fileswith each row containing 79 features In the CICIDS 2017each row is labelled as benign or as one of the 14 attack typesA summary of the distribution of different attack types andthe benign rows is presented in Table 9

12 Results of ITLBO-IPJAYA vs ITLBOand ITLBO-JAYA

(is section provides the results of the improved method-based ITLBO-IPJAYA algorithm (is method selects thebest features and updates the value of SVM parameters (iswork proposed the idea of ldquoparallel executionrdquo to update theSVM parameters (e parameters for ITLBO ITLBO-JAYAand ITLBO-IPJAYA methods used in this study are shownin Table 10

(e NSL-KDD dataset is used to evaluate the threemethods and the evaluation metrics used are maximumaccuracy (Max Acc) average accuracy (AVR Acc) de-tection rate (DT) false alarm rate (FAR) false negative rate(FNR) F-measure (F-M) recall and error rate (ER)Table 11 shows the comparison in results among ITLBOITLBO-JAYA and ITLBO-IPJAYA

(e results show that ITLBO-IPJAYA performs betterthan ITLBO and ITLBO-JAYA in all metrics Figure 11shows the comparison results based on the accuracy ofITLBO ITLBO-JAYA and ITLBO-IPJAYA

Figure 12 shows a comparison between ITLBO-JAYAand ITLBO-IPJAYA based on the number of iterations Itshows that ITLBO-IPJAYA performs better than ITLBO-JAYA even with less number of iterations (e increase inrate of accuracy for ITLBO-IPJAYA is higher than ITLBO-JAYA (e figure shows that ITLBO-IPJAYA with 20 iter-ations performs better than ITLBO-JAYA with 30 iterationsand that ITLBO-IPJAYA performs better than ITLBO-JAYAwith less number of iterations (is means there is lesscomplexity and less execution time for ITLBO-IPJAYAFigure 13 shows the average FAR of the three methodsshowing that ITLBO-IPJAYA performs better than ITLBOand ITLBO-JAYA even with less number of features whereITLBO-IPJAYA with 19 features performs better than TLBOand ITLBO-JAYA with 21 and 22 features respectively (eimprovements shown in Sections 4 and 6 reduce the exe-cution time for ITLBO-IPJAYA over ITLBO-JAYA (eparallel processing of each SVM parameter independently isthe main factor that reduces the execution time for ITLBO-IPJAYA over ITLBO-JAYA as shown in Figure 14

(e results of the CICIDS 2017 dataset are shown inTable 12

Finally statistical significance tests (T-test) T-test madeon the distribution of values in both samples showed theirsignificant difference which allowed us to reject null hy-pothesis H0 (e test sh2ows the superiority of IPJAYA-ITLBO-SVM over JAYA-ITLBO-SVM (e P values and T-

Table 8 NSL-KDD dataset

Attack classes 22 types of attacks No of instancesNormal 67343DoS smurt neptune pod teardrop back land 45927R2L phf ftp-write imap multihop warezclient warezmaster spy guess password 995U2R perl loadmodule buffer-overflow rootkit 52Probing portsweep ipsweep satan nmap 11656

Table 9 CICIDS dataset

Attack class 14 types of attacks No of instancesBenign (normal) 2359087DOS DDoS slowloris Heratbleed Hulk GoldenEye Slowhttptest 294506PortScan Portscan 158930Bot Bot 1966Brute-Force FTP-Patator SSH-Patator 13835Web attack Web attack XSS web attack SQL injection web attack brute force 2180Infiltration Infiltration 36

Table 10 Parameters used in this study margin

Parameter ValuePopulation size for ITLBO 40Number of generations for ITLBO 60Population size for JAYA 40Number of generations for JAYA 60Population size for IPJAYA 40Number of generations for IPJAYA 60Population size for ITLBO 40Number of generations for IPJAYA 60Crossover type Half-uniformMutation type Bit-flip

Complexity 13

0973097409750976097709780979

098098109820983

Acc

urac

y

ITLBO-JAYAITLBO-IPJAYA

25 30 35 40 45 50 55 60 6520Number of iterations

Figure 13 Accuracy comparison based on the number of iterations for the NSL-KDD dataset

Table 11 Comparison of ITLBO ITLBO-JAYA and ITLBO-IPJAYA for the NSL-KDD dataset

No of features Method MAX Acc AVR Acc DR FAR FNR F-M Recall ER

16

TLBO 09639 09630 09612 00449 00282 09664 09717 0036ITLBO 09680 09678 09671 00379 00268 09701 09731 0032

ITLBO-JAYA 09688 09685 09676 00373 00258 0971 09741 00312ITLBO-IPJAYA 09708 09705 09712 00331 00256 09727 09742 00292

18

TLBO 09713 0971 09739 00299 00275 09731 09724 00286ITLBO 09718 09713 09744 00292 00273 09736 09726 00282

ITLBO-JAYA 09735 09733 09752 00285 00247 09752 09752 00265ITLBO-IPJAYA 09747 09746 09753 00280 00221 09764 09779 00252

19

TLBO 09738 09735 09727 00313 00225 09755 09774 00261ITLBO 09751 09745 09737 00305 00189 09769 09811 00248

ITLBO-JAYA 09759 09758 09758 00278 00178 09775 09791 00241ITLBO-IPJAYA 09772 09770 09786 00245 00162 09787 09787 00228

21

TLBO 09782 09780 09742 00299 00145 09797 09844 00217ITLBO 09787 09784 09756 00279 00144 0981 09846 00212

ITLBO-JAYA 09793 0979 09789 00273 00132 09811 09867 00207ITLBO-IPJAYA 09802 0980 09792 00263 00123 09812 09716 00198

22

TLBO 09801 0979 09755 00284 00131 09814 09868 00199ITLBO 0981 09805 09758 00277 00117 09823 0989 00191

ITLBO-JAYA 09816 09814 09794 00265 00114 09829 0989 00183ITLBO-IPJAYA 0-9823 09821 09798 00262 00102 09835 09898 00177

Acc

urac

y

16 18 19 21 22Number of features

096

80

9688 097

08

097

180

9735

097

47

097

510

9759

097

72

097

870

9793

098

02

098

10

9816

098

23

ITLBOITLBO-JAYAITLBO-IPJAYA

Figure 12 Accuracy based on the number of features for the NSL-KDD dataset

14 Complexity

values are shown in Table 13 the small values show that theIPJAYA-ITLBO-SVM method (MV1) is highly significant

13 The Comparison of the Proposed Methods

To illustrate the effectiveness of our proposed IDS methodsthe performance of the proposed methods is compared withsix recently developed anomaly detection techniquesTable 14 demonstrates the result achieved by the proposedmethods compared with other methods tested on the NSL-KDD dataset in terms of detection rate and false alarm rateIt is very clear that our proposed methods (ITLBO-JAYAand ITLBO-IPJAYA) obtained the best results with 09823accuracy 09798 detection rate and 00102 false alarm ratefor the ITLBO-IPJAYA model and 09816 accuracy 09794detection rate and 00114 false alarm rate for the ITLBO-JAYA method as shown in Table 11 However Table 15demonstrates the result achieved by the proposed methodscompared with other methods tested on the CICIDS 2017dataset in terms of detection rate and false alarm rate

14 Discussion

(is work in general contains 4 sections based on theproposed method Furthermore all methods proposed inthis work were evaluated based on the NSL-KDD andCICIDS 2017 datasets

Firstly the proposed ITLBO-IPJAYA based on networkintrusion detection and method results were compared withTLBO ITLBO and ITLBO-JAYA as shown in Tables 11 and12 Additionally the table shows the different features for thethree algorithms to investigate the influence of the featurersquosincrease on the performance which represents a differentalgorithm structure (e ITLBO-IPJAYA results showedhigher stability and better accuracy than ITLBO and ITLBO-JAYA algorithms

Furthermore Figure 13 shows that ITLBO-JAYA needs60 iterations to reach accuracy of 09816 when the ITLBO-IPJAYA algorithm with 50 iterations achieved higher ac-curacy (erefore ITLBO-IPJAYA achieved better detectionrate and less false alarm rate with less complexity of

0020

0022

0024

0026

0028

003

0032

0034

0036

0038

004

FAR

2217 18 19 20 21 2315 16Number of features

ITLBOITLBO-JAYAITLBO-IPJAYA

Figure 14 Comparison based on the number of features head for the NSL-KDD dataset

Table 12 Comparison of ITLBO ITLBO-JAYA and ITLBO-IPJAYA for the CICIDS dataset

No of features Method MAX Acc AVR Acc DR FAR FNR F-M Recall ER

12ITLBO 09634 09631 09661 00389 00268 0970 09721 00323

ITLBO-JAYA 09685 09683 09682 00360 00267 09713 09722 00315ITLBO-IPJAYA 09704 09702 09701 00310 00265 09725 09724 00298

13ITLBO 09712 09710 09724 00298 00273 09736 09726 00282

ITLBO-JAYA 09745 09744 09728 00290 00264 09741 09794 00272ITLBO-IPJAYA 09768 09767 09732 00285 00260 09752 09787 00264

14ITLBO 09776 09775 09737 00280 00189 09769 09811 00258

ITLBO-JAYA 09789 09787 09742 00270 00174 0978 0986 00235ITLBO-IPJAYA 09801 0980 09749 00265 00134 0981 0987 00210

16ITLBO 09804 09803 09755 00271 0011 09821 0989 00190

ITLBO-JAYA 0981 09808 09773 00266 00109 09825 0989 00183ITLBO-IPJAYA 09817 09815 09782 00264 00105 09831 09896 00170

Complexity 15

iterations Secondly with all the improvement of ITLBO-SVM mentioned above random selection of the main SVMparameters is considered as one of the algorithm limitationswhich may not provide optimal parameter value and affectthe model accuracy negatively

(e results above showed that the ITLBO-IPJAYAperformance improved the basic SVM performance byproviding the best parameter values as shown in the ITLBO-IPJAYA block diagram in Figure 11 In the end the per-formance of ITLBO-IPJAYA is worth reducing the impact ofselected parameters randomly

As a result of the differences in the algorithm structurethe ITLBO structure contains three phases which shouldprevent the algorithm from being trapped in local and globaloptima Also teachers not only teach learners (students) butalso teach other teachers On the contrary the TLBO structurecontains two phases only where teachers teach learners only

Furthermore the ITLBO algorithm achieved higheraccuracy than TLBO because the knowledge exchange rate ishigher in ITLBO since teachers teach learners and otherteachers (erefore ITLBO achieved better detection rateand less false alarm rate with less complexity of iterations

Table 15 Comparison with the existing work for the CICIDS 2017 dataset

Ref Method Dataset Acc DR FAR[41] Hybrid model CICIDS 8976 NG NG[42] Wrapper-based feature selection CICIDS 9768 NG NG[43] Feature selection technique and SVM CICIDS 09803 NG NGTLBO-SVM TLBO and SVM CICIDS 09794 09745 00274ITLBO-SVM Improved TLBO and SVM CICIDS 09804 09755 00271ITLBO-JAYA-SVM Improved TLBO improved JAYA and SVM CICIDS 0981 09773 00266ITLBO-IPJAYA-SVM Improved TLBO improved JAYA and SVM CICIDS 09817 09782 00264

Table 13 T-test results

NSL-KDD CICIDS 2017P value 00156 00068T value 3174 4044

Table 14 Comparison with the existing work for the NSL-KDD dataset

Ref Method Dataset Acc DR FAR[35] Two-stage classifier NSL-KDD 9638 NG NG[36] Hypergraph-based genetic algorithm and SVM NSL-KDD 0975 09714 083[8] PSO and SVM NSL-KDD 09784 09723 087[37] Chi-square and SVM NSL-KDD 098 NG 013[38] SVM and hybrid PSO NSL-KDD 07341 06628 281[39] SVM and feature selection NSL-KDD 090 NG NG[40] SVM and GA NSL-KDD 0975 NG NGTLBO-SVM TLBO and SVM NSL-KDD 09801 09755 00284ITLBO-SVM Improved TLBO and SVM NSL-KDD 0981 09758 00277ITLBO-JAYA-SVM Improved TLBO improved JAYA and SVM NSL-KDD 09816 09794 00265ITLBO-IPJAYA-SVM Improved TLBO improved JAYA and SVM NSL-KDD 09823 09798 00262

13534

13239

ITLBO-JAYA ITLBO-IPJAYA130501310013150132001325013300133501340013450135001355013600

Figure 15 Execution time comparison for the NSL-KDD dataset

16 Complexity

Dividing the solutions of the IPJAYA algorithm into twogroups and choosing the best solution from the best solutiongroup as ldquoBestrdquo and the best solution from the worst solutiongroup as ldquoWorstrdquo cause IPJAYA to need less iterations thanJAYA to reach better solutions as shown in Figure 13 (isalso leads to improvement in accuracy and detection rate

(e parallel improvement done on the JAYA algorithmreduces the time needed for execution and hence reduces thetotal execution time for the ITLBO-IPJAYA-SVM model asshown in Figure 15

Data Availability

(e data used to support the findings of this study areavailable online at httpswwwunbcacicdatasetsnslhtml

Conflicts of Interest

(e authors declare that they have no conflicts of interest

Acknowledgments

Special appreciation is due to Universiti Malaysia Pahang forthe sponsorship of this study approved by the Ministry ofHigher Education (MOHE) for Fundamental ResearchGrant Scheme (FRGS) with Vot No RDU190113

References

[1] A Sultana and M Jabbar ldquoIntelligent network intrusiondetection system using data mining techniquesrdquo in Pro-ceedings of the 2016 2nd International Conference on Appliedand 0eoretical Computing and Communication Technology(iCATccT) pp 329ndash333 IEEE Bangalore India July 2016

[2] R Rao ldquoJaya a simple and new optimization algorithm forsolving constrained and unconstrained opti- mizationproblemsrdquo International Journal of Industrial EngineeringComputations vol 7 no 1 pp 19ndash34 2016

[3] M Alsajri M A Ismail and S Abdul-Baqi ldquoA review on therecent application of Jaya optimization algorithmrdquo in Pro-ceedings of the 2018 1st Annual International Conference onInformation and Sciences (AiCIS) Springer Al-Fallujah Iraqpp 129ndash132 November 2018

[4] P Tao Z Sun and Z Sun ldquoAn improved intrusion detectionalgorithm based on GA and SVMrdquo IEEE Access vol 6pp 13624ndash13631 2018

[5] A S Eesa Z Orman and A M A Brifcani ldquoA novel feature-selection ap- proach based on the cuttlefish optimizationalgorithm for intrusion detection systemsrdquo Expert Systemswith Applications vol 42 no 5 pp 2670ndash2679 2015

[6] P Louvieris N Clewley and X Liu ldquoEffects-based featureidentification for network intrusion detectionrdquo Neuro-computing vol 121 pp 265ndash273 2013

[7] E De la Hoz A Ortiz J Ortega and B Prieto ldquoPCA filteringand probabilistic SOM for network intrusion detectionrdquoNeurocomputing vol 164 pp 71ndash81 2015

[8] S Mojtaba H Bamakan H Wang Y Tian and Y Shi ldquoAneffective intrusion detection framework based onMCLPSVMoptimized by time-varying chaos particle swarm optimiza-tionrdquo Neurocomputing vol 199 pp 90ndash102 2016

[9] R V Rao V J Savsani and D P Vakharia ldquoTeach-ingndashlearning-based optimization a novel method for

constrained mechanical design optimization problemsrdquordquoComputer-Aided Design vol 43 no 3 pp 303ndash315 2011

[10] S P Das and S Padhy ldquoA novel hybrid model using teaching-learning-based optimization and a support vector machine forcommodity futures index forecastingrdquo International Journalof Machine Learning and Cybernetics vol 9 no 1 pp 97ndash1112018

[11] S P Das N S Achary and S Padhy ldquoNovel hybrid SVM-TLBO forecasting model incorporating dimensionality re-duction techniquesrdquo Applied Intelligence vol 45 no 4pp 1148ndash1165 2016

[12] R V Rao V J Savsani and J Balic ldquoTeachingndashlearning-based optimization algorithm for unconstrained and con-strained real-parameter optimization problemsrdquordquo EngineeringOptimization vol 44 no 12 pp 1447ndash1462 2012

[13] R V Rao and V Patel ldquoAn improved teaching-learning-based optimization algorithm for solving uncon- strainedoptimization problemsrdquo Scientia Iranica vol 20 no 3pp 710ndash720 2013

[14] M Crepinsek S-H Liu and L Mernik ldquoA note on teach-ingndashlearning-based optimization algo- rithmrdquo InformationSciences vol 212 pp 79ndash93 2012

[15] M R Nayak C K Nayak and P K Rout ldquoApplication ofmulti-objective teaching learning based optimization algo-rithm to optimal power flow problemrdquo Procedia Technologyvol 6 pp 255ndash264 2012

[16] Y Xu L Wang S-Y Wang and M Liu ldquoAn effectiveteaching-learning-based optimization algorithm for theflexible job-shop scheduling problem with fuzzy processingtimerdquo Neurocomputing vol 148 pp 260ndash268 2015

[17] H E Kiziloz A Deniz T Dokeroglu and A Cosar ldquoNovelmultiobjective TLBO algorithms for the feature subset se-lection problemrdquoNeurocomputing vol 306 pp 94ndash107 2018

[18] S H Wang K Muhammad Y Lv et al ldquoIdentification of Al-coholism based on wavelet Renyi entropy and three-segmentencoded Jaya algorithmrdquo Complexity vol 2018 Article ID3198184 13 pages 2018

[19] H Migallon A Jimeno-Morenilla and J-L Sanchez-RomeroldquoParallel improvements of the Jaya optimization algorithmrdquoApplied Sciences vol 8 no 5 p 819 2018

[20] C Gong ldquoAn enhanced Jaya algorithm with a two groupAdaptionrdquo International Journal of Computational Intelli-gence Systems vol 10 no 1 pp 1102ndash1115 2017

[21] O Samuel N Javaid S Aslam and M H Rahim ldquoJAYAoptimization based energy management controller for smartgrid JAYA optimization based energy management con-trollerrdquo in Proceedings of the 2018 International Conference onComputing Mathematics and Engineering Technologies(iCoMET) March 2018

[22] K Yu J J Liang B Y Qu X Chen and H Wang ldquoPa-rameters identification of photovoltaic models using an im-proved JAYA optimization algorithmrdquo Energy Conversionand Management vol 150 pp 742ndash753 2017

[23] M Dash and H Liu ldquoFeature selection for classificationrdquoIntelligent Data Analysis vol 1 no 1-4 pp 131ndash156 1997

[24] S Dumais J Platt D Heckerman and M Sahami ldquoInductivelearning algorithms and representations for text categoriza-tionrdquo in Proceedings of the seventh international conference onInformation and knowledge management-CIKMrsquo98 Novem-ber 1998

[25] R Singh H Kumar and R K Singla ldquoAn intrusion detectionsystem using network traffic profiling and online sequentialextreme learning machinerdquo Expert Systems with Applicationsvol 42 no 22 pp 8609ndash8624 2015

Complexity 17

[26] M Sokolova and L Guy ldquoA systematic analysis of perfor-mance measures for classification tasksrdquo Information Pro-cessing amp Management vol 45 no 4 pp 427ndash437 2009

[27] P Phoungphol Y Zhang and Y Zhao ldquoRobust multiclassclassification for learning from imbalanced biomedical datardquoTsinghua Science and Technology vol 17 no 6 pp 619ndash6282012

[28] A Jahan F Mustapha M Y Ismail S M Sapuan andM Bahraminasab ldquoA comprehensive VIKOR method formaterial selectionrdquo Materials amp Design vol 32 no 3pp 1215ndash1221 2011

[29] L Aljarah and S A Ludwig ldquoMapreduce intrusion detectionsystem based on a particle swarm optimization clusteringalgorithmrdquo in Proceedings of the 2013 IEEE Congress onEvolutionary Computation pp 955ndash962 IEEE CancunMexico June 2013

[30] K Khaleel M A Ismail U Yunan and S Kasim ldquoReview onintrusion detection system based on the goal of the detectionsystemrdquo International Journal of Integrated Engineeringvol 10 no 6 2018

[31] J D Rodriguez A Perez and J A Lozano ldquoSensitivityanalysis of k-Fold cross validation in prediction error esti-mationrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 32 no 3 pp 569ndash575 2010

[32] M Tavallaee E Bagheri W Lu and A A Ghorbani ldquoAdetailed analysis of the KDD CUP 99 data setrdquo in Proceedingsof the 2009 IEEE Symposium on Computational Intelligence forSecurity and Defense Applications pp 1ndash6 IEEE Ottawa ONCanada July 2009

[33] M Al-Qatf Y Lasheng M Al-Habib and K Al-SabahildquoDeep learning approach combining sparse autoencoder withSVM for network intrusion detectionrdquo IEEE Access vol 6pp 52843ndash52856 2018

[34] R Vinayakumar M Alazab K P Soman P PoornachandranA Al-Nemrat and S Venkatraman ldquoDeep learning approachfor intelligent intrusion detection systemrdquo IEEE Access vol 7pp 41525ndash41550 2019

[35] BA Tama M Comuzzi and K-H Rhee ldquoTSE-IDS a two-stage classifier ensemble for intelligent anomaly-based in-trusion detection systemrdquo IEEE Access vol 7 pp 94497ndash94507 2019

[36] M R Gauthama Raman N Somu K Kannan R Liscano andV S Shankar Sriram ldquoAn efficient intrusion detection systembased on hypergraph - genetic algorithm for parameter op-timization and feature selection in support vector machinerdquordquoKnowledge-Based Systems vol 134 pp 1ndash12 2017

[37] I S (aseen and C Aswani Kumar ldquoIntrusion detectionmodel using fusion of chi- square feature selection and multiclass SVMrdquo Journal of King Saud University - Computer andInformation Sciences vol 29 no 4 pp 462ndash472 2017

[38] Y Li S Yu J Bai and X Cheng ldquoTowards effective networkintrusion detection a hybrid model integrating Gini indexand GBDTwith PSOrdquo Journal of Sensors vol 2018 Article ID1578314 9 pages 2018

[39] A A Aburomman and M B I Reaz ldquoA novel weightedsupport vector machines multi- class classifier based ondifferential evolution for intrusion detection systemsrdquo In-formation Sciences vol 414 pp 225ndash246 2017

[40] J Esmaily and J Ghasemi ldquoA novel intrusion detectionsystems based on genetic algorithms-suggested features by themeans of different permutations of labelsrsquo ordersrdquo Interna-tional Journal of Engineering vol 30 no 10 pp 1494ndash15022017

[41] S Aljawarneh M Aldwairi and M B Yassein ldquoAnomaly-based intrusion detection system through feature selectionanalysis and building hybrid efficient modelrdquo Journal ofComputational Science vol 25 pp 152ndash160 2018

[42] Y Li J L Wang Z H Tian T B Lu and C Young ldquoBuildinglightweight intrusion detection system using wrapper-basedfeature selection mechanismsrdquo Computers amp Security vol 28no 6 pp 466ndash475 2009

[43] S U Jan S Ahmed V Shakhov and I Koo ldquoToward alightweight intrusion detection system for the internet ofthingsrdquo IEEE Access vol 7 pp 42450ndash42471 2019

18 Complexity

Page 6: ImprovedTLBO …downloads.hindawi.com/journals/complexity/2020/5287684.pdfFSS is a multiobjective optimisation problem. e first objective is the number of features, and the second

been satisfied or not If the termination criteria weresatisfied proceed to the next step otherwise themain three stages are repeated (Teacher phaseLearner Phase 1 and Learner Phase 2) (is step isrepresented in lines 28 to 40 in Figure 5

(vi) Step 6 the final step is the application of non-dominated sorting to the result Nondominatedsortingmeans no result (individual) is better than all

other individuals (is step can be viewed in line 43in Figure 5

5 Parameter Optimisation

After selecting the optimal subset feature several SVMparameters will be tuned(e tuning of SVM parameters is aproblem which can determine algorithm performance (e

Start

Calculate weighted average of every individual in the population

Choose best individual as teacher

Is new one better thanold one

Keep old one Keep new one

Select best two students apply crossover and mutation

Is new one better than theworse student

Keep new oneKeep old one

Select random two students apply crossover and mutation

Is new one better thanold one

Keep new oneKeep old one

Is the termination criterion satisfied

Is new one better than the worse student based on new

value of C and YKeep new oneKeep old one

Is the terminationcriterion satisfied

End

YesNo

YesNo

No Yes

YesNo

No

Crossover teacher with all other individuals (student) separately and apply mutation

Apply nondominated sorting and find the Pareto setYes

Initialize population randomly

Figure 6 ITLBO flowchart

6 Complexity

radial basis function (RBF) kernel function of the SVM isemployed for the conversion of the completely nonseparableproblem into a separable or approximate separable state(eRBF kernel parameter c suggests data distribution to a newfeature space while parameter C suggests the level of penaltyfor the classification error in the linear nonseparable caseEquations (2) and (3) represent the cost and gamma re-spectively In the next section the two parameters (C and c)were tuned by using the IPJAYA algorithm

w b12

w22

+ C 1113944n

ζnstyn wtxn + b1113872 1113873ge 1 minus ζn (2)

k xn xm( 1113857 exp minusc xn minus xm

23

1113874 1113875 (3)

6 Improved Parallel JAYA Algorithm

(e JAYA algorithm needs improvements to work betterOne of the observations on the JAYA algorithm is that if wesort the populations from best to worst and divide them intotwo groups the best and the worst solutions Obviously theoptimal solution is located in the best solution group [2]Based on this observation an improvement has been done inthe JAYA algorithm rather than selecting the best and worstcases from the whole solutions which puts the worst so-lution further from the best solution and increase the it-erations needed to reach the optimal solution the solutionswere divided into two groups (e best solution is chosenfrom the best solution group as ldquoBestrdquo and the best solutionfrom the worst solution group is the ldquoWorstrdquo (is proce-dure reserves the populationrsquos diversity and makes the so-lution start from a point closer to the optimal solution anddecreases the number of iterations needed to reach theoptimal solution In the proposed work JAYA algorithmwas improved to optimise two parameters of the SVMclassifier simultaneously Figure 7 shows the flowchart ofIPJAYA while Figure 8 shows the IPJAYA algorithm fol-lowed by the detailed steps of IPJAYA

(e detailed steps of IPJAYA are shown as follows(i) Step 1 select the population size and the number of

design variables as well as initialise the terminationcondition To explain the parameter optimisation indetail we assume the following scenario populationsize 3 design variables 2 and terminationcriterion 2 iterations (e value of the populationis the value of parameters C and c in this scenarioeach one has 3 values (ese values were initialisedrandomly for C between 0001 to 100 and for c

between 00001 to 64 Table 2 shows the values of Cand c

(ii) Step 2 SVM needs three things to classify any la-belled data ie features to choose value of C pa-rameter and value of c parameter (is step can beviewed in line 2 of Figure 8

(iii) Step 3 the next step is to evaluate each value forboth C and c separately by using the SVM and on

the first student from Learner Phase 2 after applyingcrossover and mutation as shown in Table 3To continue the optimisation process the pop-ulation is arranged from best to worst and split intotwo groups (Best and Worst groups) as shown inTable 4(e same procedure is repeated for c parameter andthis time C is by default and the new value of c is11006

(iv) Tables 5 and 6 show the details of c parameter(v) Step 4 the result will be considered as the objective

function for both C and c and then compared withother populations and continued until the termi-nation criterion is satisfied (is step can be viewedin line 7 of Figure 8

(ese two new values for C and c will be evaluated usingthe same subset feature at the same time as shown in Table 7(is step can be viewed in lines 5 to 6 of Figure 8

7 The Proposed Method

(is section describes the proposed combination of threedifferent algorithms Each algorithm has a different task todo and these tasks complete the work of the model (e firstalgorithm is the ITLBO whose task is to choose the optimalsubset feature from the whole features(e second algorithmis IPJAYA algorithm and its task is to optimise the pa-rameters of the SVM (e third algorithm is the SVMclassifier which takes the outcome of the first two algorithmsto determine if the processed traffic is intrusion or normaltraffic Figure 9 shows the flowchart of the proposed methodFigure 10 shows the pseudo-code of the proposed methodwhile Figure 11 details the proposed steps of the IPJAYA-ITLBO-SVM method

(e detail steps of ITLBO-IPJAYA-SVM are asfollows

(i) Step 1 initialise the population randomly Eachpopulation is a different set of features from 1 tothe maximum number of features (41 in NSL-KDD) (is step can be viewed in line 2 ofFigure 10

(ii) Step 2 calculate the weighted average of everyindividual population (is step can be viewed inline 3 of Figure 10

(iii) Step 3 choose the best individual as a teacher (echosen teacher interacts with all other individualsseparately Apply crossover with each one andthen apply mutation to all resulted individuals(ecrossover used is called the half-uniform crossoverand bit-flip mutation operator (is step can beviewed in lines 4 to 5 of Figure 10

(iv) Step 4 check the population (chromosome)resulting from crossover and mutation if the newone is better than the old one keep the new oneotherwise retain the old one (e best and worstpopulations refer to the degree of accuracy during

Complexity 7

Start

Initialize population size number of design variables and termination criterion

Identify best and worst solutions for C and γ

Calculate new worst solution for C Calculate new worst solution for γ

Modify the solution based on eq 4 Modify the solution based on eq 4

Use the new value of C and γ do the classification

Is new Acc Better than old Acc

Keep new C and γKeep old C and γ

Is the termination criterionsatisfied

Return best value of C and γ

End

YesNo

NO

Yes

Parallel pool

Figure 7 IPJAYA flowchart

(1) Start (2) Initialise the population size number of designed

variables and termination criteria (3) Repeat Steps 3ndash6 until the termination criteria are met (4) Arrange the solutions from best to worst and split the

solutions into two groups -best and worst solutions(5) Make the best solution in best group as best and

make the best solution in worst group as worst(6) Modify the solution based on the following equation

Yprimej k I = Yj k I + r1 k I (Yj best I ndash | Yj k I |) ndashr2 k I (Yj worst I ndash | Yj k I|)

(7) Update the previous solution if Yprimej k I gt Yj k Iotherwise do not update the previous solution

(8) Display the established optimum solution(9) End

Figure 8 IPJAYA algorithm

Table 2 C and c values

C c

20 101 210 0701 1

Table 3 Evaluation of C

C c Subset feature Accuracy (objective function)20 Default Fixed 0971 Default Fixed 089910 Default Fixed 099401 Default Fixed 099

Table 4 Best and Worst groups for C

C cSubsetfeature

Accuracy (objectivefunction)

10 Default Fixed 0994 best of best Best group01 Default Fixed 09920 Default Fixed 097 best of worst Worst

group1 Default Fixed 0899

Table 5 Accuracy based on c

C γ Subset feature Accuracy (objective function)20 Default Fixed 0971 Default Fixed 089910 Default Fixed 099401 Default Fixed 099

8 Complexity

classification All the aforementioned steps arecalled the Teacher phase because all individualslearn from the best one (teacher) After thatLearner Phase 1 is started (is step can be viewedin lines 6 to 13 of Figure 10

(v) Step 5 select the best two individuals as studentsand apply crossover between these two students(en apply mutation on the new one If the newone is better than the old two students keep thenew one otherwise keep the best old one and

apply this with all other individuals (students) (estudents are chosen once and will not be chosenagain At this point Learner Phase 1 has ended(is step can be viewed in lines 14 to 27 ofFigure 10

(vi) Step 6 Learner Phase 2 is initiated with two objec-tives one is to optimise the SVMparameters and theother is to make students learn from each other(isphase starts with choosing two random individuals(students) and then applying crossover between these

Table 6 Best and Worst groups for c

C c Subset feature Accuracy (objective function)Default 07 Fixed 09941 best of best Best groupDefault 1 Fixed 099Default 2 Fixed 098 best of worst Worst groupDefault 10 Fixed 097

Table 7 Evaluation of features based on new C and c

C c Subset feature Accuracy (objective function)142 11006 Fixed Result

Start

FSS using ITLBO

SVM parameters optimization using IPJAYA

Is termination criterion satisfied for IPJAYA

No

Is termination criterion satisfied for ITLBO

No

Training

Classification

Intrusion detection

End

Yes

Yes

Figure 9 Proposed method flowchart

Complexity 9

two students and applying mutation on the newindividual After that and before the classificationprocess is initiated check if the new student is betterthan the old two students (e SVM parameteroptimisation is started using IPJAYA this processstarts at the 29th step by initialising the populationsize the number of design variables and the ter-mination criteria for IPJAYA (e population sizecan be set before the execution and each populationis generated randomly(e designed variables are thetwo parameters of the SVM which need to beoptimised (e termination criteria can be thenumber of iterations after that each population foreach parameter is evaluated separately (which onegives better accuracy) followed by a parallel poll foreach parameter sorting the population from best toworst (best accuracy to worst accuracy) and sepa-rating them into two groups (best and worst groups)(e best population in the best group is chosen asbest and the best population in the worst group ischosen as worst (en the population is modifiedbased on equation in Figure 8 and updated if the newone is better than the old one IPJAYA is repeated

until the termination criterion is satisfied (e finalstep of IPJAYA is to deliver the best value of the twoparameters to be used by the SVM At this point theparameter optimisation has ended and LearnerPhase 2 continues in the next step (is step can beviewed in lines 28 to 39 of Figure 10

(vii) Step 7 evaluate the individuals (chromosome) byusing the outcome of IPJAYA If the new indi-vidual is better than the old two students keepthe new one otherwise keep the best old oneApply this step to all other students At this stepthe main three stages of the ITLBO have finished(e next step is to check for the satisfaction of thetermination criteria if satisfied proceed to thenext step Otherwise the main three stages arerepeated (is step can be viewed in lines 40 to 48of Figure 10

(viii) Step 8 the last step is to apply nondominatedsorting on the result Nondominated sortingmeans no result (individual) is better than all theother individuals(is step can be viewed in line 49of Figure 10

(1) Start (2) Initialize population(3) Calculate_weighted_average_of_individuals (4) for (k =1 to number_of_generations) do(5) Xteacher = Best_individual (6) Learning from Teacher lowastteacher phase (7) for (i = 1 to number_of_individuals) do(8) Xnew = Crossover (Xteacher Xi)(9) Xnew = Mutation(Xnew)(10) if (Xnew is better than Xi) then

(11) Xi = [Xnew]

(12) End if

(13) End for (14) Learning from Best Classmates lowastlearner phase 1 (15) for (i = 1 to number_of_individuals) do(16) m = Select_best_individual_from

(population)

(17) n = Select_best_individual_from (population)(18) n nem ne teacherlowast(19) Xnew = Crossover (Xm Xn)(20) Xnew = Mutation(Xnew)

(21) if (Xnew is better than Xm) then

(22) Xm = Xnew

(23) End if(24) if (Xnew is better than Xn) then

(25) Xn = Xnew

(26) End if

(27) End for

(28) Learning from Classmates lowastlearner phase 2(29) for (i =1 to number_of_individuals) do(30) m =Select_random_individual_from

(population)

(31 n =Select_random_individual_from (population) n nem ne teacherlowast

(32) Xnew = Crossover (Xm Xn)

(33) Xnew = Mutation (Xnew)(34) IPJAYA algorithm for SVM parameter

(35) Initialize the population size number of designed variables and termination criteria (IPJAYA)

(36) Arrange the solutions from the best to the worstand Split the solutions into two groups best andworst solutions

(37) Modify the solutionYj k I = Yj k I + r1 k I (Yj best I ndash |Yj k I|) ndash r2 k I (Yj worst I ndash | Yj k I|)

(38 Update the previous solution if Yj k I gt Yj k Iotherwise do not update the previous solution

(39) return best value of C and Y(40) if ( Xnew is better than Xm) then(41) Xm = Xnew(42) End if (43) if (Xnew is better than Xn) then(44) Xn = Xnew(45) End if (46) End for (47) Is termination criterion satisfied if yes go to line

42 else continue

(48) End for(49) Show_the_pareto_optimal_set (population)(50) End

Figure 10 Proposed method pseudo-code

10 Complexity

Start

Calculate weighted average of every individual in the population

Choose best individual as teacher

Crossover teacher with all other individuals (student) separately and apply mutation

Is new one better than old oneKeep old one Keep new one

Select best two students apply crossover and mutation

Is new one better than theworse student

Keep new oneKeep old one

Select random two students apply crossover and mutation

Initialize population size number of design variablesand termination criterion

Identify best and worst solutions for C and γ

Calculate new worst solution

Modify the solutions based on equation 32

Use the new value of c and y do the classification

Is new one better than old one Keep new oneKeep old one

Is the termination criterionsatisfied

This is the best value of C and γ

Is new one better thanthe worse student based on new

value of C and γ

Keep new oneKeep old one

Is the termination criterionsatisfied

End

YesNo

YesNo

Yes

No Yes

YesNo

No

No

Initialise population randomly

Apply nondominated sorting and find the Pareto setYes

ITLBO

IPJAYA

Figure 11 Details of the proposed method

Complexity 11

8 Evaluation Metrics

(emetrics measures and validation procedures used in theevaluation of the experimental data were reviewed in thissection (e literature review showed that most studies useoverall accuracy as the major performance measure for IDsystems However other metrics and validation measureshave also been mentioned Some works have detailed theinformation on FAR detections andmissed detections whichare all useful system performance evaluation measures (efollowing section details the analysis based on standardmetrics for objective evaluation of the results achieved byvarious classification methods (e performance of thesystem was evaluated using several metrics based on theNSL-KDD and CICIDS 2017 datasets A detailed descriptionof learning performance measures has been provided bySingh et al and Sokolova and Guy [25 26] whilePhoungphol et al [27] detailed the imbalanced dataset is-sues One of these metrics is the accuracy as given in thefollowing equation

accuracy TP + TN

TP + FN + TF + FP (4)

Accuracy is the capability of the classifier in predictingthe actual class here TP true positive TN true negativeFP false positive and FN false negative

Several metrics can be computed from the confusionmatrix (e false-positive rate (FPR) is another metric it isthe percentage of the samples incorrectly predicted aspositive by the classifier It is calculated by using the fol-lowing equation

false minus positive rate(FPR) FP

FP + TN (5)

(e false-negative rate (FNR) is the percentage of thedata incorrectly classified by the classifier as negative It iscalculated by using the following equation

false minus negative rate(FNR) FN

TP + FN (6)

(e detection rate (DR) is the percentage of the samplescorrectly classified by the classifier to their correct class It iscalculated by using the following equation

detection rate FP

TP + FP (7)

(e recall quantifies the number of correct positivepredictions made out of all positive predictions It is cal-culated by using the following equation

recall TP

TP + FN (8)

F-Measure provides a way to combine both detectionrate and recall into a single measure that captures bothproperties It is calculated by using the following equation

F minus M (2lowastDRlowast recall)

(DR + recall) (9)

(e results were validated by using k-fold cross-vali-dation technique [27ndash30] (is technique requires a randompartitioning of the data into k different parts and one part isselected from each iteration as testing data while the other(kminus1) parts are considered as the training dataset All theconnection records are eventually used for training andtesting For all experiments the value of k is taken as 10 toensure low bias low variance low overfitting and good errorestimate [28]

9 Dataset Preprocessing and Partitioning

(e whole dataset is preprocessed in this stage It consists oftwo steps ie scaling and normalisation In the scaling stepthe dataset is converted from a string representation to anumerical representation For example the class label in thedataset contains two different categories ldquoNormalrdquo andldquoAttackrdquo After implementing this step the label is changedto ldquo1rdquo and ldquo0rdquo where ldquo1rdquo means normal case while ldquo0rdquomeans attack (e second step is normalisation [31] (enormalisation process removes noise from the dataset anddecreases the differences in the ranges between the featuresIn this work the Max-Max normalisation method was usedas shown in the following equation

Fi Fi minus Mini

Maxi minus Minii (10)

where Fi represents the current feature that needs to benormalized and Mini and Maxi represent the minimum andthe maximum value for that feature respectively (e ob-jective function represents the accuracy of the SVM when itis evaluated on the validation set (e validation set is a partof the training set In order to make the validation fairer K-fold validation can be used(e value K is 10(e NSL-KDDand CICIDS 2017 datasets were used to evaluate the per-formance of the proposed models

10 NSL-KDD Dataset

In this study NSL-KDD datasets were used to evaluate theproposed method (is dataset was suggested in 2009 byTavallaee et al [32] due to the drawbacks of KDD CUP99(e NSL-KDD is a variant of the KDD CUP 99 dataset inwhich the redundant instances were discarded followed bythe reconstitution of the dataset structure [29] (e NSL-KDD dataset is commonly used for evaluating the per-formance of new ID approaches especially anomaly-basednetwork ID (ere are a reasonable number of testing andtraining records in the NSL-KDD (e training set(KDDTrain+) consists of 125973 records while the testingset (KDDTest+) contains 22544 records In this dataseteach traffic record has 41 features (six symbolic and 35continuous) and one class label (Table 7) (e features areclassified into basic content and traffic types (Table 8)Attack classification in the NSL-KDD is based on thefeature characteristics [33] (e NSL-KDD dataset can bedownloaded from httpswwwunbcacicdatasetsnslhtml

12 Complexity

11 CICIDS 2017 Dataset

(e CICIDS 2017 dataset consists of benign and the mostcurrent common attacks which mimic real-world data(PCAPs) It also contains the results of a network trafficanalysis obtained by using a CICFlowMeter the flows arelabelled based on the timestamp source and destinationports source and destination IPs protocols and attack (eCICIDS 2017 dataset satisfies the 11 indispensable features ofa valid IDS dataset namely anonymity available protocolsfeature set attack diversity complete capture completeinteraction complete network configuration completetraffic metadata heterogeneity and labelling [34] (ere are2830540 rows in the CICIDS 2017 devised on eight fileswith each row containing 79 features In the CICIDS 2017each row is labelled as benign or as one of the 14 attack typesA summary of the distribution of different attack types andthe benign rows is presented in Table 9

12 Results of ITLBO-IPJAYA vs ITLBOand ITLBO-JAYA

(is section provides the results of the improved method-based ITLBO-IPJAYA algorithm (is method selects thebest features and updates the value of SVM parameters (iswork proposed the idea of ldquoparallel executionrdquo to update theSVM parameters (e parameters for ITLBO ITLBO-JAYAand ITLBO-IPJAYA methods used in this study are shownin Table 10

(e NSL-KDD dataset is used to evaluate the threemethods and the evaluation metrics used are maximumaccuracy (Max Acc) average accuracy (AVR Acc) de-tection rate (DT) false alarm rate (FAR) false negative rate(FNR) F-measure (F-M) recall and error rate (ER)Table 11 shows the comparison in results among ITLBOITLBO-JAYA and ITLBO-IPJAYA

(e results show that ITLBO-IPJAYA performs betterthan ITLBO and ITLBO-JAYA in all metrics Figure 11shows the comparison results based on the accuracy ofITLBO ITLBO-JAYA and ITLBO-IPJAYA

Figure 12 shows a comparison between ITLBO-JAYAand ITLBO-IPJAYA based on the number of iterations Itshows that ITLBO-IPJAYA performs better than ITLBO-JAYA even with less number of iterations (e increase inrate of accuracy for ITLBO-IPJAYA is higher than ITLBO-JAYA (e figure shows that ITLBO-IPJAYA with 20 iter-ations performs better than ITLBO-JAYA with 30 iterationsand that ITLBO-IPJAYA performs better than ITLBO-JAYAwith less number of iterations (is means there is lesscomplexity and less execution time for ITLBO-IPJAYAFigure 13 shows the average FAR of the three methodsshowing that ITLBO-IPJAYA performs better than ITLBOand ITLBO-JAYA even with less number of features whereITLBO-IPJAYA with 19 features performs better than TLBOand ITLBO-JAYA with 21 and 22 features respectively (eimprovements shown in Sections 4 and 6 reduce the exe-cution time for ITLBO-IPJAYA over ITLBO-JAYA (eparallel processing of each SVM parameter independently isthe main factor that reduces the execution time for ITLBO-IPJAYA over ITLBO-JAYA as shown in Figure 14

(e results of the CICIDS 2017 dataset are shown inTable 12

Finally statistical significance tests (T-test) T-test madeon the distribution of values in both samples showed theirsignificant difference which allowed us to reject null hy-pothesis H0 (e test sh2ows the superiority of IPJAYA-ITLBO-SVM over JAYA-ITLBO-SVM (e P values and T-

Table 8 NSL-KDD dataset

Attack classes 22 types of attacks No of instancesNormal 67343DoS smurt neptune pod teardrop back land 45927R2L phf ftp-write imap multihop warezclient warezmaster spy guess password 995U2R perl loadmodule buffer-overflow rootkit 52Probing portsweep ipsweep satan nmap 11656

Table 9 CICIDS dataset

Attack class 14 types of attacks No of instancesBenign (normal) 2359087DOS DDoS slowloris Heratbleed Hulk GoldenEye Slowhttptest 294506PortScan Portscan 158930Bot Bot 1966Brute-Force FTP-Patator SSH-Patator 13835Web attack Web attack XSS web attack SQL injection web attack brute force 2180Infiltration Infiltration 36

Table 10 Parameters used in this study margin

Parameter ValuePopulation size for ITLBO 40Number of generations for ITLBO 60Population size for JAYA 40Number of generations for JAYA 60Population size for IPJAYA 40Number of generations for IPJAYA 60Population size for ITLBO 40Number of generations for IPJAYA 60Crossover type Half-uniformMutation type Bit-flip

Complexity 13

0973097409750976097709780979

098098109820983

Acc

urac

y

ITLBO-JAYAITLBO-IPJAYA

25 30 35 40 45 50 55 60 6520Number of iterations

Figure 13 Accuracy comparison based on the number of iterations for the NSL-KDD dataset

Table 11 Comparison of ITLBO ITLBO-JAYA and ITLBO-IPJAYA for the NSL-KDD dataset

No of features Method MAX Acc AVR Acc DR FAR FNR F-M Recall ER

16

TLBO 09639 09630 09612 00449 00282 09664 09717 0036ITLBO 09680 09678 09671 00379 00268 09701 09731 0032

ITLBO-JAYA 09688 09685 09676 00373 00258 0971 09741 00312ITLBO-IPJAYA 09708 09705 09712 00331 00256 09727 09742 00292

18

TLBO 09713 0971 09739 00299 00275 09731 09724 00286ITLBO 09718 09713 09744 00292 00273 09736 09726 00282

ITLBO-JAYA 09735 09733 09752 00285 00247 09752 09752 00265ITLBO-IPJAYA 09747 09746 09753 00280 00221 09764 09779 00252

19

TLBO 09738 09735 09727 00313 00225 09755 09774 00261ITLBO 09751 09745 09737 00305 00189 09769 09811 00248

ITLBO-JAYA 09759 09758 09758 00278 00178 09775 09791 00241ITLBO-IPJAYA 09772 09770 09786 00245 00162 09787 09787 00228

21

TLBO 09782 09780 09742 00299 00145 09797 09844 00217ITLBO 09787 09784 09756 00279 00144 0981 09846 00212

ITLBO-JAYA 09793 0979 09789 00273 00132 09811 09867 00207ITLBO-IPJAYA 09802 0980 09792 00263 00123 09812 09716 00198

22

TLBO 09801 0979 09755 00284 00131 09814 09868 00199ITLBO 0981 09805 09758 00277 00117 09823 0989 00191

ITLBO-JAYA 09816 09814 09794 00265 00114 09829 0989 00183ITLBO-IPJAYA 0-9823 09821 09798 00262 00102 09835 09898 00177

Acc

urac

y

16 18 19 21 22Number of features

096

80

9688 097

08

097

180

9735

097

47

097

510

9759

097

72

097

870

9793

098

02

098

10

9816

098

23

ITLBOITLBO-JAYAITLBO-IPJAYA

Figure 12 Accuracy based on the number of features for the NSL-KDD dataset

14 Complexity

values are shown in Table 13 the small values show that theIPJAYA-ITLBO-SVM method (MV1) is highly significant

13 The Comparison of the Proposed Methods

To illustrate the effectiveness of our proposed IDS methodsthe performance of the proposed methods is compared withsix recently developed anomaly detection techniquesTable 14 demonstrates the result achieved by the proposedmethods compared with other methods tested on the NSL-KDD dataset in terms of detection rate and false alarm rateIt is very clear that our proposed methods (ITLBO-JAYAand ITLBO-IPJAYA) obtained the best results with 09823accuracy 09798 detection rate and 00102 false alarm ratefor the ITLBO-IPJAYA model and 09816 accuracy 09794detection rate and 00114 false alarm rate for the ITLBO-JAYA method as shown in Table 11 However Table 15demonstrates the result achieved by the proposed methodscompared with other methods tested on the CICIDS 2017dataset in terms of detection rate and false alarm rate

14 Discussion

(is work in general contains 4 sections based on theproposed method Furthermore all methods proposed inthis work were evaluated based on the NSL-KDD andCICIDS 2017 datasets

Firstly the proposed ITLBO-IPJAYA based on networkintrusion detection and method results were compared withTLBO ITLBO and ITLBO-JAYA as shown in Tables 11 and12 Additionally the table shows the different features for thethree algorithms to investigate the influence of the featurersquosincrease on the performance which represents a differentalgorithm structure (e ITLBO-IPJAYA results showedhigher stability and better accuracy than ITLBO and ITLBO-JAYA algorithms

Furthermore Figure 13 shows that ITLBO-JAYA needs60 iterations to reach accuracy of 09816 when the ITLBO-IPJAYA algorithm with 50 iterations achieved higher ac-curacy (erefore ITLBO-IPJAYA achieved better detectionrate and less false alarm rate with less complexity of

0020

0022

0024

0026

0028

003

0032

0034

0036

0038

004

FAR

2217 18 19 20 21 2315 16Number of features

ITLBOITLBO-JAYAITLBO-IPJAYA

Figure 14 Comparison based on the number of features head for the NSL-KDD dataset

Table 12 Comparison of ITLBO ITLBO-JAYA and ITLBO-IPJAYA for the CICIDS dataset

No of features Method MAX Acc AVR Acc DR FAR FNR F-M Recall ER

12ITLBO 09634 09631 09661 00389 00268 0970 09721 00323

ITLBO-JAYA 09685 09683 09682 00360 00267 09713 09722 00315ITLBO-IPJAYA 09704 09702 09701 00310 00265 09725 09724 00298

13ITLBO 09712 09710 09724 00298 00273 09736 09726 00282

ITLBO-JAYA 09745 09744 09728 00290 00264 09741 09794 00272ITLBO-IPJAYA 09768 09767 09732 00285 00260 09752 09787 00264

14ITLBO 09776 09775 09737 00280 00189 09769 09811 00258

ITLBO-JAYA 09789 09787 09742 00270 00174 0978 0986 00235ITLBO-IPJAYA 09801 0980 09749 00265 00134 0981 0987 00210

16ITLBO 09804 09803 09755 00271 0011 09821 0989 00190

ITLBO-JAYA 0981 09808 09773 00266 00109 09825 0989 00183ITLBO-IPJAYA 09817 09815 09782 00264 00105 09831 09896 00170

Complexity 15

iterations Secondly with all the improvement of ITLBO-SVM mentioned above random selection of the main SVMparameters is considered as one of the algorithm limitationswhich may not provide optimal parameter value and affectthe model accuracy negatively

(e results above showed that the ITLBO-IPJAYAperformance improved the basic SVM performance byproviding the best parameter values as shown in the ITLBO-IPJAYA block diagram in Figure 11 In the end the per-formance of ITLBO-IPJAYA is worth reducing the impact ofselected parameters randomly

As a result of the differences in the algorithm structurethe ITLBO structure contains three phases which shouldprevent the algorithm from being trapped in local and globaloptima Also teachers not only teach learners (students) butalso teach other teachers On the contrary the TLBO structurecontains two phases only where teachers teach learners only

Furthermore the ITLBO algorithm achieved higheraccuracy than TLBO because the knowledge exchange rate ishigher in ITLBO since teachers teach learners and otherteachers (erefore ITLBO achieved better detection rateand less false alarm rate with less complexity of iterations

Table 15 Comparison with the existing work for the CICIDS 2017 dataset

Ref Method Dataset Acc DR FAR[41] Hybrid model CICIDS 8976 NG NG[42] Wrapper-based feature selection CICIDS 9768 NG NG[43] Feature selection technique and SVM CICIDS 09803 NG NGTLBO-SVM TLBO and SVM CICIDS 09794 09745 00274ITLBO-SVM Improved TLBO and SVM CICIDS 09804 09755 00271ITLBO-JAYA-SVM Improved TLBO improved JAYA and SVM CICIDS 0981 09773 00266ITLBO-IPJAYA-SVM Improved TLBO improved JAYA and SVM CICIDS 09817 09782 00264

Table 13 T-test results

NSL-KDD CICIDS 2017P value 00156 00068T value 3174 4044

Table 14 Comparison with the existing work for the NSL-KDD dataset

Ref Method Dataset Acc DR FAR[35] Two-stage classifier NSL-KDD 9638 NG NG[36] Hypergraph-based genetic algorithm and SVM NSL-KDD 0975 09714 083[8] PSO and SVM NSL-KDD 09784 09723 087[37] Chi-square and SVM NSL-KDD 098 NG 013[38] SVM and hybrid PSO NSL-KDD 07341 06628 281[39] SVM and feature selection NSL-KDD 090 NG NG[40] SVM and GA NSL-KDD 0975 NG NGTLBO-SVM TLBO and SVM NSL-KDD 09801 09755 00284ITLBO-SVM Improved TLBO and SVM NSL-KDD 0981 09758 00277ITLBO-JAYA-SVM Improved TLBO improved JAYA and SVM NSL-KDD 09816 09794 00265ITLBO-IPJAYA-SVM Improved TLBO improved JAYA and SVM NSL-KDD 09823 09798 00262

13534

13239

ITLBO-JAYA ITLBO-IPJAYA130501310013150132001325013300133501340013450135001355013600

Figure 15 Execution time comparison for the NSL-KDD dataset

16 Complexity

Dividing the solutions of the IPJAYA algorithm into twogroups and choosing the best solution from the best solutiongroup as ldquoBestrdquo and the best solution from the worst solutiongroup as ldquoWorstrdquo cause IPJAYA to need less iterations thanJAYA to reach better solutions as shown in Figure 13 (isalso leads to improvement in accuracy and detection rate

(e parallel improvement done on the JAYA algorithmreduces the time needed for execution and hence reduces thetotal execution time for the ITLBO-IPJAYA-SVM model asshown in Figure 15

Data Availability

(e data used to support the findings of this study areavailable online at httpswwwunbcacicdatasetsnslhtml

Conflicts of Interest

(e authors declare that they have no conflicts of interest

Acknowledgments

Special appreciation is due to Universiti Malaysia Pahang forthe sponsorship of this study approved by the Ministry ofHigher Education (MOHE) for Fundamental ResearchGrant Scheme (FRGS) with Vot No RDU190113

References

[1] A Sultana and M Jabbar ldquoIntelligent network intrusiondetection system using data mining techniquesrdquo in Pro-ceedings of the 2016 2nd International Conference on Appliedand 0eoretical Computing and Communication Technology(iCATccT) pp 329ndash333 IEEE Bangalore India July 2016

[2] R Rao ldquoJaya a simple and new optimization algorithm forsolving constrained and unconstrained opti- mizationproblemsrdquo International Journal of Industrial EngineeringComputations vol 7 no 1 pp 19ndash34 2016

[3] M Alsajri M A Ismail and S Abdul-Baqi ldquoA review on therecent application of Jaya optimization algorithmrdquo in Pro-ceedings of the 2018 1st Annual International Conference onInformation and Sciences (AiCIS) Springer Al-Fallujah Iraqpp 129ndash132 November 2018

[4] P Tao Z Sun and Z Sun ldquoAn improved intrusion detectionalgorithm based on GA and SVMrdquo IEEE Access vol 6pp 13624ndash13631 2018

[5] A S Eesa Z Orman and A M A Brifcani ldquoA novel feature-selection ap- proach based on the cuttlefish optimizationalgorithm for intrusion detection systemsrdquo Expert Systemswith Applications vol 42 no 5 pp 2670ndash2679 2015

[6] P Louvieris N Clewley and X Liu ldquoEffects-based featureidentification for network intrusion detectionrdquo Neuro-computing vol 121 pp 265ndash273 2013

[7] E De la Hoz A Ortiz J Ortega and B Prieto ldquoPCA filteringand probabilistic SOM for network intrusion detectionrdquoNeurocomputing vol 164 pp 71ndash81 2015

[8] S Mojtaba H Bamakan H Wang Y Tian and Y Shi ldquoAneffective intrusion detection framework based onMCLPSVMoptimized by time-varying chaos particle swarm optimiza-tionrdquo Neurocomputing vol 199 pp 90ndash102 2016

[9] R V Rao V J Savsani and D P Vakharia ldquoTeach-ingndashlearning-based optimization a novel method for

constrained mechanical design optimization problemsrdquordquoComputer-Aided Design vol 43 no 3 pp 303ndash315 2011

[10] S P Das and S Padhy ldquoA novel hybrid model using teaching-learning-based optimization and a support vector machine forcommodity futures index forecastingrdquo International Journalof Machine Learning and Cybernetics vol 9 no 1 pp 97ndash1112018

[11] S P Das N S Achary and S Padhy ldquoNovel hybrid SVM-TLBO forecasting model incorporating dimensionality re-duction techniquesrdquo Applied Intelligence vol 45 no 4pp 1148ndash1165 2016

[12] R V Rao V J Savsani and J Balic ldquoTeachingndashlearning-based optimization algorithm for unconstrained and con-strained real-parameter optimization problemsrdquordquo EngineeringOptimization vol 44 no 12 pp 1447ndash1462 2012

[13] R V Rao and V Patel ldquoAn improved teaching-learning-based optimization algorithm for solving uncon- strainedoptimization problemsrdquo Scientia Iranica vol 20 no 3pp 710ndash720 2013

[14] M Crepinsek S-H Liu and L Mernik ldquoA note on teach-ingndashlearning-based optimization algo- rithmrdquo InformationSciences vol 212 pp 79ndash93 2012

[15] M R Nayak C K Nayak and P K Rout ldquoApplication ofmulti-objective teaching learning based optimization algo-rithm to optimal power flow problemrdquo Procedia Technologyvol 6 pp 255ndash264 2012

[16] Y Xu L Wang S-Y Wang and M Liu ldquoAn effectiveteaching-learning-based optimization algorithm for theflexible job-shop scheduling problem with fuzzy processingtimerdquo Neurocomputing vol 148 pp 260ndash268 2015

[17] H E Kiziloz A Deniz T Dokeroglu and A Cosar ldquoNovelmultiobjective TLBO algorithms for the feature subset se-lection problemrdquoNeurocomputing vol 306 pp 94ndash107 2018

[18] S H Wang K Muhammad Y Lv et al ldquoIdentification of Al-coholism based on wavelet Renyi entropy and three-segmentencoded Jaya algorithmrdquo Complexity vol 2018 Article ID3198184 13 pages 2018

[19] H Migallon A Jimeno-Morenilla and J-L Sanchez-RomeroldquoParallel improvements of the Jaya optimization algorithmrdquoApplied Sciences vol 8 no 5 p 819 2018

[20] C Gong ldquoAn enhanced Jaya algorithm with a two groupAdaptionrdquo International Journal of Computational Intelli-gence Systems vol 10 no 1 pp 1102ndash1115 2017

[21] O Samuel N Javaid S Aslam and M H Rahim ldquoJAYAoptimization based energy management controller for smartgrid JAYA optimization based energy management con-trollerrdquo in Proceedings of the 2018 International Conference onComputing Mathematics and Engineering Technologies(iCoMET) March 2018

[22] K Yu J J Liang B Y Qu X Chen and H Wang ldquoPa-rameters identification of photovoltaic models using an im-proved JAYA optimization algorithmrdquo Energy Conversionand Management vol 150 pp 742ndash753 2017

[23] M Dash and H Liu ldquoFeature selection for classificationrdquoIntelligent Data Analysis vol 1 no 1-4 pp 131ndash156 1997

[24] S Dumais J Platt D Heckerman and M Sahami ldquoInductivelearning algorithms and representations for text categoriza-tionrdquo in Proceedings of the seventh international conference onInformation and knowledge management-CIKMrsquo98 Novem-ber 1998

[25] R Singh H Kumar and R K Singla ldquoAn intrusion detectionsystem using network traffic profiling and online sequentialextreme learning machinerdquo Expert Systems with Applicationsvol 42 no 22 pp 8609ndash8624 2015

Complexity 17

[26] M Sokolova and L Guy ldquoA systematic analysis of perfor-mance measures for classification tasksrdquo Information Pro-cessing amp Management vol 45 no 4 pp 427ndash437 2009

[27] P Phoungphol Y Zhang and Y Zhao ldquoRobust multiclassclassification for learning from imbalanced biomedical datardquoTsinghua Science and Technology vol 17 no 6 pp 619ndash6282012

[28] A Jahan F Mustapha M Y Ismail S M Sapuan andM Bahraminasab ldquoA comprehensive VIKOR method formaterial selectionrdquo Materials amp Design vol 32 no 3pp 1215ndash1221 2011

[29] L Aljarah and S A Ludwig ldquoMapreduce intrusion detectionsystem based on a particle swarm optimization clusteringalgorithmrdquo in Proceedings of the 2013 IEEE Congress onEvolutionary Computation pp 955ndash962 IEEE CancunMexico June 2013

[30] K Khaleel M A Ismail U Yunan and S Kasim ldquoReview onintrusion detection system based on the goal of the detectionsystemrdquo International Journal of Integrated Engineeringvol 10 no 6 2018

[31] J D Rodriguez A Perez and J A Lozano ldquoSensitivityanalysis of k-Fold cross validation in prediction error esti-mationrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 32 no 3 pp 569ndash575 2010

[32] M Tavallaee E Bagheri W Lu and A A Ghorbani ldquoAdetailed analysis of the KDD CUP 99 data setrdquo in Proceedingsof the 2009 IEEE Symposium on Computational Intelligence forSecurity and Defense Applications pp 1ndash6 IEEE Ottawa ONCanada July 2009

[33] M Al-Qatf Y Lasheng M Al-Habib and K Al-SabahildquoDeep learning approach combining sparse autoencoder withSVM for network intrusion detectionrdquo IEEE Access vol 6pp 52843ndash52856 2018

[34] R Vinayakumar M Alazab K P Soman P PoornachandranA Al-Nemrat and S Venkatraman ldquoDeep learning approachfor intelligent intrusion detection systemrdquo IEEE Access vol 7pp 41525ndash41550 2019

[35] BA Tama M Comuzzi and K-H Rhee ldquoTSE-IDS a two-stage classifier ensemble for intelligent anomaly-based in-trusion detection systemrdquo IEEE Access vol 7 pp 94497ndash94507 2019

[36] M R Gauthama Raman N Somu K Kannan R Liscano andV S Shankar Sriram ldquoAn efficient intrusion detection systembased on hypergraph - genetic algorithm for parameter op-timization and feature selection in support vector machinerdquordquoKnowledge-Based Systems vol 134 pp 1ndash12 2017

[37] I S (aseen and C Aswani Kumar ldquoIntrusion detectionmodel using fusion of chi- square feature selection and multiclass SVMrdquo Journal of King Saud University - Computer andInformation Sciences vol 29 no 4 pp 462ndash472 2017

[38] Y Li S Yu J Bai and X Cheng ldquoTowards effective networkintrusion detection a hybrid model integrating Gini indexand GBDTwith PSOrdquo Journal of Sensors vol 2018 Article ID1578314 9 pages 2018

[39] A A Aburomman and M B I Reaz ldquoA novel weightedsupport vector machines multi- class classifier based ondifferential evolution for intrusion detection systemsrdquo In-formation Sciences vol 414 pp 225ndash246 2017

[40] J Esmaily and J Ghasemi ldquoA novel intrusion detectionsystems based on genetic algorithms-suggested features by themeans of different permutations of labelsrsquo ordersrdquo Interna-tional Journal of Engineering vol 30 no 10 pp 1494ndash15022017

[41] S Aljawarneh M Aldwairi and M B Yassein ldquoAnomaly-based intrusion detection system through feature selectionanalysis and building hybrid efficient modelrdquo Journal ofComputational Science vol 25 pp 152ndash160 2018

[42] Y Li J L Wang Z H Tian T B Lu and C Young ldquoBuildinglightweight intrusion detection system using wrapper-basedfeature selection mechanismsrdquo Computers amp Security vol 28no 6 pp 466ndash475 2009

[43] S U Jan S Ahmed V Shakhov and I Koo ldquoToward alightweight intrusion detection system for the internet ofthingsrdquo IEEE Access vol 7 pp 42450ndash42471 2019

18 Complexity

Page 7: ImprovedTLBO …downloads.hindawi.com/journals/complexity/2020/5287684.pdfFSS is a multiobjective optimisation problem. e first objective is the number of features, and the second

radial basis function (RBF) kernel function of the SVM isemployed for the conversion of the completely nonseparableproblem into a separable or approximate separable state(eRBF kernel parameter c suggests data distribution to a newfeature space while parameter C suggests the level of penaltyfor the classification error in the linear nonseparable caseEquations (2) and (3) represent the cost and gamma re-spectively In the next section the two parameters (C and c)were tuned by using the IPJAYA algorithm

w b12

w22

+ C 1113944n

ζnstyn wtxn + b1113872 1113873ge 1 minus ζn (2)

k xn xm( 1113857 exp minusc xn minus xm

23

1113874 1113875 (3)

6 Improved Parallel JAYA Algorithm

(e JAYA algorithm needs improvements to work betterOne of the observations on the JAYA algorithm is that if wesort the populations from best to worst and divide them intotwo groups the best and the worst solutions Obviously theoptimal solution is located in the best solution group [2]Based on this observation an improvement has been done inthe JAYA algorithm rather than selecting the best and worstcases from the whole solutions which puts the worst so-lution further from the best solution and increase the it-erations needed to reach the optimal solution the solutionswere divided into two groups (e best solution is chosenfrom the best solution group as ldquoBestrdquo and the best solutionfrom the worst solution group is the ldquoWorstrdquo (is proce-dure reserves the populationrsquos diversity and makes the so-lution start from a point closer to the optimal solution anddecreases the number of iterations needed to reach theoptimal solution In the proposed work JAYA algorithmwas improved to optimise two parameters of the SVMclassifier simultaneously Figure 7 shows the flowchart ofIPJAYA while Figure 8 shows the IPJAYA algorithm fol-lowed by the detailed steps of IPJAYA

(e detailed steps of IPJAYA are shown as follows(i) Step 1 select the population size and the number of

design variables as well as initialise the terminationcondition To explain the parameter optimisation indetail we assume the following scenario populationsize 3 design variables 2 and terminationcriterion 2 iterations (e value of the populationis the value of parameters C and c in this scenarioeach one has 3 values (ese values were initialisedrandomly for C between 0001 to 100 and for c

between 00001 to 64 Table 2 shows the values of Cand c

(ii) Step 2 SVM needs three things to classify any la-belled data ie features to choose value of C pa-rameter and value of c parameter (is step can beviewed in line 2 of Figure 8

(iii) Step 3 the next step is to evaluate each value forboth C and c separately by using the SVM and on

the first student from Learner Phase 2 after applyingcrossover and mutation as shown in Table 3To continue the optimisation process the pop-ulation is arranged from best to worst and split intotwo groups (Best and Worst groups) as shown inTable 4(e same procedure is repeated for c parameter andthis time C is by default and the new value of c is11006

(iv) Tables 5 and 6 show the details of c parameter(v) Step 4 the result will be considered as the objective

function for both C and c and then compared withother populations and continued until the termi-nation criterion is satisfied (is step can be viewedin line 7 of Figure 8

(ese two new values for C and c will be evaluated usingthe same subset feature at the same time as shown in Table 7(is step can be viewed in lines 5 to 6 of Figure 8

7 The Proposed Method

(is section describes the proposed combination of threedifferent algorithms Each algorithm has a different task todo and these tasks complete the work of the model (e firstalgorithm is the ITLBO whose task is to choose the optimalsubset feature from the whole features(e second algorithmis IPJAYA algorithm and its task is to optimise the pa-rameters of the SVM (e third algorithm is the SVMclassifier which takes the outcome of the first two algorithmsto determine if the processed traffic is intrusion or normaltraffic Figure 9 shows the flowchart of the proposed methodFigure 10 shows the pseudo-code of the proposed methodwhile Figure 11 details the proposed steps of the IPJAYA-ITLBO-SVM method

(e detail steps of ITLBO-IPJAYA-SVM are asfollows

(i) Step 1 initialise the population randomly Eachpopulation is a different set of features from 1 tothe maximum number of features (41 in NSL-KDD) (is step can be viewed in line 2 ofFigure 10

(ii) Step 2 calculate the weighted average of everyindividual population (is step can be viewed inline 3 of Figure 10

(iii) Step 3 choose the best individual as a teacher (echosen teacher interacts with all other individualsseparately Apply crossover with each one andthen apply mutation to all resulted individuals(ecrossover used is called the half-uniform crossoverand bit-flip mutation operator (is step can beviewed in lines 4 to 5 of Figure 10

(iv) Step 4 check the population (chromosome)resulting from crossover and mutation if the newone is better than the old one keep the new oneotherwise retain the old one (e best and worstpopulations refer to the degree of accuracy during

Complexity 7

Start

Initialize population size number of design variables and termination criterion

Identify best and worst solutions for C and γ

Calculate new worst solution for C Calculate new worst solution for γ

Modify the solution based on eq 4 Modify the solution based on eq 4

Use the new value of C and γ do the classification

Is new Acc Better than old Acc

Keep new C and γKeep old C and γ

Is the termination criterionsatisfied

Return best value of C and γ

End

YesNo

NO

Yes

Parallel pool

Figure 7 IPJAYA flowchart

(1) Start (2) Initialise the population size number of designed

variables and termination criteria (3) Repeat Steps 3ndash6 until the termination criteria are met (4) Arrange the solutions from best to worst and split the

solutions into two groups -best and worst solutions(5) Make the best solution in best group as best and

make the best solution in worst group as worst(6) Modify the solution based on the following equation

Yprimej k I = Yj k I + r1 k I (Yj best I ndash | Yj k I |) ndashr2 k I (Yj worst I ndash | Yj k I|)

(7) Update the previous solution if Yprimej k I gt Yj k Iotherwise do not update the previous solution

(8) Display the established optimum solution(9) End

Figure 8 IPJAYA algorithm

Table 2 C and c values

C c

20 101 210 0701 1

Table 3 Evaluation of C

C c Subset feature Accuracy (objective function)20 Default Fixed 0971 Default Fixed 089910 Default Fixed 099401 Default Fixed 099

Table 4 Best and Worst groups for C

C cSubsetfeature

Accuracy (objectivefunction)

10 Default Fixed 0994 best of best Best group01 Default Fixed 09920 Default Fixed 097 best of worst Worst

group1 Default Fixed 0899

Table 5 Accuracy based on c

C γ Subset feature Accuracy (objective function)20 Default Fixed 0971 Default Fixed 089910 Default Fixed 099401 Default Fixed 099

8 Complexity

classification All the aforementioned steps arecalled the Teacher phase because all individualslearn from the best one (teacher) After thatLearner Phase 1 is started (is step can be viewedin lines 6 to 13 of Figure 10

(v) Step 5 select the best two individuals as studentsand apply crossover between these two students(en apply mutation on the new one If the newone is better than the old two students keep thenew one otherwise keep the best old one and

apply this with all other individuals (students) (estudents are chosen once and will not be chosenagain At this point Learner Phase 1 has ended(is step can be viewed in lines 14 to 27 ofFigure 10

(vi) Step 6 Learner Phase 2 is initiated with two objec-tives one is to optimise the SVMparameters and theother is to make students learn from each other(isphase starts with choosing two random individuals(students) and then applying crossover between these

Table 6 Best and Worst groups for c

C c Subset feature Accuracy (objective function)Default 07 Fixed 09941 best of best Best groupDefault 1 Fixed 099Default 2 Fixed 098 best of worst Worst groupDefault 10 Fixed 097

Table 7 Evaluation of features based on new C and c

C c Subset feature Accuracy (objective function)142 11006 Fixed Result

Start

FSS using ITLBO

SVM parameters optimization using IPJAYA

Is termination criterion satisfied for IPJAYA

No

Is termination criterion satisfied for ITLBO

No

Training

Classification

Intrusion detection

End

Yes

Yes

Figure 9 Proposed method flowchart

Complexity 9

two students and applying mutation on the newindividual After that and before the classificationprocess is initiated check if the new student is betterthan the old two students (e SVM parameteroptimisation is started using IPJAYA this processstarts at the 29th step by initialising the populationsize the number of design variables and the ter-mination criteria for IPJAYA (e population sizecan be set before the execution and each populationis generated randomly(e designed variables are thetwo parameters of the SVM which need to beoptimised (e termination criteria can be thenumber of iterations after that each population foreach parameter is evaluated separately (which onegives better accuracy) followed by a parallel poll foreach parameter sorting the population from best toworst (best accuracy to worst accuracy) and sepa-rating them into two groups (best and worst groups)(e best population in the best group is chosen asbest and the best population in the worst group ischosen as worst (en the population is modifiedbased on equation in Figure 8 and updated if the newone is better than the old one IPJAYA is repeated

until the termination criterion is satisfied (e finalstep of IPJAYA is to deliver the best value of the twoparameters to be used by the SVM At this point theparameter optimisation has ended and LearnerPhase 2 continues in the next step (is step can beviewed in lines 28 to 39 of Figure 10

(vii) Step 7 evaluate the individuals (chromosome) byusing the outcome of IPJAYA If the new indi-vidual is better than the old two students keepthe new one otherwise keep the best old oneApply this step to all other students At this stepthe main three stages of the ITLBO have finished(e next step is to check for the satisfaction of thetermination criteria if satisfied proceed to thenext step Otherwise the main three stages arerepeated (is step can be viewed in lines 40 to 48of Figure 10

(viii) Step 8 the last step is to apply nondominatedsorting on the result Nondominated sortingmeans no result (individual) is better than all theother individuals(is step can be viewed in line 49of Figure 10

(1) Start (2) Initialize population(3) Calculate_weighted_average_of_individuals (4) for (k =1 to number_of_generations) do(5) Xteacher = Best_individual (6) Learning from Teacher lowastteacher phase (7) for (i = 1 to number_of_individuals) do(8) Xnew = Crossover (Xteacher Xi)(9) Xnew = Mutation(Xnew)(10) if (Xnew is better than Xi) then

(11) Xi = [Xnew]

(12) End if

(13) End for (14) Learning from Best Classmates lowastlearner phase 1 (15) for (i = 1 to number_of_individuals) do(16) m = Select_best_individual_from

(population)

(17) n = Select_best_individual_from (population)(18) n nem ne teacherlowast(19) Xnew = Crossover (Xm Xn)(20) Xnew = Mutation(Xnew)

(21) if (Xnew is better than Xm) then

(22) Xm = Xnew

(23) End if(24) if (Xnew is better than Xn) then

(25) Xn = Xnew

(26) End if

(27) End for

(28) Learning from Classmates lowastlearner phase 2(29) for (i =1 to number_of_individuals) do(30) m =Select_random_individual_from

(population)

(31 n =Select_random_individual_from (population) n nem ne teacherlowast

(32) Xnew = Crossover (Xm Xn)

(33) Xnew = Mutation (Xnew)(34) IPJAYA algorithm for SVM parameter

(35) Initialize the population size number of designed variables and termination criteria (IPJAYA)

(36) Arrange the solutions from the best to the worstand Split the solutions into two groups best andworst solutions

(37) Modify the solutionYj k I = Yj k I + r1 k I (Yj best I ndash |Yj k I|) ndash r2 k I (Yj worst I ndash | Yj k I|)

(38 Update the previous solution if Yj k I gt Yj k Iotherwise do not update the previous solution

(39) return best value of C and Y(40) if ( Xnew is better than Xm) then(41) Xm = Xnew(42) End if (43) if (Xnew is better than Xn) then(44) Xn = Xnew(45) End if (46) End for (47) Is termination criterion satisfied if yes go to line

42 else continue

(48) End for(49) Show_the_pareto_optimal_set (population)(50) End

Figure 10 Proposed method pseudo-code

10 Complexity

Start

Calculate weighted average of every individual in the population

Choose best individual as teacher

Crossover teacher with all other individuals (student) separately and apply mutation

Is new one better than old oneKeep old one Keep new one

Select best two students apply crossover and mutation

Is new one better than theworse student

Keep new oneKeep old one

Select random two students apply crossover and mutation

Initialize population size number of design variablesand termination criterion

Identify best and worst solutions for C and γ

Calculate new worst solution

Modify the solutions based on equation 32

Use the new value of c and y do the classification

Is new one better than old one Keep new oneKeep old one

Is the termination criterionsatisfied

This is the best value of C and γ

Is new one better thanthe worse student based on new

value of C and γ

Keep new oneKeep old one

Is the termination criterionsatisfied

End

YesNo

YesNo

Yes

No Yes

YesNo

No

No

Initialise population randomly

Apply nondominated sorting and find the Pareto setYes

ITLBO

IPJAYA

Figure 11 Details of the proposed method

Complexity 11

8 Evaluation Metrics

(emetrics measures and validation procedures used in theevaluation of the experimental data were reviewed in thissection (e literature review showed that most studies useoverall accuracy as the major performance measure for IDsystems However other metrics and validation measureshave also been mentioned Some works have detailed theinformation on FAR detections andmissed detections whichare all useful system performance evaluation measures (efollowing section details the analysis based on standardmetrics for objective evaluation of the results achieved byvarious classification methods (e performance of thesystem was evaluated using several metrics based on theNSL-KDD and CICIDS 2017 datasets A detailed descriptionof learning performance measures has been provided bySingh et al and Sokolova and Guy [25 26] whilePhoungphol et al [27] detailed the imbalanced dataset is-sues One of these metrics is the accuracy as given in thefollowing equation

accuracy TP + TN

TP + FN + TF + FP (4)

Accuracy is the capability of the classifier in predictingthe actual class here TP true positive TN true negativeFP false positive and FN false negative

Several metrics can be computed from the confusionmatrix (e false-positive rate (FPR) is another metric it isthe percentage of the samples incorrectly predicted aspositive by the classifier It is calculated by using the fol-lowing equation

false minus positive rate(FPR) FP

FP + TN (5)

(e false-negative rate (FNR) is the percentage of thedata incorrectly classified by the classifier as negative It iscalculated by using the following equation

false minus negative rate(FNR) FN

TP + FN (6)

(e detection rate (DR) is the percentage of the samplescorrectly classified by the classifier to their correct class It iscalculated by using the following equation

detection rate FP

TP + FP (7)

(e recall quantifies the number of correct positivepredictions made out of all positive predictions It is cal-culated by using the following equation

recall TP

TP + FN (8)

F-Measure provides a way to combine both detectionrate and recall into a single measure that captures bothproperties It is calculated by using the following equation

F minus M (2lowastDRlowast recall)

(DR + recall) (9)

(e results were validated by using k-fold cross-vali-dation technique [27ndash30] (is technique requires a randompartitioning of the data into k different parts and one part isselected from each iteration as testing data while the other(kminus1) parts are considered as the training dataset All theconnection records are eventually used for training andtesting For all experiments the value of k is taken as 10 toensure low bias low variance low overfitting and good errorestimate [28]

9 Dataset Preprocessing and Partitioning

(e whole dataset is preprocessed in this stage It consists oftwo steps ie scaling and normalisation In the scaling stepthe dataset is converted from a string representation to anumerical representation For example the class label in thedataset contains two different categories ldquoNormalrdquo andldquoAttackrdquo After implementing this step the label is changedto ldquo1rdquo and ldquo0rdquo where ldquo1rdquo means normal case while ldquo0rdquomeans attack (e second step is normalisation [31] (enormalisation process removes noise from the dataset anddecreases the differences in the ranges between the featuresIn this work the Max-Max normalisation method was usedas shown in the following equation

Fi Fi minus Mini

Maxi minus Minii (10)

where Fi represents the current feature that needs to benormalized and Mini and Maxi represent the minimum andthe maximum value for that feature respectively (e ob-jective function represents the accuracy of the SVM when itis evaluated on the validation set (e validation set is a partof the training set In order to make the validation fairer K-fold validation can be used(e value K is 10(e NSL-KDDand CICIDS 2017 datasets were used to evaluate the per-formance of the proposed models

10 NSL-KDD Dataset

In this study NSL-KDD datasets were used to evaluate theproposed method (is dataset was suggested in 2009 byTavallaee et al [32] due to the drawbacks of KDD CUP99(e NSL-KDD is a variant of the KDD CUP 99 dataset inwhich the redundant instances were discarded followed bythe reconstitution of the dataset structure [29] (e NSL-KDD dataset is commonly used for evaluating the per-formance of new ID approaches especially anomaly-basednetwork ID (ere are a reasonable number of testing andtraining records in the NSL-KDD (e training set(KDDTrain+) consists of 125973 records while the testingset (KDDTest+) contains 22544 records In this dataseteach traffic record has 41 features (six symbolic and 35continuous) and one class label (Table 7) (e features areclassified into basic content and traffic types (Table 8)Attack classification in the NSL-KDD is based on thefeature characteristics [33] (e NSL-KDD dataset can bedownloaded from httpswwwunbcacicdatasetsnslhtml

12 Complexity

11 CICIDS 2017 Dataset

(e CICIDS 2017 dataset consists of benign and the mostcurrent common attacks which mimic real-world data(PCAPs) It also contains the results of a network trafficanalysis obtained by using a CICFlowMeter the flows arelabelled based on the timestamp source and destinationports source and destination IPs protocols and attack (eCICIDS 2017 dataset satisfies the 11 indispensable features ofa valid IDS dataset namely anonymity available protocolsfeature set attack diversity complete capture completeinteraction complete network configuration completetraffic metadata heterogeneity and labelling [34] (ere are2830540 rows in the CICIDS 2017 devised on eight fileswith each row containing 79 features In the CICIDS 2017each row is labelled as benign or as one of the 14 attack typesA summary of the distribution of different attack types andthe benign rows is presented in Table 9

12 Results of ITLBO-IPJAYA vs ITLBOand ITLBO-JAYA

(is section provides the results of the improved method-based ITLBO-IPJAYA algorithm (is method selects thebest features and updates the value of SVM parameters (iswork proposed the idea of ldquoparallel executionrdquo to update theSVM parameters (e parameters for ITLBO ITLBO-JAYAand ITLBO-IPJAYA methods used in this study are shownin Table 10

(e NSL-KDD dataset is used to evaluate the threemethods and the evaluation metrics used are maximumaccuracy (Max Acc) average accuracy (AVR Acc) de-tection rate (DT) false alarm rate (FAR) false negative rate(FNR) F-measure (F-M) recall and error rate (ER)Table 11 shows the comparison in results among ITLBOITLBO-JAYA and ITLBO-IPJAYA

(e results show that ITLBO-IPJAYA performs betterthan ITLBO and ITLBO-JAYA in all metrics Figure 11shows the comparison results based on the accuracy ofITLBO ITLBO-JAYA and ITLBO-IPJAYA

Figure 12 shows a comparison between ITLBO-JAYAand ITLBO-IPJAYA based on the number of iterations Itshows that ITLBO-IPJAYA performs better than ITLBO-JAYA even with less number of iterations (e increase inrate of accuracy for ITLBO-IPJAYA is higher than ITLBO-JAYA (e figure shows that ITLBO-IPJAYA with 20 iter-ations performs better than ITLBO-JAYA with 30 iterationsand that ITLBO-IPJAYA performs better than ITLBO-JAYAwith less number of iterations (is means there is lesscomplexity and less execution time for ITLBO-IPJAYAFigure 13 shows the average FAR of the three methodsshowing that ITLBO-IPJAYA performs better than ITLBOand ITLBO-JAYA even with less number of features whereITLBO-IPJAYA with 19 features performs better than TLBOand ITLBO-JAYA with 21 and 22 features respectively (eimprovements shown in Sections 4 and 6 reduce the exe-cution time for ITLBO-IPJAYA over ITLBO-JAYA (eparallel processing of each SVM parameter independently isthe main factor that reduces the execution time for ITLBO-IPJAYA over ITLBO-JAYA as shown in Figure 14

(e results of the CICIDS 2017 dataset are shown inTable 12

Finally statistical significance tests (T-test) T-test madeon the distribution of values in both samples showed theirsignificant difference which allowed us to reject null hy-pothesis H0 (e test sh2ows the superiority of IPJAYA-ITLBO-SVM over JAYA-ITLBO-SVM (e P values and T-

Table 8 NSL-KDD dataset

Attack classes 22 types of attacks No of instancesNormal 67343DoS smurt neptune pod teardrop back land 45927R2L phf ftp-write imap multihop warezclient warezmaster spy guess password 995U2R perl loadmodule buffer-overflow rootkit 52Probing portsweep ipsweep satan nmap 11656

Table 9 CICIDS dataset

Attack class 14 types of attacks No of instancesBenign (normal) 2359087DOS DDoS slowloris Heratbleed Hulk GoldenEye Slowhttptest 294506PortScan Portscan 158930Bot Bot 1966Brute-Force FTP-Patator SSH-Patator 13835Web attack Web attack XSS web attack SQL injection web attack brute force 2180Infiltration Infiltration 36

Table 10 Parameters used in this study margin

Parameter ValuePopulation size for ITLBO 40Number of generations for ITLBO 60Population size for JAYA 40Number of generations for JAYA 60Population size for IPJAYA 40Number of generations for IPJAYA 60Population size for ITLBO 40Number of generations for IPJAYA 60Crossover type Half-uniformMutation type Bit-flip

Complexity 13

0973097409750976097709780979

098098109820983

Acc

urac

y

ITLBO-JAYAITLBO-IPJAYA

25 30 35 40 45 50 55 60 6520Number of iterations

Figure 13 Accuracy comparison based on the number of iterations for the NSL-KDD dataset

Table 11 Comparison of ITLBO ITLBO-JAYA and ITLBO-IPJAYA for the NSL-KDD dataset

No of features Method MAX Acc AVR Acc DR FAR FNR F-M Recall ER

16

TLBO 09639 09630 09612 00449 00282 09664 09717 0036ITLBO 09680 09678 09671 00379 00268 09701 09731 0032

ITLBO-JAYA 09688 09685 09676 00373 00258 0971 09741 00312ITLBO-IPJAYA 09708 09705 09712 00331 00256 09727 09742 00292

18

TLBO 09713 0971 09739 00299 00275 09731 09724 00286ITLBO 09718 09713 09744 00292 00273 09736 09726 00282

ITLBO-JAYA 09735 09733 09752 00285 00247 09752 09752 00265ITLBO-IPJAYA 09747 09746 09753 00280 00221 09764 09779 00252

19

TLBO 09738 09735 09727 00313 00225 09755 09774 00261ITLBO 09751 09745 09737 00305 00189 09769 09811 00248

ITLBO-JAYA 09759 09758 09758 00278 00178 09775 09791 00241ITLBO-IPJAYA 09772 09770 09786 00245 00162 09787 09787 00228

21

TLBO 09782 09780 09742 00299 00145 09797 09844 00217ITLBO 09787 09784 09756 00279 00144 0981 09846 00212

ITLBO-JAYA 09793 0979 09789 00273 00132 09811 09867 00207ITLBO-IPJAYA 09802 0980 09792 00263 00123 09812 09716 00198

22

TLBO 09801 0979 09755 00284 00131 09814 09868 00199ITLBO 0981 09805 09758 00277 00117 09823 0989 00191

ITLBO-JAYA 09816 09814 09794 00265 00114 09829 0989 00183ITLBO-IPJAYA 0-9823 09821 09798 00262 00102 09835 09898 00177

Acc

urac

y

16 18 19 21 22Number of features

096

80

9688 097

08

097

180

9735

097

47

097

510

9759

097

72

097

870

9793

098

02

098

10

9816

098

23

ITLBOITLBO-JAYAITLBO-IPJAYA

Figure 12 Accuracy based on the number of features for the NSL-KDD dataset

14 Complexity

values are shown in Table 13 the small values show that theIPJAYA-ITLBO-SVM method (MV1) is highly significant

13 The Comparison of the Proposed Methods

To illustrate the effectiveness of our proposed IDS methodsthe performance of the proposed methods is compared withsix recently developed anomaly detection techniquesTable 14 demonstrates the result achieved by the proposedmethods compared with other methods tested on the NSL-KDD dataset in terms of detection rate and false alarm rateIt is very clear that our proposed methods (ITLBO-JAYAand ITLBO-IPJAYA) obtained the best results with 09823accuracy 09798 detection rate and 00102 false alarm ratefor the ITLBO-IPJAYA model and 09816 accuracy 09794detection rate and 00114 false alarm rate for the ITLBO-JAYA method as shown in Table 11 However Table 15demonstrates the result achieved by the proposed methodscompared with other methods tested on the CICIDS 2017dataset in terms of detection rate and false alarm rate

14 Discussion

(is work in general contains 4 sections based on theproposed method Furthermore all methods proposed inthis work were evaluated based on the NSL-KDD andCICIDS 2017 datasets

Firstly the proposed ITLBO-IPJAYA based on networkintrusion detection and method results were compared withTLBO ITLBO and ITLBO-JAYA as shown in Tables 11 and12 Additionally the table shows the different features for thethree algorithms to investigate the influence of the featurersquosincrease on the performance which represents a differentalgorithm structure (e ITLBO-IPJAYA results showedhigher stability and better accuracy than ITLBO and ITLBO-JAYA algorithms

Furthermore Figure 13 shows that ITLBO-JAYA needs60 iterations to reach accuracy of 09816 when the ITLBO-IPJAYA algorithm with 50 iterations achieved higher ac-curacy (erefore ITLBO-IPJAYA achieved better detectionrate and less false alarm rate with less complexity of

0020

0022

0024

0026

0028

003

0032

0034

0036

0038

004

FAR

2217 18 19 20 21 2315 16Number of features

ITLBOITLBO-JAYAITLBO-IPJAYA

Figure 14 Comparison based on the number of features head for the NSL-KDD dataset

Table 12 Comparison of ITLBO ITLBO-JAYA and ITLBO-IPJAYA for the CICIDS dataset

No of features Method MAX Acc AVR Acc DR FAR FNR F-M Recall ER

12ITLBO 09634 09631 09661 00389 00268 0970 09721 00323

ITLBO-JAYA 09685 09683 09682 00360 00267 09713 09722 00315ITLBO-IPJAYA 09704 09702 09701 00310 00265 09725 09724 00298

13ITLBO 09712 09710 09724 00298 00273 09736 09726 00282

ITLBO-JAYA 09745 09744 09728 00290 00264 09741 09794 00272ITLBO-IPJAYA 09768 09767 09732 00285 00260 09752 09787 00264

14ITLBO 09776 09775 09737 00280 00189 09769 09811 00258

ITLBO-JAYA 09789 09787 09742 00270 00174 0978 0986 00235ITLBO-IPJAYA 09801 0980 09749 00265 00134 0981 0987 00210

16ITLBO 09804 09803 09755 00271 0011 09821 0989 00190

ITLBO-JAYA 0981 09808 09773 00266 00109 09825 0989 00183ITLBO-IPJAYA 09817 09815 09782 00264 00105 09831 09896 00170

Complexity 15

iterations Secondly with all the improvement of ITLBO-SVM mentioned above random selection of the main SVMparameters is considered as one of the algorithm limitationswhich may not provide optimal parameter value and affectthe model accuracy negatively

(e results above showed that the ITLBO-IPJAYAperformance improved the basic SVM performance byproviding the best parameter values as shown in the ITLBO-IPJAYA block diagram in Figure 11 In the end the per-formance of ITLBO-IPJAYA is worth reducing the impact ofselected parameters randomly

As a result of the differences in the algorithm structurethe ITLBO structure contains three phases which shouldprevent the algorithm from being trapped in local and globaloptima Also teachers not only teach learners (students) butalso teach other teachers On the contrary the TLBO structurecontains two phases only where teachers teach learners only

Furthermore the ITLBO algorithm achieved higheraccuracy than TLBO because the knowledge exchange rate ishigher in ITLBO since teachers teach learners and otherteachers (erefore ITLBO achieved better detection rateand less false alarm rate with less complexity of iterations

Table 15 Comparison with the existing work for the CICIDS 2017 dataset

Ref Method Dataset Acc DR FAR[41] Hybrid model CICIDS 8976 NG NG[42] Wrapper-based feature selection CICIDS 9768 NG NG[43] Feature selection technique and SVM CICIDS 09803 NG NGTLBO-SVM TLBO and SVM CICIDS 09794 09745 00274ITLBO-SVM Improved TLBO and SVM CICIDS 09804 09755 00271ITLBO-JAYA-SVM Improved TLBO improved JAYA and SVM CICIDS 0981 09773 00266ITLBO-IPJAYA-SVM Improved TLBO improved JAYA and SVM CICIDS 09817 09782 00264

Table 13 T-test results

NSL-KDD CICIDS 2017P value 00156 00068T value 3174 4044

Table 14 Comparison with the existing work for the NSL-KDD dataset

Ref Method Dataset Acc DR FAR[35] Two-stage classifier NSL-KDD 9638 NG NG[36] Hypergraph-based genetic algorithm and SVM NSL-KDD 0975 09714 083[8] PSO and SVM NSL-KDD 09784 09723 087[37] Chi-square and SVM NSL-KDD 098 NG 013[38] SVM and hybrid PSO NSL-KDD 07341 06628 281[39] SVM and feature selection NSL-KDD 090 NG NG[40] SVM and GA NSL-KDD 0975 NG NGTLBO-SVM TLBO and SVM NSL-KDD 09801 09755 00284ITLBO-SVM Improved TLBO and SVM NSL-KDD 0981 09758 00277ITLBO-JAYA-SVM Improved TLBO improved JAYA and SVM NSL-KDD 09816 09794 00265ITLBO-IPJAYA-SVM Improved TLBO improved JAYA and SVM NSL-KDD 09823 09798 00262

13534

13239

ITLBO-JAYA ITLBO-IPJAYA130501310013150132001325013300133501340013450135001355013600

Figure 15 Execution time comparison for the NSL-KDD dataset

16 Complexity

Dividing the solutions of the IPJAYA algorithm into twogroups and choosing the best solution from the best solutiongroup as ldquoBestrdquo and the best solution from the worst solutiongroup as ldquoWorstrdquo cause IPJAYA to need less iterations thanJAYA to reach better solutions as shown in Figure 13 (isalso leads to improvement in accuracy and detection rate

(e parallel improvement done on the JAYA algorithmreduces the time needed for execution and hence reduces thetotal execution time for the ITLBO-IPJAYA-SVM model asshown in Figure 15

Data Availability

(e data used to support the findings of this study areavailable online at httpswwwunbcacicdatasetsnslhtml

Conflicts of Interest

(e authors declare that they have no conflicts of interest

Acknowledgments

Special appreciation is due to Universiti Malaysia Pahang forthe sponsorship of this study approved by the Ministry ofHigher Education (MOHE) for Fundamental ResearchGrant Scheme (FRGS) with Vot No RDU190113

References

[1] A Sultana and M Jabbar ldquoIntelligent network intrusiondetection system using data mining techniquesrdquo in Pro-ceedings of the 2016 2nd International Conference on Appliedand 0eoretical Computing and Communication Technology(iCATccT) pp 329ndash333 IEEE Bangalore India July 2016

[2] R Rao ldquoJaya a simple and new optimization algorithm forsolving constrained and unconstrained opti- mizationproblemsrdquo International Journal of Industrial EngineeringComputations vol 7 no 1 pp 19ndash34 2016

[3] M Alsajri M A Ismail and S Abdul-Baqi ldquoA review on therecent application of Jaya optimization algorithmrdquo in Pro-ceedings of the 2018 1st Annual International Conference onInformation and Sciences (AiCIS) Springer Al-Fallujah Iraqpp 129ndash132 November 2018

[4] P Tao Z Sun and Z Sun ldquoAn improved intrusion detectionalgorithm based on GA and SVMrdquo IEEE Access vol 6pp 13624ndash13631 2018

[5] A S Eesa Z Orman and A M A Brifcani ldquoA novel feature-selection ap- proach based on the cuttlefish optimizationalgorithm for intrusion detection systemsrdquo Expert Systemswith Applications vol 42 no 5 pp 2670ndash2679 2015

[6] P Louvieris N Clewley and X Liu ldquoEffects-based featureidentification for network intrusion detectionrdquo Neuro-computing vol 121 pp 265ndash273 2013

[7] E De la Hoz A Ortiz J Ortega and B Prieto ldquoPCA filteringand probabilistic SOM for network intrusion detectionrdquoNeurocomputing vol 164 pp 71ndash81 2015

[8] S Mojtaba H Bamakan H Wang Y Tian and Y Shi ldquoAneffective intrusion detection framework based onMCLPSVMoptimized by time-varying chaos particle swarm optimiza-tionrdquo Neurocomputing vol 199 pp 90ndash102 2016

[9] R V Rao V J Savsani and D P Vakharia ldquoTeach-ingndashlearning-based optimization a novel method for

constrained mechanical design optimization problemsrdquordquoComputer-Aided Design vol 43 no 3 pp 303ndash315 2011

[10] S P Das and S Padhy ldquoA novel hybrid model using teaching-learning-based optimization and a support vector machine forcommodity futures index forecastingrdquo International Journalof Machine Learning and Cybernetics vol 9 no 1 pp 97ndash1112018

[11] S P Das N S Achary and S Padhy ldquoNovel hybrid SVM-TLBO forecasting model incorporating dimensionality re-duction techniquesrdquo Applied Intelligence vol 45 no 4pp 1148ndash1165 2016

[12] R V Rao V J Savsani and J Balic ldquoTeachingndashlearning-based optimization algorithm for unconstrained and con-strained real-parameter optimization problemsrdquordquo EngineeringOptimization vol 44 no 12 pp 1447ndash1462 2012

[13] R V Rao and V Patel ldquoAn improved teaching-learning-based optimization algorithm for solving uncon- strainedoptimization problemsrdquo Scientia Iranica vol 20 no 3pp 710ndash720 2013

[14] M Crepinsek S-H Liu and L Mernik ldquoA note on teach-ingndashlearning-based optimization algo- rithmrdquo InformationSciences vol 212 pp 79ndash93 2012

[15] M R Nayak C K Nayak and P K Rout ldquoApplication ofmulti-objective teaching learning based optimization algo-rithm to optimal power flow problemrdquo Procedia Technologyvol 6 pp 255ndash264 2012

[16] Y Xu L Wang S-Y Wang and M Liu ldquoAn effectiveteaching-learning-based optimization algorithm for theflexible job-shop scheduling problem with fuzzy processingtimerdquo Neurocomputing vol 148 pp 260ndash268 2015

[17] H E Kiziloz A Deniz T Dokeroglu and A Cosar ldquoNovelmultiobjective TLBO algorithms for the feature subset se-lection problemrdquoNeurocomputing vol 306 pp 94ndash107 2018

[18] S H Wang K Muhammad Y Lv et al ldquoIdentification of Al-coholism based on wavelet Renyi entropy and three-segmentencoded Jaya algorithmrdquo Complexity vol 2018 Article ID3198184 13 pages 2018

[19] H Migallon A Jimeno-Morenilla and J-L Sanchez-RomeroldquoParallel improvements of the Jaya optimization algorithmrdquoApplied Sciences vol 8 no 5 p 819 2018

[20] C Gong ldquoAn enhanced Jaya algorithm with a two groupAdaptionrdquo International Journal of Computational Intelli-gence Systems vol 10 no 1 pp 1102ndash1115 2017

[21] O Samuel N Javaid S Aslam and M H Rahim ldquoJAYAoptimization based energy management controller for smartgrid JAYA optimization based energy management con-trollerrdquo in Proceedings of the 2018 International Conference onComputing Mathematics and Engineering Technologies(iCoMET) March 2018

[22] K Yu J J Liang B Y Qu X Chen and H Wang ldquoPa-rameters identification of photovoltaic models using an im-proved JAYA optimization algorithmrdquo Energy Conversionand Management vol 150 pp 742ndash753 2017

[23] M Dash and H Liu ldquoFeature selection for classificationrdquoIntelligent Data Analysis vol 1 no 1-4 pp 131ndash156 1997

[24] S Dumais J Platt D Heckerman and M Sahami ldquoInductivelearning algorithms and representations for text categoriza-tionrdquo in Proceedings of the seventh international conference onInformation and knowledge management-CIKMrsquo98 Novem-ber 1998

[25] R Singh H Kumar and R K Singla ldquoAn intrusion detectionsystem using network traffic profiling and online sequentialextreme learning machinerdquo Expert Systems with Applicationsvol 42 no 22 pp 8609ndash8624 2015

Complexity 17

[26] M Sokolova and L Guy ldquoA systematic analysis of perfor-mance measures for classification tasksrdquo Information Pro-cessing amp Management vol 45 no 4 pp 427ndash437 2009

[27] P Phoungphol Y Zhang and Y Zhao ldquoRobust multiclassclassification for learning from imbalanced biomedical datardquoTsinghua Science and Technology vol 17 no 6 pp 619ndash6282012

[28] A Jahan F Mustapha M Y Ismail S M Sapuan andM Bahraminasab ldquoA comprehensive VIKOR method formaterial selectionrdquo Materials amp Design vol 32 no 3pp 1215ndash1221 2011

[29] L Aljarah and S A Ludwig ldquoMapreduce intrusion detectionsystem based on a particle swarm optimization clusteringalgorithmrdquo in Proceedings of the 2013 IEEE Congress onEvolutionary Computation pp 955ndash962 IEEE CancunMexico June 2013

[30] K Khaleel M A Ismail U Yunan and S Kasim ldquoReview onintrusion detection system based on the goal of the detectionsystemrdquo International Journal of Integrated Engineeringvol 10 no 6 2018

[31] J D Rodriguez A Perez and J A Lozano ldquoSensitivityanalysis of k-Fold cross validation in prediction error esti-mationrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 32 no 3 pp 569ndash575 2010

[32] M Tavallaee E Bagheri W Lu and A A Ghorbani ldquoAdetailed analysis of the KDD CUP 99 data setrdquo in Proceedingsof the 2009 IEEE Symposium on Computational Intelligence forSecurity and Defense Applications pp 1ndash6 IEEE Ottawa ONCanada July 2009

[33] M Al-Qatf Y Lasheng M Al-Habib and K Al-SabahildquoDeep learning approach combining sparse autoencoder withSVM for network intrusion detectionrdquo IEEE Access vol 6pp 52843ndash52856 2018

[34] R Vinayakumar M Alazab K P Soman P PoornachandranA Al-Nemrat and S Venkatraman ldquoDeep learning approachfor intelligent intrusion detection systemrdquo IEEE Access vol 7pp 41525ndash41550 2019

[35] BA Tama M Comuzzi and K-H Rhee ldquoTSE-IDS a two-stage classifier ensemble for intelligent anomaly-based in-trusion detection systemrdquo IEEE Access vol 7 pp 94497ndash94507 2019

[36] M R Gauthama Raman N Somu K Kannan R Liscano andV S Shankar Sriram ldquoAn efficient intrusion detection systembased on hypergraph - genetic algorithm for parameter op-timization and feature selection in support vector machinerdquordquoKnowledge-Based Systems vol 134 pp 1ndash12 2017

[37] I S (aseen and C Aswani Kumar ldquoIntrusion detectionmodel using fusion of chi- square feature selection and multiclass SVMrdquo Journal of King Saud University - Computer andInformation Sciences vol 29 no 4 pp 462ndash472 2017

[38] Y Li S Yu J Bai and X Cheng ldquoTowards effective networkintrusion detection a hybrid model integrating Gini indexand GBDTwith PSOrdquo Journal of Sensors vol 2018 Article ID1578314 9 pages 2018

[39] A A Aburomman and M B I Reaz ldquoA novel weightedsupport vector machines multi- class classifier based ondifferential evolution for intrusion detection systemsrdquo In-formation Sciences vol 414 pp 225ndash246 2017

[40] J Esmaily and J Ghasemi ldquoA novel intrusion detectionsystems based on genetic algorithms-suggested features by themeans of different permutations of labelsrsquo ordersrdquo Interna-tional Journal of Engineering vol 30 no 10 pp 1494ndash15022017

[41] S Aljawarneh M Aldwairi and M B Yassein ldquoAnomaly-based intrusion detection system through feature selectionanalysis and building hybrid efficient modelrdquo Journal ofComputational Science vol 25 pp 152ndash160 2018

[42] Y Li J L Wang Z H Tian T B Lu and C Young ldquoBuildinglightweight intrusion detection system using wrapper-basedfeature selection mechanismsrdquo Computers amp Security vol 28no 6 pp 466ndash475 2009

[43] S U Jan S Ahmed V Shakhov and I Koo ldquoToward alightweight intrusion detection system for the internet ofthingsrdquo IEEE Access vol 7 pp 42450ndash42471 2019

18 Complexity

Page 8: ImprovedTLBO …downloads.hindawi.com/journals/complexity/2020/5287684.pdfFSS is a multiobjective optimisation problem. e first objective is the number of features, and the second

Start

Initialize population size number of design variables and termination criterion

Identify best and worst solutions for C and γ

Calculate new worst solution for C Calculate new worst solution for γ

Modify the solution based on eq 4 Modify the solution based on eq 4

Use the new value of C and γ do the classification

Is new Acc Better than old Acc

Keep new C and γKeep old C and γ

Is the termination criterionsatisfied

Return best value of C and γ

End

YesNo

NO

Yes

Parallel pool

Figure 7 IPJAYA flowchart

(1) Start (2) Initialise the population size number of designed

variables and termination criteria (3) Repeat Steps 3ndash6 until the termination criteria are met (4) Arrange the solutions from best to worst and split the

solutions into two groups -best and worst solutions(5) Make the best solution in best group as best and

make the best solution in worst group as worst(6) Modify the solution based on the following equation

Yprimej k I = Yj k I + r1 k I (Yj best I ndash | Yj k I |) ndashr2 k I (Yj worst I ndash | Yj k I|)

(7) Update the previous solution if Yprimej k I gt Yj k Iotherwise do not update the previous solution

(8) Display the established optimum solution(9) End

Figure 8 IPJAYA algorithm

Table 2 C and c values

C c

20 101 210 0701 1

Table 3 Evaluation of C

C c Subset feature Accuracy (objective function)20 Default Fixed 0971 Default Fixed 089910 Default Fixed 099401 Default Fixed 099

Table 4 Best and Worst groups for C

C cSubsetfeature

Accuracy (objectivefunction)

10 Default Fixed 0994 best of best Best group01 Default Fixed 09920 Default Fixed 097 best of worst Worst

group1 Default Fixed 0899

Table 5 Accuracy based on c

C γ Subset feature Accuracy (objective function)20 Default Fixed 0971 Default Fixed 089910 Default Fixed 099401 Default Fixed 099

8 Complexity

classification All the aforementioned steps arecalled the Teacher phase because all individualslearn from the best one (teacher) After thatLearner Phase 1 is started (is step can be viewedin lines 6 to 13 of Figure 10

(v) Step 5 select the best two individuals as studentsand apply crossover between these two students(en apply mutation on the new one If the newone is better than the old two students keep thenew one otherwise keep the best old one and

apply this with all other individuals (students) (estudents are chosen once and will not be chosenagain At this point Learner Phase 1 has ended(is step can be viewed in lines 14 to 27 ofFigure 10

(vi) Step 6 Learner Phase 2 is initiated with two objec-tives one is to optimise the SVMparameters and theother is to make students learn from each other(isphase starts with choosing two random individuals(students) and then applying crossover between these

Table 6 Best and Worst groups for c

C c Subset feature Accuracy (objective function)Default 07 Fixed 09941 best of best Best groupDefault 1 Fixed 099Default 2 Fixed 098 best of worst Worst groupDefault 10 Fixed 097

Table 7 Evaluation of features based on new C and c

C c Subset feature Accuracy (objective function)142 11006 Fixed Result

Start

FSS using ITLBO

SVM parameters optimization using IPJAYA

Is termination criterion satisfied for IPJAYA

No

Is termination criterion satisfied for ITLBO

No

Training

Classification

Intrusion detection

End

Yes

Yes

Figure 9 Proposed method flowchart

Complexity 9

two students and applying mutation on the newindividual After that and before the classificationprocess is initiated check if the new student is betterthan the old two students (e SVM parameteroptimisation is started using IPJAYA this processstarts at the 29th step by initialising the populationsize the number of design variables and the ter-mination criteria for IPJAYA (e population sizecan be set before the execution and each populationis generated randomly(e designed variables are thetwo parameters of the SVM which need to beoptimised (e termination criteria can be thenumber of iterations after that each population foreach parameter is evaluated separately (which onegives better accuracy) followed by a parallel poll foreach parameter sorting the population from best toworst (best accuracy to worst accuracy) and sepa-rating them into two groups (best and worst groups)(e best population in the best group is chosen asbest and the best population in the worst group ischosen as worst (en the population is modifiedbased on equation in Figure 8 and updated if the newone is better than the old one IPJAYA is repeated

until the termination criterion is satisfied (e finalstep of IPJAYA is to deliver the best value of the twoparameters to be used by the SVM At this point theparameter optimisation has ended and LearnerPhase 2 continues in the next step (is step can beviewed in lines 28 to 39 of Figure 10

(vii) Step 7 evaluate the individuals (chromosome) byusing the outcome of IPJAYA If the new indi-vidual is better than the old two students keepthe new one otherwise keep the best old oneApply this step to all other students At this stepthe main three stages of the ITLBO have finished(e next step is to check for the satisfaction of thetermination criteria if satisfied proceed to thenext step Otherwise the main three stages arerepeated (is step can be viewed in lines 40 to 48of Figure 10

(viii) Step 8 the last step is to apply nondominatedsorting on the result Nondominated sortingmeans no result (individual) is better than all theother individuals(is step can be viewed in line 49of Figure 10

(1) Start (2) Initialize population(3) Calculate_weighted_average_of_individuals (4) for (k =1 to number_of_generations) do(5) Xteacher = Best_individual (6) Learning from Teacher lowastteacher phase (7) for (i = 1 to number_of_individuals) do(8) Xnew = Crossover (Xteacher Xi)(9) Xnew = Mutation(Xnew)(10) if (Xnew is better than Xi) then

(11) Xi = [Xnew]

(12) End if

(13) End for (14) Learning from Best Classmates lowastlearner phase 1 (15) for (i = 1 to number_of_individuals) do(16) m = Select_best_individual_from

(population)

(17) n = Select_best_individual_from (population)(18) n nem ne teacherlowast(19) Xnew = Crossover (Xm Xn)(20) Xnew = Mutation(Xnew)

(21) if (Xnew is better than Xm) then

(22) Xm = Xnew

(23) End if(24) if (Xnew is better than Xn) then

(25) Xn = Xnew

(26) End if

(27) End for

(28) Learning from Classmates lowastlearner phase 2(29) for (i =1 to number_of_individuals) do(30) m =Select_random_individual_from

(population)

(31 n =Select_random_individual_from (population) n nem ne teacherlowast

(32) Xnew = Crossover (Xm Xn)

(33) Xnew = Mutation (Xnew)(34) IPJAYA algorithm for SVM parameter

(35) Initialize the population size number of designed variables and termination criteria (IPJAYA)

(36) Arrange the solutions from the best to the worstand Split the solutions into two groups best andworst solutions

(37) Modify the solutionYj k I = Yj k I + r1 k I (Yj best I ndash |Yj k I|) ndash r2 k I (Yj worst I ndash | Yj k I|)

(38 Update the previous solution if Yj k I gt Yj k Iotherwise do not update the previous solution

(39) return best value of C and Y(40) if ( Xnew is better than Xm) then(41) Xm = Xnew(42) End if (43) if (Xnew is better than Xn) then(44) Xn = Xnew(45) End if (46) End for (47) Is termination criterion satisfied if yes go to line

42 else continue

(48) End for(49) Show_the_pareto_optimal_set (population)(50) End

Figure 10 Proposed method pseudo-code

10 Complexity

Start

Calculate weighted average of every individual in the population

Choose best individual as teacher

Crossover teacher with all other individuals (student) separately and apply mutation

Is new one better than old oneKeep old one Keep new one

Select best two students apply crossover and mutation

Is new one better than theworse student

Keep new oneKeep old one

Select random two students apply crossover and mutation

Initialize population size number of design variablesand termination criterion

Identify best and worst solutions for C and γ

Calculate new worst solution

Modify the solutions based on equation 32

Use the new value of c and y do the classification

Is new one better than old one Keep new oneKeep old one

Is the termination criterionsatisfied

This is the best value of C and γ

Is new one better thanthe worse student based on new

value of C and γ

Keep new oneKeep old one

Is the termination criterionsatisfied

End

YesNo

YesNo

Yes

No Yes

YesNo

No

No

Initialise population randomly

Apply nondominated sorting and find the Pareto setYes

ITLBO

IPJAYA

Figure 11 Details of the proposed method

Complexity 11

8 Evaluation Metrics

(emetrics measures and validation procedures used in theevaluation of the experimental data were reviewed in thissection (e literature review showed that most studies useoverall accuracy as the major performance measure for IDsystems However other metrics and validation measureshave also been mentioned Some works have detailed theinformation on FAR detections andmissed detections whichare all useful system performance evaluation measures (efollowing section details the analysis based on standardmetrics for objective evaluation of the results achieved byvarious classification methods (e performance of thesystem was evaluated using several metrics based on theNSL-KDD and CICIDS 2017 datasets A detailed descriptionof learning performance measures has been provided bySingh et al and Sokolova and Guy [25 26] whilePhoungphol et al [27] detailed the imbalanced dataset is-sues One of these metrics is the accuracy as given in thefollowing equation

accuracy TP + TN

TP + FN + TF + FP (4)

Accuracy is the capability of the classifier in predictingthe actual class here TP true positive TN true negativeFP false positive and FN false negative

Several metrics can be computed from the confusionmatrix (e false-positive rate (FPR) is another metric it isthe percentage of the samples incorrectly predicted aspositive by the classifier It is calculated by using the fol-lowing equation

false minus positive rate(FPR) FP

FP + TN (5)

(e false-negative rate (FNR) is the percentage of thedata incorrectly classified by the classifier as negative It iscalculated by using the following equation

false minus negative rate(FNR) FN

TP + FN (6)

(e detection rate (DR) is the percentage of the samplescorrectly classified by the classifier to their correct class It iscalculated by using the following equation

detection rate FP

TP + FP (7)

(e recall quantifies the number of correct positivepredictions made out of all positive predictions It is cal-culated by using the following equation

recall TP

TP + FN (8)

F-Measure provides a way to combine both detectionrate and recall into a single measure that captures bothproperties It is calculated by using the following equation

F minus M (2lowastDRlowast recall)

(DR + recall) (9)

(e results were validated by using k-fold cross-vali-dation technique [27ndash30] (is technique requires a randompartitioning of the data into k different parts and one part isselected from each iteration as testing data while the other(kminus1) parts are considered as the training dataset All theconnection records are eventually used for training andtesting For all experiments the value of k is taken as 10 toensure low bias low variance low overfitting and good errorestimate [28]

9 Dataset Preprocessing and Partitioning

(e whole dataset is preprocessed in this stage It consists oftwo steps ie scaling and normalisation In the scaling stepthe dataset is converted from a string representation to anumerical representation For example the class label in thedataset contains two different categories ldquoNormalrdquo andldquoAttackrdquo After implementing this step the label is changedto ldquo1rdquo and ldquo0rdquo where ldquo1rdquo means normal case while ldquo0rdquomeans attack (e second step is normalisation [31] (enormalisation process removes noise from the dataset anddecreases the differences in the ranges between the featuresIn this work the Max-Max normalisation method was usedas shown in the following equation

Fi Fi minus Mini

Maxi minus Minii (10)

where Fi represents the current feature that needs to benormalized and Mini and Maxi represent the minimum andthe maximum value for that feature respectively (e ob-jective function represents the accuracy of the SVM when itis evaluated on the validation set (e validation set is a partof the training set In order to make the validation fairer K-fold validation can be used(e value K is 10(e NSL-KDDand CICIDS 2017 datasets were used to evaluate the per-formance of the proposed models

10 NSL-KDD Dataset

In this study NSL-KDD datasets were used to evaluate theproposed method (is dataset was suggested in 2009 byTavallaee et al [32] due to the drawbacks of KDD CUP99(e NSL-KDD is a variant of the KDD CUP 99 dataset inwhich the redundant instances were discarded followed bythe reconstitution of the dataset structure [29] (e NSL-KDD dataset is commonly used for evaluating the per-formance of new ID approaches especially anomaly-basednetwork ID (ere are a reasonable number of testing andtraining records in the NSL-KDD (e training set(KDDTrain+) consists of 125973 records while the testingset (KDDTest+) contains 22544 records In this dataseteach traffic record has 41 features (six symbolic and 35continuous) and one class label (Table 7) (e features areclassified into basic content and traffic types (Table 8)Attack classification in the NSL-KDD is based on thefeature characteristics [33] (e NSL-KDD dataset can bedownloaded from httpswwwunbcacicdatasetsnslhtml

12 Complexity

11 CICIDS 2017 Dataset

(e CICIDS 2017 dataset consists of benign and the mostcurrent common attacks which mimic real-world data(PCAPs) It also contains the results of a network trafficanalysis obtained by using a CICFlowMeter the flows arelabelled based on the timestamp source and destinationports source and destination IPs protocols and attack (eCICIDS 2017 dataset satisfies the 11 indispensable features ofa valid IDS dataset namely anonymity available protocolsfeature set attack diversity complete capture completeinteraction complete network configuration completetraffic metadata heterogeneity and labelling [34] (ere are2830540 rows in the CICIDS 2017 devised on eight fileswith each row containing 79 features In the CICIDS 2017each row is labelled as benign or as one of the 14 attack typesA summary of the distribution of different attack types andthe benign rows is presented in Table 9

12 Results of ITLBO-IPJAYA vs ITLBOand ITLBO-JAYA

(is section provides the results of the improved method-based ITLBO-IPJAYA algorithm (is method selects thebest features and updates the value of SVM parameters (iswork proposed the idea of ldquoparallel executionrdquo to update theSVM parameters (e parameters for ITLBO ITLBO-JAYAand ITLBO-IPJAYA methods used in this study are shownin Table 10

(e NSL-KDD dataset is used to evaluate the threemethods and the evaluation metrics used are maximumaccuracy (Max Acc) average accuracy (AVR Acc) de-tection rate (DT) false alarm rate (FAR) false negative rate(FNR) F-measure (F-M) recall and error rate (ER)Table 11 shows the comparison in results among ITLBOITLBO-JAYA and ITLBO-IPJAYA

(e results show that ITLBO-IPJAYA performs betterthan ITLBO and ITLBO-JAYA in all metrics Figure 11shows the comparison results based on the accuracy ofITLBO ITLBO-JAYA and ITLBO-IPJAYA

Figure 12 shows a comparison between ITLBO-JAYAand ITLBO-IPJAYA based on the number of iterations Itshows that ITLBO-IPJAYA performs better than ITLBO-JAYA even with less number of iterations (e increase inrate of accuracy for ITLBO-IPJAYA is higher than ITLBO-JAYA (e figure shows that ITLBO-IPJAYA with 20 iter-ations performs better than ITLBO-JAYA with 30 iterationsand that ITLBO-IPJAYA performs better than ITLBO-JAYAwith less number of iterations (is means there is lesscomplexity and less execution time for ITLBO-IPJAYAFigure 13 shows the average FAR of the three methodsshowing that ITLBO-IPJAYA performs better than ITLBOand ITLBO-JAYA even with less number of features whereITLBO-IPJAYA with 19 features performs better than TLBOand ITLBO-JAYA with 21 and 22 features respectively (eimprovements shown in Sections 4 and 6 reduce the exe-cution time for ITLBO-IPJAYA over ITLBO-JAYA (eparallel processing of each SVM parameter independently isthe main factor that reduces the execution time for ITLBO-IPJAYA over ITLBO-JAYA as shown in Figure 14

(e results of the CICIDS 2017 dataset are shown inTable 12

Finally statistical significance tests (T-test) T-test madeon the distribution of values in both samples showed theirsignificant difference which allowed us to reject null hy-pothesis H0 (e test sh2ows the superiority of IPJAYA-ITLBO-SVM over JAYA-ITLBO-SVM (e P values and T-

Table 8 NSL-KDD dataset

Attack classes 22 types of attacks No of instancesNormal 67343DoS smurt neptune pod teardrop back land 45927R2L phf ftp-write imap multihop warezclient warezmaster spy guess password 995U2R perl loadmodule buffer-overflow rootkit 52Probing portsweep ipsweep satan nmap 11656

Table 9 CICIDS dataset

Attack class 14 types of attacks No of instancesBenign (normal) 2359087DOS DDoS slowloris Heratbleed Hulk GoldenEye Slowhttptest 294506PortScan Portscan 158930Bot Bot 1966Brute-Force FTP-Patator SSH-Patator 13835Web attack Web attack XSS web attack SQL injection web attack brute force 2180Infiltration Infiltration 36

Table 10 Parameters used in this study margin

Parameter ValuePopulation size for ITLBO 40Number of generations for ITLBO 60Population size for JAYA 40Number of generations for JAYA 60Population size for IPJAYA 40Number of generations for IPJAYA 60Population size for ITLBO 40Number of generations for IPJAYA 60Crossover type Half-uniformMutation type Bit-flip

Complexity 13

0973097409750976097709780979

098098109820983

Acc

urac

y

ITLBO-JAYAITLBO-IPJAYA

25 30 35 40 45 50 55 60 6520Number of iterations

Figure 13 Accuracy comparison based on the number of iterations for the NSL-KDD dataset

Table 11 Comparison of ITLBO ITLBO-JAYA and ITLBO-IPJAYA for the NSL-KDD dataset

No of features Method MAX Acc AVR Acc DR FAR FNR F-M Recall ER

16

TLBO 09639 09630 09612 00449 00282 09664 09717 0036ITLBO 09680 09678 09671 00379 00268 09701 09731 0032

ITLBO-JAYA 09688 09685 09676 00373 00258 0971 09741 00312ITLBO-IPJAYA 09708 09705 09712 00331 00256 09727 09742 00292

18

TLBO 09713 0971 09739 00299 00275 09731 09724 00286ITLBO 09718 09713 09744 00292 00273 09736 09726 00282

ITLBO-JAYA 09735 09733 09752 00285 00247 09752 09752 00265ITLBO-IPJAYA 09747 09746 09753 00280 00221 09764 09779 00252

19

TLBO 09738 09735 09727 00313 00225 09755 09774 00261ITLBO 09751 09745 09737 00305 00189 09769 09811 00248

ITLBO-JAYA 09759 09758 09758 00278 00178 09775 09791 00241ITLBO-IPJAYA 09772 09770 09786 00245 00162 09787 09787 00228

21

TLBO 09782 09780 09742 00299 00145 09797 09844 00217ITLBO 09787 09784 09756 00279 00144 0981 09846 00212

ITLBO-JAYA 09793 0979 09789 00273 00132 09811 09867 00207ITLBO-IPJAYA 09802 0980 09792 00263 00123 09812 09716 00198

22

TLBO 09801 0979 09755 00284 00131 09814 09868 00199ITLBO 0981 09805 09758 00277 00117 09823 0989 00191

ITLBO-JAYA 09816 09814 09794 00265 00114 09829 0989 00183ITLBO-IPJAYA 0-9823 09821 09798 00262 00102 09835 09898 00177

Acc

urac

y

16 18 19 21 22Number of features

096

80

9688 097

08

097

180

9735

097

47

097

510

9759

097

72

097

870

9793

098

02

098

10

9816

098

23

ITLBOITLBO-JAYAITLBO-IPJAYA

Figure 12 Accuracy based on the number of features for the NSL-KDD dataset

14 Complexity

values are shown in Table 13 the small values show that theIPJAYA-ITLBO-SVM method (MV1) is highly significant

13 The Comparison of the Proposed Methods

To illustrate the effectiveness of our proposed IDS methodsthe performance of the proposed methods is compared withsix recently developed anomaly detection techniquesTable 14 demonstrates the result achieved by the proposedmethods compared with other methods tested on the NSL-KDD dataset in terms of detection rate and false alarm rateIt is very clear that our proposed methods (ITLBO-JAYAand ITLBO-IPJAYA) obtained the best results with 09823accuracy 09798 detection rate and 00102 false alarm ratefor the ITLBO-IPJAYA model and 09816 accuracy 09794detection rate and 00114 false alarm rate for the ITLBO-JAYA method as shown in Table 11 However Table 15demonstrates the result achieved by the proposed methodscompared with other methods tested on the CICIDS 2017dataset in terms of detection rate and false alarm rate

14 Discussion

(is work in general contains 4 sections based on theproposed method Furthermore all methods proposed inthis work were evaluated based on the NSL-KDD andCICIDS 2017 datasets

Firstly the proposed ITLBO-IPJAYA based on networkintrusion detection and method results were compared withTLBO ITLBO and ITLBO-JAYA as shown in Tables 11 and12 Additionally the table shows the different features for thethree algorithms to investigate the influence of the featurersquosincrease on the performance which represents a differentalgorithm structure (e ITLBO-IPJAYA results showedhigher stability and better accuracy than ITLBO and ITLBO-JAYA algorithms

Furthermore Figure 13 shows that ITLBO-JAYA needs60 iterations to reach accuracy of 09816 when the ITLBO-IPJAYA algorithm with 50 iterations achieved higher ac-curacy (erefore ITLBO-IPJAYA achieved better detectionrate and less false alarm rate with less complexity of

0020

0022

0024

0026

0028

003

0032

0034

0036

0038

004

FAR

2217 18 19 20 21 2315 16Number of features

ITLBOITLBO-JAYAITLBO-IPJAYA

Figure 14 Comparison based on the number of features head for the NSL-KDD dataset

Table 12 Comparison of ITLBO ITLBO-JAYA and ITLBO-IPJAYA for the CICIDS dataset

No of features Method MAX Acc AVR Acc DR FAR FNR F-M Recall ER

12ITLBO 09634 09631 09661 00389 00268 0970 09721 00323

ITLBO-JAYA 09685 09683 09682 00360 00267 09713 09722 00315ITLBO-IPJAYA 09704 09702 09701 00310 00265 09725 09724 00298

13ITLBO 09712 09710 09724 00298 00273 09736 09726 00282

ITLBO-JAYA 09745 09744 09728 00290 00264 09741 09794 00272ITLBO-IPJAYA 09768 09767 09732 00285 00260 09752 09787 00264

14ITLBO 09776 09775 09737 00280 00189 09769 09811 00258

ITLBO-JAYA 09789 09787 09742 00270 00174 0978 0986 00235ITLBO-IPJAYA 09801 0980 09749 00265 00134 0981 0987 00210

16ITLBO 09804 09803 09755 00271 0011 09821 0989 00190

ITLBO-JAYA 0981 09808 09773 00266 00109 09825 0989 00183ITLBO-IPJAYA 09817 09815 09782 00264 00105 09831 09896 00170

Complexity 15

iterations Secondly with all the improvement of ITLBO-SVM mentioned above random selection of the main SVMparameters is considered as one of the algorithm limitationswhich may not provide optimal parameter value and affectthe model accuracy negatively

(e results above showed that the ITLBO-IPJAYAperformance improved the basic SVM performance byproviding the best parameter values as shown in the ITLBO-IPJAYA block diagram in Figure 11 In the end the per-formance of ITLBO-IPJAYA is worth reducing the impact ofselected parameters randomly

As a result of the differences in the algorithm structurethe ITLBO structure contains three phases which shouldprevent the algorithm from being trapped in local and globaloptima Also teachers not only teach learners (students) butalso teach other teachers On the contrary the TLBO structurecontains two phases only where teachers teach learners only

Furthermore the ITLBO algorithm achieved higheraccuracy than TLBO because the knowledge exchange rate ishigher in ITLBO since teachers teach learners and otherteachers (erefore ITLBO achieved better detection rateand less false alarm rate with less complexity of iterations

Table 15 Comparison with the existing work for the CICIDS 2017 dataset

Ref Method Dataset Acc DR FAR[41] Hybrid model CICIDS 8976 NG NG[42] Wrapper-based feature selection CICIDS 9768 NG NG[43] Feature selection technique and SVM CICIDS 09803 NG NGTLBO-SVM TLBO and SVM CICIDS 09794 09745 00274ITLBO-SVM Improved TLBO and SVM CICIDS 09804 09755 00271ITLBO-JAYA-SVM Improved TLBO improved JAYA and SVM CICIDS 0981 09773 00266ITLBO-IPJAYA-SVM Improved TLBO improved JAYA and SVM CICIDS 09817 09782 00264

Table 13 T-test results

NSL-KDD CICIDS 2017P value 00156 00068T value 3174 4044

Table 14 Comparison with the existing work for the NSL-KDD dataset

Ref Method Dataset Acc DR FAR[35] Two-stage classifier NSL-KDD 9638 NG NG[36] Hypergraph-based genetic algorithm and SVM NSL-KDD 0975 09714 083[8] PSO and SVM NSL-KDD 09784 09723 087[37] Chi-square and SVM NSL-KDD 098 NG 013[38] SVM and hybrid PSO NSL-KDD 07341 06628 281[39] SVM and feature selection NSL-KDD 090 NG NG[40] SVM and GA NSL-KDD 0975 NG NGTLBO-SVM TLBO and SVM NSL-KDD 09801 09755 00284ITLBO-SVM Improved TLBO and SVM NSL-KDD 0981 09758 00277ITLBO-JAYA-SVM Improved TLBO improved JAYA and SVM NSL-KDD 09816 09794 00265ITLBO-IPJAYA-SVM Improved TLBO improved JAYA and SVM NSL-KDD 09823 09798 00262

13534

13239

ITLBO-JAYA ITLBO-IPJAYA130501310013150132001325013300133501340013450135001355013600

Figure 15 Execution time comparison for the NSL-KDD dataset

16 Complexity

Dividing the solutions of the IPJAYA algorithm into twogroups and choosing the best solution from the best solutiongroup as ldquoBestrdquo and the best solution from the worst solutiongroup as ldquoWorstrdquo cause IPJAYA to need less iterations thanJAYA to reach better solutions as shown in Figure 13 (isalso leads to improvement in accuracy and detection rate

(e parallel improvement done on the JAYA algorithmreduces the time needed for execution and hence reduces thetotal execution time for the ITLBO-IPJAYA-SVM model asshown in Figure 15

Data Availability

(e data used to support the findings of this study areavailable online at httpswwwunbcacicdatasetsnslhtml

Conflicts of Interest

(e authors declare that they have no conflicts of interest

Acknowledgments

Special appreciation is due to Universiti Malaysia Pahang forthe sponsorship of this study approved by the Ministry ofHigher Education (MOHE) for Fundamental ResearchGrant Scheme (FRGS) with Vot No RDU190113

References

[1] A Sultana and M Jabbar ldquoIntelligent network intrusiondetection system using data mining techniquesrdquo in Pro-ceedings of the 2016 2nd International Conference on Appliedand 0eoretical Computing and Communication Technology(iCATccT) pp 329ndash333 IEEE Bangalore India July 2016

[2] R Rao ldquoJaya a simple and new optimization algorithm forsolving constrained and unconstrained opti- mizationproblemsrdquo International Journal of Industrial EngineeringComputations vol 7 no 1 pp 19ndash34 2016

[3] M Alsajri M A Ismail and S Abdul-Baqi ldquoA review on therecent application of Jaya optimization algorithmrdquo in Pro-ceedings of the 2018 1st Annual International Conference onInformation and Sciences (AiCIS) Springer Al-Fallujah Iraqpp 129ndash132 November 2018

[4] P Tao Z Sun and Z Sun ldquoAn improved intrusion detectionalgorithm based on GA and SVMrdquo IEEE Access vol 6pp 13624ndash13631 2018

[5] A S Eesa Z Orman and A M A Brifcani ldquoA novel feature-selection ap- proach based on the cuttlefish optimizationalgorithm for intrusion detection systemsrdquo Expert Systemswith Applications vol 42 no 5 pp 2670ndash2679 2015

[6] P Louvieris N Clewley and X Liu ldquoEffects-based featureidentification for network intrusion detectionrdquo Neuro-computing vol 121 pp 265ndash273 2013

[7] E De la Hoz A Ortiz J Ortega and B Prieto ldquoPCA filteringand probabilistic SOM for network intrusion detectionrdquoNeurocomputing vol 164 pp 71ndash81 2015

[8] S Mojtaba H Bamakan H Wang Y Tian and Y Shi ldquoAneffective intrusion detection framework based onMCLPSVMoptimized by time-varying chaos particle swarm optimiza-tionrdquo Neurocomputing vol 199 pp 90ndash102 2016

[9] R V Rao V J Savsani and D P Vakharia ldquoTeach-ingndashlearning-based optimization a novel method for

constrained mechanical design optimization problemsrdquordquoComputer-Aided Design vol 43 no 3 pp 303ndash315 2011

[10] S P Das and S Padhy ldquoA novel hybrid model using teaching-learning-based optimization and a support vector machine forcommodity futures index forecastingrdquo International Journalof Machine Learning and Cybernetics vol 9 no 1 pp 97ndash1112018

[11] S P Das N S Achary and S Padhy ldquoNovel hybrid SVM-TLBO forecasting model incorporating dimensionality re-duction techniquesrdquo Applied Intelligence vol 45 no 4pp 1148ndash1165 2016

[12] R V Rao V J Savsani and J Balic ldquoTeachingndashlearning-based optimization algorithm for unconstrained and con-strained real-parameter optimization problemsrdquordquo EngineeringOptimization vol 44 no 12 pp 1447ndash1462 2012

[13] R V Rao and V Patel ldquoAn improved teaching-learning-based optimization algorithm for solving uncon- strainedoptimization problemsrdquo Scientia Iranica vol 20 no 3pp 710ndash720 2013

[14] M Crepinsek S-H Liu and L Mernik ldquoA note on teach-ingndashlearning-based optimization algo- rithmrdquo InformationSciences vol 212 pp 79ndash93 2012

[15] M R Nayak C K Nayak and P K Rout ldquoApplication ofmulti-objective teaching learning based optimization algo-rithm to optimal power flow problemrdquo Procedia Technologyvol 6 pp 255ndash264 2012

[16] Y Xu L Wang S-Y Wang and M Liu ldquoAn effectiveteaching-learning-based optimization algorithm for theflexible job-shop scheduling problem with fuzzy processingtimerdquo Neurocomputing vol 148 pp 260ndash268 2015

[17] H E Kiziloz A Deniz T Dokeroglu and A Cosar ldquoNovelmultiobjective TLBO algorithms for the feature subset se-lection problemrdquoNeurocomputing vol 306 pp 94ndash107 2018

[18] S H Wang K Muhammad Y Lv et al ldquoIdentification of Al-coholism based on wavelet Renyi entropy and three-segmentencoded Jaya algorithmrdquo Complexity vol 2018 Article ID3198184 13 pages 2018

[19] H Migallon A Jimeno-Morenilla and J-L Sanchez-RomeroldquoParallel improvements of the Jaya optimization algorithmrdquoApplied Sciences vol 8 no 5 p 819 2018

[20] C Gong ldquoAn enhanced Jaya algorithm with a two groupAdaptionrdquo International Journal of Computational Intelli-gence Systems vol 10 no 1 pp 1102ndash1115 2017

[21] O Samuel N Javaid S Aslam and M H Rahim ldquoJAYAoptimization based energy management controller for smartgrid JAYA optimization based energy management con-trollerrdquo in Proceedings of the 2018 International Conference onComputing Mathematics and Engineering Technologies(iCoMET) March 2018

[22] K Yu J J Liang B Y Qu X Chen and H Wang ldquoPa-rameters identification of photovoltaic models using an im-proved JAYA optimization algorithmrdquo Energy Conversionand Management vol 150 pp 742ndash753 2017

[23] M Dash and H Liu ldquoFeature selection for classificationrdquoIntelligent Data Analysis vol 1 no 1-4 pp 131ndash156 1997

[24] S Dumais J Platt D Heckerman and M Sahami ldquoInductivelearning algorithms and representations for text categoriza-tionrdquo in Proceedings of the seventh international conference onInformation and knowledge management-CIKMrsquo98 Novem-ber 1998

[25] R Singh H Kumar and R K Singla ldquoAn intrusion detectionsystem using network traffic profiling and online sequentialextreme learning machinerdquo Expert Systems with Applicationsvol 42 no 22 pp 8609ndash8624 2015

Complexity 17

[26] M Sokolova and L Guy ldquoA systematic analysis of perfor-mance measures for classification tasksrdquo Information Pro-cessing amp Management vol 45 no 4 pp 427ndash437 2009

[27] P Phoungphol Y Zhang and Y Zhao ldquoRobust multiclassclassification for learning from imbalanced biomedical datardquoTsinghua Science and Technology vol 17 no 6 pp 619ndash6282012

[28] A Jahan F Mustapha M Y Ismail S M Sapuan andM Bahraminasab ldquoA comprehensive VIKOR method formaterial selectionrdquo Materials amp Design vol 32 no 3pp 1215ndash1221 2011

[29] L Aljarah and S A Ludwig ldquoMapreduce intrusion detectionsystem based on a particle swarm optimization clusteringalgorithmrdquo in Proceedings of the 2013 IEEE Congress onEvolutionary Computation pp 955ndash962 IEEE CancunMexico June 2013

[30] K Khaleel M A Ismail U Yunan and S Kasim ldquoReview onintrusion detection system based on the goal of the detectionsystemrdquo International Journal of Integrated Engineeringvol 10 no 6 2018

[31] J D Rodriguez A Perez and J A Lozano ldquoSensitivityanalysis of k-Fold cross validation in prediction error esti-mationrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 32 no 3 pp 569ndash575 2010

[32] M Tavallaee E Bagheri W Lu and A A Ghorbani ldquoAdetailed analysis of the KDD CUP 99 data setrdquo in Proceedingsof the 2009 IEEE Symposium on Computational Intelligence forSecurity and Defense Applications pp 1ndash6 IEEE Ottawa ONCanada July 2009

[33] M Al-Qatf Y Lasheng M Al-Habib and K Al-SabahildquoDeep learning approach combining sparse autoencoder withSVM for network intrusion detectionrdquo IEEE Access vol 6pp 52843ndash52856 2018

[34] R Vinayakumar M Alazab K P Soman P PoornachandranA Al-Nemrat and S Venkatraman ldquoDeep learning approachfor intelligent intrusion detection systemrdquo IEEE Access vol 7pp 41525ndash41550 2019

[35] BA Tama M Comuzzi and K-H Rhee ldquoTSE-IDS a two-stage classifier ensemble for intelligent anomaly-based in-trusion detection systemrdquo IEEE Access vol 7 pp 94497ndash94507 2019

[36] M R Gauthama Raman N Somu K Kannan R Liscano andV S Shankar Sriram ldquoAn efficient intrusion detection systembased on hypergraph - genetic algorithm for parameter op-timization and feature selection in support vector machinerdquordquoKnowledge-Based Systems vol 134 pp 1ndash12 2017

[37] I S (aseen and C Aswani Kumar ldquoIntrusion detectionmodel using fusion of chi- square feature selection and multiclass SVMrdquo Journal of King Saud University - Computer andInformation Sciences vol 29 no 4 pp 462ndash472 2017

[38] Y Li S Yu J Bai and X Cheng ldquoTowards effective networkintrusion detection a hybrid model integrating Gini indexand GBDTwith PSOrdquo Journal of Sensors vol 2018 Article ID1578314 9 pages 2018

[39] A A Aburomman and M B I Reaz ldquoA novel weightedsupport vector machines multi- class classifier based ondifferential evolution for intrusion detection systemsrdquo In-formation Sciences vol 414 pp 225ndash246 2017

[40] J Esmaily and J Ghasemi ldquoA novel intrusion detectionsystems based on genetic algorithms-suggested features by themeans of different permutations of labelsrsquo ordersrdquo Interna-tional Journal of Engineering vol 30 no 10 pp 1494ndash15022017

[41] S Aljawarneh M Aldwairi and M B Yassein ldquoAnomaly-based intrusion detection system through feature selectionanalysis and building hybrid efficient modelrdquo Journal ofComputational Science vol 25 pp 152ndash160 2018

[42] Y Li J L Wang Z H Tian T B Lu and C Young ldquoBuildinglightweight intrusion detection system using wrapper-basedfeature selection mechanismsrdquo Computers amp Security vol 28no 6 pp 466ndash475 2009

[43] S U Jan S Ahmed V Shakhov and I Koo ldquoToward alightweight intrusion detection system for the internet ofthingsrdquo IEEE Access vol 7 pp 42450ndash42471 2019

18 Complexity

Page 9: ImprovedTLBO …downloads.hindawi.com/journals/complexity/2020/5287684.pdfFSS is a multiobjective optimisation problem. e first objective is the number of features, and the second

classification All the aforementioned steps arecalled the Teacher phase because all individualslearn from the best one (teacher) After thatLearner Phase 1 is started (is step can be viewedin lines 6 to 13 of Figure 10

(v) Step 5 select the best two individuals as studentsand apply crossover between these two students(en apply mutation on the new one If the newone is better than the old two students keep thenew one otherwise keep the best old one and

apply this with all other individuals (students) (estudents are chosen once and will not be chosenagain At this point Learner Phase 1 has ended(is step can be viewed in lines 14 to 27 ofFigure 10

(vi) Step 6 Learner Phase 2 is initiated with two objec-tives one is to optimise the SVMparameters and theother is to make students learn from each other(isphase starts with choosing two random individuals(students) and then applying crossover between these

Table 6 Best and Worst groups for c

C c Subset feature Accuracy (objective function)Default 07 Fixed 09941 best of best Best groupDefault 1 Fixed 099Default 2 Fixed 098 best of worst Worst groupDefault 10 Fixed 097

Table 7 Evaluation of features based on new C and c

C c Subset feature Accuracy (objective function)142 11006 Fixed Result

Start

FSS using ITLBO

SVM parameters optimization using IPJAYA

Is termination criterion satisfied for IPJAYA

No

Is termination criterion satisfied for ITLBO

No

Training

Classification

Intrusion detection

End

Yes

Yes

Figure 9 Proposed method flowchart

Complexity 9

two students and applying mutation on the newindividual After that and before the classificationprocess is initiated check if the new student is betterthan the old two students (e SVM parameteroptimisation is started using IPJAYA this processstarts at the 29th step by initialising the populationsize the number of design variables and the ter-mination criteria for IPJAYA (e population sizecan be set before the execution and each populationis generated randomly(e designed variables are thetwo parameters of the SVM which need to beoptimised (e termination criteria can be thenumber of iterations after that each population foreach parameter is evaluated separately (which onegives better accuracy) followed by a parallel poll foreach parameter sorting the population from best toworst (best accuracy to worst accuracy) and sepa-rating them into two groups (best and worst groups)(e best population in the best group is chosen asbest and the best population in the worst group ischosen as worst (en the population is modifiedbased on equation in Figure 8 and updated if the newone is better than the old one IPJAYA is repeated

until the termination criterion is satisfied (e finalstep of IPJAYA is to deliver the best value of the twoparameters to be used by the SVM At this point theparameter optimisation has ended and LearnerPhase 2 continues in the next step (is step can beviewed in lines 28 to 39 of Figure 10

(vii) Step 7 evaluate the individuals (chromosome) byusing the outcome of IPJAYA If the new indi-vidual is better than the old two students keepthe new one otherwise keep the best old oneApply this step to all other students At this stepthe main three stages of the ITLBO have finished(e next step is to check for the satisfaction of thetermination criteria if satisfied proceed to thenext step Otherwise the main three stages arerepeated (is step can be viewed in lines 40 to 48of Figure 10

(viii) Step 8 the last step is to apply nondominatedsorting on the result Nondominated sortingmeans no result (individual) is better than all theother individuals(is step can be viewed in line 49of Figure 10

(1) Start (2) Initialize population(3) Calculate_weighted_average_of_individuals (4) for (k =1 to number_of_generations) do(5) Xteacher = Best_individual (6) Learning from Teacher lowastteacher phase (7) for (i = 1 to number_of_individuals) do(8) Xnew = Crossover (Xteacher Xi)(9) Xnew = Mutation(Xnew)(10) if (Xnew is better than Xi) then

(11) Xi = [Xnew]

(12) End if

(13) End for (14) Learning from Best Classmates lowastlearner phase 1 (15) for (i = 1 to number_of_individuals) do(16) m = Select_best_individual_from

(population)

(17) n = Select_best_individual_from (population)(18) n nem ne teacherlowast(19) Xnew = Crossover (Xm Xn)(20) Xnew = Mutation(Xnew)

(21) if (Xnew is better than Xm) then

(22) Xm = Xnew

(23) End if(24) if (Xnew is better than Xn) then

(25) Xn = Xnew

(26) End if

(27) End for

(28) Learning from Classmates lowastlearner phase 2(29) for (i =1 to number_of_individuals) do(30) m =Select_random_individual_from

(population)

(31 n =Select_random_individual_from (population) n nem ne teacherlowast

(32) Xnew = Crossover (Xm Xn)

(33) Xnew = Mutation (Xnew)(34) IPJAYA algorithm for SVM parameter

(35) Initialize the population size number of designed variables and termination criteria (IPJAYA)

(36) Arrange the solutions from the best to the worstand Split the solutions into two groups best andworst solutions

(37) Modify the solutionYj k I = Yj k I + r1 k I (Yj best I ndash |Yj k I|) ndash r2 k I (Yj worst I ndash | Yj k I|)

(38 Update the previous solution if Yj k I gt Yj k Iotherwise do not update the previous solution

(39) return best value of C and Y(40) if ( Xnew is better than Xm) then(41) Xm = Xnew(42) End if (43) if (Xnew is better than Xn) then(44) Xn = Xnew(45) End if (46) End for (47) Is termination criterion satisfied if yes go to line

42 else continue

(48) End for(49) Show_the_pareto_optimal_set (population)(50) End

Figure 10 Proposed method pseudo-code

10 Complexity

Start

Calculate weighted average of every individual in the population

Choose best individual as teacher

Crossover teacher with all other individuals (student) separately and apply mutation

Is new one better than old oneKeep old one Keep new one

Select best two students apply crossover and mutation

Is new one better than theworse student

Keep new oneKeep old one

Select random two students apply crossover and mutation

Initialize population size number of design variablesand termination criterion

Identify best and worst solutions for C and γ

Calculate new worst solution

Modify the solutions based on equation 32

Use the new value of c and y do the classification

Is new one better than old one Keep new oneKeep old one

Is the termination criterionsatisfied

This is the best value of C and γ

Is new one better thanthe worse student based on new

value of C and γ

Keep new oneKeep old one

Is the termination criterionsatisfied

End

YesNo

YesNo

Yes

No Yes

YesNo

No

No

Initialise population randomly

Apply nondominated sorting and find the Pareto setYes

ITLBO

IPJAYA

Figure 11 Details of the proposed method

Complexity 11

8 Evaluation Metrics

(emetrics measures and validation procedures used in theevaluation of the experimental data were reviewed in thissection (e literature review showed that most studies useoverall accuracy as the major performance measure for IDsystems However other metrics and validation measureshave also been mentioned Some works have detailed theinformation on FAR detections andmissed detections whichare all useful system performance evaluation measures (efollowing section details the analysis based on standardmetrics for objective evaluation of the results achieved byvarious classification methods (e performance of thesystem was evaluated using several metrics based on theNSL-KDD and CICIDS 2017 datasets A detailed descriptionof learning performance measures has been provided bySingh et al and Sokolova and Guy [25 26] whilePhoungphol et al [27] detailed the imbalanced dataset is-sues One of these metrics is the accuracy as given in thefollowing equation

accuracy TP + TN

TP + FN + TF + FP (4)

Accuracy is the capability of the classifier in predictingthe actual class here TP true positive TN true negativeFP false positive and FN false negative

Several metrics can be computed from the confusionmatrix (e false-positive rate (FPR) is another metric it isthe percentage of the samples incorrectly predicted aspositive by the classifier It is calculated by using the fol-lowing equation

false minus positive rate(FPR) FP

FP + TN (5)

(e false-negative rate (FNR) is the percentage of thedata incorrectly classified by the classifier as negative It iscalculated by using the following equation

false minus negative rate(FNR) FN

TP + FN (6)

(e detection rate (DR) is the percentage of the samplescorrectly classified by the classifier to their correct class It iscalculated by using the following equation

detection rate FP

TP + FP (7)

(e recall quantifies the number of correct positivepredictions made out of all positive predictions It is cal-culated by using the following equation

recall TP

TP + FN (8)

F-Measure provides a way to combine both detectionrate and recall into a single measure that captures bothproperties It is calculated by using the following equation

F minus M (2lowastDRlowast recall)

(DR + recall) (9)

(e results were validated by using k-fold cross-vali-dation technique [27ndash30] (is technique requires a randompartitioning of the data into k different parts and one part isselected from each iteration as testing data while the other(kminus1) parts are considered as the training dataset All theconnection records are eventually used for training andtesting For all experiments the value of k is taken as 10 toensure low bias low variance low overfitting and good errorestimate [28]

9 Dataset Preprocessing and Partitioning

(e whole dataset is preprocessed in this stage It consists oftwo steps ie scaling and normalisation In the scaling stepthe dataset is converted from a string representation to anumerical representation For example the class label in thedataset contains two different categories ldquoNormalrdquo andldquoAttackrdquo After implementing this step the label is changedto ldquo1rdquo and ldquo0rdquo where ldquo1rdquo means normal case while ldquo0rdquomeans attack (e second step is normalisation [31] (enormalisation process removes noise from the dataset anddecreases the differences in the ranges between the featuresIn this work the Max-Max normalisation method was usedas shown in the following equation

Fi Fi minus Mini

Maxi minus Minii (10)

where Fi represents the current feature that needs to benormalized and Mini and Maxi represent the minimum andthe maximum value for that feature respectively (e ob-jective function represents the accuracy of the SVM when itis evaluated on the validation set (e validation set is a partof the training set In order to make the validation fairer K-fold validation can be used(e value K is 10(e NSL-KDDand CICIDS 2017 datasets were used to evaluate the per-formance of the proposed models

10 NSL-KDD Dataset

In this study NSL-KDD datasets were used to evaluate theproposed method (is dataset was suggested in 2009 byTavallaee et al [32] due to the drawbacks of KDD CUP99(e NSL-KDD is a variant of the KDD CUP 99 dataset inwhich the redundant instances were discarded followed bythe reconstitution of the dataset structure [29] (e NSL-KDD dataset is commonly used for evaluating the per-formance of new ID approaches especially anomaly-basednetwork ID (ere are a reasonable number of testing andtraining records in the NSL-KDD (e training set(KDDTrain+) consists of 125973 records while the testingset (KDDTest+) contains 22544 records In this dataseteach traffic record has 41 features (six symbolic and 35continuous) and one class label (Table 7) (e features areclassified into basic content and traffic types (Table 8)Attack classification in the NSL-KDD is based on thefeature characteristics [33] (e NSL-KDD dataset can bedownloaded from httpswwwunbcacicdatasetsnslhtml

12 Complexity

11 CICIDS 2017 Dataset

(e CICIDS 2017 dataset consists of benign and the mostcurrent common attacks which mimic real-world data(PCAPs) It also contains the results of a network trafficanalysis obtained by using a CICFlowMeter the flows arelabelled based on the timestamp source and destinationports source and destination IPs protocols and attack (eCICIDS 2017 dataset satisfies the 11 indispensable features ofa valid IDS dataset namely anonymity available protocolsfeature set attack diversity complete capture completeinteraction complete network configuration completetraffic metadata heterogeneity and labelling [34] (ere are2830540 rows in the CICIDS 2017 devised on eight fileswith each row containing 79 features In the CICIDS 2017each row is labelled as benign or as one of the 14 attack typesA summary of the distribution of different attack types andthe benign rows is presented in Table 9

12 Results of ITLBO-IPJAYA vs ITLBOand ITLBO-JAYA

(is section provides the results of the improved method-based ITLBO-IPJAYA algorithm (is method selects thebest features and updates the value of SVM parameters (iswork proposed the idea of ldquoparallel executionrdquo to update theSVM parameters (e parameters for ITLBO ITLBO-JAYAand ITLBO-IPJAYA methods used in this study are shownin Table 10

(e NSL-KDD dataset is used to evaluate the threemethods and the evaluation metrics used are maximumaccuracy (Max Acc) average accuracy (AVR Acc) de-tection rate (DT) false alarm rate (FAR) false negative rate(FNR) F-measure (F-M) recall and error rate (ER)Table 11 shows the comparison in results among ITLBOITLBO-JAYA and ITLBO-IPJAYA

(e results show that ITLBO-IPJAYA performs betterthan ITLBO and ITLBO-JAYA in all metrics Figure 11shows the comparison results based on the accuracy ofITLBO ITLBO-JAYA and ITLBO-IPJAYA

Figure 12 shows a comparison between ITLBO-JAYAand ITLBO-IPJAYA based on the number of iterations Itshows that ITLBO-IPJAYA performs better than ITLBO-JAYA even with less number of iterations (e increase inrate of accuracy for ITLBO-IPJAYA is higher than ITLBO-JAYA (e figure shows that ITLBO-IPJAYA with 20 iter-ations performs better than ITLBO-JAYA with 30 iterationsand that ITLBO-IPJAYA performs better than ITLBO-JAYAwith less number of iterations (is means there is lesscomplexity and less execution time for ITLBO-IPJAYAFigure 13 shows the average FAR of the three methodsshowing that ITLBO-IPJAYA performs better than ITLBOand ITLBO-JAYA even with less number of features whereITLBO-IPJAYA with 19 features performs better than TLBOand ITLBO-JAYA with 21 and 22 features respectively (eimprovements shown in Sections 4 and 6 reduce the exe-cution time for ITLBO-IPJAYA over ITLBO-JAYA (eparallel processing of each SVM parameter independently isthe main factor that reduces the execution time for ITLBO-IPJAYA over ITLBO-JAYA as shown in Figure 14

(e results of the CICIDS 2017 dataset are shown inTable 12

Finally statistical significance tests (T-test) T-test madeon the distribution of values in both samples showed theirsignificant difference which allowed us to reject null hy-pothesis H0 (e test sh2ows the superiority of IPJAYA-ITLBO-SVM over JAYA-ITLBO-SVM (e P values and T-

Table 8 NSL-KDD dataset

Attack classes 22 types of attacks No of instancesNormal 67343DoS smurt neptune pod teardrop back land 45927R2L phf ftp-write imap multihop warezclient warezmaster spy guess password 995U2R perl loadmodule buffer-overflow rootkit 52Probing portsweep ipsweep satan nmap 11656

Table 9 CICIDS dataset

Attack class 14 types of attacks No of instancesBenign (normal) 2359087DOS DDoS slowloris Heratbleed Hulk GoldenEye Slowhttptest 294506PortScan Portscan 158930Bot Bot 1966Brute-Force FTP-Patator SSH-Patator 13835Web attack Web attack XSS web attack SQL injection web attack brute force 2180Infiltration Infiltration 36

Table 10 Parameters used in this study margin

Parameter ValuePopulation size for ITLBO 40Number of generations for ITLBO 60Population size for JAYA 40Number of generations for JAYA 60Population size for IPJAYA 40Number of generations for IPJAYA 60Population size for ITLBO 40Number of generations for IPJAYA 60Crossover type Half-uniformMutation type Bit-flip

Complexity 13

0973097409750976097709780979

098098109820983

Acc

urac

y

ITLBO-JAYAITLBO-IPJAYA

25 30 35 40 45 50 55 60 6520Number of iterations

Figure 13 Accuracy comparison based on the number of iterations for the NSL-KDD dataset

Table 11 Comparison of ITLBO ITLBO-JAYA and ITLBO-IPJAYA for the NSL-KDD dataset

No of features Method MAX Acc AVR Acc DR FAR FNR F-M Recall ER

16

TLBO 09639 09630 09612 00449 00282 09664 09717 0036ITLBO 09680 09678 09671 00379 00268 09701 09731 0032

ITLBO-JAYA 09688 09685 09676 00373 00258 0971 09741 00312ITLBO-IPJAYA 09708 09705 09712 00331 00256 09727 09742 00292

18

TLBO 09713 0971 09739 00299 00275 09731 09724 00286ITLBO 09718 09713 09744 00292 00273 09736 09726 00282

ITLBO-JAYA 09735 09733 09752 00285 00247 09752 09752 00265ITLBO-IPJAYA 09747 09746 09753 00280 00221 09764 09779 00252

19

TLBO 09738 09735 09727 00313 00225 09755 09774 00261ITLBO 09751 09745 09737 00305 00189 09769 09811 00248

ITLBO-JAYA 09759 09758 09758 00278 00178 09775 09791 00241ITLBO-IPJAYA 09772 09770 09786 00245 00162 09787 09787 00228

21

TLBO 09782 09780 09742 00299 00145 09797 09844 00217ITLBO 09787 09784 09756 00279 00144 0981 09846 00212

ITLBO-JAYA 09793 0979 09789 00273 00132 09811 09867 00207ITLBO-IPJAYA 09802 0980 09792 00263 00123 09812 09716 00198

22

TLBO 09801 0979 09755 00284 00131 09814 09868 00199ITLBO 0981 09805 09758 00277 00117 09823 0989 00191

ITLBO-JAYA 09816 09814 09794 00265 00114 09829 0989 00183ITLBO-IPJAYA 0-9823 09821 09798 00262 00102 09835 09898 00177

Acc

urac

y

16 18 19 21 22Number of features

096

80

9688 097

08

097

180

9735

097

47

097

510

9759

097

72

097

870

9793

098

02

098

10

9816

098

23

ITLBOITLBO-JAYAITLBO-IPJAYA

Figure 12 Accuracy based on the number of features for the NSL-KDD dataset

14 Complexity

values are shown in Table 13 the small values show that theIPJAYA-ITLBO-SVM method (MV1) is highly significant

13 The Comparison of the Proposed Methods

To illustrate the effectiveness of our proposed IDS methodsthe performance of the proposed methods is compared withsix recently developed anomaly detection techniquesTable 14 demonstrates the result achieved by the proposedmethods compared with other methods tested on the NSL-KDD dataset in terms of detection rate and false alarm rateIt is very clear that our proposed methods (ITLBO-JAYAand ITLBO-IPJAYA) obtained the best results with 09823accuracy 09798 detection rate and 00102 false alarm ratefor the ITLBO-IPJAYA model and 09816 accuracy 09794detection rate and 00114 false alarm rate for the ITLBO-JAYA method as shown in Table 11 However Table 15demonstrates the result achieved by the proposed methodscompared with other methods tested on the CICIDS 2017dataset in terms of detection rate and false alarm rate

14 Discussion

(is work in general contains 4 sections based on theproposed method Furthermore all methods proposed inthis work were evaluated based on the NSL-KDD andCICIDS 2017 datasets

Firstly the proposed ITLBO-IPJAYA based on networkintrusion detection and method results were compared withTLBO ITLBO and ITLBO-JAYA as shown in Tables 11 and12 Additionally the table shows the different features for thethree algorithms to investigate the influence of the featurersquosincrease on the performance which represents a differentalgorithm structure (e ITLBO-IPJAYA results showedhigher stability and better accuracy than ITLBO and ITLBO-JAYA algorithms

Furthermore Figure 13 shows that ITLBO-JAYA needs60 iterations to reach accuracy of 09816 when the ITLBO-IPJAYA algorithm with 50 iterations achieved higher ac-curacy (erefore ITLBO-IPJAYA achieved better detectionrate and less false alarm rate with less complexity of

0020

0022

0024

0026

0028

003

0032

0034

0036

0038

004

FAR

2217 18 19 20 21 2315 16Number of features

ITLBOITLBO-JAYAITLBO-IPJAYA

Figure 14 Comparison based on the number of features head for the NSL-KDD dataset

Table 12 Comparison of ITLBO ITLBO-JAYA and ITLBO-IPJAYA for the CICIDS dataset

No of features Method MAX Acc AVR Acc DR FAR FNR F-M Recall ER

12ITLBO 09634 09631 09661 00389 00268 0970 09721 00323

ITLBO-JAYA 09685 09683 09682 00360 00267 09713 09722 00315ITLBO-IPJAYA 09704 09702 09701 00310 00265 09725 09724 00298

13ITLBO 09712 09710 09724 00298 00273 09736 09726 00282

ITLBO-JAYA 09745 09744 09728 00290 00264 09741 09794 00272ITLBO-IPJAYA 09768 09767 09732 00285 00260 09752 09787 00264

14ITLBO 09776 09775 09737 00280 00189 09769 09811 00258

ITLBO-JAYA 09789 09787 09742 00270 00174 0978 0986 00235ITLBO-IPJAYA 09801 0980 09749 00265 00134 0981 0987 00210

16ITLBO 09804 09803 09755 00271 0011 09821 0989 00190

ITLBO-JAYA 0981 09808 09773 00266 00109 09825 0989 00183ITLBO-IPJAYA 09817 09815 09782 00264 00105 09831 09896 00170

Complexity 15

iterations Secondly with all the improvement of ITLBO-SVM mentioned above random selection of the main SVMparameters is considered as one of the algorithm limitationswhich may not provide optimal parameter value and affectthe model accuracy negatively

(e results above showed that the ITLBO-IPJAYAperformance improved the basic SVM performance byproviding the best parameter values as shown in the ITLBO-IPJAYA block diagram in Figure 11 In the end the per-formance of ITLBO-IPJAYA is worth reducing the impact ofselected parameters randomly

As a result of the differences in the algorithm structurethe ITLBO structure contains three phases which shouldprevent the algorithm from being trapped in local and globaloptima Also teachers not only teach learners (students) butalso teach other teachers On the contrary the TLBO structurecontains two phases only where teachers teach learners only

Furthermore the ITLBO algorithm achieved higheraccuracy than TLBO because the knowledge exchange rate ishigher in ITLBO since teachers teach learners and otherteachers (erefore ITLBO achieved better detection rateand less false alarm rate with less complexity of iterations

Table 15 Comparison with the existing work for the CICIDS 2017 dataset

Ref Method Dataset Acc DR FAR[41] Hybrid model CICIDS 8976 NG NG[42] Wrapper-based feature selection CICIDS 9768 NG NG[43] Feature selection technique and SVM CICIDS 09803 NG NGTLBO-SVM TLBO and SVM CICIDS 09794 09745 00274ITLBO-SVM Improved TLBO and SVM CICIDS 09804 09755 00271ITLBO-JAYA-SVM Improved TLBO improved JAYA and SVM CICIDS 0981 09773 00266ITLBO-IPJAYA-SVM Improved TLBO improved JAYA and SVM CICIDS 09817 09782 00264

Table 13 T-test results

NSL-KDD CICIDS 2017P value 00156 00068T value 3174 4044

Table 14 Comparison with the existing work for the NSL-KDD dataset

Ref Method Dataset Acc DR FAR[35] Two-stage classifier NSL-KDD 9638 NG NG[36] Hypergraph-based genetic algorithm and SVM NSL-KDD 0975 09714 083[8] PSO and SVM NSL-KDD 09784 09723 087[37] Chi-square and SVM NSL-KDD 098 NG 013[38] SVM and hybrid PSO NSL-KDD 07341 06628 281[39] SVM and feature selection NSL-KDD 090 NG NG[40] SVM and GA NSL-KDD 0975 NG NGTLBO-SVM TLBO and SVM NSL-KDD 09801 09755 00284ITLBO-SVM Improved TLBO and SVM NSL-KDD 0981 09758 00277ITLBO-JAYA-SVM Improved TLBO improved JAYA and SVM NSL-KDD 09816 09794 00265ITLBO-IPJAYA-SVM Improved TLBO improved JAYA and SVM NSL-KDD 09823 09798 00262

13534

13239

ITLBO-JAYA ITLBO-IPJAYA130501310013150132001325013300133501340013450135001355013600

Figure 15 Execution time comparison for the NSL-KDD dataset

16 Complexity

Dividing the solutions of the IPJAYA algorithm into twogroups and choosing the best solution from the best solutiongroup as ldquoBestrdquo and the best solution from the worst solutiongroup as ldquoWorstrdquo cause IPJAYA to need less iterations thanJAYA to reach better solutions as shown in Figure 13 (isalso leads to improvement in accuracy and detection rate

(e parallel improvement done on the JAYA algorithmreduces the time needed for execution and hence reduces thetotal execution time for the ITLBO-IPJAYA-SVM model asshown in Figure 15

Data Availability

(e data used to support the findings of this study areavailable online at httpswwwunbcacicdatasetsnslhtml

Conflicts of Interest

(e authors declare that they have no conflicts of interest

Acknowledgments

Special appreciation is due to Universiti Malaysia Pahang forthe sponsorship of this study approved by the Ministry ofHigher Education (MOHE) for Fundamental ResearchGrant Scheme (FRGS) with Vot No RDU190113

References

[1] A Sultana and M Jabbar ldquoIntelligent network intrusiondetection system using data mining techniquesrdquo in Pro-ceedings of the 2016 2nd International Conference on Appliedand 0eoretical Computing and Communication Technology(iCATccT) pp 329ndash333 IEEE Bangalore India July 2016

[2] R Rao ldquoJaya a simple and new optimization algorithm forsolving constrained and unconstrained opti- mizationproblemsrdquo International Journal of Industrial EngineeringComputations vol 7 no 1 pp 19ndash34 2016

[3] M Alsajri M A Ismail and S Abdul-Baqi ldquoA review on therecent application of Jaya optimization algorithmrdquo in Pro-ceedings of the 2018 1st Annual International Conference onInformation and Sciences (AiCIS) Springer Al-Fallujah Iraqpp 129ndash132 November 2018

[4] P Tao Z Sun and Z Sun ldquoAn improved intrusion detectionalgorithm based on GA and SVMrdquo IEEE Access vol 6pp 13624ndash13631 2018

[5] A S Eesa Z Orman and A M A Brifcani ldquoA novel feature-selection ap- proach based on the cuttlefish optimizationalgorithm for intrusion detection systemsrdquo Expert Systemswith Applications vol 42 no 5 pp 2670ndash2679 2015

[6] P Louvieris N Clewley and X Liu ldquoEffects-based featureidentification for network intrusion detectionrdquo Neuro-computing vol 121 pp 265ndash273 2013

[7] E De la Hoz A Ortiz J Ortega and B Prieto ldquoPCA filteringand probabilistic SOM for network intrusion detectionrdquoNeurocomputing vol 164 pp 71ndash81 2015

[8] S Mojtaba H Bamakan H Wang Y Tian and Y Shi ldquoAneffective intrusion detection framework based onMCLPSVMoptimized by time-varying chaos particle swarm optimiza-tionrdquo Neurocomputing vol 199 pp 90ndash102 2016

[9] R V Rao V J Savsani and D P Vakharia ldquoTeach-ingndashlearning-based optimization a novel method for

constrained mechanical design optimization problemsrdquordquoComputer-Aided Design vol 43 no 3 pp 303ndash315 2011

[10] S P Das and S Padhy ldquoA novel hybrid model using teaching-learning-based optimization and a support vector machine forcommodity futures index forecastingrdquo International Journalof Machine Learning and Cybernetics vol 9 no 1 pp 97ndash1112018

[11] S P Das N S Achary and S Padhy ldquoNovel hybrid SVM-TLBO forecasting model incorporating dimensionality re-duction techniquesrdquo Applied Intelligence vol 45 no 4pp 1148ndash1165 2016

[12] R V Rao V J Savsani and J Balic ldquoTeachingndashlearning-based optimization algorithm for unconstrained and con-strained real-parameter optimization problemsrdquordquo EngineeringOptimization vol 44 no 12 pp 1447ndash1462 2012

[13] R V Rao and V Patel ldquoAn improved teaching-learning-based optimization algorithm for solving uncon- strainedoptimization problemsrdquo Scientia Iranica vol 20 no 3pp 710ndash720 2013

[14] M Crepinsek S-H Liu and L Mernik ldquoA note on teach-ingndashlearning-based optimization algo- rithmrdquo InformationSciences vol 212 pp 79ndash93 2012

[15] M R Nayak C K Nayak and P K Rout ldquoApplication ofmulti-objective teaching learning based optimization algo-rithm to optimal power flow problemrdquo Procedia Technologyvol 6 pp 255ndash264 2012

[16] Y Xu L Wang S-Y Wang and M Liu ldquoAn effectiveteaching-learning-based optimization algorithm for theflexible job-shop scheduling problem with fuzzy processingtimerdquo Neurocomputing vol 148 pp 260ndash268 2015

[17] H E Kiziloz A Deniz T Dokeroglu and A Cosar ldquoNovelmultiobjective TLBO algorithms for the feature subset se-lection problemrdquoNeurocomputing vol 306 pp 94ndash107 2018

[18] S H Wang K Muhammad Y Lv et al ldquoIdentification of Al-coholism based on wavelet Renyi entropy and three-segmentencoded Jaya algorithmrdquo Complexity vol 2018 Article ID3198184 13 pages 2018

[19] H Migallon A Jimeno-Morenilla and J-L Sanchez-RomeroldquoParallel improvements of the Jaya optimization algorithmrdquoApplied Sciences vol 8 no 5 p 819 2018

[20] C Gong ldquoAn enhanced Jaya algorithm with a two groupAdaptionrdquo International Journal of Computational Intelli-gence Systems vol 10 no 1 pp 1102ndash1115 2017

[21] O Samuel N Javaid S Aslam and M H Rahim ldquoJAYAoptimization based energy management controller for smartgrid JAYA optimization based energy management con-trollerrdquo in Proceedings of the 2018 International Conference onComputing Mathematics and Engineering Technologies(iCoMET) March 2018

[22] K Yu J J Liang B Y Qu X Chen and H Wang ldquoPa-rameters identification of photovoltaic models using an im-proved JAYA optimization algorithmrdquo Energy Conversionand Management vol 150 pp 742ndash753 2017

[23] M Dash and H Liu ldquoFeature selection for classificationrdquoIntelligent Data Analysis vol 1 no 1-4 pp 131ndash156 1997

[24] S Dumais J Platt D Heckerman and M Sahami ldquoInductivelearning algorithms and representations for text categoriza-tionrdquo in Proceedings of the seventh international conference onInformation and knowledge management-CIKMrsquo98 Novem-ber 1998

[25] R Singh H Kumar and R K Singla ldquoAn intrusion detectionsystem using network traffic profiling and online sequentialextreme learning machinerdquo Expert Systems with Applicationsvol 42 no 22 pp 8609ndash8624 2015

Complexity 17

[26] M Sokolova and L Guy ldquoA systematic analysis of perfor-mance measures for classification tasksrdquo Information Pro-cessing amp Management vol 45 no 4 pp 427ndash437 2009

[27] P Phoungphol Y Zhang and Y Zhao ldquoRobust multiclassclassification for learning from imbalanced biomedical datardquoTsinghua Science and Technology vol 17 no 6 pp 619ndash6282012

[28] A Jahan F Mustapha M Y Ismail S M Sapuan andM Bahraminasab ldquoA comprehensive VIKOR method formaterial selectionrdquo Materials amp Design vol 32 no 3pp 1215ndash1221 2011

[29] L Aljarah and S A Ludwig ldquoMapreduce intrusion detectionsystem based on a particle swarm optimization clusteringalgorithmrdquo in Proceedings of the 2013 IEEE Congress onEvolutionary Computation pp 955ndash962 IEEE CancunMexico June 2013

[30] K Khaleel M A Ismail U Yunan and S Kasim ldquoReview onintrusion detection system based on the goal of the detectionsystemrdquo International Journal of Integrated Engineeringvol 10 no 6 2018

[31] J D Rodriguez A Perez and J A Lozano ldquoSensitivityanalysis of k-Fold cross validation in prediction error esti-mationrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 32 no 3 pp 569ndash575 2010

[32] M Tavallaee E Bagheri W Lu and A A Ghorbani ldquoAdetailed analysis of the KDD CUP 99 data setrdquo in Proceedingsof the 2009 IEEE Symposium on Computational Intelligence forSecurity and Defense Applications pp 1ndash6 IEEE Ottawa ONCanada July 2009

[33] M Al-Qatf Y Lasheng M Al-Habib and K Al-SabahildquoDeep learning approach combining sparse autoencoder withSVM for network intrusion detectionrdquo IEEE Access vol 6pp 52843ndash52856 2018

[34] R Vinayakumar M Alazab K P Soman P PoornachandranA Al-Nemrat and S Venkatraman ldquoDeep learning approachfor intelligent intrusion detection systemrdquo IEEE Access vol 7pp 41525ndash41550 2019

[35] BA Tama M Comuzzi and K-H Rhee ldquoTSE-IDS a two-stage classifier ensemble for intelligent anomaly-based in-trusion detection systemrdquo IEEE Access vol 7 pp 94497ndash94507 2019

[36] M R Gauthama Raman N Somu K Kannan R Liscano andV S Shankar Sriram ldquoAn efficient intrusion detection systembased on hypergraph - genetic algorithm for parameter op-timization and feature selection in support vector machinerdquordquoKnowledge-Based Systems vol 134 pp 1ndash12 2017

[37] I S (aseen and C Aswani Kumar ldquoIntrusion detectionmodel using fusion of chi- square feature selection and multiclass SVMrdquo Journal of King Saud University - Computer andInformation Sciences vol 29 no 4 pp 462ndash472 2017

[38] Y Li S Yu J Bai and X Cheng ldquoTowards effective networkintrusion detection a hybrid model integrating Gini indexand GBDTwith PSOrdquo Journal of Sensors vol 2018 Article ID1578314 9 pages 2018

[39] A A Aburomman and M B I Reaz ldquoA novel weightedsupport vector machines multi- class classifier based ondifferential evolution for intrusion detection systemsrdquo In-formation Sciences vol 414 pp 225ndash246 2017

[40] J Esmaily and J Ghasemi ldquoA novel intrusion detectionsystems based on genetic algorithms-suggested features by themeans of different permutations of labelsrsquo ordersrdquo Interna-tional Journal of Engineering vol 30 no 10 pp 1494ndash15022017

[41] S Aljawarneh M Aldwairi and M B Yassein ldquoAnomaly-based intrusion detection system through feature selectionanalysis and building hybrid efficient modelrdquo Journal ofComputational Science vol 25 pp 152ndash160 2018

[42] Y Li J L Wang Z H Tian T B Lu and C Young ldquoBuildinglightweight intrusion detection system using wrapper-basedfeature selection mechanismsrdquo Computers amp Security vol 28no 6 pp 466ndash475 2009

[43] S U Jan S Ahmed V Shakhov and I Koo ldquoToward alightweight intrusion detection system for the internet ofthingsrdquo IEEE Access vol 7 pp 42450ndash42471 2019

18 Complexity

Page 10: ImprovedTLBO …downloads.hindawi.com/journals/complexity/2020/5287684.pdfFSS is a multiobjective optimisation problem. e first objective is the number of features, and the second

two students and applying mutation on the newindividual After that and before the classificationprocess is initiated check if the new student is betterthan the old two students (e SVM parameteroptimisation is started using IPJAYA this processstarts at the 29th step by initialising the populationsize the number of design variables and the ter-mination criteria for IPJAYA (e population sizecan be set before the execution and each populationis generated randomly(e designed variables are thetwo parameters of the SVM which need to beoptimised (e termination criteria can be thenumber of iterations after that each population foreach parameter is evaluated separately (which onegives better accuracy) followed by a parallel poll foreach parameter sorting the population from best toworst (best accuracy to worst accuracy) and sepa-rating them into two groups (best and worst groups)(e best population in the best group is chosen asbest and the best population in the worst group ischosen as worst (en the population is modifiedbased on equation in Figure 8 and updated if the newone is better than the old one IPJAYA is repeated

until the termination criterion is satisfied (e finalstep of IPJAYA is to deliver the best value of the twoparameters to be used by the SVM At this point theparameter optimisation has ended and LearnerPhase 2 continues in the next step (is step can beviewed in lines 28 to 39 of Figure 10

(vii) Step 7 evaluate the individuals (chromosome) byusing the outcome of IPJAYA If the new indi-vidual is better than the old two students keepthe new one otherwise keep the best old oneApply this step to all other students At this stepthe main three stages of the ITLBO have finished(e next step is to check for the satisfaction of thetermination criteria if satisfied proceed to thenext step Otherwise the main three stages arerepeated (is step can be viewed in lines 40 to 48of Figure 10

(viii) Step 8 the last step is to apply nondominatedsorting on the result Nondominated sortingmeans no result (individual) is better than all theother individuals(is step can be viewed in line 49of Figure 10

(1) Start (2) Initialize population(3) Calculate_weighted_average_of_individuals (4) for (k =1 to number_of_generations) do(5) Xteacher = Best_individual (6) Learning from Teacher lowastteacher phase (7) for (i = 1 to number_of_individuals) do(8) Xnew = Crossover (Xteacher Xi)(9) Xnew = Mutation(Xnew)(10) if (Xnew is better than Xi) then

(11) Xi = [Xnew]

(12) End if

(13) End for (14) Learning from Best Classmates lowastlearner phase 1 (15) for (i = 1 to number_of_individuals) do(16) m = Select_best_individual_from

(population)

(17) n = Select_best_individual_from (population)(18) n nem ne teacherlowast(19) Xnew = Crossover (Xm Xn)(20) Xnew = Mutation(Xnew)

(21) if (Xnew is better than Xm) then

(22) Xm = Xnew

(23) End if(24) if (Xnew is better than Xn) then

(25) Xn = Xnew

(26) End if

(27) End for

(28) Learning from Classmates lowastlearner phase 2(29) for (i =1 to number_of_individuals) do(30) m =Select_random_individual_from

(population)

(31 n =Select_random_individual_from (population) n nem ne teacherlowast

(32) Xnew = Crossover (Xm Xn)

(33) Xnew = Mutation (Xnew)(34) IPJAYA algorithm for SVM parameter

(35) Initialize the population size number of designed variables and termination criteria (IPJAYA)

(36) Arrange the solutions from the best to the worstand Split the solutions into two groups best andworst solutions

(37) Modify the solutionYj k I = Yj k I + r1 k I (Yj best I ndash |Yj k I|) ndash r2 k I (Yj worst I ndash | Yj k I|)

(38 Update the previous solution if Yj k I gt Yj k Iotherwise do not update the previous solution

(39) return best value of C and Y(40) if ( Xnew is better than Xm) then(41) Xm = Xnew(42) End if (43) if (Xnew is better than Xn) then(44) Xn = Xnew(45) End if (46) End for (47) Is termination criterion satisfied if yes go to line

42 else continue

(48) End for(49) Show_the_pareto_optimal_set (population)(50) End

Figure 10 Proposed method pseudo-code

10 Complexity

Start

Calculate weighted average of every individual in the population

Choose best individual as teacher

Crossover teacher with all other individuals (student) separately and apply mutation

Is new one better than old oneKeep old one Keep new one

Select best two students apply crossover and mutation

Is new one better than theworse student

Keep new oneKeep old one

Select random two students apply crossover and mutation

Initialize population size number of design variablesand termination criterion

Identify best and worst solutions for C and γ

Calculate new worst solution

Modify the solutions based on equation 32

Use the new value of c and y do the classification

Is new one better than old one Keep new oneKeep old one

Is the termination criterionsatisfied

This is the best value of C and γ

Is new one better thanthe worse student based on new

value of C and γ

Keep new oneKeep old one

Is the termination criterionsatisfied

End

YesNo

YesNo

Yes

No Yes

YesNo

No

No

Initialise population randomly

Apply nondominated sorting and find the Pareto setYes

ITLBO

IPJAYA

Figure 11 Details of the proposed method

Complexity 11

8 Evaluation Metrics

(emetrics measures and validation procedures used in theevaluation of the experimental data were reviewed in thissection (e literature review showed that most studies useoverall accuracy as the major performance measure for IDsystems However other metrics and validation measureshave also been mentioned Some works have detailed theinformation on FAR detections andmissed detections whichare all useful system performance evaluation measures (efollowing section details the analysis based on standardmetrics for objective evaluation of the results achieved byvarious classification methods (e performance of thesystem was evaluated using several metrics based on theNSL-KDD and CICIDS 2017 datasets A detailed descriptionof learning performance measures has been provided bySingh et al and Sokolova and Guy [25 26] whilePhoungphol et al [27] detailed the imbalanced dataset is-sues One of these metrics is the accuracy as given in thefollowing equation

accuracy TP + TN

TP + FN + TF + FP (4)

Accuracy is the capability of the classifier in predictingthe actual class here TP true positive TN true negativeFP false positive and FN false negative

Several metrics can be computed from the confusionmatrix (e false-positive rate (FPR) is another metric it isthe percentage of the samples incorrectly predicted aspositive by the classifier It is calculated by using the fol-lowing equation

false minus positive rate(FPR) FP

FP + TN (5)

(e false-negative rate (FNR) is the percentage of thedata incorrectly classified by the classifier as negative It iscalculated by using the following equation

false minus negative rate(FNR) FN

TP + FN (6)

(e detection rate (DR) is the percentage of the samplescorrectly classified by the classifier to their correct class It iscalculated by using the following equation

detection rate FP

TP + FP (7)

(e recall quantifies the number of correct positivepredictions made out of all positive predictions It is cal-culated by using the following equation

recall TP

TP + FN (8)

F-Measure provides a way to combine both detectionrate and recall into a single measure that captures bothproperties It is calculated by using the following equation

F minus M (2lowastDRlowast recall)

(DR + recall) (9)

(e results were validated by using k-fold cross-vali-dation technique [27ndash30] (is technique requires a randompartitioning of the data into k different parts and one part isselected from each iteration as testing data while the other(kminus1) parts are considered as the training dataset All theconnection records are eventually used for training andtesting For all experiments the value of k is taken as 10 toensure low bias low variance low overfitting and good errorestimate [28]

9 Dataset Preprocessing and Partitioning

(e whole dataset is preprocessed in this stage It consists oftwo steps ie scaling and normalisation In the scaling stepthe dataset is converted from a string representation to anumerical representation For example the class label in thedataset contains two different categories ldquoNormalrdquo andldquoAttackrdquo After implementing this step the label is changedto ldquo1rdquo and ldquo0rdquo where ldquo1rdquo means normal case while ldquo0rdquomeans attack (e second step is normalisation [31] (enormalisation process removes noise from the dataset anddecreases the differences in the ranges between the featuresIn this work the Max-Max normalisation method was usedas shown in the following equation

Fi Fi minus Mini

Maxi minus Minii (10)

where Fi represents the current feature that needs to benormalized and Mini and Maxi represent the minimum andthe maximum value for that feature respectively (e ob-jective function represents the accuracy of the SVM when itis evaluated on the validation set (e validation set is a partof the training set In order to make the validation fairer K-fold validation can be used(e value K is 10(e NSL-KDDand CICIDS 2017 datasets were used to evaluate the per-formance of the proposed models

10 NSL-KDD Dataset

In this study NSL-KDD datasets were used to evaluate theproposed method (is dataset was suggested in 2009 byTavallaee et al [32] due to the drawbacks of KDD CUP99(e NSL-KDD is a variant of the KDD CUP 99 dataset inwhich the redundant instances were discarded followed bythe reconstitution of the dataset structure [29] (e NSL-KDD dataset is commonly used for evaluating the per-formance of new ID approaches especially anomaly-basednetwork ID (ere are a reasonable number of testing andtraining records in the NSL-KDD (e training set(KDDTrain+) consists of 125973 records while the testingset (KDDTest+) contains 22544 records In this dataseteach traffic record has 41 features (six symbolic and 35continuous) and one class label (Table 7) (e features areclassified into basic content and traffic types (Table 8)Attack classification in the NSL-KDD is based on thefeature characteristics [33] (e NSL-KDD dataset can bedownloaded from httpswwwunbcacicdatasetsnslhtml

12 Complexity

11 CICIDS 2017 Dataset

(e CICIDS 2017 dataset consists of benign and the mostcurrent common attacks which mimic real-world data(PCAPs) It also contains the results of a network trafficanalysis obtained by using a CICFlowMeter the flows arelabelled based on the timestamp source and destinationports source and destination IPs protocols and attack (eCICIDS 2017 dataset satisfies the 11 indispensable features ofa valid IDS dataset namely anonymity available protocolsfeature set attack diversity complete capture completeinteraction complete network configuration completetraffic metadata heterogeneity and labelling [34] (ere are2830540 rows in the CICIDS 2017 devised on eight fileswith each row containing 79 features In the CICIDS 2017each row is labelled as benign or as one of the 14 attack typesA summary of the distribution of different attack types andthe benign rows is presented in Table 9

12 Results of ITLBO-IPJAYA vs ITLBOand ITLBO-JAYA

(is section provides the results of the improved method-based ITLBO-IPJAYA algorithm (is method selects thebest features and updates the value of SVM parameters (iswork proposed the idea of ldquoparallel executionrdquo to update theSVM parameters (e parameters for ITLBO ITLBO-JAYAand ITLBO-IPJAYA methods used in this study are shownin Table 10

(e NSL-KDD dataset is used to evaluate the threemethods and the evaluation metrics used are maximumaccuracy (Max Acc) average accuracy (AVR Acc) de-tection rate (DT) false alarm rate (FAR) false negative rate(FNR) F-measure (F-M) recall and error rate (ER)Table 11 shows the comparison in results among ITLBOITLBO-JAYA and ITLBO-IPJAYA

(e results show that ITLBO-IPJAYA performs betterthan ITLBO and ITLBO-JAYA in all metrics Figure 11shows the comparison results based on the accuracy ofITLBO ITLBO-JAYA and ITLBO-IPJAYA

Figure 12 shows a comparison between ITLBO-JAYAand ITLBO-IPJAYA based on the number of iterations Itshows that ITLBO-IPJAYA performs better than ITLBO-JAYA even with less number of iterations (e increase inrate of accuracy for ITLBO-IPJAYA is higher than ITLBO-JAYA (e figure shows that ITLBO-IPJAYA with 20 iter-ations performs better than ITLBO-JAYA with 30 iterationsand that ITLBO-IPJAYA performs better than ITLBO-JAYAwith less number of iterations (is means there is lesscomplexity and less execution time for ITLBO-IPJAYAFigure 13 shows the average FAR of the three methodsshowing that ITLBO-IPJAYA performs better than ITLBOand ITLBO-JAYA even with less number of features whereITLBO-IPJAYA with 19 features performs better than TLBOand ITLBO-JAYA with 21 and 22 features respectively (eimprovements shown in Sections 4 and 6 reduce the exe-cution time for ITLBO-IPJAYA over ITLBO-JAYA (eparallel processing of each SVM parameter independently isthe main factor that reduces the execution time for ITLBO-IPJAYA over ITLBO-JAYA as shown in Figure 14

(e results of the CICIDS 2017 dataset are shown inTable 12

Finally statistical significance tests (T-test) T-test madeon the distribution of values in both samples showed theirsignificant difference which allowed us to reject null hy-pothesis H0 (e test sh2ows the superiority of IPJAYA-ITLBO-SVM over JAYA-ITLBO-SVM (e P values and T-

Table 8 NSL-KDD dataset

Attack classes 22 types of attacks No of instancesNormal 67343DoS smurt neptune pod teardrop back land 45927R2L phf ftp-write imap multihop warezclient warezmaster spy guess password 995U2R perl loadmodule buffer-overflow rootkit 52Probing portsweep ipsweep satan nmap 11656

Table 9 CICIDS dataset

Attack class 14 types of attacks No of instancesBenign (normal) 2359087DOS DDoS slowloris Heratbleed Hulk GoldenEye Slowhttptest 294506PortScan Portscan 158930Bot Bot 1966Brute-Force FTP-Patator SSH-Patator 13835Web attack Web attack XSS web attack SQL injection web attack brute force 2180Infiltration Infiltration 36

Table 10 Parameters used in this study margin

Parameter ValuePopulation size for ITLBO 40Number of generations for ITLBO 60Population size for JAYA 40Number of generations for JAYA 60Population size for IPJAYA 40Number of generations for IPJAYA 60Population size for ITLBO 40Number of generations for IPJAYA 60Crossover type Half-uniformMutation type Bit-flip

Complexity 13

0973097409750976097709780979

098098109820983

Acc

urac

y

ITLBO-JAYAITLBO-IPJAYA

25 30 35 40 45 50 55 60 6520Number of iterations

Figure 13 Accuracy comparison based on the number of iterations for the NSL-KDD dataset

Table 11 Comparison of ITLBO ITLBO-JAYA and ITLBO-IPJAYA for the NSL-KDD dataset

No of features Method MAX Acc AVR Acc DR FAR FNR F-M Recall ER

16

TLBO 09639 09630 09612 00449 00282 09664 09717 0036ITLBO 09680 09678 09671 00379 00268 09701 09731 0032

ITLBO-JAYA 09688 09685 09676 00373 00258 0971 09741 00312ITLBO-IPJAYA 09708 09705 09712 00331 00256 09727 09742 00292

18

TLBO 09713 0971 09739 00299 00275 09731 09724 00286ITLBO 09718 09713 09744 00292 00273 09736 09726 00282

ITLBO-JAYA 09735 09733 09752 00285 00247 09752 09752 00265ITLBO-IPJAYA 09747 09746 09753 00280 00221 09764 09779 00252

19

TLBO 09738 09735 09727 00313 00225 09755 09774 00261ITLBO 09751 09745 09737 00305 00189 09769 09811 00248

ITLBO-JAYA 09759 09758 09758 00278 00178 09775 09791 00241ITLBO-IPJAYA 09772 09770 09786 00245 00162 09787 09787 00228

21

TLBO 09782 09780 09742 00299 00145 09797 09844 00217ITLBO 09787 09784 09756 00279 00144 0981 09846 00212

ITLBO-JAYA 09793 0979 09789 00273 00132 09811 09867 00207ITLBO-IPJAYA 09802 0980 09792 00263 00123 09812 09716 00198

22

TLBO 09801 0979 09755 00284 00131 09814 09868 00199ITLBO 0981 09805 09758 00277 00117 09823 0989 00191

ITLBO-JAYA 09816 09814 09794 00265 00114 09829 0989 00183ITLBO-IPJAYA 0-9823 09821 09798 00262 00102 09835 09898 00177

Acc

urac

y

16 18 19 21 22Number of features

096

80

9688 097

08

097

180

9735

097

47

097

510

9759

097

72

097

870

9793

098

02

098

10

9816

098

23

ITLBOITLBO-JAYAITLBO-IPJAYA

Figure 12 Accuracy based on the number of features for the NSL-KDD dataset

14 Complexity

values are shown in Table 13 the small values show that theIPJAYA-ITLBO-SVM method (MV1) is highly significant

13 The Comparison of the Proposed Methods

To illustrate the effectiveness of our proposed IDS methodsthe performance of the proposed methods is compared withsix recently developed anomaly detection techniquesTable 14 demonstrates the result achieved by the proposedmethods compared with other methods tested on the NSL-KDD dataset in terms of detection rate and false alarm rateIt is very clear that our proposed methods (ITLBO-JAYAand ITLBO-IPJAYA) obtained the best results with 09823accuracy 09798 detection rate and 00102 false alarm ratefor the ITLBO-IPJAYA model and 09816 accuracy 09794detection rate and 00114 false alarm rate for the ITLBO-JAYA method as shown in Table 11 However Table 15demonstrates the result achieved by the proposed methodscompared with other methods tested on the CICIDS 2017dataset in terms of detection rate and false alarm rate

14 Discussion

(is work in general contains 4 sections based on theproposed method Furthermore all methods proposed inthis work were evaluated based on the NSL-KDD andCICIDS 2017 datasets

Firstly the proposed ITLBO-IPJAYA based on networkintrusion detection and method results were compared withTLBO ITLBO and ITLBO-JAYA as shown in Tables 11 and12 Additionally the table shows the different features for thethree algorithms to investigate the influence of the featurersquosincrease on the performance which represents a differentalgorithm structure (e ITLBO-IPJAYA results showedhigher stability and better accuracy than ITLBO and ITLBO-JAYA algorithms

Furthermore Figure 13 shows that ITLBO-JAYA needs60 iterations to reach accuracy of 09816 when the ITLBO-IPJAYA algorithm with 50 iterations achieved higher ac-curacy (erefore ITLBO-IPJAYA achieved better detectionrate and less false alarm rate with less complexity of

0020

0022

0024

0026

0028

003

0032

0034

0036

0038

004

FAR

2217 18 19 20 21 2315 16Number of features

ITLBOITLBO-JAYAITLBO-IPJAYA

Figure 14 Comparison based on the number of features head for the NSL-KDD dataset

Table 12 Comparison of ITLBO ITLBO-JAYA and ITLBO-IPJAYA for the CICIDS dataset

No of features Method MAX Acc AVR Acc DR FAR FNR F-M Recall ER

12ITLBO 09634 09631 09661 00389 00268 0970 09721 00323

ITLBO-JAYA 09685 09683 09682 00360 00267 09713 09722 00315ITLBO-IPJAYA 09704 09702 09701 00310 00265 09725 09724 00298

13ITLBO 09712 09710 09724 00298 00273 09736 09726 00282

ITLBO-JAYA 09745 09744 09728 00290 00264 09741 09794 00272ITLBO-IPJAYA 09768 09767 09732 00285 00260 09752 09787 00264

14ITLBO 09776 09775 09737 00280 00189 09769 09811 00258

ITLBO-JAYA 09789 09787 09742 00270 00174 0978 0986 00235ITLBO-IPJAYA 09801 0980 09749 00265 00134 0981 0987 00210

16ITLBO 09804 09803 09755 00271 0011 09821 0989 00190

ITLBO-JAYA 0981 09808 09773 00266 00109 09825 0989 00183ITLBO-IPJAYA 09817 09815 09782 00264 00105 09831 09896 00170

Complexity 15

iterations Secondly with all the improvement of ITLBO-SVM mentioned above random selection of the main SVMparameters is considered as one of the algorithm limitationswhich may not provide optimal parameter value and affectthe model accuracy negatively

(e results above showed that the ITLBO-IPJAYAperformance improved the basic SVM performance byproviding the best parameter values as shown in the ITLBO-IPJAYA block diagram in Figure 11 In the end the per-formance of ITLBO-IPJAYA is worth reducing the impact ofselected parameters randomly

As a result of the differences in the algorithm structurethe ITLBO structure contains three phases which shouldprevent the algorithm from being trapped in local and globaloptima Also teachers not only teach learners (students) butalso teach other teachers On the contrary the TLBO structurecontains two phases only where teachers teach learners only

Furthermore the ITLBO algorithm achieved higheraccuracy than TLBO because the knowledge exchange rate ishigher in ITLBO since teachers teach learners and otherteachers (erefore ITLBO achieved better detection rateand less false alarm rate with less complexity of iterations

Table 15 Comparison with the existing work for the CICIDS 2017 dataset

Ref Method Dataset Acc DR FAR[41] Hybrid model CICIDS 8976 NG NG[42] Wrapper-based feature selection CICIDS 9768 NG NG[43] Feature selection technique and SVM CICIDS 09803 NG NGTLBO-SVM TLBO and SVM CICIDS 09794 09745 00274ITLBO-SVM Improved TLBO and SVM CICIDS 09804 09755 00271ITLBO-JAYA-SVM Improved TLBO improved JAYA and SVM CICIDS 0981 09773 00266ITLBO-IPJAYA-SVM Improved TLBO improved JAYA and SVM CICIDS 09817 09782 00264

Table 13 T-test results

NSL-KDD CICIDS 2017P value 00156 00068T value 3174 4044

Table 14 Comparison with the existing work for the NSL-KDD dataset

Ref Method Dataset Acc DR FAR[35] Two-stage classifier NSL-KDD 9638 NG NG[36] Hypergraph-based genetic algorithm and SVM NSL-KDD 0975 09714 083[8] PSO and SVM NSL-KDD 09784 09723 087[37] Chi-square and SVM NSL-KDD 098 NG 013[38] SVM and hybrid PSO NSL-KDD 07341 06628 281[39] SVM and feature selection NSL-KDD 090 NG NG[40] SVM and GA NSL-KDD 0975 NG NGTLBO-SVM TLBO and SVM NSL-KDD 09801 09755 00284ITLBO-SVM Improved TLBO and SVM NSL-KDD 0981 09758 00277ITLBO-JAYA-SVM Improved TLBO improved JAYA and SVM NSL-KDD 09816 09794 00265ITLBO-IPJAYA-SVM Improved TLBO improved JAYA and SVM NSL-KDD 09823 09798 00262

13534

13239

ITLBO-JAYA ITLBO-IPJAYA130501310013150132001325013300133501340013450135001355013600

Figure 15 Execution time comparison for the NSL-KDD dataset

16 Complexity

Dividing the solutions of the IPJAYA algorithm into twogroups and choosing the best solution from the best solutiongroup as ldquoBestrdquo and the best solution from the worst solutiongroup as ldquoWorstrdquo cause IPJAYA to need less iterations thanJAYA to reach better solutions as shown in Figure 13 (isalso leads to improvement in accuracy and detection rate

(e parallel improvement done on the JAYA algorithmreduces the time needed for execution and hence reduces thetotal execution time for the ITLBO-IPJAYA-SVM model asshown in Figure 15

Data Availability

(e data used to support the findings of this study areavailable online at httpswwwunbcacicdatasetsnslhtml

Conflicts of Interest

(e authors declare that they have no conflicts of interest

Acknowledgments

Special appreciation is due to Universiti Malaysia Pahang forthe sponsorship of this study approved by the Ministry ofHigher Education (MOHE) for Fundamental ResearchGrant Scheme (FRGS) with Vot No RDU190113

References

[1] A Sultana and M Jabbar ldquoIntelligent network intrusiondetection system using data mining techniquesrdquo in Pro-ceedings of the 2016 2nd International Conference on Appliedand 0eoretical Computing and Communication Technology(iCATccT) pp 329ndash333 IEEE Bangalore India July 2016

[2] R Rao ldquoJaya a simple and new optimization algorithm forsolving constrained and unconstrained opti- mizationproblemsrdquo International Journal of Industrial EngineeringComputations vol 7 no 1 pp 19ndash34 2016

[3] M Alsajri M A Ismail and S Abdul-Baqi ldquoA review on therecent application of Jaya optimization algorithmrdquo in Pro-ceedings of the 2018 1st Annual International Conference onInformation and Sciences (AiCIS) Springer Al-Fallujah Iraqpp 129ndash132 November 2018

[4] P Tao Z Sun and Z Sun ldquoAn improved intrusion detectionalgorithm based on GA and SVMrdquo IEEE Access vol 6pp 13624ndash13631 2018

[5] A S Eesa Z Orman and A M A Brifcani ldquoA novel feature-selection ap- proach based on the cuttlefish optimizationalgorithm for intrusion detection systemsrdquo Expert Systemswith Applications vol 42 no 5 pp 2670ndash2679 2015

[6] P Louvieris N Clewley and X Liu ldquoEffects-based featureidentification for network intrusion detectionrdquo Neuro-computing vol 121 pp 265ndash273 2013

[7] E De la Hoz A Ortiz J Ortega and B Prieto ldquoPCA filteringand probabilistic SOM for network intrusion detectionrdquoNeurocomputing vol 164 pp 71ndash81 2015

[8] S Mojtaba H Bamakan H Wang Y Tian and Y Shi ldquoAneffective intrusion detection framework based onMCLPSVMoptimized by time-varying chaos particle swarm optimiza-tionrdquo Neurocomputing vol 199 pp 90ndash102 2016

[9] R V Rao V J Savsani and D P Vakharia ldquoTeach-ingndashlearning-based optimization a novel method for

constrained mechanical design optimization problemsrdquordquoComputer-Aided Design vol 43 no 3 pp 303ndash315 2011

[10] S P Das and S Padhy ldquoA novel hybrid model using teaching-learning-based optimization and a support vector machine forcommodity futures index forecastingrdquo International Journalof Machine Learning and Cybernetics vol 9 no 1 pp 97ndash1112018

[11] S P Das N S Achary and S Padhy ldquoNovel hybrid SVM-TLBO forecasting model incorporating dimensionality re-duction techniquesrdquo Applied Intelligence vol 45 no 4pp 1148ndash1165 2016

[12] R V Rao V J Savsani and J Balic ldquoTeachingndashlearning-based optimization algorithm for unconstrained and con-strained real-parameter optimization problemsrdquordquo EngineeringOptimization vol 44 no 12 pp 1447ndash1462 2012

[13] R V Rao and V Patel ldquoAn improved teaching-learning-based optimization algorithm for solving uncon- strainedoptimization problemsrdquo Scientia Iranica vol 20 no 3pp 710ndash720 2013

[14] M Crepinsek S-H Liu and L Mernik ldquoA note on teach-ingndashlearning-based optimization algo- rithmrdquo InformationSciences vol 212 pp 79ndash93 2012

[15] M R Nayak C K Nayak and P K Rout ldquoApplication ofmulti-objective teaching learning based optimization algo-rithm to optimal power flow problemrdquo Procedia Technologyvol 6 pp 255ndash264 2012

[16] Y Xu L Wang S-Y Wang and M Liu ldquoAn effectiveteaching-learning-based optimization algorithm for theflexible job-shop scheduling problem with fuzzy processingtimerdquo Neurocomputing vol 148 pp 260ndash268 2015

[17] H E Kiziloz A Deniz T Dokeroglu and A Cosar ldquoNovelmultiobjective TLBO algorithms for the feature subset se-lection problemrdquoNeurocomputing vol 306 pp 94ndash107 2018

[18] S H Wang K Muhammad Y Lv et al ldquoIdentification of Al-coholism based on wavelet Renyi entropy and three-segmentencoded Jaya algorithmrdquo Complexity vol 2018 Article ID3198184 13 pages 2018

[19] H Migallon A Jimeno-Morenilla and J-L Sanchez-RomeroldquoParallel improvements of the Jaya optimization algorithmrdquoApplied Sciences vol 8 no 5 p 819 2018

[20] C Gong ldquoAn enhanced Jaya algorithm with a two groupAdaptionrdquo International Journal of Computational Intelli-gence Systems vol 10 no 1 pp 1102ndash1115 2017

[21] O Samuel N Javaid S Aslam and M H Rahim ldquoJAYAoptimization based energy management controller for smartgrid JAYA optimization based energy management con-trollerrdquo in Proceedings of the 2018 International Conference onComputing Mathematics and Engineering Technologies(iCoMET) March 2018

[22] K Yu J J Liang B Y Qu X Chen and H Wang ldquoPa-rameters identification of photovoltaic models using an im-proved JAYA optimization algorithmrdquo Energy Conversionand Management vol 150 pp 742ndash753 2017

[23] M Dash and H Liu ldquoFeature selection for classificationrdquoIntelligent Data Analysis vol 1 no 1-4 pp 131ndash156 1997

[24] S Dumais J Platt D Heckerman and M Sahami ldquoInductivelearning algorithms and representations for text categoriza-tionrdquo in Proceedings of the seventh international conference onInformation and knowledge management-CIKMrsquo98 Novem-ber 1998

[25] R Singh H Kumar and R K Singla ldquoAn intrusion detectionsystem using network traffic profiling and online sequentialextreme learning machinerdquo Expert Systems with Applicationsvol 42 no 22 pp 8609ndash8624 2015

Complexity 17

[26] M Sokolova and L Guy ldquoA systematic analysis of perfor-mance measures for classification tasksrdquo Information Pro-cessing amp Management vol 45 no 4 pp 427ndash437 2009

[27] P Phoungphol Y Zhang and Y Zhao ldquoRobust multiclassclassification for learning from imbalanced biomedical datardquoTsinghua Science and Technology vol 17 no 6 pp 619ndash6282012

[28] A Jahan F Mustapha M Y Ismail S M Sapuan andM Bahraminasab ldquoA comprehensive VIKOR method formaterial selectionrdquo Materials amp Design vol 32 no 3pp 1215ndash1221 2011

[29] L Aljarah and S A Ludwig ldquoMapreduce intrusion detectionsystem based on a particle swarm optimization clusteringalgorithmrdquo in Proceedings of the 2013 IEEE Congress onEvolutionary Computation pp 955ndash962 IEEE CancunMexico June 2013

[30] K Khaleel M A Ismail U Yunan and S Kasim ldquoReview onintrusion detection system based on the goal of the detectionsystemrdquo International Journal of Integrated Engineeringvol 10 no 6 2018

[31] J D Rodriguez A Perez and J A Lozano ldquoSensitivityanalysis of k-Fold cross validation in prediction error esti-mationrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 32 no 3 pp 569ndash575 2010

[32] M Tavallaee E Bagheri W Lu and A A Ghorbani ldquoAdetailed analysis of the KDD CUP 99 data setrdquo in Proceedingsof the 2009 IEEE Symposium on Computational Intelligence forSecurity and Defense Applications pp 1ndash6 IEEE Ottawa ONCanada July 2009

[33] M Al-Qatf Y Lasheng M Al-Habib and K Al-SabahildquoDeep learning approach combining sparse autoencoder withSVM for network intrusion detectionrdquo IEEE Access vol 6pp 52843ndash52856 2018

[34] R Vinayakumar M Alazab K P Soman P PoornachandranA Al-Nemrat and S Venkatraman ldquoDeep learning approachfor intelligent intrusion detection systemrdquo IEEE Access vol 7pp 41525ndash41550 2019

[35] BA Tama M Comuzzi and K-H Rhee ldquoTSE-IDS a two-stage classifier ensemble for intelligent anomaly-based in-trusion detection systemrdquo IEEE Access vol 7 pp 94497ndash94507 2019

[36] M R Gauthama Raman N Somu K Kannan R Liscano andV S Shankar Sriram ldquoAn efficient intrusion detection systembased on hypergraph - genetic algorithm for parameter op-timization and feature selection in support vector machinerdquordquoKnowledge-Based Systems vol 134 pp 1ndash12 2017

[37] I S (aseen and C Aswani Kumar ldquoIntrusion detectionmodel using fusion of chi- square feature selection and multiclass SVMrdquo Journal of King Saud University - Computer andInformation Sciences vol 29 no 4 pp 462ndash472 2017

[38] Y Li S Yu J Bai and X Cheng ldquoTowards effective networkintrusion detection a hybrid model integrating Gini indexand GBDTwith PSOrdquo Journal of Sensors vol 2018 Article ID1578314 9 pages 2018

[39] A A Aburomman and M B I Reaz ldquoA novel weightedsupport vector machines multi- class classifier based ondifferential evolution for intrusion detection systemsrdquo In-formation Sciences vol 414 pp 225ndash246 2017

[40] J Esmaily and J Ghasemi ldquoA novel intrusion detectionsystems based on genetic algorithms-suggested features by themeans of different permutations of labelsrsquo ordersrdquo Interna-tional Journal of Engineering vol 30 no 10 pp 1494ndash15022017

[41] S Aljawarneh M Aldwairi and M B Yassein ldquoAnomaly-based intrusion detection system through feature selectionanalysis and building hybrid efficient modelrdquo Journal ofComputational Science vol 25 pp 152ndash160 2018

[42] Y Li J L Wang Z H Tian T B Lu and C Young ldquoBuildinglightweight intrusion detection system using wrapper-basedfeature selection mechanismsrdquo Computers amp Security vol 28no 6 pp 466ndash475 2009

[43] S U Jan S Ahmed V Shakhov and I Koo ldquoToward alightweight intrusion detection system for the internet ofthingsrdquo IEEE Access vol 7 pp 42450ndash42471 2019

18 Complexity

Page 11: ImprovedTLBO …downloads.hindawi.com/journals/complexity/2020/5287684.pdfFSS is a multiobjective optimisation problem. e first objective is the number of features, and the second

Start

Calculate weighted average of every individual in the population

Choose best individual as teacher

Crossover teacher with all other individuals (student) separately and apply mutation

Is new one better than old oneKeep old one Keep new one

Select best two students apply crossover and mutation

Is new one better than theworse student

Keep new oneKeep old one

Select random two students apply crossover and mutation

Initialize population size number of design variablesand termination criterion

Identify best and worst solutions for C and γ

Calculate new worst solution

Modify the solutions based on equation 32

Use the new value of c and y do the classification

Is new one better than old one Keep new oneKeep old one

Is the termination criterionsatisfied

This is the best value of C and γ

Is new one better thanthe worse student based on new

value of C and γ

Keep new oneKeep old one

Is the termination criterionsatisfied

End

YesNo

YesNo

Yes

No Yes

YesNo

No

No

Initialise population randomly

Apply nondominated sorting and find the Pareto setYes

ITLBO

IPJAYA

Figure 11 Details of the proposed method

Complexity 11

8 Evaluation Metrics

(emetrics measures and validation procedures used in theevaluation of the experimental data were reviewed in thissection (e literature review showed that most studies useoverall accuracy as the major performance measure for IDsystems However other metrics and validation measureshave also been mentioned Some works have detailed theinformation on FAR detections andmissed detections whichare all useful system performance evaluation measures (efollowing section details the analysis based on standardmetrics for objective evaluation of the results achieved byvarious classification methods (e performance of thesystem was evaluated using several metrics based on theNSL-KDD and CICIDS 2017 datasets A detailed descriptionof learning performance measures has been provided bySingh et al and Sokolova and Guy [25 26] whilePhoungphol et al [27] detailed the imbalanced dataset is-sues One of these metrics is the accuracy as given in thefollowing equation

accuracy TP + TN

TP + FN + TF + FP (4)

Accuracy is the capability of the classifier in predictingthe actual class here TP true positive TN true negativeFP false positive and FN false negative

Several metrics can be computed from the confusionmatrix (e false-positive rate (FPR) is another metric it isthe percentage of the samples incorrectly predicted aspositive by the classifier It is calculated by using the fol-lowing equation

false minus positive rate(FPR) FP

FP + TN (5)

(e false-negative rate (FNR) is the percentage of thedata incorrectly classified by the classifier as negative It iscalculated by using the following equation

false minus negative rate(FNR) FN

TP + FN (6)

(e detection rate (DR) is the percentage of the samplescorrectly classified by the classifier to their correct class It iscalculated by using the following equation

detection rate FP

TP + FP (7)

(e recall quantifies the number of correct positivepredictions made out of all positive predictions It is cal-culated by using the following equation

recall TP

TP + FN (8)

F-Measure provides a way to combine both detectionrate and recall into a single measure that captures bothproperties It is calculated by using the following equation

F minus M (2lowastDRlowast recall)

(DR + recall) (9)

(e results were validated by using k-fold cross-vali-dation technique [27ndash30] (is technique requires a randompartitioning of the data into k different parts and one part isselected from each iteration as testing data while the other(kminus1) parts are considered as the training dataset All theconnection records are eventually used for training andtesting For all experiments the value of k is taken as 10 toensure low bias low variance low overfitting and good errorestimate [28]

9 Dataset Preprocessing and Partitioning

(e whole dataset is preprocessed in this stage It consists oftwo steps ie scaling and normalisation In the scaling stepthe dataset is converted from a string representation to anumerical representation For example the class label in thedataset contains two different categories ldquoNormalrdquo andldquoAttackrdquo After implementing this step the label is changedto ldquo1rdquo and ldquo0rdquo where ldquo1rdquo means normal case while ldquo0rdquomeans attack (e second step is normalisation [31] (enormalisation process removes noise from the dataset anddecreases the differences in the ranges between the featuresIn this work the Max-Max normalisation method was usedas shown in the following equation

Fi Fi minus Mini

Maxi minus Minii (10)

where Fi represents the current feature that needs to benormalized and Mini and Maxi represent the minimum andthe maximum value for that feature respectively (e ob-jective function represents the accuracy of the SVM when itis evaluated on the validation set (e validation set is a partof the training set In order to make the validation fairer K-fold validation can be used(e value K is 10(e NSL-KDDand CICIDS 2017 datasets were used to evaluate the per-formance of the proposed models

10 NSL-KDD Dataset

In this study NSL-KDD datasets were used to evaluate theproposed method (is dataset was suggested in 2009 byTavallaee et al [32] due to the drawbacks of KDD CUP99(e NSL-KDD is a variant of the KDD CUP 99 dataset inwhich the redundant instances were discarded followed bythe reconstitution of the dataset structure [29] (e NSL-KDD dataset is commonly used for evaluating the per-formance of new ID approaches especially anomaly-basednetwork ID (ere are a reasonable number of testing andtraining records in the NSL-KDD (e training set(KDDTrain+) consists of 125973 records while the testingset (KDDTest+) contains 22544 records In this dataseteach traffic record has 41 features (six symbolic and 35continuous) and one class label (Table 7) (e features areclassified into basic content and traffic types (Table 8)Attack classification in the NSL-KDD is based on thefeature characteristics [33] (e NSL-KDD dataset can bedownloaded from httpswwwunbcacicdatasetsnslhtml

12 Complexity

11 CICIDS 2017 Dataset

(e CICIDS 2017 dataset consists of benign and the mostcurrent common attacks which mimic real-world data(PCAPs) It also contains the results of a network trafficanalysis obtained by using a CICFlowMeter the flows arelabelled based on the timestamp source and destinationports source and destination IPs protocols and attack (eCICIDS 2017 dataset satisfies the 11 indispensable features ofa valid IDS dataset namely anonymity available protocolsfeature set attack diversity complete capture completeinteraction complete network configuration completetraffic metadata heterogeneity and labelling [34] (ere are2830540 rows in the CICIDS 2017 devised on eight fileswith each row containing 79 features In the CICIDS 2017each row is labelled as benign or as one of the 14 attack typesA summary of the distribution of different attack types andthe benign rows is presented in Table 9

12 Results of ITLBO-IPJAYA vs ITLBOand ITLBO-JAYA

(is section provides the results of the improved method-based ITLBO-IPJAYA algorithm (is method selects thebest features and updates the value of SVM parameters (iswork proposed the idea of ldquoparallel executionrdquo to update theSVM parameters (e parameters for ITLBO ITLBO-JAYAand ITLBO-IPJAYA methods used in this study are shownin Table 10

(e NSL-KDD dataset is used to evaluate the threemethods and the evaluation metrics used are maximumaccuracy (Max Acc) average accuracy (AVR Acc) de-tection rate (DT) false alarm rate (FAR) false negative rate(FNR) F-measure (F-M) recall and error rate (ER)Table 11 shows the comparison in results among ITLBOITLBO-JAYA and ITLBO-IPJAYA

(e results show that ITLBO-IPJAYA performs betterthan ITLBO and ITLBO-JAYA in all metrics Figure 11shows the comparison results based on the accuracy ofITLBO ITLBO-JAYA and ITLBO-IPJAYA

Figure 12 shows a comparison between ITLBO-JAYAand ITLBO-IPJAYA based on the number of iterations Itshows that ITLBO-IPJAYA performs better than ITLBO-JAYA even with less number of iterations (e increase inrate of accuracy for ITLBO-IPJAYA is higher than ITLBO-JAYA (e figure shows that ITLBO-IPJAYA with 20 iter-ations performs better than ITLBO-JAYA with 30 iterationsand that ITLBO-IPJAYA performs better than ITLBO-JAYAwith less number of iterations (is means there is lesscomplexity and less execution time for ITLBO-IPJAYAFigure 13 shows the average FAR of the three methodsshowing that ITLBO-IPJAYA performs better than ITLBOand ITLBO-JAYA even with less number of features whereITLBO-IPJAYA with 19 features performs better than TLBOand ITLBO-JAYA with 21 and 22 features respectively (eimprovements shown in Sections 4 and 6 reduce the exe-cution time for ITLBO-IPJAYA over ITLBO-JAYA (eparallel processing of each SVM parameter independently isthe main factor that reduces the execution time for ITLBO-IPJAYA over ITLBO-JAYA as shown in Figure 14

(e results of the CICIDS 2017 dataset are shown inTable 12

Finally statistical significance tests (T-test) T-test madeon the distribution of values in both samples showed theirsignificant difference which allowed us to reject null hy-pothesis H0 (e test sh2ows the superiority of IPJAYA-ITLBO-SVM over JAYA-ITLBO-SVM (e P values and T-

Table 8 NSL-KDD dataset

Attack classes 22 types of attacks No of instancesNormal 67343DoS smurt neptune pod teardrop back land 45927R2L phf ftp-write imap multihop warezclient warezmaster spy guess password 995U2R perl loadmodule buffer-overflow rootkit 52Probing portsweep ipsweep satan nmap 11656

Table 9 CICIDS dataset

Attack class 14 types of attacks No of instancesBenign (normal) 2359087DOS DDoS slowloris Heratbleed Hulk GoldenEye Slowhttptest 294506PortScan Portscan 158930Bot Bot 1966Brute-Force FTP-Patator SSH-Patator 13835Web attack Web attack XSS web attack SQL injection web attack brute force 2180Infiltration Infiltration 36

Table 10 Parameters used in this study margin

Parameter ValuePopulation size for ITLBO 40Number of generations for ITLBO 60Population size for JAYA 40Number of generations for JAYA 60Population size for IPJAYA 40Number of generations for IPJAYA 60Population size for ITLBO 40Number of generations for IPJAYA 60Crossover type Half-uniformMutation type Bit-flip

Complexity 13

0973097409750976097709780979

098098109820983

Acc

urac

y

ITLBO-JAYAITLBO-IPJAYA

25 30 35 40 45 50 55 60 6520Number of iterations

Figure 13 Accuracy comparison based on the number of iterations for the NSL-KDD dataset

Table 11 Comparison of ITLBO ITLBO-JAYA and ITLBO-IPJAYA for the NSL-KDD dataset

No of features Method MAX Acc AVR Acc DR FAR FNR F-M Recall ER

16

TLBO 09639 09630 09612 00449 00282 09664 09717 0036ITLBO 09680 09678 09671 00379 00268 09701 09731 0032

ITLBO-JAYA 09688 09685 09676 00373 00258 0971 09741 00312ITLBO-IPJAYA 09708 09705 09712 00331 00256 09727 09742 00292

18

TLBO 09713 0971 09739 00299 00275 09731 09724 00286ITLBO 09718 09713 09744 00292 00273 09736 09726 00282

ITLBO-JAYA 09735 09733 09752 00285 00247 09752 09752 00265ITLBO-IPJAYA 09747 09746 09753 00280 00221 09764 09779 00252

19

TLBO 09738 09735 09727 00313 00225 09755 09774 00261ITLBO 09751 09745 09737 00305 00189 09769 09811 00248

ITLBO-JAYA 09759 09758 09758 00278 00178 09775 09791 00241ITLBO-IPJAYA 09772 09770 09786 00245 00162 09787 09787 00228

21

TLBO 09782 09780 09742 00299 00145 09797 09844 00217ITLBO 09787 09784 09756 00279 00144 0981 09846 00212

ITLBO-JAYA 09793 0979 09789 00273 00132 09811 09867 00207ITLBO-IPJAYA 09802 0980 09792 00263 00123 09812 09716 00198

22

TLBO 09801 0979 09755 00284 00131 09814 09868 00199ITLBO 0981 09805 09758 00277 00117 09823 0989 00191

ITLBO-JAYA 09816 09814 09794 00265 00114 09829 0989 00183ITLBO-IPJAYA 0-9823 09821 09798 00262 00102 09835 09898 00177

Acc

urac

y

16 18 19 21 22Number of features

096

80

9688 097

08

097

180

9735

097

47

097

510

9759

097

72

097

870

9793

098

02

098

10

9816

098

23

ITLBOITLBO-JAYAITLBO-IPJAYA

Figure 12 Accuracy based on the number of features for the NSL-KDD dataset

14 Complexity

values are shown in Table 13 the small values show that theIPJAYA-ITLBO-SVM method (MV1) is highly significant

13 The Comparison of the Proposed Methods

To illustrate the effectiveness of our proposed IDS methodsthe performance of the proposed methods is compared withsix recently developed anomaly detection techniquesTable 14 demonstrates the result achieved by the proposedmethods compared with other methods tested on the NSL-KDD dataset in terms of detection rate and false alarm rateIt is very clear that our proposed methods (ITLBO-JAYAand ITLBO-IPJAYA) obtained the best results with 09823accuracy 09798 detection rate and 00102 false alarm ratefor the ITLBO-IPJAYA model and 09816 accuracy 09794detection rate and 00114 false alarm rate for the ITLBO-JAYA method as shown in Table 11 However Table 15demonstrates the result achieved by the proposed methodscompared with other methods tested on the CICIDS 2017dataset in terms of detection rate and false alarm rate

14 Discussion

(is work in general contains 4 sections based on theproposed method Furthermore all methods proposed inthis work were evaluated based on the NSL-KDD andCICIDS 2017 datasets

Firstly the proposed ITLBO-IPJAYA based on networkintrusion detection and method results were compared withTLBO ITLBO and ITLBO-JAYA as shown in Tables 11 and12 Additionally the table shows the different features for thethree algorithms to investigate the influence of the featurersquosincrease on the performance which represents a differentalgorithm structure (e ITLBO-IPJAYA results showedhigher stability and better accuracy than ITLBO and ITLBO-JAYA algorithms

Furthermore Figure 13 shows that ITLBO-JAYA needs60 iterations to reach accuracy of 09816 when the ITLBO-IPJAYA algorithm with 50 iterations achieved higher ac-curacy (erefore ITLBO-IPJAYA achieved better detectionrate and less false alarm rate with less complexity of

0020

0022

0024

0026

0028

003

0032

0034

0036

0038

004

FAR

2217 18 19 20 21 2315 16Number of features

ITLBOITLBO-JAYAITLBO-IPJAYA

Figure 14 Comparison based on the number of features head for the NSL-KDD dataset

Table 12 Comparison of ITLBO ITLBO-JAYA and ITLBO-IPJAYA for the CICIDS dataset

No of features Method MAX Acc AVR Acc DR FAR FNR F-M Recall ER

12ITLBO 09634 09631 09661 00389 00268 0970 09721 00323

ITLBO-JAYA 09685 09683 09682 00360 00267 09713 09722 00315ITLBO-IPJAYA 09704 09702 09701 00310 00265 09725 09724 00298

13ITLBO 09712 09710 09724 00298 00273 09736 09726 00282

ITLBO-JAYA 09745 09744 09728 00290 00264 09741 09794 00272ITLBO-IPJAYA 09768 09767 09732 00285 00260 09752 09787 00264

14ITLBO 09776 09775 09737 00280 00189 09769 09811 00258

ITLBO-JAYA 09789 09787 09742 00270 00174 0978 0986 00235ITLBO-IPJAYA 09801 0980 09749 00265 00134 0981 0987 00210

16ITLBO 09804 09803 09755 00271 0011 09821 0989 00190

ITLBO-JAYA 0981 09808 09773 00266 00109 09825 0989 00183ITLBO-IPJAYA 09817 09815 09782 00264 00105 09831 09896 00170

Complexity 15

iterations Secondly with all the improvement of ITLBO-SVM mentioned above random selection of the main SVMparameters is considered as one of the algorithm limitationswhich may not provide optimal parameter value and affectthe model accuracy negatively

(e results above showed that the ITLBO-IPJAYAperformance improved the basic SVM performance byproviding the best parameter values as shown in the ITLBO-IPJAYA block diagram in Figure 11 In the end the per-formance of ITLBO-IPJAYA is worth reducing the impact ofselected parameters randomly

As a result of the differences in the algorithm structurethe ITLBO structure contains three phases which shouldprevent the algorithm from being trapped in local and globaloptima Also teachers not only teach learners (students) butalso teach other teachers On the contrary the TLBO structurecontains two phases only where teachers teach learners only

Furthermore the ITLBO algorithm achieved higheraccuracy than TLBO because the knowledge exchange rate ishigher in ITLBO since teachers teach learners and otherteachers (erefore ITLBO achieved better detection rateand less false alarm rate with less complexity of iterations

Table 15 Comparison with the existing work for the CICIDS 2017 dataset

Ref Method Dataset Acc DR FAR[41] Hybrid model CICIDS 8976 NG NG[42] Wrapper-based feature selection CICIDS 9768 NG NG[43] Feature selection technique and SVM CICIDS 09803 NG NGTLBO-SVM TLBO and SVM CICIDS 09794 09745 00274ITLBO-SVM Improved TLBO and SVM CICIDS 09804 09755 00271ITLBO-JAYA-SVM Improved TLBO improved JAYA and SVM CICIDS 0981 09773 00266ITLBO-IPJAYA-SVM Improved TLBO improved JAYA and SVM CICIDS 09817 09782 00264

Table 13 T-test results

NSL-KDD CICIDS 2017P value 00156 00068T value 3174 4044

Table 14 Comparison with the existing work for the NSL-KDD dataset

Ref Method Dataset Acc DR FAR[35] Two-stage classifier NSL-KDD 9638 NG NG[36] Hypergraph-based genetic algorithm and SVM NSL-KDD 0975 09714 083[8] PSO and SVM NSL-KDD 09784 09723 087[37] Chi-square and SVM NSL-KDD 098 NG 013[38] SVM and hybrid PSO NSL-KDD 07341 06628 281[39] SVM and feature selection NSL-KDD 090 NG NG[40] SVM and GA NSL-KDD 0975 NG NGTLBO-SVM TLBO and SVM NSL-KDD 09801 09755 00284ITLBO-SVM Improved TLBO and SVM NSL-KDD 0981 09758 00277ITLBO-JAYA-SVM Improved TLBO improved JAYA and SVM NSL-KDD 09816 09794 00265ITLBO-IPJAYA-SVM Improved TLBO improved JAYA and SVM NSL-KDD 09823 09798 00262

13534

13239

ITLBO-JAYA ITLBO-IPJAYA130501310013150132001325013300133501340013450135001355013600

Figure 15 Execution time comparison for the NSL-KDD dataset

16 Complexity

Dividing the solutions of the IPJAYA algorithm into twogroups and choosing the best solution from the best solutiongroup as ldquoBestrdquo and the best solution from the worst solutiongroup as ldquoWorstrdquo cause IPJAYA to need less iterations thanJAYA to reach better solutions as shown in Figure 13 (isalso leads to improvement in accuracy and detection rate

(e parallel improvement done on the JAYA algorithmreduces the time needed for execution and hence reduces thetotal execution time for the ITLBO-IPJAYA-SVM model asshown in Figure 15

Data Availability

(e data used to support the findings of this study areavailable online at httpswwwunbcacicdatasetsnslhtml

Conflicts of Interest

(e authors declare that they have no conflicts of interest

Acknowledgments

Special appreciation is due to Universiti Malaysia Pahang forthe sponsorship of this study approved by the Ministry ofHigher Education (MOHE) for Fundamental ResearchGrant Scheme (FRGS) with Vot No RDU190113

References

[1] A Sultana and M Jabbar ldquoIntelligent network intrusiondetection system using data mining techniquesrdquo in Pro-ceedings of the 2016 2nd International Conference on Appliedand 0eoretical Computing and Communication Technology(iCATccT) pp 329ndash333 IEEE Bangalore India July 2016

[2] R Rao ldquoJaya a simple and new optimization algorithm forsolving constrained and unconstrained opti- mizationproblemsrdquo International Journal of Industrial EngineeringComputations vol 7 no 1 pp 19ndash34 2016

[3] M Alsajri M A Ismail and S Abdul-Baqi ldquoA review on therecent application of Jaya optimization algorithmrdquo in Pro-ceedings of the 2018 1st Annual International Conference onInformation and Sciences (AiCIS) Springer Al-Fallujah Iraqpp 129ndash132 November 2018

[4] P Tao Z Sun and Z Sun ldquoAn improved intrusion detectionalgorithm based on GA and SVMrdquo IEEE Access vol 6pp 13624ndash13631 2018

[5] A S Eesa Z Orman and A M A Brifcani ldquoA novel feature-selection ap- proach based on the cuttlefish optimizationalgorithm for intrusion detection systemsrdquo Expert Systemswith Applications vol 42 no 5 pp 2670ndash2679 2015

[6] P Louvieris N Clewley and X Liu ldquoEffects-based featureidentification for network intrusion detectionrdquo Neuro-computing vol 121 pp 265ndash273 2013

[7] E De la Hoz A Ortiz J Ortega and B Prieto ldquoPCA filteringand probabilistic SOM for network intrusion detectionrdquoNeurocomputing vol 164 pp 71ndash81 2015

[8] S Mojtaba H Bamakan H Wang Y Tian and Y Shi ldquoAneffective intrusion detection framework based onMCLPSVMoptimized by time-varying chaos particle swarm optimiza-tionrdquo Neurocomputing vol 199 pp 90ndash102 2016

[9] R V Rao V J Savsani and D P Vakharia ldquoTeach-ingndashlearning-based optimization a novel method for

constrained mechanical design optimization problemsrdquordquoComputer-Aided Design vol 43 no 3 pp 303ndash315 2011

[10] S P Das and S Padhy ldquoA novel hybrid model using teaching-learning-based optimization and a support vector machine forcommodity futures index forecastingrdquo International Journalof Machine Learning and Cybernetics vol 9 no 1 pp 97ndash1112018

[11] S P Das N S Achary and S Padhy ldquoNovel hybrid SVM-TLBO forecasting model incorporating dimensionality re-duction techniquesrdquo Applied Intelligence vol 45 no 4pp 1148ndash1165 2016

[12] R V Rao V J Savsani and J Balic ldquoTeachingndashlearning-based optimization algorithm for unconstrained and con-strained real-parameter optimization problemsrdquordquo EngineeringOptimization vol 44 no 12 pp 1447ndash1462 2012

[13] R V Rao and V Patel ldquoAn improved teaching-learning-based optimization algorithm for solving uncon- strainedoptimization problemsrdquo Scientia Iranica vol 20 no 3pp 710ndash720 2013

[14] M Crepinsek S-H Liu and L Mernik ldquoA note on teach-ingndashlearning-based optimization algo- rithmrdquo InformationSciences vol 212 pp 79ndash93 2012

[15] M R Nayak C K Nayak and P K Rout ldquoApplication ofmulti-objective teaching learning based optimization algo-rithm to optimal power flow problemrdquo Procedia Technologyvol 6 pp 255ndash264 2012

[16] Y Xu L Wang S-Y Wang and M Liu ldquoAn effectiveteaching-learning-based optimization algorithm for theflexible job-shop scheduling problem with fuzzy processingtimerdquo Neurocomputing vol 148 pp 260ndash268 2015

[17] H E Kiziloz A Deniz T Dokeroglu and A Cosar ldquoNovelmultiobjective TLBO algorithms for the feature subset se-lection problemrdquoNeurocomputing vol 306 pp 94ndash107 2018

[18] S H Wang K Muhammad Y Lv et al ldquoIdentification of Al-coholism based on wavelet Renyi entropy and three-segmentencoded Jaya algorithmrdquo Complexity vol 2018 Article ID3198184 13 pages 2018

[19] H Migallon A Jimeno-Morenilla and J-L Sanchez-RomeroldquoParallel improvements of the Jaya optimization algorithmrdquoApplied Sciences vol 8 no 5 p 819 2018

[20] C Gong ldquoAn enhanced Jaya algorithm with a two groupAdaptionrdquo International Journal of Computational Intelli-gence Systems vol 10 no 1 pp 1102ndash1115 2017

[21] O Samuel N Javaid S Aslam and M H Rahim ldquoJAYAoptimization based energy management controller for smartgrid JAYA optimization based energy management con-trollerrdquo in Proceedings of the 2018 International Conference onComputing Mathematics and Engineering Technologies(iCoMET) March 2018

[22] K Yu J J Liang B Y Qu X Chen and H Wang ldquoPa-rameters identification of photovoltaic models using an im-proved JAYA optimization algorithmrdquo Energy Conversionand Management vol 150 pp 742ndash753 2017

[23] M Dash and H Liu ldquoFeature selection for classificationrdquoIntelligent Data Analysis vol 1 no 1-4 pp 131ndash156 1997

[24] S Dumais J Platt D Heckerman and M Sahami ldquoInductivelearning algorithms and representations for text categoriza-tionrdquo in Proceedings of the seventh international conference onInformation and knowledge management-CIKMrsquo98 Novem-ber 1998

[25] R Singh H Kumar and R K Singla ldquoAn intrusion detectionsystem using network traffic profiling and online sequentialextreme learning machinerdquo Expert Systems with Applicationsvol 42 no 22 pp 8609ndash8624 2015

Complexity 17

[26] M Sokolova and L Guy ldquoA systematic analysis of perfor-mance measures for classification tasksrdquo Information Pro-cessing amp Management vol 45 no 4 pp 427ndash437 2009

[27] P Phoungphol Y Zhang and Y Zhao ldquoRobust multiclassclassification for learning from imbalanced biomedical datardquoTsinghua Science and Technology vol 17 no 6 pp 619ndash6282012

[28] A Jahan F Mustapha M Y Ismail S M Sapuan andM Bahraminasab ldquoA comprehensive VIKOR method formaterial selectionrdquo Materials amp Design vol 32 no 3pp 1215ndash1221 2011

[29] L Aljarah and S A Ludwig ldquoMapreduce intrusion detectionsystem based on a particle swarm optimization clusteringalgorithmrdquo in Proceedings of the 2013 IEEE Congress onEvolutionary Computation pp 955ndash962 IEEE CancunMexico June 2013

[30] K Khaleel M A Ismail U Yunan and S Kasim ldquoReview onintrusion detection system based on the goal of the detectionsystemrdquo International Journal of Integrated Engineeringvol 10 no 6 2018

[31] J D Rodriguez A Perez and J A Lozano ldquoSensitivityanalysis of k-Fold cross validation in prediction error esti-mationrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 32 no 3 pp 569ndash575 2010

[32] M Tavallaee E Bagheri W Lu and A A Ghorbani ldquoAdetailed analysis of the KDD CUP 99 data setrdquo in Proceedingsof the 2009 IEEE Symposium on Computational Intelligence forSecurity and Defense Applications pp 1ndash6 IEEE Ottawa ONCanada July 2009

[33] M Al-Qatf Y Lasheng M Al-Habib and K Al-SabahildquoDeep learning approach combining sparse autoencoder withSVM for network intrusion detectionrdquo IEEE Access vol 6pp 52843ndash52856 2018

[34] R Vinayakumar M Alazab K P Soman P PoornachandranA Al-Nemrat and S Venkatraman ldquoDeep learning approachfor intelligent intrusion detection systemrdquo IEEE Access vol 7pp 41525ndash41550 2019

[35] BA Tama M Comuzzi and K-H Rhee ldquoTSE-IDS a two-stage classifier ensemble for intelligent anomaly-based in-trusion detection systemrdquo IEEE Access vol 7 pp 94497ndash94507 2019

[36] M R Gauthama Raman N Somu K Kannan R Liscano andV S Shankar Sriram ldquoAn efficient intrusion detection systembased on hypergraph - genetic algorithm for parameter op-timization and feature selection in support vector machinerdquordquoKnowledge-Based Systems vol 134 pp 1ndash12 2017

[37] I S (aseen and C Aswani Kumar ldquoIntrusion detectionmodel using fusion of chi- square feature selection and multiclass SVMrdquo Journal of King Saud University - Computer andInformation Sciences vol 29 no 4 pp 462ndash472 2017

[38] Y Li S Yu J Bai and X Cheng ldquoTowards effective networkintrusion detection a hybrid model integrating Gini indexand GBDTwith PSOrdquo Journal of Sensors vol 2018 Article ID1578314 9 pages 2018

[39] A A Aburomman and M B I Reaz ldquoA novel weightedsupport vector machines multi- class classifier based ondifferential evolution for intrusion detection systemsrdquo In-formation Sciences vol 414 pp 225ndash246 2017

[40] J Esmaily and J Ghasemi ldquoA novel intrusion detectionsystems based on genetic algorithms-suggested features by themeans of different permutations of labelsrsquo ordersrdquo Interna-tional Journal of Engineering vol 30 no 10 pp 1494ndash15022017

[41] S Aljawarneh M Aldwairi and M B Yassein ldquoAnomaly-based intrusion detection system through feature selectionanalysis and building hybrid efficient modelrdquo Journal ofComputational Science vol 25 pp 152ndash160 2018

[42] Y Li J L Wang Z H Tian T B Lu and C Young ldquoBuildinglightweight intrusion detection system using wrapper-basedfeature selection mechanismsrdquo Computers amp Security vol 28no 6 pp 466ndash475 2009

[43] S U Jan S Ahmed V Shakhov and I Koo ldquoToward alightweight intrusion detection system for the internet ofthingsrdquo IEEE Access vol 7 pp 42450ndash42471 2019

18 Complexity

Page 12: ImprovedTLBO …downloads.hindawi.com/journals/complexity/2020/5287684.pdfFSS is a multiobjective optimisation problem. e first objective is the number of features, and the second

8 Evaluation Metrics

(emetrics measures and validation procedures used in theevaluation of the experimental data were reviewed in thissection (e literature review showed that most studies useoverall accuracy as the major performance measure for IDsystems However other metrics and validation measureshave also been mentioned Some works have detailed theinformation on FAR detections andmissed detections whichare all useful system performance evaluation measures (efollowing section details the analysis based on standardmetrics for objective evaluation of the results achieved byvarious classification methods (e performance of thesystem was evaluated using several metrics based on theNSL-KDD and CICIDS 2017 datasets A detailed descriptionof learning performance measures has been provided bySingh et al and Sokolova and Guy [25 26] whilePhoungphol et al [27] detailed the imbalanced dataset is-sues One of these metrics is the accuracy as given in thefollowing equation

accuracy TP + TN

TP + FN + TF + FP (4)

Accuracy is the capability of the classifier in predictingthe actual class here TP true positive TN true negativeFP false positive and FN false negative

Several metrics can be computed from the confusionmatrix (e false-positive rate (FPR) is another metric it isthe percentage of the samples incorrectly predicted aspositive by the classifier It is calculated by using the fol-lowing equation

false minus positive rate(FPR) FP

FP + TN (5)

(e false-negative rate (FNR) is the percentage of thedata incorrectly classified by the classifier as negative It iscalculated by using the following equation

false minus negative rate(FNR) FN

TP + FN (6)

(e detection rate (DR) is the percentage of the samplescorrectly classified by the classifier to their correct class It iscalculated by using the following equation

detection rate FP

TP + FP (7)

(e recall quantifies the number of correct positivepredictions made out of all positive predictions It is cal-culated by using the following equation

recall TP

TP + FN (8)

F-Measure provides a way to combine both detectionrate and recall into a single measure that captures bothproperties It is calculated by using the following equation

F minus M (2lowastDRlowast recall)

(DR + recall) (9)

(e results were validated by using k-fold cross-vali-dation technique [27ndash30] (is technique requires a randompartitioning of the data into k different parts and one part isselected from each iteration as testing data while the other(kminus1) parts are considered as the training dataset All theconnection records are eventually used for training andtesting For all experiments the value of k is taken as 10 toensure low bias low variance low overfitting and good errorestimate [28]

9 Dataset Preprocessing and Partitioning

(e whole dataset is preprocessed in this stage It consists oftwo steps ie scaling and normalisation In the scaling stepthe dataset is converted from a string representation to anumerical representation For example the class label in thedataset contains two different categories ldquoNormalrdquo andldquoAttackrdquo After implementing this step the label is changedto ldquo1rdquo and ldquo0rdquo where ldquo1rdquo means normal case while ldquo0rdquomeans attack (e second step is normalisation [31] (enormalisation process removes noise from the dataset anddecreases the differences in the ranges between the featuresIn this work the Max-Max normalisation method was usedas shown in the following equation

Fi Fi minus Mini

Maxi minus Minii (10)

where Fi represents the current feature that needs to benormalized and Mini and Maxi represent the minimum andthe maximum value for that feature respectively (e ob-jective function represents the accuracy of the SVM when itis evaluated on the validation set (e validation set is a partof the training set In order to make the validation fairer K-fold validation can be used(e value K is 10(e NSL-KDDand CICIDS 2017 datasets were used to evaluate the per-formance of the proposed models

10 NSL-KDD Dataset

In this study NSL-KDD datasets were used to evaluate theproposed method (is dataset was suggested in 2009 byTavallaee et al [32] due to the drawbacks of KDD CUP99(e NSL-KDD is a variant of the KDD CUP 99 dataset inwhich the redundant instances were discarded followed bythe reconstitution of the dataset structure [29] (e NSL-KDD dataset is commonly used for evaluating the per-formance of new ID approaches especially anomaly-basednetwork ID (ere are a reasonable number of testing andtraining records in the NSL-KDD (e training set(KDDTrain+) consists of 125973 records while the testingset (KDDTest+) contains 22544 records In this dataseteach traffic record has 41 features (six symbolic and 35continuous) and one class label (Table 7) (e features areclassified into basic content and traffic types (Table 8)Attack classification in the NSL-KDD is based on thefeature characteristics [33] (e NSL-KDD dataset can bedownloaded from httpswwwunbcacicdatasetsnslhtml

12 Complexity

11 CICIDS 2017 Dataset

(e CICIDS 2017 dataset consists of benign and the mostcurrent common attacks which mimic real-world data(PCAPs) It also contains the results of a network trafficanalysis obtained by using a CICFlowMeter the flows arelabelled based on the timestamp source and destinationports source and destination IPs protocols and attack (eCICIDS 2017 dataset satisfies the 11 indispensable features ofa valid IDS dataset namely anonymity available protocolsfeature set attack diversity complete capture completeinteraction complete network configuration completetraffic metadata heterogeneity and labelling [34] (ere are2830540 rows in the CICIDS 2017 devised on eight fileswith each row containing 79 features In the CICIDS 2017each row is labelled as benign or as one of the 14 attack typesA summary of the distribution of different attack types andthe benign rows is presented in Table 9

12 Results of ITLBO-IPJAYA vs ITLBOand ITLBO-JAYA

(is section provides the results of the improved method-based ITLBO-IPJAYA algorithm (is method selects thebest features and updates the value of SVM parameters (iswork proposed the idea of ldquoparallel executionrdquo to update theSVM parameters (e parameters for ITLBO ITLBO-JAYAand ITLBO-IPJAYA methods used in this study are shownin Table 10

(e NSL-KDD dataset is used to evaluate the threemethods and the evaluation metrics used are maximumaccuracy (Max Acc) average accuracy (AVR Acc) de-tection rate (DT) false alarm rate (FAR) false negative rate(FNR) F-measure (F-M) recall and error rate (ER)Table 11 shows the comparison in results among ITLBOITLBO-JAYA and ITLBO-IPJAYA

(e results show that ITLBO-IPJAYA performs betterthan ITLBO and ITLBO-JAYA in all metrics Figure 11shows the comparison results based on the accuracy ofITLBO ITLBO-JAYA and ITLBO-IPJAYA

Figure 12 shows a comparison between ITLBO-JAYAand ITLBO-IPJAYA based on the number of iterations Itshows that ITLBO-IPJAYA performs better than ITLBO-JAYA even with less number of iterations (e increase inrate of accuracy for ITLBO-IPJAYA is higher than ITLBO-JAYA (e figure shows that ITLBO-IPJAYA with 20 iter-ations performs better than ITLBO-JAYA with 30 iterationsand that ITLBO-IPJAYA performs better than ITLBO-JAYAwith less number of iterations (is means there is lesscomplexity and less execution time for ITLBO-IPJAYAFigure 13 shows the average FAR of the three methodsshowing that ITLBO-IPJAYA performs better than ITLBOand ITLBO-JAYA even with less number of features whereITLBO-IPJAYA with 19 features performs better than TLBOand ITLBO-JAYA with 21 and 22 features respectively (eimprovements shown in Sections 4 and 6 reduce the exe-cution time for ITLBO-IPJAYA over ITLBO-JAYA (eparallel processing of each SVM parameter independently isthe main factor that reduces the execution time for ITLBO-IPJAYA over ITLBO-JAYA as shown in Figure 14

(e results of the CICIDS 2017 dataset are shown inTable 12

Finally statistical significance tests (T-test) T-test madeon the distribution of values in both samples showed theirsignificant difference which allowed us to reject null hy-pothesis H0 (e test sh2ows the superiority of IPJAYA-ITLBO-SVM over JAYA-ITLBO-SVM (e P values and T-

Table 8 NSL-KDD dataset

Attack classes 22 types of attacks No of instancesNormal 67343DoS smurt neptune pod teardrop back land 45927R2L phf ftp-write imap multihop warezclient warezmaster spy guess password 995U2R perl loadmodule buffer-overflow rootkit 52Probing portsweep ipsweep satan nmap 11656

Table 9 CICIDS dataset

Attack class 14 types of attacks No of instancesBenign (normal) 2359087DOS DDoS slowloris Heratbleed Hulk GoldenEye Slowhttptest 294506PortScan Portscan 158930Bot Bot 1966Brute-Force FTP-Patator SSH-Patator 13835Web attack Web attack XSS web attack SQL injection web attack brute force 2180Infiltration Infiltration 36

Table 10 Parameters used in this study margin

Parameter ValuePopulation size for ITLBO 40Number of generations for ITLBO 60Population size for JAYA 40Number of generations for JAYA 60Population size for IPJAYA 40Number of generations for IPJAYA 60Population size for ITLBO 40Number of generations for IPJAYA 60Crossover type Half-uniformMutation type Bit-flip

Complexity 13

0973097409750976097709780979

098098109820983

Acc

urac

y

ITLBO-JAYAITLBO-IPJAYA

25 30 35 40 45 50 55 60 6520Number of iterations

Figure 13 Accuracy comparison based on the number of iterations for the NSL-KDD dataset

Table 11 Comparison of ITLBO ITLBO-JAYA and ITLBO-IPJAYA for the NSL-KDD dataset

No of features Method MAX Acc AVR Acc DR FAR FNR F-M Recall ER

16

TLBO 09639 09630 09612 00449 00282 09664 09717 0036ITLBO 09680 09678 09671 00379 00268 09701 09731 0032

ITLBO-JAYA 09688 09685 09676 00373 00258 0971 09741 00312ITLBO-IPJAYA 09708 09705 09712 00331 00256 09727 09742 00292

18

TLBO 09713 0971 09739 00299 00275 09731 09724 00286ITLBO 09718 09713 09744 00292 00273 09736 09726 00282

ITLBO-JAYA 09735 09733 09752 00285 00247 09752 09752 00265ITLBO-IPJAYA 09747 09746 09753 00280 00221 09764 09779 00252

19

TLBO 09738 09735 09727 00313 00225 09755 09774 00261ITLBO 09751 09745 09737 00305 00189 09769 09811 00248

ITLBO-JAYA 09759 09758 09758 00278 00178 09775 09791 00241ITLBO-IPJAYA 09772 09770 09786 00245 00162 09787 09787 00228

21

TLBO 09782 09780 09742 00299 00145 09797 09844 00217ITLBO 09787 09784 09756 00279 00144 0981 09846 00212

ITLBO-JAYA 09793 0979 09789 00273 00132 09811 09867 00207ITLBO-IPJAYA 09802 0980 09792 00263 00123 09812 09716 00198

22

TLBO 09801 0979 09755 00284 00131 09814 09868 00199ITLBO 0981 09805 09758 00277 00117 09823 0989 00191

ITLBO-JAYA 09816 09814 09794 00265 00114 09829 0989 00183ITLBO-IPJAYA 0-9823 09821 09798 00262 00102 09835 09898 00177

Acc

urac

y

16 18 19 21 22Number of features

096

80

9688 097

08

097

180

9735

097

47

097

510

9759

097

72

097

870

9793

098

02

098

10

9816

098

23

ITLBOITLBO-JAYAITLBO-IPJAYA

Figure 12 Accuracy based on the number of features for the NSL-KDD dataset

14 Complexity

values are shown in Table 13 the small values show that theIPJAYA-ITLBO-SVM method (MV1) is highly significant

13 The Comparison of the Proposed Methods

To illustrate the effectiveness of our proposed IDS methodsthe performance of the proposed methods is compared withsix recently developed anomaly detection techniquesTable 14 demonstrates the result achieved by the proposedmethods compared with other methods tested on the NSL-KDD dataset in terms of detection rate and false alarm rateIt is very clear that our proposed methods (ITLBO-JAYAand ITLBO-IPJAYA) obtained the best results with 09823accuracy 09798 detection rate and 00102 false alarm ratefor the ITLBO-IPJAYA model and 09816 accuracy 09794detection rate and 00114 false alarm rate for the ITLBO-JAYA method as shown in Table 11 However Table 15demonstrates the result achieved by the proposed methodscompared with other methods tested on the CICIDS 2017dataset in terms of detection rate and false alarm rate

14 Discussion

(is work in general contains 4 sections based on theproposed method Furthermore all methods proposed inthis work were evaluated based on the NSL-KDD andCICIDS 2017 datasets

Firstly the proposed ITLBO-IPJAYA based on networkintrusion detection and method results were compared withTLBO ITLBO and ITLBO-JAYA as shown in Tables 11 and12 Additionally the table shows the different features for thethree algorithms to investigate the influence of the featurersquosincrease on the performance which represents a differentalgorithm structure (e ITLBO-IPJAYA results showedhigher stability and better accuracy than ITLBO and ITLBO-JAYA algorithms

Furthermore Figure 13 shows that ITLBO-JAYA needs60 iterations to reach accuracy of 09816 when the ITLBO-IPJAYA algorithm with 50 iterations achieved higher ac-curacy (erefore ITLBO-IPJAYA achieved better detectionrate and less false alarm rate with less complexity of

0020

0022

0024

0026

0028

003

0032

0034

0036

0038

004

FAR

2217 18 19 20 21 2315 16Number of features

ITLBOITLBO-JAYAITLBO-IPJAYA

Figure 14 Comparison based on the number of features head for the NSL-KDD dataset

Table 12 Comparison of ITLBO ITLBO-JAYA and ITLBO-IPJAYA for the CICIDS dataset

No of features Method MAX Acc AVR Acc DR FAR FNR F-M Recall ER

12ITLBO 09634 09631 09661 00389 00268 0970 09721 00323

ITLBO-JAYA 09685 09683 09682 00360 00267 09713 09722 00315ITLBO-IPJAYA 09704 09702 09701 00310 00265 09725 09724 00298

13ITLBO 09712 09710 09724 00298 00273 09736 09726 00282

ITLBO-JAYA 09745 09744 09728 00290 00264 09741 09794 00272ITLBO-IPJAYA 09768 09767 09732 00285 00260 09752 09787 00264

14ITLBO 09776 09775 09737 00280 00189 09769 09811 00258

ITLBO-JAYA 09789 09787 09742 00270 00174 0978 0986 00235ITLBO-IPJAYA 09801 0980 09749 00265 00134 0981 0987 00210

16ITLBO 09804 09803 09755 00271 0011 09821 0989 00190

ITLBO-JAYA 0981 09808 09773 00266 00109 09825 0989 00183ITLBO-IPJAYA 09817 09815 09782 00264 00105 09831 09896 00170

Complexity 15

iterations Secondly with all the improvement of ITLBO-SVM mentioned above random selection of the main SVMparameters is considered as one of the algorithm limitationswhich may not provide optimal parameter value and affectthe model accuracy negatively

(e results above showed that the ITLBO-IPJAYAperformance improved the basic SVM performance byproviding the best parameter values as shown in the ITLBO-IPJAYA block diagram in Figure 11 In the end the per-formance of ITLBO-IPJAYA is worth reducing the impact ofselected parameters randomly

As a result of the differences in the algorithm structurethe ITLBO structure contains three phases which shouldprevent the algorithm from being trapped in local and globaloptima Also teachers not only teach learners (students) butalso teach other teachers On the contrary the TLBO structurecontains two phases only where teachers teach learners only

Furthermore the ITLBO algorithm achieved higheraccuracy than TLBO because the knowledge exchange rate ishigher in ITLBO since teachers teach learners and otherteachers (erefore ITLBO achieved better detection rateand less false alarm rate with less complexity of iterations

Table 15 Comparison with the existing work for the CICIDS 2017 dataset

Ref Method Dataset Acc DR FAR[41] Hybrid model CICIDS 8976 NG NG[42] Wrapper-based feature selection CICIDS 9768 NG NG[43] Feature selection technique and SVM CICIDS 09803 NG NGTLBO-SVM TLBO and SVM CICIDS 09794 09745 00274ITLBO-SVM Improved TLBO and SVM CICIDS 09804 09755 00271ITLBO-JAYA-SVM Improved TLBO improved JAYA and SVM CICIDS 0981 09773 00266ITLBO-IPJAYA-SVM Improved TLBO improved JAYA and SVM CICIDS 09817 09782 00264

Table 13 T-test results

NSL-KDD CICIDS 2017P value 00156 00068T value 3174 4044

Table 14 Comparison with the existing work for the NSL-KDD dataset

Ref Method Dataset Acc DR FAR[35] Two-stage classifier NSL-KDD 9638 NG NG[36] Hypergraph-based genetic algorithm and SVM NSL-KDD 0975 09714 083[8] PSO and SVM NSL-KDD 09784 09723 087[37] Chi-square and SVM NSL-KDD 098 NG 013[38] SVM and hybrid PSO NSL-KDD 07341 06628 281[39] SVM and feature selection NSL-KDD 090 NG NG[40] SVM and GA NSL-KDD 0975 NG NGTLBO-SVM TLBO and SVM NSL-KDD 09801 09755 00284ITLBO-SVM Improved TLBO and SVM NSL-KDD 0981 09758 00277ITLBO-JAYA-SVM Improved TLBO improved JAYA and SVM NSL-KDD 09816 09794 00265ITLBO-IPJAYA-SVM Improved TLBO improved JAYA and SVM NSL-KDD 09823 09798 00262

13534

13239

ITLBO-JAYA ITLBO-IPJAYA130501310013150132001325013300133501340013450135001355013600

Figure 15 Execution time comparison for the NSL-KDD dataset

16 Complexity

Dividing the solutions of the IPJAYA algorithm into twogroups and choosing the best solution from the best solutiongroup as ldquoBestrdquo and the best solution from the worst solutiongroup as ldquoWorstrdquo cause IPJAYA to need less iterations thanJAYA to reach better solutions as shown in Figure 13 (isalso leads to improvement in accuracy and detection rate

(e parallel improvement done on the JAYA algorithmreduces the time needed for execution and hence reduces thetotal execution time for the ITLBO-IPJAYA-SVM model asshown in Figure 15

Data Availability

(e data used to support the findings of this study areavailable online at httpswwwunbcacicdatasetsnslhtml

Conflicts of Interest

(e authors declare that they have no conflicts of interest

Acknowledgments

Special appreciation is due to Universiti Malaysia Pahang forthe sponsorship of this study approved by the Ministry ofHigher Education (MOHE) for Fundamental ResearchGrant Scheme (FRGS) with Vot No RDU190113

References

[1] A Sultana and M Jabbar ldquoIntelligent network intrusiondetection system using data mining techniquesrdquo in Pro-ceedings of the 2016 2nd International Conference on Appliedand 0eoretical Computing and Communication Technology(iCATccT) pp 329ndash333 IEEE Bangalore India July 2016

[2] R Rao ldquoJaya a simple and new optimization algorithm forsolving constrained and unconstrained opti- mizationproblemsrdquo International Journal of Industrial EngineeringComputations vol 7 no 1 pp 19ndash34 2016

[3] M Alsajri M A Ismail and S Abdul-Baqi ldquoA review on therecent application of Jaya optimization algorithmrdquo in Pro-ceedings of the 2018 1st Annual International Conference onInformation and Sciences (AiCIS) Springer Al-Fallujah Iraqpp 129ndash132 November 2018

[4] P Tao Z Sun and Z Sun ldquoAn improved intrusion detectionalgorithm based on GA and SVMrdquo IEEE Access vol 6pp 13624ndash13631 2018

[5] A S Eesa Z Orman and A M A Brifcani ldquoA novel feature-selection ap- proach based on the cuttlefish optimizationalgorithm for intrusion detection systemsrdquo Expert Systemswith Applications vol 42 no 5 pp 2670ndash2679 2015

[6] P Louvieris N Clewley and X Liu ldquoEffects-based featureidentification for network intrusion detectionrdquo Neuro-computing vol 121 pp 265ndash273 2013

[7] E De la Hoz A Ortiz J Ortega and B Prieto ldquoPCA filteringand probabilistic SOM for network intrusion detectionrdquoNeurocomputing vol 164 pp 71ndash81 2015

[8] S Mojtaba H Bamakan H Wang Y Tian and Y Shi ldquoAneffective intrusion detection framework based onMCLPSVMoptimized by time-varying chaos particle swarm optimiza-tionrdquo Neurocomputing vol 199 pp 90ndash102 2016

[9] R V Rao V J Savsani and D P Vakharia ldquoTeach-ingndashlearning-based optimization a novel method for

constrained mechanical design optimization problemsrdquordquoComputer-Aided Design vol 43 no 3 pp 303ndash315 2011

[10] S P Das and S Padhy ldquoA novel hybrid model using teaching-learning-based optimization and a support vector machine forcommodity futures index forecastingrdquo International Journalof Machine Learning and Cybernetics vol 9 no 1 pp 97ndash1112018

[11] S P Das N S Achary and S Padhy ldquoNovel hybrid SVM-TLBO forecasting model incorporating dimensionality re-duction techniquesrdquo Applied Intelligence vol 45 no 4pp 1148ndash1165 2016

[12] R V Rao V J Savsani and J Balic ldquoTeachingndashlearning-based optimization algorithm for unconstrained and con-strained real-parameter optimization problemsrdquordquo EngineeringOptimization vol 44 no 12 pp 1447ndash1462 2012

[13] R V Rao and V Patel ldquoAn improved teaching-learning-based optimization algorithm for solving uncon- strainedoptimization problemsrdquo Scientia Iranica vol 20 no 3pp 710ndash720 2013

[14] M Crepinsek S-H Liu and L Mernik ldquoA note on teach-ingndashlearning-based optimization algo- rithmrdquo InformationSciences vol 212 pp 79ndash93 2012

[15] M R Nayak C K Nayak and P K Rout ldquoApplication ofmulti-objective teaching learning based optimization algo-rithm to optimal power flow problemrdquo Procedia Technologyvol 6 pp 255ndash264 2012

[16] Y Xu L Wang S-Y Wang and M Liu ldquoAn effectiveteaching-learning-based optimization algorithm for theflexible job-shop scheduling problem with fuzzy processingtimerdquo Neurocomputing vol 148 pp 260ndash268 2015

[17] H E Kiziloz A Deniz T Dokeroglu and A Cosar ldquoNovelmultiobjective TLBO algorithms for the feature subset se-lection problemrdquoNeurocomputing vol 306 pp 94ndash107 2018

[18] S H Wang K Muhammad Y Lv et al ldquoIdentification of Al-coholism based on wavelet Renyi entropy and three-segmentencoded Jaya algorithmrdquo Complexity vol 2018 Article ID3198184 13 pages 2018

[19] H Migallon A Jimeno-Morenilla and J-L Sanchez-RomeroldquoParallel improvements of the Jaya optimization algorithmrdquoApplied Sciences vol 8 no 5 p 819 2018

[20] C Gong ldquoAn enhanced Jaya algorithm with a two groupAdaptionrdquo International Journal of Computational Intelli-gence Systems vol 10 no 1 pp 1102ndash1115 2017

[21] O Samuel N Javaid S Aslam and M H Rahim ldquoJAYAoptimization based energy management controller for smartgrid JAYA optimization based energy management con-trollerrdquo in Proceedings of the 2018 International Conference onComputing Mathematics and Engineering Technologies(iCoMET) March 2018

[22] K Yu J J Liang B Y Qu X Chen and H Wang ldquoPa-rameters identification of photovoltaic models using an im-proved JAYA optimization algorithmrdquo Energy Conversionand Management vol 150 pp 742ndash753 2017

[23] M Dash and H Liu ldquoFeature selection for classificationrdquoIntelligent Data Analysis vol 1 no 1-4 pp 131ndash156 1997

[24] S Dumais J Platt D Heckerman and M Sahami ldquoInductivelearning algorithms and representations for text categoriza-tionrdquo in Proceedings of the seventh international conference onInformation and knowledge management-CIKMrsquo98 Novem-ber 1998

[25] R Singh H Kumar and R K Singla ldquoAn intrusion detectionsystem using network traffic profiling and online sequentialextreme learning machinerdquo Expert Systems with Applicationsvol 42 no 22 pp 8609ndash8624 2015

Complexity 17

[26] M Sokolova and L Guy ldquoA systematic analysis of perfor-mance measures for classification tasksrdquo Information Pro-cessing amp Management vol 45 no 4 pp 427ndash437 2009

[27] P Phoungphol Y Zhang and Y Zhao ldquoRobust multiclassclassification for learning from imbalanced biomedical datardquoTsinghua Science and Technology vol 17 no 6 pp 619ndash6282012

[28] A Jahan F Mustapha M Y Ismail S M Sapuan andM Bahraminasab ldquoA comprehensive VIKOR method formaterial selectionrdquo Materials amp Design vol 32 no 3pp 1215ndash1221 2011

[29] L Aljarah and S A Ludwig ldquoMapreduce intrusion detectionsystem based on a particle swarm optimization clusteringalgorithmrdquo in Proceedings of the 2013 IEEE Congress onEvolutionary Computation pp 955ndash962 IEEE CancunMexico June 2013

[30] K Khaleel M A Ismail U Yunan and S Kasim ldquoReview onintrusion detection system based on the goal of the detectionsystemrdquo International Journal of Integrated Engineeringvol 10 no 6 2018

[31] J D Rodriguez A Perez and J A Lozano ldquoSensitivityanalysis of k-Fold cross validation in prediction error esti-mationrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 32 no 3 pp 569ndash575 2010

[32] M Tavallaee E Bagheri W Lu and A A Ghorbani ldquoAdetailed analysis of the KDD CUP 99 data setrdquo in Proceedingsof the 2009 IEEE Symposium on Computational Intelligence forSecurity and Defense Applications pp 1ndash6 IEEE Ottawa ONCanada July 2009

[33] M Al-Qatf Y Lasheng M Al-Habib and K Al-SabahildquoDeep learning approach combining sparse autoencoder withSVM for network intrusion detectionrdquo IEEE Access vol 6pp 52843ndash52856 2018

[34] R Vinayakumar M Alazab K P Soman P PoornachandranA Al-Nemrat and S Venkatraman ldquoDeep learning approachfor intelligent intrusion detection systemrdquo IEEE Access vol 7pp 41525ndash41550 2019

[35] BA Tama M Comuzzi and K-H Rhee ldquoTSE-IDS a two-stage classifier ensemble for intelligent anomaly-based in-trusion detection systemrdquo IEEE Access vol 7 pp 94497ndash94507 2019

[36] M R Gauthama Raman N Somu K Kannan R Liscano andV S Shankar Sriram ldquoAn efficient intrusion detection systembased on hypergraph - genetic algorithm for parameter op-timization and feature selection in support vector machinerdquordquoKnowledge-Based Systems vol 134 pp 1ndash12 2017

[37] I S (aseen and C Aswani Kumar ldquoIntrusion detectionmodel using fusion of chi- square feature selection and multiclass SVMrdquo Journal of King Saud University - Computer andInformation Sciences vol 29 no 4 pp 462ndash472 2017

[38] Y Li S Yu J Bai and X Cheng ldquoTowards effective networkintrusion detection a hybrid model integrating Gini indexand GBDTwith PSOrdquo Journal of Sensors vol 2018 Article ID1578314 9 pages 2018

[39] A A Aburomman and M B I Reaz ldquoA novel weightedsupport vector machines multi- class classifier based ondifferential evolution for intrusion detection systemsrdquo In-formation Sciences vol 414 pp 225ndash246 2017

[40] J Esmaily and J Ghasemi ldquoA novel intrusion detectionsystems based on genetic algorithms-suggested features by themeans of different permutations of labelsrsquo ordersrdquo Interna-tional Journal of Engineering vol 30 no 10 pp 1494ndash15022017

[41] S Aljawarneh M Aldwairi and M B Yassein ldquoAnomaly-based intrusion detection system through feature selectionanalysis and building hybrid efficient modelrdquo Journal ofComputational Science vol 25 pp 152ndash160 2018

[42] Y Li J L Wang Z H Tian T B Lu and C Young ldquoBuildinglightweight intrusion detection system using wrapper-basedfeature selection mechanismsrdquo Computers amp Security vol 28no 6 pp 466ndash475 2009

[43] S U Jan S Ahmed V Shakhov and I Koo ldquoToward alightweight intrusion detection system for the internet ofthingsrdquo IEEE Access vol 7 pp 42450ndash42471 2019

18 Complexity

Page 13: ImprovedTLBO …downloads.hindawi.com/journals/complexity/2020/5287684.pdfFSS is a multiobjective optimisation problem. e first objective is the number of features, and the second

11 CICIDS 2017 Dataset

(e CICIDS 2017 dataset consists of benign and the mostcurrent common attacks which mimic real-world data(PCAPs) It also contains the results of a network trafficanalysis obtained by using a CICFlowMeter the flows arelabelled based on the timestamp source and destinationports source and destination IPs protocols and attack (eCICIDS 2017 dataset satisfies the 11 indispensable features ofa valid IDS dataset namely anonymity available protocolsfeature set attack diversity complete capture completeinteraction complete network configuration completetraffic metadata heterogeneity and labelling [34] (ere are2830540 rows in the CICIDS 2017 devised on eight fileswith each row containing 79 features In the CICIDS 2017each row is labelled as benign or as one of the 14 attack typesA summary of the distribution of different attack types andthe benign rows is presented in Table 9

12 Results of ITLBO-IPJAYA vs ITLBOand ITLBO-JAYA

(is section provides the results of the improved method-based ITLBO-IPJAYA algorithm (is method selects thebest features and updates the value of SVM parameters (iswork proposed the idea of ldquoparallel executionrdquo to update theSVM parameters (e parameters for ITLBO ITLBO-JAYAand ITLBO-IPJAYA methods used in this study are shownin Table 10

(e NSL-KDD dataset is used to evaluate the threemethods and the evaluation metrics used are maximumaccuracy (Max Acc) average accuracy (AVR Acc) de-tection rate (DT) false alarm rate (FAR) false negative rate(FNR) F-measure (F-M) recall and error rate (ER)Table 11 shows the comparison in results among ITLBOITLBO-JAYA and ITLBO-IPJAYA

(e results show that ITLBO-IPJAYA performs betterthan ITLBO and ITLBO-JAYA in all metrics Figure 11shows the comparison results based on the accuracy ofITLBO ITLBO-JAYA and ITLBO-IPJAYA

Figure 12 shows a comparison between ITLBO-JAYAand ITLBO-IPJAYA based on the number of iterations Itshows that ITLBO-IPJAYA performs better than ITLBO-JAYA even with less number of iterations (e increase inrate of accuracy for ITLBO-IPJAYA is higher than ITLBO-JAYA (e figure shows that ITLBO-IPJAYA with 20 iter-ations performs better than ITLBO-JAYA with 30 iterationsand that ITLBO-IPJAYA performs better than ITLBO-JAYAwith less number of iterations (is means there is lesscomplexity and less execution time for ITLBO-IPJAYAFigure 13 shows the average FAR of the three methodsshowing that ITLBO-IPJAYA performs better than ITLBOand ITLBO-JAYA even with less number of features whereITLBO-IPJAYA with 19 features performs better than TLBOand ITLBO-JAYA with 21 and 22 features respectively (eimprovements shown in Sections 4 and 6 reduce the exe-cution time for ITLBO-IPJAYA over ITLBO-JAYA (eparallel processing of each SVM parameter independently isthe main factor that reduces the execution time for ITLBO-IPJAYA over ITLBO-JAYA as shown in Figure 14

(e results of the CICIDS 2017 dataset are shown inTable 12

Finally statistical significance tests (T-test) T-test madeon the distribution of values in both samples showed theirsignificant difference which allowed us to reject null hy-pothesis H0 (e test sh2ows the superiority of IPJAYA-ITLBO-SVM over JAYA-ITLBO-SVM (e P values and T-

Table 8 NSL-KDD dataset

Attack classes 22 types of attacks No of instancesNormal 67343DoS smurt neptune pod teardrop back land 45927R2L phf ftp-write imap multihop warezclient warezmaster spy guess password 995U2R perl loadmodule buffer-overflow rootkit 52Probing portsweep ipsweep satan nmap 11656

Table 9 CICIDS dataset

Attack class 14 types of attacks No of instancesBenign (normal) 2359087DOS DDoS slowloris Heratbleed Hulk GoldenEye Slowhttptest 294506PortScan Portscan 158930Bot Bot 1966Brute-Force FTP-Patator SSH-Patator 13835Web attack Web attack XSS web attack SQL injection web attack brute force 2180Infiltration Infiltration 36

Table 10 Parameters used in this study margin

Parameter ValuePopulation size for ITLBO 40Number of generations for ITLBO 60Population size for JAYA 40Number of generations for JAYA 60Population size for IPJAYA 40Number of generations for IPJAYA 60Population size for ITLBO 40Number of generations for IPJAYA 60Crossover type Half-uniformMutation type Bit-flip

Complexity 13

0973097409750976097709780979

098098109820983

Acc

urac

y

ITLBO-JAYAITLBO-IPJAYA

25 30 35 40 45 50 55 60 6520Number of iterations

Figure 13 Accuracy comparison based on the number of iterations for the NSL-KDD dataset

Table 11 Comparison of ITLBO ITLBO-JAYA and ITLBO-IPJAYA for the NSL-KDD dataset

No of features Method MAX Acc AVR Acc DR FAR FNR F-M Recall ER

16

TLBO 09639 09630 09612 00449 00282 09664 09717 0036ITLBO 09680 09678 09671 00379 00268 09701 09731 0032

ITLBO-JAYA 09688 09685 09676 00373 00258 0971 09741 00312ITLBO-IPJAYA 09708 09705 09712 00331 00256 09727 09742 00292

18

TLBO 09713 0971 09739 00299 00275 09731 09724 00286ITLBO 09718 09713 09744 00292 00273 09736 09726 00282

ITLBO-JAYA 09735 09733 09752 00285 00247 09752 09752 00265ITLBO-IPJAYA 09747 09746 09753 00280 00221 09764 09779 00252

19

TLBO 09738 09735 09727 00313 00225 09755 09774 00261ITLBO 09751 09745 09737 00305 00189 09769 09811 00248

ITLBO-JAYA 09759 09758 09758 00278 00178 09775 09791 00241ITLBO-IPJAYA 09772 09770 09786 00245 00162 09787 09787 00228

21

TLBO 09782 09780 09742 00299 00145 09797 09844 00217ITLBO 09787 09784 09756 00279 00144 0981 09846 00212

ITLBO-JAYA 09793 0979 09789 00273 00132 09811 09867 00207ITLBO-IPJAYA 09802 0980 09792 00263 00123 09812 09716 00198

22

TLBO 09801 0979 09755 00284 00131 09814 09868 00199ITLBO 0981 09805 09758 00277 00117 09823 0989 00191

ITLBO-JAYA 09816 09814 09794 00265 00114 09829 0989 00183ITLBO-IPJAYA 0-9823 09821 09798 00262 00102 09835 09898 00177

Acc

urac

y

16 18 19 21 22Number of features

096

80

9688 097

08

097

180

9735

097

47

097

510

9759

097

72

097

870

9793

098

02

098

10

9816

098

23

ITLBOITLBO-JAYAITLBO-IPJAYA

Figure 12 Accuracy based on the number of features for the NSL-KDD dataset

14 Complexity

values are shown in Table 13 the small values show that theIPJAYA-ITLBO-SVM method (MV1) is highly significant

13 The Comparison of the Proposed Methods

To illustrate the effectiveness of our proposed IDS methodsthe performance of the proposed methods is compared withsix recently developed anomaly detection techniquesTable 14 demonstrates the result achieved by the proposedmethods compared with other methods tested on the NSL-KDD dataset in terms of detection rate and false alarm rateIt is very clear that our proposed methods (ITLBO-JAYAand ITLBO-IPJAYA) obtained the best results with 09823accuracy 09798 detection rate and 00102 false alarm ratefor the ITLBO-IPJAYA model and 09816 accuracy 09794detection rate and 00114 false alarm rate for the ITLBO-JAYA method as shown in Table 11 However Table 15demonstrates the result achieved by the proposed methodscompared with other methods tested on the CICIDS 2017dataset in terms of detection rate and false alarm rate

14 Discussion

(is work in general contains 4 sections based on theproposed method Furthermore all methods proposed inthis work were evaluated based on the NSL-KDD andCICIDS 2017 datasets

Firstly the proposed ITLBO-IPJAYA based on networkintrusion detection and method results were compared withTLBO ITLBO and ITLBO-JAYA as shown in Tables 11 and12 Additionally the table shows the different features for thethree algorithms to investigate the influence of the featurersquosincrease on the performance which represents a differentalgorithm structure (e ITLBO-IPJAYA results showedhigher stability and better accuracy than ITLBO and ITLBO-JAYA algorithms

Furthermore Figure 13 shows that ITLBO-JAYA needs60 iterations to reach accuracy of 09816 when the ITLBO-IPJAYA algorithm with 50 iterations achieved higher ac-curacy (erefore ITLBO-IPJAYA achieved better detectionrate and less false alarm rate with less complexity of

0020

0022

0024

0026

0028

003

0032

0034

0036

0038

004

FAR

2217 18 19 20 21 2315 16Number of features

ITLBOITLBO-JAYAITLBO-IPJAYA

Figure 14 Comparison based on the number of features head for the NSL-KDD dataset

Table 12 Comparison of ITLBO ITLBO-JAYA and ITLBO-IPJAYA for the CICIDS dataset

No of features Method MAX Acc AVR Acc DR FAR FNR F-M Recall ER

12ITLBO 09634 09631 09661 00389 00268 0970 09721 00323

ITLBO-JAYA 09685 09683 09682 00360 00267 09713 09722 00315ITLBO-IPJAYA 09704 09702 09701 00310 00265 09725 09724 00298

13ITLBO 09712 09710 09724 00298 00273 09736 09726 00282

ITLBO-JAYA 09745 09744 09728 00290 00264 09741 09794 00272ITLBO-IPJAYA 09768 09767 09732 00285 00260 09752 09787 00264

14ITLBO 09776 09775 09737 00280 00189 09769 09811 00258

ITLBO-JAYA 09789 09787 09742 00270 00174 0978 0986 00235ITLBO-IPJAYA 09801 0980 09749 00265 00134 0981 0987 00210

16ITLBO 09804 09803 09755 00271 0011 09821 0989 00190

ITLBO-JAYA 0981 09808 09773 00266 00109 09825 0989 00183ITLBO-IPJAYA 09817 09815 09782 00264 00105 09831 09896 00170

Complexity 15

iterations Secondly with all the improvement of ITLBO-SVM mentioned above random selection of the main SVMparameters is considered as one of the algorithm limitationswhich may not provide optimal parameter value and affectthe model accuracy negatively

(e results above showed that the ITLBO-IPJAYAperformance improved the basic SVM performance byproviding the best parameter values as shown in the ITLBO-IPJAYA block diagram in Figure 11 In the end the per-formance of ITLBO-IPJAYA is worth reducing the impact ofselected parameters randomly

As a result of the differences in the algorithm structurethe ITLBO structure contains three phases which shouldprevent the algorithm from being trapped in local and globaloptima Also teachers not only teach learners (students) butalso teach other teachers On the contrary the TLBO structurecontains two phases only where teachers teach learners only

Furthermore the ITLBO algorithm achieved higheraccuracy than TLBO because the knowledge exchange rate ishigher in ITLBO since teachers teach learners and otherteachers (erefore ITLBO achieved better detection rateand less false alarm rate with less complexity of iterations

Table 15 Comparison with the existing work for the CICIDS 2017 dataset

Ref Method Dataset Acc DR FAR[41] Hybrid model CICIDS 8976 NG NG[42] Wrapper-based feature selection CICIDS 9768 NG NG[43] Feature selection technique and SVM CICIDS 09803 NG NGTLBO-SVM TLBO and SVM CICIDS 09794 09745 00274ITLBO-SVM Improved TLBO and SVM CICIDS 09804 09755 00271ITLBO-JAYA-SVM Improved TLBO improved JAYA and SVM CICIDS 0981 09773 00266ITLBO-IPJAYA-SVM Improved TLBO improved JAYA and SVM CICIDS 09817 09782 00264

Table 13 T-test results

NSL-KDD CICIDS 2017P value 00156 00068T value 3174 4044

Table 14 Comparison with the existing work for the NSL-KDD dataset

Ref Method Dataset Acc DR FAR[35] Two-stage classifier NSL-KDD 9638 NG NG[36] Hypergraph-based genetic algorithm and SVM NSL-KDD 0975 09714 083[8] PSO and SVM NSL-KDD 09784 09723 087[37] Chi-square and SVM NSL-KDD 098 NG 013[38] SVM and hybrid PSO NSL-KDD 07341 06628 281[39] SVM and feature selection NSL-KDD 090 NG NG[40] SVM and GA NSL-KDD 0975 NG NGTLBO-SVM TLBO and SVM NSL-KDD 09801 09755 00284ITLBO-SVM Improved TLBO and SVM NSL-KDD 0981 09758 00277ITLBO-JAYA-SVM Improved TLBO improved JAYA and SVM NSL-KDD 09816 09794 00265ITLBO-IPJAYA-SVM Improved TLBO improved JAYA and SVM NSL-KDD 09823 09798 00262

13534

13239

ITLBO-JAYA ITLBO-IPJAYA130501310013150132001325013300133501340013450135001355013600

Figure 15 Execution time comparison for the NSL-KDD dataset

16 Complexity

Dividing the solutions of the IPJAYA algorithm into twogroups and choosing the best solution from the best solutiongroup as ldquoBestrdquo and the best solution from the worst solutiongroup as ldquoWorstrdquo cause IPJAYA to need less iterations thanJAYA to reach better solutions as shown in Figure 13 (isalso leads to improvement in accuracy and detection rate

(e parallel improvement done on the JAYA algorithmreduces the time needed for execution and hence reduces thetotal execution time for the ITLBO-IPJAYA-SVM model asshown in Figure 15

Data Availability

(e data used to support the findings of this study areavailable online at httpswwwunbcacicdatasetsnslhtml

Conflicts of Interest

(e authors declare that they have no conflicts of interest

Acknowledgments

Special appreciation is due to Universiti Malaysia Pahang forthe sponsorship of this study approved by the Ministry ofHigher Education (MOHE) for Fundamental ResearchGrant Scheme (FRGS) with Vot No RDU190113

References

[1] A Sultana and M Jabbar ldquoIntelligent network intrusiondetection system using data mining techniquesrdquo in Pro-ceedings of the 2016 2nd International Conference on Appliedand 0eoretical Computing and Communication Technology(iCATccT) pp 329ndash333 IEEE Bangalore India July 2016

[2] R Rao ldquoJaya a simple and new optimization algorithm forsolving constrained and unconstrained opti- mizationproblemsrdquo International Journal of Industrial EngineeringComputations vol 7 no 1 pp 19ndash34 2016

[3] M Alsajri M A Ismail and S Abdul-Baqi ldquoA review on therecent application of Jaya optimization algorithmrdquo in Pro-ceedings of the 2018 1st Annual International Conference onInformation and Sciences (AiCIS) Springer Al-Fallujah Iraqpp 129ndash132 November 2018

[4] P Tao Z Sun and Z Sun ldquoAn improved intrusion detectionalgorithm based on GA and SVMrdquo IEEE Access vol 6pp 13624ndash13631 2018

[5] A S Eesa Z Orman and A M A Brifcani ldquoA novel feature-selection ap- proach based on the cuttlefish optimizationalgorithm for intrusion detection systemsrdquo Expert Systemswith Applications vol 42 no 5 pp 2670ndash2679 2015

[6] P Louvieris N Clewley and X Liu ldquoEffects-based featureidentification for network intrusion detectionrdquo Neuro-computing vol 121 pp 265ndash273 2013

[7] E De la Hoz A Ortiz J Ortega and B Prieto ldquoPCA filteringand probabilistic SOM for network intrusion detectionrdquoNeurocomputing vol 164 pp 71ndash81 2015

[8] S Mojtaba H Bamakan H Wang Y Tian and Y Shi ldquoAneffective intrusion detection framework based onMCLPSVMoptimized by time-varying chaos particle swarm optimiza-tionrdquo Neurocomputing vol 199 pp 90ndash102 2016

[9] R V Rao V J Savsani and D P Vakharia ldquoTeach-ingndashlearning-based optimization a novel method for

constrained mechanical design optimization problemsrdquordquoComputer-Aided Design vol 43 no 3 pp 303ndash315 2011

[10] S P Das and S Padhy ldquoA novel hybrid model using teaching-learning-based optimization and a support vector machine forcommodity futures index forecastingrdquo International Journalof Machine Learning and Cybernetics vol 9 no 1 pp 97ndash1112018

[11] S P Das N S Achary and S Padhy ldquoNovel hybrid SVM-TLBO forecasting model incorporating dimensionality re-duction techniquesrdquo Applied Intelligence vol 45 no 4pp 1148ndash1165 2016

[12] R V Rao V J Savsani and J Balic ldquoTeachingndashlearning-based optimization algorithm for unconstrained and con-strained real-parameter optimization problemsrdquordquo EngineeringOptimization vol 44 no 12 pp 1447ndash1462 2012

[13] R V Rao and V Patel ldquoAn improved teaching-learning-based optimization algorithm for solving uncon- strainedoptimization problemsrdquo Scientia Iranica vol 20 no 3pp 710ndash720 2013

[14] M Crepinsek S-H Liu and L Mernik ldquoA note on teach-ingndashlearning-based optimization algo- rithmrdquo InformationSciences vol 212 pp 79ndash93 2012

[15] M R Nayak C K Nayak and P K Rout ldquoApplication ofmulti-objective teaching learning based optimization algo-rithm to optimal power flow problemrdquo Procedia Technologyvol 6 pp 255ndash264 2012

[16] Y Xu L Wang S-Y Wang and M Liu ldquoAn effectiveteaching-learning-based optimization algorithm for theflexible job-shop scheduling problem with fuzzy processingtimerdquo Neurocomputing vol 148 pp 260ndash268 2015

[17] H E Kiziloz A Deniz T Dokeroglu and A Cosar ldquoNovelmultiobjective TLBO algorithms for the feature subset se-lection problemrdquoNeurocomputing vol 306 pp 94ndash107 2018

[18] S H Wang K Muhammad Y Lv et al ldquoIdentification of Al-coholism based on wavelet Renyi entropy and three-segmentencoded Jaya algorithmrdquo Complexity vol 2018 Article ID3198184 13 pages 2018

[19] H Migallon A Jimeno-Morenilla and J-L Sanchez-RomeroldquoParallel improvements of the Jaya optimization algorithmrdquoApplied Sciences vol 8 no 5 p 819 2018

[20] C Gong ldquoAn enhanced Jaya algorithm with a two groupAdaptionrdquo International Journal of Computational Intelli-gence Systems vol 10 no 1 pp 1102ndash1115 2017

[21] O Samuel N Javaid S Aslam and M H Rahim ldquoJAYAoptimization based energy management controller for smartgrid JAYA optimization based energy management con-trollerrdquo in Proceedings of the 2018 International Conference onComputing Mathematics and Engineering Technologies(iCoMET) March 2018

[22] K Yu J J Liang B Y Qu X Chen and H Wang ldquoPa-rameters identification of photovoltaic models using an im-proved JAYA optimization algorithmrdquo Energy Conversionand Management vol 150 pp 742ndash753 2017

[23] M Dash and H Liu ldquoFeature selection for classificationrdquoIntelligent Data Analysis vol 1 no 1-4 pp 131ndash156 1997

[24] S Dumais J Platt D Heckerman and M Sahami ldquoInductivelearning algorithms and representations for text categoriza-tionrdquo in Proceedings of the seventh international conference onInformation and knowledge management-CIKMrsquo98 Novem-ber 1998

[25] R Singh H Kumar and R K Singla ldquoAn intrusion detectionsystem using network traffic profiling and online sequentialextreme learning machinerdquo Expert Systems with Applicationsvol 42 no 22 pp 8609ndash8624 2015

Complexity 17

[26] M Sokolova and L Guy ldquoA systematic analysis of perfor-mance measures for classification tasksrdquo Information Pro-cessing amp Management vol 45 no 4 pp 427ndash437 2009

[27] P Phoungphol Y Zhang and Y Zhao ldquoRobust multiclassclassification for learning from imbalanced biomedical datardquoTsinghua Science and Technology vol 17 no 6 pp 619ndash6282012

[28] A Jahan F Mustapha M Y Ismail S M Sapuan andM Bahraminasab ldquoA comprehensive VIKOR method formaterial selectionrdquo Materials amp Design vol 32 no 3pp 1215ndash1221 2011

[29] L Aljarah and S A Ludwig ldquoMapreduce intrusion detectionsystem based on a particle swarm optimization clusteringalgorithmrdquo in Proceedings of the 2013 IEEE Congress onEvolutionary Computation pp 955ndash962 IEEE CancunMexico June 2013

[30] K Khaleel M A Ismail U Yunan and S Kasim ldquoReview onintrusion detection system based on the goal of the detectionsystemrdquo International Journal of Integrated Engineeringvol 10 no 6 2018

[31] J D Rodriguez A Perez and J A Lozano ldquoSensitivityanalysis of k-Fold cross validation in prediction error esti-mationrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 32 no 3 pp 569ndash575 2010

[32] M Tavallaee E Bagheri W Lu and A A Ghorbani ldquoAdetailed analysis of the KDD CUP 99 data setrdquo in Proceedingsof the 2009 IEEE Symposium on Computational Intelligence forSecurity and Defense Applications pp 1ndash6 IEEE Ottawa ONCanada July 2009

[33] M Al-Qatf Y Lasheng M Al-Habib and K Al-SabahildquoDeep learning approach combining sparse autoencoder withSVM for network intrusion detectionrdquo IEEE Access vol 6pp 52843ndash52856 2018

[34] R Vinayakumar M Alazab K P Soman P PoornachandranA Al-Nemrat and S Venkatraman ldquoDeep learning approachfor intelligent intrusion detection systemrdquo IEEE Access vol 7pp 41525ndash41550 2019

[35] BA Tama M Comuzzi and K-H Rhee ldquoTSE-IDS a two-stage classifier ensemble for intelligent anomaly-based in-trusion detection systemrdquo IEEE Access vol 7 pp 94497ndash94507 2019

[36] M R Gauthama Raman N Somu K Kannan R Liscano andV S Shankar Sriram ldquoAn efficient intrusion detection systembased on hypergraph - genetic algorithm for parameter op-timization and feature selection in support vector machinerdquordquoKnowledge-Based Systems vol 134 pp 1ndash12 2017

[37] I S (aseen and C Aswani Kumar ldquoIntrusion detectionmodel using fusion of chi- square feature selection and multiclass SVMrdquo Journal of King Saud University - Computer andInformation Sciences vol 29 no 4 pp 462ndash472 2017

[38] Y Li S Yu J Bai and X Cheng ldquoTowards effective networkintrusion detection a hybrid model integrating Gini indexand GBDTwith PSOrdquo Journal of Sensors vol 2018 Article ID1578314 9 pages 2018

[39] A A Aburomman and M B I Reaz ldquoA novel weightedsupport vector machines multi- class classifier based ondifferential evolution for intrusion detection systemsrdquo In-formation Sciences vol 414 pp 225ndash246 2017

[40] J Esmaily and J Ghasemi ldquoA novel intrusion detectionsystems based on genetic algorithms-suggested features by themeans of different permutations of labelsrsquo ordersrdquo Interna-tional Journal of Engineering vol 30 no 10 pp 1494ndash15022017

[41] S Aljawarneh M Aldwairi and M B Yassein ldquoAnomaly-based intrusion detection system through feature selectionanalysis and building hybrid efficient modelrdquo Journal ofComputational Science vol 25 pp 152ndash160 2018

[42] Y Li J L Wang Z H Tian T B Lu and C Young ldquoBuildinglightweight intrusion detection system using wrapper-basedfeature selection mechanismsrdquo Computers amp Security vol 28no 6 pp 466ndash475 2009

[43] S U Jan S Ahmed V Shakhov and I Koo ldquoToward alightweight intrusion detection system for the internet ofthingsrdquo IEEE Access vol 7 pp 42450ndash42471 2019

18 Complexity

Page 14: ImprovedTLBO …downloads.hindawi.com/journals/complexity/2020/5287684.pdfFSS is a multiobjective optimisation problem. e first objective is the number of features, and the second

0973097409750976097709780979

098098109820983

Acc

urac

y

ITLBO-JAYAITLBO-IPJAYA

25 30 35 40 45 50 55 60 6520Number of iterations

Figure 13 Accuracy comparison based on the number of iterations for the NSL-KDD dataset

Table 11 Comparison of ITLBO ITLBO-JAYA and ITLBO-IPJAYA for the NSL-KDD dataset

No of features Method MAX Acc AVR Acc DR FAR FNR F-M Recall ER

16

TLBO 09639 09630 09612 00449 00282 09664 09717 0036ITLBO 09680 09678 09671 00379 00268 09701 09731 0032

ITLBO-JAYA 09688 09685 09676 00373 00258 0971 09741 00312ITLBO-IPJAYA 09708 09705 09712 00331 00256 09727 09742 00292

18

TLBO 09713 0971 09739 00299 00275 09731 09724 00286ITLBO 09718 09713 09744 00292 00273 09736 09726 00282

ITLBO-JAYA 09735 09733 09752 00285 00247 09752 09752 00265ITLBO-IPJAYA 09747 09746 09753 00280 00221 09764 09779 00252

19

TLBO 09738 09735 09727 00313 00225 09755 09774 00261ITLBO 09751 09745 09737 00305 00189 09769 09811 00248

ITLBO-JAYA 09759 09758 09758 00278 00178 09775 09791 00241ITLBO-IPJAYA 09772 09770 09786 00245 00162 09787 09787 00228

21

TLBO 09782 09780 09742 00299 00145 09797 09844 00217ITLBO 09787 09784 09756 00279 00144 0981 09846 00212

ITLBO-JAYA 09793 0979 09789 00273 00132 09811 09867 00207ITLBO-IPJAYA 09802 0980 09792 00263 00123 09812 09716 00198

22

TLBO 09801 0979 09755 00284 00131 09814 09868 00199ITLBO 0981 09805 09758 00277 00117 09823 0989 00191

ITLBO-JAYA 09816 09814 09794 00265 00114 09829 0989 00183ITLBO-IPJAYA 0-9823 09821 09798 00262 00102 09835 09898 00177

Acc

urac

y

16 18 19 21 22Number of features

096

80

9688 097

08

097

180

9735

097

47

097

510

9759

097

72

097

870

9793

098

02

098

10

9816

098

23

ITLBOITLBO-JAYAITLBO-IPJAYA

Figure 12 Accuracy based on the number of features for the NSL-KDD dataset

14 Complexity

values are shown in Table 13 the small values show that theIPJAYA-ITLBO-SVM method (MV1) is highly significant

13 The Comparison of the Proposed Methods

To illustrate the effectiveness of our proposed IDS methodsthe performance of the proposed methods is compared withsix recently developed anomaly detection techniquesTable 14 demonstrates the result achieved by the proposedmethods compared with other methods tested on the NSL-KDD dataset in terms of detection rate and false alarm rateIt is very clear that our proposed methods (ITLBO-JAYAand ITLBO-IPJAYA) obtained the best results with 09823accuracy 09798 detection rate and 00102 false alarm ratefor the ITLBO-IPJAYA model and 09816 accuracy 09794detection rate and 00114 false alarm rate for the ITLBO-JAYA method as shown in Table 11 However Table 15demonstrates the result achieved by the proposed methodscompared with other methods tested on the CICIDS 2017dataset in terms of detection rate and false alarm rate

14 Discussion

(is work in general contains 4 sections based on theproposed method Furthermore all methods proposed inthis work were evaluated based on the NSL-KDD andCICIDS 2017 datasets

Firstly the proposed ITLBO-IPJAYA based on networkintrusion detection and method results were compared withTLBO ITLBO and ITLBO-JAYA as shown in Tables 11 and12 Additionally the table shows the different features for thethree algorithms to investigate the influence of the featurersquosincrease on the performance which represents a differentalgorithm structure (e ITLBO-IPJAYA results showedhigher stability and better accuracy than ITLBO and ITLBO-JAYA algorithms

Furthermore Figure 13 shows that ITLBO-JAYA needs60 iterations to reach accuracy of 09816 when the ITLBO-IPJAYA algorithm with 50 iterations achieved higher ac-curacy (erefore ITLBO-IPJAYA achieved better detectionrate and less false alarm rate with less complexity of

0020

0022

0024

0026

0028

003

0032

0034

0036

0038

004

FAR

2217 18 19 20 21 2315 16Number of features

ITLBOITLBO-JAYAITLBO-IPJAYA

Figure 14 Comparison based on the number of features head for the NSL-KDD dataset

Table 12 Comparison of ITLBO ITLBO-JAYA and ITLBO-IPJAYA for the CICIDS dataset

No of features Method MAX Acc AVR Acc DR FAR FNR F-M Recall ER

12ITLBO 09634 09631 09661 00389 00268 0970 09721 00323

ITLBO-JAYA 09685 09683 09682 00360 00267 09713 09722 00315ITLBO-IPJAYA 09704 09702 09701 00310 00265 09725 09724 00298

13ITLBO 09712 09710 09724 00298 00273 09736 09726 00282

ITLBO-JAYA 09745 09744 09728 00290 00264 09741 09794 00272ITLBO-IPJAYA 09768 09767 09732 00285 00260 09752 09787 00264

14ITLBO 09776 09775 09737 00280 00189 09769 09811 00258

ITLBO-JAYA 09789 09787 09742 00270 00174 0978 0986 00235ITLBO-IPJAYA 09801 0980 09749 00265 00134 0981 0987 00210

16ITLBO 09804 09803 09755 00271 0011 09821 0989 00190

ITLBO-JAYA 0981 09808 09773 00266 00109 09825 0989 00183ITLBO-IPJAYA 09817 09815 09782 00264 00105 09831 09896 00170

Complexity 15

iterations Secondly with all the improvement of ITLBO-SVM mentioned above random selection of the main SVMparameters is considered as one of the algorithm limitationswhich may not provide optimal parameter value and affectthe model accuracy negatively

(e results above showed that the ITLBO-IPJAYAperformance improved the basic SVM performance byproviding the best parameter values as shown in the ITLBO-IPJAYA block diagram in Figure 11 In the end the per-formance of ITLBO-IPJAYA is worth reducing the impact ofselected parameters randomly

As a result of the differences in the algorithm structurethe ITLBO structure contains three phases which shouldprevent the algorithm from being trapped in local and globaloptima Also teachers not only teach learners (students) butalso teach other teachers On the contrary the TLBO structurecontains two phases only where teachers teach learners only

Furthermore the ITLBO algorithm achieved higheraccuracy than TLBO because the knowledge exchange rate ishigher in ITLBO since teachers teach learners and otherteachers (erefore ITLBO achieved better detection rateand less false alarm rate with less complexity of iterations

Table 15 Comparison with the existing work for the CICIDS 2017 dataset

Ref Method Dataset Acc DR FAR[41] Hybrid model CICIDS 8976 NG NG[42] Wrapper-based feature selection CICIDS 9768 NG NG[43] Feature selection technique and SVM CICIDS 09803 NG NGTLBO-SVM TLBO and SVM CICIDS 09794 09745 00274ITLBO-SVM Improved TLBO and SVM CICIDS 09804 09755 00271ITLBO-JAYA-SVM Improved TLBO improved JAYA and SVM CICIDS 0981 09773 00266ITLBO-IPJAYA-SVM Improved TLBO improved JAYA and SVM CICIDS 09817 09782 00264

Table 13 T-test results

NSL-KDD CICIDS 2017P value 00156 00068T value 3174 4044

Table 14 Comparison with the existing work for the NSL-KDD dataset

Ref Method Dataset Acc DR FAR[35] Two-stage classifier NSL-KDD 9638 NG NG[36] Hypergraph-based genetic algorithm and SVM NSL-KDD 0975 09714 083[8] PSO and SVM NSL-KDD 09784 09723 087[37] Chi-square and SVM NSL-KDD 098 NG 013[38] SVM and hybrid PSO NSL-KDD 07341 06628 281[39] SVM and feature selection NSL-KDD 090 NG NG[40] SVM and GA NSL-KDD 0975 NG NGTLBO-SVM TLBO and SVM NSL-KDD 09801 09755 00284ITLBO-SVM Improved TLBO and SVM NSL-KDD 0981 09758 00277ITLBO-JAYA-SVM Improved TLBO improved JAYA and SVM NSL-KDD 09816 09794 00265ITLBO-IPJAYA-SVM Improved TLBO improved JAYA and SVM NSL-KDD 09823 09798 00262

13534

13239

ITLBO-JAYA ITLBO-IPJAYA130501310013150132001325013300133501340013450135001355013600

Figure 15 Execution time comparison for the NSL-KDD dataset

16 Complexity

Dividing the solutions of the IPJAYA algorithm into twogroups and choosing the best solution from the best solutiongroup as ldquoBestrdquo and the best solution from the worst solutiongroup as ldquoWorstrdquo cause IPJAYA to need less iterations thanJAYA to reach better solutions as shown in Figure 13 (isalso leads to improvement in accuracy and detection rate

(e parallel improvement done on the JAYA algorithmreduces the time needed for execution and hence reduces thetotal execution time for the ITLBO-IPJAYA-SVM model asshown in Figure 15

Data Availability

(e data used to support the findings of this study areavailable online at httpswwwunbcacicdatasetsnslhtml

Conflicts of Interest

(e authors declare that they have no conflicts of interest

Acknowledgments

Special appreciation is due to Universiti Malaysia Pahang forthe sponsorship of this study approved by the Ministry ofHigher Education (MOHE) for Fundamental ResearchGrant Scheme (FRGS) with Vot No RDU190113

References

[1] A Sultana and M Jabbar ldquoIntelligent network intrusiondetection system using data mining techniquesrdquo in Pro-ceedings of the 2016 2nd International Conference on Appliedand 0eoretical Computing and Communication Technology(iCATccT) pp 329ndash333 IEEE Bangalore India July 2016

[2] R Rao ldquoJaya a simple and new optimization algorithm forsolving constrained and unconstrained opti- mizationproblemsrdquo International Journal of Industrial EngineeringComputations vol 7 no 1 pp 19ndash34 2016

[3] M Alsajri M A Ismail and S Abdul-Baqi ldquoA review on therecent application of Jaya optimization algorithmrdquo in Pro-ceedings of the 2018 1st Annual International Conference onInformation and Sciences (AiCIS) Springer Al-Fallujah Iraqpp 129ndash132 November 2018

[4] P Tao Z Sun and Z Sun ldquoAn improved intrusion detectionalgorithm based on GA and SVMrdquo IEEE Access vol 6pp 13624ndash13631 2018

[5] A S Eesa Z Orman and A M A Brifcani ldquoA novel feature-selection ap- proach based on the cuttlefish optimizationalgorithm for intrusion detection systemsrdquo Expert Systemswith Applications vol 42 no 5 pp 2670ndash2679 2015

[6] P Louvieris N Clewley and X Liu ldquoEffects-based featureidentification for network intrusion detectionrdquo Neuro-computing vol 121 pp 265ndash273 2013

[7] E De la Hoz A Ortiz J Ortega and B Prieto ldquoPCA filteringand probabilistic SOM for network intrusion detectionrdquoNeurocomputing vol 164 pp 71ndash81 2015

[8] S Mojtaba H Bamakan H Wang Y Tian and Y Shi ldquoAneffective intrusion detection framework based onMCLPSVMoptimized by time-varying chaos particle swarm optimiza-tionrdquo Neurocomputing vol 199 pp 90ndash102 2016

[9] R V Rao V J Savsani and D P Vakharia ldquoTeach-ingndashlearning-based optimization a novel method for

constrained mechanical design optimization problemsrdquordquoComputer-Aided Design vol 43 no 3 pp 303ndash315 2011

[10] S P Das and S Padhy ldquoA novel hybrid model using teaching-learning-based optimization and a support vector machine forcommodity futures index forecastingrdquo International Journalof Machine Learning and Cybernetics vol 9 no 1 pp 97ndash1112018

[11] S P Das N S Achary and S Padhy ldquoNovel hybrid SVM-TLBO forecasting model incorporating dimensionality re-duction techniquesrdquo Applied Intelligence vol 45 no 4pp 1148ndash1165 2016

[12] R V Rao V J Savsani and J Balic ldquoTeachingndashlearning-based optimization algorithm for unconstrained and con-strained real-parameter optimization problemsrdquordquo EngineeringOptimization vol 44 no 12 pp 1447ndash1462 2012

[13] R V Rao and V Patel ldquoAn improved teaching-learning-based optimization algorithm for solving uncon- strainedoptimization problemsrdquo Scientia Iranica vol 20 no 3pp 710ndash720 2013

[14] M Crepinsek S-H Liu and L Mernik ldquoA note on teach-ingndashlearning-based optimization algo- rithmrdquo InformationSciences vol 212 pp 79ndash93 2012

[15] M R Nayak C K Nayak and P K Rout ldquoApplication ofmulti-objective teaching learning based optimization algo-rithm to optimal power flow problemrdquo Procedia Technologyvol 6 pp 255ndash264 2012

[16] Y Xu L Wang S-Y Wang and M Liu ldquoAn effectiveteaching-learning-based optimization algorithm for theflexible job-shop scheduling problem with fuzzy processingtimerdquo Neurocomputing vol 148 pp 260ndash268 2015

[17] H E Kiziloz A Deniz T Dokeroglu and A Cosar ldquoNovelmultiobjective TLBO algorithms for the feature subset se-lection problemrdquoNeurocomputing vol 306 pp 94ndash107 2018

[18] S H Wang K Muhammad Y Lv et al ldquoIdentification of Al-coholism based on wavelet Renyi entropy and three-segmentencoded Jaya algorithmrdquo Complexity vol 2018 Article ID3198184 13 pages 2018

[19] H Migallon A Jimeno-Morenilla and J-L Sanchez-RomeroldquoParallel improvements of the Jaya optimization algorithmrdquoApplied Sciences vol 8 no 5 p 819 2018

[20] C Gong ldquoAn enhanced Jaya algorithm with a two groupAdaptionrdquo International Journal of Computational Intelli-gence Systems vol 10 no 1 pp 1102ndash1115 2017

[21] O Samuel N Javaid S Aslam and M H Rahim ldquoJAYAoptimization based energy management controller for smartgrid JAYA optimization based energy management con-trollerrdquo in Proceedings of the 2018 International Conference onComputing Mathematics and Engineering Technologies(iCoMET) March 2018

[22] K Yu J J Liang B Y Qu X Chen and H Wang ldquoPa-rameters identification of photovoltaic models using an im-proved JAYA optimization algorithmrdquo Energy Conversionand Management vol 150 pp 742ndash753 2017

[23] M Dash and H Liu ldquoFeature selection for classificationrdquoIntelligent Data Analysis vol 1 no 1-4 pp 131ndash156 1997

[24] S Dumais J Platt D Heckerman and M Sahami ldquoInductivelearning algorithms and representations for text categoriza-tionrdquo in Proceedings of the seventh international conference onInformation and knowledge management-CIKMrsquo98 Novem-ber 1998

[25] R Singh H Kumar and R K Singla ldquoAn intrusion detectionsystem using network traffic profiling and online sequentialextreme learning machinerdquo Expert Systems with Applicationsvol 42 no 22 pp 8609ndash8624 2015

Complexity 17

[26] M Sokolova and L Guy ldquoA systematic analysis of perfor-mance measures for classification tasksrdquo Information Pro-cessing amp Management vol 45 no 4 pp 427ndash437 2009

[27] P Phoungphol Y Zhang and Y Zhao ldquoRobust multiclassclassification for learning from imbalanced biomedical datardquoTsinghua Science and Technology vol 17 no 6 pp 619ndash6282012

[28] A Jahan F Mustapha M Y Ismail S M Sapuan andM Bahraminasab ldquoA comprehensive VIKOR method formaterial selectionrdquo Materials amp Design vol 32 no 3pp 1215ndash1221 2011

[29] L Aljarah and S A Ludwig ldquoMapreduce intrusion detectionsystem based on a particle swarm optimization clusteringalgorithmrdquo in Proceedings of the 2013 IEEE Congress onEvolutionary Computation pp 955ndash962 IEEE CancunMexico June 2013

[30] K Khaleel M A Ismail U Yunan and S Kasim ldquoReview onintrusion detection system based on the goal of the detectionsystemrdquo International Journal of Integrated Engineeringvol 10 no 6 2018

[31] J D Rodriguez A Perez and J A Lozano ldquoSensitivityanalysis of k-Fold cross validation in prediction error esti-mationrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 32 no 3 pp 569ndash575 2010

[32] M Tavallaee E Bagheri W Lu and A A Ghorbani ldquoAdetailed analysis of the KDD CUP 99 data setrdquo in Proceedingsof the 2009 IEEE Symposium on Computational Intelligence forSecurity and Defense Applications pp 1ndash6 IEEE Ottawa ONCanada July 2009

[33] M Al-Qatf Y Lasheng M Al-Habib and K Al-SabahildquoDeep learning approach combining sparse autoencoder withSVM for network intrusion detectionrdquo IEEE Access vol 6pp 52843ndash52856 2018

[34] R Vinayakumar M Alazab K P Soman P PoornachandranA Al-Nemrat and S Venkatraman ldquoDeep learning approachfor intelligent intrusion detection systemrdquo IEEE Access vol 7pp 41525ndash41550 2019

[35] BA Tama M Comuzzi and K-H Rhee ldquoTSE-IDS a two-stage classifier ensemble for intelligent anomaly-based in-trusion detection systemrdquo IEEE Access vol 7 pp 94497ndash94507 2019

[36] M R Gauthama Raman N Somu K Kannan R Liscano andV S Shankar Sriram ldquoAn efficient intrusion detection systembased on hypergraph - genetic algorithm for parameter op-timization and feature selection in support vector machinerdquordquoKnowledge-Based Systems vol 134 pp 1ndash12 2017

[37] I S (aseen and C Aswani Kumar ldquoIntrusion detectionmodel using fusion of chi- square feature selection and multiclass SVMrdquo Journal of King Saud University - Computer andInformation Sciences vol 29 no 4 pp 462ndash472 2017

[38] Y Li S Yu J Bai and X Cheng ldquoTowards effective networkintrusion detection a hybrid model integrating Gini indexand GBDTwith PSOrdquo Journal of Sensors vol 2018 Article ID1578314 9 pages 2018

[39] A A Aburomman and M B I Reaz ldquoA novel weightedsupport vector machines multi- class classifier based ondifferential evolution for intrusion detection systemsrdquo In-formation Sciences vol 414 pp 225ndash246 2017

[40] J Esmaily and J Ghasemi ldquoA novel intrusion detectionsystems based on genetic algorithms-suggested features by themeans of different permutations of labelsrsquo ordersrdquo Interna-tional Journal of Engineering vol 30 no 10 pp 1494ndash15022017

[41] S Aljawarneh M Aldwairi and M B Yassein ldquoAnomaly-based intrusion detection system through feature selectionanalysis and building hybrid efficient modelrdquo Journal ofComputational Science vol 25 pp 152ndash160 2018

[42] Y Li J L Wang Z H Tian T B Lu and C Young ldquoBuildinglightweight intrusion detection system using wrapper-basedfeature selection mechanismsrdquo Computers amp Security vol 28no 6 pp 466ndash475 2009

[43] S U Jan S Ahmed V Shakhov and I Koo ldquoToward alightweight intrusion detection system for the internet ofthingsrdquo IEEE Access vol 7 pp 42450ndash42471 2019

18 Complexity

Page 15: ImprovedTLBO …downloads.hindawi.com/journals/complexity/2020/5287684.pdfFSS is a multiobjective optimisation problem. e first objective is the number of features, and the second

values are shown in Table 13 the small values show that theIPJAYA-ITLBO-SVM method (MV1) is highly significant

13 The Comparison of the Proposed Methods

To illustrate the effectiveness of our proposed IDS methodsthe performance of the proposed methods is compared withsix recently developed anomaly detection techniquesTable 14 demonstrates the result achieved by the proposedmethods compared with other methods tested on the NSL-KDD dataset in terms of detection rate and false alarm rateIt is very clear that our proposed methods (ITLBO-JAYAand ITLBO-IPJAYA) obtained the best results with 09823accuracy 09798 detection rate and 00102 false alarm ratefor the ITLBO-IPJAYA model and 09816 accuracy 09794detection rate and 00114 false alarm rate for the ITLBO-JAYA method as shown in Table 11 However Table 15demonstrates the result achieved by the proposed methodscompared with other methods tested on the CICIDS 2017dataset in terms of detection rate and false alarm rate

14 Discussion

(is work in general contains 4 sections based on theproposed method Furthermore all methods proposed inthis work were evaluated based on the NSL-KDD andCICIDS 2017 datasets

Firstly the proposed ITLBO-IPJAYA based on networkintrusion detection and method results were compared withTLBO ITLBO and ITLBO-JAYA as shown in Tables 11 and12 Additionally the table shows the different features for thethree algorithms to investigate the influence of the featurersquosincrease on the performance which represents a differentalgorithm structure (e ITLBO-IPJAYA results showedhigher stability and better accuracy than ITLBO and ITLBO-JAYA algorithms

Furthermore Figure 13 shows that ITLBO-JAYA needs60 iterations to reach accuracy of 09816 when the ITLBO-IPJAYA algorithm with 50 iterations achieved higher ac-curacy (erefore ITLBO-IPJAYA achieved better detectionrate and less false alarm rate with less complexity of

0020

0022

0024

0026

0028

003

0032

0034

0036

0038

004

FAR

2217 18 19 20 21 2315 16Number of features

ITLBOITLBO-JAYAITLBO-IPJAYA

Figure 14 Comparison based on the number of features head for the NSL-KDD dataset

Table 12 Comparison of ITLBO ITLBO-JAYA and ITLBO-IPJAYA for the CICIDS dataset

No of features Method MAX Acc AVR Acc DR FAR FNR F-M Recall ER

12ITLBO 09634 09631 09661 00389 00268 0970 09721 00323

ITLBO-JAYA 09685 09683 09682 00360 00267 09713 09722 00315ITLBO-IPJAYA 09704 09702 09701 00310 00265 09725 09724 00298

13ITLBO 09712 09710 09724 00298 00273 09736 09726 00282

ITLBO-JAYA 09745 09744 09728 00290 00264 09741 09794 00272ITLBO-IPJAYA 09768 09767 09732 00285 00260 09752 09787 00264

14ITLBO 09776 09775 09737 00280 00189 09769 09811 00258

ITLBO-JAYA 09789 09787 09742 00270 00174 0978 0986 00235ITLBO-IPJAYA 09801 0980 09749 00265 00134 0981 0987 00210

16ITLBO 09804 09803 09755 00271 0011 09821 0989 00190

ITLBO-JAYA 0981 09808 09773 00266 00109 09825 0989 00183ITLBO-IPJAYA 09817 09815 09782 00264 00105 09831 09896 00170

Complexity 15

iterations Secondly with all the improvement of ITLBO-SVM mentioned above random selection of the main SVMparameters is considered as one of the algorithm limitationswhich may not provide optimal parameter value and affectthe model accuracy negatively

(e results above showed that the ITLBO-IPJAYAperformance improved the basic SVM performance byproviding the best parameter values as shown in the ITLBO-IPJAYA block diagram in Figure 11 In the end the per-formance of ITLBO-IPJAYA is worth reducing the impact ofselected parameters randomly

As a result of the differences in the algorithm structurethe ITLBO structure contains three phases which shouldprevent the algorithm from being trapped in local and globaloptima Also teachers not only teach learners (students) butalso teach other teachers On the contrary the TLBO structurecontains two phases only where teachers teach learners only

Furthermore the ITLBO algorithm achieved higheraccuracy than TLBO because the knowledge exchange rate ishigher in ITLBO since teachers teach learners and otherteachers (erefore ITLBO achieved better detection rateand less false alarm rate with less complexity of iterations

Table 15 Comparison with the existing work for the CICIDS 2017 dataset

Ref Method Dataset Acc DR FAR[41] Hybrid model CICIDS 8976 NG NG[42] Wrapper-based feature selection CICIDS 9768 NG NG[43] Feature selection technique and SVM CICIDS 09803 NG NGTLBO-SVM TLBO and SVM CICIDS 09794 09745 00274ITLBO-SVM Improved TLBO and SVM CICIDS 09804 09755 00271ITLBO-JAYA-SVM Improved TLBO improved JAYA and SVM CICIDS 0981 09773 00266ITLBO-IPJAYA-SVM Improved TLBO improved JAYA and SVM CICIDS 09817 09782 00264

Table 13 T-test results

NSL-KDD CICIDS 2017P value 00156 00068T value 3174 4044

Table 14 Comparison with the existing work for the NSL-KDD dataset

Ref Method Dataset Acc DR FAR[35] Two-stage classifier NSL-KDD 9638 NG NG[36] Hypergraph-based genetic algorithm and SVM NSL-KDD 0975 09714 083[8] PSO and SVM NSL-KDD 09784 09723 087[37] Chi-square and SVM NSL-KDD 098 NG 013[38] SVM and hybrid PSO NSL-KDD 07341 06628 281[39] SVM and feature selection NSL-KDD 090 NG NG[40] SVM and GA NSL-KDD 0975 NG NGTLBO-SVM TLBO and SVM NSL-KDD 09801 09755 00284ITLBO-SVM Improved TLBO and SVM NSL-KDD 0981 09758 00277ITLBO-JAYA-SVM Improved TLBO improved JAYA and SVM NSL-KDD 09816 09794 00265ITLBO-IPJAYA-SVM Improved TLBO improved JAYA and SVM NSL-KDD 09823 09798 00262

13534

13239

ITLBO-JAYA ITLBO-IPJAYA130501310013150132001325013300133501340013450135001355013600

Figure 15 Execution time comparison for the NSL-KDD dataset

16 Complexity

Dividing the solutions of the IPJAYA algorithm into twogroups and choosing the best solution from the best solutiongroup as ldquoBestrdquo and the best solution from the worst solutiongroup as ldquoWorstrdquo cause IPJAYA to need less iterations thanJAYA to reach better solutions as shown in Figure 13 (isalso leads to improvement in accuracy and detection rate

(e parallel improvement done on the JAYA algorithmreduces the time needed for execution and hence reduces thetotal execution time for the ITLBO-IPJAYA-SVM model asshown in Figure 15

Data Availability

(e data used to support the findings of this study areavailable online at httpswwwunbcacicdatasetsnslhtml

Conflicts of Interest

(e authors declare that they have no conflicts of interest

Acknowledgments

Special appreciation is due to Universiti Malaysia Pahang forthe sponsorship of this study approved by the Ministry ofHigher Education (MOHE) for Fundamental ResearchGrant Scheme (FRGS) with Vot No RDU190113

References

[1] A Sultana and M Jabbar ldquoIntelligent network intrusiondetection system using data mining techniquesrdquo in Pro-ceedings of the 2016 2nd International Conference on Appliedand 0eoretical Computing and Communication Technology(iCATccT) pp 329ndash333 IEEE Bangalore India July 2016

[2] R Rao ldquoJaya a simple and new optimization algorithm forsolving constrained and unconstrained opti- mizationproblemsrdquo International Journal of Industrial EngineeringComputations vol 7 no 1 pp 19ndash34 2016

[3] M Alsajri M A Ismail and S Abdul-Baqi ldquoA review on therecent application of Jaya optimization algorithmrdquo in Pro-ceedings of the 2018 1st Annual International Conference onInformation and Sciences (AiCIS) Springer Al-Fallujah Iraqpp 129ndash132 November 2018

[4] P Tao Z Sun and Z Sun ldquoAn improved intrusion detectionalgorithm based on GA and SVMrdquo IEEE Access vol 6pp 13624ndash13631 2018

[5] A S Eesa Z Orman and A M A Brifcani ldquoA novel feature-selection ap- proach based on the cuttlefish optimizationalgorithm for intrusion detection systemsrdquo Expert Systemswith Applications vol 42 no 5 pp 2670ndash2679 2015

[6] P Louvieris N Clewley and X Liu ldquoEffects-based featureidentification for network intrusion detectionrdquo Neuro-computing vol 121 pp 265ndash273 2013

[7] E De la Hoz A Ortiz J Ortega and B Prieto ldquoPCA filteringand probabilistic SOM for network intrusion detectionrdquoNeurocomputing vol 164 pp 71ndash81 2015

[8] S Mojtaba H Bamakan H Wang Y Tian and Y Shi ldquoAneffective intrusion detection framework based onMCLPSVMoptimized by time-varying chaos particle swarm optimiza-tionrdquo Neurocomputing vol 199 pp 90ndash102 2016

[9] R V Rao V J Savsani and D P Vakharia ldquoTeach-ingndashlearning-based optimization a novel method for

constrained mechanical design optimization problemsrdquordquoComputer-Aided Design vol 43 no 3 pp 303ndash315 2011

[10] S P Das and S Padhy ldquoA novel hybrid model using teaching-learning-based optimization and a support vector machine forcommodity futures index forecastingrdquo International Journalof Machine Learning and Cybernetics vol 9 no 1 pp 97ndash1112018

[11] S P Das N S Achary and S Padhy ldquoNovel hybrid SVM-TLBO forecasting model incorporating dimensionality re-duction techniquesrdquo Applied Intelligence vol 45 no 4pp 1148ndash1165 2016

[12] R V Rao V J Savsani and J Balic ldquoTeachingndashlearning-based optimization algorithm for unconstrained and con-strained real-parameter optimization problemsrdquordquo EngineeringOptimization vol 44 no 12 pp 1447ndash1462 2012

[13] R V Rao and V Patel ldquoAn improved teaching-learning-based optimization algorithm for solving uncon- strainedoptimization problemsrdquo Scientia Iranica vol 20 no 3pp 710ndash720 2013

[14] M Crepinsek S-H Liu and L Mernik ldquoA note on teach-ingndashlearning-based optimization algo- rithmrdquo InformationSciences vol 212 pp 79ndash93 2012

[15] M R Nayak C K Nayak and P K Rout ldquoApplication ofmulti-objective teaching learning based optimization algo-rithm to optimal power flow problemrdquo Procedia Technologyvol 6 pp 255ndash264 2012

[16] Y Xu L Wang S-Y Wang and M Liu ldquoAn effectiveteaching-learning-based optimization algorithm for theflexible job-shop scheduling problem with fuzzy processingtimerdquo Neurocomputing vol 148 pp 260ndash268 2015

[17] H E Kiziloz A Deniz T Dokeroglu and A Cosar ldquoNovelmultiobjective TLBO algorithms for the feature subset se-lection problemrdquoNeurocomputing vol 306 pp 94ndash107 2018

[18] S H Wang K Muhammad Y Lv et al ldquoIdentification of Al-coholism based on wavelet Renyi entropy and three-segmentencoded Jaya algorithmrdquo Complexity vol 2018 Article ID3198184 13 pages 2018

[19] H Migallon A Jimeno-Morenilla and J-L Sanchez-RomeroldquoParallel improvements of the Jaya optimization algorithmrdquoApplied Sciences vol 8 no 5 p 819 2018

[20] C Gong ldquoAn enhanced Jaya algorithm with a two groupAdaptionrdquo International Journal of Computational Intelli-gence Systems vol 10 no 1 pp 1102ndash1115 2017

[21] O Samuel N Javaid S Aslam and M H Rahim ldquoJAYAoptimization based energy management controller for smartgrid JAYA optimization based energy management con-trollerrdquo in Proceedings of the 2018 International Conference onComputing Mathematics and Engineering Technologies(iCoMET) March 2018

[22] K Yu J J Liang B Y Qu X Chen and H Wang ldquoPa-rameters identification of photovoltaic models using an im-proved JAYA optimization algorithmrdquo Energy Conversionand Management vol 150 pp 742ndash753 2017

[23] M Dash and H Liu ldquoFeature selection for classificationrdquoIntelligent Data Analysis vol 1 no 1-4 pp 131ndash156 1997

[24] S Dumais J Platt D Heckerman and M Sahami ldquoInductivelearning algorithms and representations for text categoriza-tionrdquo in Proceedings of the seventh international conference onInformation and knowledge management-CIKMrsquo98 Novem-ber 1998

[25] R Singh H Kumar and R K Singla ldquoAn intrusion detectionsystem using network traffic profiling and online sequentialextreme learning machinerdquo Expert Systems with Applicationsvol 42 no 22 pp 8609ndash8624 2015

Complexity 17

[26] M Sokolova and L Guy ldquoA systematic analysis of perfor-mance measures for classification tasksrdquo Information Pro-cessing amp Management vol 45 no 4 pp 427ndash437 2009

[27] P Phoungphol Y Zhang and Y Zhao ldquoRobust multiclassclassification for learning from imbalanced biomedical datardquoTsinghua Science and Technology vol 17 no 6 pp 619ndash6282012

[28] A Jahan F Mustapha M Y Ismail S M Sapuan andM Bahraminasab ldquoA comprehensive VIKOR method formaterial selectionrdquo Materials amp Design vol 32 no 3pp 1215ndash1221 2011

[29] L Aljarah and S A Ludwig ldquoMapreduce intrusion detectionsystem based on a particle swarm optimization clusteringalgorithmrdquo in Proceedings of the 2013 IEEE Congress onEvolutionary Computation pp 955ndash962 IEEE CancunMexico June 2013

[30] K Khaleel M A Ismail U Yunan and S Kasim ldquoReview onintrusion detection system based on the goal of the detectionsystemrdquo International Journal of Integrated Engineeringvol 10 no 6 2018

[31] J D Rodriguez A Perez and J A Lozano ldquoSensitivityanalysis of k-Fold cross validation in prediction error esti-mationrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 32 no 3 pp 569ndash575 2010

[32] M Tavallaee E Bagheri W Lu and A A Ghorbani ldquoAdetailed analysis of the KDD CUP 99 data setrdquo in Proceedingsof the 2009 IEEE Symposium on Computational Intelligence forSecurity and Defense Applications pp 1ndash6 IEEE Ottawa ONCanada July 2009

[33] M Al-Qatf Y Lasheng M Al-Habib and K Al-SabahildquoDeep learning approach combining sparse autoencoder withSVM for network intrusion detectionrdquo IEEE Access vol 6pp 52843ndash52856 2018

[34] R Vinayakumar M Alazab K P Soman P PoornachandranA Al-Nemrat and S Venkatraman ldquoDeep learning approachfor intelligent intrusion detection systemrdquo IEEE Access vol 7pp 41525ndash41550 2019

[35] BA Tama M Comuzzi and K-H Rhee ldquoTSE-IDS a two-stage classifier ensemble for intelligent anomaly-based in-trusion detection systemrdquo IEEE Access vol 7 pp 94497ndash94507 2019

[36] M R Gauthama Raman N Somu K Kannan R Liscano andV S Shankar Sriram ldquoAn efficient intrusion detection systembased on hypergraph - genetic algorithm for parameter op-timization and feature selection in support vector machinerdquordquoKnowledge-Based Systems vol 134 pp 1ndash12 2017

[37] I S (aseen and C Aswani Kumar ldquoIntrusion detectionmodel using fusion of chi- square feature selection and multiclass SVMrdquo Journal of King Saud University - Computer andInformation Sciences vol 29 no 4 pp 462ndash472 2017

[38] Y Li S Yu J Bai and X Cheng ldquoTowards effective networkintrusion detection a hybrid model integrating Gini indexand GBDTwith PSOrdquo Journal of Sensors vol 2018 Article ID1578314 9 pages 2018

[39] A A Aburomman and M B I Reaz ldquoA novel weightedsupport vector machines multi- class classifier based ondifferential evolution for intrusion detection systemsrdquo In-formation Sciences vol 414 pp 225ndash246 2017

[40] J Esmaily and J Ghasemi ldquoA novel intrusion detectionsystems based on genetic algorithms-suggested features by themeans of different permutations of labelsrsquo ordersrdquo Interna-tional Journal of Engineering vol 30 no 10 pp 1494ndash15022017

[41] S Aljawarneh M Aldwairi and M B Yassein ldquoAnomaly-based intrusion detection system through feature selectionanalysis and building hybrid efficient modelrdquo Journal ofComputational Science vol 25 pp 152ndash160 2018

[42] Y Li J L Wang Z H Tian T B Lu and C Young ldquoBuildinglightweight intrusion detection system using wrapper-basedfeature selection mechanismsrdquo Computers amp Security vol 28no 6 pp 466ndash475 2009

[43] S U Jan S Ahmed V Shakhov and I Koo ldquoToward alightweight intrusion detection system for the internet ofthingsrdquo IEEE Access vol 7 pp 42450ndash42471 2019

18 Complexity

Page 16: ImprovedTLBO …downloads.hindawi.com/journals/complexity/2020/5287684.pdfFSS is a multiobjective optimisation problem. e first objective is the number of features, and the second

iterations Secondly with all the improvement of ITLBO-SVM mentioned above random selection of the main SVMparameters is considered as one of the algorithm limitationswhich may not provide optimal parameter value and affectthe model accuracy negatively

(e results above showed that the ITLBO-IPJAYAperformance improved the basic SVM performance byproviding the best parameter values as shown in the ITLBO-IPJAYA block diagram in Figure 11 In the end the per-formance of ITLBO-IPJAYA is worth reducing the impact ofselected parameters randomly

As a result of the differences in the algorithm structurethe ITLBO structure contains three phases which shouldprevent the algorithm from being trapped in local and globaloptima Also teachers not only teach learners (students) butalso teach other teachers On the contrary the TLBO structurecontains two phases only where teachers teach learners only

Furthermore the ITLBO algorithm achieved higheraccuracy than TLBO because the knowledge exchange rate ishigher in ITLBO since teachers teach learners and otherteachers (erefore ITLBO achieved better detection rateand less false alarm rate with less complexity of iterations

Table 15 Comparison with the existing work for the CICIDS 2017 dataset

Ref Method Dataset Acc DR FAR[41] Hybrid model CICIDS 8976 NG NG[42] Wrapper-based feature selection CICIDS 9768 NG NG[43] Feature selection technique and SVM CICIDS 09803 NG NGTLBO-SVM TLBO and SVM CICIDS 09794 09745 00274ITLBO-SVM Improved TLBO and SVM CICIDS 09804 09755 00271ITLBO-JAYA-SVM Improved TLBO improved JAYA and SVM CICIDS 0981 09773 00266ITLBO-IPJAYA-SVM Improved TLBO improved JAYA and SVM CICIDS 09817 09782 00264

Table 13 T-test results

NSL-KDD CICIDS 2017P value 00156 00068T value 3174 4044

Table 14 Comparison with the existing work for the NSL-KDD dataset

Ref Method Dataset Acc DR FAR[35] Two-stage classifier NSL-KDD 9638 NG NG[36] Hypergraph-based genetic algorithm and SVM NSL-KDD 0975 09714 083[8] PSO and SVM NSL-KDD 09784 09723 087[37] Chi-square and SVM NSL-KDD 098 NG 013[38] SVM and hybrid PSO NSL-KDD 07341 06628 281[39] SVM and feature selection NSL-KDD 090 NG NG[40] SVM and GA NSL-KDD 0975 NG NGTLBO-SVM TLBO and SVM NSL-KDD 09801 09755 00284ITLBO-SVM Improved TLBO and SVM NSL-KDD 0981 09758 00277ITLBO-JAYA-SVM Improved TLBO improved JAYA and SVM NSL-KDD 09816 09794 00265ITLBO-IPJAYA-SVM Improved TLBO improved JAYA and SVM NSL-KDD 09823 09798 00262

13534

13239

ITLBO-JAYA ITLBO-IPJAYA130501310013150132001325013300133501340013450135001355013600

Figure 15 Execution time comparison for the NSL-KDD dataset

16 Complexity

Dividing the solutions of the IPJAYA algorithm into twogroups and choosing the best solution from the best solutiongroup as ldquoBestrdquo and the best solution from the worst solutiongroup as ldquoWorstrdquo cause IPJAYA to need less iterations thanJAYA to reach better solutions as shown in Figure 13 (isalso leads to improvement in accuracy and detection rate

(e parallel improvement done on the JAYA algorithmreduces the time needed for execution and hence reduces thetotal execution time for the ITLBO-IPJAYA-SVM model asshown in Figure 15

Data Availability

(e data used to support the findings of this study areavailable online at httpswwwunbcacicdatasetsnslhtml

Conflicts of Interest

(e authors declare that they have no conflicts of interest

Acknowledgments

Special appreciation is due to Universiti Malaysia Pahang forthe sponsorship of this study approved by the Ministry ofHigher Education (MOHE) for Fundamental ResearchGrant Scheme (FRGS) with Vot No RDU190113

References

[1] A Sultana and M Jabbar ldquoIntelligent network intrusiondetection system using data mining techniquesrdquo in Pro-ceedings of the 2016 2nd International Conference on Appliedand 0eoretical Computing and Communication Technology(iCATccT) pp 329ndash333 IEEE Bangalore India July 2016

[2] R Rao ldquoJaya a simple and new optimization algorithm forsolving constrained and unconstrained opti- mizationproblemsrdquo International Journal of Industrial EngineeringComputations vol 7 no 1 pp 19ndash34 2016

[3] M Alsajri M A Ismail and S Abdul-Baqi ldquoA review on therecent application of Jaya optimization algorithmrdquo in Pro-ceedings of the 2018 1st Annual International Conference onInformation and Sciences (AiCIS) Springer Al-Fallujah Iraqpp 129ndash132 November 2018

[4] P Tao Z Sun and Z Sun ldquoAn improved intrusion detectionalgorithm based on GA and SVMrdquo IEEE Access vol 6pp 13624ndash13631 2018

[5] A S Eesa Z Orman and A M A Brifcani ldquoA novel feature-selection ap- proach based on the cuttlefish optimizationalgorithm for intrusion detection systemsrdquo Expert Systemswith Applications vol 42 no 5 pp 2670ndash2679 2015

[6] P Louvieris N Clewley and X Liu ldquoEffects-based featureidentification for network intrusion detectionrdquo Neuro-computing vol 121 pp 265ndash273 2013

[7] E De la Hoz A Ortiz J Ortega and B Prieto ldquoPCA filteringand probabilistic SOM for network intrusion detectionrdquoNeurocomputing vol 164 pp 71ndash81 2015

[8] S Mojtaba H Bamakan H Wang Y Tian and Y Shi ldquoAneffective intrusion detection framework based onMCLPSVMoptimized by time-varying chaos particle swarm optimiza-tionrdquo Neurocomputing vol 199 pp 90ndash102 2016

[9] R V Rao V J Savsani and D P Vakharia ldquoTeach-ingndashlearning-based optimization a novel method for

constrained mechanical design optimization problemsrdquordquoComputer-Aided Design vol 43 no 3 pp 303ndash315 2011

[10] S P Das and S Padhy ldquoA novel hybrid model using teaching-learning-based optimization and a support vector machine forcommodity futures index forecastingrdquo International Journalof Machine Learning and Cybernetics vol 9 no 1 pp 97ndash1112018

[11] S P Das N S Achary and S Padhy ldquoNovel hybrid SVM-TLBO forecasting model incorporating dimensionality re-duction techniquesrdquo Applied Intelligence vol 45 no 4pp 1148ndash1165 2016

[12] R V Rao V J Savsani and J Balic ldquoTeachingndashlearning-based optimization algorithm for unconstrained and con-strained real-parameter optimization problemsrdquordquo EngineeringOptimization vol 44 no 12 pp 1447ndash1462 2012

[13] R V Rao and V Patel ldquoAn improved teaching-learning-based optimization algorithm for solving uncon- strainedoptimization problemsrdquo Scientia Iranica vol 20 no 3pp 710ndash720 2013

[14] M Crepinsek S-H Liu and L Mernik ldquoA note on teach-ingndashlearning-based optimization algo- rithmrdquo InformationSciences vol 212 pp 79ndash93 2012

[15] M R Nayak C K Nayak and P K Rout ldquoApplication ofmulti-objective teaching learning based optimization algo-rithm to optimal power flow problemrdquo Procedia Technologyvol 6 pp 255ndash264 2012

[16] Y Xu L Wang S-Y Wang and M Liu ldquoAn effectiveteaching-learning-based optimization algorithm for theflexible job-shop scheduling problem with fuzzy processingtimerdquo Neurocomputing vol 148 pp 260ndash268 2015

[17] H E Kiziloz A Deniz T Dokeroglu and A Cosar ldquoNovelmultiobjective TLBO algorithms for the feature subset se-lection problemrdquoNeurocomputing vol 306 pp 94ndash107 2018

[18] S H Wang K Muhammad Y Lv et al ldquoIdentification of Al-coholism based on wavelet Renyi entropy and three-segmentencoded Jaya algorithmrdquo Complexity vol 2018 Article ID3198184 13 pages 2018

[19] H Migallon A Jimeno-Morenilla and J-L Sanchez-RomeroldquoParallel improvements of the Jaya optimization algorithmrdquoApplied Sciences vol 8 no 5 p 819 2018

[20] C Gong ldquoAn enhanced Jaya algorithm with a two groupAdaptionrdquo International Journal of Computational Intelli-gence Systems vol 10 no 1 pp 1102ndash1115 2017

[21] O Samuel N Javaid S Aslam and M H Rahim ldquoJAYAoptimization based energy management controller for smartgrid JAYA optimization based energy management con-trollerrdquo in Proceedings of the 2018 International Conference onComputing Mathematics and Engineering Technologies(iCoMET) March 2018

[22] K Yu J J Liang B Y Qu X Chen and H Wang ldquoPa-rameters identification of photovoltaic models using an im-proved JAYA optimization algorithmrdquo Energy Conversionand Management vol 150 pp 742ndash753 2017

[23] M Dash and H Liu ldquoFeature selection for classificationrdquoIntelligent Data Analysis vol 1 no 1-4 pp 131ndash156 1997

[24] S Dumais J Platt D Heckerman and M Sahami ldquoInductivelearning algorithms and representations for text categoriza-tionrdquo in Proceedings of the seventh international conference onInformation and knowledge management-CIKMrsquo98 Novem-ber 1998

[25] R Singh H Kumar and R K Singla ldquoAn intrusion detectionsystem using network traffic profiling and online sequentialextreme learning machinerdquo Expert Systems with Applicationsvol 42 no 22 pp 8609ndash8624 2015

Complexity 17

[26] M Sokolova and L Guy ldquoA systematic analysis of perfor-mance measures for classification tasksrdquo Information Pro-cessing amp Management vol 45 no 4 pp 427ndash437 2009

[27] P Phoungphol Y Zhang and Y Zhao ldquoRobust multiclassclassification for learning from imbalanced biomedical datardquoTsinghua Science and Technology vol 17 no 6 pp 619ndash6282012

[28] A Jahan F Mustapha M Y Ismail S M Sapuan andM Bahraminasab ldquoA comprehensive VIKOR method formaterial selectionrdquo Materials amp Design vol 32 no 3pp 1215ndash1221 2011

[29] L Aljarah and S A Ludwig ldquoMapreduce intrusion detectionsystem based on a particle swarm optimization clusteringalgorithmrdquo in Proceedings of the 2013 IEEE Congress onEvolutionary Computation pp 955ndash962 IEEE CancunMexico June 2013

[30] K Khaleel M A Ismail U Yunan and S Kasim ldquoReview onintrusion detection system based on the goal of the detectionsystemrdquo International Journal of Integrated Engineeringvol 10 no 6 2018

[31] J D Rodriguez A Perez and J A Lozano ldquoSensitivityanalysis of k-Fold cross validation in prediction error esti-mationrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 32 no 3 pp 569ndash575 2010

[32] M Tavallaee E Bagheri W Lu and A A Ghorbani ldquoAdetailed analysis of the KDD CUP 99 data setrdquo in Proceedingsof the 2009 IEEE Symposium on Computational Intelligence forSecurity and Defense Applications pp 1ndash6 IEEE Ottawa ONCanada July 2009

[33] M Al-Qatf Y Lasheng M Al-Habib and K Al-SabahildquoDeep learning approach combining sparse autoencoder withSVM for network intrusion detectionrdquo IEEE Access vol 6pp 52843ndash52856 2018

[34] R Vinayakumar M Alazab K P Soman P PoornachandranA Al-Nemrat and S Venkatraman ldquoDeep learning approachfor intelligent intrusion detection systemrdquo IEEE Access vol 7pp 41525ndash41550 2019

[35] BA Tama M Comuzzi and K-H Rhee ldquoTSE-IDS a two-stage classifier ensemble for intelligent anomaly-based in-trusion detection systemrdquo IEEE Access vol 7 pp 94497ndash94507 2019

[36] M R Gauthama Raman N Somu K Kannan R Liscano andV S Shankar Sriram ldquoAn efficient intrusion detection systembased on hypergraph - genetic algorithm for parameter op-timization and feature selection in support vector machinerdquordquoKnowledge-Based Systems vol 134 pp 1ndash12 2017

[37] I S (aseen and C Aswani Kumar ldquoIntrusion detectionmodel using fusion of chi- square feature selection and multiclass SVMrdquo Journal of King Saud University - Computer andInformation Sciences vol 29 no 4 pp 462ndash472 2017

[38] Y Li S Yu J Bai and X Cheng ldquoTowards effective networkintrusion detection a hybrid model integrating Gini indexand GBDTwith PSOrdquo Journal of Sensors vol 2018 Article ID1578314 9 pages 2018

[39] A A Aburomman and M B I Reaz ldquoA novel weightedsupport vector machines multi- class classifier based ondifferential evolution for intrusion detection systemsrdquo In-formation Sciences vol 414 pp 225ndash246 2017

[40] J Esmaily and J Ghasemi ldquoA novel intrusion detectionsystems based on genetic algorithms-suggested features by themeans of different permutations of labelsrsquo ordersrdquo Interna-tional Journal of Engineering vol 30 no 10 pp 1494ndash15022017

[41] S Aljawarneh M Aldwairi and M B Yassein ldquoAnomaly-based intrusion detection system through feature selectionanalysis and building hybrid efficient modelrdquo Journal ofComputational Science vol 25 pp 152ndash160 2018

[42] Y Li J L Wang Z H Tian T B Lu and C Young ldquoBuildinglightweight intrusion detection system using wrapper-basedfeature selection mechanismsrdquo Computers amp Security vol 28no 6 pp 466ndash475 2009

[43] S U Jan S Ahmed V Shakhov and I Koo ldquoToward alightweight intrusion detection system for the internet ofthingsrdquo IEEE Access vol 7 pp 42450ndash42471 2019

18 Complexity

Page 17: ImprovedTLBO …downloads.hindawi.com/journals/complexity/2020/5287684.pdfFSS is a multiobjective optimisation problem. e first objective is the number of features, and the second

Dividing the solutions of the IPJAYA algorithm into twogroups and choosing the best solution from the best solutiongroup as ldquoBestrdquo and the best solution from the worst solutiongroup as ldquoWorstrdquo cause IPJAYA to need less iterations thanJAYA to reach better solutions as shown in Figure 13 (isalso leads to improvement in accuracy and detection rate

(e parallel improvement done on the JAYA algorithmreduces the time needed for execution and hence reduces thetotal execution time for the ITLBO-IPJAYA-SVM model asshown in Figure 15

Data Availability

(e data used to support the findings of this study areavailable online at httpswwwunbcacicdatasetsnslhtml

Conflicts of Interest

(e authors declare that they have no conflicts of interest

Acknowledgments

Special appreciation is due to Universiti Malaysia Pahang forthe sponsorship of this study approved by the Ministry ofHigher Education (MOHE) for Fundamental ResearchGrant Scheme (FRGS) with Vot No RDU190113

References

[1] A Sultana and M Jabbar ldquoIntelligent network intrusiondetection system using data mining techniquesrdquo in Pro-ceedings of the 2016 2nd International Conference on Appliedand 0eoretical Computing and Communication Technology(iCATccT) pp 329ndash333 IEEE Bangalore India July 2016

[2] R Rao ldquoJaya a simple and new optimization algorithm forsolving constrained and unconstrained opti- mizationproblemsrdquo International Journal of Industrial EngineeringComputations vol 7 no 1 pp 19ndash34 2016

[3] M Alsajri M A Ismail and S Abdul-Baqi ldquoA review on therecent application of Jaya optimization algorithmrdquo in Pro-ceedings of the 2018 1st Annual International Conference onInformation and Sciences (AiCIS) Springer Al-Fallujah Iraqpp 129ndash132 November 2018

[4] P Tao Z Sun and Z Sun ldquoAn improved intrusion detectionalgorithm based on GA and SVMrdquo IEEE Access vol 6pp 13624ndash13631 2018

[5] A S Eesa Z Orman and A M A Brifcani ldquoA novel feature-selection ap- proach based on the cuttlefish optimizationalgorithm for intrusion detection systemsrdquo Expert Systemswith Applications vol 42 no 5 pp 2670ndash2679 2015

[6] P Louvieris N Clewley and X Liu ldquoEffects-based featureidentification for network intrusion detectionrdquo Neuro-computing vol 121 pp 265ndash273 2013

[7] E De la Hoz A Ortiz J Ortega and B Prieto ldquoPCA filteringand probabilistic SOM for network intrusion detectionrdquoNeurocomputing vol 164 pp 71ndash81 2015

[8] S Mojtaba H Bamakan H Wang Y Tian and Y Shi ldquoAneffective intrusion detection framework based onMCLPSVMoptimized by time-varying chaos particle swarm optimiza-tionrdquo Neurocomputing vol 199 pp 90ndash102 2016

[9] R V Rao V J Savsani and D P Vakharia ldquoTeach-ingndashlearning-based optimization a novel method for

constrained mechanical design optimization problemsrdquordquoComputer-Aided Design vol 43 no 3 pp 303ndash315 2011

[10] S P Das and S Padhy ldquoA novel hybrid model using teaching-learning-based optimization and a support vector machine forcommodity futures index forecastingrdquo International Journalof Machine Learning and Cybernetics vol 9 no 1 pp 97ndash1112018

[11] S P Das N S Achary and S Padhy ldquoNovel hybrid SVM-TLBO forecasting model incorporating dimensionality re-duction techniquesrdquo Applied Intelligence vol 45 no 4pp 1148ndash1165 2016

[12] R V Rao V J Savsani and J Balic ldquoTeachingndashlearning-based optimization algorithm for unconstrained and con-strained real-parameter optimization problemsrdquordquo EngineeringOptimization vol 44 no 12 pp 1447ndash1462 2012

[13] R V Rao and V Patel ldquoAn improved teaching-learning-based optimization algorithm for solving uncon- strainedoptimization problemsrdquo Scientia Iranica vol 20 no 3pp 710ndash720 2013

[14] M Crepinsek S-H Liu and L Mernik ldquoA note on teach-ingndashlearning-based optimization algo- rithmrdquo InformationSciences vol 212 pp 79ndash93 2012

[15] M R Nayak C K Nayak and P K Rout ldquoApplication ofmulti-objective teaching learning based optimization algo-rithm to optimal power flow problemrdquo Procedia Technologyvol 6 pp 255ndash264 2012

[16] Y Xu L Wang S-Y Wang and M Liu ldquoAn effectiveteaching-learning-based optimization algorithm for theflexible job-shop scheduling problem with fuzzy processingtimerdquo Neurocomputing vol 148 pp 260ndash268 2015

[17] H E Kiziloz A Deniz T Dokeroglu and A Cosar ldquoNovelmultiobjective TLBO algorithms for the feature subset se-lection problemrdquoNeurocomputing vol 306 pp 94ndash107 2018

[18] S H Wang K Muhammad Y Lv et al ldquoIdentification of Al-coholism based on wavelet Renyi entropy and three-segmentencoded Jaya algorithmrdquo Complexity vol 2018 Article ID3198184 13 pages 2018

[19] H Migallon A Jimeno-Morenilla and J-L Sanchez-RomeroldquoParallel improvements of the Jaya optimization algorithmrdquoApplied Sciences vol 8 no 5 p 819 2018

[20] C Gong ldquoAn enhanced Jaya algorithm with a two groupAdaptionrdquo International Journal of Computational Intelli-gence Systems vol 10 no 1 pp 1102ndash1115 2017

[21] O Samuel N Javaid S Aslam and M H Rahim ldquoJAYAoptimization based energy management controller for smartgrid JAYA optimization based energy management con-trollerrdquo in Proceedings of the 2018 International Conference onComputing Mathematics and Engineering Technologies(iCoMET) March 2018

[22] K Yu J J Liang B Y Qu X Chen and H Wang ldquoPa-rameters identification of photovoltaic models using an im-proved JAYA optimization algorithmrdquo Energy Conversionand Management vol 150 pp 742ndash753 2017

[23] M Dash and H Liu ldquoFeature selection for classificationrdquoIntelligent Data Analysis vol 1 no 1-4 pp 131ndash156 1997

[24] S Dumais J Platt D Heckerman and M Sahami ldquoInductivelearning algorithms and representations for text categoriza-tionrdquo in Proceedings of the seventh international conference onInformation and knowledge management-CIKMrsquo98 Novem-ber 1998

[25] R Singh H Kumar and R K Singla ldquoAn intrusion detectionsystem using network traffic profiling and online sequentialextreme learning machinerdquo Expert Systems with Applicationsvol 42 no 22 pp 8609ndash8624 2015

Complexity 17

[26] M Sokolova and L Guy ldquoA systematic analysis of perfor-mance measures for classification tasksrdquo Information Pro-cessing amp Management vol 45 no 4 pp 427ndash437 2009

[27] P Phoungphol Y Zhang and Y Zhao ldquoRobust multiclassclassification for learning from imbalanced biomedical datardquoTsinghua Science and Technology vol 17 no 6 pp 619ndash6282012

[28] A Jahan F Mustapha M Y Ismail S M Sapuan andM Bahraminasab ldquoA comprehensive VIKOR method formaterial selectionrdquo Materials amp Design vol 32 no 3pp 1215ndash1221 2011

[29] L Aljarah and S A Ludwig ldquoMapreduce intrusion detectionsystem based on a particle swarm optimization clusteringalgorithmrdquo in Proceedings of the 2013 IEEE Congress onEvolutionary Computation pp 955ndash962 IEEE CancunMexico June 2013

[30] K Khaleel M A Ismail U Yunan and S Kasim ldquoReview onintrusion detection system based on the goal of the detectionsystemrdquo International Journal of Integrated Engineeringvol 10 no 6 2018

[31] J D Rodriguez A Perez and J A Lozano ldquoSensitivityanalysis of k-Fold cross validation in prediction error esti-mationrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 32 no 3 pp 569ndash575 2010

[32] M Tavallaee E Bagheri W Lu and A A Ghorbani ldquoAdetailed analysis of the KDD CUP 99 data setrdquo in Proceedingsof the 2009 IEEE Symposium on Computational Intelligence forSecurity and Defense Applications pp 1ndash6 IEEE Ottawa ONCanada July 2009

[33] M Al-Qatf Y Lasheng M Al-Habib and K Al-SabahildquoDeep learning approach combining sparse autoencoder withSVM for network intrusion detectionrdquo IEEE Access vol 6pp 52843ndash52856 2018

[34] R Vinayakumar M Alazab K P Soman P PoornachandranA Al-Nemrat and S Venkatraman ldquoDeep learning approachfor intelligent intrusion detection systemrdquo IEEE Access vol 7pp 41525ndash41550 2019

[35] BA Tama M Comuzzi and K-H Rhee ldquoTSE-IDS a two-stage classifier ensemble for intelligent anomaly-based in-trusion detection systemrdquo IEEE Access vol 7 pp 94497ndash94507 2019

[36] M R Gauthama Raman N Somu K Kannan R Liscano andV S Shankar Sriram ldquoAn efficient intrusion detection systembased on hypergraph - genetic algorithm for parameter op-timization and feature selection in support vector machinerdquordquoKnowledge-Based Systems vol 134 pp 1ndash12 2017

[37] I S (aseen and C Aswani Kumar ldquoIntrusion detectionmodel using fusion of chi- square feature selection and multiclass SVMrdquo Journal of King Saud University - Computer andInformation Sciences vol 29 no 4 pp 462ndash472 2017

[38] Y Li S Yu J Bai and X Cheng ldquoTowards effective networkintrusion detection a hybrid model integrating Gini indexand GBDTwith PSOrdquo Journal of Sensors vol 2018 Article ID1578314 9 pages 2018

[39] A A Aburomman and M B I Reaz ldquoA novel weightedsupport vector machines multi- class classifier based ondifferential evolution for intrusion detection systemsrdquo In-formation Sciences vol 414 pp 225ndash246 2017

[40] J Esmaily and J Ghasemi ldquoA novel intrusion detectionsystems based on genetic algorithms-suggested features by themeans of different permutations of labelsrsquo ordersrdquo Interna-tional Journal of Engineering vol 30 no 10 pp 1494ndash15022017

[41] S Aljawarneh M Aldwairi and M B Yassein ldquoAnomaly-based intrusion detection system through feature selectionanalysis and building hybrid efficient modelrdquo Journal ofComputational Science vol 25 pp 152ndash160 2018

[42] Y Li J L Wang Z H Tian T B Lu and C Young ldquoBuildinglightweight intrusion detection system using wrapper-basedfeature selection mechanismsrdquo Computers amp Security vol 28no 6 pp 466ndash475 2009

[43] S U Jan S Ahmed V Shakhov and I Koo ldquoToward alightweight intrusion detection system for the internet ofthingsrdquo IEEE Access vol 7 pp 42450ndash42471 2019

18 Complexity

Page 18: ImprovedTLBO …downloads.hindawi.com/journals/complexity/2020/5287684.pdfFSS is a multiobjective optimisation problem. e first objective is the number of features, and the second

[26] M Sokolova and L Guy ldquoA systematic analysis of perfor-mance measures for classification tasksrdquo Information Pro-cessing amp Management vol 45 no 4 pp 427ndash437 2009

[27] P Phoungphol Y Zhang and Y Zhao ldquoRobust multiclassclassification for learning from imbalanced biomedical datardquoTsinghua Science and Technology vol 17 no 6 pp 619ndash6282012

[28] A Jahan F Mustapha M Y Ismail S M Sapuan andM Bahraminasab ldquoA comprehensive VIKOR method formaterial selectionrdquo Materials amp Design vol 32 no 3pp 1215ndash1221 2011

[29] L Aljarah and S A Ludwig ldquoMapreduce intrusion detectionsystem based on a particle swarm optimization clusteringalgorithmrdquo in Proceedings of the 2013 IEEE Congress onEvolutionary Computation pp 955ndash962 IEEE CancunMexico June 2013

[30] K Khaleel M A Ismail U Yunan and S Kasim ldquoReview onintrusion detection system based on the goal of the detectionsystemrdquo International Journal of Integrated Engineeringvol 10 no 6 2018

[31] J D Rodriguez A Perez and J A Lozano ldquoSensitivityanalysis of k-Fold cross validation in prediction error esti-mationrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 32 no 3 pp 569ndash575 2010

[32] M Tavallaee E Bagheri W Lu and A A Ghorbani ldquoAdetailed analysis of the KDD CUP 99 data setrdquo in Proceedingsof the 2009 IEEE Symposium on Computational Intelligence forSecurity and Defense Applications pp 1ndash6 IEEE Ottawa ONCanada July 2009

[33] M Al-Qatf Y Lasheng M Al-Habib and K Al-SabahildquoDeep learning approach combining sparse autoencoder withSVM for network intrusion detectionrdquo IEEE Access vol 6pp 52843ndash52856 2018

[34] R Vinayakumar M Alazab K P Soman P PoornachandranA Al-Nemrat and S Venkatraman ldquoDeep learning approachfor intelligent intrusion detection systemrdquo IEEE Access vol 7pp 41525ndash41550 2019

[35] BA Tama M Comuzzi and K-H Rhee ldquoTSE-IDS a two-stage classifier ensemble for intelligent anomaly-based in-trusion detection systemrdquo IEEE Access vol 7 pp 94497ndash94507 2019

[36] M R Gauthama Raman N Somu K Kannan R Liscano andV S Shankar Sriram ldquoAn efficient intrusion detection systembased on hypergraph - genetic algorithm for parameter op-timization and feature selection in support vector machinerdquordquoKnowledge-Based Systems vol 134 pp 1ndash12 2017

[37] I S (aseen and C Aswani Kumar ldquoIntrusion detectionmodel using fusion of chi- square feature selection and multiclass SVMrdquo Journal of King Saud University - Computer andInformation Sciences vol 29 no 4 pp 462ndash472 2017

[38] Y Li S Yu J Bai and X Cheng ldquoTowards effective networkintrusion detection a hybrid model integrating Gini indexand GBDTwith PSOrdquo Journal of Sensors vol 2018 Article ID1578314 9 pages 2018

[39] A A Aburomman and M B I Reaz ldquoA novel weightedsupport vector machines multi- class classifier based ondifferential evolution for intrusion detection systemsrdquo In-formation Sciences vol 414 pp 225ndash246 2017

[40] J Esmaily and J Ghasemi ldquoA novel intrusion detectionsystems based on genetic algorithms-suggested features by themeans of different permutations of labelsrsquo ordersrdquo Interna-tional Journal of Engineering vol 30 no 10 pp 1494ndash15022017

[41] S Aljawarneh M Aldwairi and M B Yassein ldquoAnomaly-based intrusion detection system through feature selectionanalysis and building hybrid efficient modelrdquo Journal ofComputational Science vol 25 pp 152ndash160 2018

[42] Y Li J L Wang Z H Tian T B Lu and C Young ldquoBuildinglightweight intrusion detection system using wrapper-basedfeature selection mechanismsrdquo Computers amp Security vol 28no 6 pp 466ndash475 2009

[43] S U Jan S Ahmed V Shakhov and I Koo ldquoToward alightweight intrusion detection system for the internet ofthingsrdquo IEEE Access vol 7 pp 42450ndash42471 2019

18 Complexity