an efficient feature reduction comparison of machine learning algorithms for intrusion detection...

Upload: anonymous-vqrjlen

Post on 03-Apr-2018

212 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/29/2019 An Efficient Feature Reduction Comparison of Machine Learning Algorithms for Intrusion Detection System

    1/5

    International Journal of EmergingTrends & Technology in Computer Science(IJETTCS)Web Site: www.ijettcs.org Email: [email protected], [email protected]

    Volume 2, Issue 1, J anuary February 2013 ISSN 2278-6856

    Volume 2, Issue 1 January - February 2013 Page 66

    Abstract: Organisation has come to recognize that appliedscience in network security has become very important in

    protecting its information. I ntrusion detection present an

    important line of defend against all variety of attacks that cancompromise the security and proper functioning of

    information system initiative. I n this paper we compared theperformance of intrusion detection. The evaluation of theIntrusion Detection System (IDS) execution analysis for any

    given security system configuration improvement is necessary

    to achieve real time capability. We analyse two learningalgorithms (NB and C4.5) for the task of detecting intrusions

    and compare their relative performances.Keywords: intrusion detection, machine learning, C4.5,NB, KDD 99

    1.INTRODUCTIONIntrusion detection algorithm should consider the

    composite properties of attack behaviors to improve thedetection speed and detection accuracy. Analyze the largevolume of network dataset and the better performances ofdetection accuracy, intrusion detection become animportant research field for machine learning. In thiswork we have presented C4.5 decision tree algorithm forintrusion detection based on machine learning. TheIntrusion Detection System (IDS) is Process ofmonitoring the events occurring in a computer system ornetwork and analyzing them for signs of possibleincidents. IDS was first introduced in 1980 by James. P.Anderson [1] and then improved by D. Denning [2] in

    1987. They are two basic approaches for IntrusionDetection techniques, i.e. Anomaly Detection and MisuseDetection (signature-based ID) [3]. Anomaly Detection isbasically based on assumption that attacker behavior isdifferent from normal user's behavior [4]. In this paper,we present the application of machine learning tointrusion detection. We analyse two learning algorithms(C4.5 and NB) for the task of detecting intrusions andcompare their relative performances. There is onlyavailable data set is KDD data set for the purpose ofexperiment for intrusion detection.KDD data set [5]contain 42 attributes. The classes in KDD99 [6] datasetcan be categorized into five main classes such as one

    normal class and four main intrusion classes. Datamining is a collection of techniques for efficientautomated discovery of previously unknown, valid, novel,useful and understandable patterns in large databases

    [18]. The field of intrusion detection has receivedincreasing attention in recent years.

    2.RELATED WORKIn 2007, Panda and Patra [7] determined a method usingnaive Bayes to detect signatures of specific attacks. Theyused KDD99 dataset for experiment, in the early 1980s,Stanford Research Institute (SRI) developed an IntrusionDetection Expert System (IDES) that monitors userbehavior and detects suspicious events. Meng Jianliang[8] used the K Mean algorithm to cluster and analyze thedata. He used the unsupervised learning technique for theintrusion detection. Mohammadreza Ektefa et al., [8] in2010, compared C4.5 with SVM and the results revealedthat C4.5 algorithms better than SVM in detectingnetwork intrusions and false alarm rate. Zubair A.Baig et

    al. (2011) proposed An AODE-based Intrusion DetectionSystem for Computer Networks. They suggested that theNaive Bayes (NB) does not accurately detect networkintrusions [9]. In 2010, Hai Nguyen et al. [10] appliedC4.5 and BayesNet for intrusion detection on KDDCUP99 Dataset. J iong Zhang and MohammadZulkernine [11] done the intrusion detection using therandom forest algorithms in anomaly based NIDS. CuixioZhang, Guobing Zhang, Shanshan Sun [12] used themissed approach for the intrusion detection. Variousparadigms namely Support Vector Machine [13], NeuralNetworks[14], K-means based clustering[15] have beenapplied to intrusion detection because it has the advantageof discovering useful knowledge that describes a users orprograms behavior. They are two basic approaches forIntrusion Detection techniques, i.e. Anomaly Detectionand Misuse Detection. (signature-based ID) AnomalyDetection is basically based on assumption that attackerbehavior is different from normal user's behavior [16].Shadmehr et al [17] showed that the performance ofBayes algorithm is better. Recently research on machinelearning for intrusion detection has standard muchattention in the computational intelligence community. Inintrusion detection algorithm, immense strengths of auditdata must be analyzed in order to conception new

    detection rules for increasing number of novel attacks inhigh speed network. Intrusion detection algorithm shouldconsider the composite properties of attack behaviors toimprove the detection speed and detection accuracy.

    An Efficient Feature Reduction Comparison of

    Machine Learning Algorithms for IntrusionDetection System

    UpendraAssistant Professor, CSE Department, NIT Raipur, C.G., India

  • 7/29/2019 An Efficient Feature Reduction Comparison of Machine Learning Algorithms for Intrusion Detection System

    2/5

    International Journal of EmergingTrends & Technology in Computer Science(IJETTCS)Web Site: www.ijettcs.org Email: [email protected], [email protected]

    Volume 2, Issue 1, J anuary February 2013 ISSN 2278-6856

    Volume 2, Issue 1 January - February 2013 Page 67

    3.METHODOLOGY

    3.1 Nave Bayes (NB)

    A Naive Bays classifier [19],[20] is a simple probabilisticclassifier based on applying Bayes' theorem (fromBayesian statistics) with strong (naive) independenceassumptions. A more descriptive term for the underlyingprobability model would be "independent feature model".

    In 2004, analysis of the Bayesian classification problemhas shown that there are some theoretical reasons for theapparently unreasonable efficacy of naive Bayesclassifiers. Still, a comprehensive comparison with otherclassification methods in 2006 showed that Bayesclassification is outperformed by more currentapproaches, such as boosted trees or random forests[25].

    3.2 C4.5

    Decision tree technology is a common, intuitionist andfast classification method [22]. Decision tree J48developed by Johan Ross Quinlan [23]. C4.5 is anextension of Quinlan's earlier the InteractiveDichotomizer3 (ID3) Algorithm. J48 builds decision treesfrom a set of labelled training data using the concept ofinformation entropy.The Decision tree is a classifierexpressed as a recursive partition of the instance space,consists of nodes that form a rooted tree, meaning it is adirected tree with a node called a root that has no

    incoming edges referred to as an internal or test node. Allother nodes are called leaves (also known as terminal ordecision nodes). Decision trees [24] are one of the mostcommonly classification methods used in supervisedlearning approaches.

    TABLE I. SELECTED 7 ATTRIBUTE WITH HIGHESTINFORMATION GAIN

    S No. Feature name

    1 service

    2 src_bytes

    3 dst_bytes4 logged_in

    5 count

    6 dst_host_diff_srv_rate

    7 dst_host_srv_diff_host_rate

    8 Class

    3.3 Evaluation

    We constructed a confusion matrix (contingency table) toevaluate the classifiers performance.

    TABLE II . A SAMPLE CONFUSION MATRIX

    Predicted ClassPositive

    Predicted ClassNegative

    Actual ClassPositive

    A B

    Actual ClassNegative

    C D

    In this confusion matrix, the value A is called a true positiveand the value D is called a true negative. The value B isreferred to as a false negative and C is known as false positive.

    3.3 Accuracy

    This is the most basic measure of the performance of alearning method. This measure determines the percentageof correctly classified instances. From the confusionmatrix, we can say that:

    A + DAccuracy =

    A + B + C+ D

    Precision =ratio of number of documents retrieved thatare relevant to the total number of documents that areretrieved Referring from the confusion matrix, we candefine precision and recall for our purposes as

    APrecision =

    A + C

    Recall =ratio of number of documents retrieved that arerelevant to the total number of documents that arerelevant

    ARecall =

    A + B

    (1) IV. performance Evaluation andResult

    The Tables III and IV Shows the performance of C4.5and NB classification methods based on accuracy,Learning time, Error rate, Average true positive rate,Average False positive rate, Average precision andAverage F-Measure respectively. The comparison isperformed for 41, 11 and 7 attributes. The C4.5 and NBclassifier models on the dataset are built and tested bymeans of 10-fold cross-validation. The Java Heap sizewas set to 1024 MB for WEKA 3.6.2, the simulationplatform is an Intel Core i3-2100 processor system with3 GB RAM under Microsoft Windows XP ServicePack-2 operating system, 3.10 GHz with 500 GBmemory. C4.5 was evaluated on the dataset by taking intoaccount 7 feature reductions using the Information Gainmeasure. The results of this test are summarized in thefollowing table.

    TABLE II I. RESULT OF C4.5 WITH SELECTED 7ATTRIBUTES

    Parameter ValueAccuracy 99.8901 %

    Learning Time 12.67 sec

  • 7/29/2019 An Efficient Feature Reduction Comparison of Machine Learning Algorithms for Intrusion Detection System

    3/5

    International Journal of EmergingTrends & Technology in Computer Science(IJETTCS)Web Site: www.ijettcs.org Email: [email protected], [email protected]

    Volume 2, Issue 1, J anuary February 2013 ISSN 2278-6856

    Volume 2, Issue 1 January - February 2013 Page 68

    Error Rate 0.1099 %

    Average True Positive Rate 0.999

    Average False PositiveRate

    0.001

    Average Precision 0.999Average Recall 0.999

    Average F-Measure 0.999

    Similar to the C4.5 tests, NB was also evaluated by takinginto account 7 features of the dataset. The results of thisevaluation are summarized in the table below.

    TABLE IV. RESULT OF NB WITH SELECTED 7ATTRIBUTES

    Parameter Value

    Accuracy 93.5698 %

    Learning Time 1.3 sec

    Error Rate 6.4302 %

    Average True PositiveRate

    0.936

    Average False PositiveRate

    0.054

    Average Precision 0.949

    Average Recall 0.936

    Average F-Measure 0.942

    In this paper, we have done the feature reduction to 7attribute and gave the result. Now we compare the resultof the C4.5 and NB algorithms with reduced 7 attributethan only we conclude that which one algorithm is goodbest for the intrusion detection.

    Now the figure 1. given below show the comparison ofthe accuracy of C4.5 and NB.

    Figure 1. Comparison of accuracy for C4.5 and NB

    using all attribute, selected 11 attribute and selected 7attribute.

    From above figure1. It is clear that information gainfeature reduction method gives the better accuracy which

    is desirable for good Intrusion Detection System.Especially in the case of C4.5 accuracy is 99 % for allattribute, 11 attribute and 7 attribute.

    Now we compare the True Positive Rate of the C4.5, andNB algorithm with selected 7 attributes.

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    True

    Positive

    Rate

    Attacks

    C4.5

    NB

    Figure 2. Comparison of TPR for C4.5 and NB withselected 7 attribute.

    For good IDS True Positive Rate should be high. Abovefigure 2. Shows that True Positive Rate of the C4.5algorithm is higher when we reduce the feature of the

    data set using information gain. Especially in the case ofC4.5 True Positive Rate is 1 and Figure 2. also shows that

    TPR of the C4.5 is higher than the NB algorithm withselected 7 attribute.

    Now we compare the False Positive Rate of the C4.5 andNB algorithm with selected 7 attributes. Types of attackgroup such as four main categories: (Probing, DoS:denial-of-service, R2L: remote to local, andU2R: user toroot) and one normal class.

    Figure 3. Comparison of FPR for C4.5 and NB usingselected 7 attribute.

  • 7/29/2019 An Efficient Feature Reduction Comparison of Machine Learning Algorithms for Intrusion Detection System

    4/5

    International Journal of EmergingTrends & Technology in Computer Science(IJETTCS)Web Site: www.ijettcs.org Email: [email protected], [email protected]

    Volume 2, Issue 1, J anuary February 2013 ISSN 2278-6856

    Volume 2, Issue 1 January - February 2013 Page 69

    For a better Intrusion Detection System, false positive rateshould be low. Above figure 3. Shows that FPR of theC4.5 algorithm is lower when we reduce the 7 feature of

    the data set using information gain. Especially in the caseof C4.5 FPR is 0.In the case of NB algorithm FalsePositive Rate of the greater than 0 with selected 7attribute.

    Figure 4. Comparison of time taken to build model forC4.5 and NB using all attribute, selected 11 and 7

    attribute.

    From above figures 1, 2, 3 and 4 it is clear that C4.5algorithm Accuracy, TPR and FPR is better than NBalgorithms. So we can say that reduction of the featureusing information gain is better technique.

    4.CONCLUSIONS

    In this paper we have showed C4.5 selected 7 attributetechnique for intrusion detection and performed featureset reduction and evaluated their performance. From theresult, it is observed that after applying the featureselection from 41 attributes to 11 and 7 attributes. Theoverall performance of C4.5 has increased theirperformance than NB. Thecomparison results show that,in general, the C4.5 has the highest classificationaccuracy performance with the lowest error rate. C4.5achieves better detection rates than NB and increased truepositive rate. On the other hand, we also found thatdrastically decreased in learning time of the algorithm.We evaluated two machine learning algorithms C4.5 andNB on the dataset built in this exercise. Based on theaccuracy, true positive rate, false positive rate, error rateand learning time C4.5 performed best classifier.

    REFERENCES

    [1.]James P. Anderson, Computer Security ThreatMonitoring and Surveillance, Technical Report,

    James P.Anderson Co., Fort Washington,Pennsylvania, USA, pp. 9817, April 1980.

    [2.] Dorothy E. Denning, An Intrusion DetectionModel, IEEE Transaction on Software Engineering(TSE), volume13, No.2, pp.222232, February1987.

    [3.] LI Min and Wang Dongliang, AnomalyIntrusion Detection Based on SOM, IEEE WASEInternational Conference on InformationEngineering, IEEE Computer Society, 2009, pp. 40-44.

    [4.] Lida Rashidi,Sattar Hashem and Ali Hamzeh,Anomaly detection in categorical datasets usingBayesian networks, AICI11 Proceedings of the

    Third International Conference on ArtificialIntelligence and Computational Intelligence, VolumePart II, Springer-Verlag, Berlin ,Heidelberg, 2011,pp. 610619.

    [5.] Knowledge Discovery in Databases DARPAarchive. Task Description ,KDDCUP 1999 Dataset,http://www.kdd.ics.uci.edu/databases/kddcup99/task.html

    [6.] Mahbod Tavallaee,Ebrahim Bagheri,Wei Lu, andAli A.Ghorbani, A Detailed Analysis of the KDDCUP 99 Data Set, Proceedings of the 2009 IEEESymposium on Computational Intelligence inSecurity and Defense Application(CISDA2009),IEEE 2009.

    [7.] M. Panda, and M. R. Patra, Network intrusiondetection using naive Bayes, International Journal ofComputer Science and Network Security (IJCSNS),Volume -7, No. 12, December 2007, pp. 258263.

    [8.] Meng Jianliang, Shang Haikun, The applicationon intrusion detection based on K-Means clusteralgorithm, International Forum on Information

    Technology and Application, 2009.

    [9.] Zubair A. Baig, Abdulrhman S. Shaheen, andRadwan AbdelAal, An AODE-based IntrusionDetection System for Computer Networks, pp. 2835, IEEE 2011.

    [10.]Hai Nguyen, Katrin Franke and SlobodanPetrovic, Improving Effectiveness of IntrusionDetection by Correlation Feature Selection,International Conference on Availability, Reliabilityand Security, pp. 1724, IEEE 2010.

    [11.]J iong Zhang and Mohhammad Zulkernine,Anomaly based Network Intrusion detection withunsupervised outlier detection, School of Computing

    Queens University, Kingston, Ontario, Canada.IEEE International Conference ICC 2006, Volume-9,pp. 2388-2393, 11-15 June 2006.

  • 7/29/2019 An Efficient Feature Reduction Comparison of Machine Learning Algorithms for Intrusion Detection System

    5/5

    International Journal of EmergingTrends & Technology in Computer Science(IJETTCS)Web Site: www.ijettcs.org Email: [email protected], [email protected]

    Volume 2, Issue 1, J anuary February 2013 ISSN 2278-6856

    Volume 2, Issue 1 January - February 2013 Page 70

    [12.]Cuixiao Zhang, Guobing Zhang, Shanshan Sen.,A mixed unsupervised clustering based Intrusiondetection model, Third International Conference onGenetic and Evolutionary Computing, 2009.

    [13.]Jiaqi Jiang, Ru Li,Tianhong Zheng,Feiqin Su,Haicheng Li, A new intrusion detection systemusing Class and Sample Weighted C-Support VectorMachine, Third International Conference onCommunications and Mobile Computing, IEEEComputer Society,2011,pp-51-54.

    [14.]E. T. Ferreira, G. A. Carrijo, R. de Oliveira andN. V. S. Araujo, Intrusion Detection System withWavelet and Neural Artifical Network Approach forNetwork Computers, IEEE Latin America

    Transactions, Vol. 9, No. 5, September 2011,pp-832-837.

    [15.]Yang Zhong, Hirohumi Yamaki, HirokiTakakura, A Grid-Based Clustering for Low-Overhead Anomaly Intrusion Detection, IEEE 2011,pp-17-24.

    [16.]Lida Rashidi,Sattar Hashem and Ali Hamzeh,Anomaly detection in categorical datasets usingbayesian networks, AICI11 Proceedings of the

    Third International Conference on ArtificialIntelligence and Computational Intelligence, VolumePart II, Springer-Verlag, Berlin ,Heidelberg, 2011,pp.610619.

    [17.]R. Shadmehr and Z. DArgenio, A comparisonof a neural network based estimator and two

    statistical estimators in a sparse and noisyenvironment, In IJCNN-90 Proceedings of theinternational joint conference on neural networks,289-292,Ann Arbor, Mi, IEEE Neural NetworksCouncil, 1990.

    [18.]Julie M. David and Kannan Balakrishnan, (2010)Significance of Classification Techniques InPrediction Of Learning Disabilities, International

    Journal of Artificial Intelligence & Applications(IJAIA), Vol.1, No.4.

    [19.]R.Dogaru ,A modified Naive Bayes classifier forefficient implementations in embedded systems,

    Signals Circuits and Systems (ISSCS), IEEE 10thInternational Symposium on Lasi, J une 30, 2011 July 1, 2011 , pp.14.

    [20.]Jiawei Han and Micheline kamber, Data MiningConcepts and Techniques,Second Edition,Universityof Illinois at Urbana-Champaign The MorganKaufmann Series in Data Management Systems,Elsevier 2007.

    [21.]Rebecca G. Bace, Intrusion Detection Sams,December 1999, 20th International Conference VeryLarge Data Bases(VLDB), Morgan Kaufmann, pp.487499.

    [22.]Pingchuan Ma, Log Analysis-Based IntrusionDetection via Unsupervised Learning Master ofScience, School of Informatics, University ofEdinburgh, 2003.

    [23.]John Ross Quinlan, (1993) C4.5: Programs forMachine Learning, Morgan Kaufmann Publishers,San Mateo,CA.1993.

    [24.]Kamarulrifin Abd Jalil and Mohamad NoormanMasrek,Comparison of Machine learningAlgorithms Performance in Detecting NetworkIntrusion, IEEE 2010 International Conference onNetworking and Information Technology,pp.221-226.

    [25.]Shaohua Teng, Hongle Du Wei Zhang, XiufenFuA Cooperative Network Intrusion DetectionBased on Heterogeneous Distance FunctionClustering,2010, 14th IEEE InternationalConference on Computer Supported CooperativeWork in Design. pp 140-145, 14-16 April 2010.