Download - Comparison of Classification Algorithms for Machine Learning Based Network Intrusion Detection Systems

COMPARISON OF CLASSIFICATION ALGORITHMS

FOR MACHINE LEARNING BASED

NETWORK INTRUSION DETECTION SYSTEMS

Dr. Srinivasa K. G., P. N. Pavan KumarM. S. Ramaiah Institute of Technology, Bangalore, India

22 December 2012 1

Presented at:International Conference on Electrical, Communication and Information Technologies, ICECIT -2012

Network Intrusion Detection Systems (NIDS)

• Data stored under networked systems comes under huge

risk because of increasing network attacks.

• NIDS plays a prominent role in any modern security

system.

• Traditional IDS are signature-based systems. They cannot

identify novel attacks.

• Newer attacks whose signature have not been identified

are those which prove to be a bigger security threat than

the existing ones.

22 December 2012 2

Machine Learning in NIDS

• Based on the model that is generated out a standard

dataset, the attacks can be classified from regular

network patterns.

• Machine Learning algorithms:

1. Clustering Algorithms

2. Classification Algorithms

• This paper compares the performance of classification

algorithms for Machine Learning based NIDS.

22 December 2012 3

Classification Algorithms

1. Bayesian Learning: Computation of posterior probability distribution. Combining prior

knowledge and observed data. [4]

• Bayesian Network

• Naïve Bayes Classifier

2. Meta Learning: Uses experience to change certain aspects of a learning algorithm, or the

learning method itself, such that the modified learner is better than the

original learner is, at learning from additional experience. [6]

• AdaBoostM1

• Rotation Forest

22 December 2012 4

Classification Algorithms (contd.)

3. Rule Based Learning:These learning algorithms use a standard set of nested conditions for learning and has a hierarchical arrangement of conditions.

• Decision Table

• NNge Classifier

4. Decision Tree Learning:A decision tree is a classifier depicted in a flowchart like tree structure. Comprehensive in nature and resembles human reasoning. [7]

• J48

• NBTree

• Random Forest

• Random Tree

• REP Tree

• Simple CART

22 December 2012 5

Classification Algorithms (contd.)

5. Classifier Functions:

• Support Vector Machine

• Multilayer Perceptron

• Radial Basis Function Network

6. Lazy learning algorithms:Doesn't construct a model when training, only when classifying new

instances.

• KStar

22 December 2012 6

Comparison of Classification Algorithms

Based on 3 criteria:

1. Accuracy of the models generated

2. Time taken for model generation

3. Memory performance during model generation and

evaluation

22 December 2012 7

Network Data

Pre -Processing

Conversion to ARFF

NSL-KDD Training Data

ClassifierAlgorithm

NIDSModel

NSL-KDD Test Data

Testing Output

Figure 1: Block Diagram of Experimental Setup

Dataset Used: NSL- KDD

The NSL-KDD data set [14] has the following advantages

over the original KDD data set:

• Does not include redundant records in the train set.

• There is no duplicate records in the proposed test sets.

• The number of selected records from each difficulty level group is

inversely proportional to the percentage of records in the original

KDD data set.

• The number of records in the train and test sets are reasonable.

• Consequently, evaluation results of different research works will be

consistent and comparable.

22 December 2012 8

Sofware Used: Weka

22 December 2012 9

• Waikato Environment for Knowledge Analysis.

• Weka supports several standard data mining tasks.

• Written in Java. Available for free under GPU license.

• Developed at the University of Waikato, New Zealand.

• Attribute-Relation File Format (ARFF).

Results: Accuracy of models generated

22 December 2012 10

Class Algorithm Accuracy (%)

Bayes BayesNetwork 74.4322

NaiveBayes 76.1178

Meta AdaboostM1 78.4422

RotationForest 77.4885

Rule DecisionTable 72.5958

Nnge 78.9168

Tree

J48 81.5339

NBTree 78.6107

RandomForest 77.7679

RandomTree 81.3565

REPTree 81.5073

SimpleCART 80.3229

Results: Time taken for model generation

22 December 2012 11

Class Algorithm Time (seconds)


NaiveBayes 2.03

Meta AdaboostM1 32.77

RotationForest 1719.03


Nnge 62.14

Tree

J48 45.4

NBTree 423.25

RandomForest 18.87

RandomTree 2.13

REPTree 7.46

SimpleCART 266.09

Results: Memory performance

22 December 2012 12

Class Algorithm Memory Used (GB)


NaiveBayes 3.7

Meta AdaboostM1 2.8

RotationForest 4.6


Nnge 2.8

Tree

J48 2.8

NBTree 3.7

RandomForest 2.8

RandomTree 2.8

REPTree 4.6

SimpleCART 3.7

Conclusion

• Accuracy of models alone will not be a good criteria for

evaluation of classifier algorithms.

• Other criteria such as time taken for model generation

and memory performance during model generation and

testing are important.

• Decision Tree based classifier algorithms are better suited

for NIDS. Bayesian learning algorithms follow.

• Overall, J48 classification algorithm, followed by Random

Tree gave the best results considering all criteria.

22 December 2012 13

References

1. Jiong Zhang, Mohammad Zulkernine: Network Intrusion Detection using Random Forests, PST (2005).

2. Nils J. Nilsson, Introduction to Machine Learning. California, United Stated of America (1999).

3. Mohd Fauzi bin Othman, Thomas Moh Shan Yau, Comparison of Different Classification Techniques Using Weka for Breast Cancer International Conference on Biomedical Engineering (BIOMED) 2006 11-14 December 2006, Kuala Lumpur, (2006).

4. David Poole, Alan Mackworth, Artificial Intelligence: Foundations of Computational Agents, Cambridge University Press, (2010).

5. T. Schaul and J. Schmidhuber. Metalearning. Scholarpedia, 5(6):4650, (2010).

6. George H. John and Pat Langley, Estimating Continuous Distributions in Bayesian Classifiers. Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence. pp. 338-345. Morgan Kaufmann, San Mateo (1995).

7. Barros, R.C.; Basgalupp, M.P.; de Carvalho, A.C.P.L.F.; Freitas, A.A.; , "A Survey of Evolutionary Algorithms for Decision-Tree Induction," Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on , vol.42, no.3, pp.291-312, (May 2012).

8. Yoav Freund and Robert E. Schapire , Experiments with a new boosting algorithm. Proc International Conference on Machine Learning, pages 148-156, Morgan Kaufmann, San Francisco (1996).

9. Ian H.Witten, Eibe Frank. Data Mining Practical Machine Learning Tools and Techniques, (2005).

22 December 2012 14

References (contd.)

10. Juan J. Rodriguez, Ludmila I. Kuncheva, Carlos J. Alonso, Rotation Forest: A new classifier ensemble method. IEEE Transactions on Pattern Analysis and Machine Intelligence. 28(10):1619-1630, (2006).

11. Leonard Bolc, Ryszard Tadeusiewicz, Leszek J. Chmielewski, Konrad W. Wojciechowski: Computer Vision and Graphics - International Conference, ICCVG 2010, Warsaw, Poland, September 20-22, 2010, Proceedings, Part II Springer (2010).

12. Weka software, Machine Learning, [http://www.cs.waikato.ac.nz/ml/Weka/], The University of Waikato, Hamilton, New Zealand.

13. Overview for extended Weka including Ensembles of Hierarchically Nested Dichotomies, [http://www.dbs.informatik.uni-muenchen.de/~zimek/diplomathesis/implementations/EHNDs/doc/].

14. Tavallaee, M.; Bagheri, E.; Wei Lu; Ghorbani, A.A.; , "A detailed analysis of the KDD CUP 99 data set," Computational Intelligence for Security and Defense Applications, 2009. CISDA 2009. IEEE Symposium on , vol., no., pp.1-6, 8-10, (July 2009).

15. Hettich, S. and Bay, S. D. (1999). The UCI KDD Archive [http://kdd.ics.uci.edu]. Irvine, CA: University of California, Department of Information and Computer Science.

16. The NSL KDD Dataset, [http://nsl.cs.unb.ca/NSL-KDD/], (2009).

17. Weka (machine learning) – article on Wikipedia, [http://en.wikipedia.org/wiki/Weka_(machine_learning)].

22 December 2012 15

http://nsl.cs.unb.ca/NSL-KDD/



http://en.wikipedia.org/wiki/Weka_(machine_learning)

THANK YOU

Dr. Srinivasa K G

[email protected]

Pavan Kumar P N

[email protected]

1622 December 2012

mailto:[email protected]

mailto:[email protected]

Download - Comparison of Classification Algorithms for Machine Learning Based Network Intrusion Detection Systems

Top Related