layered approach

Download Layered approach

Post on 18-May-2015

1.463 views

Category:

Education

6 download

Embed Size (px)

DESCRIPTION

Dear StudentsIngenious techno Solution offers an expertise guidance on you Final Year IEEE & Non- IEEE Projects on the following domainJAVA.NETEMBEDDED SYSTEMSROBOTICSMECHANICALMATLAB etcFor further details contact us: enquiry@ingenioustech.in 044-42046028 or 8428302179.Ingenious Techno Solution#241/85, 4th floorRangarajapuram main road,Kodambakkam (Power House)http://www.ingenioustech.in/

TRANSCRIPT

  • 1. IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING,VOL. 7,NO. 1,JANUARY-MARCH 201035 Layered Approach Using Conditional Random Fields for Intrusion Detection Kapil Kumar Gupta, Baikunth Nath, Senior Member, IEEE, and Ramamohanarao Kotagiri, Member, IEEE AbstractIntrusion detection faces a number of challenges; an intrusion detection system must reliably detect malicious activities in a network and must perform efficiently to cope with the large amount of network traffic. In this paper, we address these two issues of Accuracy and Efficiency using Conditional Random Fields and Layered Approach. We demonstrate that high attack detection accuracy can be achieved by using Conditional Random Fields and high efficiency by implementing the Layered Approach. Experimental results on the benchmark KDD 99 intrusion data set show that our proposed system based on Layered Conditional Random Fields outperforms other well-known methods such as the decision trees and the naive Bayes. The improvement in attack detection accuracy is very high, particularly, for the U2R attacks (34.8 percent improvement) and the R2L attacks (34.5 percent improvement). Statistical Tests also demonstrate higher confidence in detection accuracy for our method. Finally, we show that our system is robust and is able to handle noisy data without compromising performance. Index TermsIntrusion detection, Layered Approach, Conditional Random Fields, network security, decision trees, naive Bayes. 1INTRODUCTION The signature-based systems are trained by extracting specificI NTRUSION detection as defined by the SysAdmin, Audit, Networking, and Security (SANS) Institute is the art ofdetecting inappropriate, inaccurate, or anomalous activity patterns (or signatures) from previously known attacks while the anomaly-based systems learn from the normal data[6]. Today, intrusion detection is one of the high priority andcollected when there is no anomalous activity [11].challenging tasks for network administrators and securityAnother approach for detecting intrusions is to considerprofessionals. More sophisticated security tools mean that the both the normal and the known anomalous patterns forattackers come up with newer and more advanced penetra-training a system and then performing classification on thetion methods to defeat the installed security systems [4] andtest data. Such a system incorporates the advantages of both[24]. Thus, there is a need to safeguard the networks from the signature-based and the anomaly-based systems and isknown vulnerabilities and at the same time take steps to known as the Hybrid System. Hybrid systems can be verydetect new and unseen, but possible, system abuses byefficient, subject to the classification method used, and candeveloping more reliable and efficient intrusion detection also be used to label unseen or new instances as they assignsystems. Any intrusion detection system has some inherentone of the known classes to every test instance. This isrequirements. Its prime purpose is to detect as many attacks possible because during training the system learns featuresas possible with minimum number of false alarms, i.e., the from all the classes. The only concern with the hybrid methodsystem must be accurate in detecting attacks. However, anis the availability of labeled data. However, data requirementaccurate system that cannot handle large amount of network is also a concern for the signature- and the anomaly-basedtraffic and is slow in decision making will not fulfill thesystems as they require completely anomalous and attack-purpose of an intrusion detection system. We desire a system free data, respectively, which are not easy to ensure.that detects most of the attacks, gives very few false alarms, The rest of this paper is organized as follows: In Section 2,copes with large amount of data, and is fast enough to makewe discuss the related work with emphasis on variousreal-time decisions. methods and frameworks used for intrusion detection. We Intrusion detection started in around 1980s after the describe the use of Conditional Random Fields (CRFs) forinfluential paper from Anderson [10]. Intrusion detectionintrusion detection [23] in Section 3 and the Layeredsystems are classified as network based, host based, or Approach [22] in Section 4. We then describe how toapplication based depending on their mode of deployment integrate the Layered Approach and the CRFs in Section 5.and data used for analysis [11]. Additionally, intrusion In Section 6, we give our experimental results and comparedetection systems can also be classified as signature based or our method with other approaches that are known toanomaly based depending upon the attack detection method. perform well. We observe that our proposed system, Layered CRFs, performs significantly better than other. The authors are with the Department of Computer Science and Software systems. We study the robustness of our method in Section 7Engineering, and NICTA Victoria Research Laboratory, The University of by introducing noise in the system. We discuss featureMelbourne, Parkville 3010, Australia.selection in Section 8 and draw conclusions in Section 9.E-mail: {kgupta, bnath, rao}@csse.unimelb.edu.au.Manuscript received 6 Mar. 2007; revised 11 Dec. 2007; accepted 28 Jan.2008; published online 12 Mar. 2008. 2 RELATED WORKFor information on obtaining reprints of this article, please send e-mail to:tdsc@computer.org, and reference IEEECS Log Number TDSC-2007-03-0031.The field of intrusion detection and network security hasDigital Object Identifier no. 10.1109/TDSC.2008.20.been around since late 1980s. Since then, a number of 1545-5971/10/$26.00 2010 IEEE Published by the IEEE Computer Society Authorized licensed use limited to: Asha Das. Downloaded on August 10,2010 at 06:59:52 UTC from IEEE Xplore. Restrictions apply.

2. 36 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 7, NO. 1, JANUARY-MARCH 2010methods and frameworks have been proposed and manyoften hard to select the best possible architecture for a neuralsystems have been built to detect intrusions. Various network. Support vector machines have also been used fortechniques such as association rules, clustering, naive Bayes detecting intrusions [26]. Support vector machines map real-classifier, support vector machines, genetic algorithms,valued input feature vector to a higher dimensional featureartificial neural networks, and others have been applied to space through nonlinear mapping and can provide real-timedetect intrusions. In this section, we briefly discuss thesedetection capability, deal with large dimensionality oftechniques and frameworks.data, and can be used for binary-class as well as multiclass Lee et al. introduced data mining approaches for classification. Other approaches for detecting intrusiondetecting intrusions in [30], [31], and [32]. Data mining include the use of genetic algorithm and autonomous andapproaches for intrusion detection include association rulesprobabilistic agents for intrusion detection [1] and [5]. Theseand frequent episodes, which are based on buildingmethods are generally aimed at developing a distributedclassifiers by discovering relevant patterns of program intrusion detection system.and user behavior. Association rules [8] and frequent To overcome the weakness of a single intrusion detectionepisodes are used to learn the record patterns that describesystem, a number of frameworks have been proposed, whichuser behavior. These methods can deal with symbolic data, describe the collaborative use of network-based and host-and the features can be defined in the form of packet and based systems [45]. Systems that employ both signature-connection details. However, mining of features is limitedbased and behavior-based techniques are discussed in [19]to entry level of the packet and requires the number of and [41]. In [32], the authors describe a data mining frame-records to be large and sparsely populated; otherwise, they work for building adaptive intrusion detection models. Atend to produce a large number of rules that increase the distributed intrusion detection framework based on mobilecomplexity of the system [7]. agents is discussed in [12]. Data clustering methods such as the k-means and the fuzzyThe most closely related work, to our work, is of Lee et al.c-means have also been applied extensively for intrusion[30], [31], and [32]. They, however, consider a data miningdetection [36] and [39]. One of the main drawbacks of the approach for mining association rules and finding frequentclustering technique is that it is based on calculating numeric episodes in order to calculate the support and confidence ofdistance between the observations, and hence, the observa-the rules separately. Instead, in our work, we define featurestions must be numeric. Observations with symbolic featuresfrom the observations as well as from the observations andcannot be easily used for clustering, resulting in inaccuracy.the previous labels and perform sequence labeling via theIn addition, the clustering methods consider the features CRFs to label every feature in the observation. This setting isindependently and are unable to capture the relationshipsufficient for modeling the correlation between differentbetween different features of a single record, which furtherfeatures of an observation. We also compare our work withdegrades attack detection accuracy. [21], which describes the use of maximum entropy principle Naive Bayes classifiers have also been used for intrusionfor detecting anomalies in the network traffic. The keydetection [9]. However, they make strict independence difference between [21] and our work is that the authors inassumption between the features in an observation result- [21] use only the normal data during training and build aing in lower attack detection accuracy when the features arebaseline system, i.e., a behavior-based system, while we tra

Recommended

View more >