adaptive intrusion detection using learning classifiers
Post on 16-Apr-2017
1.306 Views
Preview:
TRANSCRIPT
Adaptive Intrusion
Detection Using Learning
ClassifiersPatrick Nicolas
June 21, 2013
patricknicolas.blogspot.comwww.slideshare.net/pnicolas
github.com/prnicolas
The objective of this presentation is to review the different method to implement an adaptive intrusion detection (IDS) solution. The second part of the presentation dives into learning classifiers class of algorithms to detect, evaluate and act upon a security breach or cyber attack.
Introduction
Patrick Nicolas © 2013 http://patricknicolas.blogspot.com https://github.com/prnicolas
2
Data Mining Techniques
Learning Classifiers Systems
Context
The effectiveness of an intrusion detection system depends on its adaptability to
● Ever changing IT environment● Evolving internal policies & regulations● Agile organization & mobile workforce
Patrick Nicolas © 2013 http://patricknicolas.blogspot.com https://github.com/prnicolas
4
Data mining is becoming a popular method to extract knowledge from historical data.However, traditional data mining techniques fail to capture the evolutionary nature of an organization, its process, rules and IT infrastructure.
Data Mining: Overview
Patrick Nicolas © 2013 http://patricknicolas.blogspot.com https://github.com/prnicolas
5
Data Mining: Clustering
Unsupervised learning methods such as clustering or spectral analysis have drawbacks:●Poor classification of mix variable types●No descriptive representation●Limited leverage of the domain expertise●High computational cost to update
models
Patrick Nicolas © 2013 http://patricknicolas.blogspot.com https://github.com/prnicolas
6
Supervised learning methods can be effective on a large set of historical data but have the following limitations:●Need for large training set to alleviate
data over-fitting●No descriptive representation●Limited role for domain expert
Data Mining: Supervised Learning
Patrick Nicolas © 2013 http://patricknicolas.blogspot.com https://github.com/prnicolas
7
Data Mining TechniquesLearning Classifiers Systems
An evolutionary approach
1. An intrusion detection solution should learn from its suggestions through a process borrowed from human behavior: reward-based learning
2. It should evolve with the system it monitors: Darwinian process
Patrick Nicolas © 2013 http://patricknicolas.blogspot.com https://github.com/prnicolas
9
A class of algorithms known as learning classifiers (LCS) or extended learning classifiers (XCS) combines genetic algorithm and reinforcement learning to discover, evolve security policies and rules from real-time data.
Rule-based Learners
Patrick Nicolas © 2013 http://patricknicolas.blogspot.com https://github.com/prnicolas
10
● Rule-based representation allows security experts to monitor evolving knowledge
● Learn from each security event, making very well suited for streamed data
● Support various seeds schema such as initial rules set, training set and clustering.
LCS/XCS Benefits
Patrick Nicolas © 2013 http://patricknicolas.blogspot.com https://github.com/prnicolas
11
Security rules are used to represent the knowledge of a security expert.IF num. outbounds ftp sessions >5 THEN cost+2 (source: KDD Cup Dataset 1999) Those rules are chained to support reasoning about a sequence of events in a data center.
Security rules
Patrick Nicolas © 2013 http://patricknicolas.blogspot.com https://github.com/prnicolas
12
The rules set needs to adapt constantly to the ever changing environment & objectives.
Rules Set Evolution
Patrick Nicolas © 2013 http://patricknicolas.blogspot.com https://github.com/prnicolas
13
In order to evolve, rules are represented as genes in Genetic Algorithm. A gene is implemented at a binary vector structure for which the state or condition of the rule is expressed as op(x, value) (i.e. x > value) IF op(x, value) THEN f(cost) is translated
Rule Encoding
Patrick Nicolas © 2013 http://patricknicolas.blogspot.com https://github.com/prnicolas
010 1000101 0101101110 01101110100101010
op x values cost or action
14
As with any rules-based inference engine, encoded rules can be chained by aggregating binary representations:IF op1(x1, v1) AND op2(x2, v2)THEN f(cost)
Rules Chains & Chromosomes
Patrick Nicolas © 2013 http://patricknicolas.blogspot.com https://github.com/prnicolas
001 010 1000101 01011110 010 100101 0101101110 01101110100101010
&& op1 x1 v1 op2 x2 v2 cost or action
In terms of evolutionary algorithm, the firing of multiple rules is represented as a sequence of genes or chromosomes
15
The rules set evolves through the genetic recombination of rules using cross-over, mutation and transposition operations.
Rules Evolutionary Process
0101101011101110101010111010100111
1101010101110101001101010110101110
0101101011101110101010111010100111
11010101011101101001110101101011101
Parent rules Offspring rules
Cross-over operation
0101101011101110101010111010100111 0101101011101110101010101010100011
Mutation operation
0101101011101110101010111010100111 0101101011101110101010101010100011
Transposition operation
Patrick Nicolas © 2013 http://patricknicolas.blogspot.com https://github.com/prnicolas
16
Rules are selected according to their fitness before being ‘mated’ and mutated. The fitness of a rule represents its contribution to a detection or prevention of an intrusion.
The rules which are repeatedly invoked, have the highest fitness values and thrive overtime. Other rules become slowly irrelevant.
Rules Fitness
Patrick Nicolas © 2013 http://patricknicolas.blogspot.com https://github.com/prnicolas
17
Overview Genetic Algorithm
Patrick Nicolas © 2013 http://patricknicolas.blogspot.com https://github.com/prnicolas
Initial rules set Encoding Initial chromosomes
Selection
Cross-over
Mutation
New chromosomesDecodingNew rules set
Fitness
The rules set is constantly updated by the Genetic Algorithm to guarantee that it identifies intrusion correctly.
18
The fitness criteria of one or multiple rules has to be updated according to the state of the Infrastructure, organization & policies. The fitness function is updated to provide the best possible reward (or credit) to the rules that contribute to the detection of an intrusion.
Rule Fitness & Reward
Patrick Nicolas © 2013 http://patricknicolas.blogspot.com https://github.com/prnicolas
19
Reinforcement learning techniques are widely used in robotics. In the context of IDS, it rewards (or punishes) rules for their contribution (or lack of) in identifying threats taking into account changes in the organization, external accesses and IT infrastructure.
Reinforcement Learning
Patrick Nicolas © 2013 http://patricknicolas.blogspot.com https://github.com/prnicolas
20
Evolutionary Security Rules
Patrick Nicolas © 2013 http://patricknicolas.blogspot.com https://github.com/prnicolas
1. Process new data/event from the system2. Find the security related rule(s) which
condition matches the event3. Create a new rule if none match (Covering)4. Fire the fittest rules with the highest
predicted outcome.
21
Matching
Genetic Algorithm
Threat predictor
Update Fitness
Real-time data
Threat levelState
3
45
New ruleThreats monitor
IDSData
CenterCloud
12
Rules7
Evolution
Reward6
Evolutionary Security Rules
Patrick Nicolas © 2013 http://patricknicolas.blogspot.com https://github.com/prnicolas
Matching
Genetic Algorithm
Threat predictor
Update Fitness
Real-time data
Threat levelState
3
45
5. Process new state on system6. Reward contributing/matching rules by
updating the rule fitness7. Genetic algorithm update the existing
population of security rules through reproduction and mutation of rules.
22
New ruleThreats monitor
IDSData
CenterCloud
12
Rules7
Evolution
Reward6
By combining evolutionary algorithms with reinforcement learning, rule-based learners such as learning classifiers systems allow security policies and constraints to adapt to any change in environment or data center and therefore stay a step ahead of ever changing threats.
Conclusion
Patrick Nicolas © 2003 http://patricknicolas.blogspot.com https://github.com/prnicolas
23
● Genetic Programming: On the Programming of Computers by Means of Natural Selection - j. Koza
● Reinforcement Learning: An Introduction to Adaptive Computation and Machine Learning - R. Sutton, A. Barto
● Learning Classifiers Systems in Data Mining L. Bull, E. Bernado-Mansilla, J. Holms
● Hacking Smart Machines with Smarter Ones: How to Extract Meaningful Data from Machine Learning Classifiers G. Ateniese, G. Felici, L. Mancini, D. Vitali, A. Spognardi
● Evaluation of anomaly-based IDS for mobile devices using machine learning classifiers D. Damopoulos, S. Menesidou, G. Kambourakis, M Papadaki, N. Clarke
● http://patricknicolas.blogspot.com
References
Patrick Nicolas © 2003 http://patricknicolas.blogspot.com https://github.com/prnicolas
24
top related