name:-tapasi pati roll no-0401101238 10/19/20151 fuzzy data mining and genetic algorithms applied to...

28
Name:-TAPASI PATI Roll No- 0401101238 03/22/22 1 FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Upload: gilbert-lee

Post on 02-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Name-TAPASI PATI

Roll No-0401101238

042023 1

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED

TO

INTRUSION DETECTION

Cyber Attacks - Intrusions

Introduction

Why We Need Intrusion Detection

Models Of Intrusion Detection Anomaly Detection Misuse Detection

How Genetic Algorithm is used in IDS

Conclusion

References

042023 2

System Goals and Preliminary Architecture

Contents FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Intrusion Detection

FirewallContentsContentsContentsThe wide spread use of computer networks in todayrsquos society especially the sudden surge in importance of e-commerce to the world economy has made computer network security an international priority Since it is not technically feasible to build a system with no vulnerabilities intrusion detection has become an important area of research

Intelligent intrusion detection system (IIDS) has been developed to demonstrate the effectiveness of data mining techniques that utilize fuzzy logic and genetic algorithms

042023 3

Introduction

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 4

Cyber Attack-Intrusion

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Cyber attacks (intrusions) are actions that attempt to bypass security mechanisms of computer systemsThey are caused by

Attackers accessing the system from InternetInsider attackers - authorized users attempting to gain and misuse

non-authorized privileges1048714 Typical intrusion scenario

042023 5042023 5

Cyber Attack-Intrusion

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Intrusion Detection FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Intrusion Detection Intrusion detection is the process of monitoring the events occurring in a computer system or network and analyzing them for signs of intrusionsdefined as attempts to bypass the security mechanisms of a computer or network (ldquocompromise the confidentiality integrity availability of information resourcesrdquo)

Intrusion Detection System (IDS)combination of software and hardware that attempts to perform intrusion

detectionraise the alarm when possible intrusion happens

Security mechanisms always have inevitable vulnerabilities

042023 7

Need of Intrusion Detection

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Current firewalls are not sufficient to ensure security in computer networks

ldquoSecurity holesrdquo caused by allowances made to usersprogrammersadministrators

1048714 Insider attacks Multiple levels of data confidentiality in commercial and

government organizations needs multi-layer protection in firewalls

042023 8042023 8

The long term goal to design and build an intelligent intrusion detection system that are

Distributed

Real-time

Accurate (low false negative and false positive rates)

Flexible

Adaptive in new environments

Modular with both misuse and anomaly detection components

042023 8

System Goals FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Not easily fooled by small variations in intrusion patterns

042023 9

Architecture FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 10042023 10

Data Mining

for Intrusion Detection FUZZY DATA MINING AND GENETIC ALGORITHMS

APPLIED TO INTRUSION DETECTION

Misuse detection

Anomaly detection

These models can be more sophisticated and precise than manually created signatures

Unable to detect attacks whose instances have not yet been observed

Predictive models are built from labeled data sets (instances are labeled as ldquonormalrdquo or ldquointrusiverdquo)

Build models of ldquonormalrdquo behavior and detect anomalies as deviations from it

Possible high false alarm rate - previously unseen (yet legitimate)system behaviors may be recognized as anomalies

One to represent concepts that could be considered to be in more than one category (or from another point of viewmdashit allows representation of overlapping categories)

Partial membership in sets or categories

042023 11

Anomaly Detection via Fuzzy Data Mining

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Automatically learn patterns from large quantities of data

The integration of fuzzy logic with data mining methods helps to create more abstract and flexible patterns for intrusion detection

Fuzzy logic

Data Mining

Fuzzy Logic Method FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Logic

ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Suppose one wants to write a rule such as

If the number different destination addresses during the last 2 seconds was highThen an unusual situation exists

Using fuzzy logic a rule like the one shown above could be written as

If the DP = highThen an unusual situation exists

DP is a fuzzy variable and high is a fuzzy set

The degree of membership of the number of destination ports in the fuzzy set high determines whether or not the rule is activated

ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

ID using Data Mining FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTIONID Using DataMining

Two data mining methods have been used to mine audit data to find normal patterns for anomaly intrusion detection

Association Rules

Frequency episodes

Fuzzy Association Rules

Fuzzy Frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Association Rules

Association rules are developed to find correlations in transactions using retail data

For example if a customer who buys a soft drink (A) usually also buys potato chips (B) then potato chips are associated with soft drinks using the rule A B Suppose that 25 of all customers buy both soft drinks and potato chips and that 50 of the customers who buy soft drinks also buy potato chips Then the degree of support for the rule is s = 025 and the degree of confidence in the rule is c = 050

The Apriori algorithm requires two thresholds of minconfidence (representing minimum confidence) and minsupport (representing minimum support)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Association Rules

This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category

Their method allows a value to contribute to the support of more than one fuzzy set

For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed

An example of a fuzzy association rule from one set of audit data is

SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049

where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions

Fuzzy Association Rules

Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Frequency Episodes

This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes

An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval

Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window

So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent

if frequency(P)n geminfrequency

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Frequency Episodes

The fuzzy frequency episodes involves quantitative attributes in an event

An example of a fuzzy frequency episode given below

E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds

where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2

second period

The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate

This is Integration of fuzzy logic with frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

A simple example of a rule from the misuse detection component is

IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious

Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result

The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules

Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS

APPLIED TO INTRUSION DETECTION

Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)

The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set

The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem

Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms

The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set

Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Conclusion

The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions

Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Referrences

Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99

Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada

Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)

Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 28

Queries

042023 28

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28

Cyber Attacks - Intrusions

Introduction

Why We Need Intrusion Detection

Models Of Intrusion Detection Anomaly Detection Misuse Detection

How Genetic Algorithm is used in IDS

Conclusion

References

042023 2

System Goals and Preliminary Architecture

Contents FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Intrusion Detection

FirewallContentsContentsContentsThe wide spread use of computer networks in todayrsquos society especially the sudden surge in importance of e-commerce to the world economy has made computer network security an international priority Since it is not technically feasible to build a system with no vulnerabilities intrusion detection has become an important area of research

Intelligent intrusion detection system (IIDS) has been developed to demonstrate the effectiveness of data mining techniques that utilize fuzzy logic and genetic algorithms

042023 3

Introduction

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 4

Cyber Attack-Intrusion

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Cyber attacks (intrusions) are actions that attempt to bypass security mechanisms of computer systemsThey are caused by

Attackers accessing the system from InternetInsider attackers - authorized users attempting to gain and misuse

non-authorized privileges1048714 Typical intrusion scenario

042023 5042023 5

Cyber Attack-Intrusion

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Intrusion Detection FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Intrusion Detection Intrusion detection is the process of monitoring the events occurring in a computer system or network and analyzing them for signs of intrusionsdefined as attempts to bypass the security mechanisms of a computer or network (ldquocompromise the confidentiality integrity availability of information resourcesrdquo)

Intrusion Detection System (IDS)combination of software and hardware that attempts to perform intrusion

detectionraise the alarm when possible intrusion happens

Security mechanisms always have inevitable vulnerabilities

042023 7

Need of Intrusion Detection

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Current firewalls are not sufficient to ensure security in computer networks

ldquoSecurity holesrdquo caused by allowances made to usersprogrammersadministrators

1048714 Insider attacks Multiple levels of data confidentiality in commercial and

government organizations needs multi-layer protection in firewalls

042023 8042023 8

The long term goal to design and build an intelligent intrusion detection system that are

Distributed

Real-time

Accurate (low false negative and false positive rates)

Flexible

Adaptive in new environments

Modular with both misuse and anomaly detection components

042023 8

System Goals FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Not easily fooled by small variations in intrusion patterns

042023 9

Architecture FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 10042023 10

Data Mining

for Intrusion Detection FUZZY DATA MINING AND GENETIC ALGORITHMS

APPLIED TO INTRUSION DETECTION

Misuse detection

Anomaly detection

These models can be more sophisticated and precise than manually created signatures

Unable to detect attacks whose instances have not yet been observed

Predictive models are built from labeled data sets (instances are labeled as ldquonormalrdquo or ldquointrusiverdquo)

Build models of ldquonormalrdquo behavior and detect anomalies as deviations from it

Possible high false alarm rate - previously unseen (yet legitimate)system behaviors may be recognized as anomalies

One to represent concepts that could be considered to be in more than one category (or from another point of viewmdashit allows representation of overlapping categories)

Partial membership in sets or categories

042023 11

Anomaly Detection via Fuzzy Data Mining

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Automatically learn patterns from large quantities of data

The integration of fuzzy logic with data mining methods helps to create more abstract and flexible patterns for intrusion detection

Fuzzy logic

Data Mining

Fuzzy Logic Method FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Logic

ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Suppose one wants to write a rule such as

If the number different destination addresses during the last 2 seconds was highThen an unusual situation exists

Using fuzzy logic a rule like the one shown above could be written as

If the DP = highThen an unusual situation exists

DP is a fuzzy variable and high is a fuzzy set

The degree of membership of the number of destination ports in the fuzzy set high determines whether or not the rule is activated

ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

ID using Data Mining FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTIONID Using DataMining

Two data mining methods have been used to mine audit data to find normal patterns for anomaly intrusion detection

Association Rules

Frequency episodes

Fuzzy Association Rules

Fuzzy Frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Association Rules

Association rules are developed to find correlations in transactions using retail data

For example if a customer who buys a soft drink (A) usually also buys potato chips (B) then potato chips are associated with soft drinks using the rule A B Suppose that 25 of all customers buy both soft drinks and potato chips and that 50 of the customers who buy soft drinks also buy potato chips Then the degree of support for the rule is s = 025 and the degree of confidence in the rule is c = 050

The Apriori algorithm requires two thresholds of minconfidence (representing minimum confidence) and minsupport (representing minimum support)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Association Rules

This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category

Their method allows a value to contribute to the support of more than one fuzzy set

For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed

An example of a fuzzy association rule from one set of audit data is

SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049

where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions

Fuzzy Association Rules

Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Frequency Episodes

This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes

An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval

Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window

So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent

if frequency(P)n geminfrequency

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Frequency Episodes

The fuzzy frequency episodes involves quantitative attributes in an event

An example of a fuzzy frequency episode given below

E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds

where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2

second period

The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate

This is Integration of fuzzy logic with frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

A simple example of a rule from the misuse detection component is

IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious

Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result

The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules

Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS

APPLIED TO INTRUSION DETECTION

Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)

The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set

The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem

Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms

The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set

Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Conclusion

The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions

Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Referrences

Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99

Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada

Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)

Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 28

Queries

042023 28

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28

FirewallContentsContentsContentsThe wide spread use of computer networks in todayrsquos society especially the sudden surge in importance of e-commerce to the world economy has made computer network security an international priority Since it is not technically feasible to build a system with no vulnerabilities intrusion detection has become an important area of research

Intelligent intrusion detection system (IIDS) has been developed to demonstrate the effectiveness of data mining techniques that utilize fuzzy logic and genetic algorithms

042023 3

Introduction

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 4

Cyber Attack-Intrusion

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Cyber attacks (intrusions) are actions that attempt to bypass security mechanisms of computer systemsThey are caused by

Attackers accessing the system from InternetInsider attackers - authorized users attempting to gain and misuse

non-authorized privileges1048714 Typical intrusion scenario

042023 5042023 5

Cyber Attack-Intrusion

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Intrusion Detection FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Intrusion Detection Intrusion detection is the process of monitoring the events occurring in a computer system or network and analyzing them for signs of intrusionsdefined as attempts to bypass the security mechanisms of a computer or network (ldquocompromise the confidentiality integrity availability of information resourcesrdquo)

Intrusion Detection System (IDS)combination of software and hardware that attempts to perform intrusion

detectionraise the alarm when possible intrusion happens

Security mechanisms always have inevitable vulnerabilities

042023 7

Need of Intrusion Detection

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Current firewalls are not sufficient to ensure security in computer networks

ldquoSecurity holesrdquo caused by allowances made to usersprogrammersadministrators

1048714 Insider attacks Multiple levels of data confidentiality in commercial and

government organizations needs multi-layer protection in firewalls

042023 8042023 8

The long term goal to design and build an intelligent intrusion detection system that are

Distributed

Real-time

Accurate (low false negative and false positive rates)

Flexible

Adaptive in new environments

Modular with both misuse and anomaly detection components

042023 8

System Goals FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Not easily fooled by small variations in intrusion patterns

042023 9

Architecture FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 10042023 10

Data Mining

for Intrusion Detection FUZZY DATA MINING AND GENETIC ALGORITHMS

APPLIED TO INTRUSION DETECTION

Misuse detection

Anomaly detection

These models can be more sophisticated and precise than manually created signatures

Unable to detect attacks whose instances have not yet been observed

Predictive models are built from labeled data sets (instances are labeled as ldquonormalrdquo or ldquointrusiverdquo)

Build models of ldquonormalrdquo behavior and detect anomalies as deviations from it

Possible high false alarm rate - previously unseen (yet legitimate)system behaviors may be recognized as anomalies

One to represent concepts that could be considered to be in more than one category (or from another point of viewmdashit allows representation of overlapping categories)

Partial membership in sets or categories

042023 11

Anomaly Detection via Fuzzy Data Mining

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Automatically learn patterns from large quantities of data

The integration of fuzzy logic with data mining methods helps to create more abstract and flexible patterns for intrusion detection

Fuzzy logic

Data Mining

Fuzzy Logic Method FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Logic

ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Suppose one wants to write a rule such as

If the number different destination addresses during the last 2 seconds was highThen an unusual situation exists

Using fuzzy logic a rule like the one shown above could be written as

If the DP = highThen an unusual situation exists

DP is a fuzzy variable and high is a fuzzy set

The degree of membership of the number of destination ports in the fuzzy set high determines whether or not the rule is activated

ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

ID using Data Mining FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTIONID Using DataMining

Two data mining methods have been used to mine audit data to find normal patterns for anomaly intrusion detection

Association Rules

Frequency episodes

Fuzzy Association Rules

Fuzzy Frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Association Rules

Association rules are developed to find correlations in transactions using retail data

For example if a customer who buys a soft drink (A) usually also buys potato chips (B) then potato chips are associated with soft drinks using the rule A B Suppose that 25 of all customers buy both soft drinks and potato chips and that 50 of the customers who buy soft drinks also buy potato chips Then the degree of support for the rule is s = 025 and the degree of confidence in the rule is c = 050

The Apriori algorithm requires two thresholds of minconfidence (representing minimum confidence) and minsupport (representing minimum support)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Association Rules

This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category

Their method allows a value to contribute to the support of more than one fuzzy set

For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed

An example of a fuzzy association rule from one set of audit data is

SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049

where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions

Fuzzy Association Rules

Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Frequency Episodes

This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes

An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval

Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window

So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent

if frequency(P)n geminfrequency

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Frequency Episodes

The fuzzy frequency episodes involves quantitative attributes in an event

An example of a fuzzy frequency episode given below

E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds

where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2

second period

The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate

This is Integration of fuzzy logic with frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

A simple example of a rule from the misuse detection component is

IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious

Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result

The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules

Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS

APPLIED TO INTRUSION DETECTION

Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)

The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set

The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem

Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms

The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set

Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Conclusion

The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions

Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Referrences

Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99

Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada

Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)

Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 28

Queries

042023 28

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28

042023 4

Cyber Attack-Intrusion

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Cyber attacks (intrusions) are actions that attempt to bypass security mechanisms of computer systemsThey are caused by

Attackers accessing the system from InternetInsider attackers - authorized users attempting to gain and misuse

non-authorized privileges1048714 Typical intrusion scenario

042023 5042023 5

Cyber Attack-Intrusion

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Intrusion Detection FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Intrusion Detection Intrusion detection is the process of monitoring the events occurring in a computer system or network and analyzing them for signs of intrusionsdefined as attempts to bypass the security mechanisms of a computer or network (ldquocompromise the confidentiality integrity availability of information resourcesrdquo)

Intrusion Detection System (IDS)combination of software and hardware that attempts to perform intrusion

detectionraise the alarm when possible intrusion happens

Security mechanisms always have inevitable vulnerabilities

042023 7

Need of Intrusion Detection

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Current firewalls are not sufficient to ensure security in computer networks

ldquoSecurity holesrdquo caused by allowances made to usersprogrammersadministrators

1048714 Insider attacks Multiple levels of data confidentiality in commercial and

government organizations needs multi-layer protection in firewalls

042023 8042023 8

The long term goal to design and build an intelligent intrusion detection system that are

Distributed

Real-time

Accurate (low false negative and false positive rates)

Flexible

Adaptive in new environments

Modular with both misuse and anomaly detection components

042023 8

System Goals FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Not easily fooled by small variations in intrusion patterns

042023 9

Architecture FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 10042023 10

Data Mining

for Intrusion Detection FUZZY DATA MINING AND GENETIC ALGORITHMS

APPLIED TO INTRUSION DETECTION

Misuse detection

Anomaly detection

These models can be more sophisticated and precise than manually created signatures

Unable to detect attacks whose instances have not yet been observed

Predictive models are built from labeled data sets (instances are labeled as ldquonormalrdquo or ldquointrusiverdquo)

Build models of ldquonormalrdquo behavior and detect anomalies as deviations from it

Possible high false alarm rate - previously unseen (yet legitimate)system behaviors may be recognized as anomalies

One to represent concepts that could be considered to be in more than one category (or from another point of viewmdashit allows representation of overlapping categories)

Partial membership in sets or categories

042023 11

Anomaly Detection via Fuzzy Data Mining

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Automatically learn patterns from large quantities of data

The integration of fuzzy logic with data mining methods helps to create more abstract and flexible patterns for intrusion detection

Fuzzy logic

Data Mining

Fuzzy Logic Method FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Logic

ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Suppose one wants to write a rule such as

If the number different destination addresses during the last 2 seconds was highThen an unusual situation exists

Using fuzzy logic a rule like the one shown above could be written as

If the DP = highThen an unusual situation exists

DP is a fuzzy variable and high is a fuzzy set

The degree of membership of the number of destination ports in the fuzzy set high determines whether or not the rule is activated

ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

ID using Data Mining FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTIONID Using DataMining

Two data mining methods have been used to mine audit data to find normal patterns for anomaly intrusion detection

Association Rules

Frequency episodes

Fuzzy Association Rules

Fuzzy Frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Association Rules

Association rules are developed to find correlations in transactions using retail data

For example if a customer who buys a soft drink (A) usually also buys potato chips (B) then potato chips are associated with soft drinks using the rule A B Suppose that 25 of all customers buy both soft drinks and potato chips and that 50 of the customers who buy soft drinks also buy potato chips Then the degree of support for the rule is s = 025 and the degree of confidence in the rule is c = 050

The Apriori algorithm requires two thresholds of minconfidence (representing minimum confidence) and minsupport (representing minimum support)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Association Rules

This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category

Their method allows a value to contribute to the support of more than one fuzzy set

For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed

An example of a fuzzy association rule from one set of audit data is

SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049

where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions

Fuzzy Association Rules

Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Frequency Episodes

This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes

An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval

Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window

So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent

if frequency(P)n geminfrequency

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Frequency Episodes

The fuzzy frequency episodes involves quantitative attributes in an event

An example of a fuzzy frequency episode given below

E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds

where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2

second period

The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate

This is Integration of fuzzy logic with frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

A simple example of a rule from the misuse detection component is

IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious

Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result

The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules

Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS

APPLIED TO INTRUSION DETECTION

Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)

The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set

The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem

Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms

The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set

Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Conclusion

The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions

Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Referrences

Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99

Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada

Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)

Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 28

Queries

042023 28

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28

042023 5042023 5

Cyber Attack-Intrusion

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Intrusion Detection FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Intrusion Detection Intrusion detection is the process of monitoring the events occurring in a computer system or network and analyzing them for signs of intrusionsdefined as attempts to bypass the security mechanisms of a computer or network (ldquocompromise the confidentiality integrity availability of information resourcesrdquo)

Intrusion Detection System (IDS)combination of software and hardware that attempts to perform intrusion

detectionraise the alarm when possible intrusion happens

Security mechanisms always have inevitable vulnerabilities

042023 7

Need of Intrusion Detection

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Current firewalls are not sufficient to ensure security in computer networks

ldquoSecurity holesrdquo caused by allowances made to usersprogrammersadministrators

1048714 Insider attacks Multiple levels of data confidentiality in commercial and

government organizations needs multi-layer protection in firewalls

042023 8042023 8

The long term goal to design and build an intelligent intrusion detection system that are

Distributed

Real-time

Accurate (low false negative and false positive rates)

Flexible

Adaptive in new environments

Modular with both misuse and anomaly detection components

042023 8

System Goals FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Not easily fooled by small variations in intrusion patterns

042023 9

Architecture FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 10042023 10

Data Mining

for Intrusion Detection FUZZY DATA MINING AND GENETIC ALGORITHMS

APPLIED TO INTRUSION DETECTION

Misuse detection

Anomaly detection

These models can be more sophisticated and precise than manually created signatures

Unable to detect attacks whose instances have not yet been observed

Predictive models are built from labeled data sets (instances are labeled as ldquonormalrdquo or ldquointrusiverdquo)

Build models of ldquonormalrdquo behavior and detect anomalies as deviations from it

Possible high false alarm rate - previously unseen (yet legitimate)system behaviors may be recognized as anomalies

One to represent concepts that could be considered to be in more than one category (or from another point of viewmdashit allows representation of overlapping categories)

Partial membership in sets or categories

042023 11

Anomaly Detection via Fuzzy Data Mining

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Automatically learn patterns from large quantities of data

The integration of fuzzy logic with data mining methods helps to create more abstract and flexible patterns for intrusion detection

Fuzzy logic

Data Mining

Fuzzy Logic Method FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Logic

ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Suppose one wants to write a rule such as

If the number different destination addresses during the last 2 seconds was highThen an unusual situation exists

Using fuzzy logic a rule like the one shown above could be written as

If the DP = highThen an unusual situation exists

DP is a fuzzy variable and high is a fuzzy set

The degree of membership of the number of destination ports in the fuzzy set high determines whether or not the rule is activated

ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

ID using Data Mining FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTIONID Using DataMining

Two data mining methods have been used to mine audit data to find normal patterns for anomaly intrusion detection

Association Rules

Frequency episodes

Fuzzy Association Rules

Fuzzy Frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Association Rules

Association rules are developed to find correlations in transactions using retail data

For example if a customer who buys a soft drink (A) usually also buys potato chips (B) then potato chips are associated with soft drinks using the rule A B Suppose that 25 of all customers buy both soft drinks and potato chips and that 50 of the customers who buy soft drinks also buy potato chips Then the degree of support for the rule is s = 025 and the degree of confidence in the rule is c = 050

The Apriori algorithm requires two thresholds of minconfidence (representing minimum confidence) and minsupport (representing minimum support)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Association Rules

This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category

Their method allows a value to contribute to the support of more than one fuzzy set

For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed

An example of a fuzzy association rule from one set of audit data is

SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049

where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions

Fuzzy Association Rules

Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Frequency Episodes

This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes

An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval

Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window

So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent

if frequency(P)n geminfrequency

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Frequency Episodes

The fuzzy frequency episodes involves quantitative attributes in an event

An example of a fuzzy frequency episode given below

E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds

where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2

second period

The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate

This is Integration of fuzzy logic with frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

A simple example of a rule from the misuse detection component is

IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious

Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result

The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules

Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS

APPLIED TO INTRUSION DETECTION

Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)

The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set

The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem

Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms

The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set

Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Conclusion

The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions

Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Referrences

Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99

Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada

Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)

Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 28

Queries

042023 28

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28

Intrusion Detection FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Intrusion Detection Intrusion detection is the process of monitoring the events occurring in a computer system or network and analyzing them for signs of intrusionsdefined as attempts to bypass the security mechanisms of a computer or network (ldquocompromise the confidentiality integrity availability of information resourcesrdquo)

Intrusion Detection System (IDS)combination of software and hardware that attempts to perform intrusion

detectionraise the alarm when possible intrusion happens

Security mechanisms always have inevitable vulnerabilities

042023 7

Need of Intrusion Detection

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Current firewalls are not sufficient to ensure security in computer networks

ldquoSecurity holesrdquo caused by allowances made to usersprogrammersadministrators

1048714 Insider attacks Multiple levels of data confidentiality in commercial and

government organizations needs multi-layer protection in firewalls

042023 8042023 8

The long term goal to design and build an intelligent intrusion detection system that are

Distributed

Real-time

Accurate (low false negative and false positive rates)

Flexible

Adaptive in new environments

Modular with both misuse and anomaly detection components

042023 8

System Goals FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Not easily fooled by small variations in intrusion patterns

042023 9

Architecture FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 10042023 10

Data Mining

for Intrusion Detection FUZZY DATA MINING AND GENETIC ALGORITHMS

APPLIED TO INTRUSION DETECTION

Misuse detection

Anomaly detection

These models can be more sophisticated and precise than manually created signatures

Unable to detect attacks whose instances have not yet been observed

Predictive models are built from labeled data sets (instances are labeled as ldquonormalrdquo or ldquointrusiverdquo)

Build models of ldquonormalrdquo behavior and detect anomalies as deviations from it

Possible high false alarm rate - previously unseen (yet legitimate)system behaviors may be recognized as anomalies

One to represent concepts that could be considered to be in more than one category (or from another point of viewmdashit allows representation of overlapping categories)

Partial membership in sets or categories

042023 11

Anomaly Detection via Fuzzy Data Mining

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Automatically learn patterns from large quantities of data

The integration of fuzzy logic with data mining methods helps to create more abstract and flexible patterns for intrusion detection

Fuzzy logic

Data Mining

Fuzzy Logic Method FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Logic

ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Suppose one wants to write a rule such as

If the number different destination addresses during the last 2 seconds was highThen an unusual situation exists

Using fuzzy logic a rule like the one shown above could be written as

If the DP = highThen an unusual situation exists

DP is a fuzzy variable and high is a fuzzy set

The degree of membership of the number of destination ports in the fuzzy set high determines whether or not the rule is activated

ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

ID using Data Mining FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTIONID Using DataMining

Two data mining methods have been used to mine audit data to find normal patterns for anomaly intrusion detection

Association Rules

Frequency episodes

Fuzzy Association Rules

Fuzzy Frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Association Rules

Association rules are developed to find correlations in transactions using retail data

For example if a customer who buys a soft drink (A) usually also buys potato chips (B) then potato chips are associated with soft drinks using the rule A B Suppose that 25 of all customers buy both soft drinks and potato chips and that 50 of the customers who buy soft drinks also buy potato chips Then the degree of support for the rule is s = 025 and the degree of confidence in the rule is c = 050

The Apriori algorithm requires two thresholds of minconfidence (representing minimum confidence) and minsupport (representing minimum support)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Association Rules

This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category

Their method allows a value to contribute to the support of more than one fuzzy set

For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed

An example of a fuzzy association rule from one set of audit data is

SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049

where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions

Fuzzy Association Rules

Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Frequency Episodes

This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes

An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval

Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window

So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent

if frequency(P)n geminfrequency

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Frequency Episodes

The fuzzy frequency episodes involves quantitative attributes in an event

An example of a fuzzy frequency episode given below

E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds

where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2

second period

The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate

This is Integration of fuzzy logic with frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

A simple example of a rule from the misuse detection component is

IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious

Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result

The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules

Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS

APPLIED TO INTRUSION DETECTION

Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)

The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set

The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem

Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms

The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set

Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Conclusion

The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions

Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Referrences

Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99

Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada

Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)

Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 28

Queries

042023 28

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28

Security mechanisms always have inevitable vulnerabilities

042023 7

Need of Intrusion Detection

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Current firewalls are not sufficient to ensure security in computer networks

ldquoSecurity holesrdquo caused by allowances made to usersprogrammersadministrators

1048714 Insider attacks Multiple levels of data confidentiality in commercial and

government organizations needs multi-layer protection in firewalls

042023 8042023 8

The long term goal to design and build an intelligent intrusion detection system that are

Distributed

Real-time

Accurate (low false negative and false positive rates)

Flexible

Adaptive in new environments

Modular with both misuse and anomaly detection components

042023 8

System Goals FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Not easily fooled by small variations in intrusion patterns

042023 9

Architecture FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 10042023 10

Data Mining

for Intrusion Detection FUZZY DATA MINING AND GENETIC ALGORITHMS

APPLIED TO INTRUSION DETECTION

Misuse detection

Anomaly detection

These models can be more sophisticated and precise than manually created signatures

Unable to detect attacks whose instances have not yet been observed

Predictive models are built from labeled data sets (instances are labeled as ldquonormalrdquo or ldquointrusiverdquo)

Build models of ldquonormalrdquo behavior and detect anomalies as deviations from it

Possible high false alarm rate - previously unseen (yet legitimate)system behaviors may be recognized as anomalies

One to represent concepts that could be considered to be in more than one category (or from another point of viewmdashit allows representation of overlapping categories)

Partial membership in sets or categories

042023 11

Anomaly Detection via Fuzzy Data Mining

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Automatically learn patterns from large quantities of data

The integration of fuzzy logic with data mining methods helps to create more abstract and flexible patterns for intrusion detection

Fuzzy logic

Data Mining

Fuzzy Logic Method FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Logic

ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Suppose one wants to write a rule such as

If the number different destination addresses during the last 2 seconds was highThen an unusual situation exists

Using fuzzy logic a rule like the one shown above could be written as

If the DP = highThen an unusual situation exists

DP is a fuzzy variable and high is a fuzzy set

The degree of membership of the number of destination ports in the fuzzy set high determines whether or not the rule is activated

ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

ID using Data Mining FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTIONID Using DataMining

Two data mining methods have been used to mine audit data to find normal patterns for anomaly intrusion detection

Association Rules

Frequency episodes

Fuzzy Association Rules

Fuzzy Frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Association Rules

Association rules are developed to find correlations in transactions using retail data

For example if a customer who buys a soft drink (A) usually also buys potato chips (B) then potato chips are associated with soft drinks using the rule A B Suppose that 25 of all customers buy both soft drinks and potato chips and that 50 of the customers who buy soft drinks also buy potato chips Then the degree of support for the rule is s = 025 and the degree of confidence in the rule is c = 050

The Apriori algorithm requires two thresholds of minconfidence (representing minimum confidence) and minsupport (representing minimum support)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Association Rules

This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category

Their method allows a value to contribute to the support of more than one fuzzy set

For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed

An example of a fuzzy association rule from one set of audit data is

SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049

where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions

Fuzzy Association Rules

Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Frequency Episodes

This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes

An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval

Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window

So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent

if frequency(P)n geminfrequency

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Frequency Episodes

The fuzzy frequency episodes involves quantitative attributes in an event

An example of a fuzzy frequency episode given below

E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds

where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2

second period

The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate

This is Integration of fuzzy logic with frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

A simple example of a rule from the misuse detection component is

IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious

Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result

The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules

Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS

APPLIED TO INTRUSION DETECTION

Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)

The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set

The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem

Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms

The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set

Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Conclusion

The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions

Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Referrences

Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99

Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada

Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)

Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 28

Queries

042023 28

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28

042023 8042023 8

The long term goal to design and build an intelligent intrusion detection system that are

Distributed

Real-time

Accurate (low false negative and false positive rates)

Flexible

Adaptive in new environments

Modular with both misuse and anomaly detection components

042023 8

System Goals FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Not easily fooled by small variations in intrusion patterns

042023 9

Architecture FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 10042023 10

Data Mining

for Intrusion Detection FUZZY DATA MINING AND GENETIC ALGORITHMS

APPLIED TO INTRUSION DETECTION

Misuse detection

Anomaly detection

These models can be more sophisticated and precise than manually created signatures

Unable to detect attacks whose instances have not yet been observed

Predictive models are built from labeled data sets (instances are labeled as ldquonormalrdquo or ldquointrusiverdquo)

Build models of ldquonormalrdquo behavior and detect anomalies as deviations from it

Possible high false alarm rate - previously unseen (yet legitimate)system behaviors may be recognized as anomalies

One to represent concepts that could be considered to be in more than one category (or from another point of viewmdashit allows representation of overlapping categories)

Partial membership in sets or categories

042023 11

Anomaly Detection via Fuzzy Data Mining

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Automatically learn patterns from large quantities of data

The integration of fuzzy logic with data mining methods helps to create more abstract and flexible patterns for intrusion detection

Fuzzy logic

Data Mining

Fuzzy Logic Method FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Logic

ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Suppose one wants to write a rule such as

If the number different destination addresses during the last 2 seconds was highThen an unusual situation exists

Using fuzzy logic a rule like the one shown above could be written as

If the DP = highThen an unusual situation exists

DP is a fuzzy variable and high is a fuzzy set

The degree of membership of the number of destination ports in the fuzzy set high determines whether or not the rule is activated

ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

ID using Data Mining FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTIONID Using DataMining

Two data mining methods have been used to mine audit data to find normal patterns for anomaly intrusion detection

Association Rules

Frequency episodes

Fuzzy Association Rules

Fuzzy Frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Association Rules

Association rules are developed to find correlations in transactions using retail data

For example if a customer who buys a soft drink (A) usually also buys potato chips (B) then potato chips are associated with soft drinks using the rule A B Suppose that 25 of all customers buy both soft drinks and potato chips and that 50 of the customers who buy soft drinks also buy potato chips Then the degree of support for the rule is s = 025 and the degree of confidence in the rule is c = 050

The Apriori algorithm requires two thresholds of minconfidence (representing minimum confidence) and minsupport (representing minimum support)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Association Rules

This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category

Their method allows a value to contribute to the support of more than one fuzzy set

For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed

An example of a fuzzy association rule from one set of audit data is

SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049

where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions

Fuzzy Association Rules

Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Frequency Episodes

This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes

An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval

Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window

So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent

if frequency(P)n geminfrequency

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Frequency Episodes

The fuzzy frequency episodes involves quantitative attributes in an event

An example of a fuzzy frequency episode given below

E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds

where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2

second period

The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate

This is Integration of fuzzy logic with frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

A simple example of a rule from the misuse detection component is

IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious

Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result

The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules

Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS

APPLIED TO INTRUSION DETECTION

Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)

The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set

The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem

Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms

The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set

Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Conclusion

The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions

Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Referrences

Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99

Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada

Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)

Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 28

Queries

042023 28

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28

042023 9

Architecture FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 10042023 10

Data Mining

for Intrusion Detection FUZZY DATA MINING AND GENETIC ALGORITHMS

APPLIED TO INTRUSION DETECTION

Misuse detection

Anomaly detection

These models can be more sophisticated and precise than manually created signatures

Unable to detect attacks whose instances have not yet been observed

Predictive models are built from labeled data sets (instances are labeled as ldquonormalrdquo or ldquointrusiverdquo)

Build models of ldquonormalrdquo behavior and detect anomalies as deviations from it

Possible high false alarm rate - previously unseen (yet legitimate)system behaviors may be recognized as anomalies

One to represent concepts that could be considered to be in more than one category (or from another point of viewmdashit allows representation of overlapping categories)

Partial membership in sets or categories

042023 11

Anomaly Detection via Fuzzy Data Mining

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Automatically learn patterns from large quantities of data

The integration of fuzzy logic with data mining methods helps to create more abstract and flexible patterns for intrusion detection

Fuzzy logic

Data Mining

Fuzzy Logic Method FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Logic

ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Suppose one wants to write a rule such as

If the number different destination addresses during the last 2 seconds was highThen an unusual situation exists

Using fuzzy logic a rule like the one shown above could be written as

If the DP = highThen an unusual situation exists

DP is a fuzzy variable and high is a fuzzy set

The degree of membership of the number of destination ports in the fuzzy set high determines whether or not the rule is activated

ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

ID using Data Mining FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTIONID Using DataMining

Two data mining methods have been used to mine audit data to find normal patterns for anomaly intrusion detection

Association Rules

Frequency episodes

Fuzzy Association Rules

Fuzzy Frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Association Rules

Association rules are developed to find correlations in transactions using retail data

For example if a customer who buys a soft drink (A) usually also buys potato chips (B) then potato chips are associated with soft drinks using the rule A B Suppose that 25 of all customers buy both soft drinks and potato chips and that 50 of the customers who buy soft drinks also buy potato chips Then the degree of support for the rule is s = 025 and the degree of confidence in the rule is c = 050

The Apriori algorithm requires two thresholds of minconfidence (representing minimum confidence) and minsupport (representing minimum support)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Association Rules

This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category

Their method allows a value to contribute to the support of more than one fuzzy set

For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed

An example of a fuzzy association rule from one set of audit data is

SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049

where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions

Fuzzy Association Rules

Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Frequency Episodes

This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes

An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval

Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window

So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent

if frequency(P)n geminfrequency

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Frequency Episodes

The fuzzy frequency episodes involves quantitative attributes in an event

An example of a fuzzy frequency episode given below

E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds

where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2

second period

The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate

This is Integration of fuzzy logic with frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

A simple example of a rule from the misuse detection component is

IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious

Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result

The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules

Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS

APPLIED TO INTRUSION DETECTION

Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)

The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set

The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem

Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms

The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set

Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Conclusion

The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions

Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Referrences

Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99

Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada

Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)

Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 28

Queries

042023 28

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28

042023 10042023 10

Data Mining

for Intrusion Detection FUZZY DATA MINING AND GENETIC ALGORITHMS

APPLIED TO INTRUSION DETECTION

Misuse detection

Anomaly detection

These models can be more sophisticated and precise than manually created signatures

Unable to detect attacks whose instances have not yet been observed

Predictive models are built from labeled data sets (instances are labeled as ldquonormalrdquo or ldquointrusiverdquo)

Build models of ldquonormalrdquo behavior and detect anomalies as deviations from it

Possible high false alarm rate - previously unseen (yet legitimate)system behaviors may be recognized as anomalies

One to represent concepts that could be considered to be in more than one category (or from another point of viewmdashit allows representation of overlapping categories)

Partial membership in sets or categories

042023 11

Anomaly Detection via Fuzzy Data Mining

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Automatically learn patterns from large quantities of data

The integration of fuzzy logic with data mining methods helps to create more abstract and flexible patterns for intrusion detection

Fuzzy logic

Data Mining

Fuzzy Logic Method FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Logic

ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Suppose one wants to write a rule such as

If the number different destination addresses during the last 2 seconds was highThen an unusual situation exists

Using fuzzy logic a rule like the one shown above could be written as

If the DP = highThen an unusual situation exists

DP is a fuzzy variable and high is a fuzzy set

The degree of membership of the number of destination ports in the fuzzy set high determines whether or not the rule is activated

ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

ID using Data Mining FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTIONID Using DataMining

Two data mining methods have been used to mine audit data to find normal patterns for anomaly intrusion detection

Association Rules

Frequency episodes

Fuzzy Association Rules

Fuzzy Frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Association Rules

Association rules are developed to find correlations in transactions using retail data

For example if a customer who buys a soft drink (A) usually also buys potato chips (B) then potato chips are associated with soft drinks using the rule A B Suppose that 25 of all customers buy both soft drinks and potato chips and that 50 of the customers who buy soft drinks also buy potato chips Then the degree of support for the rule is s = 025 and the degree of confidence in the rule is c = 050

The Apriori algorithm requires two thresholds of minconfidence (representing minimum confidence) and minsupport (representing minimum support)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Association Rules

This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category

Their method allows a value to contribute to the support of more than one fuzzy set

For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed

An example of a fuzzy association rule from one set of audit data is

SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049

where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions

Fuzzy Association Rules

Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Frequency Episodes

This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes

An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval

Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window

So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent

if frequency(P)n geminfrequency

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Frequency Episodes

The fuzzy frequency episodes involves quantitative attributes in an event

An example of a fuzzy frequency episode given below

E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds

where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2

second period

The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate

This is Integration of fuzzy logic with frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

A simple example of a rule from the misuse detection component is

IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious

Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result

The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules

Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS

APPLIED TO INTRUSION DETECTION

Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)

The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set

The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem

Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms

The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set

Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Conclusion

The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions

Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Referrences

Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99

Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada

Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)

Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 28

Queries

042023 28

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28

One to represent concepts that could be considered to be in more than one category (or from another point of viewmdashit allows representation of overlapping categories)

Partial membership in sets or categories

042023 11

Anomaly Detection via Fuzzy Data Mining

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Automatically learn patterns from large quantities of data

The integration of fuzzy logic with data mining methods helps to create more abstract and flexible patterns for intrusion detection

Fuzzy logic

Data Mining

Fuzzy Logic Method FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Logic

ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Suppose one wants to write a rule such as

If the number different destination addresses during the last 2 seconds was highThen an unusual situation exists

Using fuzzy logic a rule like the one shown above could be written as

If the DP = highThen an unusual situation exists

DP is a fuzzy variable and high is a fuzzy set

The degree of membership of the number of destination ports in the fuzzy set high determines whether or not the rule is activated

ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

ID using Data Mining FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTIONID Using DataMining

Two data mining methods have been used to mine audit data to find normal patterns for anomaly intrusion detection

Association Rules

Frequency episodes

Fuzzy Association Rules

Fuzzy Frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Association Rules

Association rules are developed to find correlations in transactions using retail data

For example if a customer who buys a soft drink (A) usually also buys potato chips (B) then potato chips are associated with soft drinks using the rule A B Suppose that 25 of all customers buy both soft drinks and potato chips and that 50 of the customers who buy soft drinks also buy potato chips Then the degree of support for the rule is s = 025 and the degree of confidence in the rule is c = 050

The Apriori algorithm requires two thresholds of minconfidence (representing minimum confidence) and minsupport (representing minimum support)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Association Rules

This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category

Their method allows a value to contribute to the support of more than one fuzzy set

For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed

An example of a fuzzy association rule from one set of audit data is

SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049

where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions

Fuzzy Association Rules

Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Frequency Episodes

This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes

An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval

Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window

So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent

if frequency(P)n geminfrequency

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Frequency Episodes

The fuzzy frequency episodes involves quantitative attributes in an event

An example of a fuzzy frequency episode given below

E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds

where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2

second period

The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate

This is Integration of fuzzy logic with frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

A simple example of a rule from the misuse detection component is

IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious

Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result

The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules

Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS

APPLIED TO INTRUSION DETECTION

Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)

The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set

The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem

Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms

The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set

Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Conclusion

The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions

Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Referrences

Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99

Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada

Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)

Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 28

Queries

042023 28

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28

Fuzzy Logic Method FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Logic

ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Suppose one wants to write a rule such as

If the number different destination addresses during the last 2 seconds was highThen an unusual situation exists

Using fuzzy logic a rule like the one shown above could be written as

If the DP = highThen an unusual situation exists

DP is a fuzzy variable and high is a fuzzy set

The degree of membership of the number of destination ports in the fuzzy set high determines whether or not the rule is activated

ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

ID using Data Mining FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTIONID Using DataMining

Two data mining methods have been used to mine audit data to find normal patterns for anomaly intrusion detection

Association Rules

Frequency episodes

Fuzzy Association Rules

Fuzzy Frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Association Rules

Association rules are developed to find correlations in transactions using retail data

For example if a customer who buys a soft drink (A) usually also buys potato chips (B) then potato chips are associated with soft drinks using the rule A B Suppose that 25 of all customers buy both soft drinks and potato chips and that 50 of the customers who buy soft drinks also buy potato chips Then the degree of support for the rule is s = 025 and the degree of confidence in the rule is c = 050

The Apriori algorithm requires two thresholds of minconfidence (representing minimum confidence) and minsupport (representing minimum support)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Association Rules

This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category

Their method allows a value to contribute to the support of more than one fuzzy set

For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed

An example of a fuzzy association rule from one set of audit data is

SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049

where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions

Fuzzy Association Rules

Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Frequency Episodes

This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes

An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval

Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window

So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent

if frequency(P)n geminfrequency

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Frequency Episodes

The fuzzy frequency episodes involves quantitative attributes in an event

An example of a fuzzy frequency episode given below

E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds

where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2

second period

The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate

This is Integration of fuzzy logic with frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

A simple example of a rule from the misuse detection component is

IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious

Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result

The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules

Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS

APPLIED TO INTRUSION DETECTION

Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)

The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set

The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem

Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms

The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set

Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Conclusion

The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions

Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Referrences

Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99

Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada

Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)

Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 28

Queries

042023 28

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28

ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Suppose one wants to write a rule such as

If the number different destination addresses during the last 2 seconds was highThen an unusual situation exists

Using fuzzy logic a rule like the one shown above could be written as

If the DP = highThen an unusual situation exists

DP is a fuzzy variable and high is a fuzzy set

The degree of membership of the number of destination ports in the fuzzy set high determines whether or not the rule is activated

ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

ID using Data Mining FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTIONID Using DataMining

Two data mining methods have been used to mine audit data to find normal patterns for anomaly intrusion detection

Association Rules

Frequency episodes

Fuzzy Association Rules

Fuzzy Frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Association Rules

Association rules are developed to find correlations in transactions using retail data

For example if a customer who buys a soft drink (A) usually also buys potato chips (B) then potato chips are associated with soft drinks using the rule A B Suppose that 25 of all customers buy both soft drinks and potato chips and that 50 of the customers who buy soft drinks also buy potato chips Then the degree of support for the rule is s = 025 and the degree of confidence in the rule is c = 050

The Apriori algorithm requires two thresholds of minconfidence (representing minimum confidence) and minsupport (representing minimum support)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Association Rules

This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category

Their method allows a value to contribute to the support of more than one fuzzy set

For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed

An example of a fuzzy association rule from one set of audit data is

SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049

where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions

Fuzzy Association Rules

Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Frequency Episodes

This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes

An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval

Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window

So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent

if frequency(P)n geminfrequency

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Frequency Episodes

The fuzzy frequency episodes involves quantitative attributes in an event

An example of a fuzzy frequency episode given below

E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds

where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2

second period

The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate

This is Integration of fuzzy logic with frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

A simple example of a rule from the misuse detection component is

IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious

Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result

The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules

Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS

APPLIED TO INTRUSION DETECTION

Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)

The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set

The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem

Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms

The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set

Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Conclusion

The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions

Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Referrences

Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99

Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada

Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)

Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 28

Queries

042023 28

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28

ID using Fuzzy Logic FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

ID using Data Mining FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTIONID Using DataMining

Two data mining methods have been used to mine audit data to find normal patterns for anomaly intrusion detection

Association Rules

Frequency episodes

Fuzzy Association Rules

Fuzzy Frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Association Rules

Association rules are developed to find correlations in transactions using retail data

For example if a customer who buys a soft drink (A) usually also buys potato chips (B) then potato chips are associated with soft drinks using the rule A B Suppose that 25 of all customers buy both soft drinks and potato chips and that 50 of the customers who buy soft drinks also buy potato chips Then the degree of support for the rule is s = 025 and the degree of confidence in the rule is c = 050

The Apriori algorithm requires two thresholds of minconfidence (representing minimum confidence) and minsupport (representing minimum support)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Association Rules

This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category

Their method allows a value to contribute to the support of more than one fuzzy set

For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed

An example of a fuzzy association rule from one set of audit data is

SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049

where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions

Fuzzy Association Rules

Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Frequency Episodes

This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes

An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval

Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window

So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent

if frequency(P)n geminfrequency

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Frequency Episodes

The fuzzy frequency episodes involves quantitative attributes in an event

An example of a fuzzy frequency episode given below

E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds

where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2

second period

The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate

This is Integration of fuzzy logic with frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

A simple example of a rule from the misuse detection component is

IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious

Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result

The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules

Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS

APPLIED TO INTRUSION DETECTION

Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)

The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set

The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem

Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms

The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set

Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Conclusion

The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions

Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Referrences

Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99

Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada

Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)

Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 28

Queries

042023 28

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28

ID using Data Mining FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTIONID Using DataMining

Two data mining methods have been used to mine audit data to find normal patterns for anomaly intrusion detection

Association Rules

Frequency episodes

Fuzzy Association Rules

Fuzzy Frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Association Rules

Association rules are developed to find correlations in transactions using retail data

For example if a customer who buys a soft drink (A) usually also buys potato chips (B) then potato chips are associated with soft drinks using the rule A B Suppose that 25 of all customers buy both soft drinks and potato chips and that 50 of the customers who buy soft drinks also buy potato chips Then the degree of support for the rule is s = 025 and the degree of confidence in the rule is c = 050

The Apriori algorithm requires two thresholds of minconfidence (representing minimum confidence) and minsupport (representing minimum support)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Association Rules

This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category

Their method allows a value to contribute to the support of more than one fuzzy set

For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed

An example of a fuzzy association rule from one set of audit data is

SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049

where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions

Fuzzy Association Rules

Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Frequency Episodes

This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes

An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval

Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window

So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent

if frequency(P)n geminfrequency

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Frequency Episodes

The fuzzy frequency episodes involves quantitative attributes in an event

An example of a fuzzy frequency episode given below

E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds

where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2

second period

The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate

This is Integration of fuzzy logic with frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

A simple example of a rule from the misuse detection component is

IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious

Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result

The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules

Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS

APPLIED TO INTRUSION DETECTION

Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)

The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set

The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem

Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms

The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set

Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Conclusion

The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions

Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Referrences

Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99

Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada

Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)

Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 28

Queries

042023 28

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28

Association Rules

Association rules are developed to find correlations in transactions using retail data

For example if a customer who buys a soft drink (A) usually also buys potato chips (B) then potato chips are associated with soft drinks using the rule A B Suppose that 25 of all customers buy both soft drinks and potato chips and that 50 of the customers who buy soft drinks also buy potato chips Then the degree of support for the rule is s = 025 and the degree of confidence in the rule is c = 050

The Apriori algorithm requires two thresholds of minconfidence (representing minimum confidence) and minsupport (representing minimum support)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Association Rules

This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category

Their method allows a value to contribute to the support of more than one fuzzy set

For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed

An example of a fuzzy association rule from one set of audit data is

SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049

where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions

Fuzzy Association Rules

Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Frequency Episodes

This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes

An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval

Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window

So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent

if frequency(P)n geminfrequency

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Frequency Episodes

The fuzzy frequency episodes involves quantitative attributes in an event

An example of a fuzzy frequency episode given below

E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds

where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2

second period

The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate

This is Integration of fuzzy logic with frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

A simple example of a rule from the misuse detection component is

IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious

Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result

The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules

Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS

APPLIED TO INTRUSION DETECTION

Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)

The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set

The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem

Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms

The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set

Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Conclusion

The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions

Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Referrences

Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99

Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada

Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)

Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 28

Queries

042023 28

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28

Fuzzy Association Rules

This gives rise to the ldquosharp boundary problemrdquo in which a very small change in value causes an abrupt change in category

Their method allows a value to contribute to the support of more than one fuzzy set

For anomaly detection we mine a set of rules from a data set with no intrusions (termed a reference data set) and use this as a description of normal behavior When considering a new set of audit data a set of association rules is mined from the new data and the similarity of this new rule set and the reference set is computed

An example of a fuzzy association rule from one set of audit data is

SN=LOW FN=LOW rarr RN=LOW c = 0924 s = 049

where SN is the number of SYN flags FN is the number of FIN flags and RN is the number of RST flags in a 2 second period

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions

Fuzzy Association Rules

Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Frequency Episodes

This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes

An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval

Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window

So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent

if frequency(P)n geminfrequency

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Frequency Episodes

The fuzzy frequency episodes involves quantitative attributes in an event

An example of a fuzzy frequency episode given below

E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds

where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2

second period

The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate

This is Integration of fuzzy logic with frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

A simple example of a rule from the misuse detection component is

IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious

Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result

The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules

Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS

APPLIED TO INTRUSION DETECTION

Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)

The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set

The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem

Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms

The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set

Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Conclusion

The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions

Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Referrences

Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99

Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada

Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)

Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 28

Queries

042023 28

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28

Figure shows results from one experiment comparing the similarities with the reference set of rules mined from data without intrusions and with intrusions

Fuzzy Association Rules

Comparison of Similarities Between Training Data Set and Different Test Data Sets for Fuzzy Association Rules (minconfidence=06 minsupport=01Training Data Set reference (representing normal behavior) Test Data Sets baseline (representing normal behavior) network1 (including simulated IP spoofing intrusions) andnetwork3 (including simulated port scanning intrusions)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Frequency Episodes

This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes

An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval

Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window

So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent

if frequency(P)n geminfrequency

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Frequency Episodes

The fuzzy frequency episodes involves quantitative attributes in an event

An example of a fuzzy frequency episode given below

E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds

where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2

second period

The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate

This is Integration of fuzzy logic with frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

A simple example of a rule from the misuse detection component is

IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious

Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result

The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules

Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS

APPLIED TO INTRUSION DETECTION

Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)

The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set

The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem

Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms

The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set

Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Conclusion

The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions

Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Referrences

Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99

Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada

Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)

Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 28

Queries

042023 28

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28

Frequency Episodes

This algorithm for discovering simple serial frequency episodes from event sequences based on minimal occurrencesLater it is used to mine to fuzzy frequency episodes

An event is characterized by a set of attributes at a point in time An episode P(e1e2 hellip ek) is a sequence of events that occurs within a time window [ttrsquo] The episode is minimal if there is no occurrence of the sequence in a subinterval of the time interval

Given a threshold of window (representing timestamp bounds) the frequency of P(e1e2 hellip ek) in an event sequence S is the total number of its minimal occurrences in any interval smaller than window

So given another threshold minfrequency (representing minimum frequency) an episode P(e1e2 hellip ek) is called frequent

if frequency(P)n geminfrequency

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Fuzzy Frequency Episodes

The fuzzy frequency episodes involves quantitative attributes in an event

An example of a fuzzy frequency episode given below

E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds

where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2

second period

The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate

This is Integration of fuzzy logic with frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

A simple example of a rule from the misuse detection component is

IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious

Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result

The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules

Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS

APPLIED TO INTRUSION DETECTION

Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)

The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set

The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem

Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms

The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set

Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Conclusion

The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions

Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Referrences

Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99

Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada

Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)

Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 28

Queries

042023 28

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28

Fuzzy Frequency Episodes

The fuzzy frequency episodes involves quantitative attributes in an event

An example of a fuzzy frequency episode given below

E1 PN=LOW E2 PN=MEDIUM rarr E3 PN=MEDIUM c = 0854 s = 0108 w = 10 seconds

where E1 E2 and E3 are events that occur in that order PN is the number of distinct destination ports within a 2

second period

The use of fuzzy logic with frequency episodes results in a reduction of the false positive error rate

This is Integration of fuzzy logic with frequency episodes

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

A simple example of a rule from the misuse detection component is

IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious

Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result

The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules

Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS

APPLIED TO INTRUSION DETECTION

Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)

The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set

The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem

Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms

The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set

Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Conclusion

The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions

Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Referrences

Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99

Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada

Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)

Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 28

Queries

042023 28

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28

A simple example of a rule from the misuse detection component is

IF the number of consecutive logins by a user is greater than 3THEN the behavior is suspicious

Information from a number of misuse detection components will be combined by the decision component to determine if an alarm should be result

The misuse detection components are small rule-based expert systems that look for known patterns of intrusive behavior The FuzzyCLIPS system allows us to implement both fuzzy and non-fuzzy rules

Misuse Detection FUZZY DATA MINING AND GENETIC ALGORITHMS

APPLIED TO INTRUSION DETECTION

Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)

The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set

The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem

Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms

The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set

Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Conclusion

The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions

Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Referrences

Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99

Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada

Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)

Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 28

Queries

042023 28

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28

Each fuzzy membership function can be defined using two parameters as shown in Figure 3 Each chromosome for the GA consists of a sequence of these parameters (two per membership function) An initial population of chromosomes is generated randomly where each chromosome represents a possible solution to the problem (an set of parameters)

The goal is to increase the similarity of rules mined from data without intrusions and the reference rule set while decreasing the similarity of rules mined from intrusion data and the reference rule set

The genetic algorithm works by slowly ldquoevolvingrdquo a population of chromosomes that represent better and better solutions to the problem

Genetic algorithms are search procedures often used for optimization problems When using fuzzy logic it is often difficult for an expert to provide ldquogoodrdquo definitions for the membership functions for the fuzzy variables

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms

The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set

Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Conclusion

The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions

Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Referrences

Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99

Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada

Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)

Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 28

Queries

042023 28

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28

Genetic Algorithms

The evolution process of the fitness of the populationincluding the fitness of the most fit individual the fitness of the least fit individual and the average fitness of the whole population

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set

Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Conclusion

The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions

Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Referrences

Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99

Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada

Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)

Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 28

Queries

042023 28

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Figure 7 The evolution process for tuning fuzzy membership functions in terms of similarity of data sets containing intrusions (mscan1) and not containing intrusions (normal1) with the reference rule set

Figure 7 demonstrates the evolution of the population of solutions in terms of the two components of the fitness function (similarity of mined ruled to the ldquonormalrdquo rules and similarity of the mined rules to the ldquoabnormalrdquo rules) This graph also demonstrates that the quality of the solution increases as the evolution process proceeds

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Conclusion

The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions

Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Referrences

Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99

Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada

Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)

Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 28

Queries

042023 28

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28

Genetic Algorithms FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Conclusion

The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions

Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Referrences

Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99

Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada

Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)

Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 28

Queries

042023 28

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28

Conclusion

The integrated data mining techniques with fuzzy logic provide new techniques to support both anomaly detection and misuse detection components at both the individual workstation level and at the network levelThe genetic algorithms to tune the membership functions for the fuzzy variables used by our system to and select the most effective set of features for particular types of intrusions

Currently it is used for misuse detection components the decision module additional machine learning components and a graphical user interface for the system Now it is Planning to extend this system to operate in a high performance cluster computing environment

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

Referrences

Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99

Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada

Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)

Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 28

Queries

042023 28

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28

Referrences

Ilgun K and A Kemmerer1995 State transition analysis A rule-based intrusion detection approach IEEE Transaction on Software Engineering 21(3) 181-99

Orchard R 1995 FuzzyCLIPS version 604 userrsquos guide Knowledge System Laboratory National Research Council Canada

Kuok C A Fu and M Wong 1998 Mining fuzzy association rules in databases SIGMOD Record 17(1) 41-6 (Downloaded fromhttpwwwacmorgsigssigmodrecord issues9803 on 1 March 1999)

Allen J Alan Christie Willima Fithen John McHugh Jed Pickel Ed Stoner 2000State of the Practice of Intrusion Detection Technologies CMUSEI-99-TR-028Carnegie Mellon Software Engineering Institute (httpseicmuedupublicationsdocuments99reports99tr028abstracthtml)

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

042023 28

Queries

042023 28

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28

042023 28

Queries

042023 28

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28