Using Machine Learning in Networks Intrusion Detection Systems

Download Using Machine Learning in Networks Intrusion Detection Systems

Post on 21-Apr-2017

1.002 views

Category:

Data & Analytics

0 download

TRANSCRIPT

  • Using Machine Learning in Networks Intrusion Detection

    Systems

    OMAR SHAYA

    Georg-August-Universitt Gttingen 1

  • Sections

    Introduction

    Intrusion Detection Methodologies

    A Machine Learning Based IDS (Intrusion Detection System)

    Challenges of Using Machine Learning in Intrusion Detection

    Summary

    References

    Appendix

    Georg-August-Universitt Gttingen 2

  • INTRODUCITON

    Georg-August-Universitt Gttingen 3IDS: Intrusion Detection System

  • Increasing attacks on computer networks and the need for automated detection

    Georg-August-Universitt Gttingen 4

    Internet and computer systems have raised numerous security and privacy issues

    Explosive use of networks due to many reasons e.g. internet, wireless networks, cloud computing

    Thus, malicious attacks on networks have increased year over year

    Need to automate systems that detect these attacks Based on on known attacks But what about attacks that were not seen before Machine learning?

    INTRODUCTION

  • Definition: intrusion & intrusion detection

    Georg-August-Universitt Gttingen 5

    INTRODUCTION

    Intrusion is an attempt to compromise CIA (Confidentiality, Integrity, Availability), or to bypass the security mechanisms of a computer or network

    Intrusion detection is the process of monitoring the events occurring in a computer system or network, and analyzing them for signs of intrusion

  • INTRUSION DETECTION METHODOLOGIES

    Georg-August-Universitt Gttingen 6IDS: Intrusion Detection System

  • There are 3 main Detection Methodologies

    Georg-August-Universitt Gttingen 7

    Signature-based Detection (SD) A signature is a string or pattern that corresponds to known attack or threat SD is a process to compare patterns against captured events for recognizing

    possible intrusions Uses the knowledge accumulated by specific attacks and system vulnerabilities Also known as Knowledge-based Detection or Misuse Detection

    Anomaly-based Detection (AD) Anomaly is a deviation to normal behavior Profiles of normal derived from monitoring network traffic AD compares normal profiles with observed events to recognize attacks

    Stateful Protocol Analysis (SPA) SPA depends on vendor-developed generic profiles to specific protocols Protocols based on standards from international standard organizations

    Hybrid IDS use multiple methodologies SD and AD are complementary methods, former concerns with certain attacks

    and the later focuses on unknown attacks

    INTRUSION DETECTION METHODOLOGIES

  • There are 3 main Detection Methodologies

    Georg-August-Universitt Gttingen 8

    Hybrid IDS use multiple methodologies E.g. SD and AD are complementary methods SD concerns with certain attacks and AD focuses on unknown attacks

    INTRUSION DETECTION METHODOLOGIES

    Signature-based Detection (SD)*

    Anomaly-based Detection (AD)

    Stateful Protocol Analysis (SPA)

    SD is a process to compare patterns against captured events for recognizing possible intrusions

    AD compares normal profiles with observed events to recognize attacks

    SPA depends on vendor-developed generic profiles to specific protocols

    A signature is a string or pattern that corresponds to known attack or threat

    Anomaly is a deviation to normal behavior

    The stateful in SPA indicates that IDS could know and trace the protocol states (e.g., pairing requests with replies)

    Uses the knowledge accumulated by specific attacks and system vulnerabilities

    Profiles of normal derived from monitoring network traffic

    Protocols based on standards from international standard organizations

    * Also known as Knowledge-based Detection or Misuse Detection

  • Pros and cons of Intrusion Detection Methods

    Georg-August-Universitt Gttingen 9

    INTRUSION DETECTION METHODOLOGIES

    Table 1: Pros and Cons of intrusion detection methodologies. Source [2]

    Signature-based Detection (SD)

    Anomaly-based Detection (AD)

    Stateful Protocol Analysis (SPA)

    Simplest and effective method to detect attacks

    Detail contextual analysis

    Effective to detect new and unforeseen vulnerabilities

    Less dependent on OS

    Facilitate detections of privilege abuse

    Know and trace protocol states

    Distinguish unexpected sequences of commands

    Ineffective with unknown attacks and variants of known attacks

    Little understanding to states and protocols

    Hard to keep signatures/patterns up to date

    Time consuming to maintain the knowledge

    Weak profiles accuracy due to observed events

    Unavailable during rebuilding of behavior profiles

    Difficult to trigger alerts in right time

    Resource consuming to protocol state tracing and examination

    Unable to inspect attacks looking like benign protocol behaviors

    Might be incompatible to dedicated OSs or APs

    PRO

    SC

    ON

    S

  • A MACHINE LEARNING BASED IDS

    Georg-August-Universitt Gttingen 10IDS: Intrusion Detection System

  • Machine learning in anomaly detection

    Georg-August-Universitt Gttingen 11

    Anomaly-based Detection (AD) Easy when it is possible to characterize what is normal in the

    data using simple mathematical model, e.g. normal distribution Most interesting real world systems have complex behavior that

    doesnt follow such distribution Machine learning is useful to learn the characteristics of the

    system from observed data Feature Selection is the process of selecting a subset of relevant

    features (variables, predictors) for use in model construction. Feature selection techniques are used for three reasons: Simplification of models to make them easier to interpret Shorter training times Enhanced generalization by reducing overfitting

    Outlier Detection: an outlier is an observation point that is distant from other observations

    A MACHINE LEARNING BASED IDS

  • Robust Feature Selection and Robust PCA for Internet Traffic Anomaly Detection

    Georg-August-Universitt Gttingen 12

    Couples feature selection algorithm with outlier detection method

    Uses robust statistics tools in both procedures Reliable results even with outliers presence Feature selection based on robust mutual estimator

    MI (Mutual Information): an information-theoretic metric that captures both linear and non-linear dependencies

    Outlier detection on robust PCA (Principal Component Analysis) Mathematical procedure used to reduce dimensionality of a

    problem

    A MACHINE LEARNING BASED IDS

  • Robust Feature Selection and Robust PCA for Internet Traffic Anomaly Detection

    Georg-August-Universitt Gttingen 13

    Feature selection Important preprocessing step (filter) Reduce dimensionality with high-dimensional data Remove irrelevant data Increase learning accuracy Gives significant performance gains

    A MACHINE LEARNING BASED IDS

  • Robust Feature Selection and Robust PCA for Internet Traffic Anomaly Detection

    Georg-August-Universitt Gttingen 14

    A MACHINE LEARNING BASED IDS

    Robust statistics Reliable results even in the

    presence of outliers Example:

    In normal distribution, the inner 95% are in center 1.96 X spread Center: instead of mean,

    take the median Spread: instead of SD (standard

    deviation), take the MAD (median absolute deviation)

    Source [1]

  • Dataset creation for training and testing (1/2)

    Georg-August-Universitt Gttingen 15

    Dataset collected from mirroring traffic passing the switch of: Private laboratory network, 17 inter-connected PCs

    10 for users producing licit traffic 1 for server, 1 for measurements 5 for attacks

    Licit traffic File sharing (BitTorrent) Video streaming (IPTV over TCP) Web browsing (HTTP)

    Attacks Botnets

    Port-scans: identify other targets vulnerable to infections Snapshots: type of identity theft for stealing personal information Other Botnet attacks are not used e.g. spyware, malware, denial of service, and

    email spam Happen uniquely on host level Can be detected by e.g. anti-virus, monitoring at router/firewalls, email scanning

    A MACHINE LEARNING BASED IDS

  • Dataset creation for training and testing (2/2)

    Georg-August-Universitt Gttingen 16

    Customer usage profiles (a) Soft browsing (HTTP only) (b) File sharing machine (BitTorrent only) (c) File sharing user (BitTorrent and HTTP) (d) Heavy user (HTTP, BitTorrent, and

    Streaming)

    Network scenarios (B) Business user

    100% (a) (R) Residential user

    30% (b), 40% (c), 30% (d)

    Attack intensities (1) 6% (5% snapshot, 1% port-scan) (2) 20% (15% snapshot, 5% port-scan) (3) 35% (30% snapshot, 5% port-scan)

    A MACHINE LEARNING BASED IDS

    Table 2. Source [1]

  • Results (1/3)

    Georg-August-Universitt Gttingen 17

    A MACHINE LEARNING BASED IDS

    6 types of anomaly detectors A-B A: feature selection method, B Outlier

    detection method R (robust) NR (non-robust) (no-method)

    Performance measures Nr Ftrs: number of selected features Recall: probability that an observation is

    classified as anomaly when in fact it is an anomaly

    False positive rate (FPR): probability that an observation is classified as an anomaly when in fact it is a regular observation

    Precision: probability of having an anomalous observation given that it is classified as an anomaly

    Table 3. Source [1]

  • Results (2/3)

    Georg-August-Universitt Gttingen 18

    R-R detector achieved the best results Recall is always 1 B1, B2, B3, R3 performance is maximum FPR and Precision are close to their optimal

    Improvement over non-robust version is high Low recall means large percentage of

    anomalies are not correctly identified B2, B3, R3 recall improved from 0.167,

    0.273, and 0.125 to 1

    Feature selection Feature selection reduces Nr Ftrs, improves

    performance B3 and R3: no feature selection sometimes

    better than non-robust feature selection

    A MACHINE LEARNING BASED IDS

    Table 3. Source [1]

  • Results (3/3)

    Georg-August-Universitt Gttingen 19

    A MACHINE LEARNING BASED IDS

    Compare R-NR (top) and R-R (bottom)

    Any point with score or distance larger than a threshold (the lines) is considered an anomaly

    R-NR case there is confusion around snapshots thus poor recall value 0.125 proximity in behavior between snapshots and

    some HTTP and BitTorrent fools the non-robust outlier detector All consist of small file uploads

    Source [1]Fig. 2.

  • Discussion

    Georg-August-Universitt Gttingen 20

    There are advantages of using feature selection step and using robust statistics for both feature selection and outlier detection System achieves very high performance The systems anomaly detector is adaptive to different traffic conditions (licit traffic

    differs significantly in the two scenarios)

    However, the dataset used was obtained from a private lab with 17 PCs, and not necessarily representative of a real world scenario Need to show proof of the effectiveness of the system in larger scale network

    traffic dataset

    A MACHINE LEARNING BASED IDS

  • CHALLENGES OF USING MACHINE LEARNING IN INTRUSION DETECTION

    Georg-August-Universitt Gttingen 21

  • Outliers, cost of error, semantics, and evaluation

    Georg-August-Universitt Gttingen 22

    Outlier detection Hard to define normal in network traffic as the usage varies in every

    session and with new applications (diversity of network traffic)

    High cost of errors Cost of misclassification is extremely high False positive: expensive analyst time False negative: cause serious damage to an organization Error in other applications of ML not expensive e.g. product

    recommendations, OCR, spam detection

    Semantic gap Currently it is only assessment of capability to identify deviations from

    normal profile (could be good or bad) Need to interpret results from operator point of view, what does it mean?

    Difficulties with evaluation Designing sound evaluation schemes can be more difficult than the

    detector itself Lack of public data sets for assessing anomaly detection

    Hard to gain real data set for many reasons e.g. leak of personal data Simulated data is not accurate

    CHALLENGES OF USING MACHINE LEARNING IN INTRUSION DETECTION

  • SUMMARY

    Georg-August-Universitt Gttingen 23

  • Summary

    Georg-August-Universitt Gttingen 24

    Introduction The need for automated Intrusion Detection Systems Definition of Intrusion and Intrusion Detection

    Intrusion Detection Methodologies Signature-based Detection (SD) Anomaly-based Detection (AD) Stateful Protocol Analysis (SPA)

    Machine Learning Based IDS Using feature selection and robust statistics Dataset creation Results and evaluation Discussion

    Challenges of Using Machine Learning in ID Outlier detection, high cost of error, semantic gap, and difficulties with evaluation

    SUMMARY

  • OMAR SHAYA omar.shaya@stud.uni-goettingen.de

    Thanks!

    Georg-August-Universitt Gttingen 25

    mailto:omar.shaya@stud.uni-goettingen.de

  • Georg-August-Universitt Gttingen 26

    References

    [1] C. Pasocal, M. Oliveira, R. Valdas, P. Filzmoser, P. Salvador and A. Pacheco. Robust Feature Selection and Robust PCA for Internet Traffic Anomaly Detection. In Proceedings IEEE INFOCOM, pages 1755-1763, 2012

    [2] H. Liao, C. Lin, Y. Lin and K. Tung. Intrusion Detection System: A Comprehensive Review. In Journal of Network and Computer Applications, pages 16-24, 2013

    [3] R. Sommer and V. Paxson. Outside the Closed World: On Using Machine Learning For Network Intrusion Detection. In IEEE Symposium on Security and Privacy, pages 305-316, 2010

    [4] Feature Selection. https://en.wikipedia.org/wiki/Feature_selection on 6 August 2015

    [5] Outlier. https://en.wikipedia.org/wiki/Outlier on 6 August 2015

    [6] Anomaly Detection Using Machine Learning to Detect Abnormalities in Time Series Data. http://blogs.technet.com/b/machinelearning/archive/2014/11/05/anomaly-detection-using-machine-learning-to-detect-abnormalities-in-time-series-data.aspx on 6 August 2015

    REFERENCES

    https://en.wikipedia.org/wiki/Feature_selectionhttps://en.wikipedia.org/wiki/Outlierhttp://blogs.technet.com/b/machinelearning/archive/2014/11/05/anomaly-detection-using-machine-learning-to-detect-abnormalities-in-time-series-data.aspx

  • Precision and Recall

    Georg-August-Universitt Gttingen 27

    APPENDIX

    Source: Dr. Stephan Siggs slides from Machine Learning and Pervasive Computing course SoSe 2015

Recommended

View more >