data mining for network intrusion detection

21
Data Mining for Data Mining for Network Intrusion Network Intrusion Detection Detection Paul Dokas, Levent Ertoz, Vipin Kumar, Aleksandar Paul Dokas, Levent Ertoz, Vipin Kumar, Aleksandar Lazarevic, Jaideep ZSrivastava, Pang-Ning Tan Lazarevic, Jaideep ZSrivastava, Pang-Ning Tan Computer Science Department Computer Science Department University of Minnesota University of Minnesota Presented By: [email protected] CS685 Presentation

Upload: tate-holcomb

Post on 31-Dec-2015

31 views

Category:

Documents


1 download

DESCRIPTION

Paul Dokas, Levent Ertoz, Vipin Kumar, Aleksandar Lazarevic, Jaideep ZSrivastava, Pang-Ning Tan Computer Science Department University of Minnesota. CS685 Presentation. Data Mining for Network Intrusion Detection. Presented By: [email protected]. CS685 Presentation. Outlines Motivation - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Data Mining for  Network Intrusion Detection

Data Mining for Data Mining for Network Intrusion DetectionNetwork Intrusion Detection

Paul Dokas, Levent Ertoz, Vipin Kumar, Aleksandar Lazarevic, Jaideep Paul Dokas, Levent Ertoz, Vipin Kumar, Aleksandar Lazarevic, Jaideep ZSrivastava, Pang-Ning TanZSrivastava, Pang-Ning Tan

Computer Science DepartmentComputer Science DepartmentUniversity of MinnesotaUniversity of Minnesota

Presented By: [email protected]

CS685 Presentation

Page 2: Data Mining for  Network Intrusion Detection

CS685 Presentation

OutlinesOutlines • Motivation

• Related Work

• Detection Models and Approaches

• Experimental Evaluation

• Conclusion

Page 3: Data Mining for  Network Intrusion Detection

CS685 Presentation

MotivationMotivation • Organizations are becoming increasingly vulnerable to potential cyber threats, e.g., network intrusions.

cyber incidents reported to CERT/CC

Incidents Reported to Computer Emergency Response Team/Coordination Center (CERT/CC)

0

10000

20000

30000

40000

50000

60000

90 91 92 93 94 95 96 97 98 99 00 01

Page 4: Data Mining for  Network Intrusion Detection

CS685 Presentation

Motivation (cont.)Motivation (cont.) •Intrusion Detection System (IDS)

• collect signatures of known attacks • input attack signatures into IDS signature databases• extract features from various audit streams • compare these features with attacks signatures• raise the alarm when possible intrusion happens

•LimitationsLimitations of traditional signature-based methods

• manual update of signature database • inability to detect emerging cyber threats

Page 5: Data Mining for  Network Intrusion Detection

CS685 Presentation

Motivation (cont.)Motivation (cont.)

Why data mining?

• large volumes of network data

• different data mining techniquesclustering, classification

Page 6: Data Mining for  Network Intrusion Detection

CS685 Presentation

Related WorkRelated Work Data mining based intrusion detection techniques

• anomaly detection• Build models of normal data• Detect any deviation from normal data• Flag deviation as suspect• Identify new types of intrusions as deviation from normal behavior

• misuse detection• Label all instances in the data set (“normal” or “intrusion” ) • Run learning algorithms over the labeled data to generate

classification rules• Automatically retrain intrusion detection models on different input

data

Page 7: Data Mining for  Network Intrusion Detection

CS685 Presentation

Related WorkRelated Work --- misuse detection

•Classification Model

Bayesian classifier

Decision tree

Association rule

Support vector machine

Learning from rare class

Page 8: Data Mining for  Network Intrusion Detection

CS685 Presentation

Related WorkRelated Work --- anomaly detection

•Anomaly Detection Model

Association rule

Neural network

Unsupervised SVM

Outlier detection

Page 9: Data Mining for  Network Intrusion Detection

CS685 Presentation

Detection ModelsDetection Models

• misuse detection rare class prediction model

known intrusions and their variations

• anomaly detectionoutlier detection model

novel attacks whose nature is unknown

Page 10: Data Mining for  Network Intrusion Detection

CS685 Presentation

Learning from Rare ClassLearning from Rare Class

• Problem: classification model for dataset with skewed class distribution ?

intrusion class << normal class Mining needle in a haystack

Page 11: Data Mining for  Network Intrusion Detection

CS685 Presentation

Learning from Rare Class (cont.)Learning from Rare Class (cont.)

• Novel classification algorithms

•PN-rule• P-rule most of intrusive examples• N-rule eliminating false alarms

•SMOTEBoost•SMOTE (Synthetic Minority Over-sampling TEchnique)•Boosting

Page 12: Data Mining for  Network Intrusion Detection

CS685 Presentation

Anomaly DetectionAnomaly Detection

•Novel attacks/intrusions deviation from normal behavior

•Outlier detection algorithm

Nearest neighbor approachDistance based approachDensity based approach

Unsupervised support vector machines

Page 13: Data Mining for  Network Intrusion Detection

CS685 Presentation

Anomaly DetectionAnomaly Detection

• Density based approach (LOF)

Page 14: Data Mining for  Network Intrusion Detection

CS685 Presentation

Anomaly DetectionAnomaly Detection

•Identify normal behavior

•Construct useful set of feature

•Define similarity function

•Flag deviation as suspect

Page 15: Data Mining for  Network Intrusion Detection

CS685 Presentation

Experimental EvaluationExperimental Evaluation

•Public data setDARPA 1998 Intrusion Detection Evaluation Data Set

prepared and managed by MIT Lincoln Labtraining data and test data

KDD Cup 1999 Data the extension of DARPA’98

training data and test data

•Real network dataNetwork data from University of Minnesota

Page 16: Data Mining for  Network Intrusion Detection

CS685 Presentation

Experimental EvaluationExperimental Evaluation --- feature construction

Purpose: more informative data set from public data set

Method:

• connection records• label connection records ‘normal‘ or ‘intrusion‘• features for each connection record # of {packets, bytes}, {ACK, Re-Tx} packets, SYN/FIN, … time-based features ( DoS attacks )

connection-based features ( PROBING attacks )

Page 17: Data Mining for  Network Intrusion Detection

CS685 Presentation

ExperimentalExperimental EvaluationEvaluation --- single connection attacks

0 0.02 0.04 0.06 0.08 0.10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

LOF approachNN approachMahalanobis approachUnsupervised SVM

ROC Curves for different outlier detection techniques

De

tect

ion

Ra

te

False Alarm Rate

ROC curves for single connection attacks

Page 18: Data Mining for  Network Intrusion Detection

CS685 Presentation

Experimental EvaluationExperimental Evaluation --- bursty attacks --- bursty attacks

0 0.02 0.04 0.06 0.08 0.1 0.120.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

ROC Curves for different outlier detection techniques

False Alarm Rate

De

tect

ion

Ra

te

Unsupervised SVMLOF approachMahalanobis approachNN approach

ROC curves for bursty attacks

Page 19: Data Mining for  Network Intrusion Detection

CS685 Presentation

Experimental EvaluationExperimental Evaluation --- --- real network datareal network data

•Why? Limitations of DARPA’98 data set

•How? Detect network intrusion in the live network

traffic

•Result? •Successfully identify some novel intrusions (top ranked outliers)

Page 20: Data Mining for  Network Intrusion Detection

CS685 Presentation

ConclusionConclusion

• promising intrusion detection models

• performance of algorithm (on-line detection)

• new classification and anomaly detection algorithms

Page 21: Data Mining for  Network Intrusion Detection

CS685 Presentation

Thanks!

Questions?