casting out demons: sanitizing training data for anomaly sensors

33
Casting out Demons: Sanitizing Training Data for Anomaly Sensors Angelos Stavrou, Department of Computer Science George Mason University Joint work with Gabriela Cretu, Michael E. Locasto, Salvatore J. Stolfo, Angelos D. Keromytis

Upload: others

Post on 03-Feb-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Casting out Demons: Sanitizing Training Data for Anomaly Sensors

Casting out Demons: Sanitizing Training Data for

Anomaly Sensors

Angelos Stavrou, Department of Computer Science

George Mason UniversityJoint work with Gabriela Cretu, Michael E. Locasto,

Salvatore J. Stolfo, Angelos D. Keromytis

Page 2: Casting out Demons: Sanitizing Training Data for Anomaly Sensors

Anomaly Detection (AD) Systems

SupervisedThey are dependent on labeled data, which cannot be

prepared for large data sets, eg. network packets

Semi-supervisedUsing a third party sensor for labeling some data as

known bad dataDependent on clean data for training

UnsupervisedCan clean the data by determining the outliers in the

training dataNo good definition for an anomaly other than low

probability data

!

"

!

"

"

!

"

"

Page 3: Casting out Demons: Sanitizing Training Data for Anomaly Sensors

Motivation

Detection of zero-day attacks (only using ADsystem)

Detection accuracy of all learning-based anomalydetectors depends heavily on the quality of thetraining data

Training data is often poor, severely degradingAD’s reliability as detection and forensic analysistools

!

!

!

Page 4: Casting out Demons: Sanitizing Training Data for Anomaly Sensors

Rest of the Talk

Intuition

Local Training Sanitization

Distributed Cross-Sanitization

Future work

Conclusions

!

!

!

!

!

Page 5: Casting out Demons: Sanitizing Training Data for Anomaly Sensors

Intuition

Pattern of actions reflected on traces: Regular – what we are expecting based on previous observations

Abnormal – unlikely data requiring further investigation

An attack can pass as normal traffic if it is partof the training set

Sanitize the training data by using a large set ofmicro-models where attacks and non-regular datacause a localized or limited "pollution“ oftraining data

!

"

"

!

!

Page 6: Casting out Demons: Sanitizing Training Data for Anomaly Sensors

Training Dataset Sanitization

Attacks and accidental mal-formed requests/datacause a local "pollution“ of training data

An attack can pass as normal traffic if it is part of thetraining set

We seek to remove both malicious and abnormaldata from the training dataset

Related ML algorithms:Ensemble methods [Dietterich00]MetaCost [Domingos99]Meta-learning [Stolfo00]

!

"

!

!

"

"

"

Page 7: Casting out Demons: Sanitizing Training Data for Anomaly Sensors

Training Strategies – Uniform Time

Divide data into multiple blocks!

micro-datasets with the same time granularity"

……..

Page 8: Casting out Demons: Sanitizing Training Data for Anomaly Sensors

Training Strategies – Multiple Models

Divide data into multiple blocks

Build micro-models for each block

!

!

M1 M2 MKµM1 µM2 µMK

Attacks and non-regular data cause localized "pollution“"

……..

Page 9: Casting out Demons: Sanitizing Training Data for Anomaly Sensors

Training Strategies – Voting Models

Divide data into multiple blocks

Build micro-models for each blockTest all models against a smaller dataset

!

!

!

M1 M2 MK

Votingalgorithm

µM1 µM2 µMK

Votingalgorithm

Simple voting:

Weighted voting: wi = number of packets

used for training µMi

"

"……..

Page 10: Casting out Demons: Sanitizing Training Data for Anomaly Sensors

Training Strategies - Sanitization

Divide data into multiple blocks

Build micro-models for each blockTest all models against a smaller dataset

Build sanitized and abnormal models

!

!

!

!

M1 M2 MK

Votingalgorithm

Abnormal

model

SanitizedmodelTraining phase

µM1 µM2 µMK

Votingalgorithm

Abnormalmodel

SanitizedmodelTraining phase

sanitized model:

abnormal model:

V = voting threshold

"

"

"

……..

Page 11: Casting out Demons: Sanitizing Training Data for Anomaly Sensors

Shadow Sensor RedirectionShadow sensor

Heavily instrumented host based anomaly detector akinto an “oracle”

Performs substantially slower than the nativeapplication

Use the shadow sensor to classify or corroborate thealerts produced by the AD sensors

!

"

"

!

Sanitizedmodel

Alert?

False falsepositive

AlertTesting phase

Shadowserver

Sanitizedmodel

Alert?

Falsepositive

AlertTesting phase

Host basedIDS

Feasibility and scalabilitydepend on the number ofalerts generated by the ADsensor

"

Page 12: Casting out Demons: Sanitizing Training Data for Anomaly Sensors

Overall Architecture

For each host, use a large setof training data:

Divide data intomultiple blocks

Build micro-modelsfor each block

Test all models against asmaller dataset

Sanitize data based onprevious step and buildthe sanitized model

Build an abnormal modelas well

!

!

!

"

"

M1 M2 MK

Votingalgorithm

Abnormal

model

SanitizedmodelTraining phase

Alert?

False falsepositive

AlertTesting phase

Shadowserver

µM1 µM2 µMK

Votingalgorithm

Maliciousmodel

SanitizedmodelTraining phase

Alert?

Falsepositive

AlertTesting phase

Host basedIDS

Page 13: Casting out Demons: Sanitizing Training Data for Anomaly Sensors

Micro-models

Partition a large training dataset into a number ofsmaller, time delimited training sets => micro-datasets

where each mdi has a time granularity, g

AD can be any chosen anomaly detection algorithm T is the training datasetM denotes the normal model produced by AD

Attacks and non-regular data cause a localized orlimited "pollution“ of training data

!

"

"

"

"

Page 14: Casting out Demons: Sanitizing Training Data for Anomaly Sensors

Voting algorithms

Using a second dataset and testing it against !Mi

Lj,i = 0 if !Mi deems the packet Pj as normal

Lj,i = 1 otherwise

The generalized label for packet Pj

where wi is the weight assigned to !Mi

Simple voting:

Weighted voting: = proportion of all packetsused for training µMi

!

!

!

!

!

!

Page 15: Casting out Demons: Sanitizing Training Data for Anomaly Sensors

Sanitized and Abnormal Models

Sanitized model

Abnormal model

V = voting threshold

!

!

!

Page 16: Casting out Demons: Sanitizing Training Data for Anomaly Sensors

Evaluation

Proof of concept using two content-based anomalydetectors:

Anagramsemi-supervised learning (when using Snort)supervised learning (without Snort)analyzing n-gram

Paylunsupervised learninganalyzing byte(1-gram) frequency distributions

!

"

"

"

!

"

"

Page 17: Casting out Demons: Sanitizing Training Data for Anomaly Sensors

Evaluation dataset

300/100/100 hours of real network traffic!

Page 18: Casting out Demons: Sanitizing Training Data for Anomaly Sensors

Voting Techniques Comparison

a) Simple voting b) Weighted votingPerformance of Anagram sensor after sanitization for www1

Page 19: Casting out Demons: Sanitizing Training Data for Anomaly Sensors

Datasets Comparison

Performance for www and lists for 3-hour granularity when usingAnagram

Page 20: Casting out Demons: Sanitizing Training Data for Anomaly Sensors

AD sensors comparison

Sensor FP)(%) FA TP)(%) TA

Anagram 0.07 544 0 0

Anagram)with)Snort 0.04 312 20.20 20

Anagram)withsanitization

0.10 776 100 99

Payl 0.84 6,558 0 0

Payl)with)sanitization 6.64 70,392 76.76 76

Page 21: Casting out Demons: Sanitizing Training Data for Anomaly Sensors

Signal-to-noise ratio comparison

Sensor www1 www lists

Anagram 0 0 0

Anagram with Snort 505 59.10 370.2

Anagram with sanitization 1000 294.11 1000

Payl 0 6.64 1.00

Payl with sanitization 11.56 5.84 36.05

signal-to-noise ratio TP/FP: higher values mean better results

Page 22: Casting out Demons: Sanitizing Training Data for Anomaly Sensors

Granularity Impact

Granularity impact on the performance of the system when usingAnagram and Payl

Page 23: Casting out Demons: Sanitizing Training Data for Anomaly Sensors

Training Dataset Size Impact

Impact of the size of the training dataset for www1

Page 24: Casting out Demons: Sanitizing Training Data for Anomaly Sensors

AD’s Internal Threshold Impact

Impact of the anomaly detector’s internal threshold forwww1 when using Anagram

Page 25: Casting out Demons: Sanitizing Training Data for Anomaly Sensors

Analysis of g and V

a) Simple voting b) Weighted votingPerformance of Anagram sensor after sanitization

Page 26: Casting out Demons: Sanitizing Training Data for Anomaly Sensors

Shadow Sensor Performance Evaluation

Overall computational requirements of an AD sensor anda host based sensor (e.g. STEM and DYBOC)

l is the standard latency of a protected serviceOs is the shadow server overhead

FP is the false positive rate

!

!

!

!

Sensor STEM DYBOC

N/A 44*l 1.2*l

Anagram 1.031*l 1.0001*l

Anagram=with=Snort 1.0172*l 1.0000*l

Anagram=with=sanitization 1.0430*l 1.0002*l

Payl 1.3612*l 1.0016*l

Payl=with=sanitization 3.8552*l 1.0132*l

Page 27: Casting out Demons: Sanitizing Training Data for Anomaly Sensors

Caveat Emptor & Limitations

The presence of a long-lasting attack in thedataset used for computing the micro-models

Poisoning all the micro-models

!

!

Page 28: Casting out Demons: Sanitizing Training Data for Anomaly Sensors

AD Distributed Cross-Sanitization

Use external knowledge (models) to generate a betterlocal normal model

Abnormal models are exchanged across collaborative sites[Stolfo00]

re-evaluate the locally computed sanitized models

!

!

!

Apply model differencing Remove remote

abnormal data from thelocal normal model

!

!

Page 29: Casting out Demons: Sanitizing Training Data for Anomaly Sensors

Cross-sanitization

Direct model differencingAnalytic method, difference of the models

Indirect model differencingNo analytic method, use testing

!

"

!

"

direct

indirect

Local sanitizedmodel

Remote abnormalmodel

Page 30: Casting out Demons: Sanitizing Training Data for Anomaly Sensors

Cross-sanitization: Evaluation

Model www1 www lists

FP (%) TP (%) FP (%) TP (%) FP (%) TP (%)

Mpois 0.10 44.94 0.27 51.78 0.25 47.53

Mcross(direct) 0.24 100 0.71 100 0.48 100

Mcross(indirect) 0.10 100 0.26 100 0.10 100

Indirect model differencing is more expensive than thedirect model differencing

!

Method www1 www lists

direct 13.989s 26.359s 16.849s

indirect 1966.689s 1732.329s 685.819s

Page 31: Casting out Demons: Sanitizing Training Data for Anomaly Sensors

Future work

adversarial scenarios: new techniques to resist training attacks

distributed sanitization: a distributed architecture toshare models and remove training attacks

model updates: updating AD models to accommodateconcept drift

!

!

!

Page 32: Casting out Demons: Sanitizing Training Data for Anomaly Sensors

Conclusions

A novel sanitization method that boosts theperformance of out-of-the-box anomaly detectors

Simple and general method, without significantadditional computational cost

An efficient and accurate online packetclassifier; both in real time and in post-processingforensic analysis

!

!

!

Page 33: Casting out Demons: Sanitizing Training Data for Anomaly Sensors

Thank you!

Questions?