a deep learning based anomaly detection approach for … · 2018. 3. 23. · instruction sequences...

1
q Advanced modern exploits characterize by their sophistication in stealthy attacks. q Code-reuse attacks such as return-oriented programming and memory disclosure attacks allow attackers executing malicious instruction sequences on victim systems without injecting external code. q This research proposes a new Deep Learning based anomaly detection technique that probabilistically models program control flows for behavioral reasoning and live monitoring. q We aim to answer the binary classification problem of given a sequence of events e 1 e 2 e 3 e k whether or not the sequence should occur? q An event e i in the sequence is a function call in a a given trace. MOTIVATION Miguel Villarreal-Vasquez and Bharat Bhargava Department of Computer Science, Purdue University, West Lafayette, IN, USA A Deep Learning Based Anomaly Detection Approach for Intelligent Autonomous Systems Acknowledgement: This research is supported by NGC Research Consortium. We collaborated with Paul Conoval, Donald Steiner and Jason Kobes. ANOMALY DETECTION ALGORITHM THREAT MODEL REFERENCE q We defined an event as a function call. Each possible function call must be identified as they will form the vocabulary of events. q The dynamic code behaviors can be learned by training the model with non-malicious program traces. q At any time t each possible event (system call or library call) in the system is assigned a probability estimated with respect to the sequence of events observed until time t-1. q At classification, the decision is made with respect to a pre-defined threshold of the K-top most probably sequences. IMPLEMENTATION Given this sequence Should this sequence occur? Legend Each color bar represents one of the all possible events (system calls) in the system. Attack • Code Injection (e.g., Buffer Overflow) Defense •WX (e.g., Windows DEP) Attack • Code Reuse (e.g., ROP) Defense • ASLR and variants Attack • Memory Disclosure (JIT-ROP): Enables Code Reuse Defense • Code Re-Randomization / Deep Learning based Anomaly Detection The Eternal War q Invalid and abnormal control flow of a program: ü Code Injection ü Code Reuse ü Memory Disclosure q Can be caused by: ü Human error (e.g., unauthorized use or operation of the program ü Software flaws (e.g., buffer overflow vulnerabilities) ü Attacks by remote attackers ü Malicious insiders (e.g., through drive-by downloads) CONTRIBUTIONS q Systems can be trained with data of benign flows only in isolation (the data is full available.) q Flow sensitive anomaly detection. Given the execution paths P 1 , P 2 and P 3 our technique captures their occurrence probabilities, vital for high-precision anomaly detection. CUDA GPU MODEL q Model ü Based on Recurrent Neural Networks (RNN): Long-Short Term Memory (LSTM) and Gated Recurrent Unit (GRU) q Programming Language: Python ü Compatible with several numerical computing libraries suitable for the use of GPUs ü Some examples: Pytorch, TensorFlow and Theano q Computing Library: PyTorch ü Scientific computing package ü Replacement of Numpy to take advantage of the power of GPUs ü Python-friendly platform for neural networks (Deep Learning) q Parallel Computing Platform: CUDA ü Programming model to use GPUs for general purpose computing D. Ulybyshev, M. Villarreal-Vasquez, B. Bhargava, G. Mani, S. Seaberg, P. Conoval, D. Steiner, J. Kobes "Blockhub: Blockchain-based Software Development System for Untrusted Environments", Submitted to IEEE CLOUD 2018. M. Villarreal-Vasquez, B. Bhargava, P. Angin, N. Ahmed, D. Goodwin, K. Brin and J. Kobes. An MTD-based Self-Adaptive Resilience Approach for Cloud Systems. In IEEE CLOUD 2017. N. Ahmed and B. Bhargava. “Mayflies: A moving target defense framework for distributed systems,” in Proceedings of the 2016 ACM Workshop on Moving Target Defense 2016.

Upload: others

Post on 27-Sep-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Deep Learning Based Anomaly Detection Approach for … · 2018. 3. 23. · instruction sequences on victim systems without injecting external code. q This research proposes a new

q Advancedmodernexploitscharacterizebytheirsophisticationinstealthyattacks.

q Code-reuse attacks such as return-oriented programming andmemory disclosure attacks allow attackers executing maliciousinstruction sequences on victim systems without injectingexternalcode.

q This research proposes a new Deep Learning based anomalydetection technique that probabilistically models programcontrolflowsforbehavioralreasoningandlivemonitoring.

q  Weaimtoanswerthebinaryclassificationproblemofgivenasequence of events e1e2e3…ek whether or not the sequenceshouldoccur?

q  Aneventeiinthesequenceisafunctioncallinaagiventrace.

MOTIVATION

MiguelVillarreal-VasquezandBharatBhargavaDepartmentofComputerScience,PurdueUniversity,WestLafayette,IN,USA

ADeepLearningBasedAnomalyDetectionApproachforIntelligentAutonomousSystems

Acknowledgement:This research is supported by NGC Research Consortium. We collaborated with Paul Conoval, Donald Steiner and Jason Kobes.

ANOMALYDETECTIONALGORITHM

THREATMODEL

REFERENCE

q Wedefinedaneventasafunctioncall.Eachpossiblefunctioncallmustbeidentifiedastheywillformthevocabularyofevents.

q Thedynamiccodebehaviorscanbelearnedbytrainingthemodelwithnon-maliciousprogramtraces.

q Atanytimeteachpossibleevent(systemcallorlibrarycall)inthesystem is assigned a probability estimated with respect to thesequenceofeventsobserveduntiltimet-1.

q Atclassification,thedecisionismadewithrespecttoapre-definedthresholdoftheK-topmostprobablysequences.

IMPLEMENTATION

Giventhissequence

Shouldthissequenceoccur?

LegendEach color bar representsone of the all possibleevents (system calls) in thesystem.

Attack• CodeInjection(e.g.,BufferOverflow)

Defense• W⨁X(e.g.,WindowsDEP)

Attack• CodeReuse(e.g.,ROP)

Defense• ASLRandvariants

Attack• MemoryDisclosure(JIT-ROP):EnablesCodeReuse

Defense• CodeRe-Randomization/DeepLearningbasedAnomalyDetection

TheEternalWar q  Invalidandabnormalcontrolflowofaprogram:

ü  CodeInjectionü  CodeReuseü MemoryDisclosure

q Canbecausedby:ü  Humanerror(e.g.,

unauthorizeduseoroperationoftheprogram

ü  Softwareflaws(e.g.,bufferoverflowvulnerabilities)

ü  Attacksbyremoteattackers

ü Maliciousinsiders(e.g.,throughdrive-bydownloads)

CONTRIBUTIONSq Systems can be trained with data of benign flows only in

isolation(thedataisfullavailable.)q Flowsensitiveanomalydetection.GiventheexecutionpathsP1,

P2andP3ourtechniquecapturestheiroccurrenceprobabilities,vitalforhigh-precisionanomalydetection.

CUDA

GPU

MODEL q  Modelü  BasedonRecurrentNeuralNetworks(RNN):

Long-ShortTermMemory(LSTM)andGatedRecurrentUnit(GRU)

q  ProgrammingLanguage:Pythonü  Compatiblewithseveralnumericalcomputing

librariessuitablefortheuseofGPUsü  Someexamples:Pytorch,TensorFlowand

Theano

q  ComputingLibrary:PyTorchü  Scientificcomputingpackageü  ReplacementofNumpytotakeadvantageof

thepowerofGPUsü  Python-friendlyplatformforneuralnetworks

(DeepLearning)q  ParallelComputingPlatform:CUDAü  ProgrammingmodeltouseGPUsforgeneral

purposecomputing

•  D. Ulybyshev, M. Villarreal-Vasquez, B. Bhargava, G. Mani, S. Seaberg, P. Conoval, D. Steiner, J. Kobes"Blockhub: Blockchain-based Software Development System for Untrusted Environments", Submitted toIEEECLOUD2018.

•  M.Villarreal-Vasquez,B.Bhargava,P.Angin,N.Ahmed,D.Goodwin,K.BrinandJ.Kobes.AnMTD-basedSelf-AdaptiveResilienceApproachforCloudSystems.InIEEECLOUD2017.

•  N. Ahmed and B. Bhargava. “Mayflies: A moving target defense framework for distributed systems,” inProceedingsofthe2016ACMWorkshoponMovingTargetDefense2016.

mfocosi
Typewritten Text
mfocosi
Typewritten Text
mfocosi
Typewritten Text
mfocosi
Typewritten Text
mfocosi
Typewritten Text
2018 - PDR - F16-68D - A Deep Learning Based Anomaly Detection Approach for Intelligent Autonomous Systems - Miguel Villarreal-Vasquez