machine learning - slides.yowconference.com · dnn: feature engineering anything humans can do in...
TRANSCRIPT
Copyright Cognomotiv 2016
Machine Learning
No: It Can’t Do That!
Hadi Nahari
hadinahari
Copyright Cognomotiv 2016
“Friends, Romans, countrymen, lend me your ears;
I come to bury Caesar, not to praise him.
The evil that men do lives after them…”
Julius Caesar
Act 3, Scene II
Copyright Cognomotiv 2016
Setup
• ML + NetSec
Copyright Cognomotiv 2016
National Academy of EngineeringGrand Challenges for 21st Century
"The best minds of my generation are thinking about how to make people click ads.” ---Jeff Hammerbacher
Copyright Cognomotiv 2016
Agenda
• Motivations
• Machine Learning 101
• ML & Network Security
• What Works, What Doesn’t
• Conclusion
5 / 50
Copyright Cognomotiv 2016
MOTIVATIONSAgenda
Copyright Cognomotiv 2016
ML Is NOT New
• This is the 5th round…
Copyright Cognomotiv 2016
ML is HOT!!
• VCs fund ML-companies like crazy
• Amazing new fields have opened
– Autonomous driving, behavior analytics, etc.
• Ton of existing fields have been revived
– Search, personalization/customization, audio processing, image processing, etc., etc.
Copyright Cognomotiv 2016
• Mainly because…
Copyright Cognomotiv 2016
Code Complexity
• Space Shuttle: ~400K LOC
• F22 Raptor fighter: ~2M LOC
• Linux kernel 2.2: ~2.5M LOC
• Hubble telescope: ~3M LOC
• Android core: ~12M LOC
• Future Combat Sys.: ~63M LOC
• Connected car: ~100M LOC
• Autonomous vehicle: ~300M LOC
10 / 50
Copyright Cognomotiv 2016
• Autonomous vehicle: ~300M LOC
Large Hadron Collider: 60 M LOC
50 M LOC
Copyright Cognomotiv 2016
Usecase Complexityservice provider
on avg. only five passwords per 40 online accounts per user
where to store the tokens???
Copyright Cognomotiv 2016
Data Procreation
• >2 billion GB of new data is created every day– 2.3283006436538696 B GB to be exact
• Sparse data: mainly 0s
• In ‘93 the information on the internet surpassed all information that humanity had created before it
Copyright Cognomotiv 2016
Stack Proliferation
HW Architecture(s)
Applications
Copyright Cognomotiv 2016
Algorithms
15 / 50
Copyright Cognomotiv 2016
Algorithms
Copyright Cognomotiv 2016
ML 101Agenda
Copyright Cognomotiv 2016
Machine Learning (ML)• Study of pattern recognition & computational
learning theory in Artificial Intelligence (AI)
• Algorithms to learn from, and make predictions on data
• As opposed to following strictly static program instructions
Copyright Cognomotiv 2016
ML Models• Supervised learning
• Unsupervised learning
• (Semi-supervised learning)
• Reinforcement learning
Copyright Cognomotiv 2016
Supervised Learning
20
• {(labeled) Input} [map] {Expected Output}
• Find [map]
/ 50
Copyright Cognomotiv 2016
Supervised Learning Model
Copyright Cognomotiv 2016
Unsupervised Learning• {(unlabled) Input} [map] {Output}
• Find structure (patterns) in {Input}
Copyright Cognomotiv 2016
Unsupervised Learning Model
Copyright Cognomotiv 2016
Reinforcement Learning• No correct {Input}/{Output}
• Action, environment, reward
Copyright Cognomotiv 2016
Reinforcement Learning Model
25 / 50
Copyright Cognomotiv 2016
Main ML Approaches• Decision Tree Learning, Association Rule Learning
• Inductive Logic Programming, Support Vector Machines, Clustering, Bayesian Networks
• Representation Learning, Genetic Algorithms
• Similarity and Metric Learning, Sparse Dictionary Learning
• Artificial Neural Networks (ANN), Deep Learning (DL)
Copyright Cognomotiv 2016
Neural Network
• Interpret an Artificial Intelligence (AI) task as the evaluation of complex functions
– Facial Recognition: Map a bunch of pixels to a name
– Handwriting Recognition: Image to a character
• NN: Network of interconnected simple neurons
Copyright Cognomotiv 2016
The NeuronFeed-forward system, made up of two stages:
Linear Transformation of data
Point-wise application of non-linear function
X
1
X
2
X
3
W
1
W
2
W
3
yi =F(ΣWiXi)i
F(x) =max(0,x)
(also sigmoid, Rectified Linear Unit (ReLU), etc.)
Copyright Cognomotiv 2016
Artificial Neural Network (ANN)• Layers and layers of neurons, with many
connections
Input:
Output:
Copyright Cognomotiv 2016
Deep Learning (DL)
30
• Branch of ML based on a set of algorithms that:
• Attempt to model high-level data abstractions
• Are based on learning representations of data
• Use complex architectures with multiple non-linear transformations
• Some representations make it easier to learn tasks from examples (e.g. Alpha Go)
/ 50
Copyright Cognomotiv 2016
DNN: Learning Feature Representation
Input Result
Copyright Cognomotiv 2016
DNN: Feature Engineering
Anything humans can do in 0.1 sec, the right, big 10-layer network can do too
Image Vision features Detection
Images/video
Audio Audio features Speaker ID
Audio
Text
Text Text features
Text classification, Machine translation, Information retrieval, ....
Copyright Cognomotiv 2016
ML/DL Improve With Scale
Data & Compute
Performance ML / DL
Many previous methods
Past Present Future
Copyright Cognomotiv 2016
ML & NETSECAgenda
Copyright Cognomotiv 2016
Intrusion & Intrusion Detection
35
“Intrusion is an attempt to compromise CIA
(Confidentiality, Integrity, Availability), or to bypass the
security mechanisms of a computer or network”
“Intrusion detection is the process of monitoring the
events occurring in a computer system or network, and
analyzing them for signs of an intrusion”
/ 50
Copyright Cognomotiv 2016
3 Main Detection Methodologies• Signature-based Detection (SD)
• Signature: pattern corresponding to known attack or threat
• SD: process to compare patterns against captured events
• A.K.A “Knowledge-based Detection”
• Anomaly-based Detection (AD)
• Anomaly is a deviation to “normal” behavior
• Profile of normal is derived from monitoring network traffic
• AD compares normal profile with observed events
• Stateful Protocol Analysis (SPA)
• Vendor-developed generic profiles to specific protocols
Copyright Cognomotiv 2016
Cybersecurity System
• Attacks evolve, ergo building defense systems is nontrivial
• Thus, higher-level & adaptive methodologies are required
Copyright Cognomotiv 2016
Adaptive Cybersecurity
• Data-capturing tools (Libpcap, Winpcap, etc.) capture events from the audit trails of information sources (e.g. network)
• Data-preprocessing module filters out the attacks from which good signatures have been learned
• A feature-extractor derives basic features (sequence of syscalls, start time, NetFlow duration, src/dest IP/port, protocol, byte and packet counts
• Analysis engine implements detection methods for infrastructure anomalies, which may or may not have appeared before
Copyright Cognomotiv 2016
WHAT WORKS WHAT DOESN’T Agenda
Copyright Cognomotiv 2016
Curse of Dimensionality
40
• Data volume is massive
– min. ~100M events per day
• Much of the data is streaming data
– Requires inline, real-time analysis
• Feature space is high dimensional
/ 50
Copyright Cognomotiv 2016
$/Detection Performance Abysmal
• Looking for “every anomaly” is cost prohibitive
– if at all [practically] possible
• Narrowing down the criteria too much
– results in false negative
• Reference data hard to gain due to privacy concerns
– Simulated data is useless
• ML was supposed to be better than signature era
Copyright Cognomotiv 2016
Husky Recognition
Copyright Cognomotiv 2016
• We built an effective snow recognition model…
Learned Features
Copyright Cognomotiv 2016
Models: Simple Correlations
• Simple models are also (usually) wrong
Copyright Cognomotiv 2016
Network Anomalies
45
• Malicious data packets have a small variety(low type-count), but happen in high frequency
– Current models are not good at detecting this type of anomaly
• Anomaly/outlier varies among application domains
• Labeled anomalies are not available for training/validation
/ 50
Copyright Cognomotiv 2016
Baselining
• Using ML to detect anomaly is easy when baseline is well-defined and follows simple mathematical model (e.g. Normal Distribution)
• Most real-world systems don’t render a simple baseline (i.e. their behavior is very complex)
• [!]Sanctity of baseline: “nearly 100% of networks are compromised”
Copyright Cognomotiv 2016
Time Shifting
• “Window problem”: algos should be limited to ingest data in chunks that can be processed
– What if the anomaly is seeded outside that window?
• Network traffic diversity: usage varies in every session and with new applications
– window should also be shifted for recurring training
• Serious impact on performance, real-time, and security
Copyright Cognomotiv 2016
There’s More…
• How do you trust what the model predicts?– i.e. how do we know the model works correctly (husky)?
• Designing sound evaluation schemes can be more difficult than the detector itself
• We really don’t know how ML works
• … or how to reason about ML models
• … or how to debug them
• For now it’s just magic & voodoo
Copyright Cognomotiv 2016
CONCLUSIONAgenda
Copyright Cognomotiv 2016
Summary
50
• ML is a great and necessary technology
• ML really shines for some classes of problems
• ML is NOT the best solution for every problem (e.g. NetSec)
• Obtaining (and training with) useful data remains a challenge
• ML is just one initial building block of Machine Cognition and Artificial Understanding: there are many more
• Still a long way before machines can replicate humans!
/ 50
Copyright Cognomotiv 2016
THANK YOU!
Hadi Nahari
hadinahari
Copyright Cognomotiv 2016
Backup
Copyright Cognomotiv 2016
References• Prof. Karl Friston seminal works
(http://www.fil.ion.ucl.ac.uk/~karl/#_Free-energy_principle)• “Why Should I Trust You?” Explaining the Predictions of Any Classifier, Carlos Guestrin, et al
(https://arxiv.org/pdf/1602.04938.pdf)• “Using Machine Learning in Network Intrusion Detection Systems”, Omar Shaya
(http://www.slideshare.net/OmarShaya/machine-learning-in-networks-intrusion-detection?next_slideshow=1)
• “Machine Learning Is Not The Answer To Better Network Security”, Matt Harrigan(https://techcrunch.com/2016/02/29/machine-learning-is-not-the-answer-to-better-network-security/)
• “Machine Learning Algorithm Cheat Sheet”, Laura Diane Hamilton, (http://www.lauradhamilton.com/machine-learning-algorithm-cheat-sheet)
• “Anomaly Detection Approaches for Communicating Networks”(http://users.ece.gatech.edu/~jic/anomaly-book-chap-09.pdf)
• “A Survey on Machine Learning Techniques for Intrusion Detection Systems”, J. Sing, N.J. Nene, (http://ijarcce.com/upload/2013/november/35-o-jayveer_singh-A_Survey_on_Machine.pdf)
• “Machine Learning Techniques for Anomaly Detection: An Overview”, S. Omar, et al,(http://research.ijcaonline.org/volume79/number2/pxc3891478.pdf)
• “Recent Advances in Predictive (Machine) Learning”, J.H. Friedman, et al, (http://statweb.stanford.edu/~jhf/ftp/machine)
• “Outside the Closed World: On Using Machine Learning For Network Intrusion Detection”, R. Sommer, V. Paxson, (http://www.utdallas.edu/~muratk/courses/dmsec_files/oakland10-ml.pdf)
• http://xkcd.com
Copyright Cognomotiv 2016
• IQ scores are rising
• Underlying biological “HW” declining
• “Intelligence” is in decline
Are Humans Getting Smarter?