anomaly detection - new york machine learning
DESCRIPTION
Anomaly detection is the art of finding what you don't know how to ask for. In this talk, I walk through the why and how of building probabilistic models for a variety of problems including continuous signals and web traffic. This talk blends theory and practice in a highly approachable way.TRANSCRIPT
© 2014 MapR Technologies 1
© MapR Technologies, confidential
How to Find What You Didn’t Know to Look For
Anomaly Detection
October 14, 2014
© 2014 MapR Technologies 2
Anomaly Detection: How To Find What You Didn’t Know to Look For
Ted Dunning, Chief Applications Architect MapR Technologies
Email [email protected] [email protected]
Twitter @Ted_Dunning
Ellen Friedman, Consultant and Commentator
Email [email protected]
Twitter @Ellen_Friedman
© 2014 MapR Technologies 3
e-book available courtesy of MapR
http://bit.ly/1jQ9QuL
A New Look at Anomaly Detectionby Ted Dunning and Ellen Friedman © June 2014 (published by O’Reilly)
© 2014 MapR Technologies 4
Practical Machine Learning series (O’Reilly)
• Machine learning is becoming mainstream• Need pragmatic approaches that take into account real world
business settings:– Time to value– Limited resources– Availability of data– Expertise and cost of team to develop and to maintain system
• Look for approaches with big benefits for the effort expended
© 2014 MapR Technologies 5
Anomaly Detection
© 2014 MapR Technologies 6
Who Needs Anomaly Detection?
Utility providers using smart meters
© 2014 MapR Technologies 7
Who Needs Anomaly Detection?
Feedback from manufacturing assembly lines
© 2014 MapR Technologies 8
Who Needs Anomaly Detection?
Monitoring data traffic on communication networks
© 2014 MapR Technologies 9
What is Anomaly Detection?
• The goal is to discover rare events – especially those that shouldn’t have happened
• Find a problem before other people see it– especially before it causes a problem for customers
• Why is this a challenge?– I don’t know what an anomaly looks like (yet)
© 2014 MapR Technologies 10
Spot the Anomaly
© 2014 MapR Technologies 11
Spot the Anomaly
Looks pretty anomalous
to me
© 2014 MapR Technologies 12
Spot the Anomaly
Will the real anomaly please stand up?
© 2014 MapR Technologies 13
Basic idea:Find “normal” first
© 2014 MapR Technologies 14
Steps in Anomaly Detection
• Build a model: Collect and process data for training a model• Use the machine learning model to determine what is the normal
pattern • Decide how far away from this normal pattern you’ll consider to
be anomalous• Use the AD model to detect anomalies in new data
– Methods such as clustering for discovery can be helpful
© 2014 MapR Technologies 15
How hard is it to set an alert for anomalies?
Grey data is from normal events; x’s are anomalies.Where would you set the threshold?
© 2014 MapR Technologies 16
Basic idea:Set adaptive thresholds
© 2014 MapR Technologies 17
What Are We Really Doing
• We want action when something breaks (dies/falls over/otherwise gets in trouble)
• But action is expensive• So we don’t want too many false alarms• And we don’t want too many false negatives
• What’s the right threshold to set for alerts?– We need to trade off costs
© 2014 MapR Technologies 18
A Second Look
© 2014 MapR Technologies 19
A Second Look
99.9%-ile
© 2014 MapR Technologies 20
New algorithm: t-digest
© 2014 MapR Technologies 21
Online Summarizer
99.9%-ile
t
x > t ? Alarm !x
How Hard Can it Be?
© 2014 MapR Technologies 22
Detecting Anomalies in Sporadic Events
© 2014 MapR Technologies 23
Using t-Digest
• Apache Mahout uses t-digest as an on-line percentile estimator– very high accuracy for extreme tails– new in version Mahout v 0.9
• t-digest also available elsewhere– in streamlib (open source library on github)– standalone (github and Maven Central)
• What’s the big deal with anomaly detection?
• This looks like a solved problem
© 2014 MapR Technologies 24
Already Done? Etsy Skyline?
© 2014 MapR Technologies 25
What About This?
© 2014 MapR Technologies 26
Model Delta Anomaly Detection
Online Summarizer
δ > t ?
99.9%-ile
t
Alarm !
Model
-
+ δ
© 2014 MapR Technologies 27
The Real Inside Scoop
• The model-delta anomaly detector is really just a sum of random variables– the model we know about already– and a normally distributed error
• The output (delta) is (roughly) the log probability of the sum distribution (really δ2)
• Thinking about probability distributions is good
• But how do you handle AD in systems with sporadic events?
© 2014 MapR Technologies 28
Spot the Anomaly
Anomaly?
© 2014 MapR Technologies 29
Maybe not!
© 2014 MapR Technologies 30
Where’s Waldo?
This is the real anomaly
© 2014 MapR Technologies 31
Normal Isn’t Just Normal
• What we want is a model of what is normal
• What doesn’t fit the model is the anomaly
• For simple signals, the model can be simple …
• The real world is rarely so accommodating
© 2014 MapR Technologies 32
We Do Windows
© 2014 MapR Technologies 33
We Do Windows
© 2014 MapR Technologies 34
We Do Windows
© 2014 MapR Technologies 35
We Do Windows
© 2014 MapR Technologies 36
We Do Windows
© 2014 MapR Technologies 37
We Do Windows
© 2014 MapR Technologies 38
We Do Windows
© 2014 MapR Technologies 39
We Do Windows
© 2014 MapR Technologies 40
We Do Windows
© 2014 MapR Technologies 41
We Do Windows
© 2014 MapR Technologies 42
We Do Windows
© 2014 MapR Technologies 43
We Do Windows
© 2014 MapR Technologies 44
We Do Windows
© 2014 MapR Technologies 45
We Do Windows
© 2014 MapR Technologies 46
We Do Windows
© 2014 MapR Technologies 47
Windows on the World
• The set of windowed signals is a nice model of our original signal• Clustering can find the prototypes
– Fancier techniques available using sparse coding
• The result is a dictionary of shapes• New signals can be encoded by shifting, scaling and adding
shapes from the dictionary
© 2014 MapR Technologies 48
Most Common Shapes (for EKG)
© 2014 MapR Technologies 49
Reconstructed signal
Original signal
Reconstructed signal
Reconstructionerror
< 1 bit / sample
© 2014 MapR Technologies 50
An Anomaly
Original technique for finding 1-d anomaly works against reconstruction error
© 2014 MapR Technologies 51
Close-up of anomaly
Not what you want your heart to do.
And not what the model expects it to do.
© 2014 MapR Technologies 52
A Different Kind of Anomaly
© 2014 MapR Technologies 53
Model Delta Anomaly Detection
Online Summarizer
δ > t ?
99.9%-ile
t
Alarm !
Model
-
+ δ
© 2014 MapR Technologies 54
The Real Inside Scoop
• The model-delta anomaly detector is really just a sum of random variables– the model we know about already– and a normally distributed error
• The output (delta) is (roughly) the log probability of the sum distribution (really δ2)
• Thinking about probability distributions is good
© 2014 MapR Technologies 55
Anomalies among sporadic events
© 2014 MapR Technologies 56
Sporadic Web Traffic to an e-Business Site
It’s important to know if traffic is stopped or delayed because of a problem…
But visits to site normally come at varying intervals.
How long after the last event should you begin to worry?
© 2014 MapR Technologies 57
Sporadic Web Traffic to an e-Business Site
It’s important to know if traffic is stopped or delayed because of a problem…
But visits to site normally come at varying intervals.
And how do you let your CEO sleep through the night?
© 2014 MapR Technologies 58
Basic idea:Time interval between events is how to
convert to something useful you can measure
© 2014 MapR Technologies 59
Sporadic Events: Finding Normal and Anomalous Patterns
• Time between intervals is much more usable than absolute times
• Counts don’t link as directly to probability models
• Time interval is log ρ
• This is a big deal
© 2014 MapR Technologies 60
Event Stream (timing)
• Events of various types arrive at irregular intervals– we can assume Poisson distribution
• The key question is whether frequency has changed relative to expected values– This shows up as a change in interval
• Want alert as soon as possible
© 2014 MapR Technologies 61
Converting Event Times to Anomaly
99.9%-ile
99.99%-ile
© 2014 MapR Technologies 62
But in the real world, event rates often change
© 2014 MapR Technologies 63
Time Intervals Are Key to Modeling Sporadic Events
© 2014 MapR Technologies 64
Model-Scaled Intervals Solve the Problem
© 2014 MapR Technologies 65
Model Delta Anomaly Detection
Online Summarizer
δ > t ?
99.9%-ile
t
Alarm !
Model
-
+ δ
log p
© 2014 MapR Technologies 66
Detecting Anomalies in Sporadic Events
© 2014 MapR Technologies 67
Detecting Anomalies in Sporadic Events
© 2014 MapR Technologies 68
Slipped Week: Simple Rate Predictor
© 2014 MapR Technologies 69
Poisson Distribution
• Time between events is exponentially distributed
• This means that long delays are exponentially rare
• If we know λ we can select a good threshold– or we can pick a threshold empirically
© 2014 MapR Technologies 70
Seasonality Poses a Challenge
© 2014 MapR Technologies 71
Something more is needed …
© 2014 MapR Technologies 72
We need a better rate predictor…
© 2014 MapR Technologies 73
A New Rate Predictor for Sporadic Events
© 2014 MapR Technologies 74
Improved Prediction with Adaptive Modeling
© 2014 MapR Technologies 75
Anomaly Detection + Classification Useful Pair
• Use the AD model to detect anomalies in new data– Methods such as clustering for discovery can be helpful
• Once you have well-defined models in your system, you may also want to use classification to tag those
• Continue to use the AD model to find new anomalies
© 2014 MapR Technologies 76
Recap (out of order)
• Anomaly detection is best done with a probability model• -log p is a good way to convert to anomaly measure• Adaptive quantile estimation (t-digest) works for auto-setting
thresholds
© 2014 MapR Technologies 77
Recap
• Different systems require different models• Continuous time-series
– sparse coding to build signal model
• Events in time– rate model base on variable rate Poisson– segregated rate model
• Events with labels– language modeling– hidden Markov models
© 2014 MapR Technologies 78
Why Use Anomaly Detection?
© 2014 MapR Technologies 79
Keep in mind…
• Model normal, then find anomalies
• t-digest for adaptive threshold
• Probabilistic models for complex patterns
-
© 2014 MapR Technologies 80
Keep in mind…
• Time intervals are key for sporadic events
• Complex time shift to predict rate with seasonality
• Sequence of events reveals phishing attack
© 2014 MapR Technologies 81
e-book available courtesy of MapR
http://bit.ly/1jQ9QuL
A New Look at Anomaly Detectionby Ted Dunning and Ellen Friedman © June 2014 (published by O’Reilly)
© 2014 MapR Technologies 82
Coming in October: Time Series Databasesby Ted Dunning and Ellen Friedman © Oct 2014 (published by O’Reilly)
© 2014 MapR Technologies 83
Thank you for coming today!
© 2014 MapR Technologies 85
© MapR Technologies, confidential
© 2014 MapR Technologies 86
Sandbox