a novel technique for learning rare events
DESCRIPTION
A Novel Technique for Learning Rare Events. ME. Margaret H. Dunham, Yu Meng, Jie Huang CSE Department Southern Methodist University Dallas, Texas 75275 [email protected] This material is based upon work supported by the National Science Foundation under Grant No. IIS-0208741. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/1.jpg)
11/11/05 1
MEMEA Novel Technique for Learning Rare A Novel Technique for Learning Rare
EventsEvents
Margaret H. Dunham, Yu Meng, Jie HuangMargaret H. Dunham, Yu Meng, Jie Huang
CSE DepartmentCSE Department
Southern Methodist UniversitySouthern Methodist University
Dallas, Texas 75275Dallas, Texas 75275
This material is based upon work supported by the National Science This material is based upon work supported by the National Science Foundation under Grant No. Foundation under Grant No. IIS-0208741 IIS-0208741
![Page 2: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/2.jpg)
11/11/05 2
Objectives/Outline
Develop modeling techniques which can “learn/forget” past behavior of spatiotemporal events. Apply to prediction of rare events.
Introduction EMM Overview EMM Applications to Rare Event Detection Future Work
![Page 3: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/3.jpg)
11/11/05 3
Objectives/Outline
Develop modeling techniques which can “learn/forget” past behavior of spatiotemporal events. Apply to prediction of rare events.
Introduction EMM Overview EMM Applications to Rare Event Detection Future Work
![Page 4: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/4.jpg)
11/11/05 4
Spatiotemporal Environment
Events arriving in a streamCan not look at a snapshot of the data.At any time, t, we can view the state of the problem at a site as represented by a vector of n numeric values:
Vt = <S1t, S2t, ..., Snt>
V2 V2 … V2
S1 S11 S12 … S1q
S2 S21 S22 … S2q
… … … … …
Sn Sn1 Sn2 … Snq
Time
![Page 5: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/5.jpg)
11/11/05 5
Spatiotemporal Modeling
Example Applications: Flood Prediction Rare Event Detection – Network traffic,
automobile traffic Requirements
Capture Time Capture Space Dynamic Scalable Quasi-Real Time
![Page 6: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/6.jpg)
11/11/05 6
Technique
Spatiotemporal modeling technique based on Markov models.
However – Size of MM depends on size of dataset The required structure of the MM is not known at the
model construction time. As the real world being modeled by the MM changes,
so should the structure of the MM. Thus not only should transition probabilities change, but the number of states should be changed to more accurately model the changing world.
![Page 7: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/7.jpg)
11/11/05 7
MM
A first order Markov Chain is a finite or countably infinite sequence of events {E1, E2, … } over discrete time points, where Pij = P(Ej | Ei), and at any time the future behavior of the process is based solely on the current state
A Markov Model (MM) is a graph with m vertices or states, S, and directed arcs, A, such that:
S ={N1,N2, …, Nm}, and A = {Lij | i 1, 2, …, m, j 1, 2, …, m} and Each
arc, Lij = <Ni,Nj> is labeled with a transition probability Pij = P(Nj | Ni).
![Page 8: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/8.jpg)
11/11/05 8
Problem with Markov Chains
The required structure of the MC may not be certain at the model construction time.
As the real world being modeled by the MC changes, so should the structure of the MC.
Not scalable – grows linearly as number of events. Markov Property Our solution:
Extensible Markov Model (EMM) Cluster real world events Allow Markov chain to grow and shrink dynamically
![Page 9: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/9.jpg)
11/11/05 9
Objectives/Outline
Develop modeling techniques which can “learn/forget” past behavior of spatiotemporal events. Apply to prediction of rare events.
Introduction
EMM Overview EMM Applications to Rare Event Detection Future Work
![Page 10: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/10.jpg)
11/11/05 10
Extensible Markov Model (EMM)
Time Varying Discrete First Order Markov Model Nodes are clusters of real world states. Learning continues during application phase. Learning:
Transition probabilities between nodes Node labels (centroid/medoid of cluster) Nodes are added and removed as data arrives
![Page 11: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/11.jpg)
11/11/05 11
Related Work Splitting Nodes in HMMs
Create new states by splitting an existing state M.J. Black and Y. Yacoob,”Recognizing facial expressions in image sequences using local
parameterized models of image motion”, Int. Journal of Computer Vision, 25(1), 1997, 23-48. Dynamic Markov Modeling
States and transitions are cloned G. V. Cormack, R. N. S. Horspool. “Data compression using dynamic Markov Modeling,” The
Computer Journal, Vol. 30, No. 6, 1987.
Augmented Markov Model (AMM) Creates new states if the input data has never been seen in the
model, and transition probabilities are adjusted Dani Goldberg, Maja J Mataric. “Coordinating mobile robot group behavior using a model of
interaction dynamics,” Proceedings, the Third International Conference on Autonomous Agents (agents ’99), Seattle, Washington
![Page 12: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/12.jpg)
11/11/05 12
EMM vs AMMOur proposed EMM model is similar to AMM, but is more flexible: EMM continues to learn during the application (prediction, etc.) phase. The EMM is a generic incremental model whose nodes can have any
kind of representatives. State matching is determined using a clustering technique. EMM not only allows the creation of new nodes, but deletion (or
merging) of existing nodes. This allows the EMM model to “forget” old information which may not be relevant in the future. It also allows the EMM to adapt to any main memory constraints for large scale datasets.
EMM performs one scan of data and therefore is suitable for online data processing.
![Page 13: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/13.jpg)
11/11/05 13
EMM
Extensible Markov Model (EMM): at any time t, EMM consists of an MM and algorithms to modify it, where algorithms include:
EMMSim, which defines a technique for matching between input data at time t + 1 and existing states in the MM at time t.
EMMBuild algorithm, which updates MM at time t + 1 given the MM at time t and classification measure result at time t + 1.
Additional algorithms are used to modify the model or for applications.
![Page 14: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/14.jpg)
11/11/05 14
EMMBuildInput: Vt = <S1, S2, …, Sn>: Observed values at n different locations at time t.
G: EMM with m states at time t-1.Nc:Current state at time t-1.
Output: G: EMM graph at time t.Nc:Current state at time t.
if G = empty then // Initialize G, first input vector is the first state N1 = Vt; CN1 = 0; Nc = N1;else // update G as new input comes in foreach Ni in G determine EMMSim(Vt, Ni); let Nn be node with largest similarity value, sim; if sim >= threshold then // update matching state information CNc = CNc + 1; if Lcn exists CLcn = CLcn + 1; else create new transition Lcn = <Nc,Nn>; CLcn = 1; Nc = Nn; else // create a new state Nm+1 represented by Vt create new node Nm+1; Nm+1 = Vt; CNm+1 = 0; create new transition Lc(m+1) = <Nc, Nm+1>; CLc(m+1) = 1; CNc = CNc + 1; Nc = Nm+1;
![Page 15: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/15.jpg)
11/11/05 15
EMMSim
Find closest node to incoming event. If none “close” create new node Labeling of cluster is centroid/medoid of
members in cluster Problem
O(n) BIRCH O(lg n)
• Requires second phase to recluster initial
![Page 16: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/16.jpg)
11/11/05 16
EMMBuild
<18,10,3,3,1,0,0><18,10,3,3,1,0,0>
<17,10,2,3,1,0,0><17,10,2,3,1,0,0>
<16,9,2,3,1,0,0><16,9,2,3,1,0,0>
<14,8,2,3,1,0,0><14,8,2,3,1,0,0>
<14,8,2,3,0,0,0><14,8,2,3,0,0,0>
<18,10,3,3,1,1,0.><18,10,3,3,1,1,0.>
1/3
N1
N2
2/3
N3
1/11/3
N1
N2
2/3
1/1
N3
1/1
1/2
1/3
N1
N2
2/31/2
1/2
N3
1/1
2/3
1/3
N1
N2
N1
2/21/1
N1
1
![Page 17: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/17.jpg)
11/11/05 17
EMMDecrement
•N2
•N1 •N3
•N5 •N6
•2/2
•1/3
•1/3
•1/3
•1/2
•N1 •N3
•N5 •N6
•1/6
•1/6
•1/6
•1/3
•1/3
•1/3
Delete N2
![Page 18: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/18.jpg)
11/11/05 18
EMM Advantages
Dynamic Adaptable Use of clustering Learns rare event Scalable:
Growth of EMM is not linear on size of data. Hierarchical feature of EMM
Creation/evaluation quasi-real time Distributed / Hierarchical extensions
![Page 19: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/19.jpg)
11/11/05 19
Growth of EMM
0
100
200
300
400
500
600
700
800
1 80 159
238
317
396
475
554
633
712
791
870
949
1028
1107
1186
1265
1344
1423
1502
number of input data (total 1574)
num
ber o
f st
ate
in m
ode
l
threshold 0.994
threshold 0.995
threshold 0.996
threshold 0.997
threshold 0.998
Servent Data
![Page 20: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/20.jpg)
11/11/05 20
EMM Performance – Growth Rate
Data Sim
Threshold
0.99 0.992 0.994 0.996 0.998
Serwent
Jaccrd 156 190 268 389 667
Dice 72 92 123 191 389
Cosine 11 14 19 31 61
Ovrlap 2 2 3 3 4
Ouse
Jaccrd 56 66 81 105 162
Dice 40 43 52 66 105
Cosine 6 8 10 13 24
Ovrlap 1 1 1 1 1
![Page 21: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/21.jpg)
11/11/05 21
EMM Performance – Growth Rate
Minnesota Traffic Data
![Page 22: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/22.jpg)
11/11/05 22
Error Rates
Normalized Absolute Ratio Error (NARE)
NARE =
Root Means Square (RMS)
RMS =
N
t
N
t
tO
tPtO
1
1
)(
|)()(|
N
tPtON
t
1
2))()((
![Page 23: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/23.jpg)
11/11/05 23
EMM Performance - Prediction
NARE RMSNo of States
RLF 0.321423 1.5389
EMMTh=0.95 0.068443 0.43774 20Th=0.99 0.046379 0.4496 56
Th=0.995 0.055184 0.57785 92
![Page 24: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/24.jpg)
11/11/05 24
EMM Water Level Prediction – Ouse Data
0
1
2
3
4
5
6
7
8
1
38
75
112
14
9
18
6
22
3
26
0
29
7
33
4
37
1
40
8
44
5
48
2
51
9
55
6
59
3
63
0
66
7
Input Time Series
Wa
ter
Le
ve
l (m
)
RLF Prediction EMM Prediction Observed
![Page 25: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/25.jpg)
11/11/05 25
Objectives/Outline
Develop modeling techniques which can “learn/forget” past behavior of spatiotemporal events. Apply to prediction of rare events.
Introduction EMM Overview
EMM Applications to Rare Event Detection
Future Work
![Page 26: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/26.jpg)
11/11/05 26
Rare Event
Rare - Anomalous – Surprising Out of the ordinary Not outlier detection
No knowledge of data distribution Data is not static Must take temporal and spatial values into account May be interested in sequence of events
Ex: Snow in upstate New York is not rare Snow in upstate New York in June is rare
Rare events may change over time
![Page 27: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/27.jpg)
11/11/05 27
Rare Event Examples
The amount of traffic through a site in a particular time interval as extremely high or low.
The type of traffic (i.e. source IP addresses or destination addresses) is unusual.
Current traffic behavior is unusual based on recent precious traffic behavior.
Unusual behavior at several sites.
![Page 28: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/28.jpg)
11/11/05 28
What is a Rare Event?
Not an outlier We don’t know anything about the distribution of
the data. Even if we did the data continues changing. A model created based on a static view may not fit tomorrow’s data.
We view a rare event as: Unusual state of the network (or subset thereof). Transition between network states which does
not frequently occur. Base rare event detection on determining events or
transitions between events that do not frequently occur.
![Page 29: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/29.jpg)
11/11/05 29
Rare Event Examples – VoIP Traffic
The amount of traffic through a site in a particular time interval as extremely high or low.
The type of traffic (i.e. source IP addresses or destination addresses) is unusual.
Current traffic behavior is unusual based on recent precious traffic behavior.
Unusual behavior at several sites.
![Page 30: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/30.jpg)
11/11/05 30
Rare Event Detection Applications
Intrusion Detection Fraud Flooding Unusual automobile/network traffic
![Page 31: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/31.jpg)
11/11/05 31
Rare Event Detection Techniques
Signature Based Created signatures for normal behavior Rule based Pattern Matching State Transition Analysis
Statistical Based Profiles of normal behavior
Data Mining Base Classification Clustering
![Page 32: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/32.jpg)
11/11/05 32
EMM Rare Event Prediction – VoIP Traffic
Predict rare events at a specific site (switch) representing an area of the network.
Use: Identify when rare transition occurs Identify rare event by creation of new node
Hierarchical EMM:
Collect rare event information at a higher level by constructing an EMM of more global events from several sites there.
![Page 33: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/33.jpg)
11/11/05 33
Our Approach
By learning what is normal, the model can predict what is not
Normal is based on likelihood of occurrence Use EMM to build model of behavior We view a rare event as:
Unusual event Transition between events states which does
not frequently occur. Base rare event detection on determining events
or transitions between events that do not frequently occur.
Continue learning
![Page 34: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/34.jpg)
11/11/05 34
EMMRare
EMMRare algorithm indicates if the current input event is rare. Using a threshold occurrence percentage, the input event is determined to be rare if either of the following occurs:
The frequency of the node at time t+1 is below this threshold
The updated transition probability of the MC transition from node at time t to the node at t+1 is below the threshold
![Page 35: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/35.jpg)
11/11/05 35
Determining Rare
Occurrence Frequency (OFc) of a node Nc as defined by:
OFc =
Likewise when determining what is meant by small for a transition probability, we should look at a normalized rather than actual value. We, thus, define the Normalized Transition Probability (NTPmn), from one state, Nm, to another, Nn, as:
NTPmn =
c ii
CN CN
mn ii
CL CN
![Page 36: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/36.jpg)
11/11/05 36
Ongoing/Future Work
Extend to Emerging Patterns Incorporate techniques to reduce False
Alarms Extend to Hierarchical/Distributed
![Page 37: A Novel Technique for Learning Rare Events](https://reader036.vdocuments.mx/reader036/viewer/2022062719/56813053550346895d96021e/html5/thumbnails/37.jpg)
11/11/05 37
Conclusion
We welcome feedback