less is more: building selective anomaly ensembles with application to event detection in temporal...
TRANSCRIPT
![Page 1: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/1.jpg)
Rayana & Akoglu
Shebuti Rayana* Leman Akoglu
May 2, 2015
![Page 2: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/2.jpg)
Rayana & Akoglu 2Less is More: Building Selective Anomaly Ensembles
Network intrusion
At time point t
Time tick 7
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 5 10 15 20
Sco
re
Time tick
Event Detection
![Page 3: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/3.jpg)
Rayana & Akoglu 3Less is More: Building Selective Anomaly Ensembles
Emerging Topic in Social Media
Nepal Earth Quake 2015tweets, retweets with• #Nepal• #NepalEarthQuake• #NepalEarthQuakeRelief• …
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 2 4 6 8 10 12 14 16 18 20
Sco
re
Time tick
Event Detection
25th April 2015
![Page 4: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/4.jpg)
Rayana & Akoglu 4
Given a sequence of graphs {G1, G2, … , Gt, …, GT}
Find time points t’ at which Gt’ changes significantly from Gt’-1
Less is More: Building Selective Anomaly Ensembles
time
similarity/distance scores
![Page 5: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/5.jpg)
Rayana & Akoglu 5Less is More: Building Selective Anomaly Ensembles
Numerous algorithms for event detection
no “winner” algorithm across datasets Idea: ensemble approach
Combine strength of accurate detectors
Alleviate weakness of inaccurate detectors
Improved accuracy, reduced noise
More robust performance
Better than individual base detectors
T. G. Dietterich. Ensemble methods in machine learning. Springer, 2000
J. Ghosh and A. Acharya. Cluster ensembles: Theory and applications. 2013.
![Page 6: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/6.jpg)
Rayana & Akoglu 6
Idea: ensemble approach
Challenge: building anomaly ensembles –a fully unsupervised task
No labels to guide for detector accuracy
No objective function inherent to task
Combining all the results may deteriorate the overall ensemble accuracy [Rayana&Akoglu’14]
▪ some detectors may be inaccurate
Less is More: Building Selective Anomaly Ensembles
We build SELECTive anomaly ensembles - identify (in)accurate detectors- in unsupervised fashion
![Page 7: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/7.jpg)
Rayana & Akoglu 7Less is More: Building Selective Anomaly Ensembles
Even
t Dete
ction
![Page 8: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/8.jpg)
Rayana & Akoglu 8Less is More: Building Selective Anomaly Ensembles
Eigen-behaviors
Parametric modeling
SPIRIT
Z-score
1 – norm.
(sum
p-value)
projection
Subspace Method
Moving Average
SPE
Agg.
p-value
time ticks
Even
t Dete
ction
(Cyb
ern
et)
feature: degree
![Page 9: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/9.jpg)
Rayana & Akoglu
Even
t Dete
ction
(Enro
n)
feature:
weighted in-degree
Z-score
1 – norm.
(sum
p-value)
projection
SPE
Agg.
p-value
9
![Page 10: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/10.jpg)
Rayana & Akoglu 10Less is More: Building Selective Anomaly Ensembles
Graphs over time node feature time series
Base detectors Anomalous Subspace (ASED) [Lakhina et al. ’04] SPIRIT [Papadimitriou et al. ’05] Eigen-behavior based (EBED) [Akoglu et al. ’10] Parametric modeling (PTSAD) [Rayana&Akoglu ’14]▪ Models: Poisson, ZIP, Bernoulli+ZTP, Markov+ZTP▪ Model selection: likelihood ratio test
Moving average (MAED)
Nodes
Features(egonet)
Time
![Page 11: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/11.jpg)
Rayana & Akoglu 11Less is More: Building Selective Anomaly Ensembles
ASED SPIRIT EBED PTSAD MAED
Base detector SELECTion
Rank based
• Inverse Rank• Kemeny-Young [Kemeny’59]
•RobustRankAggregation[Kolde+ ‘12]
Score based
• Unification [Zimek+ ‘11]
- avg & max• Mixture Model [Gao+ ‘06]
- avg & max
Consensus SELECTion & final ensemble
![Page 12: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/12.jpg)
Rayana & Akoglu 12
Vertical SELECTion (SELECT-V)
Exploits correlation among the rank lists
Horizontal SELECTion (SELECT-H)
Exploits element wise order statistics to filter out inaccurate detectors
Less is More: Building Selective Anomaly Ensembles
![Page 13: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/13.jpg)
Rayana & Akoglu 13Less is More: Building Selective Anomaly Ensembles
S1 S2 S3 S4 S5P1 P2 P3 P4 P5
Unification
![Page 14: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/14.jpg)
Rayana & Akoglu 14Less is More: Building Selective Anomaly Ensembles
P1
target
avg
P2 P3 P4 P5
Pseudo ground truth
P3 is most correlated to the target
![Page 15: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/15.jpg)
Rayana & Akoglu 15Less is More: Building Selective Anomaly Ensembles
P1
target
avg
P2 P3 P4 P5
P3
Ensemble
avg
p
![Page 16: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/16.jpg)
Rayana & Akoglu 16Less is More: Building Selective Anomaly Ensembles
P1 P2
P3
P4 P5
Ensemble
avg
p
P1 is most correlated to p
If corr(avg(E,P1), target) > corr(p, target)accept P1
elsediscard P1
![Page 17: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/17.jpg)
Rayana & Akoglu 17Less is More: Building Selective Anomaly Ensembles
P1 P2
P3
P4 P5
Ensemble
avg
p
P1Update until this list is empty
![Page 18: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/18.jpg)
Rayana & Akoglu 18Less is More: Building Selective Anomaly Ensembles
P2P3
P4 P5
Ensemble
P1
Discarded
![Page 19: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/19.jpg)
Rayana & Akoglu 19Less is More: Building Selective Anomaly Ensembles
S1 S2 S3
…
Sm
1110..
1010..
0011..
1010..
M1 M2 M3
…
Mm
Mixture Modeling• 1 (outliers)• 0 (inliers)
1010..
Majority Voting
O
Order statistics to choose accurate lists
Given m lists, for each pseudo outlier:
r = [r(1), …,r(m)], s.t. r(1) ≤ … ≤ r(m)
Under uniform null, prob. r ̂(l) ≤ r(l):
(at least l ranks drawn uniformly from [0, 1] must be ϵ [0, r(l)])Pseudo
outliers
![Page 20: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/20.jpg)
Rayana & Akoglu 20
Example with 20 detectors
last 5 likely inaccurate
Less is More: Building Selective Anomaly Ensembles
![Page 21: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/21.jpg)
Rayana & Akoglu 22
Full Ensemble (Full) [Rayana&Akoglu‘14]
Assemble all the detector/consensus results
Diversity-based Ensemble (DivE) [Schubert et al. 2012]
Select diverse (less correlated) detector/ consensus results to assemble
Less is More: Building Selective Anomaly Ensembles
![Page 22: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/22.jpg)
Rayana & Akoglu 23
Data Set names duration #nodes #edges rate
1. EnronInc 4 years ~80K ~350K 1 day
2. RealityMining 50 weeks ~18K ~33k 1 week
3. TwitterSecurity 4 months ~130K ~441K 1 day
4. TwitterWCup 1 month ~54K ~274K 5 mins
5. NYTNews 7.5 years ~320K ~2980K 1 week
Less is More: Building Selective Anomaly Ensembles
• Ground truth for datasets 1-4• Qualitative evaluation for NYTNews
![Page 23: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/23.jpg)
Rayana & Akoglu 24Less is More: Building Selective Anomaly Ensembles
![Page 24: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/24.jpg)
Rayana & Akoglu 25Less is More: Building Selective Anomaly Ensembles
![Page 25: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/25.jpg)
Rayana & Akoglu 26Less is More: Building Selective Anomaly Ensembles
![Page 26: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/26.jpg)
Rayana & Akoglu 27Less is More: Building Selective Anomaly Ensembles
![Page 27: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/27.jpg)
Rayana & Akoglu 28Less is More: Building Selective Anomaly Ensembles
![Page 28: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/28.jpg)
Rayana & Akoglu 29
![Page 29: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/29.jpg)
Rayana & Akoglu 30Less is More: Building Selective Anomaly Ensembles
Performance comparison
![Page 30: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/30.jpg)
Rayana & Akoglu 31Less is More: Building Selective Anomaly Ensembles
Performance comparison
![Page 31: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/31.jpg)
Rayana & Akoglu 32Less is More: Building Selective Anomaly Ensembles
Performance comparison
![Page 32: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/32.jpg)
Rayana & Akoglu 33Less is More: Building Selective Anomaly Ensembles
Performance comparison
![Page 33: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/33.jpg)
Rayana & Akoglu 36Less is More: Building Selective Anomaly Ensembles
![Page 34: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/34.jpg)
Rayana & Akoglu 37
Feature: Weighted Degree
![Page 35: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/35.jpg)
Rayana & Akoglu 38
Columbia Disaster
9/11 attack
New York City
World Trade Center
Washington (DC)
Afghanistan
Bin Laden, Osama
Al Qaeda
Manhattan (NY)
Bush, George W
White House Congress
New York City
World Trade Center
Washington (DC)
Afghanistan
Bin Laden, Osama
Al Qaeda
Manhattan (NY)
Bush, George W
White House Congress
Time tick 89 Time tick 90
Less is More: Building Selective Anomaly Ensembles
![Page 36: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/36.jpg)
Rayana & Akoglu 39Less is More: Building Selective Anomaly Ensembles
A new Anomaly Ensemble SELECTive:▪ Discard inaccurate detectors▪ unsupervised
Heterogeneous ▪ different detectors▪ different consensus
2-phases:▪ No bias towards detectors & consensus
SELECT outperforms▪ Full (no selection)▪ DivE (diversity ensemble)
5 large datasets (4 w/ ground truth)
Hurt by inaccurate detectors
![Page 37: Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs](https://reader033.vdocuments.mx/reader033/viewer/2022051521/5a6d94267f8b9ab3418b7f3d/html5/thumbnails/37.jpg)
Rayana & Akoglu 40Less is More: Building Selective Anomaly Ensembles
Event Detection
http://www.cs.stonybrook.edu/~datalab/