forecasting suspicious account activity at large-scale online …matei/papers/fc2019slides.pdf ·...
TRANSCRIPT
![Page 1: Forecasting Suspicious Account Activity at Large-Scale Online …matei/papers/fc2019slides.pdf · 2019. 3. 4. · Detection Campaigns (1) Email Classification Anomaly Detection (2)](https://reader034.vdocuments.mx/reader034/viewer/2022052005/601882ef94a3de454f389c97/html5/thumbnails/1.jpg)
February 2019
Forecasting Suspicious Account Activity at
Large-Scale Online Service ProvidersHassan Halawa1, Konstantin Beznosov1, Baris Coskun2, Meizhu Liu3, Matei Ripeanu1
1 University of British Columbia2 Amazon Web Services
3 Yahoo! Research
![Page 2: Forecasting Suspicious Account Activity at Large-Scale Online …matei/papers/fc2019slides.pdf · 2019. 3. 4. · Detection Campaigns (1) Email Classification Anomaly Detection (2)](https://reader034.vdocuments.mx/reader034/viewer/2022052005/601882ef94a3de454f389c97/html5/thumbnails/2.jpg)
Automated attacks
2
operating on alarge-scaleexploiting
unsafe decisionsmade by
individual users
![Page 3: Forecasting Suspicious Account Activity at Large-Scale Online …matei/papers/fc2019slides.pdf · 2019. 3. 4. · Detection Campaigns (1) Email Classification Anomaly Detection (2)](https://reader034.vdocuments.mx/reader034/viewer/2022052005/601882ef94a3de454f389c97/html5/thumbnails/3.jpg)
■ Phishing
3
![Page 4: Forecasting Suspicious Account Activity at Large-Scale Online …matei/papers/fc2019slides.pdf · 2019. 3. 4. · Detection Campaigns (1) Email Classification Anomaly Detection (2)](https://reader034.vdocuments.mx/reader034/viewer/2022052005/601882ef94a3de454f389c97/html5/thumbnails/4.jpg)
■ Phishing □ Online Services
4
![Page 5: Forecasting Suspicious Account Activity at Large-Scale Online …matei/papers/fc2019slides.pdf · 2019. 3. 4. · Detection Campaigns (1) Email Classification Anomaly Detection (2)](https://reader034.vdocuments.mx/reader034/viewer/2022052005/601882ef94a3de454f389c97/html5/thumbnails/5.jpg)
5
■ Phishing■ Overview □ Current vs. Proposed Current Defenses Proposed
Reactive
Signatures
Proactive
Anomalies
EvolvingAttacks
FalsePositives
Forecasting
EarlyWarning
identifyingattack/attacker patterns
miningbehavioral /usage patterns
![Page 6: Forecasting Suspicious Account Activity at Large-Scale Online …matei/papers/fc2019slides.pdf · 2019. 3. 4. · Detection Campaigns (1) Email Classification Anomaly Detection (2)](https://reader034.vdocuments.mx/reader034/viewer/2022052005/601882ef94a3de454f389c97/html5/thumbnails/6.jpg)
6
■ Phishing■ Overview □ Current vs. Proposed □ Highlights
■ Experiment at a Large-Scale Online Service Provider(4 months production data / 100+ million users / 100+ billion login events)
■ Promising Performance as an Early Warning System (AUROC ~ 0.92 / FPR ~ 0.5% / ACC ~ 99.5% / REC ~ 50.6% / PRE ~ 18.3% using only a 1 week historical trace and predicting 1 month in advance)
■ Supervised ML Pipeline for Forecasting(predict future suspicious account activity from historical traces)
■ Evaluation Across Varied Classification Exercises (1 week trace → [7, 90] day forecast / 3 weeks → [21, 34] days)
![Page 7: Forecasting Suspicious Account Activity at Large-Scale Online …matei/papers/fc2019slides.pdf · 2019. 3. 4. · Detection Campaigns (1) Email Classification Anomaly Detection (2)](https://reader034.vdocuments.mx/reader034/viewer/2022052005/601882ef94a3de454f389c97/html5/thumbnails/7.jpg)
7
Account Registration
Account Compromised
Account Flagged
Account Remediation
Legitimate Owner
Time
Overview of thelifecycle of a compromised account lifecycle
■ Phishing■ Overview■ Approach □ Account Lifecycle
✔ ✘
Legitimate Owner
Legitimate Owner
Attacker
![Page 8: Forecasting Suspicious Account Activity at Large-Scale Online …matei/papers/fc2019slides.pdf · 2019. 3. 4. · Detection Campaigns (1) Email Classification Anomaly Detection (2)](https://reader034.vdocuments.mx/reader034/viewer/2022052005/601882ef94a3de454f389c97/html5/thumbnails/8.jpg)
8
■ Phishing■ Overview■ Approach □ Account Lifecycle □ ML Pipeline
Goal: Forecast suspicious account activity using supervised machine learning
Data Source
Data Pre-Processing
Model Selection
Ground Truth User Activity
Susp. Acct. Classifier
Model Evaluation
Unstructured Data
Structured & Labeled Data
Susp. Acct. Population& Susp. Acct. Scores Defense Systems
MetricsAUROC, BTR, PRE, REC, FPR
![Page 9: Forecasting Suspicious Account Activity at Large-Scale Online …matei/papers/fc2019slides.pdf · 2019. 3. 4. · Detection Campaigns (1) Email Classification Anomaly Detection (2)](https://reader034.vdocuments.mx/reader034/viewer/2022052005/601882ef94a3de454f389c97/html5/thumbnails/9.jpg)
9
Time
Overview of aclassification exercise
Training Interval Testing IntervalBuffer Window
(BW)
■ Phishing■ Overview■ Approach □ Account Lifecycle □ ML Pipeline □ Classification Exercise Data Source
Label Window (LW)
Ground Truth
Data Window (DW)
User Activity
Label Window (LW)
Ground Truth
Data Window (DW)
User Activity
Susp. Acct. Classifier Susp. Acct. Population& Susp. acct. Scores Defense Systems
![Page 10: Forecasting Suspicious Account Activity at Large-Scale Online …matei/papers/fc2019slides.pdf · 2019. 3. 4. · Detection Campaigns (1) Email Classification Anomaly Detection (2)](https://reader034.vdocuments.mx/reader034/viewer/2022052005/601882ef94a3de454f389c97/html5/thumbnails/10.jpg)
10
■ Phishing■ Overview■ Approach■ Evaluation □ Classification Exercises
Notation DW - Data Window, BW - Buffer Window,LW - Label Window, H - Prediction Horizon
Time
HyperparameterOptimization
Overfit Check
PreprocessingImpact
Performance forWider Windows
![Page 11: Forecasting Suspicious Account Activity at Large-Scale Online …matei/papers/fc2019slides.pdf · 2019. 3. 4. · Detection Campaigns (1) Email Classification Anomaly Detection (2)](https://reader034.vdocuments.mx/reader034/viewer/2022052005/601882ef94a3de454f389c97/html5/thumbnails/11.jpg)
11
■ Phishing■ Overview■ Approach■ Evaluation □ Classification Exercises □ AUROC
![Page 12: Forecasting Suspicious Account Activity at Large-Scale Online …matei/papers/fc2019slides.pdf · 2019. 3. 4. · Detection Campaigns (1) Email Classification Anomaly Detection (2)](https://reader034.vdocuments.mx/reader034/viewer/2022052005/601882ef94a3de454f389c97/html5/thumbnails/12.jpg)
12
■ Phishing■ Overview■ Approach■ Evaluation □ Classification Exercises □ AUROC □ PRE/REC vs. Horizon
![Page 13: Forecasting Suspicious Account Activity at Large-Scale Online …matei/papers/fc2019slides.pdf · 2019. 3. 4. · Detection Campaigns (1) Email Classification Anomaly Detection (2)](https://reader034.vdocuments.mx/reader034/viewer/2022052005/601882ef94a3de454f389c97/html5/thumbnails/13.jpg)
13
■ Phishing■ Overview■ Approach■ Evaluation■ Recap
■ Experiment at a Large-Scale Online Service Provider(4 months production data / 100+ million users / 100+ billion login events)
■ Promising Performance as an Early Warning System (AUROC ~ 0.92 / FPR ~ 0.5% / ACC ~ 99.5% / REC ~ 50.6% / PRE ~ 18.3% using only a 1 week historical trace and predicting 1 month in advance)
■ Supervised ML Pipeline for Forecasting(predict future suspicious account activity from historical traces)
■ Evaluation Across Varied Classification Exercises (1 week trace → [7, 90] day forecast / 3 weeks → [21, 34] days)
![Page 14: Forecasting Suspicious Account Activity at Large-Scale Online …matei/papers/fc2019slides.pdf · 2019. 3. 4. · Detection Campaigns (1) Email Classification Anomaly Detection (2)](https://reader034.vdocuments.mx/reader034/viewer/2022052005/601882ef94a3de454f389c97/html5/thumbnails/14.jpg)
February 2019
Forecasting Suspicious Account Activity at
Large-Scale Online Service ProvidersHassan Halawa1, Konstantin Beznosov1, Baris Coskun2, Meizhu Liu3, Matei Ripeanu1
1 University of British Columbia2 Amazon Web Services
3 Yahoo! Research
![Page 15: Forecasting Suspicious Account Activity at Large-Scale Online …matei/papers/fc2019slides.pdf · 2019. 3. 4. · Detection Campaigns (1) Email Classification Anomaly Detection (2)](https://reader034.vdocuments.mx/reader034/viewer/2022052005/601882ef94a3de454f389c97/html5/thumbnails/15.jpg)
15
Backup/Discussion Slides
![Page 16: Forecasting Suspicious Account Activity at Large-Scale Online …matei/papers/fc2019slides.pdf · 2019. 3. 4. · Detection Campaigns (1) Email Classification Anomaly Detection (2)](https://reader034.vdocuments.mx/reader034/viewer/2022052005/601882ef94a3de454f389c97/html5/thumbnails/16.jpg)
16
■ Discussion □ Account Suspiciousness vs. Vulnerability
Time
Suspicious
in Future (Forecast)
Mining Historical Behavioral/Usage Patterns
Vulnerable
at Present
![Page 17: Forecasting Suspicious Account Activity at Large-Scale Online …matei/papers/fc2019slides.pdf · 2019. 3. 4. · Detection Campaigns (1) Email Classification Anomaly Detection (2)](https://reader034.vdocuments.mx/reader034/viewer/2022052005/601882ef94a3de454f389c97/html5/thumbnails/17.jpg)
17
■ Discussion □ Current vs. Proposed
(1)Throttled Outbox
Delayed Inbox
(2)Personalized Controls
Targeted Education
(3)Efficient Compromise-Detection Campaigns
(1)Email ClassificationAnomaly Detection
(2)HTTPS Browser Lock
Two Factor Auth.
(3)Incident Response
User Reports
Feedback based on identifying vulnerable users
Feedback based on identifying attack patterns
AttackLaunchedPhishing emails
(1) Operator
Filter
SystemInfiltratedEmail in
inbox
(2)UserFilter
UserVictimizedCredentials
stolen
(3) Remediation
FilterCompromise
Detected
V Robust
Users
![Page 18: Forecasting Suspicious Account Activity at Large-Scale Online …matei/papers/fc2019slides.pdf · 2019. 3. 4. · Detection Campaigns (1) Email Classification Anomaly Detection (2)](https://reader034.vdocuments.mx/reader034/viewer/2022052005/601882ef94a3de454f389c97/html5/thumbnails/18.jpg)
Honeypots
DifferentialDefenses
Prioritization
18
(1) Operator Filters (2) User Filters (3) Remediation Filters
Defense Resource Prioritization
Targeted User Education
Efficient InspectionEffective Exercises
Throttling DuringEmergencies Captive Portals
Mitigate Adversarial
Learning
Personalised Control & Advice
Infer Attack Origin
IdentifyNew Attacks
Design of new defense mechanisms■ Discussion □ Proposed Defense Mechanisms
![Page 19: Forecasting Suspicious Account Activity at Large-Scale Online …matei/papers/fc2019slides.pdf · 2019. 3. 4. · Detection Campaigns (1) Email Classification Anomaly Detection (2)](https://reader034.vdocuments.mx/reader034/viewer/2022052005/601882ef94a3de454f389c97/html5/thumbnails/19.jpg)
19
■ Discussion □ Evaluation of Mechanisms ML Pipeline Vulnerability Classifier Vuln. Population
& Vuln. Scores Proposed Defenses
Simulation Analytical Models Practical Experiments
Output MetricsCost, Effectiveness
Evaluation of proposed defense mechanisms
InputAttack Propagation
Population DistributionDefense Parameters
S I R
S I R
S I R
V Robust
A/B TestDefense ApplicationTargeted Education
Defense EvaluationSecurity Exercise
![Page 20: Forecasting Suspicious Account Activity at Large-Scale Online …matei/papers/fc2019slides.pdf · 2019. 3. 4. · Detection Campaigns (1) Email Classification Anomaly Detection (2)](https://reader034.vdocuments.mx/reader034/viewer/2022052005/601882ef94a3de454f389c97/html5/thumbnails/20.jpg)
20
Vulnerable Robust
Long-TermVulnerability
Scores
Context-SpecificVulnerability
Scores
Proposed Defenses
■ Discussion □ Context-Specific Defenses
![Page 21: Forecasting Suspicious Account Activity at Large-Scale Online …matei/papers/fc2019slides.pdf · 2019. 3. 4. · Detection Campaigns (1) Email Classification Anomaly Detection (2)](https://reader034.vdocuments.mx/reader034/viewer/2022052005/601882ef94a3de454f389c97/html5/thumbnails/21.jpg)
■ Limited Access to User Data
■ Limited Computational Resources
■ Imperfect Groundtruth
■ Aggressive Pruning Heuristics
21
■ Discussion □ Results Presented as Lower Bounds
![Page 22: Forecasting Suspicious Account Activity at Large-Scale Online …matei/papers/fc2019slides.pdf · 2019. 3. 4. · Detection Campaigns (1) Email Classification Anomaly Detection (2)](https://reader034.vdocuments.mx/reader034/viewer/2022052005/601882ef94a3de454f389c97/html5/thumbnails/22.jpg)
22
■ Discussion □ Buffer Window Sizing
![Page 23: Forecasting Suspicious Account Activity at Large-Scale Online …matei/papers/fc2019slides.pdf · 2019. 3. 4. · Detection Campaigns (1) Email Classification Anomaly Detection (2)](https://reader034.vdocuments.mx/reader034/viewer/2022052005/601882ef94a3de454f389c97/html5/thumbnails/23.jpg)
23
“Social engineering, in the context of information security, refers to psychological manipulation of people into performing actions or divulging confidential information. A type of confidence trick for the purpose of information gathering, fraud, or system access, it differs from a traditional "con" in that it is often one of many steps in a more complex fraud scheme.”
■ Discussion □ Social Engineering
![Page 24: Forecasting Suspicious Account Activity at Large-Scale Online …matei/papers/fc2019slides.pdf · 2019. 3. 4. · Detection Campaigns (1) Email Classification Anomaly Detection (2)](https://reader034.vdocuments.mx/reader034/viewer/2022052005/601882ef94a3de454f389c97/html5/thumbnails/24.jpg)
■ Cost of attack
■ Multi-Stage Attacks
■ Similar dynamics to epidemics
24
■ Discussion □ Focusing on the Vulnerable Population as a key defense Element
![Page 25: Forecasting Suspicious Account Activity at Large-Scale Online …matei/papers/fc2019slides.pdf · 2019. 3. 4. · Detection Campaigns (1) Email Classification Anomaly Detection (2)](https://reader034.vdocuments.mx/reader034/viewer/2022052005/601882ef94a3de454f389c97/html5/thumbnails/25.jpg)
■ Targeted
■ Efficient
■ Proactive
■ Robust
25
■ Discussion □ Advantages of Proposed Paradigm
![Page 26: Forecasting Suspicious Account Activity at Large-Scale Online …matei/papers/fc2019slides.pdf · 2019. 3. 4. · Detection Campaigns (1) Email Classification Anomaly Detection (2)](https://reader034.vdocuments.mx/reader034/viewer/2022052005/601882ef94a3de454f389c97/html5/thumbnails/26.jpg)
26
■ Discussion □ Robustness
■ Current defenses are attack/attacker centric
■ Based on attacker-controlled behavior/features
■ Attackers can employ adversarial strategies
![Page 27: Forecasting Suspicious Account Activity at Large-Scale Online …matei/papers/fc2019slides.pdf · 2019. 3. 4. · Detection Campaigns (1) Email Classification Anomaly Detection (2)](https://reader034.vdocuments.mx/reader034/viewer/2022052005/601882ef94a3de454f389c97/html5/thumbnails/27.jpg)
■ Discussion □ Reactive Defenses
Focus on identifying attacks/attackers
27
[SNS’11] Tao Stein, Erdong Chen, and Karan Mangla. 2011. Facebook immune system. In Proceedings of the 4th Workshop on Social Network Systems (SNS'11). ACM, pp. 8, New York, NY, USA.
Begin Attack
Initial Detection
DefenderResponds
AttackerDetects
Attack
Mutate
Detect
Defense
Attacker Controls
Defender Controls
![Page 28: Forecasting Suspicious Account Activity at Large-Scale Online …matei/papers/fc2019slides.pdf · 2019. 3. 4. · Detection Campaigns (1) Email Classification Anomaly Detection (2)](https://reader034.vdocuments.mx/reader034/viewer/2022052005/601882ef94a3de454f389c97/html5/thumbnails/28.jpg)
28
■ Discussion □ User Education
■ First line of defense
■ Direct cost (attack) vs. Indirect cost (effort)
■ Distribute cost proportional to user vulnerability
![Page 29: Forecasting Suspicious Account Activity at Large-Scale Online …matei/papers/fc2019slides.pdf · 2019. 3. 4. · Detection Campaigns (1) Email Classification Anomaly Detection (2)](https://reader034.vdocuments.mx/reader034/viewer/2022052005/601882ef94a3de454f389c97/html5/thumbnails/29.jpg)
■ Paternalism
■ Fairness (Service Discrimination)
29
■ Discussion □ Legal/Ethical Considerations
![Page 30: Forecasting Suspicious Account Activity at Large-Scale Online …matei/papers/fc2019slides.pdf · 2019. 3. 4. · Detection Campaigns (1) Email Classification Anomaly Detection (2)](https://reader034.vdocuments.mx/reader034/viewer/2022052005/601882ef94a3de454f389c97/html5/thumbnails/30.jpg)
■ Feasibility to develop a vulnerability classifier
■ Inaccuracies in predicting the vulnerable population
■ Some defense mechanisms may violate user expectations
■ Targeted protection may be confusing / complex
30
■ Discussion □ Adoption Challenges
![Page 31: Forecasting Suspicious Account Activity at Large-Scale Online …matei/papers/fc2019slides.pdf · 2019. 3. 4. · Detection Campaigns (1) Email Classification Anomaly Detection (2)](https://reader034.vdocuments.mx/reader034/viewer/2022052005/601882ef94a3de454f389c97/html5/thumbnails/31.jpg)
■ Offline Worlds
■ Online Worlds
■ Our Experience
31
■ Discussion □ Related Work
![Page 32: Forecasting Suspicious Account Activity at Large-Scale Online …matei/papers/fc2019slides.pdf · 2019. 3. 4. · Detection Campaigns (1) Email Classification Anomaly Detection (2)](https://reader034.vdocuments.mx/reader034/viewer/2022052005/601882ef94a3de454f389c97/html5/thumbnails/32.jpg)
■ Large-scale social-bot infiltration feasible
■ Defense system leveraging the proposed paradigm
■ Deployed at Telefonica’s OSN Tuenti (50+ million users)
32
■ Discussion □ Our Experience (Integro)
![Page 33: Forecasting Suspicious Account Activity at Large-Scale Online …matei/papers/fc2019slides.pdf · 2019. 3. 4. · Detection Campaigns (1) Email Classification Anomaly Detection (2)](https://reader034.vdocuments.mx/reader034/viewer/2022052005/601882ef94a3de454f389c97/html5/thumbnails/33.jpg)
33
■ Discussion □ Integro
[ECS’16] Boshmaf, Y., Logothetis, D., Siganos, G., Lería, J., Lorenzo, J., Ripeanu, M., Beznosov, K., and Halawa, H. (2016). Íntegro: Leveraging Victim Prediction for Robust Fake Account Detection in Large Scale OSNs.Elsevier Computers & Security. 61: 142-168.