automatically evading classifiers - ndss symposium · based on genetic programming automated...
TRANSCRIPT
![Page 1: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/1.jpg)
Automatically Evading Classifiers A Case Study on PDF Malware Classifiers
Weilin Xu David Evans Yanjun Qi
University of Virginia
![Page 2: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/2.jpg)
Machine Learning is Solving Our Problems
2
Fake
Spam IDS MalwareFake Accounts
…
…
![Page 3: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/3.jpg)
3
![Page 4: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/4.jpg)
4
![Page 5: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/5.jpg)
Machine Learning is Eating the World
Data
Scientist
Security
Expert
5
?
![Page 6: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/6.jpg)
Machine Learning is Eating the World
Data
Scientist
Security
Expert
6
No! Security is different.
![Page 7: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/7.jpg)
Goal: Understand classifiers under attack. Results: Vulnerable to automated evasion.
Security Tasks are Different: Adversary Adapts
7
![Page 8: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/8.jpg)
Building Machine Learning Classifiers
8
Trained ClassifierLabelledTraining
Data
MLAlgorithm
Training (Supervised Learning)
FeatureExtraction
Vectors
![Page 9: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/9.jpg)
Assumption: Training Data is Representative
9
LabelledTraining
Data
MLAlgorithm
FeatureExtraction
Vectors
Deployment
Malicious / Benign
Operational Data
Trained Classifier
Training (Supervised Learning)
![Page 10: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/10.jpg)
Results: Evaded PDF Malware ClassifiersPDFrate*
[ACSAC’12]Hidost
[NDSS’13]
Accuracy 0.9976 0.9996
False Negative Rate 0.0000 0.0056
False Negative Rate with Adversary 1.0000 1.0000
10
* Mimicus [Oakland ’14], an open source reimplementation of PDFrate.
![Page 11: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/11.jpg)
Results: Evaded PDF Malware ClassifiersPDFrate*
[ACSAC’12]Hidost
[NDSS’13]
Accuracy 0.9976 0.9996
False Negative Rate 0.0000 0.0056
False Negative Rate with Adversary 1.0000 1.0000
11
Very robust against “strongest conceivable mimicry attack”.
* Mimicus [Oakland ’14], an open source reimplementation of PDFrate.
![Page 12: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/12.jpg)
Variants
12
Clone
Benign PDFsMalicious PDF
Mutation
01011001101Variants
Variants
Select Variants
✓✓✗✓
Based on Genetic ProgrammingAutomated Evasion Approach
![Page 13: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/13.jpg)
Variants
13
Clone
Benign PDFsMalicious PDF
Mutation
01011001101Variants
Variants
Select Variants
✓✓✗✓
Based on Genetic ProgrammingAutomated Evasion Approach
/Catalog /Pages
0
/JavaScript
eval(‘…’);
/RootModifiedParser
Extract Me If You Can: Abusing PDF Parsers in Malware Detectors
Curtis Carmony,et al.
![Page 14: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/14.jpg)
Variants
14
Clone
Benign PDFsMalicious PDF
Mutation
01011001101Variants
Variants
Select Variants
✓✓✗✓
Based on Genetic ProgrammingAutomated Evasion Approach
/Catalog /Pages
0
/JavaScript
eval(‘…’);
/Root
Mutation
Variants From Benign
Insert / Replace / Delete
![Page 15: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/15.jpg)
Variants
15
Clone
Benign PDFsMalicious PDF
Mutation
01011001101Variants
Variants
Select Variants
✓✓✗✓
Based on Genetic ProgrammingAutomated Evasion Approach
/Catalog /Pages
0
/JavaScript
eval(‘…’);
/Root
Mutation
Variants From Benign
128
546
0
0
Insert / Replace / Delete
![Page 16: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/16.jpg)
Variants
16
Clone
Benign PDFsMalicious PDF
Mutation
01011001101Variants
Variants
Select Variants
✓✓✗✓
Based on Genetic ProgrammingAutomated Evasion Approach
/Catalog /Pages
0
/JavaScript
eval(‘…’);
/Root
Mutation
Variants From Benign
128
546
0
0
Insert / Replace / Delete
![Page 17: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/17.jpg)
Variants
17
Clone
Benign PDFsMalicious PDF
Mutation
01011001101Variants
Variants
Select Variants
✓✓✗✓
Based on Genetic ProgrammingAutomated Evasion Approach
/Catalog /Pages
0
/JavaScript
eval(‘…’);
/Root
Mutation
Variants From Benign
128
546
0
0
128
0
Insert / Replace / Delete
![Page 18: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/18.jpg)
Variants
18
Clone
Benign PDFsMalicious PDF
Mutation
01011001101Variants
Variants
Select Variants
✓✓✗✓
Based on Genetic ProgrammingAutomated Evasion Approach
/Catalog /Pages
0
/JavaScript
eval(‘…’);
/Root
Mutation
Variants From Benign
128
0
Insert / Replace / Delete
![Page 19: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/19.jpg)
Variants
19
Clone
Benign PDFsMalicious PDF
Mutation
01011001101Variants
Variants
Select Variants
✓✓✗✓
Based on Genetic ProgrammingAutomated Evasion Approach
/Catalog /Pages
0
/JavaScript
eval(‘…’);
/Root
Mutation
Variants From Benign
128
0
Insert / Replace / Delete
![Page 20: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/20.jpg)
Variants
20
Clone
Benign PDFsMalicious PDF
Mutation
01011001101Variants
Variants
Select Variants
✓✓✗✓
Based on Genetic ProgrammingAutomated Evasion Approach
![Page 21: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/21.jpg)
Variants
21
Clone
Benign PDFsMalicious PDF
Mutation
01011001101Variants
Variants
Select Variants
✓✓✗✓
Based on Genetic ProgrammingAutomated Evasion Approach
Fitness Function
Oracle
Target Classifier
f(x)
Malicious?
Score
Fitness ScoreVariants
![Page 22: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/22.jpg)
Variants
22
Clone
Benign PDFsMalicious PDF
Mutation
01011001101Variants
Variants
Select Variants
✓✓✗✓
Based on Genetic ProgrammingAutomated Evasion Approach
Fitness Function
Oracle
Target Classifier
f(x)
Malicious?
Score
Fitness ScoreVariants
Malicious
Benign
![Page 23: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/23.jpg)
Variants
23
Clone
Benign PDFsMalicious PDF
Mutation
01011001101Variants
Variants
Select Variants
✓✓✗✓
Based on Genetic ProgrammingAutomated Evasion Approach
![Page 24: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/24.jpg)
Results: Evaded PDFrate 100%
24
Original Malware Seeds
![Page 25: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/25.jpg)
Results: Evaded PDFrate 100%
25
Original Malware Seeds
Evasive Variants
![Page 26: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/26.jpg)
Evaded PDFrate with Adjusted Threshold
26
Original Malware Seeds
Evasive Variants
Evasive Variants with lower threshold
![Page 27: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/27.jpg)
Results: Evaded Hidost 100%
27
Original Malware Seeds
![Page 28: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/28.jpg)
Results: Evaded Hidost 100%
28
Original Malware Seeds
Evasive Variants
![Page 29: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/29.jpg)
29
Difficulty varies by seedSimple mutations often work Complex mutations sometimes needed.
Difficulty varied by targets:PDFrate: 6 days to evade all Hidost: 2 days to evade all
Results: Accumulated Evasion Rate
![Page 30: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/30.jpg)
Cross-Evasion Effects
30
PDF MalwareSeeds
Hidost
EvasivePDF Malware
(against Hidost)Automated Evasion
PDFrate 387/500 Evasive (77.4%)
3/500 Evasive (0.6%)
Gmail’s classifier is secure?
![Page 31: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/31.jpg)
Cross-Evasion Effects
31
PDF MalwareSeeds
Hidost
EvasivePDF Malware
(against Hidost)Automated Evasion
PDFrate 387/500 Evasive (77.4%)
3/500 Evasive (0.6%)
Gmail’s classifier is secure? different.
![Page 32: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/32.jpg)
Evading Gmail’s Classifier
32
Evasion rate on : 135/380 (35.5%)
![Page 33: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/33.jpg)
Evading Gmail’s Classifier
33
Evasion rate on : 179/380 (47.1%)
![Page 34: Automatically Evading Classifiers - NDSS Symposium · Based on Genetic Programming Automated Evasion Approach /Catalog /Pages 0 /JavaScript eval(‘…’); Modified /Root Parser](https://reader030.vdocuments.mx/reader030/viewer/2022041120/5f33db9ad2740e6aa3202cd5/html5/thumbnails/34.jpg)
Conclusion
34
Source Code: http://EvadeML.org
Vs.
Who will win this arm race?