![Page 1: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/1.jpg)
1
When Malware Is Packing Heat
Davide Balzarotti and Giovanni Vigna
USENIX Enigma 2018
![Page 2: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/2.jpg)
2
Packing
![Page 3: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/3.jpg)
3
Packing
![Page 4: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/4.jpg)
4
Researchers often have a limited understanding of the complexity of runtime packers
![Page 5: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/5.jpg)
5
Researchers often have a limited understanding of the complexity of runtime packers
AV software often mis-classify benign packed samples as malicious
![Page 6: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/6.jpg)
6
Researchers often have a limited understanding of the complexity of runtime packers
AV software often mis-classify benign packed samples as malicious
We all love ML, but in the presence of packing it just learns the wrong thing
![Page 7: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/7.jpg)
7
![Page 8: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/8.jpg)
8
Layer 1
![Page 9: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/9.jpg)
9
Layer 1 Layer 2
![Page 10: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/10.jpg)
10
Layer 1 Layer 2 Layer 3
![Page 11: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/11.jpg)
11
Layer 1 Layer 2 Layer n
packer code application code
![Page 12: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/12.jpg)
12
![Page 13: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/13.jpg)
13
![Page 14: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/14.jpg)
14
Complexity Classes [Class I] a single unpacking routine is executed before transferring the control to the unpacked program
[Class II] multiple unpacking layers are executed sequentially and lead to the original code at the end
[Class III] intermediate layers are executed in loops
[Class IV] the packer code is interleaved with the execution of the unpacked program
[Class V] pieces of the original program are unpacked on-demand
[Class VI] only a single fragment of the original program (as little as a single instruction) is unpacked in memory at any moment in time
![Page 15: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/15.jpg)
15
Off-The-Shelf Packers Custom Malware Packers
![Page 16: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/16.jpg)
16
Ange Albertini 2009-2010 Creative Commons Attribution http://corkami.blogspot.com
![Page 17: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/17.jpg)
17
Why Does Packing Matter?
§ Dynamic analysis techniques (e.g., sandboxes) have been introduced to deal with packing…
§ …but static analysis techniques are more efficient!
![Page 18: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/18.jpg)
18
An Experiment
§ Benign programs from Windows OSs (XP, Vista, 7, NT) § 7983 samples
§ Packed with 4 different packers § 16663 samples
§ Submitted to VirusTotal § Looking for 10+ detections
§ See: http://sarvamblog.blogspot.com/2013/05/nearly-70-of-packed-windows-system.html
![Page 19: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/19.jpg)
19
Results
§ UPX: 0% False Positives
§ BEP: 72.78% False Positives
§ NsPack: 98.72% False Positives
§ Upack: 99.88% False Positives
![Page 20: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/20.jpg)
20
Packing = Malware?
§ False Positives § Dataset Pollution
![Page 21: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/21.jpg)
21
How Did We Get Here?
§ Machine Learning has been increasingly used to perform malware detection
§ The misclassification of packed binaries is the result of learning the wrong thing…
§ Let’s take a step back!
![Page 22: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/22.jpg)
22
What Is Machine Learning?
§ “Machine learning explores the study and construction of algorithms that can learn from and perform predictive analysis on data” https://en.wikipedia.org/wiki/Machine_learning
![Page 23: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/23.jpg)
23
Why Machine Learning?
§ Supports data analysis
§ Supports characterization
§ Supports classification
![Page 24: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/24.jpg)
24
?Round?
Has>3sides?
…
Machine Learning
![Page 25: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/25.jpg)
25
Machine Learning
![Page 26: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/26.jpg)
26
?Redsarebad
Blues,greens,orangesaregood
Whataboutgreys?
Machine Learning
![Page 27: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/27.jpg)
27
Machine Learning
![Page 28: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/28.jpg)
28
Pitfalls in Machine Learning
![Page 29: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/29.jpg)
29
Pitfalls in Machine Learning
![Page 30: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/30.jpg)
30
Pitfalls in Machine Learning
![Page 31: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/31.jpg)
31
Pitfalls in Machine Learning
![Page 32: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/32.jpg)
32
Another Experiment
Insight: When most of malware is packed, packing is what is actually learned
ratio % # packed malicious # unpacked malicious # unpacked benign accuracy % (mean) variance % % false positive for benign packed0 0 5173 5173 94.607 0.007 7.0335 258 4914 5173 93.766 0.039 20.038
10 517 4655 5173 93.283 0.107 20.57715 775 4397 5173 94.452 0.003 26.67320 1034 4138 5173 94.19 0.006 27.81825 1293 3879 5173 94.065 0.01 29.8530 1551 3621 5173 94.326 0.009 28.11635 1810 3362 5173 94.412 0.009 31.88240 2069 3103 5173 94.712 0.015 32.80345 2327 2845 5173 94.751 0.018 35.02550 2586 2586 5173 94.48 0.018 35.37455 2845 2327 5173 94.75 0.02 35.91360 3103 2069 5173 94.712 0.032 36.66765 3362 1810 5173 94.982 0.052 36.4170 3621 1551 5173 94.934 0.052 38.25275 3879 1293 5173 95.466 0.071 34.60280 4138 1034 5173 95.533 0.074 35.82285 4397 775 5173 95.756 0.055 35.65690 4655 517 5173 95.407 0.234 36.12895 4914 258 5173 96.123 0.076 49.711
100 5173 0 5173 96.839 0.008 52.451
![Page 33: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/33.jpg)
33
Conclusions
§ Applying machine learning to packed malware might lead to the detection of packing (and not the detection of malicious behavior) resulting in false positives § De-sensitization caused by false positives § Pollution of datasets
§ Sophisticated dynamic unpacking and analysis is necessary
![Page 34: malware packing heat enigma v6 - usenix.org · Ange Albertini 2009-2010 Creative Commons Attribution . 17 Why Does Packing Matter?](https://reader031.vdocuments.mx/reader031/viewer/2022020316/5b84afab7f8b9ae5498cd081/html5/thumbnails/34.jpg)
34
Questions?
process by Roman from the Noun Project Machine learning picture: https://xkcd.com/1838/