identifying web attacks via data analysis
DESCRIPTION
This presentation will look at detection of SQL injection using Machine Learning as well as profiling web traffic to find misbehaving hosts. The goal is to get beyond "Top N" types of analysis and begin using multiple features to guide us towards interesting traffic. With these techniques multiple log types can be used, everything from web server logs to proxy logs.TRANSCRIPT
![Page 1: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/1.jpg)
![Page 2: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/2.jpg)
Mike Sconzo
@sooshie
R&D at Click Security
Focused on data analysis for security use cases
Interested in machine learning/statistical analysis
NetWitness
ERCOT
Sandia National Labs
![Page 3: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/3.jpg)
● Introduction● How to use basic log information to detect
different attack types○ Drive-by○ SQL Injection
● Closing
![Page 4: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/4.jpg)
● Python○ IPython○ pandas○ numpy○ matplotlib○ scikit learn
● Bro● Google● sqlmap● JBroFuzz● sqlparse
![Page 5: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/5.jpg)
● Gather data● Clean up data● Explore data● Select/create features (numeric only)*● Run machine learning algorithm*● Analyze results
*optional
![Page 6: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/6.jpg)
![Page 7: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/7.jpg)
Is it possible to find clients being exploited by various exploit kits by just looking at traffic patterns?
● Gather data● Clean up data● Explore data● Analyze results
![Page 8: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/8.jpg)
![Page 9: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/9.jpg)
● 21GB of Network Traffic● 7600 Samples● 687627 Files● 807537 HTTP Requests
![Page 10: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/10.jpg)
*MHR will be used as our ground truth
![Page 11: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/11.jpg)
![Page 12: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/12.jpg)
![Page 13: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/13.jpg)
![Page 14: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/14.jpg)
![Page 15: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/15.jpg)
![Page 16: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/16.jpg)
![Page 17: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/17.jpg)
![Page 18: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/18.jpg)
![Page 19: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/19.jpg)
![Page 20: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/20.jpg)
![Page 21: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/21.jpg)
![Page 22: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/22.jpg)
![Page 23: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/23.jpg)
Is it possible to used supervised learning (classification) to detect strings that are likely SQL Injection?● Gather data● Explore data● Clean up data● Transform data● Select/create features (numeric only)● Run machine learning algorithm● Analyze results
![Page 24: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/24.jpg)
![Page 25: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/25.jpg)
![Page 26: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/26.jpg)
![Page 27: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/27.jpg)
![Page 28: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/28.jpg)
![Page 29: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/29.jpg)
![Page 30: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/30.jpg)
*Transform the data into a form that might give better insight than a signature
![Page 31: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/31.jpg)
![Page 32: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/32.jpg)
● Strings are great, but patterns might be better● Extract patterns from the strings● N-Grams!!!
![Page 33: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/33.jpg)
![Page 34: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/34.jpg)
![Page 35: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/35.jpg)
![Page 36: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/36.jpg)
![Page 37: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/37.jpg)
![Page 38: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/38.jpg)
![Page 39: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/39.jpg)
![Page 40: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/40.jpg)
![Page 41: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/41.jpg)
![Page 42: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/42.jpg)
![Page 43: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/43.jpg)
![Page 44: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/44.jpg)
![Page 45: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/45.jpg)
● It’s possible to make quality decisions/find interesting activity using data
● The more data you have the more accurate your predictions can be
● Gathering (the right) data for the use case is important● Cleaning the data takes a lot of effort, but it’s necessary● Unfortunately none of this is a silver bullet, but it can help point you
in the right direction(s)● None of this is magic, you can do it too!
![Page 46: Identifying Web Attacks Via Data Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5575b0b1d8b42a3b498b4cbe/html5/thumbnails/46.jpg)
http://clicksecurity.github.io/data_hacking/