ben smith and laurie williams
DESCRIPTION
Using SQL Hotspots in a Prioritization Heuristic for Detecting All Types of Web Application Vulnerabilities. Ben Smith and Laurie Williams. Input Validation Vulnerabilities. There is a plethora of proposed mitigation techniques, no solution eliminates all vulnerabilities. - PowerPoint PPT PresentationTRANSCRIPT
1
Using SQL Hotspots in a Prioritization Heuristic for Detecting All Types of Web
Application Vulnerabilities
Ben Smith and Laurie Williams
2
Input Validation Vulnerabilities
• There is a plethora of proposed mitigation techniques, no solution eliminates all vulnerabilities.
• In the CWE/SANS Top 25 for 2009.• Continue to be in the CWE/SANS Top 25
for 2010.• Also indicated by SANS as the most
common attacks for compromising web sites.
3
How do we stop this?
• Development organizations do not have the time or resources to detect vulnerabilities in every source file before release.
• Validation and verification must be prioritized to start with vulnerable files first.
• SQL hotspots may help with this prioritization process.
• Though typically associated with SQL injection, hotspots may be useful for predicting any type of vulnerability.
4
Goal
The goal of this research is to improve the prioritization of security fortification efforts by investigating the ability of SQL hotspots to be used as the basis for a heuristic for the prediction of all vulnerability types.
5
Agenda
• What are SQL hotspots?
• Case Studies– Projects– Methodology
• Results: Eight Hypotheses about Hotspots
• Conclusion: A heuristic for prioritizing V&V efforts
6
SQL Hotspot
A SQL Hotspot is any point in the application source code where the program interacts with a database management system.
Typically indicated with mysql_query() or other library functions in PHP.
7
8
SQL Hotspots (2)
$username = $_POST[‘username’];$password = $_POST[‘password’];
$result = mysql_query(“select * from users where username =‘$username’ AND password = ‘$password’”);
$firstresult = mysql_fetch_array($result);
$role = $firstresult[‘role’];
$_COOKIE[‘userrole’] = $role
Study Subjects
• WordPress– Advanced blog management– 74% bloggers run WordPress– Uses MySQL and PHP– 138,967 SLOC
• WikkaWiki– Wiki management system– 532 websites are using WikkaWiki– Uses MySQL and PHP– 46,025 SLOC
9
10
CWE Classifications
11
WordPress WikkaWiki
Tracing Vulnerabilities to Files
12
WikkaWiki WordPress
Detecting Hotspots
13
Prediction Model
• Contained two terms: no. hotspots, SLOC
• Logistic regression
• Trained on releases 1…N, tested on release N+1. (1.0 to 1.3, tested on 1.4).
• tp, tn, fp, fn
14
Descriptive Statistics
WordPress WikkaWiki
Releases Analyzed Nine Six
Security reports analyzed
97 61
Vulnerable files 26% (85 / 326) 29% (44 / 209)
Average hotspots 255 92
Average files having at least one hotspot
14.2% 8.42%
15
Used open source tools R to test statisticalhypotheses, and Weka for model evaluation.
Hypotheses about Files
H1: The more hotspots a file contains per line of code, the more likely it is that the file contains any type of web application vulnerability (Logit, p < 0.05).
H2: The more hotspots a file contains, the more times that file was changed due to any kind of vulnerability (SLR, p < 0.0001, Adjusted R2 = 0.4208, 0.3802).
16
Hypotheses about Issue Reports
H3: Input validation vulnerabilities result in a higher number average repository revisions than any other type of vulnerability.
(Consistent with SANS report).
Mann-Whitney-Wilcoxon Test
(p < 0.05)
17
Hypotheses about Prediction
H4: Hotspots can be used to predict files that will contain any type of web application vulnerability in the current release (predictive model that does better than a random guess).
H5: The more hotspots a file contains, the more likely that file will be vulnerable in the next release (coefficients on predictive model).
18
Model Performance - WordPress
19
Hypotheses Comparing Projects
H6: The average number of hotspots per file is more variable in WordPress than WikkaWiki. (F-test, p < 0.000001)
H7: WordPress suffered a higher proportion of input validation vulnerabilities than WikkaWiki. (Chi-Squared Test, p = 0.0692)
H8: In WordPress, more lines of code that were changed due to security issues were hotspots than in WikkaWiki. (Chi-Squared Test, p < 0.000001)
20
Limitations
• We can never find or know all vulnerabilities.
• Our definition of a hotspot may be insufficient or incorrect.
• Issue reports were subject to human error both in reporting and in analyzing.
• We are limited to these two open source projects.
21
Conclusion
• Hotspots can be used in a V&V prioritization heuristic as follows:More SQL and non-SQL vulnerabilities will be found
in files that contain more hotspots per line of code.
• Input validation vulnerabilities: prominent problem, no single solution.
• Separating the concern of database interaction is associated with a decrease in the proportion of reported input validation vulnerabilities.
22
Thank you!
• Any questions?
23
Precision & Recall
24
A measure of the level of exactness exhibited by the model
The number of vulnerable files the model retrieves.
25
$username = $_POST[‘username’];$password = $_POST[‘password’];
$result = mysql_query(“select * from users where username =‘’ OR 1=1 ---’ AND password = ‘$password’”);
$firstresult = mysql_fetch_array($result);
$role = $firstresult[‘role’];
$_COOKIE[‘userrole’] = $role
SQL Injection Attacks
‘ OR 1=1 --