Tuomas Aura T-110.4206 Information security technology
Software security
Aalto University, autumn 2012
2
Outline
Untrusted input
Buffer overrun
SQL injection
Web vulnerabilities
Input validation
There is no one simple solution that would make programs
safe must learn about all the things that can go wrong
3
Untrusted input Software is typically specified with use cases
– Focus on desired functionality, not on handling exceptional situations
User and network input is untrusted and can be malicious must handle exceptional input safely
Does my software get input from the Internet? – Documents, media streams, messages, photos can all be
untrusted – Network access key part of application functionality – Intranet hosts, backend databases, office appliances etc.
are directly or indirectly exposed to input from the Internet
→ All software must handle malformed and malicious input
4
Format string vulnerability Vulnerable code:
int process_text(char *input){
char buffer[1000];
snprintf(buffer, 1000, input);
...
}
Good practice to use snprintf (not sprintf) in C, so this is not a typical buffer overrun
Bad idea to use the user input as the format string: – Input ”%x%x%x%x%x…” will print data from the stack
– Input ”%s%s%s%s%s…” will probably crash the program
– Input ”ABC%n” will write 3 to somewhere in the stack
5
Types of security bugs
Some typical security flaws: – Buffer overflows in stack or heap
– Integer overflows or signed/unsigned confusion
– Injection of untrusted input into the command line, SQL, Javascript, HTML, XML, format string, file path etc.
– Logical errors, e.g. time of check—time of use
– Crypto mistakes, bad random numbers
Most software bugs first seem harmless but eventually someone figures out how to build an exploit
BUFFER OVERRUN
6
Buffer overrun
Bug: failure to check for array boundary
#define MAXLEN 1000
char *process_input (char *input) {
char buffer[MAXLEN];
int i;
for (i = 0; buffer[i] = input[i]; i++);
...
Loops until a null character found; should check also for i < MAXLEN
Stack smashing Why are buffer overflows
a security issue?
#define MAXLEN 1000
char *process_input (char *input) {
char buffer[MAXLEN];
int i;
for (i = 0; buffer[i] = input[i]; i++);
...
Too long input overwrites the function return address and previous stack frames
Loops until a null character found; should check also for i < MAXLEN
Function arguments
Return address
Local variables
Function arguments
Return address
Local variables
Function arguments
Return address
buffer i
Anything
Call stack
Stack smashing Why are buffer overflows
a security issue?
#define MAXLEN 1000
char *process_input (char *input) {
char buffer[MAXLEN];
int i;
for (i = 0; i++; buffer[i] = input[i]);
...
When the function returns, execution will jump to the new return address, which points to attack code
Loops until a null character found; should check also for i < MAXLEN
Function arguments
Return address
Local variables
Function arguments
Return address
Local variables
Function arguments
Return address
buffer i
Call stack
Attack code
Attack code address
Anything
Buffer overruns Buffer overruns may cause
– data modification unexpected program behavior – access violation process crashing – malicious data modification or code injection
How attacker gains control: – Stack overruns may overwrite function return address or
exception handler address – Heap overruns may overwrite function pointer or virtual
method table
How difficult is it to write an exploit? – Technical progress in exploit writing, like in any area of IT – Instructions and code widely available – Attackers are standing by in hope of new vulnerabilities
ever shorter time to exploits
10
Another example
Vulnerabilities can be difficult to spot
#define BUFLEN 4
void vulnerable(void) {
wchar_t buf[BUFLEN];
int val;
val = MultiByteToWideChar(
CP_ACP, 0, "1234567",
-1, buf, sizeof(buf));
printf("%d\n", val);
}
Should calculate target buffer size as sizeof(buf)/sizeof(buf[0])
[Example thanks to Ulfar Erlingsson]
Buffer overrun prevention Preventing buffer overruns:
– Type-safe languages (e.g. SML, Java, C#) – Programmer training, code reviews – Avoiding unsafe and difficult-to-use libraries:
strcpy, gets, scanf, MultiByteToWideChar, etc. – Fuzz testing – Static code analysis: proving safety
No complete protection need to also mitigate consequences Stack cookies
– store an unguessable value on the top of the stack frame in function prologue; check before returning
– implemented in compilers: GCC -fstack-protector-all, Visual Studio /GS
NX bit in IA-64, AMD64 – set the stack and heap memory pages as non-executable – breaks some code, e.g. JIT compilation
Memory layout randomization
12
NX and memory layout randomization “Return to libc” attack:
– Because of NX, attacker cannot insert new executable code
script existing code, e.g. libc functions
When function returns, execution jumps to the return address in stack
Function arguments
Return address
Local variables
Function arguments
Return address
Local variables
Function arguments
Return address
Local variables
Function arguments
Return address
Local variables
Anything
Call stack
NX and memory layout randomization “Return to libc” attack:
– Because of NX, attacker cannot insert new executable code
script existing code, e.g. libc functions
When function returns, execution jumps to the return address in stack
– Point the return addresses to the beginning of library functions
– Set arguments as desired
Typical script creates a new executable page, copies attack code there and runs it
Combine NX with memory layout randomization: load libc and other library code at a random memory offset attacker does not know where to jump
Function arguments
Return address
Local variables
Function arguments
Return address
Local variables
Function arguments
Return address
Local variables
Function arguments
Return address
Local variables
Anything
Function 3 args
Anything
Zeros for fun 3 locals
Function 2 args
Function 3 address
Zeros for fun 2 locals
Function 1 args
Function 2 address
Zeros for fun 2 locals
Anything
Function 1 address
Anything Buffer overrun
Library functions to be executed in order 1,2,3
Call stack
Integer overflow Integers in programming languages are not ideal
mathematical integers
Vulnerable code: nBytes = (nBits + 7) >> 3;
if (nBytes <= bufferSize)
copyBits(input, buffer, nBits);
Attacker sends input: nBits = UINT_MAX
The buffer-overflow check is evaluated: nBytes = (UINT_MAX + 7) >> 3 = 6 >> 3 = 0
(nBytes <= bufferSize) is true!
buffer overflow
15
SQL INJECTION
16
17
[XKCD]
18
SQL injection example
SQL statement: "SELECT * FROM users WHERE username = '"
+ input + "';"
Attacker sends input: "alice'; DROP TABLE users; --"
The query evaluated by the SQL database: SELECT * FROM users WHERE username =
'alice'; DROP TABLE users; --';
19
SQL injection example 2
Application greets the user by first name: "SELECT firstname FROM users WHERE username = '"
+ input + "';"
Attacker enters username: "nobody' UNION SELECT password FROM users WHERE
username = 'alice'; --"
The query evaluated by the SQL database: SELECT firstname FROM users WHERE username =
'nobody' UNION SELECT password FROM users WHERE
username = ‘alice'; --';
This is why we should always assume that the attacker can read the password database
Swedish parlamentary election 2010
20
Some hand-written votes scanned by machine:
http://www.val.se/val/val2010/handskrivna/handskrivna.skv
Preventing SQL injection
Minimum privilege: set tables as read-only; run different services as different users
Sanitize input: allow only the necessary characters and string formats
Escape input strings with ready-made functions, e.g. mysql_real_escape_string() in PHP
Stored procedures: precompile SQL queries instead of concatenating strings at run time – Prepared statements are similar to stored procedures
Disable SQL error messages to normal users harder to build exploits
21
WEB VULNERABILITIES
22
Why don’t evil hackers get any
friends?
Meet Bob, the evil hacker
23
Cross-site request forgery (CSRF) Web browsers implement the same-origin security
policy: two web pages with different hostname, protocol or port cannot interact
Exception: hyperlink or form submission can cause a GET or POST method to be invoked on another site
Fictional CSRF example: – Users on social.net often stay logged in (session cookie) – JavaScript on Bob’s page: <script type="text/javascript"> frames['hidden'].window.location='http://social.net/AddFriend.php?name=Bob'; </script>
Preventing CSRF: – Server checks Referrer field in HTTP requests – Add a random session id to all URLs and forms after log-in
24
Cross-site scripting (XSS) Web sites must filter user-posted content;
otherwise, someone could post malicious scripts – Another way to circumvent the same-origin policy: insert
the malicious content to the target web site
Fictional XSS example: – Social.net has a blog where users can post comments – Instead of a comment, Bob posts a script: <b onmouseover="window.location=‘
http://social.net/AddFriend.php?name=Bob
';">See here!<b> – Another user reads the blog and moves mouse on the text
Preventing XSS: – Filter <tags> or Javascript from user-posted content – Filter out or escape all angled brackets
25
Another XSS example
Page with XSS vulnerability (e.g. on social.net): <html><body><? php
print “Page not found: " .
urldecode($_SERVER["REQUEST_URI"]);
?></body></html>
Typical output:
Page not found: /foo.html
Bob redirects users from his own site to:
http://social.net/<script>
window.location='http://social.net/AddF
riend.php?name=Bob';</script>
INPUT VALIDATION
26
27
File path vulnerability File names and paths are typical input, e.g. URL path Vulnerable code:
char docRoot[] = "/usr/www/";
char fileName[109];
strncpy (docRoot, fileName, 9);
strncpy (input, fileName+9, 100);
file = fopen(fileName, "r");
// Next, send file to client
Attacker sends input: "../../etc/password“
The same file path has many different representations should allow only one canonical representation, use the realpath() function
Online services should be executed in a sandbox to limit their access: chroot, virtual machines
28
Sanitizing input
Sanitizing input is not easy
Escape sequences enable many encodings for the same character and string:
– URL escapes: %2e%2e%2f2e%2e%2f = ../../
– HTML entities: <SCRIPT> = <SCRIPT>
Cannot simply filter “..” or “<“
International character set issues
Bad implementations of Unicode UTF-8 allowed multiple encodings for the same character:
0x0A The only correct encoding for Linefeed
0xC0 0x8A 0xE0 0x80 0x8A 0xF0 0x80 0x80 0x8A 0xF8 0x80 0x80 0x80 0x8A 0xFC 0x80 0x80 0x80 0x80 0x8A
User may not see the difference between similar-looking characters, e.g. in the URL
– Α vs A vs А
29
SECURE PROGRAMMING
30
How to produce secure code?
Programming: – Learn about bugs and vulnerabilities by example
– Adopt secure coding guidelines
– Use safe languages, libraries and tools
– Code reviews, static checkers
– Fuzz testing, penetration testing
Software process: – Threat modelling
– Define security requirements
– Define quality assurance process
31
Security principles Keep it simple Minimize attack surface Sanitize input and output Least privilege Defense in depth Isolation Secure defaults Secure failure modes Separation of duties No security through obscurity Fix potential bugs
32
33
Reading material Dieter Gollmann: Computer Security, 2nd ed.
chapter 14; 3rd ed. Chapters 10, 18, 20.5–20.6 Stallings and Brown: Computer security, principles
and practice, 2008, chapter 11-12 Michael Howard and David LeBlanc, Writing
Secure Code, 2nd ed. Online:
– Top 25 Most Dangerous Software Errors, http://cwe.mitre.org/top25/ – SQL Injection Attacks by Example, http://unixwiz.net/techtips/sql-
injection.html – OWASP, https://www.owasp.org/, see especially Top Ten – CERT Secure Coding Standards,
https://www.securecoding.cert.org/confluence/display/seccode/CERT+Secure+Coding+Standards
– Aleph One, Smashing The Stack For Fun And Profit (classic paper) http://inst.eecs.berkeley.edu/~cs161/fa08/papers/stack_smashing.pdf
34
Exercises Find examples of actual security flaws in different
categories. Try to understand how the attacks work. Which features in code may indicate poor quality and
potential security vulnerabilities? How may error messages help an attacker? Buffer overrun in Java will raise an exception. Problem
solved — or can these still cause security problems? What security bugs can occur in concurrent systems,
e.g. multiple web servers that use one shared database?
Find out what carefully designed string sanitization functions, such as mysql_real_escape_string or the OWASP Enterprise Security API, actually do.