compilers and software security gaurav s. kc [email protected] gskc programming systems lab...
TRANSCRIPT
Compilers and Software SecurityCompilers and Software Security
Gaurav S. [email protected]
http://www.cs.columbia.edu/~gskc
Programming Systems Lab
Tuesday, 22nd April 2003
OutlineOutline
SecurityRuntime Management of ProcessesVulnerabilities and Attack TechniquesCompilers 4115Security ResearchConclusion
SecuritySecurity
What does security mean?– Focus: Security of resources
• No unauthorised access (using Authentication)• Availability for authorised users (no DoS)
– Also: Security of data during transit• Protection from eavesdropping• Protection from malformation• Solutions: PKI for encryption, digital signatures
for non-repudiation
Security: Models & ThreatsSecurity: Models & Threats
Social aspects of security failure– 3Bs: Burglary, Bribery, Brutality– Social Engineering
Threats to Security During Transit– Man-in-the-middle attack
• Identity spoofing / Masquerading• Packet sniffing• Communication replay
Threats to Application SecurityThreats to Application Security
Trojan HorsesMalicious security breaking program disguised as something benign like a screen saver or game program– Keystroke loggers & powerful remote-control utility like Back Orifice– Abnormal system behaviour, e.g. open server socket, CTRL-ALT-
DEL signal handler– Zombie nodes, awaiting instructions for conducting D.DoS
Computer VirusesExecutable code that, when run by someone, infects or attaches itself to other executable code in a computer in an effort to reproduce itself– Can be malicious, erase files, lock up systems– Boot Sector, File, Macro, Multipartite, Polymorphic, Stealth– Anti-virus: search for known signature in suspect files
Threats to Application Security 2Threats to Application Security 2
Internet WormsA worm is a self-replicating program that does not alter files, but resides in active memory and duplicates itself by means of computer networks
– Morris Worm (RTM) exploited fingerd, sendmail, weak passwords
– Code Red exploited a (publicised) vulnerability in Microsoft IIS
– Code Red II had a Trojan payload
– Nimda: Swiss Army knife of worms – worm, virus, trojan!Spread via its own e-mail engine, IIS servers that it scanned, and shared disks on corporate networks.
Common Trait:Well-crafted input data can let you take control of a computer
– WinNuke: for rebooting remote Win95 machine :)
SecurityRuntime Management of ProcessesVulnerabilities and Attack TechniquesCompilers 4115Security ResearchConclusion
Process RuntimeProcess Runtime
Program
Stack
Heap
char *env[]
char *argv[]
int argc
.bss
.data
.text
argv[]
runtime stack
runtime heap
env[]0xbfffffff
int main(int argc, char *argv[], char *env[]) {
return 0;
}
kernel space
0x08048000
x86– 32-bit von Neumann machine
– 232 ≈ 4GB memory locations Breakdown of process space stack
– <= 0xbfffffff, Grows downwards
– Environment variables, Program parameters
– Automatically allocated stack variables
– Activation records heap
– Dynamic allocation
– Explicitly through malloc, free
0x00000000
0xffffffff
Process Runtime 2Process Runtime 2
char *env[]
char *argv[]
int argc
.bss
.data
.text
argv[]
runtime stack
runtime heap
env[]0xbfffffff
Block Started by Segment
// static & global uninitialised data
Data Section
// static & global initialised data
Text Section
// executable machine code
kernel space
0x08048000
.bss– assembler directive for IBM 704 assembler– runtime allocation of space– RWX
.data– compile-time space allocation,
and initialisation values– RWX
.text– program code– runtime DLLs– RO, X
.rodata– RO, X– constantsconst int x = 4;“hello, world”
0x00000000
0xffffffff
Activation RecordsActivation Records
Subroutines– functions and procedures– abstraction of computation– structured programming concept
Stack frame, Function frame, Activation frame– Block of stack space reserved for duration of function
Logical stack frames are crucial for implementing subroutines– Each frame contains information related to the context of the
given function. Grows downwards for each nested invocation.
Reserved registers– %eip (next instruction), %esp, %ebp (fixed offsets)
Activation Records 2Activation Records 2
void function(char *s, float y, int x) { int a; int b; char buffer[SIZE]; int c; strcpy(buffer, s); return;}
#define SIZE 9int main(void) { function(“yep”, 2.f, 93); return 0;}
function parametersreturn addressold frame pointerautomatic variables
int xfloat ychar *s
ret. addr: 0x0abcdef0old fp: 0x4fedcba8
int aint b
char buffer[SIZE]int c
PC
FPSP
Source function Visualisation of the
runtime stack frame
-40(%ebp)
-16(%ebp)
-12(%ebp)
8(%ebp)
12(%ebp)
16(%ebp)
-44(%ebp)
Activation Records 3Activation Records 3
prologue
epilogue
function: pushl %ebp movl %esp, %ebp subl $56, %esp subl $8, %esp pushl 8(%ebp) leal -40(%ebp), %eax pushl %eax call strcpy addl $16, %esp leave ret
.LC0: .string “yep”
main: ... pushl $93 pushl $0x40000000 pushl $.LC0 call function ...
void function(char *s, float y, int x) { int a; int b; char buffer[SIZE]; int c; strcpy(buffer, s); return;}
#define SIZE 9int main(void) { function(“yep”, 2.f, 93); return 0;}
s
bufferfunction body
Source function Assembly equivalent Building the stack frame
char *s
float y
int x
SecurityRuntime Management of ProcessesVulnerabilities and Attack TechniquesCompilers 4115Security ResearchConclusion
VulnerabilitiesVulnerabilities
C: Low level, high level systems languageEfficient execution, Usable for real-time
solutionsPointers and Arrays
– Pointer to (null-terminated?) block of memoryLack of bounds checking
– Buffer overflow causes havoc
Attack TechniquesAttack Techniques
Criteria for successful attack– Locate a buffer that has an unsafe operation applied to it– Well-crafted input data to trigger the overflow
Buffer overrun vulnerabilities– Stack-based: Stack-smashing attack– Heap-based: Function pointers, C++ virtual pointers,
Exception handlers (CodeRed)
FormatString exploits– %n format converter for *printf family of functions – writes #bytes output so far to %n argument (int *)
printf(“\x70\xf7\xff\xbf%%n”); //0xbffff770 := 4
Smashing the StackSmashing the Stack
void function(char *s, float y, int x) { int a; int b; char buffer[SIZE]; int c; ... ; strcpy(buffer, s); ...}
PC
int xfloat ychar *s
ret. addr: 0x0abcdef0old fp: 0x4fedcba8
int aint b
char buffer[SIZE]int c
Stacksmashing attack
•Buffer overrun•Code injection•Return address overwritten
0xBadAdda0.........
(“/bin/sh”)exec
To overflow (automatic) stack buffer, one would need:– Shellcode, i.e. characters representing machine code (obtain from gdb, as)– Memory location of injected shellcode (typically buffer address)
Can approximate to make up for lack of precise information– nop instructions at the beginning of the shellcode– overwrite locations around 0(%ebp)with shellcode address
suid installed programs. Shellcode: shell, export xterm display
Heap-Based AttacksHeap-Based Attacks
Function pointer– Higher address: function pointer– Lower address: buffer
C++ Pointer to vtable– Higher address: virtual pointer– Lower address: buffer
char buffer[ ];
int (* f) (void)
.bss
class ABC { char buffer[10]; virtual void print() { cout << buffer; } void set(char *s) { strcpy(buffer, s); }};
int main(int argc, char *argv[]) { static char buffer[10]; static int (*f)(void) = exit; // gets(buffer); strcpy(buffer, argv[1]); (*f)();
ABC *abc = new ABC(); abc->set(argv[1]); abc->print();
}
char buffer[ ];
void *vptr
C++ object
SecurityRuntime Management of ProcessesVulnerabilities and Attack TechniquesCompilers 4115Security ResearchConclusion
Compilers 4115Compilers 4115
GCC: GNU Compiler Collection– Just a wrapper for different phases
• cpp: C preprocessorprogram.c program.i
• cc1: C compiler properprogram.i program.s
• as: Assembler (a.out, ELF relocatable files)program.s program.o
• ld: Link editor (ELF executables)program.o program
GCCGCC
Command line optionsgcc –save-temps (-pipe) –Wall –O0 –dr –v –static-I$HOME/include –L$HOME/lib-lsocket –lm -lpthread
Standard libraries/lib/libc.so.6, /lib/ld-linux.so.2
Standard library header files/usr/include
Other toolsOther tools
GNU Debugger: gdbGNU Binutils
– objcopy: add/remove ELF sections – readelf,objdump: print ELF information
Miscellaneous– ldd: list dynamic dependencies (DLLs)– strace: trace syscall invocations
SecurityRuntime Management of ProcessesVulnerabilities and Attack TechniquesCompilers 4115Security ResearchConclusion
Security ResearchSecurity Research
Know thy enemy– Monitor the attacker’s behaviour and tactics– In a constrained resource environment
Honeypots– Illusion of an “easy target” to lure attackers
Jail– Sandboxed environment using chroot– All necessary files are available locally
Virtual machines Sandboxes with limited syscalls
Automatic Defence MechanismsAutomatic Defence Mechanisms
Face thy enemy – Applications fortified with runtime checks
Stackguard, Memguard, .NET cl.exe /gs – “canary” word to detect Stack-smashing– READONLY stack frame– .NET C/C++ compiler protects 0(%ebp),4(%ebp)
Libsafe, Libverify– “safe” implementation of standard libraries– runtime backup/checking of return address
Defence through DiversityDefence through Diversity
Code Diversity– Code randomisation for diversity– Security through obscurity even for open-
source software– No more: breach once, breach everywhere
Compiler-based Protection– Secure the stack data– Potentially vulnerable heap data
CasperCasper
Paper: Casper: Compiler-assisted securing of programs at runtime
Via added runtime checks as part of function invocations
Add protection codeProtect what: control data in stack framesWhat from: most stack-smashing attacksAvailable as patches:
• Compiler: gcc-2.95• Debugger: gdb-5.2.1
Casper in ActionCasper in Action
Similar in nature to Stackguard, but with much smaller overhead
XOR property: idempotent when applied twice. Simplest form of encryption / obfuscation of data
int xfloat ychar *s
ret. addr: 0x0abcdef0old fp: 0x4fedcba8
int aint b
char buffer[SIZE]int c
Casper protection
•Mask original return address value when entering function•Unmask and restore the original return address value when returning from function•Overwritten value will be “restored” to invalid code address
PC
ret. addr := 32-bit XOR ret. addr
Get the Processor InvolvedGet the Processor Involved
Paper: Countering Code-Injection Attacks With Instruction-Set Randomization
Machine instruction translation – unique per process Reversible mapping
machine instruction ↔ garbage bit sequence1. Post-compilation stage
• Encode all executable sections with key• Store codec key in file header
2. Modified von Neumann: fetch, decrypt, decode, execute• decrypt: “Processor” restores each block of bytes to valid,
original instruction• Injected code gets probabilistically transformed to garbage bit-
sequence that cannot be decoded
Binary Encryption and ExecutionBinary Encryption and Execution
SOURCE CODE
MACHINE EXECUTABLE
FILE
compile
key
ENCRYPTED EXECUTABLE
FILE
key
encryptvia objcopy
fetch
decrypt
Binary Encryption and Execution 2Binary Encryption and Execution 2
Bochs Pentium emulator is the “modified machine”– Support for hidden register %gav– Interrupt routine handler saves %gav to process
structure Linux 2.2.14
– Kernel recognises new register– Support for register in process structure
as and objcopy for program encryption and codec storage
code
Future WorkFuture Work
Randomised ISA on real machine– Programmable Transmeta chips– Dynamo: Dynamic optimiser of native code
Activation records – automatically managed, randomised layout
Heap smashing techniques– break type-system– corrupt malloc data, Diversified research– Languages, Compilers: C++, Sun CC, Visual C++– Other architectures: Solaris, Alpha (DLX ;-)
ConclusionConclusion
Security– Process Security
Runtime Management of Processes– Stack, Heap, Activation Records
Vulnerabilities and Attack Techniques– Buffer overrun. Stacksmashing. Pointer overwriting.
Compilers 4115– GCC, GDB, Binutils
Security Research– Monitoring. Runtime protection
ReferencesReferences
1. The Bochs Pentium emulatorhttp://bochs.sourceforge.net/
2. Aleph One. Smashing The Stack For Fun And Profithttp://www.phrack.org/show.php?p=49&a=14
3. Arash Baratloo, N. Singh, T. TsaiTransparent Run-Time Defense Against Stack Smashing Attacks
4. Crispin Cowan, M. Barringer, et al.FormatGuard: Automatic Protection From printf format string vulnerabilities
5. Crispin Cowan, Calton Pu, et al.StackGuard: Automatic Adaptive Detection and Prevention of Buffer-Overflow Attacks
6. Gaurav S. Kc, Stephen A. Edwards, Gail E. Kaiser, Angelos KeromytisCasper: Compiler-assisted securing of programs at runtime
7. Gaurav S. Kc, Angelos D. Keromytis, Vassilis PrevelakisCountering Code-Injection Attacks With Instruction-Set Randomization
Optimisation of Tail-RecursionOptimisation of Tail-Recursion
int factorial(int n) { if (1 >= n) return 1; return n*factorial(n-1);}int val = factorial(x);
int factorial(int n, int v) { if (1 >= n) return v; return factorial(n-1, v*n);}int val = factorial(x, 1);
factorial:
...
pushl n-1
call factorial
...
factorial:
...
n := n-1
v := v*n
goto factorial
C source code Assembly
back
x86 Processorx86 Processor
Dual integer pipeline
Hidden register %eip does not always fetch the “next” instruction
back
Binary Encryption Code: GNU Binary Encryption Code: GNU asasif [ ! $1 ] ; then echo "usage: $0 <ELF_executable_image> [key]"; exit; fi
if [ ! $2 ] ; then XOR_KEY="0x$RANDOM"; else XOR_KEY=$2; fi
# file names
NEW_FILE="$1.$XOR_KEY"
ORG_FILE=$1
INTERMEDIATE="$XOR_KEY.o"
# modified binary
OBJCOPY=/home/gskc/usr/binutils-2.13.2/bin/objcopy
# create an intermediate ELF object file with an .xor.stuff section
as -o $INTERMEDIATE <<EOF
.section .xor.stuff
.long $XOR_KEY
EOF
# merge the .xor.stuff section into the specified file
$OBJCOPY --encrypt-xor-key $XOR_KEY --add-section .xor.stuff=$INTERMEDIATE $ORG_FILE $NEW_FILE
# clean up
rm -f $INTERMEDIATE
back