formal reasoning of security vulnerabilities by pointer taintedness semantics

21
Formal Reasoning of Formal Reasoning of Security Security Vulnerabilities by Vulnerabilities by Pointer Taintedness Pointer Taintedness Semantics Semantics S. Chen, K. Pattabiraman, Z. S. Chen, K. Pattabiraman, Z. Kalbarczyk and R. K. Iyer Kalbarczyk and R. K. Iyer Center for Reliable and High- Center for Reliable and High- Performance Computing Performance Computing University of Illinois at Urbana- University of Illinois at Urbana- Champaign Champaign

Upload: ocean-reilly

Post on 03-Jan-2016

19 views

Category:

Documents


0 download

DESCRIPTION

Formal Reasoning of Security Vulnerabilities by Pointer Taintedness Semantics. S. Chen, K. Pattabiraman, Z. Kalbarczyk and R. K. Iyer Center for Reliable and High-Performance Computing University of Illinois at Urbana-Champaign. Our Previous Work on Security Vulnerability Analysis. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Formal Reasoning of Security Vulnerabilities by Pointer Taintedness Semantics

Formal Reasoning of Security Formal Reasoning of Security Vulnerabilities by Pointer Vulnerabilities by Pointer

Taintedness SemanticsTaintedness Semantics

S. Chen, K. Pattabiraman, Z. Kalbarczyk and R. K. IyerS. Chen, K. Pattabiraman, Z. Kalbarczyk and R. K. Iyer

Center for Reliable and High-Performance Computing Center for Reliable and High-Performance Computing

University of Illinois at Urbana-ChampaignUniversity of Illinois at Urbana-Champaign

Page 2: Formal Reasoning of Security Vulnerabilities by Pointer Taintedness Semantics

Our Previous Work on Security Our Previous Work on Security Vulnerability AnalysisVulnerability Analysis

Appears in DSN 2003. Appears in DSN 2003. Analyzed Analyzed CERTCERT and and BugtraqBugtraq reports and the reports and the

corresponding application source code.corresponding application source code. Developed a state machine representation Developed a state machine representation

approach to decompose security vulnerabilities to approach to decompose security vulnerabilities to a series of primitive operations, each indicating a a series of primitive operations, each indicating a simple predicate.simple predicate.

The analyzed vulnerabilities include Stack The analyzed vulnerabilities include Stack Overflow, Heap Corruption, Integer Overflow, Overflow, Heap Corruption, Integer Overflow, Format String Vulnerability, and others.Format String Vulnerability, and others.

Page 3: Formal Reasoning of Security Vulnerabilities by Pointer Taintedness Semantics

Use the func pointer. Malicious code executed.

Use the negative integer as an array index to corrupt a func pointer

Get a very large integer, which is converted into a negative

Sendmail Signed Integer Overflow Sendmail Signed Integer Overflow (Bugtraq #3163)(Bugtraq #3163)

addr_setuid unchanged

tTvect[x]=i

addr_setuid changed

Execute code referred by addr_setuid

convert str_i and str_x to integer i and x

( integer represented by str_x) > 231

x 100

x > 100

?

Execute MCode

get text strings str_x and str_i

?

x < 0 or x > 100

0 x 100

Function pointer is corrupted

Load the function pointer

( integer represented

by str_x) 2 31

pFSM1

pFSM2

pFSM3

Page 4: Formal Reasoning of Security Vulnerabilities by Pointer Taintedness Semantics

Current WorkCurrent Work

Page 5: Formal Reasoning of Security Vulnerabilities by Pointer Taintedness Semantics

MotivationMotivation Our analysis on CERT advisories showsOur analysis on CERT advisories shows

– Many vulnerabilities (Many vulnerabilities ( 66%) 66%) due to incorrect pointer due to incorrect pointer dereferencesdereferences

– A significant portion of A significant portion of vulnerabilities (vulnerabilities ( 33.6%) due 33.6%) due to errors in library functions or to errors in library functions or incorrect invocations of library incorrect invocations of library functionsfunctions

Format String 7%

Globbing2%

Heap Corruption

8%

Integer Overflow

6%

Buffer Overflow

44%

Other33%

Motivating questions Motivating questions – What is the common characteristic among most security What is the common characteristic among most security

vulnerabilities? vulnerabilities?

– How to develop a generic reasoning approach to find a wide How to develop a generic reasoning approach to find a wide spectrum of security vulnerabilities?spectrum of security vulnerabilities?

Page 6: Formal Reasoning of Security Vulnerabilities by Pointer Taintedness Semantics

Formal Analysis of Pointer TaintednessFormal Analysis of Pointer Taintedness Pointer Taintedness: a pointer value, including a return : a pointer value, including a return

address, is derived directly or indirectly from user input. address, is derived directly or indirectly from user input. (formally defined using equational logic) (formally defined using equational logic)

It provides a unifying perspective for reasoning about a It provides a unifying perspective for reasoning about a significant number of security vulnerabilities.significant number of security vulnerabilities.

The notion of pointer taintedness enables:The notion of pointer taintedness enables:– Static analysis: reasoning about the possibility of pointer taintedness Static analysis: reasoning about the possibility of pointer taintedness

by source code analysis; by source code analysis; – Runtime checking: inserting assertions in object code to check Runtime checking: inserting assertions in object code to check

pointer taintedness at runtime; pointer taintedness at runtime; – Hardware architecture-based support to detect pointer taintedness.Hardware architecture-based support to detect pointer taintedness.

Current focus: extraction of security specifications of library Current focus: extraction of security specifications of library functions based on pointer taintedness semantics. functions based on pointer taintedness semantics.

Page 7: Formal Reasoning of Security Vulnerabilities by Pointer Taintedness Semantics

Examples Vulnerabilities Caused by Examples Vulnerabilities Caused by Pointer TaintednessPointer Taintedness

Format string vulnerability Format string vulnerability – Taint an argument pointer of functions such as Taint an argument pointer of functions such as printf, printf,

fprintf, sprintf fprintf, sprintf andand syslog. syslog. Stack buffer overflow (stack smashing)Stack buffer overflow (stack smashing)

– Taint a return address.Taint a return address. Heap corruption Heap corruption

– Taint the free-chunk doubly-linked list of the heap.Taint the free-chunk doubly-linked list of the heap. Glibc Glibc globbingglobbing vulnerabilities vulnerabilities

– User input resides in a location that is used as a pointer User input resides in a location that is used as a pointer by the parent function of by the parent function of glob().glob().

Page 8: Formal Reasoning of Security Vulnerabilities by Pointer Taintedness Semantics

Stack Buffer Overflow Stack Buffer Overflow Vulnerable code: char buf[100]; strcpy(buf,user_input);

Return addrReturn addr

Frame pointerFrame pointer

buf[99]buf[99]

……

buf[1]buf[1]

buf[0]buf[0]

High

Low

Sta

ck g

row

th

buf

user_input

Return address can be tainted.

Page 9: Formal Reasoning of Security Vulnerabilities by Pointer Taintedness Semantics

Format String VulnerabilityFormat String Vulnerability

In vfprintf(), if (fmt points to “%n”) then **ap = (character count)

Vulnerable code: recv(buf); printf(buf); /* should be printf(“%s”,buf) */

\xdd \xcc \xbb \xaa %d %d %d %n

……

%n%n

%d%d

%d%d

%d%d

0xaabbccdd0xaabbccdd

fmt: format string pointer

ap: argument pointer

High

Low

Sta

ck g

row

th

*ap is a tainted value.

ap: argument pointer

fmt: format string pointer

Page 10: Formal Reasoning of Security Vulnerabilities by Pointer Taintedness Semantics

Heap Corruption VulnerabilityHeap Corruption VulnerabilityFree chunk A

Free chunk Bfd=Abk=C

Allocated buffer buf

Free chunk C

user

inpu

t

Vulnerable code:buf = malloc(1000);recv(sock,buf,1024);free(buf);

In free():B->fd->bk=B->bk; B->bk->fd=B->fd;

When B->fd and B->bk are tainted, the effect of free() is to write a user specified value to a user specified address.

Page 11: Formal Reasoning of Security Vulnerabilities by Pointer Taintedness Semantics

Semantic Definition of Pointer TaintednessSemantic Definition of Pointer Taintedness

Page 12: Formal Reasoning of Security Vulnerabilities by Pointer Taintedness Semantics

One-Slide Intro to Equational LogicOne-Slide Intro to Equational Logic Use term rewriting to establish proofs of theorems.Use term rewriting to establish proofs of theorems. Natural number addition expressed in the Maude Natural number addition expressed in the Maude

system. system. 0 : Natural .s_ : Natural -> Natural ._+_ : Natural Natural -> Natural .

vars N M : Natural Axiom: N + 0 = N .Axiom: N + s M = s (N + M) .

(s s s 0) + (s s 0) = s ((s s s 0) + (s 0)) = s( s((s s s 0) + 0)) = s(s((s s s 0)) = s s s s s 0Intuitively, this is a proof of “3 + 2 = 5” in natural number algebra.

Page 13: Formal Reasoning of Security Vulnerabilities by Pointer Taintedness Semantics

Semantics of a Memory ModelSemantics of a Memory Model• A store represents a snapshot of the memory state at a point in the program execution. • For each memory location, we can evaluate two properties: content and taintedness (true/false).• Operations on memory locations:

•The fetch operation Ftch(S,A) gives the content of the memory address A in store S•The location-taintedness operation LocT(S,A) gives the taintedness of the location A in store S

• Operations on expressions:•The evaluation operation Eval(S,E) evaluates expression E in store S•The expression-taintedness operation ExpT(S,E) computes the taintedness of expression E in store S.

Page 14: Formal Reasoning of Security Vulnerabilities by Pointer Taintedness Semantics

Axioms of Axioms of EvalEval and and ExpTExpT operations operationsEval(S, I) = I // I is an integer constantEval(S, ^ E1) = Ftch(S, Eval(S,E1))Eval(S, E1 + E2) = Eval(S, E1) + Eval(S, E2)Eval(S, E1 - E2) = Eval(S, E1) - Eval(S, E2) … …ExpT (S, I) = falseExpT(S, ^ E1) = LocT(S,Eval(S,E1)) ExpT(S,E1 + E2) = ExpT(S,E1) or ExpT(S,E2)ExpT(S,E1 - E2) = ExpT(S,E1) or ExpT(S,E2)… …

E.g., is the expression (^100)–2 tainted in store S?ExpT(S, (^100)–2) = ExpT(S, (^100)) or ExpT(S, 2) = LocT(S,100) or false = LocT(S,100)

Note: ^ is the dereference operator, ^100 gives the content in the location 100

Page 15: Formal Reasoning of Security Vulnerabilities by Pointer Taintedness Semantics

Semantics of Language LSemantics of Language L Extend the semantics proposed by Extend the semantics proposed by Goguen and Malcolm Goguen and Malcolm The following operations (arithmetic/logic) are defined:The following operations (arithmetic/logic) are defined:

– +, -, *, /, %, !, &&, ||, !=, ==, ……+, -, *, /, %, !, &&, ||, !=, ==, …… The following instructions are defined:The following instructions are defined:

– mov [Exp1] <- Exp2mov [Exp1] <- Exp2– branch (Condition) Labelbranch (Condition) Label – call FuncName(Exp1,Exp2,…)call FuncName(Exp1,Exp2,…)

Axioms defining Axioms defining movmov instruction semantics instruction semantics– Specify the effects of applying Specify the effects of applying movmov instruction on a store instruction on a store– Allow taintedness to propagate from Exp2 to [Exp1].Allow taintedness to propagate from Exp2 to [Exp1].

Axioms defining the semantics of Axioms defining the semantics of recvrecv (similarly, (similarly, scanfscanf, , recvfrom: recvfrom: user user input functions)input functions)– Specify the memory locations tainted by the recv call.

Page 16: Formal Reasoning of Security Vulnerabilities by Pointer Taintedness Semantics

Extracting Function Specifications Extracting Function Specifications by Theorem Proverby Theorem Prover

C source code of a library function

Code in language L

Automatically translated to Language L

Critical instruction – indirect writesFor each mov [^ E1] <- E2, generate

theorems:a) E1 should not be taintedb) The mov instruction should not taint any

location outside the buffer pointed by E1

Theorem generation

Theorem prover

A set of sufficient conditions that imply the validity of the theorems. They are the security specifications of the analyzed function.

Page 17: Formal Reasoning of Security Vulnerabilities by Pointer Taintedness Semantics

Example: strcpy()Example: strcpy()

char * strcpy (char * dst, char * src) { char * res;0: res =dst; while (*src!=0) {1: *dst=*src; dst++; src++; }2: *dst=0; return res;}

0: mov [res] <- ^ dst

lbl(#while#6)

branch (^ ^ src is 0) #ex#while#6

1: mov [^ dst] <- ^ ^ src

mov [dst] <- (^ dst) + 1

mov [src] <- (^ src) + 1

branch true #while#6

lbl(#ex#while#6)

2: mov [^ dst] <- 0

mov [ret] <- ^ res

Translate to Language L

a) Suppose S1 is the store before Line L1, then LocT(S1,dst) = false b) If S0 is the store before Line L0, and S2 is the store after Line L1, then

I < Eval(S0, ^dst) or Eval(S0, ^dst+dstsize) I => LocT(S2,I) = LocT(S0, I)

c) Suppose S3 is the store before Line L2, then LocT(S3,dst) = false

Theorem generation

Theorem prover

Page 18: Formal Reasoning of Security Vulnerabilities by Pointer Taintedness Semantics

Specifications Suggested by Specifications Suggested by Theorem ProverTheorem Prover

Specifications that are extracted by Specifications that are extracted by the theorem proving approachthe theorem proving approach– srclensrclen <= <= dstsizedstsize– The buffers The buffers srcsrc and and dstdst do not do not

overlap in such a way that the buffer overlap in such a way that the buffer dstdst covers the string terminator of the covers the string terminator of the srcsrc string. string.

– The buffers The buffers dstdst and and srcsrc do not cover do not cover the function frame of strcpy.the function frame of strcpy.

– Initially, Initially, dst dst is not taintedis not tainted

Documented in Linux man page

Not documented

Suppose when function strcpy() is called, the Suppose when function strcpy() is called, the sizesize of of destination buffer (dst) is destination buffer (dst) is dstsizedstsize, the , the lengthlength of user of user input string (src) is input string (src) is srclensrclen

Page 19: Formal Reasoning of Security Vulnerabilities by Pointer Taintedness Semantics

Other ExamplesOther Examples A simplied version of A simplied version of printf()printf()

– 55 lines of C code55 lines of C code– Four security specifications are extracted, including one Four security specifications are extracted, including one

indicating indicating format string vulnerability Function Function free()free() of a heap management system of a heap management system

– 36 lines of C code36 lines of C code– Seven security specifications are extracted, including several Seven security specifications are extracted, including several

specifications indicating specifications indicating heap corruption vulnerabilities. vulnerabilities. Socket read functions of Apache HTTPD and NULL Socket read functions of Apache HTTPD and NULL

HTTPDHTTPD– The Apache function is proved to be free of pointer taintedness.The Apache function is proved to be free of pointer taintedness.– Two (known) vulnerabilities are exposed in the theorem proving Two (known) vulnerabilities are exposed in the theorem proving

process of NULL HTTPD function. process of NULL HTTPD function.

Page 20: Formal Reasoning of Security Vulnerabilities by Pointer Taintedness Semantics

ConclusionsConclusions A common characteristic of many categories of A common characteristic of many categories of

widely exploited security vulnerabilities: pointer widely exploited security vulnerabilities: pointer taintednesstaintedness

A memory model and a language can be A memory model and a language can be formally defined using equational logic to allow formally defined using equational logic to allow reasoning of pointer taintedness.reasoning of pointer taintedness.

A theorem proving approach is developed to A theorem proving approach is developed to extract security specifications from library extract security specifications from library function code, based pointer taintedness function code, based pointer taintedness analysis.analysis.

Page 21: Formal Reasoning of Security Vulnerabilities by Pointer Taintedness Semantics

Future DirectionsFuture Directions Provide higher degree of automation on the theorem Provide higher degree of automation on the theorem

generation and theorem proving process.generation and theorem proving process. Apply the pointer taintedness analysis on a substantial Apply the pointer taintedness analysis on a substantial

number of commonly used library functions to extract number of commonly used library functions to extract their security specifications. their security specifications.

Compiler techniques for inserting “guarding code” to Compiler techniques for inserting “guarding code” to check unproved properties at runtime.check unproved properties at runtime.

Architecture supports for pointer taintedness detection. Architecture supports for pointer taintedness detection. A module working with RSE (Reliability and Security A module working with RSE (Reliability and Security Engine).Engine).