splint the c code static checker

43
Splint the C code static checker Pedro Pereira Ulisses Costa Formal Methods in Software Engineering May 28, 2009 Pedro Pereira, Ulisses Costa Splint the C code static checker

Upload: ulisses-costa

Post on 18-Dec-2014

8.546 views

Category:

Technology


1 download

DESCRIPTION

A presentation about splint, the C code static checker.

TRANSCRIPT

Page 1: Splint the C code static checker

Splint the C code static checker

Pedro Pereira Ulisses Costa

Formal Methods in Software Engineering

May 28, 2009

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 2: Splint the C code static checker

Sumario

1 Introduction

2 Unused variables

3 Types

4 Memory management

5 Control Flow

6 Buffer sizes

7 The Ultimate Test: wu-ftpd

8 Pros and Cons

9 Conclusions

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 3: Splint the C code static checker

Lint for detecting anomalies in C programs

Statically checking C programs

Unused declarations

Type inconsistencies

Use before definition

Unreachable code

Ignored return values

Execution paths with no return

Infinite loops

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 4: Splint the C code static checker

Splint

Specification Lint and Secure Programming Lint

Annotations

FunctionsVariablesParametersTypes

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 5: Splint the C code static checker

Sumario

1 Introduction

2 Unused variables

3 Types

4 Memory management

5 Control Flow

6 Buffer sizes

7 The Ultimate Test: wu-ftpd

8 Pros and Cons

9 Conclusions

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 6: Splint the C code static checker

Unused variables

Splint detects instances where the value of a location is usedbefore it is defined.

Annotations can be used to describe what storage must bedefined and what storage may be undefined at interfacepoints.

All storage reachable is defined before and after a functioncall.

global variableparameter to a functionfunction return value

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 7: Splint the C code static checker

Undefined Parameters

Sometimes, function parameters or return values are expected toreference undefined or partially defined storage.

out annotation denotes a pointer to storage that may beundefined

in annotation can be used to denote a parameter that mustbe completely defined

1 extern void setVal (/*@out@*/ int *x);

2 extern int getVal (/*@in@*/ int *x);

3 extern int mysteryVal (int *x);

4

5 int dumbfunc (/*@out@*/ int *x, int i) {

6 if (i > 3)

7 return *x;

8 else if (i > 1)

9 return getVal (x);

10 else if (i == 0)

11 return mysteryVal (x);

12 else {

13 setVal (x);

14 return *x;

15 }

16 }

> splint usedef.c

usedef.c:7: Value *x used before

definition

usedef.c:9: Passed storage x not

completely defined

(*x is undefined): getVal (x)

usedef.c:11: Passed storage x not

completely defined

(*x is undefined): mysteryVal

(x)

Finished checking --- 3 code warnings

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 8: Splint the C code static checker

Sumario

1 Introduction

2 Unused variables

3 Types

4 Memory management

5 Control Flow

6 Buffer sizes

7 The Ultimate Test: wu-ftpd

8 Pros and Cons

9 Conclusions

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 9: Splint the C code static checker

Types

Strong type checking often reveals programmingerrors. Splint can check primitive C types more strictlyand flexibly than typical compilers.

Built in C Types

Splint supports stricter checking of built-in C types. The char andenum types can be checked as distinct types, and the differentnumeric types can be type-checked strictly.

Characters

The primitive char type can be type-checked as a distinct type. Ifchar is used as a distinct type, common errors involving assigningints to chars are detected.

If charint is on (+), char types are indistinguishable from ints.

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 10: Splint the C code static checker

Types - Enums

An error is reported if:

a value that is not an enumerator member is assigned to theenum type

if an enum type is used as an operand to an arithmeticoperator

If the enumint flag is on, enum and int types may be usedinterchangeably.

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 11: Splint the C code static checker

Sumario

1 Introduction

2 Unused variables

3 Types

4 Memory management

5 Control Flow

6 Buffer sizes

7 The Ultimate Test: wu-ftpd

8 Pros and Cons

9 Conclusions

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 12: Splint the C code static checker

Memory management

About half the bugs in typical C programs can beattributed to memory management problems.

Some only appear sporadically

And some may only be apparent when compiled on a differentplatform

Splint detects many memory management errors at compile time

Using storage that may have been deallocated

Memory leaks

Returning a pointer to stack-allocated storage

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 13: Splint the C code static checker

Memory management - Memory Model

An object is a typed region of storage;

Some objects use a fixed amount of storage (that is allocatedand deallocated by the compiler);

Other objects use dynamic memory storage that must bemanaged by the program.

Storage is undefined if it has not been assigned a value

and defined after it has been assigned a value.

An object is completely defined if all storage that may bereached from it is defined.

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 14: Splint the C code static checker

Memory management - Memory Model (cont.)

What storage is reachable from an object depends on the type andvalue of the object.

Example

If p is a pointer to a structure, p is completely defined if the valueof p is NULL, or if every field of the structure p points to iscompletely defined.

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 15: Splint the C code static checker

Memory management - Memory Model (cont.)

Left side of an assignment

When an expression is used as the left side of an assignmentwe say it is an lvalue;

Its location in memory is used, but not its value;

Undefined storage may be used as an lvalue since only itslocation is needed.

Right side of an assignment

When storage is used in any other way:

on the right side of an assignment;as an operand to a primitive operator;as a function parameter.

we say it is used as an rvalue;

It is an anomaly to use undefined storage as an rvalue.

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 16: Splint the C code static checker

Memory management - Deallocation Errors

Deallocating storage when there are other live references tothe same storage

Failing to deallocate storage before the last reference to it islost

Solution

Obligation to release storage

This obligation is attached to the reference to which thestorage is assigned

The only annotation is used to indicate that a reference is theonly pointer to the object it points to:

1 /* @only@ */ /* @null@ */ void *malloc (size_t size);

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 17: Splint the C code static checker

Memory management - Memory Leaks

1 extern /* @only@ */ int *glob;

2

3 /* @only@ */ int *f (/* @only@ */

int *x, int *y,int *z) {

4 int *m = (int *) malloc (

sizeof (int));

5 glob = y; // Memory leak

6 free (x);

7 *m = *x; //Use after

free

8 return z; // Memory leak

detected

9 }

> splint only.c

only.c:4: Only storage glob (type int *)

not released

before assignment: glob = y

only.c:1: Storage glob becomes only

only.c:4: Implicitly temp storage y

assigned to only:

glob = y

only.c:6: Dereference of possibly null

pointer m: *m

only.c:8: Storage m may become null

only.c:6: Variable x used after being

released

only.c:5: Storage x released

only.c:7: Implicitly temp storage z

returned as only: z

only.c:7: Fresh storage m not released

before return

only.c:3: Fresh storage m allocated

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 18: Splint the C code static checker

Memory management - Stack References

A memory error occurs if a pointer into stack is live after thefunction returns

Splint detects errors involving stack references exported froma function through return values or assignments to referencesreachable from global variables or actual parameters

No annotations are needed to detect stack reference errors. It isclear from declarations if storage is allocated on the function stack.

1 int *glob;

2

3 int *f (int **x) {

4 int sa[2] = { 0, 1 };

5 int loc = 3;

6

7 glob = &loc;

8 *x = &sa[0];

9 return &loc;

10 }

> splint stack.c

stack.c:9: Stack -allocated storage &loc

reachable

from return value: &loc

stack.c:9: Stack -allocated storage *x

reachable from

parameter x

stack.c:8: Storage *x becomes stack

stack.c:9: Stack -allocated storage glob

reachable

from global glob

stack.c:7: Storage glob becomes stack

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 19: Splint the C code static checker

Sumario

1 Introduction

2 Unused variables

3 Types

4 Memory management

5 Control Flow

6 Buffer sizes

7 The Ultimate Test: wu-ftpd

8 Pros and Cons

9 Conclusions

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 20: Splint the C code static checker

Control Flow - Execution

Many of these checks are possible because of the extrainformation that is known in annotations

To avoid spurious errors it is important to know somethingabout the behaviour of called functions

Without additional information Splint assumes that allfunctions return and execution continues normally

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 21: Splint the C code static checker

Control Flow - Execution (cont.)

noreturn annotation is used to denote a function that neverreturns.

1 extern /* @noreturn@ */ void fatalerror (char *s);

Problem!

We also have maynoreturn and alwaysreturns annotations, butSplint must assume that a function returns normally whenchecking the code and doesn’t verify if a function really returns.

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 22: Splint the C code static checker

Control Flow - Execution (cont.)

To describe non-returning functions the noreturnwhentrue andnoreturnwhenfalse mean that a function never returns if the firstargument is true or false.

1 /* @noreturnwhenfalse@ */ void assert (/*@sef@*/ bool /*@alt

int@*/ pred);

The sef annotation denotes a parameter as side effect free

The alt int indicate that it may be either a Boolean or aninteger

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 23: Splint the C code static checker

Control Flow - Undefined Behavior

The order which side effects take place in C is notentirely defined by the code.

Sequence point

a function call (after the arguments have been evaluated)

at the end of a if, while, for or do statement

a &&, || and ?

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 24: Splint the C code static checker

Control Flow - Undefined Behavior (cont.)

1 extern int glob;

2 extern int mystery (void);

3 extern int modglob (void) /*

@globals glob@ *//*

@modifies glob@ */;

4 int f (int x, int y[]) {

5 int i = x++ * x;

6 y[i] = i++;

7 i += modglob () * glob;

8 i += mystery () * glob;

9 return i;

10 }

> splint order.c +evalorderuncon

order.c:5: Expression has undefined

behavior (value of

right operand modified by left operand):

x++ * x

order.c:6: Expression has undefined

behavior (left operand

uses i, modified by right operand): y[i]

= i++

order.c:7: Expression has undefined

behavior (value of

right operand modified by left operand):

modglob () * glob

order.c:8: Expression has undefined

behavior

(unconstrained function mystery used in

left operand

may set global variable glob used in

right operand):

mystery () * glob

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 25: Splint the C code static checker

Control Flow - Likely Infinite Loops

Splint reports an error if it detects a loop that appears to beinifinite. An error is reported for a loop that does not modify anyvalue used in its condition test inside the body of the loop or in thecondition test itself.

1 extern int glob1 , glob2;

2 extern int f (void) /* @globals

glob1@ */ /* @modifies

nothing@ */;

3 extern void g (void) /*

@modifies glob2@ */ ;

4 extern void h (void) ;

5

6 void upto (int x) {

7 while (x > f ()) g();

8 while (f () < 3) h();

9 }

> splint loop.c +infloopsuncon

loop.c:7: Suspected infinite loop. No

value used in

loop test (x, glob1) is modified by test

or loop

body.

loop.c:8: Suspected infinite loop. No

condition

values modified. Modification possible

through

unconstrained calls: h

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 26: Splint the C code static checker

Control Flow - Switches

Splint detects case statements with code that may fall through tothe next case. The casebreak flag controls reporting of fallthrough cases. The keyword fallthrough explicitly indicates thatexecution falls through to this case.

1 typedef enum {

2 YES , NO, DEFINITELY ,

3 PROBABLY , MAYBE } ynm;

4

5 void decide (ynm y) {

6 switch (y) {

7 case PROBABLY:

8 case NO: printf ("No!");

9 case MAYBE: printf ("

Maybe");

10 /* @fallthrough@ */

11 case YES: printf ("Yes!"

);

12 }

13 }

> splint switch.c

switch.c:9: Fall through case (no

preceding break)

switch.c:12: Missing case in switch:

DEFINITELY

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 27: Splint the C code static checker

Control Flow - Conclusion

But Splint has more!

Deep Breaks

Complete Logic

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 28: Splint the C code static checker

Sumario

1 Introduction

2 Unused variables

3 Types

4 Memory management

5 Control Flow

6 Buffer sizes

7 The Ultimate Test: wu-ftpd

8 Pros and Cons

9 Conclusions

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 29: Splint the C code static checker

Buffer sizes

1 Buffer overflow errors are a particularly dangerous type of bugin C

2 They are responsible for half of all security attacks

3 C does not perform runtime bound checking (for performancereasons)

4 Attackers can exploit program bugs to gain full access to amachine

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 30: Splint the C code static checker

Buffer sizes - Checking access

Splint models blocks of memory using two properties:

maxSet

maxSet(b) denotes the highest address beyond b that can besafely used as lvalue, for instance:char buffer[MAXSIZE] we have maxSet(buffer) = MAXSIZE − 1

maxRead

maxRead(b) denotes the highest index of a buffer that can besafely used as rvalue.

When a buffer is accessed as an lvalue, Splint generates aprecondition constraint involving the maxSet property

When a buffer is accessed as an rvalue, Splint generates aprecondition constraint involving the maxRead property

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 31: Splint the C code static checker

Buffer sizes - Annotating Buffer Sizes

1 Function declarations may include requires and ensuresclauses to specify assumptions about buffer sizes for functionpreconditions

2 When a function with requires clause is called, the call sitemust be checked to satisfy the constraints implied by requires

3 If the +checkpost is set, Splint warns if it cannot verify thata function implementation satisfies its declared postconditions

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 32: Splint the C code static checker

Buffer sizes - Annotating Buffer Sizes (cont.)

1 void /*@alt char * @*/ strcpy

2 (/* @unique@ */ /*@out@*/ /* @returned@ */ char *s1, char *s2)

3 /* @modifies *s1@*/

4 /* @requires maxSet(s1) >= maxRead(s2) @*/

5 /* @ensures maxRead(s1) == maxRead (s2) @*/;

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 33: Splint the C code static checker

Buffer sizes - Annotating Buffer Sizes (cont.)

1 void /*@alt char * @*/ strncpy

2 (/* @unique@ */ /*@out@*/ /* @returned@ */ char *s1, char *s2,

3 size_t n)

4 /* @modifies *s1@*/

5 /* @requires maxSet(s1) >= ( n - 1 ); @*/

6 /* @ensures maxRead (s2) >= maxRead(s1) /\ maxRead (s1) <= n;

@*/;

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 34: Splint the C code static checker

Buffer sizes - Warnings

Bound checking is more complex than other checks done bySplint

So, memory bound warnings contain extensive informationabout the unresolved constraint

1 int buf [10];

2 buf [10] = 3;

setChar.c:5:4: Likely out -of -bounds

store:

buf [10]

Unable to resolve constraint: requires 9

>= 10

needed to satisfy precondition: requires

maxSet(buf @ setChar.c:5:4) >= 10

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 35: Splint the C code static checker

Buffer sizes - Warnings (cont.)

1 void updateEnv(char * str) {

2 char * tmp;

3 tmp = getenv("MYENV");

4 if (tmp != NULL)

5 strcpy (str , tmp);

6 }

> splint bounds.c +bounds +

showconstraintlocation

bounds.c:5: Possible out -of-bounds store

:

strcpy(str , tmp)

Unable to resolve constraint:

requires maxSet(str @ bounds.c:5) >=

maxRead(getenv ("MYENV") @ bounds.c:3)

needed to satisfy precondition:

requires maxSet(str @ bounds.c:5) >=

maxRead(tmp @ bounds.c:5)

derived from strcpy precondition:

requires

maxSet(<parameter 1>) >=

maxRead(<parameter 2>)

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 36: Splint the C code static checker

Sumario

1 Introduction

2 Unused variables

3 Types

4 Memory management

5 Control Flow

6 Buffer sizes

7 The Ultimate Test: wu-ftpd

8 Pros and Cons

9 Conclusions

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 37: Splint the C code static checker

The Ultimate Test: wu-ftpd

wu-ftpd version 2.5.0

20.000 lines of code

Took less than four seconds to check all of wu-ftpd on a1.2-GHz Athlon machine

Splint detected the known flaws as well as finding somepreviously unknown flaws (!)

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 38: Splint the C code static checker

The Ultimate Test: wu-ftpd (cont.)

Running Splint on wu-ftpd without adding annotationsproduced 166 warnings for potential out-of-bounds writes

After adding 66 annotations, it produced 101 warnings: 25 ofthese indicated real problems and 76 were false

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 39: Splint the C code static checker

Sumario

1 Introduction

2 Unused variables

3 Types

4 Memory management

5 Control Flow

6 Buffer sizes

7 The Ultimate Test: wu-ftpd

8 Pros and Cons

9 Conclusions

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 40: Splint the C code static checker

Pros and Cons

Pros

Lightweight static analysis detects software vulnerabilities

Splint definately improves code quality

Suitable for real programs...

Cons

. . . although it produces more warning messages that lead toconfusion

It won’t eliminate all security risks

Hasn’t been developed since 2007, they need new volunteers

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 41: Splint the C code static checker

Sumario

1 Introduction

2 Unused variables

3 Types

4 Memory management

5 Control Flow

6 Buffer sizes

7 The Ultimate Test: wu-ftpd

8 Pros and Cons

9 Conclusions

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 42: Splint the C code static checker

Conclusions

No tool will eliminate all security risks

Lightweight static analysis tools (Splint) play an importantrole in identifying security vulnerabilities

Pedro Pereira, Ulisses Costa Splint the C code static checker

Page 43: Splint the C code static checker

Questions

?

Pedro Pereira, Ulisses Costa Splint the C code static checker