splint the c code static checker
DESCRIPTION
A presentation about splint, the C code static checker.TRANSCRIPT
Splint the C code static checker
Pedro Pereira Ulisses Costa
Formal Methods in Software Engineering
May 28, 2009
Pedro Pereira, Ulisses Costa Splint the C code static checker
Sumario
1 Introduction
2 Unused variables
3 Types
4 Memory management
5 Control Flow
6 Buffer sizes
7 The Ultimate Test: wu-ftpd
8 Pros and Cons
9 Conclusions
Pedro Pereira, Ulisses Costa Splint the C code static checker
Lint for detecting anomalies in C programs
Statically checking C programs
Unused declarations
Type inconsistencies
Use before definition
Unreachable code
Ignored return values
Execution paths with no return
Infinite loops
Pedro Pereira, Ulisses Costa Splint the C code static checker
Splint
Specification Lint and Secure Programming Lint
Annotations
FunctionsVariablesParametersTypes
Pedro Pereira, Ulisses Costa Splint the C code static checker
Sumario
1 Introduction
2 Unused variables
3 Types
4 Memory management
5 Control Flow
6 Buffer sizes
7 The Ultimate Test: wu-ftpd
8 Pros and Cons
9 Conclusions
Pedro Pereira, Ulisses Costa Splint the C code static checker
Unused variables
Splint detects instances where the value of a location is usedbefore it is defined.
Annotations can be used to describe what storage must bedefined and what storage may be undefined at interfacepoints.
All storage reachable is defined before and after a functioncall.
global variableparameter to a functionfunction return value
Pedro Pereira, Ulisses Costa Splint the C code static checker
Undefined Parameters
Sometimes, function parameters or return values are expected toreference undefined or partially defined storage.
out annotation denotes a pointer to storage that may beundefined
in annotation can be used to denote a parameter that mustbe completely defined
1 extern void setVal (/*@out@*/ int *x);
2 extern int getVal (/*@in@*/ int *x);
3 extern int mysteryVal (int *x);
4
5 int dumbfunc (/*@out@*/ int *x, int i) {
6 if (i > 3)
7 return *x;
8 else if (i > 1)
9 return getVal (x);
10 else if (i == 0)
11 return mysteryVal (x);
12 else {
13 setVal (x);
14 return *x;
15 }
16 }
> splint usedef.c
usedef.c:7: Value *x used before
definition
usedef.c:9: Passed storage x not
completely defined
(*x is undefined): getVal (x)
usedef.c:11: Passed storage x not
completely defined
(*x is undefined): mysteryVal
(x)
Finished checking --- 3 code warnings
Pedro Pereira, Ulisses Costa Splint the C code static checker
Sumario
1 Introduction
2 Unused variables
3 Types
4 Memory management
5 Control Flow
6 Buffer sizes
7 The Ultimate Test: wu-ftpd
8 Pros and Cons
9 Conclusions
Pedro Pereira, Ulisses Costa Splint the C code static checker
Types
Strong type checking often reveals programmingerrors. Splint can check primitive C types more strictlyand flexibly than typical compilers.
Built in C Types
Splint supports stricter checking of built-in C types. The char andenum types can be checked as distinct types, and the differentnumeric types can be type-checked strictly.
Characters
The primitive char type can be type-checked as a distinct type. Ifchar is used as a distinct type, common errors involving assigningints to chars are detected.
If charint is on (+), char types are indistinguishable from ints.
Pedro Pereira, Ulisses Costa Splint the C code static checker
Types - Enums
An error is reported if:
a value that is not an enumerator member is assigned to theenum type
if an enum type is used as an operand to an arithmeticoperator
If the enumint flag is on, enum and int types may be usedinterchangeably.
Pedro Pereira, Ulisses Costa Splint the C code static checker
Sumario
1 Introduction
2 Unused variables
3 Types
4 Memory management
5 Control Flow
6 Buffer sizes
7 The Ultimate Test: wu-ftpd
8 Pros and Cons
9 Conclusions
Pedro Pereira, Ulisses Costa Splint the C code static checker
Memory management
About half the bugs in typical C programs can beattributed to memory management problems.
Some only appear sporadically
And some may only be apparent when compiled on a differentplatform
Splint detects many memory management errors at compile time
Using storage that may have been deallocated
Memory leaks
Returning a pointer to stack-allocated storage
Pedro Pereira, Ulisses Costa Splint the C code static checker
Memory management - Memory Model
An object is a typed region of storage;
Some objects use a fixed amount of storage (that is allocatedand deallocated by the compiler);
Other objects use dynamic memory storage that must bemanaged by the program.
Storage is undefined if it has not been assigned a value
and defined after it has been assigned a value.
An object is completely defined if all storage that may bereached from it is defined.
Pedro Pereira, Ulisses Costa Splint the C code static checker
Memory management - Memory Model (cont.)
What storage is reachable from an object depends on the type andvalue of the object.
Example
If p is a pointer to a structure, p is completely defined if the valueof p is NULL, or if every field of the structure p points to iscompletely defined.
Pedro Pereira, Ulisses Costa Splint the C code static checker
Memory management - Memory Model (cont.)
Left side of an assignment
When an expression is used as the left side of an assignmentwe say it is an lvalue;
Its location in memory is used, but not its value;
Undefined storage may be used as an lvalue since only itslocation is needed.
Right side of an assignment
When storage is used in any other way:
on the right side of an assignment;as an operand to a primitive operator;as a function parameter.
we say it is used as an rvalue;
It is an anomaly to use undefined storage as an rvalue.
Pedro Pereira, Ulisses Costa Splint the C code static checker
Memory management - Deallocation Errors
Deallocating storage when there are other live references tothe same storage
Failing to deallocate storage before the last reference to it islost
Solution
Obligation to release storage
This obligation is attached to the reference to which thestorage is assigned
The only annotation is used to indicate that a reference is theonly pointer to the object it points to:
1 /* @only@ */ /* @null@ */ void *malloc (size_t size);
Pedro Pereira, Ulisses Costa Splint the C code static checker
Memory management - Memory Leaks
1 extern /* @only@ */ int *glob;
2
3 /* @only@ */ int *f (/* @only@ */
int *x, int *y,int *z) {
4 int *m = (int *) malloc (
sizeof (int));
5 glob = y; // Memory leak
6 free (x);
7 *m = *x; //Use after
free
8 return z; // Memory leak
detected
9 }
> splint only.c
only.c:4: Only storage glob (type int *)
not released
before assignment: glob = y
only.c:1: Storage glob becomes only
only.c:4: Implicitly temp storage y
assigned to only:
glob = y
only.c:6: Dereference of possibly null
pointer m: *m
only.c:8: Storage m may become null
only.c:6: Variable x used after being
released
only.c:5: Storage x released
only.c:7: Implicitly temp storage z
returned as only: z
only.c:7: Fresh storage m not released
before return
only.c:3: Fresh storage m allocated
Pedro Pereira, Ulisses Costa Splint the C code static checker
Memory management - Stack References
A memory error occurs if a pointer into stack is live after thefunction returns
Splint detects errors involving stack references exported froma function through return values or assignments to referencesreachable from global variables or actual parameters
No annotations are needed to detect stack reference errors. It isclear from declarations if storage is allocated on the function stack.
1 int *glob;
2
3 int *f (int **x) {
4 int sa[2] = { 0, 1 };
5 int loc = 3;
6
7 glob = &loc;
8 *x = &sa[0];
9 return &loc;
10 }
> splint stack.c
stack.c:9: Stack -allocated storage &loc
reachable
from return value: &loc
stack.c:9: Stack -allocated storage *x
reachable from
parameter x
stack.c:8: Storage *x becomes stack
stack.c:9: Stack -allocated storage glob
reachable
from global glob
stack.c:7: Storage glob becomes stack
Pedro Pereira, Ulisses Costa Splint the C code static checker
Sumario
1 Introduction
2 Unused variables
3 Types
4 Memory management
5 Control Flow
6 Buffer sizes
7 The Ultimate Test: wu-ftpd
8 Pros and Cons
9 Conclusions
Pedro Pereira, Ulisses Costa Splint the C code static checker
Control Flow - Execution
Many of these checks are possible because of the extrainformation that is known in annotations
To avoid spurious errors it is important to know somethingabout the behaviour of called functions
Without additional information Splint assumes that allfunctions return and execution continues normally
Pedro Pereira, Ulisses Costa Splint the C code static checker
Control Flow - Execution (cont.)
noreturn annotation is used to denote a function that neverreturns.
1 extern /* @noreturn@ */ void fatalerror (char *s);
Problem!
We also have maynoreturn and alwaysreturns annotations, butSplint must assume that a function returns normally whenchecking the code and doesn’t verify if a function really returns.
Pedro Pereira, Ulisses Costa Splint the C code static checker
Control Flow - Execution (cont.)
To describe non-returning functions the noreturnwhentrue andnoreturnwhenfalse mean that a function never returns if the firstargument is true or false.
1 /* @noreturnwhenfalse@ */ void assert (/*@sef@*/ bool /*@alt
int@*/ pred);
The sef annotation denotes a parameter as side effect free
The alt int indicate that it may be either a Boolean or aninteger
Pedro Pereira, Ulisses Costa Splint the C code static checker
Control Flow - Undefined Behavior
The order which side effects take place in C is notentirely defined by the code.
Sequence point
a function call (after the arguments have been evaluated)
at the end of a if, while, for or do statement
a &&, || and ?
Pedro Pereira, Ulisses Costa Splint the C code static checker
Control Flow - Undefined Behavior (cont.)
1 extern int glob;
2 extern int mystery (void);
3 extern int modglob (void) /*
@globals glob@ *//*
@modifies glob@ */;
4 int f (int x, int y[]) {
5 int i = x++ * x;
6 y[i] = i++;
7 i += modglob () * glob;
8 i += mystery () * glob;
9 return i;
10 }
> splint order.c +evalorderuncon
order.c:5: Expression has undefined
behavior (value of
right operand modified by left operand):
x++ * x
order.c:6: Expression has undefined
behavior (left operand
uses i, modified by right operand): y[i]
= i++
order.c:7: Expression has undefined
behavior (value of
right operand modified by left operand):
modglob () * glob
order.c:8: Expression has undefined
behavior
(unconstrained function mystery used in
left operand
may set global variable glob used in
right operand):
mystery () * glob
Pedro Pereira, Ulisses Costa Splint the C code static checker
Control Flow - Likely Infinite Loops
Splint reports an error if it detects a loop that appears to beinifinite. An error is reported for a loop that does not modify anyvalue used in its condition test inside the body of the loop or in thecondition test itself.
1 extern int glob1 , glob2;
2 extern int f (void) /* @globals
glob1@ */ /* @modifies
nothing@ */;
3 extern void g (void) /*
@modifies glob2@ */ ;
4 extern void h (void) ;
5
6 void upto (int x) {
7 while (x > f ()) g();
8 while (f () < 3) h();
9 }
> splint loop.c +infloopsuncon
loop.c:7: Suspected infinite loop. No
value used in
loop test (x, glob1) is modified by test
or loop
body.
loop.c:8: Suspected infinite loop. No
condition
values modified. Modification possible
through
unconstrained calls: h
Pedro Pereira, Ulisses Costa Splint the C code static checker
Control Flow - Switches
Splint detects case statements with code that may fall through tothe next case. The casebreak flag controls reporting of fallthrough cases. The keyword fallthrough explicitly indicates thatexecution falls through to this case.
1 typedef enum {
2 YES , NO, DEFINITELY ,
3 PROBABLY , MAYBE } ynm;
4
5 void decide (ynm y) {
6 switch (y) {
7 case PROBABLY:
8 case NO: printf ("No!");
9 case MAYBE: printf ("
Maybe");
10 /* @fallthrough@ */
11 case YES: printf ("Yes!"
);
12 }
13 }
> splint switch.c
switch.c:9: Fall through case (no
preceding break)
switch.c:12: Missing case in switch:
DEFINITELY
Pedro Pereira, Ulisses Costa Splint the C code static checker
Control Flow - Conclusion
But Splint has more!
Deep Breaks
Complete Logic
Pedro Pereira, Ulisses Costa Splint the C code static checker
Sumario
1 Introduction
2 Unused variables
3 Types
4 Memory management
5 Control Flow
6 Buffer sizes
7 The Ultimate Test: wu-ftpd
8 Pros and Cons
9 Conclusions
Pedro Pereira, Ulisses Costa Splint the C code static checker
Buffer sizes
1 Buffer overflow errors are a particularly dangerous type of bugin C
2 They are responsible for half of all security attacks
3 C does not perform runtime bound checking (for performancereasons)
4 Attackers can exploit program bugs to gain full access to amachine
Pedro Pereira, Ulisses Costa Splint the C code static checker
Buffer sizes - Checking access
Splint models blocks of memory using two properties:
maxSet
maxSet(b) denotes the highest address beyond b that can besafely used as lvalue, for instance:char buffer[MAXSIZE] we have maxSet(buffer) = MAXSIZE − 1
maxRead
maxRead(b) denotes the highest index of a buffer that can besafely used as rvalue.
When a buffer is accessed as an lvalue, Splint generates aprecondition constraint involving the maxSet property
When a buffer is accessed as an rvalue, Splint generates aprecondition constraint involving the maxRead property
Pedro Pereira, Ulisses Costa Splint the C code static checker
Buffer sizes - Annotating Buffer Sizes
1 Function declarations may include requires and ensuresclauses to specify assumptions about buffer sizes for functionpreconditions
2 When a function with requires clause is called, the call sitemust be checked to satisfy the constraints implied by requires
3 If the +checkpost is set, Splint warns if it cannot verify thata function implementation satisfies its declared postconditions
Pedro Pereira, Ulisses Costa Splint the C code static checker
Buffer sizes - Annotating Buffer Sizes (cont.)
1 void /*@alt char * @*/ strcpy
2 (/* @unique@ */ /*@out@*/ /* @returned@ */ char *s1, char *s2)
3 /* @modifies *s1@*/
4 /* @requires maxSet(s1) >= maxRead(s2) @*/
5 /* @ensures maxRead(s1) == maxRead (s2) @*/;
Pedro Pereira, Ulisses Costa Splint the C code static checker
Buffer sizes - Annotating Buffer Sizes (cont.)
1 void /*@alt char * @*/ strncpy
2 (/* @unique@ */ /*@out@*/ /* @returned@ */ char *s1, char *s2,
3 size_t n)
4 /* @modifies *s1@*/
5 /* @requires maxSet(s1) >= ( n - 1 ); @*/
6 /* @ensures maxRead (s2) >= maxRead(s1) /\ maxRead (s1) <= n;
@*/;
Pedro Pereira, Ulisses Costa Splint the C code static checker
Buffer sizes - Warnings
Bound checking is more complex than other checks done bySplint
So, memory bound warnings contain extensive informationabout the unresolved constraint
1 int buf [10];
2 buf [10] = 3;
setChar.c:5:4: Likely out -of -bounds
store:
buf [10]
Unable to resolve constraint: requires 9
>= 10
needed to satisfy precondition: requires
maxSet(buf @ setChar.c:5:4) >= 10
Pedro Pereira, Ulisses Costa Splint the C code static checker
Buffer sizes - Warnings (cont.)
1 void updateEnv(char * str) {
2 char * tmp;
3 tmp = getenv("MYENV");
4 if (tmp != NULL)
5 strcpy (str , tmp);
6 }
> splint bounds.c +bounds +
showconstraintlocation
bounds.c:5: Possible out -of-bounds store
:
strcpy(str , tmp)
Unable to resolve constraint:
requires maxSet(str @ bounds.c:5) >=
maxRead(getenv ("MYENV") @ bounds.c:3)
needed to satisfy precondition:
requires maxSet(str @ bounds.c:5) >=
maxRead(tmp @ bounds.c:5)
derived from strcpy precondition:
requires
maxSet(<parameter 1>) >=
maxRead(<parameter 2>)
Pedro Pereira, Ulisses Costa Splint the C code static checker
Sumario
1 Introduction
2 Unused variables
3 Types
4 Memory management
5 Control Flow
6 Buffer sizes
7 The Ultimate Test: wu-ftpd
8 Pros and Cons
9 Conclusions
Pedro Pereira, Ulisses Costa Splint the C code static checker
The Ultimate Test: wu-ftpd
wu-ftpd version 2.5.0
20.000 lines of code
Took less than four seconds to check all of wu-ftpd on a1.2-GHz Athlon machine
Splint detected the known flaws as well as finding somepreviously unknown flaws (!)
Pedro Pereira, Ulisses Costa Splint the C code static checker
The Ultimate Test: wu-ftpd (cont.)
Running Splint on wu-ftpd without adding annotationsproduced 166 warnings for potential out-of-bounds writes
After adding 66 annotations, it produced 101 warnings: 25 ofthese indicated real problems and 76 were false
Pedro Pereira, Ulisses Costa Splint the C code static checker
Sumario
1 Introduction
2 Unused variables
3 Types
4 Memory management
5 Control Flow
6 Buffer sizes
7 The Ultimate Test: wu-ftpd
8 Pros and Cons
9 Conclusions
Pedro Pereira, Ulisses Costa Splint the C code static checker
Pros and Cons
Pros
Lightweight static analysis detects software vulnerabilities
Splint definately improves code quality
Suitable for real programs...
Cons
. . . although it produces more warning messages that lead toconfusion
It won’t eliminate all security risks
Hasn’t been developed since 2007, they need new volunteers
Pedro Pereira, Ulisses Costa Splint the C code static checker
Sumario
1 Introduction
2 Unused variables
3 Types
4 Memory management
5 Control Flow
6 Buffer sizes
7 The Ultimate Test: wu-ftpd
8 Pros and Cons
9 Conclusions
Pedro Pereira, Ulisses Costa Splint the C code static checker
Conclusions
No tool will eliminate all security risks
Lightweight static analysis tools (Splint) play an importantrole in identifying security vulnerabilities
Pedro Pereira, Ulisses Costa Splint the C code static checker
Questions
?
Pedro Pereira, Ulisses Costa Splint the C code static checker