detecting deadlock, double-free and other abuses in a million lines of linux kernel source (sew 30)

Post on 10-Jun-2015

188 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Presentation at 30th Annual IEEE/NASA Software Engineering Workshop (SEW-30), Loyola College Graduate Center, Columbia, MD, USA, April 25, 2006. The preprint of the paper is at http://www.academia.edu/1413564/Detecting_deadlock_double-free_and_other_abuses_in_a_million_lines_of_linux_kernel_source. DOI 10.1109/SEW.2006.1 .

TRANSCRIPT

(Needles in a Haystack)

Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux Kernel

Source

Peter T. Breuer, Simon PickinUniversidad Carlos III de Madrid

Maria Larrondo PetrieFlorida Atlantic University

Goal•

Add quality assurance fromFormal Methods

to theLinux kernel

post-hoccapable of application by non-experts

handle 6.5 million lines of rapidly changing C code

A little poem

"The time has come," the Walrus said,"To talk of many things:Of shoes - and ships - and sealing-waxOf cabbages - and kings -And why the sea is boiling hot -And whether pigs have wings."

L. Carroll, The Walrus and the Carpenter

• Pigs with wings have seemed about as likely as Formal Methods in the Linux kernel !

Frightening Naïveté in FM• "For example in p. 13 you claim that the kernel does

not treat an infinite number of user request[s] between DiskTQ events. This is not completely true ..."

• BUT HERE is a fundamental problem with the paper: it ignores early work on operating system correctness. In 1969, A.N. Habermann (of Dijkstra's T.H.E. team) wrote a thesis entitled "On the harmonious cooperation of abstract machines". His results [are] at least very similar to yours. Whether they are equivalent I cannot say ...

Analysis Example: Sleep under Spinlock Hunt (SluSH - needs funding)

Challenges for FM in OS

• Programming in the large– 6.5 Million LoC, 15 hardware architectures, hundreds or

thousands of authors, hundreds of changes every day

– industrial programming "the code should be commented"

– literate programming "the comments should be coded"

– open source "the code is the commentary"

Methodology

• Apply combinatory logic to C programs (think precisely how afterwards)

• Parallel abstract interpretation of state to guide analysis

• Further interpretation of logic to generate actions

• Initial idea loops evaluated "almost once" evolves to finding loop invariant automatically

• Did not initially realise that C statements return a value(did realise that C expression evaluations affect state!)

SluSH run

Example of detected miscreant code

• snd_sb_csp_load() in sb16_csp.c

Another piece of guilty code

• Kernel 2.6.12 sound/oss/sequencer.c midi_outc()

Alan Cox owns up

How many Type II errors?

• 16 false alarms per 1000 files

• 2 real alarms– in kernel coding, nearly any over-reporting is

acceptable to authors

• There will always be over-detection Why?

Output summarises liklihoods

How many Type I errors

• Ideally0 !

– abstract approximation

– under-specifies, over-estimates

– if we say/see it doesn't happen, it doesn't happen● provisos

data/code memory separationlibrary functions do not modify current

environment parallel thread does not rewrite local data

Example of kfree/access

• drivers/scsi/aix7xxx_old.c in kernel 2.6.3

Basic method - state descriptions

Three Ps...

• Program– x = 1; if (x) ...

• Predicate description of program state

– n 1

• Perception of description– upper[n:p]

● spincount > 0 on a sleepy node is bad!● trigger/action system raises alarms, generates

relations ...

What's new?

• Combining logic is 3 phase

● PRE● DURING● POST

• Logic constraints reduced to NF on the fly● ∪ ∩ simple constraints x < k, x > k, x = k (NP-

complete)

• Traces are often joined● p → 1 | q→2 becomes p∪q →[1,2]

During?

• Observe process in execution

• In C means– exceptional program exit revealing internal state

● Return, Break (continue), Goto, (interrupt)

● R B G– "Blue box"

Pre Post

Dur

Blue box processing

A BPre Post

Dur

Empty Statement - NRB

• Pre → (Post, Dur)

– maintains P normally

– cannot return (F)

– cannot break (F)

Sequence -NRB• normal : traverse A then B

• return : return from A OR traverse A then return from B

• break : break from AOR traverse A then break from B

Forever Loop -NRB

• break from body is only normal exit from while(1)

• relax p until it is invariant

+ Trigger/Action engine

• Three rules propagate call graph and do other housekeeping.

• One ... says that a sleep call while the objective function is positive is set causes alarm output:

Using the analyser

• Call with same parameters as gcc compiler

At this point I should do example run

Summary

• Has formal methods come to the poloi?

• It's a step in the right direction.– No expertise needed

– Fast

– Copes with massive amounts of code

– Sound

• Negatives– Not good tracking program state (cries wolf)

– Extension to new problems needs expert

top related