1 cen 4072 software testing ppt7: deducing errors

1

CEN 4072Software Testing

PPT7: Deducing errors

Slides Topics

Discuss what deduction is and how it is used when reasoning about program runs

Discuss how to use control flow graphs Discuss what control flow is and how it relates

to various dependences Discuss what dependences are and how they

can be used when slicing programs Discuss a bit about code smells Discuss some caveats about data flow and

source code

2

What is Deduction?

Deduction is the reasoning from the general to the particular.

In software testing, we use deduction as the basis for all of our different reasoning techniques (explained in next slide).

We use those techniques to analyze code and runs of a program for bugs.

A run is one execution of a given software.

3

Reasoning Techniques

Reasoning techniques include: Experimentation: The use of multiple controlled runs of a

program to refine and reject hypotheses until a precise diagnosis is isolated

Induction: The summation of multiple program runs to some abstract ruling that holds true for all considered runs

Observation: The act of inspecting arbitrary aspects of one program run

Deduction: The technique that uses analysis of program code (abstractions) to generalize what can and cannot happen in program runs (concrete)

4

Hierarchy of Program Analysis Techniques

5

Notice that all techniques use deduction in the end.

Reasoning about runs

We use deduction as the reasoning technique to find relevant information (errors, bugs, performance) about runs without ever having to execute a program.

With deduction, the sole focus for analysis is what is in the program’s code.

Based on the analysis of the abstract code, software testers can make assumptions or conduct reasoning about program runs.

6

Control Flow represents the order in which code statements are executed and how each one affects the other. This allows testers to deduce how a program works and pinpoint bugs.

Control Flow is often best understood when represented by a control flow graph.

In these graphs, each statement of a program is represented by a node.

These nodes are connected by edges which represent possible execution sequences of the statements

Finally, entry and exit nodes represent the start and end of a program or function

Pseudocode example:

while(i < 10)

i++

Above is a segment of code using a common control flow pattern, the while loop.

Here is the general control flow pattern/graph for the example above:

7

What is Control Flow?

While(COND)

BODY

Notice there is no entry and exit node because this is a simple while loop not a full function

Common Control Flow Patterns

8

There are 4 main control flow problems: Jumps: A jump is a goto which is an unconditional transfer of control.

Jumping into or giving control to the body of a loop or function makes reasoning about a program difficult. The control flow graph becomes unstructured or irreducible.

Indirect Jumps: This is a computed goto. These jumps make reasoning about control flow difficult because the goto can be followed by an arbitrary statement.○ Example: goto X (X can be anywhere)

Dynamic Dispatch: This is a very constrained form of indirect jumps found in Object Oriented languages. These make reasoning difficult because the tester must be aware of all possible destinations for every method call.○ Example: call shape.draw() can lead to Rectangle.draw(), Circle.draw(),

Triangle.draw(), etc. because the destination of the call and class is determined at runtime.

Exceptions: Throwing an exception can cause a function to give control back to its original caller. Control may never reach the end of said function but be transferred directly to the caller. This would need to be accounted for in control flow graphs. That is difficult to predict.

9

Control flow problems

Statements have an effect on the direction of control flow. They do this in four ways which can either be active or passive. The Active:

○ They can Write: Changing the state of a program such as assigning values to variables

○ They can Control: Changing the program counter or determining which statement is to execute next

The Passive:○ They can be Read: A statement can be affected by the state of

another it relies on (i.e. b = a + 2; b is reading a)○ They can be Executed: The execution of any statement can

be controlled by another (i.e. A() calls B())

10

Statements: their effect on flow

11

Example code

int fib(int n){ int f, f0 = 1, f1 = 1;

while (n > 1) {n = n - 1; f = f0 + f1;f0 = f1;f1 = f;

}

return f;}

int main(){ int n = 9;

while (n > 0) { printf("fib(%d)=%d\n", n,

fib(n)); n = n - 1; }

return 0;}

Below is some sample code of the Fibonacci Sequence. We will analyze each statement and determine the effects they have on control flow in the next slide.

Effects of the fib() Statements

Statement # Reads From Writes To Controls (statements)

0 fib(n) n 1-10

1 int f f

2 f0 = 1 f0

3 f1 = 1 f1

4 while (n > 1) n 5-8

5 n = n – 1 n n

6 f = f0 + f1 f0, f1 f

7 f0 = f1 f1 f0

8 f1 = f f f1

9 return f f (return value)

Note that each statement reads or writes a variable or controls whether other statements are executed.

12

Analysis of sample code

13

Dependences

• Dependence: The reliance that statements have on one another.

• There are two types of dependences:

– Data dependency: A statement’s outcome influences the data read by another statement

– Control dependency: One statement’s execution relies on the execution of another

• The dashed arrows show data dependences.

• The dotted arrows show control dependences.

Program Slicing: a subset of statements that can be found following dependences from a starting statement. (Like a tree branch)

A tester can find defect patterns in subsections of code by using dependences.

These subsections are called slices of code, and the method to creating them is called program slicing.

There are two methods called forward slicing and backward slicing (both explained in the next few slides).

14

What is Program Slicing?

Forward Slice: Creating a slice starting with an initial statement, and tracing its flow to all subsequent statements that depend on it. The initial statement plus all subsequent statements gives the slice.

Example: Given the below graph we want a forward slice from the initialization of A because A has been known to not create the proper funk. So, we trace the routes from A and find the declaration of a string x and the function uptownFunc.

15

Forward slice

if(CONDITION) A = 2

C = “sup”

string x = A + “FUNK”

uptownFunc(B, A)

The Forward Slice We Want!

16

A = 2 string x = A + “Joe”

uptownFunc(B, A)

Backward Slice: A slice of a program created by starting at an initial statement and tracing backwards the dependences that stem from that statement.

Example: Given the below graph we want a backward slice from the initialization of C because C is not saying “sup” like it should!

17

Backward slice

if(CONDITION) A = 2

C = “sup”

string x = A + “FUNK”

uptownFunc(B, A)G = “What”

The Backward Slice We Want!

18

if(CONDITION)

C = “sup”

G = “What”

19

Slice operation: example code

1 int main() {2 int a, b, sum, mul;3 sum = 0;4 mul = 1;5 a = read();6 b = read();7 while (a <= b) {8 sum = sum + a;9 mul = mul * a;10 a = a + 1;11 }12 write(sum);13 write(mul);14 }

We will use this code to demonstrate forward and backward slicing some more!

20

Program slice examples

In blue, we have the forward slice starting with the variable mul.

In red, we have the backward slice stemming from the statement write(sum)

This is the final statement traced from mul.

Like mathematical sets, we can perform operations on slices.

The three operations are:Chops: The intersection between a forward and a

backward sliceBackbones: The intersection between any two

slices no matter forward or backward.Dices: The difference between two slices.

21

Slice operations

Chop Example:

22

• This example uses the previous fibonacci program displayed as a control flow graph.

• To the left is a chop denoting all possible paths that f1 could take to influence f0 (highlighted in purple and red circles).

• This chop is a combination of the forward slice from f0 and the backwards slice from return f.

Backbone Example:

23

Below we have the intersection of the slices of sum and mul from the left.

Dice Example:

24

• Below is the difference between the two slices from the left.

• If you remove the backwards slice and the forwards slice for sum and mul, all you have left are the statements below.

We can use our dependences to deduce some common errors.

We will call these common errors, code smells.

The Code smells affected by usage of variables: Reading Uninitialized Variables: The case in which a variable is used before

initialization. Unused Values: The case in which a variable is written to but never used. Unreachable Code: The case where there is a statement that isn’t control

dependent on another statement.

The Code smells affected by dependences specific to certain languages or runtime libraries: Memory Leaks: This happens in languages with no garbage collection.

Programmer must deallocate dynamic memory. Interface Misuse: Memory is not the only resource that needs to be deallocated.

Programmers must deallocate things such as i/o streams, locks, sockets, etc… Null Pointers: Programs can accidentally attempt to access pointers having the

value null.

25

Using code smells

26

Uninitialized variables• Below is an example of using uninitialized variables. • If color is any value other than RED, AMBER, or GREEN, the int variable,

go will remain uninitialized and later referenced in the if statement.• This causes a warning from the compiler. • And, this will result in garbage data used for go which is probably not what

we want.

27

Unreachable code• An example of unreachable code is presented below. • The second print statement, printf(“w is positive\n”); will never be reached. • This is because it is not dependent on any other statement, and the else if

conditional requirement is already handled in the condition for the preceding if statement.

• This is more than likely an error because there would be no point to having code that you cannot ever reach.

28

Memory leaks• Presented below is an example of code containing a potential memory leak. • The pointer p is returned prematurely as the code never states to deallocate

the memory pointed to by p. • This is a memory leak and can result in using up all of a users RAM.

29

NULL pointers• In the sample code below (which is the same example from the

previous slide), we will demonstrate an example of using null pointers. • Let’s say the system runs out of memory to allocate before reaching

the malloc for pointer p. • This will cause p to be come a null pointer as malloc returns null which

is assigned to pointer p. • This will then cause a runtime error as we try to access p[i] later on in

the code.

30

Data flow caveats

• We can use debugger tools to check code for common errors, but no tool is perfect.

• Tools will have a false positive rate which is the percentage of time a tool will return an existing bug when in actuality there is no bug.

• We have these false positives because there are data dependences that we cannot compute precisely.

• To deal with these data dependence issues or data flow caveats (shown in the next slide) we make approximations.

Data flow caveats The Data Flow Caveats:

Indirect access: Sometimes when accessing a variable you have to approximate the location of said variable because its location is determined at runtime.

Pointers: When writing to a location referenced by a pointer, you have to know the location that the pointer points to or references. A strategy to deal with this is assuming the pointer can point to all objects addressed in the code.

Functions: A function can be called from multiple sites and it can be recursive. So, we can approximate dependence by using summary edges at call sites (basically a connection). The edges represent a dependence between the call site and function in question. This however is imprecise as this is the nature of approximations.

Features: Object orientation and concurrency can make computing dependences difficult.

31

We also have source code caveats that makes deduction about a given set of code a bit more complex.

The Source Code Caveats: Source Mismatch: Code being deduced from or analyzed can be different from

the code that is expected to run. We must ensure they are the same.

Macros and Preprocessors: A preprocessor is a program that manipulates code before it is fed to the compiler (i.e. any #include statement). We must take this into account.

Undefined Behavior: Some languages do not specify the semantics of some constructs. An example is the value range of char in the C language. It can vary from 8-bit to 128-bit or more. This can cause discrepancies when deducing.

Aspects: An aspect is a piece of code that is added to specific parts of a program which can cause arbitrary changes to code behavior. We must adjust to these arbitrary changes.

32

Source code caveats

33

Tools• FINDBUGS: This is a static checker tool meaning it

analyzes code for issues without ever running the program. It is a deductive tool used for checking JAVA code. It finds defect patterns in bytecode which are known as common coding errors. We have explored these errors in this powerpoint naming them code smells.

• CODESURFER: This is another static checker tool (defined above). It is a deductive tool used for checking C, C++, and x86 machine code. It checks the semantics of code and ensures that code smells do not exist.

• Keep in mind that both of these tools are not perfect. Both will return false positives at some given rate.

Sources

34

http://www.embedded.com/print/4418686

http://www.whyprogramsfail.com/toc.php

http://read.pudn.com/downloads154/doc/fileformat/678870/Why_Programs_Fail_-_A_Guide_to_Systematic_Debugging.pdf

http://www.grammatech.com/research/technologies/codesurfer

Why Programs Fail : A Guide to Systematic Debugging 2nd edition By Andreas Zeller

1 cen 4072 software testing ppt7: deducing errors

Documents

control flow difficult

control flow graphsdiscuss

program difficult

program rundeduction

program runsdiscuss

main control flow problems

data flow

reasoning difficult