cs 4723: lecture 5 test coverage

63
CS 4723: Lecture 5 Test Coverage

Upload: ethelbert-daniel

Post on 08-Jan-2018

218 views

Category:

Documents


0 download

DESCRIPTION

Test Coverage After we have done some testing, how do we know the testing is enough? The most straightforward: input coverage # of inputs tested / # of possible inputs Unfortunately, # of possible inputs is typically infinite Not feasible, so we need approximations… 2

TRANSCRIPT

Page 1: CS 4723: Lecture 5 Test Coverage

CS 4723: Lecture 5Test Coverage

Page 2: CS 4723: Lecture 5 Test Coverage

2

Test Coverage After we have done some testing, how do we

know the testing is enough? The most straightforward: input coverage # of inputs tested / # of possible inputs

Unfortunately, # of possible inputs is typically infinite

Not feasible, so we need approximations…

Page 3: CS 4723: Lecture 5 Test Coverage

3

Test Coverage Code Coverage Input Combination Coverage Specification Coverage Mutation Coverage

Page 4: CS 4723: Lecture 5 Test Coverage

4

Code Coverage Basic idea:

Bugs in the code that has never been executed will not be exposed

So the test suite is definitely not sufficient Definition:

Divide the code to elements Calculate the proportion of elements that are

executed by the test suite

Page 5: CS 4723: Lecture 5 Test Coverage

5

Control Flow Graph

How many test cases to achieve full statement coverage?

Page 6: CS 4723: Lecture 5 Test Coverage

6

Statement Coverage in Practice Microsoft reports 80-90% statement coverage Safely-critical software must achieve 100%

statement coverage

Usually about 85% coverage, 100% for large systems is usually very hard

Page 7: CS 4723: Lecture 5 Test Coverage

7

Statement Coverage: Example

Page 8: CS 4723: Lecture 5 Test Coverage

8

Branch Coverage Cover the branches in a program A branch is consider executed when both (All)

outcomes are executed Also called multiple-condition coveage

Page 9: CS 4723: Lecture 5 Test Coverage

9

Control Flow Graph

How many test cases to achieve full branch coverage?

Page 10: CS 4723: Lecture 5 Test Coverage

10

Branch Coverage: Example

Page 11: CS 4723: Lecture 5 Test Coverage

11

Branch Coverage: Example

An untested flow of data from an assignment to a use of the assigned value, could hide an erroneous computation

Even though we have 100% statement and branch coverage

Page 12: CS 4723: Lecture 5 Test Coverage

12

Data Flow Coverage Cover all def-use pairs in a software Def: write to a variable Use: read of a variable Use u and Def d are pairedwhen d is the directprecursor of u in certainexecution

Page 13: CS 4723: Lecture 5 Test Coverage

13

Data Flow Coverage Formula

Not easy to locate all use-def pairs Easy for inner-procedure (inside a method) Very difficult for inter-procedure

Consider the write to a field var in one method, and the read to it in another method

Page 14: CS 4723: Lecture 5 Test Coverage

14

Path coverage The strongest code coverage criterion

Try to cover all possible execution paths in a program Covers all previous coverage criteria? Usually not feasible

Exponential paths in acyclic programs Infinite paths in some programs with loops

Page 15: CS 4723: Lecture 5 Test Coverage

15

Path coverage N conditions 2N paths Many are not

feasible e.g., L1L2L3L4L6

X = 0 => L1L2L3L4L5L6

X = -1 => L1L3L4L6

X = -2 => L1L3L4L5L6

Page 16: CS 4723: Lecture 5 Test Coverage

16

Control Flow Graph

How many paths?How many test casesto cover?

Page 17: CS 4723: Lecture 5 Test Coverage

17

Path coverage, not enough1. main() {2. int x, y, z, w;3. read(x);4. read(y);5. if (x != 0)6. z = x + 10;7. else8. z = 1;9. if (y>0)10. w = y / z;10. else11. w = 0;12.}

Test Requirements: – 4 paths• Test Cases – (x = 1, y = 22) – (x = 0, y = 10) – (x = 1, y = -22) – (x = 1, y = -10)• We are still not exposing the fault !• Faulty if x = -10 – Structural coverage cannot reveal this error

Page 18: CS 4723: Lecture 5 Test Coverage

18

Code Coverage Questions

Statement (basic block) coverage, are they the same? Branch coverage (cover all edges in a control flow

graph), same with basic block coverage?

Page 19: CS 4723: Lecture 5 Test Coverage

19

Method coverage So far, all examples are inner-method

Quite useful in unit testing It is very hard to achieve 100% statement

coverage in system testing Need higher level code element Method coverage

Similar to statements Node coverage : method coverage Edge coverage : method invocation coverage Path coverage : stack trace coverage

Page 20: CS 4723: Lecture 5 Test Coverage

20

Method coverage

Page 21: CS 4723: Lecture 5 Test Coverage

21

Code coverage: summary Coverage of code elements and their

connections Node coverage:

Class/method/statement/predicate coverage Edge coverage:

Branch/Dataflow/MethodInvok Path coverage:

Path/UseDefChain/StackTrace

Page 22: CS 4723: Lecture 5 Test Coverage

22

Code coverage: limitations Not enough

Some bugs can not be revealed even with full path coverage

Cannot reveal bugs due to missing code

Page 23: CS 4723: Lecture 5 Test Coverage

23

Code coverage: practice Though not perfect, code coverage is the most

widely used technique for test evaluation Also used for measure progress made in

testing The criteria used in practice are mainly:

Method coverage Statement coverage Branch coverage Loop coverage with heuristic (0, 1, many)

Page 24: CS 4723: Lecture 5 Test Coverage

24

Code coverage: practice Far from perfect

The commonly used criteria are the weakest, recall our examples

A lot of corner (they are not so corner if just not found by statement coverage) cases can never be found

100% code coverage is rarely achieved Mature commercial software products released with

85% to 90% statement coverage Some commercial software products released with

around 60% statement coverage Many open source software even lower than 50%

Page 25: CS 4723: Lecture 5 Test Coverage

25

Input Combination Coverage Basic idea

Origins from the most straightforward idea In theory, proof of 100% correctness when achieve

100% coverage in theory In practice, on very trivial cases

Main problems Combinations are exponential Possible values are infinite

Page 26: CS 4723: Lecture 5 Test Coverage

26

Input Combination Coverage An example on a simple automatic sales

machine Accept only 1$ bill once and all beverages are 1$ Coke, Sprite, Juice, Water Icy or normal temperature Want receipt or not

All combinations = 4*2*2 = 16 combinations

Try all 16 combinations will make sure the system works correctly

Page 27: CS 4723: Lecture 5 Test Coverage

27

Input Combination Coverage Sales Machine Example

Coke

Sprite

Juice

Water

Normal

Icy

Receipt

No-Receipt

Input 1 Input 2 Input 3

Page 28: CS 4723: Lecture 5 Test Coverage

28

Combination Explosion Combinations are exponential to the number of

inputs Consider an annual tax report system with 50

yes/no questions to generate a customized form for you

250 combinations = about 1015 test cases Running 1000 test case for 1 second -> 30,000

years

Page 29: CS 4723: Lecture 5 Test Coverage

29

Observation When there are many inputs, usually a

relationship among inputs usually involve only a small number of inputs

The previous example: Maybe only icy coke and sprite, but receipt is independent

Page 30: CS 4723: Lecture 5 Test Coverage

30

Example of Tax Report Input 1: Family combined report or Single

report Input 2: Home loans or not Input 3: Receive gift or not Input 4: Age over 60 or not …

Input 1 is related to all other inputs Other inputs are independent of each other

Page 31: CS 4723: Lecture 5 Test Coverage

31

Studies A long term study from NIST (national institute

of standardization technology) A combination width of 4 to 6 is enough for detecting

almost all errors

Page 32: CS 4723: Lecture 5 Test Coverage

32

N-wise coverage Coverage on N-wise combination of the possible values of

all inputs Example: 2-wise combinations

(coke, icy), (sprite, icy), (water, icy), (juice, icy) (coke, normal), (sprite, normal), … (coke, receipt), (sprite, receipt), … (coke, no-receipt), (sprite, no-receipt), … (icy, receipt), (normal, receipt) (icy, no-receipt), (normal, no-receipt) 20 combinations in total We had 16 3-wise combinations, now we have 20, get

worse??

Page 33: CS 4723: Lecture 5 Test Coverage

33

N-wise coverage Note: One test case may cover multiple N-wise

combinations E.g., (Coke, Icy, Receipt) covers 3 2-wise combinations

(Coke, Icy), (Coke, Receipt), (Icy, Receipt) 100% N-wise coverage will fully cover 100% (N-1)-

wise coverage, is this true? For K Boolean inputs

Full combination coverage = 2k combinations: exponential Full n-wise coverage = 2n*k*(k-1)* … *(k-n+1)/n!combinations: polynomial, for 2-wise combination, 2*k*(k-1)

Page 34: CS 4723: Lecture 5 Test Coverage

34

N-wise coverage: Example How many test cases for 100% 2-wise

coverage of our sales machine example? (coke, icy, receipt), covers 3 new 2-wise combinations (sprite, icy, no-receipt), cover 3 new … (juice, icy, receipt), covers 2 new … (water, icy, receipt), covers 2 new … (coke, normal, no-receipt), covers 3 new … (sprite, normal, receipt), cover 3 new … (juice, normal, no-receipt), covers 2 new … (water, normal, no-receipt), covers 2 new … 8 test cases covers all 20 2-wise combinations

Page 35: CS 4723: Lecture 5 Test Coverage

35

Combination Coverage in Practice 2-wise combination coverage is very widely

used Pair-wise testing All pairs testing

Mostly used in configuration testing Example: configuration of gcc All lot of variables Several options for each variable For command line tools: add or remove an option

Page 36: CS 4723: Lecture 5 Test Coverage

36

Input model What happened if an input has infinite possible

values Integer Float Character String Note: all these are actually finite, but the possible

value set is too large, so that they are deemed as infinite

Idea: map infinite values to finite value baskets (ranges)

Page 37: CS 4723: Lecture 5 Test Coverage

37

Input model Equivalent class partition

Partition the possible value set of a input to several value ranges

Transform numeric variables (integer, float, double, character) to enumerated variables

Example: int exam_score => {less than -1}, {0, 59}, {60,69},

{70,79}, {80,89}, {90, 100}, {100+} char c => {a, z}, {A,Z}, {0,9}, {other}

Page 38: CS 4723: Lecture 5 Test Coverage

38

Input model Feature extraction

For string and structure inputs Split the possible value set with a certain feature Example:

String passwd => {contains space}, {no space} It is possible to extract multiple features from one input Example:

String name => {capitalized first letter}, {not} => {contains space}, {not} => {length >10}, {2-10}, {1}, {0}One test case may cover multiple features

Page 39: CS 4723: Lecture 5 Test Coverage

39

Input model Feature extraction: structure input

A Word Binary Tree (Data at all nodes are strings) Depth : integer -> partition {0, 1, 1+} Number of leaves : integer -> partition {0, 1, <10, 10+} Root: null / not A node with only left child / not A node with only right child / not Null value data on any node / not Root value: string -> further feature extraction Value on the left most leaf: string -> further feature

extraction …

Page 40: CS 4723: Lecture 5 Test Coverage

40

Input model Infeasible feature combination?

Example:String name => {capitalized first letter}, {not}

=> {contains space}, {not} => {length >10}, {2-10}, {1}, {0}Length = 0 ^ contains spaceLength = 0 ^ capitalized first letterLength = 1 ^ contains space ^ capitalized first letter

Page 41: CS 4723: Lecture 5 Test Coverage

41

Input combination coverage Summary:

Try to cover the combination of possible values of inputs

Exponential combinations: N-wise coverage 2-wise coverage is most popular, all pairs testing

Infinite possible values Input partition Input feature extraction

Coverage is usually 100% once adopted It is easy to achieve, compared with code coverage Models are not easy to write

Page 42: CS 4723: Lecture 5 Test Coverage

42

Specification Coverage A type of input coverage Covers the written formal specification in the

requirement document Example

When a number smaller than 0 is fed in, the system should report error => testcase: -1

Sometimes can be a sequence of inputs When you input correct user name, a passwd prompt

is shown, after you input the correct passwd, the user profile will be shown, …

=> testcase: xiaoyin, xxxxx, …

Page 43: CS 4723: Lecture 5 Test Coverage

43

Specification Coverage Widely used in industry Advantages

Target at the specification No need for writing oracles Usually can achieve 100% coverage

Disadvantages Very hard to automate

can only be automated with formal specifications No guarantee to be complete Quality highly depend on the specification

Page 44: CS 4723: Lecture 5 Test Coverage

44

Test coverage So far, covering inputs and code The final goal of testing

Find all bugs in the software So there should be a bug coverage The coverage best represents the adequacy of

a test suite 50% bug coverage = half done! 100% bug coverage = done!

Page 45: CS 4723: Lecture 5 Test Coverage

45

But it is impossible Bugs are unknown

Otherwise we do not need testing So we have the number of bugs found, we do

not know what to divide One possible solution

Estimation 1-10 bugs in 1 KLOC Depends on the type of software and the stage of

development, imprecise When you find many bugs, do you think all bugs are

there or the code is really of low quality?

Page 46: CS 4723: Lecture 5 Test Coverage

46

Mutation coverage How can we know how many bugs there are in

the code? If only we plant those bugs!

Mutation coverage checks the adequacy of a test suite by how many human-planted bugs it can expose

Page 47: CS 4723: Lecture 5 Test Coverage

47

Concepts Mutant

A software version with planted bugs Usually each mutant contains only one planted bug,

why? Mutant Kill

Given a test suite S and a mutant m, if there is a test case t in S, so that execute(original, t) != execute(m, t), we state that S can kill m

Basically, a test suite can kill a mutant, meaning that the test suite is able to detect the planted bug represented by the mutant

Page 48: CS 4723: Lecture 5 Test Coverage

48

Illustration

Test Cases

Original

Mutant 1

Mutant 2

Mutant n

...

Oracles

Results

Results

Results

same Survived

different Killed

Page 49: CS 4723: Lecture 5 Test Coverage

49

Concepts Mutation coverage

generated mutants of #killed mutants of #

Page 50: CS 4723: Lecture 5 Test Coverage

50

Mutant generation Traditional mutation operators

Statement deletion Replace Boolean expression with true/false Replace arithmetic operators (+, -, *, /, …) Replace comparison relations (>=, ==, <=, !

=) Replace variables …

Page 51: CS 4723: Lecture 5 Test Coverage

51

Mutation Example: OperatorMutant operator In original In mutant

Statement Deletion z=x*y+1;

Boolean expression to true | false

if (x<y) if(true)If(false)

Replace arithmetic operators

z=x*y+1; z=x*y-1z=x+y-1

Replace comparison operators

if(x<y) if(x<=y)if(x==y)

Replace variables z=x*y+1; z = z*y+1z = x*x+1

Page 52: CS 4723: Lecture 5 Test Coverage

52

Mutant generation Object-oriented mutation operators

Insert/Delete overriding method Add/delete “this” Instantiation as child class Cast to subtype …

Page 53: CS 4723: Lecture 5 Test Coverage

53

Mutation Example: Object-Oriented Insert/Delete overriding methodclass Shape{ public void setID(String id){ this.id = id; } public void draw(){ ... }}class Circle extends Shape{ public void draw(){ ... }}

class Shape{ public void setID(String id){ this.id = id; } public void draw(){ ... }}class Circle extends Shape{ public void setID(String id){ } public void draw(){ ... }}

class Shape{ public void setID(String id){ this.id = id; } protected void draw(){ ... }}class Circle extends Shape{}

Page 54: CS 4723: Lecture 5 Test Coverage

54

Problems of mutation testing Large amount of time overhead

Need to run the test suite over large number of mutants

Cause extra burden for collecting test coverage Equivalent mutants

A mutant that will not affect the behavior of the software

Page 55: CS 4723: Lecture 5 Test Coverage

55

Time overhead For n mutants, requires n times of

overhead How to reduce time overhead?

Reuse execution info Early rule out

Mutants that are not covered Mutants that cannot be killed

Page 56: CS 4723: Lecture 5 Test Coverage

56

Reduce Time Overheadint index = read; while (…){ …; index++; if (index == 10) { break; }}return value > 0;

int index = read; while (…){ …; index++; if (index == 10) { break; }}return value < 0;

original m1

reuse the program states before return statement

int index = read; while (…){ …; index++; if (index == 10) { return true; }}return value > 0;

int index = read; while (…){ …; index++; if (index == 10) { break; }}return value +1 >0;

m3m2

If index reads 100, The mutant is not covered

If value is not 0, nothing is changed

Page 57: CS 4723: Lecture 5 Test Coverage

57

Equivalent mutants Another main problem in mutation coverage is

equivalent mutants A mutant is an equivalent mutant if its semantics is

identical with the original softwareint index = 0; while (…){ …; index++; if (index == 10) { break; }}

=>

int index = 0; while (…){ …; index++; if (index >= 10) { break; }}

Page 58: CS 4723: Lecture 5 Test Coverage

58

Equivalent mutants Another main problem in mutation coverage is

equivalent mutants Equivalent mutants cause mutation coverage to never

reach 100% So you do not know whether there are too many

equivalent mutants, or the test suite is not adequate

Page 59: CS 4723: Lecture 5 Test Coverage

59

Reduce equivalent mutants Using compiler optimization

Check whether the compiled bytecode is the same with the original software Mutating dead code Mutating unused variable

After the mutation code, write a conditional path, and check whether the path is feasible

//result = a + b;result = a - b;

=>

//result = a + b;result = a - b;

if(a + b != a - b){ not equivalent;}

Page 60: CS 4723: Lecture 5 Test Coverage

60

Mutant testing tools MILUhttp://www0.cs.ucl.ac.uk/staff/Y.Jia/#tools MuJavahttp://cs.gmu.edu/~offutt/mujava/ Javalanchehttps://github.com/david-schuler/javalanche/

Page 61: CS 4723: Lecture 5 Test Coverage

61

Summary on all coverage measures Code coverage

Target: code Adequacy: no -> 100% code coverage != no bugs Approximation: dataflow, branch, method/statements Usability: medium (require code for instrumentation) Preparation: none Overhead: low (instrumentation cause some

overhead)

Page 62: CS 4723: Lecture 5 Test Coverage

62

Summary on all coverage measures Input combination coverage

Target: inputs Adequacy: yes -> 100% input coverage == no bugs Approximation: n-wise coverage, input partition, input

feature extraction Usability: none Preparation: hard (require input mapping) Overhead: none

Page 63: CS 4723: Lecture 5 Test Coverage

63

Summary on all coverage measures Mutation coverage

Target: bugs Adequacy: no -> 100% mutant coverage != no bugs Approximation: mutation is already approximation Usability: medium (require code change for mutants) Preparation: none Overhead: very high (execution on instrumented

mutated versions)