11/14/2014
1
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
MTAT.03.094
Software Engineering
Lecture 10:
Verification & Validation II
Dietmar Pfahl
email: [email protected] Fall 2014
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Schedule of Lectures
Week 01: Introduction to SE
Week 02: Requirements Engineering I
Week 03: Requirements Engineering II
Week 04: Analysis
Week 05: Dev. Infrastructure I
Week 06: Dev. Infrastructure II
Week 07: ICS Day / ATI Päev 2014
Week 08: Architecture and Design
Week 09: Refactoring
Week 10: Verification & Validation I
Week 11: Verification & Validation II
Week 12: Agile/Lean Methods
Week 13: Software Quality
Management
Week 14: Measurement & Process
Improvement
Course wrap-up, review and
exam preparation
Week 15: no lecture
Week 16: no lecture
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Structure of Lecture 10
• Fundamental Definitions and Concepts
• Creating Test Cases
• Manually
• Automatically
• Assessing the Quality of Test Suites
• Test Coverage
• Mutation Testing
• Static Analysis
• Quality Prediction
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Error Guessing
• Exploratory testing, happy testing, ...
• Always worth including
• Can trigger failures that systematic techniques miss
• Consider
• Past failures
• Intuition
• Experience
• Brain storming
• ”What is the craziest thing we can do?”
• Lists in literature
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Exploratory Testing
• Inventors:
• Cem Kaner, James Bach (1990s)
• Definition:
• “Exploratory testing is simultaneous learning, test design, and test execution.”
• Elements / Variants
• Charter: defines mission (and sometimes tactics to use)
• Example: “Check UI against Windows interface standards”
• Session-based test management: Defects + Notes + Interviews of the testers
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Usability Test Types + Environment
Rubin’s Types of Usability Tests (Rubin, 1994, p. 31-46)
Exploratory test – early product development
Assessment test – most typical, either early or midway in the product development
Validation test – verification of product’s usability
Comparison test – compare two or more designs; can be used with other three types of tests
Think aloud
Observe
Audio/video recording
11/14/2014
2
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Testing Strategies
requirements
input
events
output
Black Box Testing White Box Testing
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
There are many possible paths!
loop < 20x
If-then-else
Selective Testing
White-Box Testing
520 (~1014 ) different paths
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Control Flow Graph (CFG)
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Control Flow Graph (CFG)
empty Blocks (= Nodes): 4 Edges: 4
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Control Flow – Example
If (c1) then {
if (c2) then {s1}
s2
while (c3) do {s3}
}
else {
if (c4) then {
repeat {s4} until (c5)
}
}
d1 d2
d3
d5
d4
d6 d7
d8 d9
d10
d11
d12 d13
d14
c1
c4 c2
c5 c3
s1
s2
s4
s3
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Control Flow – Example
If (c1) then {
if (c2) then {s1}
s2
while (c3) do {s3}
}
else {
if (c4) then {
repeat {s4} until (c5)
}
}
d1 d2
c1
CFG(f) CFG(t)
If (c1) then {
CFG(c1=true)
}
else {
CFG(c1=false)
}
}
11/14/2014
3
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Control Flow – Example
If (c1) then {
if (c2) then {s1}
s2
while (c3) do {s3}
}
else {
if (c4) then {
repeat {s4} until (c5)
}
}
d1 d2
d4
d6
d9
d10
d11
d14
c1
c4 CFG(if)
s2
If (c1) then {
CFG(if)
s2
CFG(while)
}
else {
if (c4) then {
CFG(repeat)
}
}
CFG (while)
CFG (repeat)
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Control Flow – Example
If (c1) then {
if (c2) then {s1}
s2
while (c3) do {s3}
}
else {
if (c4) then {
repeat {s4} until (c5)
}
}
d1 d2
d3
d5
d4
d6 d7
d8 d9
d10
d11
d12 d13
d14
c1
c4 c2
c5 c3
s1
s2
s4
s3
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Overview of Control Flow Criteria
Statement (or Block) Coverage – all nodes
Decision (or Branch) Coverage – all edges
Condition Coverage
Condition/Decision Coverage
Multiple Condition Coverage
Modified Condition Decision Coverage (MC/DC)
Linearly Independent Paths
Simple Paths
Visit-Each Loop
All Paths
…
d1 d2
d3
d5
d4
d6 d7
d8 d9
d10
d11
d12 d13
d14
c1
c4 c2
c5 c3
s1
s2
s4
s3
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Life Insurance Example
boolean AccClient(int age; gtype gender)
if (gender = female && age < 85)
return(TRUE);
if (gender = male && age < 80)
return(TRUE);
return(FALSE);
S
E
3
1
5
1:
2:
3:
4:
5:
d1 = c1 && c2
d2 = c3 && c4
2 4
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Statement Coverage
• Criterion:
• Execute each statement at least once
• Tools can be used to monitor execution
• Possible concern:
• Dead code
• Example: assume that due to some previous calculations, AccClient can only be invoked with parameter value gender = female
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Statement Coverage /1
boolean AccClient(int age; gtype gender)
if (gender = female && age < 85)
return(TRUE);
if (gender = male && age < 80)
return(TRUE);
return(FALSE);
S
E
3
1
5
1:
2:
3:
4:
5:
d1 = c1 && c2
d2 = c3 && c4
2 4
Test:
AccClient(83, female)->accept
AccClient(83, male)->reject
0 %
11/14/2014
4
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Statement Coverage /2
boolean AccClient(int age; gtype gender)
if (gender = female && age < 85)
return(TRUE);
if (gender = male && age < 80)
return(TRUE);
return(FALSE);
S
E
3
1
5
1:
2:
3:
4:
5:
d1 = c1 && c2
d2 = c3 && c4
2 4
Test:
AccClient(83, female)->accept
AccClient(83, male)->reject
40 %
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Statement Coverage /3
boolean AccClient(int age; gtype gender)
if (gender = female && age < 85)
return(TRUE);
if (gender = male && age < 80)
return(TRUE);
return(FALSE);
S
E
3
1
5
1:
2:
3:
4:
5:
d1 = c1 && c2
d2 = c3 && c4
2 4
Test:
AccClient(83, female)->accept
AccClient(83, male)->reject
80 %
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Statement Coverage /4
boolean AccClient(int age; gtype gender)
if (gender = female && age < 85)
return(TRUE);
if (gender = male && age < 80)
return(TRUE);
return(FALSE);
S
E
3
1
5
1:
2:
3:
4:
5:
d1 = c1 && c2
d2 = c3 && c4
2 4
Test:
AccClient(83, female)->accept
AccClient(83, male)->reject
AccClient(25, male)->accept
100 %
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Decision (Branch) Coverage /1
boolean AccClient(int age; gtype gender)
if (gender = female && age < 85)
return(TRUE);
if (gender = male && age < 80)
return(TRUE);
return(FALSE);
S
E
3
1
5
1:
2:
3:
4:
5:
d1 = c1 && c2
d2 = c3 && c4
2 4
Test:
AccClient(83, female)->accept
AccClient(83, male)->reject
0 %
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Decision (Branch) Coverage /2
boolean AccClient(int age; gtype gender)
if (gender = female && age < 85)
return(TRUE);
if (gender = male && age < 80)
return(TRUE);
return(FALSE);
S
E
3
1
5
1:
2:
3:
4:
5:
d1 = c1 && c2
d2 = c3 && c4
2 4
Test:
AccClient(83, female)->accept
AccClient(83, male)->reject
AccClient(25, male)->accept
25 %
T
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Decision (Branch) Coverage /3
boolean AccClient(int age; gtype gender)
if (gender = female && age < 85)
return(TRUE);
if (gender = male && age < 80)
return(TRUE);
return(FALSE);
S
E
3
1
5
1:
2:
3:
4:
5:
d1 = c1 && c2
d2 = c3 && c4
2 4
Test:
AccClient(83, female)->accept
AccClient(83, male)->reject
AccClient(25, male)->accept
75 %
T/F
F
11/14/2014
5
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Decision (Branch) Coverage /4
boolean AccClient(int age; gtype gender)
if (gender = female && age < 85)
return(TRUE);
if (gender = male && age < 80)
return(TRUE);
return(FALSE);
S
E
3
1
5
1:
2:
3:
4:
5:
d1 = c1 && c2
d2 = c3 && c4
2 4
Test:
AccClient(83, female)->accept
AccClient(83, male)->reject
AccClient(25, male)->accept
100 %
T/F
T/F
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Decision (Branch) Coverage /?
boolean AccClient(int age; gtype gender)
if (gender = female){
if (age < 85)
return(TRUE);
return(FALSE);}
if (gender = male){
if (age < 80)
return(TRUE);
return(FALSE);}
return(FALSE);
S
E
5
1
8
1:
2:
3:
4:
5:
6:
7:
8:
9:
d1 = c1
d3 = c3
4 6
Test:
AccClient(83, female)->accept
AccClient(83, male)->reject
AccClient(25, male)->accept
75 %
2
3
7
d2 = c2
9
Dead code!
d4 = c4
T/F
T
T/F
T
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Decision (Branch) Coverage /?
boolean AccClient(int age; gtype gender)
if (gender = female){
if (age < 85)
return(TRUE);
return(FALSE);}
if (gender = male){
if (age < 80)
return(TRUE);
return(FALSE);}
return(FALSE);
S
E
5
1 1:
2:
3:
4:
5:
6:
7:
8:
9:
d1 = c1
d3 = c3
6
Test:
AccClient(83, female)->accept
AccClient(83, male)->reject
AccClient(25, male)->accept
100 %
2
3
7
d2 = c2
9
d4 = c4
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Condition Coverage
• Test all conditions (in all predicate nodes)
• Each condition must evaluate at least once (or: once to ’true’ and once to ’false’)
• A (simple) condition may contain:
• Relational operators
• Arithmetic expressions
• ...
• A predicate may contain several (simple) conditions connected via Boolean operators
If (A<10 and B>250) then …
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Condition Coverage /1
boolean AccClient(int age; gtype gender)
if (gender = female && age < 85)
return(TRUE);
if (gender = male && age < 80)
return(TRUE);
return(FALSE);
S
E
3
1
5
1:
2:
3:
4:
5:
d1 = c1 && c2
d2 = c3 && c4
2 4
Test:
AccClient(83, female)->accept
AccClient(83, male)->reject
0 %
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Condition Coverage /2
boolean AccClient(int age; gtype gender)
if (gender = female && age < 85)
return(TRUE);
if (gender = male && age < 80)
return(TRUE);
return(FALSE);
S
E
3
1
5
1:
2:
3:
4:
5:
d1 = c1 && c2
d2 = c3 && c4
2 4
Test:
AccClient(83, female)->accept
AccClient(83, male)->reject
50 % or 25 %
T T T
11/14/2014
6
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Condition Coverage /3
boolean AccClient(int age; gtype gender)
if (gender = female && age < 85)
return(TRUE);
if (gender = male && age < 80)
return(TRUE);
return(FALSE);
S
E
3
1
5
1:
2:
3:
4:
5:
d1 = c1 && c2
d2 = c3 && c4
2 4
Test:
AccClient(83, female)->accept
AccClient(83, male)->reject
100 % or 62.5 %
T/F T T/F
T F F
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Condition Coverage /4
boolean AccClient(int age; gtype gender)
if (gender = female && age < 85)
return(TRUE);
if (gender = male && age < 80)
return(TRUE);
return(FALSE);
S
E
3
1
5
1:
2:
3:
4:
5:
d1 = c1 && c2
d2 = c3 && c4
2 4
100 % or 75 %
T/F T T/F
T F/T F/T
Test:
AccClient(83, female)->accept
AccClient(83, male)->reject
AccClient(25, male)->accept
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Advanced Condition Coverage
• Condition/Decision Coverage (C/DC)
• as DC plus: every (simple) condition in each decision is tested in each possible outcome
• Modified Condition/Decision coverage (MC/DC)
• as above plus, every (simple) condition shown to independently affect a decision outcome (by varying that condition only)
• a (simple) condition independently affects a decision when, by flipping that condition and holding all the others fixed, the decision changes
• this criterion was created at Boeing and is required for aviation software according to RCTA/DO-178B
• Multiple-Condition Coverage (M-CC)
• all possible combinations of (simple) conditions within each decision taken
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
CC, DC, C/DC, M-CC, MC/DC Examples
Condition:
(TF) A = 2; B = 200 (D: False)
[(FT) A = 12; B = 300 (D: False)]
Decision:
(TT) A = 2; B = 300 (D: True)
(FT) A = 12; B = 300 (D: False)
Condition/Decision:
(TT) A = 2; B = 300 (D: True)
(FF) A = 12; B = 200 (D: False)
Multiple Condition:
(TT) A = 2; B = 300 (D: True)
(FT) A = 12; B = 300 (D: False)
(TF) A = 2; B = 200 (D: False)
(FF) A = 12; B = 200 (D: False)
Modified Condition/Decision:
(TT) A = 2; B = 300 (D: True)
(FT) A = 12; B = 300 (D: False)
(TF) A = 2; B = 200 (D: False)
If (A<10 and B>250) then …
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Independent Path Coverage
• McCabe cyclomatic complexity estimates number of test cases needed
• The number of independent paths needed to cover all simple paths at least once in a program
• Visualize by drawing a CFG
• CC = #(edges) – #(nodes) + 2
• CC = #(decisions) + 1
if-then-else
while-loop
case-of
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Independent Paths Coverage – Example
• Independent Paths Coverage
• Requires that a minimum set of linearly independent paths through the control flow-graph be executed
• This test strategy is the rationale for McCabe’s cyclomatic number (McCabe 1976) …
• … which is equal to the number of test cases required to satisfy the strategy.
1 2
3
5
4
6 7
8 9
10
11 12
13
14
Cyclomatic Complexity = 5 + 1 = 6
11/14/2014
7
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Independent Paths Coverage – Example
Edges: 1-2-3-4-5-6-7-8-9-10-11-12-13-14
Path1: 1-0-0-1-0-1-0-0-1-0---0---0---0---0
Path2: 1-0-1-0-1-1-1-1-1-0---0---0---0---0
Path3: 1-0-0-1-0-1-1-1-1-0---0---0---0---0
Path4: 0-1-0-0-0-0-0-0-0-1---0---1---0---1
Path5: 0-1-0-0-0-0-0-0-0-1---0---1---1---1
Path6: 0-1-0-0-0-0-0-0-0-0---1---0---0---0
1 2
3
5
4
6 7
8 9
10
11 12
13
14
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Independent Paths Coverage – Example
Edges: 1-2-3-4-5-6-7-8-9-10-11-12-13-14
Why no need to cover Path7 ???
Path7: 1-0-1-0-1-1-0-0-1-0---0---0---0---0
Because it equals Path1+Path2-Path3 !!!
Path1: 1-0-0-1-0-1-0-0-1-0---0---0---0---0
Path2: 1-0-1-0-1-1-1-1-1-0---0---0---0---0
P1+P2: 2-0-1-1-1-2-1-1-2-0---0---0---0---0
Path3: 1-0-0-1-0-1-1-1-1-0---0---0---0---0
-P3: 1-0-1-0-1-1-0-0-1-0---0---0---0---0
1 2
3
5
4
6 7
8 9
10
11 12
13
14
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Control-Flow Coverage Relationships
Subsumption:
a criterion C1 subsumes another criterion C2, if any test set {T} that satisfies C1 also satisfies C2
Statement
Decision
Complete Path
Linearly Indep. Path
Multiple Condition
MC/DC
Condition
C/DC
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Data Flow Testing
• Identifies paths in the program that go
• from the assignment of a value to a variable
• to the use of such variable,
to make sure that the variable is properly used.
X:=14; ….. Y:= X-3;
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Data Flow Testing – Definitions
• Def – assigned or changed
• Uses – utilized (not changed)
• C-use (Computation) e.g. right-hand side of an assignment, an index of an array, parameter of a function.
• P-use (Predicate) branching the execution flow, e.g. in an if statement, while statement, for statement.
• Example: All def-use paths (DU) requires that each DU chain is covered at least once
• Why interesting?
• E.g., consider: def-def-def (and no use) or use (without def)
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Data Flow Testing – Example
Considering age, there are two DU paths:
(a)[1]-[4]
(b)[1]-[6]
Test case data:
AccClient(*, female)-> *
AccClient(*, male)-> *
[1] bool AccClient(agetype
age; gndrtype gender)
[2] bool accept
[3] if(gender=female)
[4] accept := age < 85;
[5] else
[6] accept := age < 80;
[7] return accept
Test cases needed: (a) AccClient() is executed and
if-cond is true (b) AccClient() is executed and
if-cond is false
11/14/2014
8
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Data Flow Testing – Example
Considering gender, there is one DU path:
(a)[1]-[3]
Test case data:
AccClient(*, *)-> *
[1] bool AccClient(agetype
age; gndrtype gender)
[2] bool accept
[3] if(gender=female)
[4] accept := age < 85;
[5] else
[6] accept := age < 80;
[7] return accept
Test cases needed: (a) AccClient() is executed
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Data Flow Criteria
Weaker
Stronger
# tests All uses
All p-uses, some c-uses
All c-uses All p-uses All defs
All def-use paths
All c-uses, some p-uses
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
What can be automated?
Intellectual Performed once
Repeated Clerical
1. Identify
2. Design
3. Build
4. Execute
5. Check
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Test Automation
• Effectiveness – independent of automation
• Efficiency – can be improved by automation
• Automation takes 2-10(-30) times the time for first run!
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Test Automation
• Not only the automatic execution of test cases
• Also:
• Generation of test data (input data)
• Evaluation of tests (-> test oracle needed)
• Generation of test cases (including test oracle)
• Selection of test cases (e.g., in regression testing)
• Reporting of test results
• Measurement of test effectiveness
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
What to automate first?
• Most important tests
• A set of breadth tests (sample each system area overall)
• Test for the most important functions
• Tests that are easiest to automate
• Tests that will give the quickest payback
• Test that are run the most often
11/14/2014
9
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Tests...
... to automate
Tests that are run often
• Regression tests
• Important but trivial
Expensive to perform manually
• Multi-user tests
• Endurance/reliability tests
Difficult to perform manually
• Timing critical
• Complex tests
• Difficult comparisons
... not to automate
Tests that are run rarely
Tests that are not important
• will not find severe faults
Hard to recognize defects
• Usability tests
• Do the colours look nice?
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Test Automation Approaches
• For input data generation:
• Random testing
• Statistical testing
• Load / stress testing
• For test case selection:
• Search-based testing
• Regression testing
• Combinatorial testing (covering arrays)
• For test case generation:
• Symbolic execution
• Model-based testing
• For test effectiveness measurement:
• Coverage measurement
• Mutation testing
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Test automation promises
1. Efficient regression test
2. Run tests more often
3. Perform difficult tests (e.g. load, outcome check)
4. Better use of resources
5. Consistency and repeatability
6. Reuse of tests
7. Earlier time to market
8. Increased confidence
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Common problems
1. Unrealistic expectations
2. Poor testing practice ”Automatic chaos just gives faster chaos”
3. Expected effectiveness
4. False sense of security
5. Maintenance of automatic tests
6. Technical problems (e.g. Interoperability)
7. Organizational issues
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Structure of Lecture 10
• Fundamental Definitions and Concepts
• Creating Test Cases
• Manually
• Automatically
• Assessing the Quality of Test Suites
• Test Coverage
• Mutation Testing
• Static Analysis
• Quality Prediction
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Code Coverage (Test Coverage)
• Measures the extent to which certain code items related to a defined test adequacy criterion have been executed (covered) by running a set of test cases (= test suites)
• Goal: Define test suites such that they cover as many (disjoint) code items as possible
11/14/2014
10
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Code Coverage Measure – Example
• Statement Coverage (CVs)
• Portion of the statements tested by at least one test case.
: number of statements tested
: total number of statemen
100%
ts
ts
p
t
p
SCV
S
S
S
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Main Classes of Test Adequacy Criteria
• Control Flow Criteria:
• Statement, decision (branch), condition, and path coverage are examples of control flow criteria
• They rely solely on syntactic characteristics of the program (ignoring the semantics of the program computation)
• Data Flow Criteria:
• Require the execution of path segments that connect parts of the code that are intimately connected by the flow of data
• Other Criteria:
• Requirements ’covered’, ...
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Mutation Testing
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Let’s count marbles ... a lot of marbles
• Suppose we have a big bowl of marbles
• How can we estimate how many?
• I don’t want to count every marble individually
• I have a bag of 100 other marbles of the same size, but a different color
• What if I mix them?
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Estimating marbles ...
• I mix 100 black marbles into the bowl
• Stir well ...
• I draw out 100 marbles at random
• 20 of them are black
• How many marbles were in the bowl to begin with?
400
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Estimating Test Suite Quality
• Now, instead of a bowl of marbles, I have a program with faults (bugs)
• I add 100 new faults
• Assume they are exactly like real faults in every way
• I make 100 copies of my program, each with one of my 100 new faults
• I run my test suite on the programs with seeded faults ...
• ... and the tests reveal 20 of the faults
• (the other 80 program copies do not fail)
• What can I infer about my test suite?
11/14/2014
11
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Basic Assumptions
• We’d like to judge effectiveness of a test suite in finding real faults, by measuring how well it finds seeded fake faults.
• Valid to the extent that the seeded bugs are representative of real bugs
• Not necessarily identical (e.g., black marbles are not identical to clear marbles); but the differences should not affect the selection
• E.g., if I mix metal ball bearings into the marbles, and pull them out with a magnet, I don’t learn anything about how many marbles were in the bowl
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Mutation Testing Assumptions
Competent programmer hypothesis:
Programs are nearly correct
Real faults are small variations from the correct program
=> Mutants are reasonable models of real faulty programs
Coupling effect hypothesis:
Tests that find simple faults also find more complex faults
Even if mutants are not perfect representatives of real faults, a test suite that kills mutants is good at finding real faults too
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Fault-Based Testing (Mutation Testing)
Terminology
Mutant – new version of the program with a small deviation (=fault) from the original version
Killed mutant – new version detected by the test suite
Live mutant – new version not detected by the test suite
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Mutation Testing
• A method for evaluation of test suite effectiveness – not for designing test cases!
1. Take a program and test data generated for that program
2. Create a number of similar programs (mutants), each differing from the original in a small way
3. The original test data are then run through the mutants
4. If tests detect all changes in mutants, then the mutants are dead and the test suite adequate
Otherwise: Create more test cases and iterate 2-4 until a sufficiently high number of mutants is killed
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Example Mutation Operations
• Change relational operator (<,>, …)
• Change logical operator (II, &, …)
• Change arithmetic operator (*, +, -,…)
• Change constant name / value
• Change variable name / initialisation
• Change (or even delete) statement
• …
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Example Mutants
if (a || b)
c = a + b;
else
c = 0;
if (a || b)
c = a + b;
else
c = 0;
if (a && b)
c = a + b;
else
c = 0;
if (a || b)
c = a * b;
else
c = 0;
11/14/2014
12
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Types of Mutants
• Stillborn mutants: Syntactically incorrect – killed by compiler, e.g., x = a ++ b
• Trivial mutants: Killed by almost any test case
• Equivalent mutant: Always acts in the same behaviour as the original program, e.g., x = a + b and x = a – (-b)
• None of the above are interesting from a mutation testing perspective
• Those mutants are interesting which behave differently than the original program, and we do not (yet) have test cases to identify them (i.e., to cover those specific changes)
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Equivalent Mutants
if (a == 2 && b == 2)
c = a + b;
else
c = 0;
int index=0;
while (...)
{
. . .;
index++;
if (index==10)
break;
}
if (a == 2 && b == 2)
c = a * b;
else
c = 0;
int index=0;
while (...)
{
. . .;
index++;
if (index>=10)
break;
}
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Program Example
nbrs = new int[range]
public int max(int[] a) {
int imax := 0;
for (int i = 1; i <= range; i++)
if a[i] > a[imax]
imax:= i;
return imax;
}
a[0] a[1] a[2] max
TC1 1 2 3 2
TC2 1 3 2 1
TC3 3 1 2 0
Program returns the index of the array element with the maximum value.
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Program Example
nbrs = new int[range]
public int max(int[] a) {
int imax := 0;
for (int i = 1; i <= range; i++)
if a[i] > a[imax]
imax:= i;
return imax;
}
a[0] a[1] a[2] max
TC1 1 2 3 2
TC2 1 3 2 1
TC3 3 1 2 0
Program returns the index of the array element with the maximum value.
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Variable Name Mutant
nbrs = new int[range]
public int max(int[] a) {
int imax := 0;
for (int i = 1; i <= range; i++)
if i > a[imax]
imax:= i;
return imax;
}
a[0] a[1] a[2] max
TC1 1 2 3 2
TC2 1 3 2 0
TC3 3 1 2 0
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Relational Operator Mutant
nbrs = new int[range]
public int max(int[] a) {
int imax := 0;
for (int i = 1; i <= range; i++)
if a[i] >= a[imax]
imax:= i;
return imax;
}
a[0] a[1] a[2] max
TC1 1 2 3 2
TC2 1 3 2 1
TC3 3 1 2 0
Need a test case with two identical max entries in a[.], e.g., (1, 3, 3)
11/14/2014
13
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Variable Operator Mutant
nbrs = new int[range]
public int max(int[] a) {
int imax := 0;
for (int i = 0; i < range; i++)
if a[i] > a[imax]
imax:= i;
return imax;
}
a[0] a[1] a[2] max
TC1 1 2 3 2
TC2 1 3 2 1
TC3 3 1 2 0
Need a test case detecting wrong loop counting
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Structure of Lecture 09
• Fundamental Definitions and Concepts
• Creating Test Cases
• Manually
• Automatically
• Assessing the Quality of Test Suites
• Test Coverage
• Mutation Testing
• Static Analysis
• Quality Prediction
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Static Analysis
• Document Review (manual)
• Different types
• Static Code Analysis (automatic)
• Structural properties / metrics
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Reviews - Terminology
• Static testing – testing without software execution
• Review – meeting to evaluate software artifact
• Inspection – formally defined review
• Walkthrough – author guided review
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Reviews complement testing
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Relative Cost of Faults Maintenance
200
Source: Davis, A.M., “Software Requirements: analysis and specification” (1990)
11/14/2014
14
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Reading Techniques
• Ad hoc
• Checklist-based
• Defect-based
• Scenario-based
• Usage-based
• Perspective-based
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Perspective-based Reading
• Scenarios
• Purpose • Decrease overlap
(redundancy)
• Improve
effectiveness
Designer
Tester
User
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Structure of Lecture 10
• Fundamental Definitions and Concepts
• Creating Test Cases
• Manually
• Automatically
• Assessing the Quality of Test Suites
• Test Coverage
• Mutation Testing
• Static Analysis
• Quality Prediction
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Quality Prediction
• Based on product and process properties
• Quality = Function(Code Size | Complexity)
• Quality = Function(Code Changes)
• Quality = Function(Detected #Defects)
• Quality = Function(Test Effort)
• Based on detected defects
• Capture-Recapture Models
• Reliability Growth models
Quality def.: Undetected #Defects
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Capture-Recapture – Defect Estimation
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Capture-Recapture – Defect Estimation
11/14/2014
15
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Capture-Recapture – Defect Estimation
• Situation: Two inspectors are assigned to inspect the same product
• d1: #defects detected by Inspector 1
• d2: #defects detected by Inspector 2
• d12: #defects by both inspectors
• Nt: total #defects (detected and undetected)
• Nr: remaining #defects (undetected)
12
21
d
ddNt )( 1221 dddNN tr
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Capture-Recapture – Example
• Situation: Two inspectors are assigned to inspect the same product
• d1: 50 defects detected by Inspector 1
• d2: 40 defects detected by Inspector 2
• d12: 20 defects by both inspectors
• Nt: total defects (detected and undetected)
• Nr: remaining defects (undetected)
10020
4050
12
21
d
ddNt 30)204050(100 rN
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Advanced Capture-Recapture Models
• Four basic models used for inspections
• Degree of freedom
• Prerequisites for all models
• All reviewers work independently of each other
• It is not allowed to inject or remove faults during inspection
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Advanced Capture-Recapture Models
Model
Probability of defect being
found is equal across ...
Estimator Defect Reviewer
M0 Yes Yes Maximum-likelihood
Mt Yes No Maximum-likelihood
Chao’s estimator
Mh No Yes Jackknife
Chao’s estimator
Mth No No Chao’s estimator
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Mt Model
Maximum-likelihood: • Mt = total marked animals
(=faults) at the start of the t'th sampling interval
• Ct = total number of animals (=faults) sampled during interval t
• Rt = number of recaptures in the sample Ct
• An approximation of the maximum likelihood estimate of population size (N) is: SUM(Ct*Mt)/SUM(Rt)
First resampling:
M1=50 (first inspector)
C1=40 (second inspector)
R1=20
N=40*50/20=100
Second resampling:
M2=70 (first and second inspector)
C2=40 (third inspector)
R2=30
N=(40*50+40*70)/(20+30)=4800/50=96
Third resampling:
M3=80
C3=30 (fourth inspector)
R3=30
N=(2000+2800+30*80)/(20+30+30)=7200/80=90
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Reliability Growth Models
• To predict the probability of future failure occurrence based on past (observed) failure occurrence
• Can be used to estimate
• the number of residual (remaining) faults or
• the time until the next failure occurs
• the remaining test time until a reliability objective is achieved or
• Application typically during system test
11/14/2014
16
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Reliability Growth Models (RGMs)
Purpose:
Stop testing when
a certain percentage (90%, 95%, 99%, 99.9%, …) of estimated total number of failures has been reached
a certain failure rate has been reached
Cumulative #Failures (m)
Test Intensity (t)
(CPU time, test effort, test days, calendar days, …)
100%
95%
(estimated n0)
dtet
0
0
0
0)(n
t
tm
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Model Selection
Many different RGMs have been proposed (>100)
To choose a reliability model, perform the following steps:
1. Collect failure data
2. Examine data (failure data vs. test time/effort)
3. Select a set of candidate models
4. Estimate model parameters for each candidate model
Least squares method
Maximum likelihood method
5. Customize model using the estimated parameters
6. Compare models with goodness-of-fit test and select the best
7. Make reliability predictions with selected model(s)
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Structure of Lecture 10
• Fundamental Definitions and Concepts
• Creating Test Cases
• Manually
• Automatically
• Assessing the Quality of Test Suites
• Test Coverage
• Mutation Testing
• Static Analysis
• Quality Prediction
More topics: - Test Management - Test Measurement - Test Tools - ...
MTAT.03.094 / Lecture 10 / © Dietmar Pfahl 2014
Next Lecture
• Date/Time:
• Friday, 21-Nov, 14:15-16:00
• Topic:
• Agile / Lean Methods
• For you to do:
• Evaluation of Lab Task 5 – Go to Labs!
• Continue working on Lab Task 6