csce 431: testing some material from bruegge, dutoit, meyer et al

103
CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

Upload: nickolas-cunningham

Post on 16-Dec-2015

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431:Testing

Some material from Bruegge, Dutoit, Meyer et al

Page 2: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Outline• Introduction• How to deal with faults, erroneous states, and failures• Different kinds of testing• Testing strategies• Test automation• Unit testing• Mocks etc.• Estimating quality: Coverage• Using dataflow in testing• Mutation testing• Test management• Collaborative construction• Effectiveness of different quality assurance techniques• References

Page 3: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Testing Truism

• Untested systems will not work

• Why?

• Requirements not correct

• Misunderstood requirements

• Coding errors

• Miscommunication

Page 4: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Edsger W. Dijstra, in 1970

Program testing can be used to show the presence of bugs, but never to show their absence!

• It is impractical or impossible to exhaustively test all possible executions of a program

• It is important to choose tests wisely

Page 5: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Increasing System Reliability• Fault avoidance

• Detect faults statically, without relying on executing any system models• Includes development methodologies, configuration management,

verification

• Fault detection• Debugging, testing• Controlled (and uncontrolled) experiments during development process

to identify erroneous states and their underlying faults before system release

• Fault tolerance• Assume that system can be released with faults and that failures can

be dealt with• E.g., redundant subsystems, majority wins• For a little extreme approach, see Martin Rinard:

Acceptability-Oriented Computing, Failure-Oblivious Computing

Page 6: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Fault Avoidance and Detection• Static Analysis

• Hand execution: Reading the source code• Walk-Through (informal presentation to others)• Code Inspection (formal presentation to others)• Automated Tools checking for

• Syntactic and semantic errors• Departure from coding standards

• Dynamic Analysis• Black-box testing (Test the input/output behavior)• White-box testing (Test the internal logic of the

subsystem or class)

• Data-structure based testing (Data types determine test cases)

Page 7: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Terminologytest component part of the system isolated for testing

test case a set of inputs and expected results that exercises a test component (with the purpose of causing failures or detecting faults)

test stub a partial implementation of a component on which a test component depends

test driver a partial implementation of a component that depends on a test component

fault design or coding mistake that may cause abnormal behavior

erroneous state manifestation of a fault during execution. Caused by one or more faults and can lead to a failure

failure deviation between the observed and specified behavior

• When exact meaning not important, fault, failure, erroneous state commonly called errors, defects, bugs

Page 8: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Outline• Introduction• How to deal with faults, erroneous states, and failures• Different kinds of testing• Testing strategies• Test automation• Unit testing• Mocks etc.• Estimating quality: Coverage• Using dataflow in testing• Mutation testing• Test management• Collaborative construction• Effectiveness of different quality assurance techniques• References

Page 9: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

What is This?

• A failure?

• An error?

• A fault?

• We need to describe specified behavior first!

• Specification: “A track shall support a moving train”

Page 10: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Erroneous State (“Error”)

Page 11: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Fault

• Possible algorithmic fault: Compass shows wrong reading

• Or: Wrong usage of compass

• Or: Communication problems between teams

Page 12: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Mechanical Fault

Page 13: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Modular Redundancy

Page 14: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Declaring the Bug as a Feature

Page 15: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Patching

Page 16: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Testing

Page 17: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Outline• Introduction• How to deal with faults, erroneous states, and failures• Different kinds of testing• Testing strategies• Test automation• Unit testing• Mocks etc.• Estimating quality: Coverage• Using dataflow in testing• Mutation testing• Test management• Collaborative construction• Effectiveness of different quality assurance techniques• References

Page 18: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Typical Test Categorization

• Unit testing

• Integration testing

• System testing

• Reliability testing

• Stress testing

Page 19: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Unit Testing• Test each module individually

• Choose data based on knowing the source code• “White-box” testing• Desirable to try to cover all branches of a program

• Heuristics: choose input data1. Well within acceptable input range

2. Well outside acceptable input range

3. At or near the boundary

• Usually performed by the programmer implementing the module

• Purchases components should be unit tested too

• Goal: component or subsystem correctly implemented, and carries out the intended functionality

Page 20: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Integration Testing

• Testing collections of subsystems together

• Eventually testing the entire system

• Usually carried out by developers

• Goal: test interfaces between subsystems

• Integration testing can start early• Stubs for modules that have not yet been

implemented• Agile development ethos

Page 21: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

System Testing• The entire system is tested

• Software and hardware together

• Black box methodology

• Robust testing• Science of selecting test cases to maximize

coverage

• Carried out by developers, but likely a separate testing group

• Goal: determine if the system meets its requirements (functional and nonfunctional)

Page 22: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Reliability Testing

• Run with same data repeatedly

• Finding timing problems

• Finding undesired consequences of changes• Regression testing

• Fully automated test suites to run regression test repeatedly

Page 23: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Stress Testing

• How the system performs under stress• More than maximum anticipated loads• No load at all• Load fluctuating from very high to very low

• How the system performs under exceptional situations

• Longer than anticipated run times• Loss of a device, such as a disk, sensor• Exceeding (physical) resource limits (memory, files)• Backup/Restore

Page 24: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Acceptance Testing

• Evaluates the system delivered by developers

• Carried out by/with the client

• May involve executing typical transactions on site on a trial basis

• Goal: Enable the customer to decide whether to accept a product

Page 25: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Verification vs. Validation

• Validation: “Are you building the right thing?”

• Verification: “Are you building it right?”

• Acceptance testing about validation, other testing about verification

Page 26: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Another Test Categorization – By Intent

• Regression testing• Retest previously tested element after changes• Goal is to assess whether changes have

(re)introduced faults

• Mutation testing• Introduce faults to assess test quality

Page 27: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Categorization by Process Phase

• Unit testing• Implementation

• Integration testing• Subsystem integration

• System testing• System integration

• Acceptance testing• Deployment

• Regression testing• Maintenance

V-Model

Page 28: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Outline• Introduction• How to deal with faults, erroneous states, and failures• Different kinds of testing• Testing strategies• Test automation• Unit testing• Mocks etc.• Estimating quality: Coverage• Using dataflow in testing• Mutation testing• Test management• Collaborative construction• Effectiveness of different quality assurance techniques• References

Page 29: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Goal – Partition Testing• Cannot test for all possible input data• Idea: For each test, partition input data into equivalence classes, such

that:• The test fails for all elements in the equivalence class; or• The test succeeds for all elements in the equivalence class.

• If this succeeds:• One input from each equivalence class suffices

• No way to know if partition is correct (likely not)• Heuristics - could partition data like this:

• Clearly good values• Clearly bad values• Values just inside the boundary• Values just outside the boundary

Page 30: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Choosing Values From Equivalence Classes

• Each Choice (EC):• For every equivalence class c, at least one test

case must use a value from c

• All Combinations (AC):• For every combination ec of equivalence classes, at

least one test case must use a set of values from ec

• Obviously more extensive, but may be unrealistic• Think, e.g., testing a compiler (all combinations of all

features)

Page 31: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Example Partitioning

• Date-related program• Month: 28, 29, 30, 31 days• Year:

• Leap• Standard non-leap• Special non-leap (x100)• Special leap (x400)

• Month-to-month transition• Year-to-year transition• Time zone/date line locations

• All combinations: some do not make sense

Page 32: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

About Partition Testing

• Applicable to all levels of testing• unit, class, integration, system

• Black box• Based only on input space, not the implementation

• A natural and attractive idea, applied by many (most) testers

• No rigorous basis for assessing effectiveness, as there is generally no way of being certain that partition corresponds to reality

Page 33: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Outline• Introduction• How to deal with faults, erroneous states, and failures• Different kinds of testing• Testing strategies• Test automation• Unit testing• Mocks etc.• Estimating quality: Coverage• Using dataflow in testing• Mutation testing• Test management• Collaborative construction• Effectiveness of different quality assurance techniques• References

Page 34: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Test Automation

• Testing is time consuming

• Should be automated as much as possible• At a minimum, regression tests should be run

repeatedly and automatically

• Many tools exist to help

• E.g., automating test execution with “xUnit” tools• http://en.wikipedia.org/wiki/XUnit

• It is possible to automate more than just test execution

Page 35: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Test Automation

• Generation• Test inputs

• Selection of test data

• Test driver code

• Execution• Running the test code• Recovering from failures

• Evaluation• Oracle: classify pass/no

pass• Other info about results

• Test quality estimation

• Coverage measures• Other test quality

measures• Feedback to test data

generator

• Management• Save tests for

regression testing

Page 36: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Automated Widely

• Generation• Test inputs

• Selection of test data• Test driver code• Execution

• Running the test code• Recovering from

failures

• Evaluation• Oracle: classify pass/no

pass• Other info about results

• Test quality estimation

• Coverage measures• Other test quality

measures• Feedback to test data

generator

• Management• Save tests for

regression testing

Page 37: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Difficult to Automate

• Generation• Test inputs

• Selection of test data

• Test driver code

• Execution• Running the test code• Recovering from failures

• Evaluation• Oracle: classify

pass/no pass• Other info about results

• Test quality estimation

• Coverage measures• Other test quality

measures• Feedback to test data

generator

• Management• Save tests for

regression testing

Page 38: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Outline• Introduction• How to deal with faults, erroneous states, and failures• Different kinds of testing• Testing strategies• Test automation• Unit testing• Mocks etc.• Estimating quality: Coverage• Using dataflow in testing• Mutation testing• Test management• Collaborative construction• Effectiveness of different quality assurance techniques• References

Page 39: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

What to Unit Test?

• Mission critical: test or die

• Complex: test or suffer

• Everything non-trivial: test or waste time

• Everything trivial: test == waste of time

Page 40: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Code Coverage Metrics

• Take a critical view

• E.g.: Java getters and setters usually trivial

• Not testing them results in low code coverage metric (< 50%)

• But they can indicate poorly covered parts of code

• Example: error handling

Page 41: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

xUnit

• cppunit — C++

• JUnit — Java

• NUnit — .NET

• SUnit — Small Talk• This was the first unit testing library

• pyUnit — Python

• vbUnit — Visual Basic

• . . .

Page 42: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Outline• Introduction• How to deal with faults, erroneous states, and failures• Different kinds of testing• Testing strategies• Test automation• Unit testing• Mocks etc.• Estimating quality: Coverage• Using dataflow in testing• Mutation testing• Test management• Collaborative construction• Effectiveness of different quality assurance techniques• References

Page 43: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Mock Objects• Often (practically) impossible to include real objects,

those used in full application, into test cases

• To test code that depends on such objects, one often uses mock objects instead

• Mock object simulates some part of the behavior of another object, or objects

• Useful in situations where real objects• Could provide non-deterministic data• States of real objects are hard to reproduce (e.g. are result of

interactive use of software, erroneous cases)• Functionality of real objects has not yet been implemented• Real objects are slow to produce results• Tear-up/tear-down requires lots of work and/or is time

consuming

Page 44: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Knowing What is Being Tested

• Assume a failed test involves two classes/data types

• Who to blame?

• One class’ defect can cause the other class to fail

• Essentially, this is not unit testing, but rather integration testing

• “Mocking” one class makes it clear which class to blame for failures

Page 45: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Mocks and Assigning Blame in Integration Testing

• Direct integration testing• Code + database• My code + your code

• Integration testing with mock objects• Code + mock database• My code + mock your code• Mock my code + your code

• Mocks help to make it clear what the system under test (SUT) is

Page 46: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Mocks and Assigning Blame in Integration Testing

• Direct integration testing blame assignment unclear• Code + database• My code + your code

• Integration testing with mock objects blame assignment clear• Code + mock database• My code + mock your code• Mock my code + your code

• Mocks help to make it clear what the system under test (SUT) is

• As with scientific experiments, only change the variable being measured, control others

Page 47: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

TerminologyDummy object Passed around, never used. Used for filling

parameter lists, etc.

Fake object A working implementation, but somehow simplified, e.g., uses an in-memory database instead of a real one

Stub Provides canned answers to calls made during a test, but cannot respond to anything outside what it is programmed for

Mock object Mimic some of the behavior of the real object, for example, dealing with sequences of calls

• The definitions are a bit overlapping and ambiguous, but the terms are being used, and it is good to know their meaning, even if imprecise

Page 48: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Outline• Introduction• How to deal with faults, erroneous states, and failures• Different kinds of testing• Testing strategies• Test automation• Unit testing• Mocks etc.• Estimating quality: Coverage• Using dataflow in testing• Mutation testing• Test management• Collaborative construction• Effectiveness of different quality assurance techniques• References

Page 49: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

White Box Testing• White box – you know what is inside, i.e. the code

• Idea• To assess the effectiveness of a test suite, measure how much of

the program it exercises

• Concretely• Choose a kind of program element, e.g., instructions (instruction

coverage) or paths (path coverage)• Count how many are executed at least once• Report as percentage

• A test suite that achieves 100% coverage achieves the chosen criterion. Example:

• “This test suite achieves instruction coverage for routine r ”• Means that for every instruction i in r, at least one test executes i

Page 50: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Coverage Criteria

• Instruction (or statement) coverage• Measure instructions executed• Disadvantage: insensitive to some control structures

• Branch coverage• Measure conditionals whose paths are both/all executed

• Condition coverage• How many atomic Boolean expressions evaluate to both true and false

• Path coverage• How many of the possible paths are taken• path == sequence of branches from routine entry to exit

Page 51: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Using Coverage Measures to Improve Test Suite

• Coverage-guided test suite improvement1. Perform coverage analysis for a given criterion2. If coverage < 100%, find unexercised code sections3. Create additional test cases to cover them

• Process can be aided by a coverage analysis tool1. Instrument source code by inserting trace instructions2. Run instrumented code, yielding a trace file3. From the trace file, analyzer produces coverage report

• Many tools available for many languages• E.g. PureCoverage, SoftwareVerify, BullseyeCoverage,

xCover, govc, cppunit

Page 52: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Example: Source Code

class ACCOUNT feature

balance: INTEGER

withdraw(sum: INTEGER)

do

if balance >= sum then

balance := balance – sum

if balance = 0 then

io.put_string(“Account empty%N”)

end

else

io.put_string(“Less than “

io.put_integer(sum)

io.put_string(“ $ in account%N)

end

end

end

end

Page 53: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Instruction Coverageclass ACCOUNT feature

balance: INTEGER

withdraw(sum: INTEGER)

do

if balance >= sum then

balance := balance – sum

if balance = 0 then

io.put_string(“Account empty%N”)

end

else

io.put_string(“Less than “

io.put_integer(sum)

io.put_string(“ $ in account%N)

end

end

end

end

-- TC1: -- TC2:

create a create a

a.set_balance(100) a.set_balance(100)

a.withdraw(1000) a.withdraw(100)

Page 54: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Condition and Path Coverage

class ACCOUNT feature

balance: INTEGER

withdraw(sum: INTEGER)

do

if balance >= sum then

balance := balance – sum

if balance = 0 then

io.put_string(“Account empty%N”)

end

else

io.put_string(“Less than “

io.put_integer(sum)

io.put_string(“ $ in account%N)

end

end

end

End

-- TC1: -- TC2: --TC3:

create a create acreate a

a.set_balance(100) a.set_balance(100) a.set_balance(100)

a.withdraw(1000)a.withdraw(100) a.withdraw(99)

Page 55: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Outline• Introduction• How to deal with faults, erroneous states, and failures• Different kinds of testing• Testing strategies• Test automation• Unit testing• Mocks etc.• Estimating quality: Coverage• Using dataflow in testing• Mutation testing• Test management• Collaborative construction• Effectiveness of different quality assurance techniques• References

Page 56: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Dataflow Oriented Testing

• Focuses on how variables are defined, modified, and accessed throughout the run of the program

• Goal• Execute certain paths between a definition of a

variable in the code and certain uses of that variable

Page 57: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Motivation• Instruction coverage: all nodes in the control flow graph• Branch coverage: all edges in the control flow graph• Neither tests interactions• Could use path coverage, but it often leads to

impractical number of test cases• Only a few paths uncover additional faults

• Need to distinguish “important” paths• Intuition: statements interact through dataflow

• Value computed in one statement, used in another• Faulty value in a variable revealed only when it is used, other

paths can be ignored

Page 58: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Access Related Defect Candidates

• Dataflow information also helps to find suspicious access related issues

• Variable defined but never used/referenced• Variable used but never defined/initialized• Variable defined twice before use

• Many of these defected by static analysis tools• E.g. PCLint, Coverity, Klocwork, CodeSonar,

Cppcheck

Page 59: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Kinds of Accesses to a Variable

• Definition (def) - changing the value of a variable• Initialization, assignment

• Use - reading the value of a variable (without changing)• Computational use (c-use)

• Use variable for computation

• Predicative use (p-use):• Use variable in a predicate

• Kill - any operation that causes the variable to be deallocated, undefined, no longer usable

• Examplesa = b * c c-use of b; c-use of c; def of aif (x > 0)... p-use of x

Page 60: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Characterizing Paths in a Dataflow Graph

• For a path or sub-path p and a variable v:• def-clear for v

• No definition of v occurs in p

• DU-path for v• p starts with a definition of v• Except for this first node, p is def-clear for v• v encounters either a c-use in the last node or a p-use

along the last edge of p

Page 61: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Example: Control Flow Graph for withdraw

class ACCOUNT feature

balance: INTEGER

withdraw(sum: INTEGER)

do

if balance >= sum then

balance := balance – sum

if balance = 0 then

io.put_string(“Account empty%N”)

end

else

io.put_string(“Less than “

io.put_integer(sum)

io.put_string(“ $ in account%N)

end

end

end

end

Page 62: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Data Flow Graph for sum in withdraw

Page 63: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Data Flow Graph for balance in withdraw

Page 64: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Some Coverage Criteria

all-defs execute at least one def-clear sub-path between every definition of every variable and at least one reachable use of that variable

all-p-uses execute at least one def-clear sub-path from every definition of every variable to every reachable p-use of that variable

all-c-uses execute at least one def-clear sub-path from every definition of every variable to every reachable c-use of the respective variable

Page 65: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Some Coverage Criteria

all-c-uses/some-p-uses apply all-c-uses; then if any definition of a variable is not covered, use p-use

all-p-uses/some-c-uses symmetrical to all-c-uses/ some-p-uses

all-uses execute at least one def-clear sub-path from every definition of every variable to every reachable use of that variable

Page 66: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Dataflow Coverage Criteria for sum

• all-defs - execute at least one def-clear sub-path between every definition of every variable and at least one reachable use of that variable

(0,1)

• all-p-uses - execute at least one def-clear sub-path from every definition of every variable to every reachable p-use of that variable

(0,1)

• all-c-uses - execute at least one def-clear sub-path from every definition of every variable to every reachable c-use of the respective variable

(0,1,2); (0.1,2,3,4); (0,1,5)

Page 67: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Dataflow Coverage Criteria for sum

• all-c-uses/some-p-uses - apply all-c-uses; then if any definition of a variable is not covered, use p-use

(0,1,2); (0,1,2,3,4); (0,1,5)

• all-p-uses/some-c-uses - symmetrical to all-c-uses/ some-p-uses

(0,1)

• all-uses - execute at least one def-clear sub-path from every definition of every variable to every reachable use of that variable

(0,1); (0,1,2); (0,1,2,3,4); (0,1,5)

Page 68: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Subsumption of Dataflow Coverage Criteria

Page 69: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Outline• Introduction• How to deal with faults, erroneous states, and failures• Different kinds of testing• Testing strategies• Test automation• Unit testing• Mocks etc.• Estimating quality: Coverage• Using dataflow in testing• Mutation testing• Test management• Collaborative construction• Effectiveness of different quality assurance techniques• References

Page 70: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Who Tests the Tester?

• All tests pass. Is the software really that good?

• How does one know?• Is the software perfect or is the coverage too limited?

• Test the tests• Intentionally introduce defects

• If tests find the defects, test suite good• If no bugs are found, test suite insufficient

• Have to plan defect types and locations• Random?• Weight based on code criticality?• Amount of coverage?

Page 71: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Fault Injection Terminology

• Faulty versions of the program = mutants• Only versions not equivalent to the original program

are considered to be mutants (even though the original program was faulty)

• A mutant is• Killed if at least one test case detects the injected

fault• What if injected fault is missed, but causes test to detect

previously undetected fault in original code?

• Alive otherwise

• A mutation score (MS) is associated to the test set to measure its effectiveness

Page 72: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Mutation Operators

• Mutation operator• A rule that specifies a syntactic variation of the

program text so that the modified program still compiles

• Maybe also require that it pass static analysis

• A mutant is the result of an application of a mutation operator

• The quality of the mutation operators determines the quality of the mutation testing process

Page 73: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Mutant Examples

• Original program • Mutants

if (a < b)b = b – a;

elseb = 0;

if (a < b)if (a <= b)if (a > b)if (c < b)b = b - a;b = b + a;b = x - a;

elseb = 0;b = 1;a = 0;

Page 74: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Example Mutant Operators

• Replace arithmetic operator by another• Replace relational operator by another• Replace logical operator by another• Replace a variable by another• Replace a variable (in use position) by a constant• Replace number by absolute value• Replace a constant by another• Replace while(...)\{...\} by do \{...\} while(...)

• Replace condition of test by the test’s negation• Replace call to a routine by a call to another routine• Language-specific operators: OO, AOP

Page 75: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Test Quality Measures• S - system composed of n components denoted Ci

• di - number of killed mutants after applying Ci ’s unit test sequence Ti to Ci

• mi - total number of mutants of component Ci

• Mutation score MS for Ci and its unit test sequence Ti

• Test quality TQ is defined as MS

• System Test Quality

• STQ is a measure of the quality of the entire test suite for the entire system

Page 76: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Outline• Introduction• How to deal with faults, erroneous states, and failures• Different kinds of testing• Testing strategies• Test automation• Unit testing• Mocks etc.• Estimating quality: Coverage• Using dataflow in testing• Mutation testing• Test management• Collaborative construction• Effectiveness of different quality assurance techniques• References

Page 77: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Testing Strategy

• Defining the process• Test plan• Input and output documents

• Who is testing?• Developers / special testing teams / customer

• What test levels are needed?• Unit, integration, system, acceptance, regression

• Order of tests?• Top-down, bottom-up, combination (pertains to integration testing)

• Running the tests• Manually• Use of tools• Automatically

Page 78: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Who Tests?

• Any significant project should have a separate QA team

• Avoids self-delusion• QA requires a skill-set of its own

• Unit tests• Developers

• Integration tests• Developers or QA team

• System test• QA team

• Acceptance test• Customer + QA team

Page 79: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Testing Personnel

• Developing test requires• Detailed understanding of the system• Application and solution domain knowledge• Knowledge of the testing techniques• Skill to apply these techniques

• Testing is done best by independent testers• We often develop a certain mental attitude that the

program should behave in a certain way when in fact it does not

• Developers often stick to the data set with which the program works

• Often the case that every new user uncovers a new class of defects

Page 80: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Test Report Classification, Severity

• Defined in advance

• Applied to every reported failure

• Analyzes each failure to determine whether it reflects a fault, and if so, how damaging

• Example classification:• not a fault• cosmetic• minor• serious• blocking

Page 81: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Test Report Classification, Status

• When a defect report finds its way around in the bug database, its status changes reflecting the state of the defect and its repairs

• For example:• registered• open• re-opened• corrected• integrated• delivered• closed• irreproducible• canceled

Page 82: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Example Responsibility Definitions

• Who runs each kind of test?

• Who is responsible for assigning severity and status?

• What is the procedure for disputing such an assignment?

• What are the consequences on the project of a defect at each severity level?

• E.g. “the product shall be accepted when two successive rounds of testing, at least one week apart, have evidenced fewer than m serious faults and no blocking faults”

Page 83: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

IEEE is Your Friend

• IEEE 829-2008• Standard for Software and System Test

Documentation• http://en.wikipedia.org/wiki/IEEE_829

Page 84: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Many Issue Tracking Systems

• Pick one and use it

• Example: Traq

• See Kode Vicious column comments on the shortcomings of bug tracking systems

Page 85: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Traq System (1)

Page 86: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Traq System (2)

Page 87: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

When to Stop Testing?

• Complete testing is infeasible• One needs a correctness proof (usually not

feasible)

• Nevertheless, at some point, software testing has to be stopped and the product shipped

• The stopping time can be decided by the trade-off between time and budget

• Or testing can be stopped if an agreed-upon reliability estimate meets a set requirement

• Set based on experience from earlier projects/releases• Requires accurate record keeping of defects

Page 88: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Full-Employment Theorem for Software Testing Researchers

• The Pesticide Paradox:• “Every method you use to prevent or find bugs leaves a residue

of subtler bugs against which those methods are ineffectual.”

• Unlike real bugs, software bugs immune to current testing methodologies do not evolve and multiply themselves

• Instead, “killing” the “easy bugs,” allows the software to grow with more features and more complexity, creating new possibilities of these subtle defects to manifest themselves as failures

• Therefore, the need for new software testing technologies is guaranteed ad infinitum :)

Boris Beizer, Software Testing Techniques, 2nd edition, 1990

Page 89: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Outline• Introduction• How to deal with faults, erroneous states, and failures• Different kinds of testing• Testing strategies• Test automation• Unit testing• Mocks etc.• Estimating quality: Coverage• Using dataflow in testing• Mutation testing• Test management• Collaborative construction• Effectiveness of different quality assurance techniques• References

Page 90: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Defects Can Be Found By Inspecting Code

• Formal inspections

• Walkthroughs

• Code reading

• Pair programming

• All of these techniques have a collaborative dimension

• More than one pair of eyes staring at the code

Page 91: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Collaborative Construction

• Working on code development in close cooperation with others

• Motivation• Developers do not notice their own errors very

easily• Others will not have the same blind spots• Thus, errors caught more easily by other people

• Takes place during construction process

• E.g. pair programming• http://en.wikipedia.org/wiki/Pair_programming

Page 92: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Benefits

• Can be more effective at finding errors than testing alone• 35% errors found through low-volume beta testing• 55-60% errors found by design/code inspection

• Finds errors earlier in process• Reduces time and cost of fixing them

• Provides mentoring opportunity• Junior programmers learn from more senior programmers

• Creates collaborative ownership• No single “owner” of code

• People can leave team more easily, since others have seen code

• Wider pool of people to draw from when fixing later errors in code

Page 93: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Code Reviews

• Method shown to be effective in finding errors• Ratio of time spent in review vs. later testing and error

correction• Ranges from 1:20 to 1:100

• Reduced defect correction from 40% of budget to 20%• Maintenance costs of inspected code is 10% of non-

inspected code• Changes done with review: 95% correct vs. 20% w/o

review• Reviews cut errors by 20% to 80%• Several others (examples from Code Complete)

Page 94: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Reviews vs. Testing• Finds different types of problems than testing• Unclear error messages• Bad commenting• Hard-coded variable names• Repeated code patterns• Only high-volume beta testing (and prototyping)

find more errors than formal inspections• Inspections typically take 10-15% of budget, but

usually reduce overall project cost• Reviews can provide input to test plan

Page 95: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Example Method: Formal Inspection, Some Rules

• Focus on detection, not correction

• Reviewers prepare ahead of time and arrive with a list of what they have discovered

• Code reviews aren’t to sit and read code

• No meeting unless everyone is prepared

• Distinct roles assigned to participants

• No deviation from roles during review

• Data is collected and fed into future reviews

• Checklists focus reviewers’ attention to common past problems

Page 96: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Roles• Moderator

• Keep review moving, handle details.• Technically competent

• Author• Minor role, design/code should speak for itself

• Reviewer(s)• Focus on code, not author• Find errors prior and during the meeting

• Scribe• Record keeper, not moderator, not author

• Management• Not involved

• 3 people min• ˜6 people max

Page 97: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Stages• Overview

• Authors might give overview to reviewers to familiarize them• Separate from review meeting

• Preparation• Reviewers work alone to scrutinize code for errors• Review rate of 100–500 lines per hour for typical code• Reviewer’s can be assign perspectives

• API client view, user’s view, etc.

• Inspection meeting• Reviewer paraphrases code

• Explains design and logic

• Moderator keeps focus• Scribe records defects when found• No discussion on solutions• No more than 1 meeting per day, max 2 hours

• Possibly “Third hour meeting”• Outline solutions

Page 98: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Outline• Introduction• How to deal with faults, erroneous states, and failures• Different kinds of testing• Testing strategies• Test automation• Unit testing• Mocks etc.• Estimating quality: Coverage• Using dataflow in testing• Mutation testing• Test management• Collaborative construction• Effectiveness of different quality assurance techniques• References

Page 99: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Different Methods Catch Different Defects

Page 100: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Observations

• No technique >75%

• Average ~40%

• Human inspection seems more effective

• Code reading detected 80% more faults per (human) hour than testing

• However, testing catches different kinds of bugs, so still worthwhile

• V. Basili, R Selby, “Comparing the Effectiveness of Software Testing Strategies”, IEEE Trans. SE, vol. SE-13, no. 12, Dec. 1987, pp. 1278-1296

Page 101: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

How Many Defects Are There?• Industry average: 1-25 errors per 1000 LOC for

delivered software

• Microsoft’s application division: ~10-20 defects / 1kLOC during in-house testing, 0.5 / 1 kLOC in a released product (Moore 1992)

• Code inspections, testing, collaborative development

• Space shuttle software• 0 defects / 500 kLOC• Formal systems, peer reviews, statistical testing

• Humphrey’s Team Software Process• 0.06 defects / 1 kLOC (Weber 2003)• Process focuses on not creating bugs to begin with• http://en.wikipedia.org/wiki/Team_software_process

Page 102: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

Outline• Introduction• How to deal with faults, erroneous states, and failures• Different kinds of testing• Testing strategies• Test automation• Unit testing• Mocks etc.• Estimating quality: Coverage• Using dataflow in testing• Mutation testing• Test management• Collaborative construction• Effectiveness of different quality assurance techniques• References

Page 103: CSCE 431: Testing Some material from Bruegge, Dutoit, Meyer et al

CSCE 431 Testing

References

• Paul Amman and Jeff Offutt, Introduction to Software Testing, Cambridge University Press, 2008

• Boris Beizer, Software Testing Techniques, 2nd edition, Van Nostrand Reinhold, 1990

• Steve McConnell, Code Complete, 2nd edition, Microsoft Press, 2004, Chapter 20-23