daimi(c) henrik bærbak christensen1 integration testing
Post on 15-Jan-2016
239 views
TRANSCRIPT
DAIMI (c) Henrik Bærbak Christensen 1
Integration Testing
DAIMI (c) Henrik Bærbak Christensen 2
Burnstein
Systems are hierarchies of levels, so testing is as well...
Unit test
Integrationtest
System test
Accept test
Individual units
Groups / Clusters
Whole System - Technical
Whole System - Requirements
DAIMI (c) Henrik Bærbak Christensen 3
Intuition
The intuition is that we can– test individual “atoms” of behaviour on their own
(“sandboxing”) and strengthen our belief in their reliability
– test the interaction between “atoms” when combined and strengthen our belief that they interact reliably
– test that when combining all parts into a complete system then it works reliably
• from a technical point of view• from a user/requirement point of view
DAIMI (c) Henrik Bærbak Christensen 4
Interpretation
But what are the “atoms”?– atom = undividable part
In procedural languages: – the procedure
In OO languages– method? class?
And what does “integrate” mean?
DAIMI (c) Henrik Bærbak Christensen 5
Burnstein
Burnstein makes a definition of unit like
Unit = smallest possible testable software component
– so – what is a component ?
DAIMI (c) Henrik Bærbak Christensen 6
Binder
Binder focus on a hierarchical definition– System is composed of components– A component is a system of smaller components
Component
*
DAIMI (c) Henrik Bærbak Christensen 7
Discussion
As testing is execution-based then software behaviour is what we ultimately study– we execute some software and study “what it does”
Thus – I find that what we intuitively call a unit (=“atom”) must be defined in terms of the resulting behaviour
Unit = smallest meaningful behaviour exhibited by executing software. Often associated with implementing a responsibility...
DAIMI (c) Henrik Bærbak Christensen 8
Integration testing
Integration testing deals with finding defects in the way individual parts work together.
As we use composition at almost any level in software engineering one may argue (as does Binder) that integration testing is occurs at all levels.
DAIMI (c) Henrik Bærbak Christensen 9
Class testing
Burnstein notes that if a method is considered the smallest unit of testing, the “ideally” you would have to test it in isolation from all other methods in the class– otherwise it is integration testing
Thus: you have to put it into an artificial test harness = code many of the remaining methods in the class as “stub” implementations
Often more efficient: consider the class as the unit...
DAIMI (c) Henrik Bærbak Christensen 10
Relations to OO
A sound definition of what constitutes object oriented programs is (I know there are others ):
A program execution is the collective behavior of a set of collaborating objects.
Thus, integration testing is at the heart of testing object-oriented programs.
A common source of defects is indeed complex interactions between collaborating objects as these collaboration patterns are less visible in the code and therefore more difficult to overview.
DAIMI (c) Henrik Bærbak Christensen 11
War Story
In a system, there were hardware-near machines and operator-near machines, the latter received temperature data from the former.
We had agreed that the data was packed as integers coded as the temperature * 10– i.e. T = 23,4 Celsius was coded as integer “234”
However they were sent by the hardware-near machine using another coding due to a misunderstanding.
Thus– both software units worked perfectly well in isolation– when integrating we detected some really odd defects...
DAIMI (c) Henrik Bærbak Christensen 12
Why integration testing
Why test component interactions???
If we provide– ‘sufficient’ component testing– and– ‘sufficient’ system testing– is it not OK then???
DAIMI (c) Henrik Bærbak Christensen 13
Why integration testing?
Burnstein refers to some axioms in §5.6:
Antidecomposition axiom:– There exists a program P and component Q [of P] such that T is
adequate for P, T’ is the set of vectors of values that variables can assume on entrance to Q for some t of T, and T’ is not adequate for Q
– System scope coverage does not necessarily achieve component coverage (“you do not test components by testing the system”)
– Why? Examples? Consequences?
Anticomposition axiom:– There exist programs P and Q, and test set T, such that T is adequate
for P, and the set of vectors of values that variables can assume on entrance to Q for inputs in T is adequate for Q, but T is not adequate for P;Q
– Adequate testing at component level is not equivalent to adequate testing of a system of components (“you do not test the system by testing the components”)
– Why? Examples?
DAIMI (c) Henrik Bærbak Christensen 14
Terminology
Test harness: Auxiliary code developed to support testing of units.
Consists of drivers that call target code and stubs that represent units it calls.
:UUT:driver :stub1. 2.
DAIMI (c) Henrik Bærbak Christensen 15
Exercise
Describe from mandatory exercise examples of
Drivers
Stubs
DAIMI (c) Henrik Bærbak Christensen 16
Requirements
Why replace real units with drivers/stubs?
What is the premise that makes drivers/stubs interesting?
Hint: Consider code complexity...
DAIMI (c) Henrik Bærbak Christensen 17
Premise for integration testing
At any given level: System consisting of components.
Testing interactions between components require that these are stable.– stable = sufficient dependability to allow integration
testing– threshold reflects practical considerations
• has passed unit testing
– In case of unstable component – then what?
DAIMI (c) Henrik Bærbak Christensen 18
Lessons learned
Key lessons from early 1970s is that incremental integration is most effective.
Integrating all components at the same time is problematic:– debugging is difficult as no idea where defect is– last-minute changes are necessary but no time for
adequate testing– testing often not systematic
DAIMI (c) Henrik Bærbak Christensen 19
Lessons learned
Advantages in incremental testing
– interfaces systematically exercised and shown stable before new unproven interfaces are exercised
– observed failures more likely to come from most recently added components making debugging more efficient.
DAIMI (c) Henrik Bærbak Christensen 20
Coupling
Relating the discussion of coupling: A measure of the strength of dependencies
between two subsystems Which is preferable from a testing perspective?
high coupling low coupling
DAIMI (c) Henrik Bærbak Christensen 21
Discussion
Thus – what is good for design is also good for testing.
How fortunate
DAIMI (c) Henrik Bærbak Christensen 22
Planning integration testing
Integration test plans must answer questions:– What interfaces will be the focus of integration
testing?– In what sequence will components/interfaces be
exercised?– Which stubs and drivers must be developed?– Which test pattern/strategy should be used?– When is integration testing considered adequate?– Component stability
All are of course very dependent upon particular project characteristics.
DAIMI (c) Henrik Bærbak Christensen 23
Component stability/volatility
An important aspect is of course to preserve our investment in testing. Spending staff hours to test a class that is later fundamentally changed is a waste of time. This observation is of course in opposition to our wish for early integration testing. Postpone integration of volatile components!
DAIMI (c) Henrik Bærbak Christensen 24
Stubs
So, the doctrine says integration test often but how do we do integration with components that are not stable or not developed at all?
Stubs: partial or surrogate implementation.
DAIMI (c) Henrik Bærbak Christensen 25
Reasons for Stubs
Stubs are powerful tools for many reasons.
Make testing of certain conditions possible– hardware simulator that outputs data that are seldom
occurring in practice• example: wind directions over north from wind sensor
– replace random behavior with predictable behavior• example: testing backgammon move validation
Make testing economically– stub for a database that is costly to reset/set up
DAIMI (c) Henrik Bærbak Christensen 26
Reasons for Stubs
Often stubs are relevant to decouple interactions– stub for complex algorithm with ‘simple’ answer
• ensures that the defect does not lie in the algorithm impl.
– stub for a component in a cycle• allows one in the cycle to be tested in isolation
There is no free lunch!– stubs can be costly to produce– … and sometimes you chase defects in production
code – that are actually embedded in the stubs
Morale: Keeps stubs simple !
DAIMI (c) Henrik Bærbak Christensen 27
Planning integration sequence
Which order should we integrate modules?– A with B before (AB) with C?– B with C before A with (BC)?
The decision is usually based upon a dependency analysis.– A (depends-on) B
DAIMI (c) Henrik Bærbak Christensen 28
Dependencies
A (depends-on) B is the super-relation a vast number of relations between software abstractions:– composition and aggregation– delegation to objects / API calls– inheritance– global variables– instance variable access– objects as message parameters– RMI / socket communication
DAIMI (c) Henrik Bærbak Christensen 29
Dependencies
Many are however implicit dependencies that are much harder to spot– database access / persistent store access– initialization sequences– timing constraints– file formatting / coding of untyped byte streams
DAIMI (c) Henrik Bærbak Christensen 30
Dependency analysis
Anyway: Explicit dependencies often dictate the sequence of testing – i.e. in which order are component interactions tested?
Thus a very helpful tool is to do dependency analysis – similar to what build systems do to determine compilation order.
DAIMI (c) Henrik Bærbak Christensen 31
Example from Binder
Root level (level 0)– unit not used by any other
unit in cluster under test– often there are several
roots
Leaf level– units that do not use any
other units in cluster under test.
Cycles – either be tested as a ‘unit’ – or stubs introduced to
break cycle
Arrows are uses relations
DAIMI (c) Henrik Bærbak Christensen 32
Example from Binder
Terminology
DAIMI (c) Henrik Bærbak Christensen 33
Integration defects
Binder list typical interface defects…
My war stories are covered by it
DAIMI (c) Henrik Bærbak Christensen 34
Integration Strategies
Binder lists nine integration strategies, documented in pattern form, for doing integration testing.
We focus on the classic strategies:– Big Bang (actually an anti-pattern )– Bottom-up– Top-down
and a costly but successful one:– continuous integration
DAIMI (c) Henrik Bærbak Christensen 35
Big Bang
DAIMI (c) Henrik Bærbak Christensen 36
Big Bang
Intent– demonstrate stability by attempting to exercise an
entire system with a few test runs
Context– bring all components together all at once. All
interfaces tested in one go.– usually ends in ‘big-bang’ – system dies miserably…
Entry criteria– all components have passed unit testing
Exit criteria– Test suite passes
DAIMI (c) Henrik Bærbak Christensen 37
Big Bang
Consequences– If nothing happens – then what? Failure diagnosis is
very difficult.– Even if exit criteria met, many interface faults can still
hide.– On the plus side: if it works, no effort has been spent
on test drivers and writing stubs.
– may be the course of action for• small systems with adequate unit testing
• existing system with only minor modifications
• system made from certified high quality reusable components
DAIMI (c) Henrik Bærbak Christensen 38
Bottom Up
DAIMI (c) Henrik Bærbak Christensen 39
Bottom Up
Intent– Demonstrate system stability by adding components
to the SUT in uses-dependency order, starting with component having the fewest dependencies.
Context– stepwise verification of tightly coupled components.
DAIMI (c) Henrik Bærbak Christensen 40
Bottom up
Strategy: 1. stage: leafs
DAIMI (c) Henrik Bærbak Christensen 41
Bottom up
2. stage..
DAIMI (c) Henrik Bærbak Christensen 42
Bottom up
Last stage..
DAIMI (c) Henrik Bærbak Christensen 43
Bottom up
Entry criteria– components pass unit tests
Exit criteria– interface for each subcomponent has been exercised
at least once– complete when all root-level components pass test
suites
DAIMI (c) Henrik Bærbak Christensen 44
Bottom up
Consequences– Actually most unit tests in OO is actually integration tests
• BU testing implicitly takes place in a bottom up development process
Disadvantages– Driver development cost significant
• but if the JUnit test cases are maintained this cost is worthwhile!
– Fix in lower level component may require revisions up the chain– interfaces only indirectly exercised– upper levels testing may require stubs in the bottom to test
special conditions– High level testing very late in cycle
DAIMI (c) Henrik Bærbak Christensen 45
Bottom up
Advantages– parallel implementation and testing possible– little need for stub writing
DAIMI (c) Henrik Bærbak Christensen 46
Top-Down
DAIMI (c) Henrik Bærbak Christensen 47
Top Down
Intent– Demonstrate stability by
adding components to the SUT in control hierarchy order, beginning with the top-level control objects
Strategy– 1 stage:
• test control objects
DAIMI (c) Henrik Bærbak Christensen 48
Top Down
2. stage
DAIMI (c) Henrik Bærbak Christensen 49
Top Down
Final Stage
DAIMI (c) Henrik Bærbak Christensen 50
Top Down
Entry– each component to be integrated has passed unit test
Exit– interface of each component has been exercised at
least once.– complete when leaf-level components passes system
scope test suite.
DAIMI (c) Henrik Bærbak Christensen 51
Top Down
Disadvantages– Large number of stubs necessary which is costly– Stubs are brittle– Fix in lower level component may require revisions up
the chain– difficult to get lower level components sufficiently
exercised (antidecomposition axiom)
Advantages– Low driver development costs– early demonstration of user-related behaviour
DAIMI (c) Henrik Bærbak Christensen 52
Variations
There are many variations of these strategies– sandwich testing
• moving from both top and bottom
– collaboration testing• taking outset from collaborations diagrams over use cases
Bottom line– there is no substitute from being clever and utilize a
combination of techniques that is most cost-efficient for the project at hand.
DAIMI (c) Henrik Bærbak Christensen 53
High-frequency integration
DAIMI (c) Henrik Bærbak Christensen 54
High-frequency integration
Binder also describes high-frequency integration that is a strategy whose characteristics lies in the timing, not the ordering, of testing. It is an intrinsic part of the process pattern daily build.
Intent– Integrate new code with stabilized baseline frequently
to prevent integration bugs from going undiscovered for a long time; and to prevent divergence from the baseline
DAIMI (c) Henrik Bærbak Christensen 55
High-frequency integration
Context– A stable baseline must be present; increments are the
focus of HF integration– Increments size must match integration frequency –
i.e. if daily builds, then increments must be deliverable in a day
– Test code developed in parallel with code– Testing must be automated– Software Configuration Management must be in place
DAIMI (c) Henrik Bærbak Christensen 56
High-frequency integration
Procedure– revise code + test code on private branch– desk check code and test– when all component testing passes, check-in to the
integration branch
Second– integration tester builds system of increments– testing using
• smoke tests and as much additional as time permits
Any increments that break HFI are corrected immediately
DAIMI (c) Henrik Bærbak Christensen 57
High-frequency integration
Disadvantages– automated tests must be in place– high commitment to maintaining code as well as tests– be aware of adequacy criteria – the suite that found
the old bugs may not find the new ones
Advantages– focus on maintaining tests is effective bug prevention
strategy– defects found early; debugging easier– morale high as system works early and keeps doing it