sagar sen caise2013final
TRANSCRIPT
1
Testing a Data-intensive System with Generated Data Interactions
The Norwegian Customs and Excise Case Study
Sagar Sen and Arnaud GotliebCertus V&V Center, Simula Research Laboratory
2
OutlineMotivation
Problem Context
Faktum
Evaluation
3
The Heart of Norway’s E-governance: TVINN
• 30,000 declarations/day, potentially adhering to about 220,000 customs rules
• Customs rules typically accept/return declarations based on information in the declaration
• Norway first country in the world to use UN’s EDIFACT brokerage standard
• 20% of Norway’s Economy (200 Billion NOK/year or 25 billion Euros/year)
• Towards a corruption-free society
4
Daily Challenges for TVINN
• Accurate Computation of Taxes
5
Daily Challenges for TVINN
• Accurate Computation of Taxes
• Preventing criminal activities such as mafia
Gross-weight > 2 x Net-weight?
6
Daily Challenges for TVINN
• Accurate Computation of Taxes
• Preventing criminal activities such as mafia
• Protecting people of Norway from imports of hazardous substances
7
Daily Challenges Translates to Testing
• Accurate Computation of Taxes
• Preventing criminal activities such as mafia
• Protecting people of Norway from imports of hazardous substances
Testing TVINN
Are customs rules complete? Can they correctly detect problems in declarations? Are there missing
rules?
8
Behind the Scenes: Testing at Toll
Atle, Katrine, Astrid, OddLarge amounts of live data (up-to
30,000 customs declarations/day)
Small team of test managers
Testing TVINN
9
Behind the Scenes: Testing at Toll
Is live data complete for testing all
customs rules?
How long will it take to test with all live
data?This is a lot of data, can I select those relevant to detect
bugs?
10
Can we automatically synthesize small test databases but effective instead of using live data?
11
OutlineMotivation
ProblemContext
Faktum
Evaluation
12
Database defined by a schema
specified by
Database
Database Schema(Eg. Norwegian Customs)
13
Modelling Test Database Configuration Space with a Feature Model
Database
Tables
Fields
Field Values
InvariantsCountryCode.CN requires Currency.CNYCountryCode.CN requires CountryGroup.RCN
Database Configuration Space
14
Configuration to Test Database Population
Feature Model
Configuration of field values
Database
INSERT INTO Declarations( Category , Direction, CountryCode , CurrencyCode )VALUES ( 'FO' , ' I ' , 'US ' , 'CNY' ) ;
populates
to SQL
INSERT INTO Items( OriginCountry)VALUES ( 'US' ) ;
...
15
Challenge
1. How to generate small test databases?
2. That satisfy test coverage criteria such as combinatorial interaction coverage?
16
OutlineMotivation
Problem Context
Faktum
Evaluation
17
Faktum
•Input is a feature model of database variability, Schema, T (Eg. T=2 is pairwise)
•A tool to synthesize test databases
•That covers all T-wise combinatorial interactions between a set of field values
•Uses the Alloy Analyzer API and implemented in Java
18
FaktumStep 1: Feature Model to Constraint
Satisfaction Problem in Alloy
one sig ProductConfigurations{ configurations : set Configuration}sig Configuration{f1: lone Category_FO, f2: lone Category_EN, ...}fact Invariant_Category_XOR{all c:Configuration|#c.f9+#c.f10+#c.f11+#c.f12+#c.f13+#c.f14=1 }
Base Alloy Model
“Database field values are features in a config.”
“Invariants as facts”
“Set of configurations is set of tests”
“A configuration is a test case”
19
Step 2: Generating T-wise data interactions
FU USDN/P N/PN/P PP N/PP P
T=2, Pairwise Interactions or Tuples
P=Present in databaseN/P=Not Present in database
Faktum
Case study has 2582 pairwise tuples (interactions)
20
FaktumStep 3: Tuples of Interactions to
Alloy Predicates
pred tuple1{all c:Configuration|#c.f1=0 and #c.f2=0}pred tuple2{all c:Configuration|#c.f1=0 and #c.f2=1}pred tuple3{all c:Configuration|#c.f1=1 and #c.f2=0}pred tuple4{all c:Configuration|#c.f1=1 and #c.f2=1}
FU USDN/P N/PN/P PP N/PP P
transform
Tuple Predicates
21
FaktumStep 4: Checking Tuple Validity
pred tuple1{all c:Configuration|#c.f1=0 and #c.f2=0}pred tuple2{all c:Configuration|#c.f1=0 and #c.f2=1}pred tuple3{all c:Configuration|#c.f1=1 and #c.f2=0}pred tuple4{all c:Configuration|#c.f1=1 and #c.f2=1}
Tuple Predicates
Base Alloy Model
Solution Exists?+solve
Tuple Valid!Tuple Invalid!
Y N
“A fully parallelizable process”
22
FaktumStep 5: Divide and Combine
StrategyA large number of interaction tuples gives a large number of predicates.
Solving all predicates in one Alloy constraint model is not tractable.
Valid Tuple Predicates Base Alloy
Tuple subsets
+Base Alloy+Base Alloy+
solve
solve
solve
divide combine
Configuration subsetsConfiguration
set
Perrouin, Sen, Baudry, Le Traon, Automate T-wise Test Generation for SPLs, ICST 2010“One can explore different divide strategies”
23
FaktumStep 6: Configuration to SQL
Configuration of field values
Database
INSERT INTO Declarations( Category , Direction, CountryCode , CurrencyCode )VALUES ( 'FO' , ' I ' , 'US ' , 'CNY' ) ;
populates
to SQL
Configuration Set
INSERT INTO Items...; INSERT INTO Taxes;..
24
FaktumStep 7: Generating updates for
other fields
Database
completes
UPDATE INTO Declarations (CustomerID , Date , Sequence , Version , Amount , FeeAmount ,TransportCost , ExchangeRate )VALUES ( ' 2002542616 ' , ' 1965-3-29 ' , ' 1 ' , ' 1 ' , ' 2982.490245 ' , ' 1343.471627 ' ,' 79.0749637 ' , ' 112.7416998 ' ) ;
Basic strategy: Unique values for keys and random generation for other fields
25
Summary of Faktum’s Approach
26
OutlineMotivation
Problem Context
Faktum
Evaluation
27
Case Study: Norwegian Toll Customs
Test scenario: Imports from Brazil, Chine, India, and USA
28
Case Study: Norwegian Toll Customs37 terminal field values with 2582 pairwise
interactions
Faktum
935 Configurations
...
29
Scalability to Check Tuple ValidityMeasured using perf4J every 10
seconds
30
Scalability of Configuration Generation
1. 935 configurations required 6803 calls to the Alloy SAT solver
2. Divide-and-combine gives an average of 400 ms per call to solver
31
Conclusion and Future Work1. We represent variability in test databases as a feature model.
2. Faktum, a tool to synthesize test databases covering T-wise combinatorial interactions between database field values.
3. We apply Faktum to generate compact test databases for the Norwegian Customs and Excise Dept.
4. Faktum is scalable due to a divide-and-combine strategy
Future Work1. We are thinking about a better representation of database variability 2. A multi-processor parallelized implementation of Faktum for fast generation (not just scalable)
32
Thank you, ?s