towards automated a/b testing
TRANSCRIPT
Towards AutomatedA/B Testing
Alessandro [email protected]
Università della Svizzera Italiana, Lugano (CH)
Giordano [email protected]
Università della Svizzera Italiana, Lugano (CH)
User-intensive Applications
• Large and evolving populations of users
• Meeting user preferences is a crucial factor
• Almost impossible to design applications that accurately capture all possible and meaningful user preferences upfront.
A/B Testing: what?
• Initial (inaccurate) implementation based on the available knowledge
A/B Testing: two distinct variants of the same application are compared using live experiments.
• Run-time continuous monitoring, refinement and improvement to meet newly discovered user preferences
A/B Testing: what?
A/B Testing: what?• Widely adopted in industry: “The A/B Test: Inside
the Technology Thats Changing the Rules of Business”.
A/B Testing: pitfalls
1. Development and deployment of multiple variants.
2. What is a variant
3. How many variants.
4. How to select variants.
5. How to evaluate variants.
6. When to stop
• Still a difficult, tedious, and costly manual activity.
About this work• Adopt SBSE methods to
improve/automate the A/B Testing process
• Investigate the feasibility and draft a possible solution (GA+AOP)
• A polished & ready-to-use solution
A/B Testing: an optimization problem
• Features: From an abstract viewpoint a program p can be viewed as a finite set of features: F = {f1 . . . fn}.
• Each feature f has an associated domain D that specifies which values are valid/allowed for f.
A/B Testing: an optimization problem
• Instantiation functions: Function that associates a feature with a specific value from its domain.
• To obtain a concrete implementation for a program it is necessary to specify the instantiations for all the features
• The specification of different instantiations yields to different concrete implementations of the same abstract program.
A/B Testing: an optimization problem
• Variants: We call a concrete implementation of a program p a variant of p.
• Constraints: A constraint is a function Ci,j : Di → P(Dj) that, given a domain value for a feature, returns a subset of values not allowed for other features
• A variant is valid if it satisfies all the defined constraints.
A/B Testing: an optimization problem
• Assessment Function: An assessment function is a function defined as o(v) : Vp → R, where Vp = {v1, . . . , vm} is the set of all possible variants for the program.
• This function associates to each and every variant of a program a numeric value, which indicates the goodness of the variant with respect to the goal of the program.
A/B Testing: an optimization problem
• Thus A/B testing can be formulated as:
v = argmax o(v)v
• Goal: exploit search algorithms to enable automated A/B testing.
Towards automated A/B Testing
• Two ingredients:• A design-time declarative
facility to specify program features
• A run-time framework in charge of automatically and iteratively exploring the solution space
Towards automated A/B Testing
Specifying Features
• Ad-hoc annotations to specify the set of relevant features
• They allow to write a parametric program, that represents all possible variants
• We rely on Aspect Oriented Programming to dynamically create variants from the parametric program
Primitive Type Features
@StringFeature(name=“checkOutButtonText”,values={"CheckOut", "Buy", "Buy Now!"})String buttonText; // Primitive feature specification
Button checkOutButton = new Button();checkOutButton.setText(buttonText); // Primitive feature usecheckOutButton.setFontSize(textSize); // Primitive feature use
@IntegerFeature(name="fontSize", range="12:1:18")int textSize; // Primitive feature specification
Generic Data Type Feature@GenericFeatureInterface(name="sort", values={"com.example.SortByPrice", "com.example.SortByName"})public interface AbstractSortingInterface{ public List<Item> sort();}
@GenericFeatureAbstractSortingInterface sortingFeature; // ADT feature specificationsortingFeature.sort(..); // ADT feature use
public class SortByPrice implements AbstractSortingInterface{ public List<Item> sort(){ // Sort by price }}
public class SortByName implements AbstractSortingInterface{ public List<Item> sort(){ // Sort by name }}
Towards automated A/B Testing
Encoding
• Each feature declared by developers directly maps into a gene, while each variant maps into a chromosome.
• The assessment function, which evaluates variants on live users, corresponds directly to the fitness function, which evaluates chromosomes.
Selection
• Identifying, at each iteration, a finite number of chromosomes in the population that survive
• Several possible strategies
• Selection strategies relieve the developer from manually selecting variants during A/B testing.
Crossover & Mutation
• Crossover and mutation contribute to the generation of new variants.
• In traditional A/B testing, the process of generating new variants of a program is performed manually by the developers.
Some results: setup
• We consider a program with n features.
• We assume that each feature has a finite, countable domain
• We split the users into groups
• We adopt an assessment function that ranges from 0 (worst case) to 1000 (best case).
• We set a distance threshold t. If the distance between the value of a feature and the user’s favourite value is higher than t, then the value of the variant is 0.
Some results
Some results
Some results
Some results
Key messages
• An automated solution is indeed
possible and worth investigating.
• Heterogenous groups implies complex
A/B Testing
• Intermediate variants of the programs
were providing good assessment
function values
• Future work:
• Real testbed
• Customised mutation operators
• Full support for constraints.
Thank you
1: A/B Testing as a complex and manual activity
2: A/B Testing can be seen as an optimisation problem
3: We can write parametric programs
4: GA can carry on A/B Testing campaigns for us