application of metamorphic testing to supervised classifiers

23
1 Application of Application of Metamorphic Testing Metamorphic Testing to Supervised to Supervised Classifiers Classifiers Xiaoyuan Xie, Tsong Yueh Chen Swinburne University of Technology Christian Murphy, Gail Kaiser Columbia University Joshua Ho University of Sydney Baowen Xu Nanjing University

Upload: chava

Post on 23-Feb-2016

43 views

Category:

Documents


0 download

DESCRIPTION

Application of Metamorphic Testing to Supervised Classifiers. Xiaoyuan Xie, Tsong Yueh Chen Swinburne University of Technology. Christian Murphy, Gail Kaiser Columbia University. Joshua Ho University of Sydney. Baowen Xu Nanjing University. Background. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Application of Metamorphic Testing to Supervised Classifiers

1

Application of Application of Metamorphic Testing to Metamorphic Testing to Supervised ClassifiersSupervised Classifiers

Xiaoyuan Xie, Tsong Yueh ChenSwinburne University of Technology

Christian Murphy, Gail KaiserColumbia University

Joshua HoUniversity of Sydney

Baowen XuNanjing University

Page 2: Application of Metamorphic Testing to Supervised Classifiers

2

BackgroundBackground Many applications in the field of Many applications in the field of

scientific computing depend on scientific computing depend on machine learningmachine learning (ML) algorithms (ML) algorithms

ML applications often do not have ML applications often do not have test test oraclesoracles that indicate whether the that indicate whether the output is correct for arbitrary inputoutput is correct for arbitrary input

Applications without test oracles are Applications without test oracles are called “called “non-testable programsnon-testable programs””

Page 3: Application of Metamorphic Testing to Supervised Classifiers

3

Problem StatementProblem Statement Oracles may exist for a Oracles may exist for a limitedlimited subset subset

of the input domain, and gross errors of the input domain, and gross errors ((e.g.e.g. crashes) can be detected with crashes) can be detected with certain inputs or techniquescertain inputs or techniques

However, it is difficult to detect However, it is difficult to detect subtlesubtle (computational) errors for (computational) errors for arbitraryarbitrary inputsinputs

Page 4: Application of Metamorphic Testing to Supervised Classifiers

4

Testing ML ApplicationsTesting ML Applications There has been much research into There has been much research into

applying ML techniques to software applying ML techniques to software testing, but not the other way aroundtesting, but not the other way around

Reusable real-world data sets and Reusable real-world data sets and frameworks are available for checking frameworks are available for checking that an ML algorithm that an ML algorithm predictspredicts wellwell, , but not for checking that an but not for checking that an implementation implementation worksworks correctlycorrectly

Page 5: Application of Metamorphic Testing to Supervised Classifiers

5

ObservationObservation If there is no oracle in the general case, If there is no oracle in the general case,

we cannot know the expected relationship we cannot know the expected relationship between a particular input and its outputbetween a particular input and its output

However, it may be possible to know However, it may be possible to know relationships between a relationships between a setset of inputs and of inputs and the corresponding the corresponding setset of outputs of outputs

““Metamorphic TestingMetamorphic Testing” ” [Chen [Chen et al.et al. ’98] ’98] is is such an approachsuch an approach

Page 6: Application of Metamorphic Testing to Supervised Classifiers

6

Metamorphic TestingMetamorphic Testing An approach for An approach for creating follow-on test casescreating follow-on test cases

based on previous test casesbased on previous test cases

If input If input xx produces output produces output f(x)f(x), then the , then the function’s “function’s “metamorphic propertiesmetamorphic properties” are used ” are used to guide a transformation function to guide a transformation function tt, which is , which is applied to produce a new test case input, applied to produce a new test case input, t(x)t(x)

We can then predict the expected value of We can then predict the expected value of f(t(x))f(t(x)) based on the value of based on the value of f(x)f(x) obtained obtained from the actual executionfrom the actual execution

Page 7: Application of Metamorphic Testing to Supervised Classifiers

7

Metamorphic Testing without an Metamorphic Testing without an OracleOracle

When a test oracle exists, we can know When a test oracle exists, we can know whether whether f(t(x))f(t(x)) is correct is correct– Because we have an oracle for Because we have an oracle for f(x)f(x)– So if So if f(t(x))f(t(x)) is as expected, then it is correct is as expected, then it is correct

When there is no test oracle, When there is no test oracle, f(x)f(x) acts as a acts as a “pseudo-oracle” for “pseudo-oracle” for f(t(x))f(t(x))– If If f(t(x))f(t(x)) is as expected, it is is as expected, it is not not necessarily necessarily

correctcorrect– However, if However, if f(t(x))f(t(x)) is is notnot as expected, either as expected, either f(x)f(x) or or f(t(x))f(t(x)) (or both) is wrong (or both) is wrong

Page 8: Application of Metamorphic Testing to Supervised Classifiers

8

Metamorphic Testing Metamorphic Testing ExampleExample

Consider a program that reads a text file of Consider a program that reads a text file of test scores for students in a class, test scores for students in a class, and and computes thecomputes the averageaverages and thes and the standard standard deviation of the averagesdeviation of the averages

If we If we permutepermute the the valuesvalues in the text file, the in the text file, the resultresults s should stay the sameshould stay the same

If we If we multiplymultiply each score by 10, the each score by 10, the final final resultresultss should al should all be multiplied by 10 as welll be multiplied by 10 as well

These metamorphic properties can be used to These metamorphic properties can be used to create a “pseudo-oracle” for the applicationcreate a “pseudo-oracle” for the application

Page 9: Application of Metamorphic Testing to Supervised Classifiers

9

ApproachApproach To apply Metamorphic Testing to such To apply Metamorphic Testing to such

ML applications, we first enumerate the ML applications, we first enumerate the metamorphic relations based on the metamorphic relations based on the expected behaviors of a given machine expected behaviors of a given machine learning learning algorithmalgorithm

We then utilize these relations to We then utilize these relations to conduct metamorphic testing on the conduct metamorphic testing on the implementationimplementation

Page 10: Application of Metamorphic Testing to Supervised Classifiers

10

Verification & ValidationVerification & Validation The scope of which metamorphic properties The scope of which metamorphic properties

are are necessarynecessary may differ between various may differ between various problems in the domainproblems in the domain

Properties that are necessary can be used for Properties that are necessary can be used for verificationverification: : “Is the implementation of the “Is the implementation of the algorithm correct?”algorithm correct?”

Other properties can be used for Other properties can be used for validationvalidation: : “Is the algorithm appropriate for solving this “Is the algorithm appropriate for solving this problem?”problem?”

Page 11: Application of Metamorphic Testing to Supervised Classifiers

11

Research QuestionsResearch Questions What are the metamorphic What are the metamorphic

properties of supervised ML properties of supervised ML classification algorithms?classification algorithms?– Which can be used for verification? Which can be used for verification? – Which can be used for validation?Which can be used for validation?

Can metamorphic testing detect Can metamorphic testing detect defects in real-world ML applications?defects in real-world ML applications?

Page 12: Application of Metamorphic Testing to Supervised Classifiers

12

Machine Learning Machine Learning FundamentalsFundamentals

Data sets consist of a number of Data sets consist of a number of samplessamples, , each of which has each of which has attributesattributes and a and a labellabel

In the first phase (“In the first phase (“trainingtraining”), a ”), a modelmodel is is generated that attempts to generalize how generated that attempts to generalize how attributes relate to the labelattributes relate to the label

In the second phase, the model is applied to In the second phase, the model is applied to a previously-unseen data set (“a previously-unseen data set (“testingtesting” data) ” data) with unknown labels to produce a with unknown labels to produce a classification of each sampleclassification of each sample

Page 13: Application of Metamorphic Testing to Supervised Classifiers

13

Algorithms InvestigatedAlgorithms Investigated kk-Nearest Neighbors (-Nearest Neighbors (kkNN)NN)

– Samples in the testing data are classified by using Samples in the testing data are classified by using Euclidean distance to find the Euclidean distance to find the kk nearest samples nearest samples in the training datain the training data

– Classification is then done by majority ruleClassification is then done by majority rule

Naïve Bayes Classifier (NBC)Naïve Bayes Classifier (NBC)– For a given sample in the testing data, computes For a given sample in the testing data, computes

the probability of that sample belonging to each the probability of that sample belonging to each class, assuming conditional independence class, assuming conditional independence between the attributesbetween the attributes

– Chooses the class that is most likelyChooses the class that is most likely

Page 14: Application of Metamorphic Testing to Supervised Classifiers

14

Metamorphic RelationsMetamorphic Relations We identified 11 properties that we would We identified 11 properties that we would

expect expect allall classification algorithms to have classification algorithms to have

Affine transformation of attributesAffine transformation of attributes Permutation of labels or attributesPermutation of labels or attributes Addition of informative or uninformative attributesAddition of informative or uninformative attributes Addition of classes by duplicating or re-labeling Addition of classes by duplicating or re-labeling

samplessamples Removal of classes or samplesRemoval of classes or samples

Page 15: Application of Metamorphic Testing to Supervised Classifiers

15

Experimental SetupExperimental Setup Applied the approach to implementations in Applied the approach to implementations in

the Weka 3.5.7 toolkitthe Weka 3.5.7 toolkit

Initial test cases:Initial test cases:– Randomly generated values Randomly generated values – Four attributes (“columns”)Four attributes (“columns”)– 20-50 samples (“rows”)20-50 samples (“rows”)

Metamorphic relations were applied to create Metamorphic relations were applied to create 20-300 follow-on test cases20-300 follow-on test cases

Page 16: Application of Metamorphic Testing to Supervised Classifiers

16

PropertPropertyy

NecessarNecessary?y?

% % violatedviolated

NecessarNecessary?y?

% % violatedviolated

00 00 7.47.41.11.1 15.915.9 0.30.31.21.2 00 002.12.1 00 0.60.62.22.2 4.14.1 003.13.1 00 003.23.2 00 004.14.1 25.325.3 004.24.2 00 3.93.95.15.1 5.95.9 5.65.65.25.2 2.82.8 2.82.8

k Nearest Neighbors Naïve Bayes Classifier

ResultsResults

Page 17: Application of Metamorphic Testing to Supervised Classifiers

17

Analysis: Analysis: kkNNNN No necessary properties were violatedNo necessary properties were violated

Issues related to validation:Issues related to validation:– Labels that are non-existent in the Labels that are non-existent in the

training data have a non-zero chance of training data have a non-zero chance of being selected in classificationbeing selected in classification

– If two labels are equally likely, the “first” If two labels are equally likely, the “first” one that is listed is chosenone that is listed is chosen

Page 18: Application of Metamorphic Testing to Supervised Classifiers

18

Analysis: Naïve BayesAnalysis: Naïve Bayes Four necessary properties were Four necessary properties were

violated, indicating defects in the violated, indicating defects in the implementationimplementation– Loss of precision related to use of the Loss of precision related to use of the

“double” datatype in Java“double” datatype in Java– Laplace Accuracy used to determine Laplace Accuracy used to determine

probabilities; thus, labels that did not probabilities; thus, labels that did not appear in training data have non-zero appear in training data have non-zero probabilityprobability

Page 19: Application of Metamorphic Testing to Supervised Classifiers

19

SuggestionsSuggestions We suggest using the “BigDecimal” We suggest using the “BigDecimal”

class instead of the “double” class instead of the “double” datatype datatype

Laplace Accuracy is appropriate for Laplace Accuracy is appropriate for the the attributesattributes but not for the but not for the labelslabels

Use of Laplace Accuracy should be Use of Laplace Accuracy should be set as an optionset as an option

Page 20: Application of Metamorphic Testing to Supervised Classifiers

20

Future WorkFuture Work Apply the testing approach to other Apply the testing approach to other

domains that depend on ML, such as domains that depend on ML, such as scientific computingscientific computing

Further investigation of testing “non-Further investigation of testing “non-testable programs”testable programs”

Measure the effectiveness of the Measure the effectiveness of the approach in empirical studiesapproach in empirical studies

Page 21: Application of Metamorphic Testing to Supervised Classifiers

21

SummarySummary Metamorphic testing is easy to Metamorphic testing is easy to

implement and automateimplement and automate We were able to devise fault-revealing We were able to devise fault-revealing

properties even with just a basic properties even with just a basic understanding of the ML algorithmsunderstanding of the ML algorithms

Metamorphic testing can be used for Metamorphic testing can be used for both both verificationverification and and validationvalidation

Page 22: Application of Metamorphic Testing to Supervised Classifiers

22

Application of Application of Metamorphic Testing to Metamorphic Testing to Supervised ClassifiersSupervised Classifiers

Xiaoyuan Xie, Tsong Yueh ChenSwinburne University of Technology

Christian Murphy, Gail KaiserColumbia University

Joshua HoUniversity of Sydney

Baowen XuNanjing University

Page 23: Application of Metamorphic Testing to Supervised Classifiers

23

Related WorkRelated Work Applying MT to non-testable Applying MT to non-testable

programs in other domainsprograms in other domains

General properties for use in MTGeneral properties for use in MT