search-based software testing keynote: research...

42
Search-Based Software Testing Search-Based Software Testing Keynote: Keynote: Research, Research, Challenges and Opportunities Challenges and Opportunities Westley Weimer Westley Weimer University of Virginia University of Virginia

Upload: tranhanh

Post on 03-Jul-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

Search-Based Software TestingSearch-Based Software TestingKeynote:Keynote:

Research, Research, Challenges and OpportunitiesChallenges and Opportunities

Westley WeimerWestley WeimerUniversity of VirginiaUniversity of Virginia

Page 2: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

MetaMeta

● I'm the sort of person who usually skips I'm the sort of person who usually skips Keynotes.Keynotes.

● I'm not sure that I have a good argument for I'm not sure that I have a good argument for why you shouldn't skip mine.why you shouldn't skip mine.

● I'll try to focus on informal notions, I'll try to focus on informal notions, background or ideas that may not be as background or ideas that may not be as common …common …

● … … rather than formal technical details.rather than formal technical details.

Page 3: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

OutlineOutline

● My Recent ResearchMy Recent Research– Preview ICSE 2012Preview ICSE 2012

● The FieldThe Field

● Challenges and OpportunitiesChallenges and Opportunities

Page 4: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

TestingTesting

● A way to gain confidence that a program A way to gain confidence that a program implementation adheres to its specification (as implementation adheres to its specification (as refined from its requirements). refined from its requirements).

● Thus, testing often finds bugs ...Thus, testing often finds bugs ...

Page 5: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

Roundabout MotivationRoundabout Motivation

● "Don't come back till you have him!" the Ticktockman said, "Don't come back till you have him!" the Ticktockman said, very quietly, very sincerely, extremely dangerously. very quietly, very sincerely, extremely dangerously.

● They used dogs. They used probes. They used cardioplate They used dogs. They used probes. They used cardioplate crossoffs. They used teepers. They used bribery. They used crossoffs. They used teepers. They used bribery. They used stiktytes. They used intimidation. They used torment. They stiktytes. They used intimidation. They used torment. They used torture. They used finks. They used cops. They used used torture. They used finks. They used cops. They used search&seizure. They used fallaron. They used betterment search&seizure. They used fallaron. They used betterment incentive. They used fingerprints. They used Bertillon. They incentive. They used fingerprints. They used Bertillon. They used cunning. They used guile. They used treachery. They used cunning. They used guile. They used treachery. They used Raoul Mitgong, but he didn't help much. They used used Raoul Mitgong, but he didn't help much. They used applied physics. They used techniques of criminology. applied physics. They used techniques of criminology.

● And what the hell: they caught him. And what the hell: they caught him.

– ““Repent, Harlequin!” Said the Ticktockman (1965)Repent, Harlequin!” Said the Ticktockman (1965)

Page 6: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

Catching BugsCatching Bugs

● "Don't come back till you have him!" the Ticktockman said, "Don't come back till you have him!" the Ticktockman said, very quietly, very sincerely, extremely dangerously. very quietly, very sincerely, extremely dangerously.

● They used dogs. They used They used dogs. They used probesprobes. They used cardioplate . They used cardioplate crossoffs. They used teepers. They used bribery. They used crossoffs. They used teepers. They used bribery. They used stiktytes. They used intimidation. They used torment. They stiktytes. They used intimidation. They used torment. They used torture. They used finks. They used cops. They used used torture. They used finks. They used cops. They used searchsearch&seizure. They used fallaron. They used betterment &seizure. They used fallaron. They used betterment incentive. They used fingerprints. They used Bertillon. They incentive. They used fingerprints. They used Bertillon. They used used cunningcunning. They used . They used guileguile. They used treachery. They . They used treachery. They used Raoul Mitgong, but he didn't help much. They used used Raoul Mitgong, but he didn't help much. They used applied physics. They used techniques of criminology. applied physics. They used techniques of criminology.

● And what the hell: they caught him. And what the hell: they caught him.

– ““Repent, Harlequin!” Said the Ticktockman (1965)Repent, Harlequin!” Said the Ticktockman (1965)

Page 7: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

A Big Issue: Now What?A Big Issue: Now What?

Page 8: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

Bug BountiesBug Bounties

Page 9: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

Show Me The MoneyShow Me The Money

Page 10: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

Really?Really?● Tarsnap: Tarsnap:

– 200 candidates200 candidates

– 125 spelling/style125 spelling/style

– 63 “harmless”63 “harmless”

– 11 “minor”11 “minor”

– 1 “major”1 “major”

● 75/200 = 38% 75/200 = 38% TPTP rate rate● $17 + 40 hours per $17 + 40 hours per TPTP

Page 11: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

GenProg: GenProg: Automated Program RepairAutomated Program Repair

● Goal: Automatically repair defects in off-the-Goal: Automatically repair defects in off-the-shelf, legacy software.shelf, legacy software.

● Input: Input: – Unannotated Program SourceUnannotated Program Source

– Deterministic Test Cases (1+ currently fail)Deterministic Test Cases (1+ currently fail)● Output:Output:

– Repaired Program (Patch)Repaired Program (Patch)

– That passes all Test CasesThat passes all Test Cases

Page 12: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

See AlsoSee Also

● Clearview, JoltClearview, Jolt

– Automated fixing of deployed binariesAutomated fixing of deployed binaries– Perkins, Kim, Larsen, Amarasinghe, Bachrach, Carbin, Pacheco, Perkins, Kim, Larsen, Amarasinghe, Bachrach, Carbin, Pacheco,

Sherwood, Sidiroglou, Sullivan, Wong, Zibin, Ernst, RinardSherwood, Sidiroglou, Sullivan, Wong, Zibin, Ernst, Rinard

● AutoFix-EAutoFix-E

– Automated fixing of programs with contractsAutomated fixing of programs with contracts– Wei, Pei, Furia, Silva, Buchholz, Meyer, Nordio, ZellerWei, Pei, Furia, Silva, Buchholz, Meyer, Nordio, Zeller

● AFixAFix

– Automated fixing of single variable atomicity violationsAutomated fixing of single variable atomicity violations– Jin, Song, Zhang, Lu, LiblitJin, Song, Zhang, Lu, Liblit

● Debroy and WongDebroy and Wong, etc., etc.

Page 13: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

GenProg ApproachGenProg Approach

● SearchSearch through the space of patches through the space of patches (sequences of edits) until one is found that (sequences of edits) until one is found that passes all passes all test casestest cases..

● An “Edit” is:An “Edit” is:– Delete statement XDelete statement X

– Insert statement X after statement YInsert statement X after statement Y

– Replace statement X with statement YReplace statement X with statement Y● The “Search” is:The “Search” is:

– Genetic AlgorithmGenetic Algorithm

Page 14: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

Genetic AlgorithmGenetic Algorithm

● Genome = sequence of editsGenome = sequence of edits

● Mutate = add a new editMutate = add a new edit

● Crossover = uniformCrossover = uniform

● Fitness = weighted sum of tests passedFitness = weighted sum of tests passed

Page 15: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

On The Shoulders of GiantsOn The Shoulders of Giants

● Search Space ReductionSearch Space Reduction– Fault Localization guides searchFault Localization guides search

– When mutating, apply edits to likely placesWhen mutating, apply edits to likely places

● Test Suite PrioritizationTest Suite Prioritization– Evaluate fitness on a Evaluate fitness on a samplesample of the tests of the tests

– Confirm a candidate repair via Confirm a candidate repair via retest-allretest-all

– Time-Aware Test Suite PrioritizationTime-Aware Test Suite Prioritization

– Impact AnalysisImpact Analysis

Page 16: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

ParallelismParallelism

● Can evaluate in parallelCan evaluate in parallel– One candidate on test X and test YOne candidate on test X and test Y

– Separate candidatesSeparate candidates

– Separate runs of the entire processSeparate runs of the entire process

● Reduce time-to-first-fixReduce time-to-first-fix– With many runs in parallelWith many runs in parallel

– Use public Use public cloud computingcloud computing infrastructure infrastructure

– Directly measure the cost ($) of each fixDirectly measure the cost ($) of each fix

Page 17: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

Results that GeneralizeResults that Generalize

● Straw Program Repair EvaluationStraw Program Repair Evaluation– I took my 10 favorite programs, identified a I took my 10 favorite programs, identified a

bug in each one, and voila, our technique bug in each one, and voila, our technique fixed all of those bugs.fixed all of those bugs.

● Research QuestionResearch Question– What fraction of defects can What fraction of defects can

these techniques actually these techniques actually repair?repair?

Page 18: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

Systematic Benchmark SelectionSystematic Benchmark Selection

● Intuition: Take the last 100 bugs from project Intuition: Take the last 100 bugs from project X. How many can we fix?X. How many can we fix?

● Process for each Program X:Process for each Program X:– Take all tests from latest version of X.Take all tests from latest version of X.

– Find all compile-able, run-able versions A,B Find all compile-able, run-able versions A,B such that B passes tests that A does notsuch that B passes tests that A does not

● Consider top programs X from Sourceforge, Consider top programs X from Sourceforge, Google Code, Fedora SRPM, etc.Google Code, Fedora SRPM, etc.

● Fix all algorithm parameters before finding or Fix all algorithm parameters before finding or inspecting benchmarksinspecting benchmarks

Page 19: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

From So Many To So FewFrom So Many To So Few

● Opaque or non-automated GUI testingOpaque or non-automated GUI testing– Firefox, Eclipse, OpenOfficeFirefox, Eclipse, OpenOffice

● Inaccessible or small version control historiesInaccessible or small version control histories– bash, cvs, opensshbash, cvs, openssh

● Few viable versions for recent testsFew viable versions for recent tests– valgrindvalgrind

● Require incompatible automake, libtoolRequire incompatible automake, libtool– gmpgmp

● Non-deterministic tests ...Non-deterministic tests ...

Page 20: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

105 bugs, 5 MLOC, 10k Tests105 bugs, 5 MLOC, 10k Tests

● Bugs severe enough to merit checked-in testsBugs severe enough to merit checked-in tests● Bugs 3/5 or higher Devel-reported severityBugs 3/5 or higher Devel-reported severity

10x, 10x, 10x

Page 21: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

55/105 Repaired for $8 Each55/105 Repaired for $8 Each

● $403 for all 105 trials, leading to 55 repairs$403 for all 105 trials, leading to 55 repairs

Page 22: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

Public ComparisonsPublic Comparisons

● JBoss issue tracking: JBoss issue tracking: – Median 5.0 hours, mean 15.3 hoursMedian 5.0 hours, mean 15.3 hours

● Tarsnap.comTarsnap.com– $17 per non-trivial repair$17 per non-trivial repair

● IBMIBM– $25 per defect during coding$25 per defect during coding

– (rising at build, Q&A, post-release, etc.)(rising at build, Q&A, post-release, etc.)

● One of the One of the phpphp bugs we fixed has an bugs we fixed has an associated security CVEassociated security CVE

Page 23: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

Preprint, Code, Benchmarks, VMsPreprint, Code, Benchmarks, VMshttp://genprog.cs.virginia.eduhttp://genprog.cs.virginia.edu

Page 24: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

OutlineOutline

● My Recent ResearchMy Recent Research– Preview ICSE 2012Preview ICSE 2012

● The FieldThe Field

● Challenges and OpportunitiesChallenges and Opportunities

Page 25: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

What is SBST?What is SBST?

● Search-Based Software TestingSearch-Based Software Testing

● Well, “Software Testing” is well-establishedWell, “Software Testing” is well-established

● So what's “Search-Based”?So what's “Search-Based”?● Let's ask a luminary in the field.Let's ask a luminary in the field.

Page 26: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

Authoritative AnswerAuthoritative Answer

Andreas Zeller. Keynote, Andreas Zeller. Keynote, Search-Based Software EngineeringSearch-Based Software Engineering, 10 Sept 2011., 10 Sept 2011.

Page 27: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

SBST Abstract Tag CloudSBST Abstract Tag Cloud2008-20112008-2011

Page 28: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

Are We Pining for the Fjords?Are We Pining for the Fjords?

2008 2009 2010 2011 20120

2

4

6

8

10

12

International Workshop on Search-Based Software Testing

Num

ber o

f Acc

epte

d Pa

pers

Page 29: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

SE Fields Using “Search-Based”SE Fields Using “Search-Based”

● Yuanyuan Zhang, SEBASE Repository Yuanyuan Zhang, SEBASE Repository 1022 SE+Search Publications1022 SE+Search Publications

Page 30: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

Search-Based Software TestingSearch-Based Software Testing

● Is there still a strong case for a separate SBST Is there still a strong case for a separate SBST identity (from SBSE and Software Testing)?identity (from SBSE and Software Testing)?

– It's not clear to me that there is.It's not clear to me that there is.● Search-Based approaches are more Search-Based approaches are more

respectable.respectable.● There are good scientific and engineering There are good scientific and engineering

challenges associated with making challenges associated with making “mainstream” SE arguments.“mainstream” SE arguments.

Page 31: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

Challenges and OpportunitiesChallenges and Opportunities

● Benchmark SelectionBenchmark Selection● Oracles and SpecificationsOracles and Specifications● Extracting Human IntentExtracting Human Intent● Leveraging Cloud Computing & CrowdsourcingLeveraging Cloud Computing & Crowdsourcing● ““Fix” LocalizationFix” Localization● Embracing the ProblemEmbracing the Problem● Lifting Input Assumptions & SensitivityLifting Input Assumptions & Sensitivity

Page 32: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

Benchmark SelectionBenchmark Selection

● The TRIANGLE program and the small SIEMENS The TRIANGLE program and the small SIEMENS benchmarks are not adequate.benchmarks are not adequate.

● Augment them with real-world programs (e.g., Augment them with real-world programs (e.g., open source), larger programs from SIR, or open source), larger programs from SIR, or programs from other repositories (iBugs, etc.).programs from other repositories (iBugs, etc.).

● Opportunity: If your algorithm really scales Opportunity: If your algorithm really scales with the “tests” or “test coverage” and not with the “tests” or “test coverage” and not the subject LOC, this is a free way for you to the subject LOC, this is a free way for you to look more impressive while convincing a look more impressive while convincing a general SE audience.general SE audience.

Page 33: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

Benchmark SelectionBenchmark Selection● Jia and Harman, Jia and Harman, An Analysis and Survey of the An Analysis and Survey of the

Development of Mutation TestingDevelopment of Mutation Testing, IEEE TSE , IEEE TSE 2010.2010.

– 350 papers and theses, 1977 - 2009 350 papers and theses, 1977 - 2009

– Table IX: Programs Used In Empirical StudiesTable IX: Programs Used In Empirical Studies

– Only Only 33 used a program of size 100,000 LOC used a program of size 100,000 LOC

– Median size < ~1000 LOCMedian size < ~1000 LOC● Mutation Testing != SBST, but ...Mutation Testing != SBST, but ...

– SBST Abs #: “LOC” 0, “lines” 0, “thousand” 0, SBST Abs #: “LOC” 0, “lines” 0, “thousand” 0, “million” 0, etc. General SE will use LOC.“million” 0, etc. General SE will use LOC.

● Challenge yourself: +1 order of magnitudeChallenge yourself: +1 order of magnitude

Page 34: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

Oracles and SpecificationsOracles and Specifications

● Test Test inputinput generation is no longer sufficient. generation is no longer sufficient. Test oracle generation is akin to specification Test oracle generation is akin to specification mining and anomaly intrusion detection.mining and anomaly intrusion detection.

● There are now a number of great projects for There are now a number of great projects for test test inputinput generation (CUTE, AUSTIN, PEX, generation (CUTE, AUSTIN, PEX, DART, etc.). DART, etc.).

– These work great with implicit or universal These work great with implicit or universal specifications (e.g., “don't segfault”).specifications (e.g., “don't segfault”).

● SBST Abs #: “oracle” 2, “specification” 3, SBST Abs #: “oracle” 2, “specification” 3, “requirements” 1.“requirements” 1.

Page 35: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

Specification MiningSpecification Mining● Ill-named task in SE/PL:Ill-named task in SE/PL:

– Given a program's source code and an Given a program's source code and an indicative workload, output some partial indicative workload, output some partial correctness specifications.correctness specifications.

● Analogy: Learn the rules of English from high Analogy: Learn the rules of English from high school student essays.school student essays.

● Reasonable post-DAIKON examples: Reasonable post-DAIKON examples: – Ammons and Bodik, Ammons and Bodik, Mining SpecificationsMining Specifications, ,

POPL 2002. POPL 2002.

– Forrest et al., Forrest et al., A Sense of Self for Unix A Sense of Self for Unix ProcessesProcesses, IEEE Security and Privacy 1996., IEEE Security and Privacy 1996.

Page 36: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

Human IntentHuman Intent

● Non-executable artifacts should play a larger Non-executable artifacts should play a larger role in search-based software testing.role in search-based software testing.

● Imagine that you are trying to localize a fault, Imagine that you are trying to localize a fault, learn a specification, or prioritize a test suite, learn a specification, or prioritize a test suite, and you have two pieces of code:and you have two pieces of code:

– One highly readable, rarely-touched, written One highly readable, rarely-touched, written by an expert developer, full of comments.by an expert developer, full of comments.

– Another less readable, often churned, written Another less readable, often churned, written by a novice, full of duplicate code.by a novice, full of duplicate code.

Page 37: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

Measuring Human IntentMeasuring Human Intent

● Halstead and Software Science may be poor Halstead and Software Science may be poor choices, but other options are available.choices, but other options are available.

● Example: Le Goues et al., Example: Le Goues et al., Measuring Code Measuring Code Quality to Improve Specification MiningQuality to Improve Specification Mining, IEEE , IEEE TSE 2011. Reduces FP rate by 10x. TSE 2011. Reduces FP rate by 10x.

● SBST Abs #: docume* 0, comment 0, human 2. SBST Abs #: docume* 0, comment 0, human 2. ● Measuring intent or quality involves humans Measuring intent or quality involves humans

but is much cheaper than a full-blown human but is much cheaper than a full-blown human study: instead, sift historical data to get study: instead, sift historical data to get human judgments “for free”. human judgments “for free”.

Page 38: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

Cloud ComputingCloud Computing

● Cloud computing (or similar) should be used to Cloud computing (or similar) should be used to put a monetary value on research costs or put a monetary value on research costs or benefits, where applicable. benefits, where applicable.

● For some tasks (test suite reduction or For some tasks (test suite reduction or prioritization) this may not be necessary, but prioritization) this may not be necessary, but for for others (fault localization, bug repair) for for others (fault localization, bug repair) this seems increasingly relevant.this seems increasingly relevant.

– Many layers to “total effort saved” Many layers to “total effort saved”

– Challenge yourself: be precise about one.Challenge yourself: be precise about one.● SBST Abs #: cloud 0, dollar 0, money 0, cost 8SBST Abs #: cloud 0, dollar 0, money 0, cost 8

Page 39: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

CrowdsourcingCrowdsourcing

● Crowdsourcing should be used as an efficient Crowdsourcing should be used as an efficient way to gather human study data.way to gather human study data.

● Services like Amazon's Mechanical Turk are Services like Amazon's Mechanical Turk are increasingly used in SE human studies.increasingly used in SE human studies.

● Care must be taken to control or account for Care must be taken to control or account for expertise and “gaming the system”. expertise and “gaming the system”.

● Example: Fry et al., Example: Fry et al., A Human Study of Fault A Human Study of Fault Localization AccuracyLocalization Accuracy, ICSM 2010. (200+), ICSM 2010. (200+)

● SBST Abs #: “human s” 0, “irb” 0, “crowd” 0, SBST Abs #: “human s” 0, “irb” 0, “crowd” 0, human 2. human 2.

Page 40: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

““Fix” LocalizationFix” Localization

● We should use tests to detect possible fixes, We should use tests to detect possible fixes, not just to detect faults. not just to detect faults.

● Explicit personal bias on this one.Explicit personal bias on this one.● Information used for test prioritization, test Information used for test prioritization, test

reduction, test generation, etc., could be used reduction, test generation, etc., could be used to say “If you are failing test X, you should to say “If you are failing test X, you should look at Y” more often than it currently is.look at Y” more often than it currently is.

● SBST Abs # : fix 1!, locali* 0, repair 2. SBST Abs # : fix 1!, locali* 0, repair 2.

Page 41: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

Input Assumptions & SensitivityInput Assumptions & Sensitivity● We should try to reduce (or measure the We should try to reduce (or measure the

impact of) the assumptions of our projects.impact of) the assumptions of our projects.● If you are a grad student looking for a project, If you are a grad student looking for a project,

removing an assumption made by another removing an assumption made by another project is a reasonable first step.project is a reasonable first step.

– Example: “Assumes single-threaded code” or Example: “Assumes single-threaded code” or “Assumes tracing from requirements to “Assumes tracing from requirements to tests.” Increases utility, carves out niche.tests.” Increases utility, carves out niche.

● If not: Sensitivity Analysis. (Abs #: sens.* 0)If not: Sensitivity Analysis. (Abs #: sens.* 0)– Example: If the tracing is off by 10%, is the Example: If the tracing is off by 10%, is the

reduction off by 20%? 2000%? reduction off by 20%? 2000%?

Page 42: Search-Based Software Testing Keynote: Research ...web.eecs.umich.edu/~weimerw/ppt/weimer-sbst-keynote.pdfKeynote: Research, Challenges and Opportunities ... – Genetic Algorithm

ConclusionConclusion

● Automated Program RepairAutomated Program Repair

– We can repair 55/105 bugs in 5 We can repair 55/105 bugs in 5 MLOC with 10,000+ tests for $8 MLOC with 10,000+ tests for $8 each, on average.each, on average.

● Search-Based Software Testing venueSearch-Based Software Testing venue

– Time to go mainstream?Time to go mainstream?

● Challenges and OpportunitiesChallenges and Opportunities

– Benchmarks, Specifications, Intent, Benchmarks, Specifications, Intent, Cloud Computing, Crowdsourcing, Cloud Computing, Crowdsourcing, Fix Localization, Assumptions and Fix Localization, Assumptions and SensitivitySensitivity