brian de alwis and gail murphy dept of computer science university of british columbia, canada...

Brian de Alwis and Gail MurphyDept of Computer ScienceUniversity of British Columbia, Canada

Presented at the International Conference on Software Engineering (ICSE 2008)

Class: CISC 879 Oct 2 2008Giriprasad Sridhara ([email protected])

Motivation

Software Maintenance Hard (60 to 90% of development costs)

Scenario: Developer maintaining code written by another programmer, in an auction bidding software

Now she is interested in knowing…How is “remove auction”

implemented?

Which method(s)

call RemoveAucti

on?

Who last changed

RemoveAuction and why did he

change it?What other methods in

this class are used

(addAuction…)?

Conceptual Queries

Feature Location

Answering Conceptual Queries Conceptual queries can be answered by

Looking at version history Browsing call graph, Checking inheritance hierarchy …

But Tedious (in the best case) Data overload and disorientation (worst case) Net effect: Programmer is more likely to

introduce buggy code

Three problems with using existing tools for answering conceptual queries Map & scope

Map query to existing concrete queries and scope the results to those of interest

Example: Conceptual Query: Where is this exception thrown?

Mapping: Manually search references to exception

Scoping: Manually examine results to find actual throw

Three problems with using existing tools for answering conceptual queries Compose

To answer a single conceptual Query: May need to make multiple concrete queries And compose the results

Example: Conceptual Query: Where is this interface created?

Step 1: Find concrete classes implementing the interface

Step 2: Find locations in code where the implementing classes are created.

Three problems with using existing tools for answering conceptual queries Integrate and reason

To answer a single conceptual Query: Integrate across different sources of information Reason across different sources of information

Tools have potentially different internal representations of an element

Conceptual Query: When is this interface method called (during a run)?

Step 1: Static Information: Find classes implementing the interface

Step 2: Dynamic information: Look for calls on implementing classes in execution traces

Wish I could get answers

to all my questions

easily!

Problem Statement

Define a model that supports integration of different sources of information about a program.

The model should enable: the results of concrete queries in separate tools to be brought together to directly answer many of a programmer’s conceptual queries

Show that the model is practical by implementing a proof of concept tool

StaticDynam

icEvoluti

on

Combined (Static+Dynamic+Evolution

)

State of the art

Integrate different tools Wassermann (90) Drawback: Assumes direct correspondence between artifacts

Cross artifact search GSEE (Favre 05) SEXTANT (Schafer et al. 06) Drawback: Assumes direct correspondence between artifacts

Query Languages CodeQuest (Hajiyev and de Moor 06) JQuery (Janzen and de Volder 03) SCA (Paul and Prakash 96) Drawback: No support for correspondences between

elements

Proposed Approach

Integrate the different sources of information

Develop a model that Supports composition and integration of

different sources of program information

Form a single queryable knowledge base that can answer conceptual queries

Contribution Theory: Development of a model for

answering 36 different conceptual queries Conceptual queries have been obtained from

Prior research (Sillito et al. 06, Voinea et al. 07) Blogs Experience

Practice: Ferret – implementation of the model

Theory: The Sphere Model

Spheres – different sources of program information Example:

Static Java relations in the source Revision history Dynamic execution trace

Theory: Formal Definition of a Sphere

Sphere S is a tuple, S = <E, L, R> E = Set of elements in the source L = Set of relation labels existing between

elements R = Subset of L X E X E Example:Element Type

Static

Dynamic

Evolution

Eclipse Plug-in

Theory: Example of Sphere elements and relations

Relation Type Relations

Static Implements, calls, instantiates…

Dynamic Calls, was-invoked, was-instantiated…

Evolution Modified-by

Eclipse plug-ins Depends, extends…

Element Type Elements

Static Classes, methods, fields…

Dynamic Classes, methods

Evolution Revisions, transactions

Eclipse Plug-in Extension points…

Elements

Relations

Theory: Composing Sphere Relations

Conceptual Query:“which of the implementations of this interface were actually instantiated in this last run?”

Insight: Composing static information with dynamic information allows a tool to answer such a conceptual query.

Composition of sphere S1 by S2: S1 Of S2 = (E1 U E2, L1 U L2, f (R1;R2))

Theory: Composition functions

For the relations R1 and R2 in the spheres S1 and S2,

Union: Includes relations from both spheres involved in

the composition Supposing 5 methods m1, m2, m3, m4, m5 have

calls in a program to a method m Supposing during two different runs of the

programRelation

(calls, m1, m)

(calls, m2, m)

Relation

(calls, m3, m)

(calls, m4, m)

Union Relation

(calls, m1, m)

(calls, m2, m)

(calls, m3, m)

(calls, m4, m)



Replacement Relations of R1 with a label from R2 are

removed and replaced with relations from R2 Supposing 5 methods m1, m2, m3, m4, m5 have

calls in a program to a method mRelations

(calls, m1, m)

(calls, m2, m)

(calls, m3, m)(calls, m4, m)

(calls, m5, m)

Other relations

Relation

(calls, m1, m)

(calls, m3, m)

Relations

(calls, m1, m)

(calls, m3, m)

Other relations



Transformation Joins relations of R1 by a subset of R2 with a

particular label lr of R2

Practice: Ferret Tool

Practice: Ferret implementation

Ferret implements 4 spheres

Sphere Implementation

Static Java Eclipse JDT

Dynamic java Eclipse Test and Performance Tools (TPTP)

Software Evolution Kenyon

Plug-in development Eclipse PDE

Practice: Ferret implementation

Ferret implements 36 conceptual queries

ExampleCategory Query

Inter-class What calls this method?

Inter-class Where is the value of this field retrieved?

Intra-class What methods does this method call?

Inheritance What classes implement this interface?

Declarations What are all the fields are declared by this type?

Evolution Who has changed this element and when?

Practice: Realization of a Conceptual query

Conceptual query: Relational operators over relation names

Example: Conceptual Query “What instantiates this type?” Implementors O instantiators Implementors relation

Takes an input, say some type T Returns all concrete classes implementing T

Instantiators relation Take as input all concrete classes implementing T Return all methods instantiating a class C

Evaluation

Evaluate tool Evaluate underlying model

Two types of evaluation Benchmarking Study of tool usage by real world

programmers

Evaluation (benchmarking)

Question: What is Ferret’s querying performance? Configuration 1 : Ferret uses only static information. How does it compare with a normal static Java tool? Configuration 2 : Ferret uses only static and dynamic

information. Is time taken for query through Ferret < time taken

by programmer to use different existing tools and combine the results?

Setup: Average timings for Ferret invocation on ARGOUML

project Select certain types and methods to trigger Ferret

Evaluation Results (benchmarking)

Ferret Benchmark timings in seconds: First three rows represent Ferret performance for types Last three rows represent Ferret performance for

methods/field

Conclusion: Timings faster than time required by developer if he was using

multiple tools and combining the results.

Evaluation: Field Study Questions

Question: Are the 36 queries implemented by Ferret useful to real world programmers?

Question: Which conceptual queries implemented in Ferret

are useful to programmers? Is the composition of static and dynamic

program information, which have some overlap in their concrete queries, useful?

Are there features of Ferret that programmers find particularly useful?

Evaluation: Field Study Setup

Two day diary study with four Java programmers (P1-P4) working on their own code base

Each programmer used Ferret instrumented to record queries used by the programmers

Spheres used in Ferret Static (JDT) Dynamic (Eclipse TPTP) Plug-in (PDE) Could not configure Evolution Sphere (Kenyon)

Two programmers used integration of Static and PDE spheres

One programmer used integration of Static, PDE and Dynamic Java Sphere

Interview with developers at the end

Evaluation: Field Study Results

Programmers found Ferret useful! Frequently used conceptual queries:

Comparatively fewer Eclipse searches were used Authors conclude that this shows Ferret satisfied the programmers needs for contextual queries

Conceptual Query Percentage usage

What calls this method? 28

Where is the value of this field retrieved? 8

What methods does this method call? 8

What types does this type reference 6.5

What fields does this method access 6

Conclusions

Problem Statement: Define a model that supports integration of different sources of information about a program to easily answer conceptual queries.

Contribution: Introduced the sphere model for conceptual queries

Problem Statement: Determine if the model is practical.

Contribution: As proof of concept, implemented the sphere model for 36 conceptual queries in the tool Ferret.

Conclusions

Evaluation: Is the performance of the tool (time) acceptable?

YES! Measured by timing on ArgoUML project. Do real world programmers find Ferret useful?

YES! Field study done on 4 real world Java programmers.

Implication For many of the conceptual queries used by

programmers, we now have an easy way of getting answers

No need to struggle across multiple tools and their outputs

Future work

Presentation issues Extending Ferret for other conceptual queries In theory, what all conceptual queries can the

sphere model support?

Class Discussion

All opinions expressed regarding this paper are my own. They do not necessarily reflect the views of the

instructor.

Overall, The concept of conceptual queries is good (i.e., background

work of Sillito et al.) Motivating examples for the tool are weak (especially the

map and scope example in the introduction)

Evaluation: is particularly weak Need more rigorous benchmark tests Benchmarking runs claim that performance w.r.t time is

comparable to static tools offered by Eclipse But in the Field Study, programmer P1 used Eclipse

instead of Ferret as he said he did not want to wait for Ferret Ferret slow?

Class Discussion

Evaluation Field Study: Need more rigorous study Basically only 3 spheres were used

(Version information Sphere Kenyon could not be configured)

Only one programmer used 3 spheres Effectively studied with only 2 spheres

(static and PDE) Not sure how necessary was the PDE

sphere So probably 2 programmers needed only

the static sphere How easy is it to add more information spheres

to Ferret?

brian de alwis and gail murphy dept of computer science university of british columbia, canada...

Documents

single conceptual query

conceptual queriesintegrate

conceptual queriescompose

existing tools

implementing classes

concrete classes

elementconceptual query

results of concrete