icpc
TRANSCRIPT
Patricia Jablonski and Daqing HouClarkson University ~ Potsdam, NY USA
Friday July 2, 2010 ~ ICPC 2010
Source code is copied and pasted for reuse and then it is modified to fit a specific task
The correspondence (similarity) relationship between the clones can be useful information during modification and debugging
Having to manually identify and consistently change clones can be difficult over time
Cloning increases source code maintenance and can lead to undetected errors in the code
Proactive tracking of copy-and-paste clones upon creation (using abstract syntax trees)
Features of CnP tested in this user study◦ CnP clone visualization
Shows the programmer where related clones are
◦ CReN identifier renaming
Assists the programmer in renaming identifiers consistently within clones
◦ LexId substring renaming
Assists the programmer in renaming substrings consistently within clones
Programmers often make small modifications to pasted code (like to its identifier names and literal constants) for it to fit its task
CReN is an Eclipse plug-in that uses CnP’s clone tracking and visualization, and groups instances of the same identifier within a clone
All instances of the same identifier are renamed together when any one instance is edited by the programmer
This helps prevent inconsistencies (errors)
LexId is a separate Eclipse plug-in (part of CnP) that adds onto CReN’s functionality
LexId groups and renames together common substrings between different identifierswithin a clone◦ Recall that CReN renames instances of the same
whole identifier name within a clone
LexId focuses on inferring the lexical patterns
LexId divides identifiers into substrings (standard Java CamelCase is supported)
CnP Clone Visualization Hypotheses◦ CnP’s clone visualization makes it faster for
programmers to find software bugs in copied-and-pasted code than debugging manually or with other tools, when the cloning information is not fresh in their memories.
◦ CnP’s clone visualization makes it faster and less error-prone for programmers to make modifications to copied-and-pasted code than modifying without visualization, when the cloning information is not fresh in their memories.
CReN Identifier Renaming Hypotheses◦ Using CReN to rename identifier instances
consistently in copied-and-pasted code is quicker than performing the same task manually or with other tools
◦ CReN prevents such inconsistent renaming errors that can happen otherwise.
LexId Substring Renaming Hypotheses◦ Using LexId to rename substring instances
consistently in copied-and-pasted code is quicker than performing the same task manually or with other tools.
◦ LexId prevents such inconsistent renaming errors that can happen otherwise.
Recruitment email was sent to all Clarkson University computer science and engineering undergraduate and graduate students
14 male subjects participated in the user study – 8 undergraduate, 6 graduate students
Knowledge of Java and Swing were required
Familiarity with IDEs, especially Eclipse, was preferred, but not necessary
Subjects had different levels of prior experience and knowledge
Subjects came one at a time to a user study lab for a session between 1 and 2 hours long
Subjects were presented with background about the problems of copy-and-paste clones, the 3 CnP features, and other tools
Subjects were shown the source code and graphical Paint program used for the tasks
Subjects were encouraged to work efficiently (with accuracy and speed)
Subjects were recorded with video/audio
Before starting a task, the current task’s description was read and questions answered
The subject could use the instruction sheet, a sheet with the identifier names labeled, and an online Swing tutorial during the study
The Paint program for the task was run for the subject to see the task’s problem visually
The subjects were told whether the current task involved CnP, and if not, about some other (optional) tools that they could use
Similar tasks were paired together◦ Task 1 & 2 – debugging, CnP clone visualization
◦ Task 3 & 4 – modification, CnP clone visualization
◦ Task 5 & 6 – renaming, CReN identifier renaming
◦ Task 7 & 8 – renaming, LexId substring renaming
Subjects completed each of the 8 tasks once, alternating with and without the tool◦ Odd-numbered subjects performed the first task in
each pair with tool support, while even-numbered subjects performed the first tasks without the tool
◦ Tasks were paired rather than pairing the subjects
For subjects who had CnP support for a task, the clone groups were already highlighted with different colors in PaintWindow.java
5 clone groups considered copy and pastes:◦ Red (r), green (g), blue (b), thickness (t) slider/panel
◦ Color panel and thickness panel
◦ Tool panel and clear/undo panel
◦ The UI constraints for each of the four panels
◦ The declaration of the two change listeners
Subjects without CnP could use CCFinderX
Task 1: Moving the blue slider does not change the pixel color.
rSlider should be bSlider (on line 120)
Task 2: Moving the thickness slider does not change the pixel thickness.
colorChangeListener should bethicknessChangeListener (on line 142)
Task 3: Add a titled border to colorPanel and to thicknessPanel.
Task 4: Add color to the label of each color slider – red, green, and blue.
Task 5: Rename colorPanel to thicknessPanel and rPanel to tPanel within the clone.
Task 6: Rename toolPanel to clearUndoPanel, pencilButton to clearButton, and eraserButton to undoButton within the clone.
Task 7: Rename rPanel to gPanel and rSlider to gSlider in the green slider clone (shown), and rPanel to bPanel and rSlider to bSlider in the blue slider clone.
Task 8: Rename bPanel to tPanel and bSlider to tSlider in the thickness slider clone.
The time (in minutes) to complete each pair of tasks.
Statistical hypothesis testing on the paired time data.
Correct states when running the program or when finished.
Number of subjects who used each location and inspectionmethod for debugging and modification tasks.
Number of times each renaming method was used forrenaming tasks.
Confounding factors for clone visualization◦ Clone visualization is not forced on the user
◦ Subjects would have produced correct solutions to Task 3 if they had made use of cloning information
◦ Varying levels of subjects’ prior experience
Threats to validity◦ Some subjects had more knowledge/experience
◦ Tasks close to real-world GUI programming tasks
Tool design◦ Need to further improve the clone visualization
◦ Need to tell programmers exactly what is renamed
Clone visualization◦ Some other tools use colored bars, markers, rulers
◦ Many others use separate views that programmers need to invoke or relatively complex graphs that they need to learn and understand
Related tools◦ Find & Replace
◦ Rename Refactoring
◦ Linked Renaming
◦ Rename Type Refactoring
◦ Vaci
User study tested CnP’s clone visualization and renaming features (CReN and LexId)◦ CReN and LexId both perform statistically quicker
than without them on similar tasks
◦ CReN and LexId both help prevent inconsistent renaming that otherwise happens
◦ Since CnP’s clone visualization was not forced on the user, some subjects may not have used it even when it was present during a particular debugging or modification task
Cloning information may be more useful to people without significant prior knowledge/experience