lase : locating and applying systematic edits by learning from examples
DESCRIPTION
Lase : Locating and Applying Systematic Edits by Learning from Examples. Na Meng * Miryung Kim* Kathryn S. McKinley* + The University of Texas at Austin* Microsoft Research +. Motivating Scenario. Pat needs to update database transaction code to prevent SQL injection attacks. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Lase : Locating and Applying Systematic Edits by Learning from Examples](https://reader035.vdocuments.mx/reader035/viewer/2022062501/5681641a550346895dd5d328/html5/thumbnails/1.jpg)
Lase: Locating and Applying Systematic Edits by Learning from Examples
Na Meng* Miryung Kim* Kathryn S. McKinley*+
The University of Texas at Austin*Microsoft Research+
![Page 2: Lase : Locating and Applying Systematic Edits by Learning from Examples](https://reader035.vdocuments.mx/reader035/viewer/2022062501/5681641a550346895dd5d328/html5/thumbnails/2.jpg)
2
Motivating Scenario
Aold
Anew
Bold
Bnew
Cold
Cnew
Pat needs to update database transaction code to prevent SQL injection attacks
![Page 3: Lase : Locating and Applying Systematic Edits by Learning from Examples](https://reader035.vdocuments.mx/reader035/viewer/2022062501/5681641a550346895dd5d328/html5/thumbnails/3.jpg)
3
Systematic Editing
• Similar but not identical changes to multiple contexts
• Manual, tedious, and error-prone• Source transformation tools require describing
edits in a formal language[CHP91, ER02, LR95] • Bug fixing tools handle simple stylized code
changes[WNGF09, JZDLL12, SMS13]• Sydit does not find edit locations automatically
when applying systematic edits[MKM11]
![Page 4: Lase : Locating and Applying Systematic Edits by Learning from Examples](https://reader035.vdocuments.mx/reader035/viewer/2022062501/5681641a550346895dd5d328/html5/thumbnails/4.jpg)
4
Lase Workflow
Dold
Dsuggested
LASE selects methods & suggests edits
Aold
Anew
Bold
Bnew
User selects examples
Iold
Isuggested
Xold
Xsuggested
… …
![Page 5: Lase : Locating and Applying Systematic Edits by Learning from Examples](https://reader035.vdocuments.mx/reader035/viewer/2022062501/5681641a550346895dd5d328/html5/thumbnails/5.jpg)
5
Dnew
Syntactic program differencing
Aold
Apply edit script
Identify common edit
Generalize identifier
Bold
Anew
Bnew
Extract context
Find edit location
Object next = e.next();
Object next = iter.next();
Object next = v$0.next();
Approach Overview
✗ no match
✔ match
DoldColdPhase: I. Create edit script II. III.
![Page 6: Lase : Locating and Applying Systematic Edits by Learning from Examples](https://reader035.vdocuments.mx/reader035/viewer/2022062501/5681641a550346895dd5d328/html5/thumbnails/6.jpg)
6
Step 1. Syntactic Program Differencing
operation definitioninsert(u, v, k) insert node u and position it as the (k+1)th
child of node vdelete(u) delete node uupdate(u, v) replace u with vmove(u, v, k) delete u from its current position and insert
u as the (k+1)th child of v
Input: mold, mnew
Output: Edit operations
![Page 7: Lase : Locating and Applying Systematic Edits by Learning from Examples](https://reader035.vdocuments.mx/reader035/viewer/2022062501/5681641a550346895dd5d328/html5/thumbnails/7.jpg)
7
Step 2: Identify Common Edit
• Longest Common Edit Operation Subsequence
insert(Object next = e.next()…)insert(if(next instanceof MVAction)insert(((MVAction)next).update())update(print(next.toString())) to …
insert(Object next = iter.next()…)update(print(next.getString())) to …insert(if(next instanceof MVAction)insert(((MVAction)next).update())delete(System.out.println(…))
insert(Object next = e.next()…)insert(if(next instanceof MVAction))insert(((MVAction)next).update())
insert(Object next = iter.next()…)insert(if(next instanceof MVAction))insert(((MVAction)next).update())
Edit script A Edit script B
![Page 8: Lase : Locating and Applying Systematic Edits by Learning from Examples](https://reader035.vdocuments.mx/reader035/viewer/2022062501/5681641a550346895dd5d328/html5/thumbnails/8.jpg)
8
Step 3: Generalize Identifier
• Keep the original type, method, and variable names if examples agree
• Abstract identifiers if examples disagree
8
Generalized name
Name in mA
Namein mB
Variable Map next next nextv$0 e iter
Method Map next next nextType Map Object Object Object
Iterator Iterator Iterator
Object next = e.next(); Object next = iter.next();
Object next = v$0.next();
![Page 9: Lase : Locating and Applying Systematic Edits by Learning from Examples](https://reader035.vdocuments.mx/reader035/viewer/2022062501/5681641a550346895dd5d328/html5/thumbnails/9.jpg)
9
Step 4: Extract Context
Object next = e.next(); Object next = iter.next();
Iterator e = fActions.values().iterator();… …while(e.hasNext())
Iterator iter = getActions().values().iterator();… …while(iter.hasNext())
Iterator v$0 = u$0:FieldAccessOrMethodInvocation.values().iterator();… …while(v$0.hasNext())
Generalized name Namein mA
Namein mB
Uncertain Map u$0:FieldAccessOrMethodInvocation fActions getActions()Variable Map v$0 e iterMethod Map values values values
iterator iterator iteratorhasNext hasNext hasNext
TypeMap Iterator Iterator Iterator
![Page 10: Lase : Locating and Applying Systematic Edits by Learning from Examples](https://reader035.vdocuments.mx/reader035/viewer/2022062501/5681641a550346895dd5d328/html5/thumbnails/10.jpg)
10
Phase II. Find Edit Locations
Dold
Iterator e = fActions.values().iterator();
Iterator v$0 = u$0:FieldAccessOrMethodInvocation.values().iterator();
Generalized name Name in mDUncertain Map u$0:FieldAccessOrMethodInvocation fActionsVariable Map v$0 eMethod Map values values
iterator iteratorTypeMap Iterator Iterator
Aold
Bold
Anew
Bnew
![Page 11: Lase : Locating and Applying Systematic Edits by Learning from Examples](https://reader035.vdocuments.mx/reader035/viewer/2022062501/5681641a550346895dd5d328/html5/thumbnails/11.jpg)
11
Phase III. Applying Edit Script
• Customize general edit scripts– Identifier concretization– Edit position concretization
• Apply the customized edit scripts
![Page 12: Lase : Locating and Applying Systematic Edits by Learning from Examples](https://reader035.vdocuments.mx/reader035/viewer/2022062501/5681641a550346895dd5d328/html5/thumbnails/12.jpg)
12
Example 1:Comment[] getLeadingComments(ASTNode node) {- if (this.leadingComments != null) {+ if (this.leadingPts >= 0) {- int[] range = (int[]) this.leadingComments.get(node);+ int[] range = null;+ for (int i = 0; range == null && i <= this.leadingPtr; i++) {+ if (this.leadingNodes[i] == node) + range = this.leadingIndexes[i]; + } if (range != null) { … … return leadingComments; }} return null; }
Example 2:Comment[] getTrailingComments(ASTNode node) {- if (this.trailingComments != null) {+ if (this.trailingPts >= 0) {- int[] range = (int[]) this.trailingComments.get(node);+ int[] range = null;+ for (int i = 0; range == null && i <= this.trailingPtr; i++) {+ if (this.trailingNodes[i] == node) + range = this.trailingIndexes[i]; + } if (range != null) { … … return trailingComments; }} return null;}
update (if (this.v$0 != null) ) to (if (this.v$1 >= 0) )insert (int[] range = null; …)insert (for (int i = 0; range == null && i <= this.v$1; i++) …)insert (if (this.v$2[i] == node) …)insert (range = this.v$3[i]; …)delete (int[] range = (int[]) this.v$0.get(node); )
![Page 13: Lase : Locating and Applying Systematic Edits by Learning from Examples](https://reader035.vdocuments.mx/reader035/viewer/2022062501/5681641a550346895dd5d328/html5/thumbnails/13.jpg)
13
Found location:public int getExtendedEnd (ASTNode node) { int end = node.getStartPosition() + node.getLength(); if (this.trailingComments != null) { int[] range = (int[]) this.trailingComments.get(node);
if (range != null) { … … } } else { … … } return end - 1;}
Suggested version:public int getExtendedEnd (ASTNode node) { int end = node.getStartPosition() + node.getLength(); if (this.v$1 >= 0) { int[] range = null; for (int i = 0; range == null && i <= this.v$1; i++) { if (this.v$2[i] == node) range = this.v$3[i]; } if (range != null) { … … } } else { … … } return end - 1;}
![Page 14: Lase : Locating and Applying Systematic Edits by Learning from Examples](https://reader035.vdocuments.mx/reader035/viewer/2022062501/5681641a550346895dd5d328/html5/thumbnails/14.jpg)
14
Outline
• Phase I: Creating Abstract Edit Scripts– Syntactic Program Diff– Identify Common Edit– Generalize Identifier– Extract Context
• Phase II: Find Edit Locations • Phase III: Apply Edit Script• Evaluation
![Page 15: Lase : Locating and Applying Systematic Edits by Learning from Examples](https://reader035.vdocuments.mx/reader035/viewer/2022062501/5681641a550346895dd5d328/html5/thumbnails/15.jpg)
15
Test Suite
• 24 repetitive bug fixes that require multiple check-ins [Park et al., MSR 2012]– 2 from Eclipse JDT and 22 from Eclipse SWT– Each bug is fixed in multiple commits– Clones of at least two lines between patches checked in at
different times– We use the first two changed methods as input examples
• 37 systematic edits that require similar changes to different methods
![Page 16: Lase : Locating and Applying Systematic Edits by Learning from Examples](https://reader035.vdocuments.mx/reader035/viewer/2022062501/5681641a550346895dd5d328/html5/thumbnails/16.jpg)
16
RQ1: Precision, Recall, and Accuracy
Precision (P): What percentage of all found locations are correctly identified?
Recall (R): What percentage of all expected locations are correctly identified?
Accuracy (A): How similar is Lase-generated version to developer-generated version?
![Page 17: Lase : Locating and Applying Systematic Edits by Learning from Examples](https://reader035.vdocuments.mx/reader035/viewer/2022062501/5681641a550346895dd5d328/html5/thumbnails/17.jpg)
17
On average, Lase finds edit locations with 99% precision, 89% recall, and 91% accuracy.
For three bugs, Lase suggests in total 9 edits that developers missed and later confirmed.
Index Bug(patches) mi
Edit Location Operations
Σ ✔ P% R% A% E C AE%
2 82429(2) 16 13 12 92 75 81 9 9 100
4 139329(3) 6 2 2 100 33 74 6 3 50
7 103863(5) 7 7 7 100 100 100 34 34 100
8 129314(3) 3 4 4 100 100 100 2 2 100
16 95409(3) 7 9 9 100 100 78 4 4 100
24 98198(2) 9 15 15 100 100 95 3 3 100
![Page 18: Lase : Locating and Applying Systematic Edits by Learning from Examples](https://reader035.vdocuments.mx/reader035/viewer/2022062501/5681641a550346895dd5d328/html5/thumbnails/18.jpg)
18
RQ2: Sensitivity to number of exemplar edits
• 7 cases in the oracle data set• Enumerate subsets of exemplar edits
![Page 19: Lase : Locating and Applying Systematic Edits by Learning from Examples](https://reader035.vdocuments.mx/reader035/viewer/2022062501/5681641a550346895dd5d328/html5/thumbnails/19.jpg)
19
# of exemplars
P% R% A%
Index 4
1 100 17 100
2 100 51 72
3 100 82 67
4 100 96 67
5 100 100 67
Index 7
1 100 59 100
2 100 83 100
3 100 84 100
4 100 88 100
5 100 92 100
6 100 96 100
Index 12
1 100 54 92
2 78 90 85
3 49 98 83
4 31 100 82
As the number of exemplar edits increases,
P does not change because exemplar edits are similar, except for case 12R is more sensitive to the number of exemplar editsR increases as a function of exemplar editsA decreases when exemplar edits are differentA remains the same or increases when the exemplar edits are very similar
![Page 20: Lase : Locating and Applying Systematic Edits by Learning from Examples](https://reader035.vdocuments.mx/reader035/viewer/2022062501/5681641a550346895dd5d328/html5/thumbnails/20.jpg)
20
Conclusion
• Lase automates edit location search and program transformation application
• Lase achieves 99% precision, 89% recall, and 91% accuracy
• Future Work– Integrate with automated compilation and testing– Automatically detect repetitive change examples
to infer program transformations
TOOL DEMO: @MARINA, 13:30 ON FRIDAY
![Page 21: Lase : Locating and Applying Systematic Edits by Learning from Examples](https://reader035.vdocuments.mx/reader035/viewer/2022062501/5681641a550346895dd5d328/html5/thumbnails/21.jpg)
21
Acknowledgement
• This work was supported in part by the National Science Foundation under grants CAREER-1149391, CCF-1117902, CCF-1043810, SHF-0910818, CCF-1018271, CCF-0811524, and a Microsoft SEIF award
Thank you!
![Page 22: Lase : Locating and Applying Systematic Edits by Learning from Examples](https://reader035.vdocuments.mx/reader035/viewer/2022062501/5681641a550346895dd5d328/html5/thumbnails/22.jpg)
22
References I• [Meng et al. 2011] Na Meng, Miryung Kim and Kathryn S.
McKinley. Systematic editing: Generating program transformations from an example. In PLDI ‘11.
• [Kamiya et al. 2002] Toshihiro Kamiya and Shinji Kusumoto and Katsuro Inoue. CCFinder: A multilinguistic token-based code clone detection system for large scale source code. In TSE ’02.
• [Lozano et al. 2004] Antoni Lozano and Gabriel Valiente. On the maximum common embedded subtree problem for ordered trees. In C. Iliopoulos and T Lecroq, editors, String Algorithmics, 2004.
• [Park et al. MSR 2012] J. Park, M. Kim, B. Ray, and D.-H. Bae. An empirical study of supplementary bug fixes. In MSR ’12.
![Page 23: Lase : Locating and Applying Systematic Edits by Learning from Examples](https://reader035.vdocuments.mx/reader035/viewer/2022062501/5681641a550346895dd5d328/html5/thumbnails/23.jpg)
23
References II• [JZDLL12] G. Jin,W. Zhang, D. Deng, B. Liblit, and S. Lu.
Automated concurrency bug fixing. In PLDI ’12.• [CHP91] J. R. Cordy, C. D. Halpern, and E. Promislow. Txl: A
rapid prototyping system for programming language dialects. Computer Languages, 1991.
• [G10] S. Gulwani. Dimensions in program synthesis. In PPDP ’10.
• [WNGF09] W. Weimer, T. Nguyen, C. Le Goues, and S. Forrest. Automatically finding patches using genetic programming. In ICSE ’09.
![Page 24: Lase : Locating and Applying Systematic Edits by Learning from Examples](https://reader035.vdocuments.mx/reader035/viewer/2022062501/5681641a550346895dd5d328/html5/thumbnails/24.jpg)
24
References III• [ER02] M. Erwig and D. Ren. A rule-based language for
programming software updates. In RULE ’02.• [LR95] D. A. Ladd and J. C. Ramming.A*: A language for imple-
menting language processors. In TSE’95.• [SMS13] S. Son, K. S. McKinley, and V. Shmatikov. Fix Me Up:
Repairing access-control bugs in web applications. In NDSS’13.
![Page 25: Lase : Locating and Applying Systematic Edits by Learning from Examples](https://reader035.vdocuments.mx/reader035/viewer/2022062501/5681641a550346895dd5d328/html5/thumbnails/25.jpg)
25
Step 4: Common Edit Context Extraction
• Extract all potential common context• Refine the common context– Consistent identifier mapping – Embedded subtree isomorphism– Program dependence equivalence
![Page 26: Lase : Locating and Applying Systematic Edits by Learning from Examples](https://reader035.vdocuments.mx/reader035/viewer/2022062501/5681641a550346895dd5d328/html5/thumbnails/26.jpg)
26
Step 4: Common Edit Context Extraction (1/4)
1 12 23
3
• Finding common text with clone detection (CCFinder [Kamiya et al. 2002])
![Page 27: Lase : Locating and Applying Systematic Edits by Learning from Examples](https://reader035.vdocuments.mx/reader035/viewer/2022062501/5681641a550346895dd5d328/html5/thumbnails/27.jpg)
27
Step 4: Common Edit Context Extraction (2/4)
• Identifier generalizationIterator e = fActions.values().iterator();while (e.hasNext()) {
Iterator iter = getActions().values().iterator();while (iter.hasNext()) {
Iterator v$0 = u$0:FieldAccessOrMethodInvocation.values().iterator();while (v$0.hasNext()) {
Abstract identifier Identifier in mA
Identifierin mB
Uncertain Map u$0:FieldAccessOrMethodInvocation fActions getActions()Variable Map v$0 e iterMethod Map values values values
iterator iterator iteratorhasNext hasNext hasNext
TypeMap Iterator Iterator Iterator
![Page 28: Lase : Locating and Applying Systematic Edits by Learning from Examples](https://reader035.vdocuments.mx/reader035/viewer/2022062501/5681641a550346895dd5d328/html5/thumbnails/28.jpg)
28
Step 4: Common Edit Context Extraction (3/4)
• Maximum Common Embedded Subtree Extraction (MCESE) [Lozano et al. 2004]
1
2
3
1
2 3
1,2,3,-3,-2,-1 1,2,-2,3,-3,-1
1,2,-2,-1
1
2
1
2
![Page 29: Lase : Locating and Applying Systematic Edits by Learning from Examples](https://reader035.vdocuments.mx/reader035/viewer/2022062501/5681641a550346895dd5d328/html5/thumbnails/29.jpg)
29
Step 4: Common Edit Context Extraction (4/4)
• Program dependence analysis
Abstractidentifier
Identifier in mA
Identifierin mB
Variable Map v$0 e iterMethod Map values values values
… ….
Object next = e.next();
while (e.hasNext()) {
Iterator e = fActions.values().iterator();
![Page 30: Lase : Locating and Applying Systematic Edits by Learning from Examples](https://reader035.vdocuments.mx/reader035/viewer/2022062501/5681641a550346895dd5d328/html5/thumbnails/30.jpg)
30
?When more than two examples?
Aold
Anew
Bold
Bnew
Cold
Cnew
EAB EAC
Dold
Dnew
EAD
EABCEACD
EABCD
![Page 31: Lase : Locating and Applying Systematic Edits by Learning from Examples](https://reader035.vdocuments.mx/reader035/viewer/2022062501/5681641a550346895dd5d328/html5/thumbnails/31.jpg)
31
public void setBackgroundPattern (Pattern pattern){ if (handle == 0) SWT.error(SWT.ERROR_GRAPHIC_DISPOSED); if (pattern == null) SWT.error(SWT.ERROR_NULL_ARGUMENT); if (pattern.isDisposed()) SWT.error(SWT.ERROR_INVALID_ARGUMENT); initGdip(false, false); if (data.gdipBrush != 0) destroyGdipBrush(data.gdipBrush); data.gdipBrush = Gdip.Brush_Clone(pattern.handle);
data.backgroundPattern = pattern;}
public void setBackgroundPattern (Pattern pattern){ if (handle == 0) SWT.error(SWT.ERROR_GRAPHIC_DISPOSED); if (pattern != null && pattern.isDisposed()) SWT.error(SWT.ERROR_INVALID_ARGUMENT); initGdip(false, false); if (data.gdipBrush != 0) destroyGdipBrush(data.gdipBrush); if (pattern != null) { data.gdipBrush = Gdip.Brush_Clone(pattern.handle); } else { data.gdipBrush = 0; } data.backgroundPattern = pattern;}