meaningful variable names for decompiled code: a machine ...statistical machine translation (smt) 12...
TRANSCRIPT
![Page 1: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/1.jpg)
Meaningful Variable Names for Decompiled Code:
A Machine Translation Approach
Alan Jaffe, Jeremy Lacomis, Edward J. Schwartz*, Claire Le Goues, and Bogdan Vasilescu
*
![Page 2: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/2.jpg)
Problem: Obfuscated Variable Names in Code
2
function callback(error, response, body) {if (!error && response.statusCode == 200) {
var info = JSON.parse(body);…
function callback(o, s, a) {if (!o && s.statusCode == 200) {
var c = JSON.parse(a);…
Minified JavaScript:
![Page 3: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/3.jpg)
Problem: Obfuscated Variable Names in Code
3
function callback(error, response, body) {if (!error && response.statusCode == 200) {
var info = JSON.parse(body);…
function callback(o, s, a) {if (!o && s.statusCode == 200) {
var c = JSON.parse(a);…
Minified JavaScript:
![Page 4: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/4.jpg)
Problem: Obfuscated Variable Names in Code
4
function callback(error, response, body) {if (!error && response.statusCode == 200) {
var info = JSON.parse(body);…
function callback(o, s, a) {if (!o && s.statusCode == 200) {
var c = JSON.parse(a);…
cp = buf;(void)asxTab(level + 1);for (n = asnContents(asn, buf, 512); n > 0; n--) {
printf(" %02X ", *(cp++));}
v14 = &v15;asxTab(a2 + 1);for (v13 = asnContents(a1, &v15, 512LL); v13 > 0; --v13) {
v9 = (unsignedchar*)(v14++);printf(" %02X ", *v9);
}
Minified JavaScript:
Decompiled C Code:
![Page 5: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/5.jpg)
Problem: Obfuscated Variable Names in Code
5
function callback(error, response, body) {if (!error && response.statusCode == 200) {
var info = JSON.parse(body);…
function callback(o, s, a) {if (!o && s.statusCode == 200) {
var c = JSON.parse(a);…
cp = buf;(void)asxTab(level + 1);for (n = asnContents(asn, buf, 512); n > 0; n--) {
printf(" %02X ", *(cp++));}
v14 = &v15;asxTab(a2 + 1);for (v13 = asnContents(a1, &v15, 512LL); v13 > 0; --v13) {
v9 = (unsignedchar*)(v14++);printf(" %02X ", *v9);
}
Minified JavaScript:
Decompiled C Code:
![Page 6: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/6.jpg)
Problem: Obfuscated Variable Names in Code
6
function callback(error, response, body) {if (!error && response.statusCode == 200) {
var info = JSON.parse(body);…
function callback(o, s, a) {if (!o && s.statusCode == 200) {
var c = JSON.parse(a);…
Minified JavaScript:
• Software is “natural” [Hindle et al., 2011].
![Page 7: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/7.jpg)
Problem: Obfuscated Variable Names in Code
7
function callback(error, response, body) {if (!error && response.statusCode == 200) {
var info = JSON.parse(body);…
function callback(o, s, a) {if (!o && s.statusCode == 200) {
var c = JSON.parse(a);…
Minified JavaScript:
• Software is “natural” [Hindle et al., 2011].
• Use large corpora + machine learning to predict better identifier names.• Corpora are easy to generate!
![Page 8: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/8.jpg)
Problem: Obfuscated Variable Names in Code
8
function callback(error, response, body) {if (!error && response.statusCode == 200) {
var info = JSON.parse(body);…
function callback(o, s, a) {if (!o && s.statusCode == 200) {
var c = JSON.parse(a);…
Minified JavaScript:
• Software is “natural” [Hindle et al., 2011].
• Use large corpora + machine learning to predict better identifier names.• Corpora are easy to generate!
• Bavishi et al., Context2Name, 2017• Vasilescu et al., JSNaughty, 2017• Raychev et al., JSNice, 2015
![Page 9: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/9.jpg)
Problem: Obfuscated Variable Names in Code
9
cp = buf;(void)asxTab(level + 1);for (n = asnContents(asn, buf, 512); n > 0; n--) {
printf(" %02X ", *(cp++));}
v14 = &v15;asxTab(a2 + 1);for (v13 = asnContents(a1, &v15, 512LL); v13 > 0; --v13) {
v9 = (unsignedchar*)(v14++);printf(" %02X ", *v9);
}
Decompiled C Code:
Can we use similar strategies for decompiled code?
![Page 10: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/10.jpg)
Statistical Machine Translation (SMT)
10
• Noisy channel model
![Page 11: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/11.jpg)
Statistical Machine Translation (SMT)
11
• Noisy channel model• English à French:
![Page 12: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/12.jpg)
Statistical Machine Translation (SMT)
12
• Noisy channel model• English à French:
Va faire de la recherche!Go do some research!
![Page 13: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/13.jpg)
Statistical Machine Translation (SMT)
13
• Noisy channel model• English à French:
Va faire de la recherche!Go do some research!
!"#$!%&( ) *)
![Page 14: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/14.jpg)
Statistical Machine Translation (SMT)
14
• Noisy channel model• English à French:
Va faire de la recherche!Go do some research!
= "#$%"&' ) * +))(+)
)(*)"#$%"&') + *)
= "#$%"&') * +))(+)
![Page 15: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/15.jpg)
Statistical Machine Translation (SMT)
15
• Noisy channel model• English à French:
Va faire de la recherche!Go do some research!
= "#$%"&' ) * +))(+)
)(*)"#$%"&') + *)
= "#$%"&') * +))(+)
Translation Model: Probability that f is a translation of e
![Page 16: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/16.jpg)
Statistical Machine Translation (SMT)
16
• Noisy channel model• English à French:
Va faire de la recherche!Go do some research!
= "#$%"&' ) * +))(+)
)(*)"#$%"&') + *)
= "#$%"&') * +))(+)
Language Model: “Fluency” of e
![Page 17: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/17.jpg)
Statistical Machine Translation (SMT)
17
• Noisy channel model• English à French:
Va faire de la recherche!Go do some research!
= "#$%"&' ) * +))(+)
)(*)"#$%"&') + *)
= "#$%"&') * +))(+)
) * +): Translation Model
)(+): Language ModelMOSES SMT:
![Page 18: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/18.jpg)
SMT Model for Natural Language
18
Aligned French/English corpus
English corpus
![Page 19: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/19.jpg)
SMT Model for Minified JavaScript
19
Aligned original/minified source corpus
Original source corpus
![Page 20: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/20.jpg)
Problem: Obfuscated Identifiers in Code
21
cp = buf;(void)asxTab(level + 1);for (n = asnContents(asn, buf, 512); n > 0; n--) {
printf(" %02X ", *(cp++));}
v14 = &v15;asxTab(a2 + 1);for (v13 = asnContents(a1, &v15, 512LL); v13 > 0; --v13) {
v9 = (unsignedchar*)(v14++);printf(" %02X ", *v9);
}
Decompiled C Code:
Can we use SMT for decompiled code?
![Page 21: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/21.jpg)
SMT Model for Decompiled Code?
22
Aligned original/decompiled source corpus
Original source corpus
![Page 22: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/22.jpg)
SMT Model for Decompiled Code?
23
Aligned original/decompiled source corpus
Original source corpus
Nontrivial
![Page 23: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/23.jpg)
24
Difficulty: Decompilation Changes Structure
#include <stdio.h>int main() {int cur = 0;while (cur <= 9) {
printf("%d\n", cur);++cur;
}return 0;
}
#include <stdio.h>int main() {int v1 = 0;int v2;for (v2 = 0; v2 < 10; ++v2)
printf("%d\n", v2);return v1;
}
Original Source Decompiled Code
![Page 24: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/24.jpg)
25
Difficulty: Decompilation Changes Structure
• Different line count.
#include <stdio.h>int main() {int cur = 0;while (cur <= 9) {
printf("%d\n", cur);++cur;
}return 0;
}
#include <stdio.h>int main() {int v1 = 0;int v2;for (v2 = 0; v2 < 10; ++v2)
printf("%d\n", v2);return v1;
}
Original Source Decompiled Code9 Lines 8 Lines
![Page 25: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/25.jpg)
26
Difficulty: Decompilation Changes Structure
• Different line count.• Different numbers of variables.
#include <stdio.h>int main() {int cur = 0;while (cur <= 9) {
printf("%d\n", cur);++cur;
}return 0;
}
#include <stdio.h>int main() {int v1 = 0;int v2;for (v2 = 0; v2 < 10; ++v2)
printf("%d\n", v2);return v1;
}
Original Source Decompiled Code
![Page 26: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/26.jpg)
27
Difficulty: Decompilation Changes Structure
• Different line count.• Different numbers of variables.• Different types of loops.
#include <stdio.h>int main() {int cur = 0;while (cur <= 9) {
printf("%d\n", cur);++cur;
}return 0;
}
#include <stdio.h>int main() {int v1 = 0;int v2;for (v2 = 0; v2 < 10; ++v2)
printf("%d\n", v2);return v1;
}
Original Source Decompiled Code
![Page 27: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/27.jpg)
Decompiled Code Corpus Generation
28
#include <stdio.h>int main() {int v1 = 0;int v2;for (v2 = 0; v2 < 10; ++v2)
printf("%d\n", v2);return v1;
}
Decompiled Code
![Page 28: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/28.jpg)
Decompiled Code Corpus Generation
29
❌
#include <stdio.h>int main() {int v1 = 0;int v2;for (v2 = 0; v2 < 10; ++v2)
printf("%d\n", v2);return v1;
}
Decompiled Code
#include <stdio.h>int main() {int cur = 0;while (cur <= 9) {
printf("%d\n", cur);++cur;
}return 0;
}
Original Code
![Page 29: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/29.jpg)
Decompiled Code Corpus Generation
30
#include <stdio.h>int main() {int v1 = 0;int v2;for (v2 = 0; v2 < 10; ++v2)
printf("%d\n", v2);return v1;
}
❌
#include <stdio.h>int main() {int v1 = 0;int v2;for (v2 = 0; v2 < 10; ++v2)
printf("%d\n", v2);return v1;
}
Decompiled Code
#include <stdio.h>int main() {int cur = 0;while (cur <= 9) {
printf("%d\n", cur);++cur;
}return 0;
}
Original Code
![Page 30: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/30.jpg)
Decompiled Code Corpus Generation
31
#include <stdio.h>int main() {int v1 = 0;int v2;for (v2 = 0; v2 < 10; ++v2)
printf("%d\n", v2);return v1;
}
❌
#include <stdio.h>int main() {int v1 = 0;int v2;for (v2 = 0; v2 < 10; ++v2)
printf("%d\n", v2);return v1;
}
Decompiled Code
#include <stdio.h>int main() {int cur = 0;while (cur <= 9) {
printf("%d\n", cur);++cur;
}return 0;
}
Original Code
![Page 31: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/31.jpg)
Decompiled Code Corpus Generation
32
#include <stdio.h>int main() {int v1 = 0;int __;for (__ = 0; __ < 10; ++__)
printf("%d\n", __);return v1;
}
❌
#include <stdio.h>int main() {int v1 = 0;int v2;for (v2 = 0; v2 < 10; ++v2)
printf("%d\n", v2);return v1;
}
Decompiled Code
#include <stdio.h>int main() {int cur = 0;while (cur <= 9) {
printf("%d\n", cur);++cur;
}return 0;
}
Original Code
![Page 32: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/32.jpg)
Decompiled Code Corpus Generation
33
#include <stdio.h>int main() {int v1 = 0;int cur;for (cur = 0; cur < 10; ++cur)
printf("%d\n", cur);return v1;
}
❌ �
#include <stdio.h>int main() {int v1 = 0;int v2;for (v2 = 0; v2 < 10; ++v2)
printf("%d\n", v2);return v1;
}
Decompiled Code
#include <stdio.h>int main() {int cur = 0;while (cur <= 9) {
printf("%d\n", cur);++cur;
}return 0;
}
Original CodeRenamed Decompiled Code
![Page 33: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/33.jpg)
Better SMT Model for Decompiled Code
36
Aligned renamed/decompiled source corpus
Renamed source corpus
![Page 34: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/34.jpg)
37
#include <stdio.h>int main() {int cur = 0;while (cur <= 9) {
printf("%d\n", cur);++cur;
}return 0;
}
#include <stdio.h>int main() {int v1 = 0;int v2;for (v2 = 0; v2 < 10; ++v2)
printf("%d\n", v2);return v1;
}
Choosing Renamings
Original Code Decompiled Code
![Page 35: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/35.jpg)
38
#include <stdio.h>int main() {int cur = 0;while (cur <= 9) {
printf("%d\n", cur);++cur;
}return 0;
}
#include <stdio.h>int main() {int v1 = 0;int v2;for (v2 = 0; v2 < 10; ++v2)
printf("%d\n", v2);return v1;
}
Choosing Renamings
Original Code Decompiled Code
![Page 36: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/36.jpg)
39
#include <stdio.h>int main() {int cur = 0;while (cur <= 9) {
printf("%d\n", cur);++cur;
}return 0;
}
#include <stdio.h>int main() {int v1 = 0;int v2;for (v2 = 0; v2 < 10; ++v2)
printf("%d\n", v2);return v1;
}
Choosing Renamings
• Not used as the return value.
Original Code Decompiled Code
![Page 37: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/37.jpg)
40
#include <stdio.h>int main() {int cur = 0;while (cur <= 9) {
printf("%d\n", cur);++cur;
}return 0;
}
#include <stdio.h>int main() {int v1 = 0;int v2;for (v2 = 0; v2 < 10; ++v2)
printf("%d\n", v2);return v1;
}
Choosing Renamings
• Not used as the return value.• Used inside of a loop.
Original Code Decompiled Code
![Page 38: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/38.jpg)
41
#include <stdio.h>int main() {int cur = 0;while (cur <= 9) {
printf("%d\n", cur);++cur;
}return 0;
}
#include <stdio.h>int main() {int v1 = 0;int v2;for (v2 = 0; v2 < 10; ++v2)
printf("%d\n", v2);return v1;
}
Choosing Renamings
• Not used as the return value.• Used inside of a loop.• Used in a function call.
Original Code Decompiled Code
![Page 39: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/39.jpg)
42
#include <stdio.h>int main() {int cur = 0;while (cur <= 9) {
printf("%d\n", cur);++cur;
}return 0;
}
#include <stdio.h>int main() {int v1 = 0;int v2;for (v2 = 0; v2 < 10; ++v2)
printf("%d\n", v2);return v1;
}
Choosing Renamings
• Not used as the return value.• Used inside of a loop.• Used in a function call.• Same operations.
Original Code Decompiled Code
![Page 40: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/40.jpg)
43
#include <stdio.h>int main() {int cur = 0;while (cur <= 9) {
printf("%d\n", cur);++cur;
}return 0;
}
#include <stdio.h>int main() {int v1 = 0;int __;for (__ = 0; __ < 10; ++__)
printf("%d\n", __);return v1;
}
Choosing Renamings
• Not used as the return value.• Used inside of a loop.• Used in a function call.• Same operations.
Original Code Decompiled Code
![Page 41: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/41.jpg)
44
#include <stdio.h>int main() {int cur = 0;while (cur <= 9) {
printf("%d\n", cur);++cur;
}return 0;
}
#include <stdio.h>int main() {int v1 = 0;int cur;for (cur = 0; cur < 10; ++cur)
printf("%d\n", cur);return v1;
}
Choosing Renamings
• Not used as the return value.• Used inside of a loop.• Used in a function call.• Same operations.
Original Code Decompiled Code
![Page 42: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/42.jpg)
System Architecture
45
![Page 43: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/43.jpg)
Results and Evaluation
46
my_rc base2_string(base2_handle base2_h, char* buffer,size_t buffer_size)
Original
![Page 44: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/44.jpg)
Results and Evaluation
47
my_rc base2_string(base2_handle base2_h, char* buffer,size_t buffer_size)
my_rc base2_string(base2_handle a1, char* a2,size_t a3)
Original
Decompiled
![Page 45: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/45.jpg)
Results and Evaluation
48
my_rc base2_string(base2_handle base2_h, char* buffer,size_t buffer_size)
my_rc base2_string(base2_handle a1, char* a2,size_t a3)
Original
Decompiled
my_rc base2_string(base2_handle base2_h, char* buf,size_t len)
Renamed Decompiled
![Page 46: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/46.jpg)
Results and Evaluation
49
my_rc base2_string(base2_handle base2_h, char* buffer,size_t buffer_size)
Original
my_rc base2_string(base2_handle base2_h, char* buf,size_t len)
Renamed Decompiled
![Page 47: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/47.jpg)
Results and Evaluation
50
my_rc base2_string(base2_handle base2_h, char* buffer,size_t buffer_size)
Original
my_rc base2_string(base2_handle base2_h, char* buf,size_t len)
Renamed Decompiled
Exact
![Page 48: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/48.jpg)
Results and Evaluation
51
my_rc base2_string(base2_handle base2_h, char* buffer,size_t buffer_size)
Original
my_rc base2_string(base2_handle base2_h, char* buf,size_t len)
Renamed Decompiled
Approx
![Page 49: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/49.jpg)
Results and Evaluation
52
my_rc base2_string(base2_handle base2_h, char* buffer,size_t buffer_size)
Original
my_rc base2_string(base2_handle base2_h, char* buf,size_t len)
Renamed Decompiled
Not a match
![Page 50: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/50.jpg)
Results and Evaluation
53
my_rc base2_string(base2_handle base2_h, char* buffer,size_t buffer_size)
Original
my_rc base2_string(base2_handle base2_h, char* buf,size_t len)
Renamed Decompiled
• 12.7% Exact• 16.2% Exact + Approx
![Page 51: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/51.jpg)
Results and Evaluation
54
my_rc base2_string(base2_handle base2_h, char* buffer,size_t buffer_size)
Original
my_rc base2_string(base2_handle base2_h, char* buf,size_t len)
Renamed Decompiled
Not a match
• 12.7% Exact• 16.2% Exact + Approx
![Page 52: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/52.jpg)
Results and Evaluation
55
my_rc base2_string(base2_handle base2_h, char* buffer,size_t buffer_size)
my_rc base2_string(base2_handle a1, char* a2,size_t a3)
Original
Decompiled
my_rc base2_string(base2_handle base2_h, char* buf,size_t len)
Renamed Decompiled
• 12.7% Exact• 16.2% Exact + Approx
![Page 53: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/53.jpg)
Preliminary Investigation: Human Study
• Presented users with short snippets (<50 lines) of decompiled code, asked to perform various maintenance tasks, graded and timed:
56
![Page 54: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/54.jpg)
Preliminary Investigation: Human Study
• Presented users with short snippets (<50 lines) of decompiled code, asked to perform various maintenance tasks, graded and timed:
57
1 int x = 1;2 int y = 0;3 while (x <= 5) {4 y += 2;5 x += 1;6 }7 printf("%d", y);
- What is the value of the variable y on line 7?
![Page 55: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/55.jpg)
Preliminary Investigation: Human Study
• Presented users with short snippets (<50 lines) of decompiled code, asked to perform various maintenance tasks, graded and timed:
58
1 int x = 1;2 int y = 0;3 while (x <= 5) {4 y += 2;5 x += 1;6 }7 printf("%d", y);
- What is the value of the variable y on line 7?
• For correct answers, the time to answer using our renamings was statistically significantly lower than when using the decompiler names.
![Page 56: Meaningful Variable Names for Decompiled Code: A Machine ...Statistical Machine Translation (SMT) 12 •Noisy channel model •English àFrench: Go do some research! Vafaire de la](https://reader035.vdocuments.mx/reader035/viewer/2022081407/5f1d519206eddc366e5ca6d2/html5/thumbnails/56.jpg)
System Architecture
45
Conclusion
•Questions?•Suggestions?
59