helping developers help themselves: automatic ...€¦ · how do software engineers understand code...

Post on 20-Jun-2020

3 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

H E L P I N G D E V E L O P E R S H E L P T H E M S E LV E S : A U T O M A T I C D E C O M P O S I T I O N O F C O D E R E V I E W C H A N G E S

M I K E B A R N E T T, C H R I S T I A N B I R D , A N D S H U V E N D U K . L A H I R I M I C R O S O F T R E S E A R C H

J O Ã O B R U N E T F E D E R A L U N I V E R S I T Y O F C A M P I N A G R A N D E ( U F C G ) , B R A Z I L

1

2

25,000 developers 100,000 reviews per month

CodeFlow

The code review process

The Problem

“Fixed #244 by adding inmethod binders in between properties and indexers.

Also refactors expression-bodied member semantic model tests into their own file

and adds some extra tests.”  

3

Developers commit loosely related changes.

It is difficult to review unrelated changes at once.

4

It is difficult to review unrelated changes at once.

4

Literature said so “The  more  files  and  diffs  the  more  rela3vely  independent  changes”  [1]  

“Par3cipants  call  for  a  tool  that  can  automa3cally  decompose  a  change  into  separate  sub-­‐changes“  [2]

Wouldn’t it be nice?

5

Wouldn’t it be nice?

ClusterChanges!

5

6

ClusterChanges

diff regions

7

ClusterChangesStandard set of diff regions

8

ClusterChangesSome of them are syntactically and semantically related.

9

ClusterChanges

Partial program analysis

Roslyn

9

ClusterChanges

Partial program analysis

Roslyn Entities (nodes)

methods, fields, properties etc

Relationships (edges) method calls, field access etc

def-use relationship

10

The relationships

def-use relationship

use$

defini&on(

10

The relationships

use-use relationship

11

The relationships

use-use relationship

defini&on(

use( use(

11

The relationships

12

ClusterChangesnon-trivial partition non-trivial partition trivial partitions

12

ClusterChangesnon-trivial partition non-trivial partition trivial partitions

Diffs within the same method are linked together. We don’t split method across partitions.

2-step EvaluationQuantitative Qualitative

Distribu3on  of  par33ons  

1000  Bing  and  Office  reviews  

Manual  inspec3on  

             100  reviews  

Interviews:  20  developers  from  13  projects

13

Quantitative: trivial partitions

Why?

14

Quantitative: trivial partitions

Why?

14

Quantitative: trivial partitions

Why?e.g. Log Messages

14

Quantitative: non-trivial partitions

15

Quantitative: non-trivial partitions

15

Quantitative: non-trivial partitions

We found 3 false-negatives from 50 Changesets

15

Qualitative: The ground truth

16

Qualitative: The ground truth

16

Qualitative: The ground truth

RQ1: Do developers agree with our decomposition?

16

Qualitative: The ground truth

RQ1: Do developers agree with our decomposition?

16

Qualitative: The ground truth

RQ1: Do developers agree with our decomposition?

16

Qualitative: The ground truth

RQ1: Do developers agree with our decomposition?

RQ2: Can organizing a changeset using our decomposition help reviewers?

16

Firehouse interviews• Monitored review submissions

• Criteria: distance and #partitions

17

Firehouse interviews• Monitored review submissions

• Criteria: distance and #partitions

50mi%&>%80.5%km%

Rush to the scene!

17

RQ1: Do developers agree with our decomposition?

18

RQ1: Do developers agree with our decomposition?

19

RQ1: Do developers agree with our decomposition?

19

RQ1: Do developers agree with our decomposition?

Some trivial should be moved to one of non-trivial partitions.20

RQ1: Do developers agree with our decomposition?

21

RQ1: Do developers agree with our decomposition?

“These were actually two different changes and I actually split them in two different things after this review” [P7].

21

RQ1: Do developers agree with our decomposition?

“These were actually two different changes and I actually split them in two different things after this review” [P7].

“I would like tag partitions” [P14].

21

RQ1: Do developers agree with our decomposition?

22

RQ1: Do developers agree with our decomposition?

“In some sort of hypothetical perfect splitting that read my mind, there is one change in one line which was a variable changed (regular expression) that could be in a different partition. But I would not expect that, because it is difficult” [P9].

22

RQ1: Do developers agree with our decomposition?

23

RQ1: Do developers agree with our decomposition?

[P13] “There is no reason to commit unrelated changes.”

“The tool should be 95% correct or else I would not use it because it would be annoying.”

23

RQ1: Do developers agree with our decomposition?

[P13] “There is no reason to commit unrelated changes.”

“The tool should be 95% correct or else I would not use it because it would be annoying.”

[P17] “What is the why behind it?”

“If you do not have something showing why/how the partitions were created, it is difficult to see its value.”

23

RQ2: Can organizing a changeset using our decomposition help reviewers?

“For large sets it it would be very helpful"

“it is useful because allow different reviewers with different purposes to focus on what they want.”

24

RQ2: Can organizing a changeset using our decomposition help reviewers?

It would speed up the review process.

pre-commit usage

"If I had a way to run this tool before I commit, I would have even considered splitting this partition 2 into a second commit" [P4].

25

And, more importantly…

Please, can I try?

26

Developers mentioned CodeFlow and ask to use or integrate.

Microsoft internal event with developers [Techfest] • hundreds signed up to be notified • quite a few asked for the prototype • a few have joined the project and are now contributing

Related WorkTao et al. How do software engineers understand code changes? An exploratory study in industry. In FSE, 2012.

Kirinuki et al. Hey! are you committing tangled changes? In ICPC, 2014.

Herzig and Zeller: The impact of tangled code changes. In MSR, 2013.

27

Summary

28

Summary

28

Summary

28

Summary

28

Summary

28

Summary

28

Summary

28

top related