recent advances in software engineering in microsoft research judith bishop microsoft research...

58
Recent Advances in Software Engineering in Microsoft Research Judith Bishop

Upload: marsha-butler

Post on 25-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Recent Advances in Software

Engineering in Microsoft Research

Judith BishopMicrosoft [email protected]

University of Nanjing, 28 May 2015

Page 2: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

• Statistics• Trends

• WER, CRANE• Testing

• IntelliTest• Code Hunt

• Z3• And Friends

Prevention Education

HardwareMaintenance

Page 3: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Software runs on hardware – lots of it

Worldwide PC units for personal devices increased by 5% year over year in 1Q14 with sales of basic and utility tablets in emerging markets, plus smartphones driving total device market growth during the quarter. Gartner June 2014

Page 4: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Connected Devices and The Cloud

Page 5: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Most recent technology shift

Page 6: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Desktop operating system market share

Source: www.netmarketshare.com

Page 7: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Mobile/tablet market share

Source: www.netmarketshare.com

Page 8: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Market share of operating systems in the United States from January 2012 to September 2014

Not Windows

Page 9: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

• Statistics• Trends

• WER, CRANE• Testing

• IntelliTest• Code Hunt

• Z3• And Friends

Prevention Education

HardwareMaintenance

Page 10: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Maintenance

Page 11: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

11

The Challenge for MicrosoftMicrosoft ships software to 1 billion users around the world

We want tofix bugs regardless of source

application or OSsoftware, hardware, or malware

prioritize bugs that affect the most users

generalize the solution to be used by any programmer

get the solutions out to users most efficiently

try to prevent bugs in the first place

Page 12: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

12

Debugging in the Large with WER…

!analyze

5 17 23,450,649

Minidump

Page 13: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

The huge data based can be mined to prioritize workFix bugs from most (not loudest) users

Correlate failures to co-located componentsShow when a collection of unrelated crashes all contain the same culprit (e.g. a device driver)

Proven itself “in the wild”Found and fixed 5000 bugs in beta releases of Windows after programmers had found 100 000 with static analysis and model checking tools.

WER’s properties

Kirk Glerum, Kinshuman Kinshumann, Steve Greenberg, Gabriel Aul, Vince Orgovan, Greg Nichols, David Grant, Gretchen Loihle, and Galen Hunt, Debugging in the (Very) Large: Ten Years of Implementation and Experience, in SOSP '09, Big Sky, MT, October 2009

Page 14: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

14

Bucketing Mostly WorksOne bug can hit multiple buckets

up to 40% of error reportsduplicate buckets must be hand triaged

Multiple bugs can hit one bucketup to 4% of error reportsharder to isolate each bug

But if bucketing is wrong 44% of the time?Solution: scale is our friend

With billions of error reports, we can throw away a few million

Page 15: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

15

Top 20 Buckets for MS Word 2010

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 200%

25%

50%

75%

100%

Rela

tive

hit

cou

nt 3-week internal deployment

to 9,000 users.

Just 20 buckets account for 50% of all errorsFixing a small # of bugs will help many users

Bucket #:

CDF

Page 16: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

16

Hardware: Processor Bug

-9 -7 -5 -3 -1 1 3 5 7 9 11 13 15 17 190%

20%

40%

60%

80%

100%

Repo

rts

as

% o

f Pea

k

Day #:

WER helped fix hardware error Manufacturer could have caught this earlier w/ WER

Page 17: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

17

WER works because …… bucketing mostly worksWindows Error Reporting (WER) is

the first post-mortem reporting system with automatic diagnosisthe largest client-server system in the world (by installs)helped 700 companies fix 1000s of bugs and billions of errorsfundamentally changed software development at Microsoft

http://winqual.microsoft.com

Page 18: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

CRANE: Risk Prediction and Change Risk AnalysisGoal: to improve hotfix quality and response time

Page 20: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

IMPROVING TESTING PROCESSES

Release cycles impact verification process• Testing becomes bottleneck for development.

• How much testing is enough?

• How reliable and effective are tests?

• When should we run a test?

Kim Herzig£$, Michaela Greiler$, Jacek Czerwonka$, Brendan Murphy£

The Art of Testing Less without Sacrificing Code Quality, ICSE 2015.

Page 21: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Engineering Process

Engineers desktop Integration process

Page 22: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

System and Integration Testing

Quality gates• Developers have to pass quality gates (no control over test selection)• Checking system constraints: e.g. compatibility or performance• Failures not isolated

involve human inspections causes development freeze for corresponding branch

Page 23: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

System and Integration Testing

Software testing is expensive• 10k+ gates executed, 1M+ test cases• Different branches, architectures, languages, …

• Aims to find code issues as early as possible• Slows down product development

Page 24: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Research Objective

Only run effective and reliable tests• Not every tests performs equally well, depends on code base • Reduce execution frequency of tests that cause false test alarms

(failures due to test and infrastructure issues)

Do not sacrifice code quality• Run every test at least once on every code change• Eventually find all code defects, taking risk of finding defects later ok.

Running less tests increases code velocity• We cannot run all tests on all code changes anymore. • Identify tests that are more likely to find defects (not coverage).

Page 25: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Effectiveness

Rel

iabi

lity

High cost, unknown value

$$$$

High cost, low value

$$$$

Low cost, good value

$$

Low cost, low value

$

high

low

high

low

HISTORIC TEST FAILURE PROBABILITIES

Analyzing past test runs: failure probabilities• How often did the test fail and detected a code defect? ()

• How often did the test report a false test alarm? ()

timeQuality

Gate

Build Build

?Build

Execution history

These probabilities depend on the execution context!

Page 26: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Does it Pay Off?

Less test executionsreduce cost

Taking riskincreases cost

~11 month period> 30 million test execsmultiple branches

~3 month period> 1.2 million test execssingle branch

~12 month period> 6.5 million test execsmultiple branches

Page 27: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Across All Products

TABLE I. SIMULATION RESULTS FOR MICROSOFT WINDOWS, OFFICE, AND DYNAMICS.

Windows Office Dynamics Measurement Rel. improvement Cost

improvement Rel. improvement Cost

improvement Rel. improvement Cost

improvement Test executions 40.58% -- 34.9% -- 50.36% -- Test time 40.31% $1,567,607.76 40.1% $76,509.24 47.45% $19,979.03 Test result inspection 33.04% $61,532.80 21.1% $104,880.00 32.53% $2,337,926.40 Escaped defects 0.20% $11,970.56 8.7% $75,326.40 13.40% $310,159.42 Total cost balance $1,617,170.00 $106,063.24 $2,047,746.01

Results vary • Branching structure• Runtime of tests• We save cost on all products

Fine-tuning possible, better results but not general

Page 28: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

DYNAMIC & SELF-ADAPTIVE

Probabilities are dynamic (change over time)• Skipping tests influences risk factors (of higher level branches)

• Tests re-enabled when code quality drops

• Feedback-loop between decision points

0%

10%

20%

30%

40%

50%

60%

70%

Time (Windows 8.1)

rela

tive

test

red

uctio

n ra

te

Training period

automatically enable tests again

Page 29: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Impact on Development Process

Secondary Improvements• Machine Setup

We may lower the number of machines allocated to testing process• Developer satisfaction

Removing false test failures increases confidence in testing process

Development speed• Impact on development speed hard to estimate through simulation• Product teams invest as they believe that removing tests:

Increases code velocity (at least lower bound) Avoids additional changes due to merge conflicts Reduces the number of required integration branches as their main purpose is to test product

“We used the data your team has provided to cut a bunch of bad content and are running a much leaner BVT system […] we’re panning out to scale about 4x and run in well under 2 hours” (Jason Means, Windows BVT PM)

Page 30: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

• Statistics• Trends

• WER, CRANE• Testing

• IntelliTest• Code Hunt

• Z3• And Friends

Prevention Education

HardwareMaintenance

Page 31: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Prevention

Page 32: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Continual abstraction

Page 33: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

33

Automated Theorem ProverWon 19/21 divisions in SMT 2011 Competition

The most influential tool paper in the first 20 years of TACAS (2014)

Z3 reasons over a combination of logical

theories

BooleanAlgebra

Bit Vectors LinearArithmetic Floating

Point

First-orderAxioms

Non-linear, Reals

Algebraic Data TypesSets/Maps/…

33Leonardo de Moura and Nikolaj Bjørner. Satisfiability modulo theories: introduction and applications. Commun. ACM, 54(9):69-77, 2011.

Page 34: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

SAGE: Binary File FuzzingSymbolic execution of x86 traces to generate new input filesZ3 theories: bit vectors and arrays

Finds assertion violations using stratified inlining of procedures and calls to Z3Z3 theories: arrays, linear arithmetic, bit vectors, un-interpreted functions

Automated Test Generation and Safety/Termination Checking

Random +RegressionAll Others SAGE

Fuzzing bugs found in Win7 (over 100s of file parsers):

34

Corral: Whole Program analysis

As of Windows Threshold, Corral is the program analysis engine for SDV (Static Driver Verifier)

Page 35: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Problem:1000s of devicesLow level access control lists for different policies Updates to Edge ACL can break policiesComplexity is “inhumane”

Validating Network ACLs in the Datacenter

35

Page 36: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Education

Page 37: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

• Statistics• Trends

• WER, CRANE• Testing

• IntelliTest• Code Hunt

• Z3• And Friends

Prevention Education

HardwareMaintenance

Page 38: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Available in Visual Studio since 2010(as Pex and Smart Unit Tests)

IntelliTest in Visual Studio 2015

Nikolai Tillmann, Jonathan de Halleux, Tao Xie:Transferring an automated test generation tool to practice: from pex to fakes and code digger. ASE 2014: 385-396

Page 39: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Working and learning for fun

Enjoyment adds to long term retention on a taskDiscovery is a powerful driver, contrasting with direct instructionsGaming joins these two, and is hugely popularCan we add these elements to coding?

Code Hunt can!

www.codehunt.com

Page 40: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Code Hunt

• Is a serious programming game• Works in C# and Java (Python coming)• Appeals to coders wishing to hone their programming skills• And also to students learning to code• Code Hunt has had over 300,000 users since launching in March 2014

with around 1,000 users a day• Stickiness (loyalty) is very high

Page 41: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015
Page 42: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015
Page 43: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015
Page 44: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015
Page 45: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Gameplay

1. User writes code in browser2. Cloud analyzes code – test cases show differences

As long as there are differences: User must adapt code, repeatWhen they are no more differences: User wins level!

secret

code

test cases

Page 46: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

void CoverMe(int[] a){ if (a == null) return; if (a.Length > 0) if (a[0] == 1234567890) throw new Exception("bug");}

a.Length>0

a[0]==123…

TF

T

F

Fa==null

T

Constraints to solve

a!=null a!=null &&a.Length>0

a!=null &&a.Length>0 &&a[0]==123456890

Input

null{}

{0}

{123…}

Execute&MonitorSolve

Choose next path

Observed constraints

a==nulla!=null &&!(a.Length>0)a==null &&a.Length>0 &&a[0]!=1234567890a==null &&a.Length>0 &&a[0]==1234567890

Done: There is no path left.

Dynamic Symbolic Execution

Page 47: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Code Hunt - the APCS (default) Zone

• Opened in March 2014• 129 problems covering the Advanced Placement Computer Science course• By August 2014, over 45,000 users started.

1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11 1.12 1.13 1.14 1.15 2.1 2.2 2.3 2.4 2.5 2.6 2.7 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.80

5000

10000

15000

20000

25000

30000

35000

40000

45000

50000

APCS Zone, First three sectors, 45K to 1K

Sector and Level

Play

ers

Page 48: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Effect of difficulty on drop off in sectors 1-3

Yellow – DivisionBlue – OperatorsGreen - Sectors

Page 49: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Aug 2014 and Feb 2015

1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11 1.12 1.13 1.14 1.15 2.1 2.2 2.3 2.4 2.5 2.6 2.7 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8

-10

0

10

20

30

40

50

60

Effect of Puzzle Difficulty on Drop off

Aug Feb-A

Perc

enta

ge D

rop

Off

Puzzle Level Aug Feb-ACompute -X 1.1 17 22Compute 4 / X 1.6 18 21Compute X-Y 1.7 18 22Compute X/Y 1.11 32 38Compute X%3+1 1.13 15 18Compute 10%X 1.14 12 16Construct a list of numbers 0..N-1 2.1 37 48Construct a list of multiples of N 2.2 19 23Compute x^y 3.1 11 18Compute X! the factorial of X 3.2 16 19Compute sum of i*(i+1)/2 3.5 17 22

Page 50: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Towards a Course Experience

Page 51: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Total Try Count

Average Try Count

Max Try Count

Total Solved Users

13374 363 1306 1581

Public Data release in open source

For ImCupSept257 users x 24 puzzles x approx. 10 tries = about 13,000 programs

For experimentation on how people program and reach solutions

Github.com/microsoft/code-hunt

Page 52: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Upcoming events

PLOOC 2015 at PLDI 2015, June 14 2015, Portland, OR, USA

CHESE 2015 at ISSTA 2015, July 14, 2015, Baltimore, MD, USA

Worldwide intern and summer school contests

Public Code Hunt Contests are over for the summer

Special ICSE attendees Contest. Register at

aka.ms/ICSE2015

Code Hunt Workshop February 2015

Page 53: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Summary:Code Hunt: A Game for Coding

1. Powerful and versatile platform for coding as a game2. Unique in working from unit tests not specifications3. Contest experience fun and robust4. Large contest numbers with public data sets from cloud data

• Enables testing of hypotheses and making conclusions about how players are mastering coding, and what holds them up

5. Has potential to be a teaching platform• collaborators needed

Page 54: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Total Try Count

Average Try Count

Max Try Count

Total Solved Users

13374 363 1306 1581

Public Data release in open source

For ImCupSept257 users x 24 puzzles x approx. 10 tries = about 13,000 programs

For experimentation on how people program and reach solutions

Github.com/microsoft/code-hunt

Page 55: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Upcoming events

PLOOC 2015 at PLDI 2015, June 14 2015, Portland, OR, USA

CHESE 2015 at ISSTA 2015, July 14, 2015, Baltimore, MD, USA

Worldwide intern and summer school contests

Public Code Hunt Contests are over for the summer

Special ICSE attendees Contest. Register at

aka.ms/ICSE2015

Code Hunt Workshop February 2015

Page 56: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Summary:Code Hunt: A Game for Coding

1. Powerful and versatile platform for coding as a game2. Unique in working from unit tests not specifications3. Contest experience fun and robust4. Large contest numbers with public data sets from cloud data

• Enables testing of hypotheses and making conclusions about how players are mastering coding, and what holds them up

5. Has potential to be a teaching platform• collaborators needed

Page 57: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Websites

GameProjectCommunityData ReleaseBlogsOffice Mix

www.codehunt.comresearch.microsoft.com/codehuntresearch.microsoft.com/codehuntcommunitygithub.com/microsoft/code-huntLinked on the Project pagemix.office.com

Page 58: Recent Advances in Software Engineering in Microsoft Research Judith Bishop Microsoft Research jbishop@microsoft.com University of Nanjing, 28 May 2015

Conclusions

1. Software runs on hardware and hardware is increasingly varied2. The hardware sector that is growing (mobile) is the most tricky3. Maintenance increases in complexity with the number of

deployments4. Addressing human factors in large maintenance teams pays off 5. Prevention is a hugely valuable aid to maintenance6. Gaming is a way for practicing software engineering skills

Thank you! Questions?