the psychology of c# analysis

61
The Psychology of C# Analysis Eric Lippert C# Analysis Architect Coverity

Upload: coverity

Post on 13-Jun-2015

20.948 views

Category:

Technology


0 download

DESCRIPTION

Our C# expert Eric Lippert provides his take on the psychology of C# analysis, including the business case for C#, developer characteristics and analysis tools.

TRANSCRIPT

Page 1: The Psychology of C# Analysis

The Psychology of C# AnalysisEric Lippert

C# Analysis Architect

Coverity

Page 2: The Psychology of C# Analysis

Intro

Page 3: The Psychology of C# Analysis
Page 4: The Psychology of C# Analysis

Intro

• Psychological factors in language design…

• … and compiler error messages…• … and static analysis tools…• … and funny pictures of cats.

Page 5: The Psychology of C# Analysis

Who is this guy?

• Compiler developer / language designer at Microsoft from 1996 through 2012• Visual Basic, VBScript, JScript, VS Tools for Office, C# /

Roslyn

• Static analysis architect for C# at Coverity since January• I will use “we” totally inconsistently

• I have no formal background in static analysis• I take an engineering rather than academic approach

Page 6: The Psychology of C# Analysis

This guy is you, not me

Page 7: The Psychology of C# Analysis

Body

Page 8: The Psychology of C# Analysis
Page 9: The Psychology of C# Analysis

The business case for C#

Page 10: The Psychology of C# Analysis

The business case for C#

• Productive, successful professional developers who target Microsoft platforms make those platforms more attractive to Microsoft’s customers

• Original design goal was “a simple, modern, general-purpose language”• Any language with an 800 page specification is no longer

simple, but modern and general-purpose still apply

• Understanding developer psychology is key to achieving wide adoption of any developer tool

Page 11: The Psychology of C# Analysis

Target C# Developer Characteristics• Professionals, not amateurs

• Engineers, not hackers

• Programming experts, not line-of-business experts

• Pragmatists, not academics

• Skeptics, not true believers

• Conservatives, not radicals

Page 12: The Psychology of C# Analysis

Conservatism

Page 13: The Psychology of C# Analysis

Conservatism

• C# developers hate breaking changes imposed by tools

• Even trivial breaking changes are agonized over

• In 11 years and 6 releases C# has never added a new reserved keyword• New keywords are contextual so as to not be breaking

• This imposes considerable restrictions on new syntaxes

• For example, consider iterator blocks:

double yield = 123.4;

yield return yield;

Page 14: The Psychology of C# Analysis

Conservatism

• C# app developers also hate breaking their users• Facilitating versionable components was a pri 1 design

goal

• Numerous seemingly-counterintuitive features actually mitigate brittle-base-class failures:

class Base { public void M(int x) { } }class Derived : Base { public void M(double x) { } }...derived.M(123); // Base.M or Derived.M?

Page 15: The Psychology of C# Analysis

Conservatism

Page 16: The Psychology of C# Analysis

Conservatism

C# 4.0 added dynamic dispatch to facilitate interoperability with dynamic languages and “legacy” object models

• Enormous MVP community pushback

• I will use this feature correctly but my coworkers are going to abuse it and then I’m going to have to fix their god-awful hacked-up code

• Anything that makes the compiler less capable of finding bugs is met with skepticism and resistance

• Completely redesigned based on early feedback

Page 17: The Psychology of C# Analysis

Error reporting psychology

FAIL

Page 18: The Psychology of C# Analysis

Error reporting psychology

• Dealing with correct code is literally the smallest problem

• “Roslyn” does syntactic analysis of broken code in the time between keystrokes; semantic analysis takes a little longer

• Error messages need to be understandable, accurate, polite and diagnostic rather than prescriptive

• Let’s take a look at some examples

Page 19: The Psychology of C# Analysis

Error reporting psychology

Page 20: The Psychology of C# Analysis

Error reporting psychology

A params parameter must be the last parameter in a formal parameter list

Is this saying:

• If there is a params parameter, it must be the last one? or

• The last parameter and only the last parameter must always be a params parameter? Or

• The last parameter must be a params parameter; if others are as well, that’s fine too?

The error is only clear if the feature is already understood

Page 21: The Psychology of C# Analysis

Error reporting psychology

Error messages must read the mind of a developer who wrote broken code and figure out what they meant.class C { public virtual static void M(){}}

Page 22: The Psychology of C# Analysis

Error reporting psychology

Page 23: The Psychology of C# Analysis

Error reporting psychology

Complex operator + (Complex x, Complex y) { ...

User-defined operator must be declared static and public

• This is an example of a prescriptive error done right• The user absolutely positively has to do this to overload

an operator

• Odds that they were not trying to overload an operator are low

Page 24: The Psychology of C# Analysis

Warnings are harder than errors

Page 25: The Psychology of C# Analysis

Warnings are harder than errors

• Must infer developers erroneous thoughts

• Compiler must be fast• This makes an opportunity for third-party tools

• Must be plausibly wrong• A warning for code that no one would reasonably type is

unhelpful

• Must be able to eliminate warning• And ideally the warning should tell you how

• Must have low false positive rate• Encouraging developers to change correct code is

harmful

• We will return to this point later

Page 26: The Psychology of C# Analysis

What do C# developers want?

Rigidly defined areas of doubt and uncertainty

• Static type checking, type safety, memory safety…

• … that can be disabled if necessary.

• A compiler that infers developer intent…

• … with predictable behavior and understandable rules

• Actionable errors when inference fails…

• …rather than muddling on through and getting it wrong

Page 27: The Psychology of C# Analysis

It hurts because its true

Page 28: The Psychology of C# Analysis

C# was originally called SafeC

C# throws developers into the “Pit of Success”:

• Eliminate unimportant dangerous features entirely• switch fall through

• Restrict dangerous features to clearly-marked unsafe code regions

• Eliminate implementation-defined behaviours• x = ++x + x++; is well-defined in C# …

• …but still a bad idea.

• Define common undefined behaviours• Accessing an array out of bounds causes an exception

• Mandate compiler warnings

There are numerous defects that the Coverity C/C++ analysis checkers detect which are impossible, unlikely, or already warnings in C#.

Let’s look at a few dozen. Quickly. These are all defects found by Coverity in C/C++ that are not worth checking in C#…

Page 29: The Psychology of C# Analysis

C/C++ defects inapplicable to C#:

• Local read before assignment • C# rejects programs that use uninitialized locals

• Uninitialized fields / arrays• Fields and arrays are automatically zeroed out

• Treating a pointer to a variable as a pointer to an array• Rare, must be marked as unsafe

• Buffer length arithmetic errors• Strings and arrays know their lengths; checked at runtime

• Pointer/integer/char/bool/enum type errors• Not inter-assignable in C# without explicit cast operators

Page 30: The Psychology of C# Analysis

C/C++ defects inapplicable to C#:

• Failure to consistently check error return codes• C# uses exceptions

• Accidental sign extension• Either error or warning

• Implementation-defined side effect order• Side effect order is well-defined

• Statement with no effect• is actually a parse time error in C#

• Accidental use of ambiguous names• C# requires that a simple name have a unique meaning in

a block

Page 31: The Psychology of C# Analysis

C/C++ defects inapplicable to C#:• sizeof mistakes

• C#’s sizeof operator only takes types

• Unintentional switch fall-through• Is an error

• Unreachable code• Is a warning

• Accidental assignment or comparison of variable to itself• Yep, that’s a warning too

• Field never written or never read• Man that’s a lot of warnings

• Missing return statement• Is illegal

• malloc without free / free without malloc / allocator – deallocator mismatch / use after free• Not needed in a garbage-collected language

• Dereferencing an address that lived longer than the storage it refers to• References to variables may not be stored in long-term storage

• Accidental use of function pointer• Method group expressions can only be used in strictly limited locations

• Overriding errors• The language was designed to mitigate brittle base class failures by default

Page 32: The Psychology of C# Analysis

Of course the compiler is not perfect…

Page 33: The Psychology of C# Analysis

Defects common to C/C++ and C#• Copy paste mistakes

• Expression contains variables but always has the same result

• You checked for null here, you dereferenced without checking there.

• Some infinite loops

• Dangling else and other indentation issues

• Array index out of bounds

• Integer overflow • checked arithmetic is off by default

• Non-memory resource leaks • Such as forgetting to close a file

• Stray semicolons

• Swapped arguments

• Unused return value

• Uncaught exception

• Missing or misordered critical sections• Including non-atomic operations

inconsistently inside critical sections

• And many more!

And these are just a few that are common to C and C#; there are a whole host of defects specific to C# programs that we could find statically.

Let’s consider the psychological aspects of static analysis tools beyond the compiler.

Page 34: The Psychology of C# Analysis

Day one training at Coverity

Page 35: The Psychology of C# Analysis

Developer Adoption is Key

• Soundness is explicitly a non-goal• We don’t want to find all defects or even most defects

• We want every defect reported to be a customer-affecting bug

• Developers won’t adopt a product that they perceive as making their jobs harder for no customer benefit

• Our business model requires adoption to drive renewals

• How do developers – who, remember, are using C# because they like a statically-typed language – react to static analysis tools?

Page 36: The Psychology of C# Analysis

Developer psychology WRT analysis tools

Page 37: The Psychology of C# Analysis

Developer psychology WRT analysis tools

• Egotistical• I don’t need this tool for my code

• But my coworkers on the other hand…

• Clever management uses this trait to advantage

Page 38: The Psychology of C# Analysis

Developer psychology WRT analysis tools

Page 39: The Psychology of C# Analysis

Developer psychology WRT analysis tools

• Skeptical, conservative, dismissive• Resistant to change

• Quick to criticize “stupid” false positives

• The first five defects they see had better be true positives

Page 40: The Psychology of C# Analysis

Developer psychology WRT analysis tools

Page 41: The Psychology of C# Analysis

Developer psychology WRT analysis tools

• “Busy” with, you know, “real work”• Code annotations are unacceptable

• Analysis tool must adapt to customer’s build process

• Overnight analysis runs are acceptable – barely

Page 42: The Psychology of C# Analysis

Developer psychology WRT analysis tools

Page 43: The Psychology of C# Analysis

Developer psychology WRT analysis tools

• Any change in what defects are reported on the same code over time – a.k.a. “churn” – is the enemy

• Randomized analysis is right out, unfortunately

• Any improvement to our analysis heuristics can cause unwanted churn

• We try to keep churn below 5% on every release

Page 44: The Psychology of C# Analysis

Developer psychology WRT analysis tools

Page 45: The Psychology of C# Analysis

Developer psychology WRT analysis tools

• Responds well to perverse incentives• Hard-to-understand defect reports are easy to ignore

• No downside to incorrectly triaging true positives as false positives

• Finding defects is hard; presenting evidence that prevents incorrect classification as a false positive is harder• Deep analysis with theorem provers can be worse than

shallow analysis with cheap heuristics.

• Presenting the result is insufficient; the developer must understand the proof to fix the defect.

Page 46: The Psychology of C# Analysis

Displaying good defect messages

Page 47: The Psychology of C# Analysis

Displaying good defect messages

public void GetThing(Type type, bool includeFrobs){ bool isFrob = (type != null) && typeof(IFrob).IsAssignableFrom(type); object instance = this.objects[this.name] if (instance is IFrob && includeFrobs) { [...] } else if (type.IsAssignableFrom(instance.GetType()) { [...] }

Page 48: The Psychology of C# Analysis

Displaying good defect messages

public void GetThing(Type type, bool includeFrobs){ Assuming type is null. type != null evaluated to false. bool isFrob = (type != null) && typeof(IFrob).IsAssignableFrom(type); object instance = this.objects[this.name] instance is IFrob evaluated to true. includeFrobs evaluated to false. if (instance is IFrob && includeFrobs) { [...] } Dereference after null check: dereferencing type while it is null. else if (type.IsAssignableFrom(instance.GetType()) { [...] }

Page 49: The Psychology of C# Analysis

Management psychology

Page 50: The Psychology of C# Analysis

Management psychology

• The first time static analysis runs there may be thousands of errors; typical rate is one defect per thousand LOC

• Academic answer: rank heuristics

• Pragmatic answer: ignore them all• Simply ignore all defects in existing code

• Triage and fix defects in new code

• “Someday” get around to fixing defects in old code

• Why is this so popular?• Old code is in the field. It works well enough. Risk is low.

• New code is unproven. It might work, or it might not. Risk is high.

Page 51: The Psychology of C# Analysis

Management psychology

Page 52: The Psychology of C# Analysis

Management psychology

• Management actually pays for the developer tools• And typically has no idea how to use them effectively

• Middle management has perverse incentives too• Time, cost and complexity are easily measured; quality is

not

• “Never upgrade the static analysis tool before release”

• Worse tools are better; better tools are worse

Page 53: The Psychology of C# Analysis

Worse is better; better is worse

Time

Know

n D

efe

cts

No tool improvements == Management gets bonus

Page 54: The Psychology of C# Analysis

Worse is better; better is worse

Time

Know

n D

efe

cts

No tool improvements == Management gets bonus

Tool upgrades find more defects == Management gets no bonus

The fix rate is the same in these two graphs but if the tool improves faster than the fix rate, no bonus.

Page 55: The Psychology of C# Analysis

Good news

If you have a well-engineered product that:• makes good use of theoretical and pragmatic approaches,

• finds real-world, user-affecting defects, and

• takes developer and management psychology into account

Then you can make a positive difference

Page 56: The Psychology of C# Analysis

Conclusion

Page 57: The Psychology of C# Analysis

Special thanks to Scott at BasicInstructions.net

Page 58: The Psychology of C# Analysis

Conclusion

Page 59: The Psychology of C# Analysis

Conclusion

• Theoretical static analysis techniques are awesome; we can and do use them in industry…• … but doing all that math is actually only one small part of

shipping a static analysis product

• Understanding developer and management psychology is necessary to ensure adoption of any developer tools• C# was carefully designed to match a target developer

mindset

• Coverity thinks about developer and manager psychology at every stage in the analysis and overall product design

• Research into better ways to present defects would be awesome

Page 60: The Psychology of C# Analysis

More information

• Learn about Coverity at www.Coverity.com

• Read “A Few Billion Lines Of Code Later”

• Find me on Twitter at @ericlippert

• Or read my C# blog at www.EricLippert.com

• Or ask me about C# at www.StackOverflow.com

Page 61: The Psychology of C# Analysis

Copyright 2013 Coverity, Inc.