too imprecise useless results may not scale does not scale overkill for some things possibly still...

11
In Defense of Probabilistic Static Analysis BEN LIVSHITS SHUVENDU LAHIRI MICROSOFT RESEARCH

Upload: miranda-welch

Post on 17-Dec-2015

221 views

Category:

Documents


0 download

TRANSCRIPT

In Defense of Probabilistic Static

Analysis

BEN LIVSHITS

SHUVENDU LAHIRI

MICROSOFT RESEARCH

FROM THE PEOPLE WHO BROUGHT YOU SOUNDINESS.ORG…

STATIC ANALYSIS: UNEASY TRADEOFFS

too imprecise

useless results

may not scale

does not scale

overkill for some things

possibly still too imprecise for others

WHAT IS MISSING IS

ANALYSIS ELA S T I C I TY

OUR APPROACH IS PROBABILISTIC TREATMENT

Points-to(p, v, h)

• MANY INTERPRETATIONS ARE POSSIBLE

• OUR CERTAINTY IN THE FACT BASED ON STATIC EVIDENCE SUCH AS PROGRAM STRUCTURE

• OUR CERTAINTY BASED ON RUNTIME OBSERVATIONS

• OUR CERTAINLY BASED ON PRIORS OBTAINED FROM THIS OR OTHER PROGRAMS

Object x = new Object();

try {

} catch(...){

x = null;

}

if(...){ // branch direction info

x = new Object();

}else{

x = null;

}

$(‘mydiv’).css(‘color’:’red’);

BENEFITS

RESULT PRIORITIZATION

• STATIC ANALYSIS RESULTS CAN BE NATURALLY RANKED OR PRIORITIZED IN TERMS OF CERTAINTY, NEARLY A REQUIREMENT IN A SITUATION WHERE ANALYSIS USERS ARE FREQUENTLY FLOODED WITH RESULTS

ANALYSIS DEBUGGING

• PROGRAM POINTS OR EVEN STATIC ANALYSIS INFERENCE RULES AND FACTS LEADING TO IMPRECISION CAN BE IDENTIFIED WITH THE HELP OF BACKWARD PROPAGATION

MORE BENEFITS

HARD AND SOFT RULES

• IN AN EFFORT TO MAKE THEIR ANALYSIS FULLY SOUND, ANALYSIS DESIGNERS OFTEN COMBINE CERTAIN INFERENCE RULES WITH THOSE THAT COVER GENERALLY UNLIKELY CASES TO MAINTAIN SOUNDNESS

• NATURALLY BLENDING SUCH INFERENCE RULES TOGETHER, BY GIVING HIGH PROBABILITIES TO THE FORMER AND LOW PROBABILITIES TO THE LATTER ALLOWS US TO BALANCE SOUNDNESS AND UTILITY CONSIDERATIONS

INFUSING WITH PRIORS

• END-QUALITY OF ANALYSIS RESULTS CAN OFTEN BE IMPROVED BY DO- MAIN KNOWLEDGE SUCH AS INFORMATION ABOUT VARIABLE NAMING, CHECK-IN INFORMATION FROM SOURCE CONTROL REPOSITORIES, BUG FIX DATA FROM BUG REPOSITORIES, ETC.

SIMPLE ANALYSIS IN DATALOG

1. x=3;

2. y=null;

3. z=null;

4. z=x;

5. if(...){

6. z=null;

7. y=5;

8. }

9. w=*z

// transitive flow propagation1. FLOW(x,z) :- FLOW(x,y), ASSIGN(y,z)2. FLOW(a,c) :- FLOW(a,b),

ASSIGNCOND(b,c)3. FLOW(x,x). // nullable variables4. NULLABLE(x) :- FLOW(x,y), ISNULL(y) // error detection5. ERROR(a) :- ISNULL(a), DEREF(a)6. ERROR(a) :- !ISNULL(a),

NULLABLE(a), DEREF(a)

RELAXING THE RULES

// transitive flow propagationFLOW(x,y) ^ ASSIGN(y,z) => FLOW(x,z).1 FLOW(a,b) ^ ASSIGNCOND(b,c) => FLOW(a,c)FLOW(x,x).

// transitive flow propagationFLOW(x,z) :- FLOW(x,y), ASSIGN(y,z).FLOW(a,c) :- FLOW(a,b), ASSIGNCOND(b,c).FLOW(x,x).

// nullable variablesFLOW(x,y) ^ ISNULL(y) => NULLABLE(x).

// nullable variablesNULLABLE(x) :- FLOW(x,y), ISNULL(y).

// error detectionISNULL(a)^ DEREF(a) => ERROR(a).0.5 !ISNULL(a) ^ NULLABLE(a) ^ DEREF(a) => ERROR(a).

// error detectionERROR(a) :- ISNULL(a), DEREF(a).ERROR(a) :- !ISNULL(a), NULLABLE(a), DEREF(a).

// priors and shaping distributions3 !FLOW(x,y).

PROBABILISTIC INFERENCE WITH ALCHEMY

• TUNING THE RULES

• TUNING THE WEIGHTS

• SEMANTICS ARE NOT AS OBVIOUS

• SHAPING PRIORS IS NON-TRIVIAL, BUT FRUITFUL

X1

U1

W1

Z1

Z2

W4

Z3

W3

Y1

W5

W6

W7

W8

W9

W10 W11

0.616988 0.614989

0.567993

0.560994 0.544996

CHALLENGES

• LEARNING THE WEIGHTS

• EXPERT USERS

• LEARNING (NEED LABELED DATASET)

• WHAT CLASS OF STATIC ANALYSIS CAN BE MADE ELASTIC?

• DATALOG

• ABSTRACT INTERPRETATION

• DECISION PROCEDURE (SMT)-BASED