points-to analysis for java using annotated...

38
Java Points-to, 12/02, BGR 1 Points-to Analysis for Java Using Annotated Constraints* Dr. Barbara G. Ryder Rutgers University http://www.cs.rutgers.edu/~ ryder http://prolangs.rutgers.edu/ *Joint research with Atanas Rountev and Ana Milanova, supported by NSF-CCR 9900988.

Upload: others

Post on 18-Mar-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Java Points-to, 12/02, BGR 1

Points-to Analysis for JavaUsing Annotated Constraints*

Dr. Barbara G. RyderRutgers University

http://www.cs.rutgers.edu/~ryderhttp://prolangs.rutgers.edu/

*Joint research with Atanas Rountev and AnaMilanova, supported by NSF-CCR 9900988.

Java Points-to, 12/02, BGR 2

Outline

• PROLANGS research projects• Points-to analysis for Java• Initial constraint-based implementation,

OOPSLA’01– Empirical results

• Object-sensitive analysis, ISSTA’02 – Empirical results

• Related work• Summary

Java Points-to, 12/02, BGR 3

PROLANGS

• Research projects at boundary of ProgrammingLanguages/Compilers and Software Engineering– Algorithm design and prototyping

• Mature research projects– Pointer analysis of C programs– Side-effect analysis of C systems– Semantic software change analysis– Studies of Java exception usages– PROLANGS Analysis Framework (PAF), version

1.1 released June 1999

http://prolangs.rutgers.edu

Java Points-to, 12/02, BGR 4

Ongoing Research• Object-oriented systems (C++/Java)

– Analysis• Points-to, side-effect analyses POPL’99, OOPSLA’01, ISSTA’02,

ICSM’02

• Change impact analysis (with Frank Tip, IBM) PASTE’01

– Optimization• Profiling framework for feedback-directed optimization PLDI’01• Experience with feedback-directed optimization OOPSLA’02

• Static/dynamic analyses of application resource usage(with Rich Martin and Thu Nguyen, Rutgers)

• Analysis of program fragments FSE’99, CC’01, ICSE’03

Java Points-to, 12/02, BGR 5

Points-to Analysis for Java

• Which objects may referencevariable x point to?

• Builds a points-to graph

x = new A();y = new B();x.f = y;

x o1

o2y

f

Java Points-to, 12/02, BGR 6

Uses of Points-to Informationin Compilers

• Object read-write information– Side-effect analysis, dependenceanalysis

• Call graph construction– Devirtualization & inlining

• Synchronization removal• Stack-based object allocation

Java Points-to, 12/02, BGR 7

Uses of Points-to Informationin Software Engineering Tools

• Object read-write information– Semantic browers– Program slicers– Debuggers

• Change impact analysis tools• Testing

– Object relationships (Object relation diagrams)– Program-based coverage metrics

Java Points-to, 12/02, BGR 8

Contributions

Points-to Analyses for Java usingannotated constraints

• Initial analysis based on Andersen’sanalysis for C, OOPSLA’01– Annotations embody OO notions needed

– Maintained efficient constraint-basedimplementation

– Empirical evaluation of cost and precision

Java Points-to, 12/02, BGR 9

Contributions

• Object-sensitive analysis, ISSTA’02– Adding context sensitivity

• Parameterization framework

– Empirical evaluation demonstrates moreprecision for same cost

Java Points-to, 12/02, BGR 10

Points-to Analysis, OOPSLA’01

• Handles virtual calls– Simulates the run-time method lookup

• Models the fields of objects• Analyzes executable code

– Ignores unreachable code from libraries

Java Points-to, 12/02, BGR 11

Points-to Analysis in Action

class A { void m(X p) {..} }

class B extends A { X f; void m(X q) { this.f=q; }}

B b = new B();X x = new X();A a = b;a.m(x);

q

b o1

a

thisB.m f

x o2

A.m() not analyzed because it’s unreachable.

Java Points-to, 12/02, BGR 12

Efficient Implementation

• Constraint-based approach– Extends previous work for C pointer analysis

using BANE (UC Berkeley)• Extended constraint system and resolution rules

• Define and solve a system of annotatedset-inclusion constraints to obtain points-to sets

Java Points-to, 12/02, BGR 13

Annotated Constraints

• Form: L ⊆ a R– L and R denote set expressions– Annotation a: additional information(e.g., object fields)

• Kinds of set expressions L and R– Set variables: represent points-to sets– ref terms: represent objects– Other kinds of expressions

Java Points-to, 12/02, BGR 14

Set variables and ref terms

• Set variables represent points-to sets– For each reference variable p: VP

– For each object o: Vo

• Object o is denoted by term ref(o,Vo)

ref(o2,Vo2) ⊆ f Vo1

o1 o2f

ref(o,Vo) ⊆ VP p o

Java Points-to, 12/02, BGR 15

Example: Accessing Fields

p = new A();

q = new B();

p.f = q;

p o1

o2q

f

ref(o1,VO1) ⊆ Vp

ref(o2,VO2) ⊆ Vq

Vp ⊆ proj(ref,W)

Vq ⊆ f W

Constraint generation

Java Points-to, 12/02, BGR 16

Example: Solving Constraintsref(o1,VO1

) ⊆ Vp

ref(o2,VO2) ⊆ Vq

Vp ⊆ proj(ref,W)

Vq ⊆ f WW ⊆ VO1

Vq ⊆ f VO1

ref(o2,VO2) ⊆ f VO1

o1.f points to o2

Constraint resolution

Java Points-to, 12/02, BGR 17

Example: Virtual Calls

p.m(x); VP ⊆ m lam(Vx)

ref(o,VO) ⊆ VPreceiver object o

Vx ⊆ Vz

ref(o,VO) ⊆ Vthis(A.m)

Actual methodcalled, A.m

Java Points-to, 12/02, BGR 18

Experiments

• 23 Java programs: 14 – 677 user classes– Added the necessary library classes– Machine: 360 MHz, 512Mb SUN Ultra-60

• Cost measured in time and memory• Precision (wrt usage in client analyses and

tranformations)– Object read-write information– Call graph construction– Synchronization removal and stack allocation

Java Points-to, 12/02, BGR 19

Analysis Time

0

50

100

150

200

250

300

350

400

proxy

compress

db jb

echo

raytrace

mtrt

jtar

jlex

javacup

rabbit

jack

jflex

jess

mpegaudio

jjtree

sablecc

javac

creature

mindterm

soot

muffin

javacc

Seconds

Java Points-to, 12/02, BGR 20

Resolution of Virtual Call Sites

0

10

20

30

40

50

60

70

80

90

proxy d

b

echo

mtrt

jlex

rabbit

jflex

mpegaudio

sablecc

creature

soot

javacc

% R

es

olv

ed

Ca

ll S

ite

s

Points-to RTA

Java Points-to, 12/02, BGR 21

Thread-local new sites

0%

10%

20%

30%

40%

50%

60%

70%

% T

hre

ad

-lo

ca

l S

ite

s

Java Points-to, 12/02, BGR 22

Practical Points-to Analysesfor Java, ISSTA’02

• Existing analyses were flow- and context-insensitive extensions of C analyses

• Context insensitivity inherentlycompromises precision for object-orientedlanguages

• Goal: Introduce context sensitivity andremain practical

Java Points-to, 12/02, BGR 23

Example: Imprecision

class Y extends X { … }

class A { X f; void m(X q) {

this.f=q ; }}

A a = new A() ;a.m(new X()) ;A aa = new A() ;aa.m(new Y()) ;

o2o1a

thisA.m q

o3 aa o4f

f

f

f

Java Points-to, 12/02, BGR 24

Imprecision ofContext-insensitive Analysis

• Does not distinguish contexts forinstance methods and constructors– States of distinct objects are merged

• Common OO features and idioms– Encapsulation– Inheritance– Containers, maps and iterators

Java Points-to, 12/02, BGR 25

Object-sensitivePoints-to Analysis

• Object sensitivity– Form of context sensitivity for flow-insensitive

points-to analysis of OO languages• Object-sensitive Andersen’s analysis

– Object sensitivity applicable to other analyses• Parameterization framework

– Cost vs. precision tradeoff• Empirical evaluation

– Vs. context-insensitive OOPSLA’01 analysis

Java Points-to, 12/02, BGR 26

Details

• Instance methods and constructorsanalyzed for different contexts

• Receiver objects used as contexts• Multiple copies of reference variables

this.f=q thisA.m.f=q o1 o1

o1

Java Points-to, 12/02, BGR 27

Example: Object-sensitiveAnalysis

class A { X f; void m(X q) {

this.f=q ; }}

A a = new A() ;a.m(new X()) ;A aa = new A() ;aa.m(new Y()) ;

o2f

o1a

thisA.mo1 qA.m

o1thisA.m.f=q o1 o1

o1

this.f=q ;

o3 aa o4

o3thisA.mo3qA.m

thisA.m.f=q o3 o3

f

Java Points-to, 12/02, BGR 28

Implementation

• Implemented one instance ofparameterization framework– this, formals and return variables(effectively) replicated

– Optimized constraint-based analysisusing previous technique

– Comparison with OOPSLA’01 analysis

Java Points-to, 12/02, BGR 29

Empirical Results

• 23 Java programs: 14 – 677 user classes– Added the necessary library classes– Machine: 360 MHz, 512Mb, SUN Ultra-60

• Object Sensitive vs. OOPSLA’01 points-to• Found comparable cost with better precision

• Modification side-effect analysis• Virtual call resolution

Java Points-to, 12/02, BGR 30

Analysis Time

0

50

100

150

200

250

co

mp

ress

db

jb ech

o

raytr

ace

mtr

t

jtar

jlex

javacu

p

rab

bit

jack

jflex

jess

mp

eg

au

dio

jjtree

sab

lecc

creatu

re

min

dte

rm

so

ot

mu

ffin

javacc

Seco

nd

s

Object Sensitive OOPSLA'01

Java Points-to, 12/02, BGR 31

Side-effect Analysis:Modified Objects Per Statement

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

1 2 3 4 5 6 7 8 9 10 11 12 13 14

One Two or three Four to nine More than nine

jb jess sablecc raytrace Average

OB

JE

CT

SE

NS

ITIV

E

OO

PS

LA

’01

Java Points-to, 12/02, BGR 32

Improvement in Resolved Calls

0

10

20

30

40

50

60

proxy

compress

db

jb echo

raytrace

mtrt

jtar

jlex

javacup

rabbit

jack

jflex

jess

mpegaudio

jjtree

sablecc

javac

creature

mindterm

soot

muffin

javacc

Avg

Percent

Java Points-to, 12/02, BGR 33

Related Work

• Context-sensitive points-to analysis for OOlanguages– Grove et al. OOPSLA’97, Chatterjee et al. POPL’99,

Ruf PLDI’00, Grove-Chambers TOPLAS’01

• Context-insensitive points-to analysis for OOlanguages– Liang et al. PASTE’01

• 0Context-sensitive class analysis– Oxhoj et al. ECOOP’92, Agesen SAS’94,

Plevyak-Chien OOPSLA’94, Agesen ECOOP’95, Grove et al.OOPSLA’97, Grove-Chambers TOPLAS’01

Java Points-to, 12/02, BGR 34

Summary

• Defined two new points-to analysesfor references in OOPLs usingannotated constraints

• Context-insensitive analysis(OOPSLA’01)– Based on Andersen’s points-to for C– Practical cost and good precision wrtclient analyses and transformations

Java Points-to, 12/02, BGR 35

Summary

• Object-sensitive (context-sensitive)points-to analysis -– New kind of context sensitivity for flow-

insensitive analysis– Parameterization framework allows tunable

algorithm choice– Practical cost, comparable to OOPSLA’01

analysis– Better precision than OOPSLA’01 analysis

Java Points-to, 12/02, BGR 36

Java Points-to, 12/02, BGR 37

Number of new X() whose objectsare accessed by p.f

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

proxy

dbecho

mtrt jle

x

rabbit

jflex

mpegaudio

sablecc

creature

soot

javacc

One Two or three More than three

Java Points-to, 12/02, BGR 38

Parameterization

• Goal: tunable analysis• Multiple copies for a subset of variables

– For the other variables a single copy• Result: reduces points-to graph size and

analysis cost– At the expense of precision loss