theabstractionvs. approximationsdilemma inpointeranalysisuday/soft-copies/aadppa.pdf- frances allen,...

177
The Abstraction Vs. Approximations Dilemma in Pointer Analysis Uday Khedker (www.cse.iitb.ac.in/˜uday) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay Nov 2017

Upload: others

Post on 03-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

The Abstraction Vs. Approximations Dilemmain Pointer Analysis

Uday Khedker

(www.cse.iitb.ac.in/̃ uday)

Department of Computer Science and Engineering,

Indian Institute of Technology, Bombay

Nov 2017

Page 2: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Outline 1/43

Outline

• Disclaimer: This talk is

◮ not about accomplishments but about opinions, and hopes

◮ an idealistic view of pointer analysis(the destination we wish to reach)

Uday Khedker IIT Bombay

Page 3: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Outline 1/43

Outline

• Disclaimer: This talk is

◮ not about accomplishments but about opinions, and hopes

◮ an idealistic view of pointer analysis(the destination we wish to reach)

• Outline:

◮ Our Meanderings

◮ Some short trips

◮ Conclusions

Uday Khedker IIT Bombay

Page 4: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

Part 1

Our Meanderings

Page 5: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 2/43

Pointer Analysis Musings

• A keynote address:

“The worst thing that has happened to Computer Science is C,because it brought pointers with it . . . ”

- Frances Allen, IITK Workshop (2007)

• A couple of influential papers

◦ Which Pointer Analysis should I Use?

Michael Hind and Anthony Pioli. ISTAA 2000

◦ Pointer Analysis: Haven’t we solved this problem ?

Michael Hind PASTE

yet

2001

Uday Khedker IIT Bombay

Page 6: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 2/43

Pointer Analysis Musings

• A keynote address:

“The worst thing that has happened to Computer Science is C,because it brought pointers with it . . . ”

- Frances Allen, IITK Workshop (2007)

• A couple of influential papers

◦ Which Pointer Analysis should I Use?

Michael Hind and Anthony Pioli. ISTAA 2000

◦ Pointer Analysis: Haven’t we solved this problem ?

Michael Hind PASTE

yet

2001

Uday Khedker IIT Bombay

Page 7: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 2/43

Pointer Analysis Musings

• A keynote address:

“The worst thing that has happened to Computer Science is C,because it brought pointers with it . . . ”

- Frances Allen, IITK Workshop (2007)

• A couple of influential papers

◦ Which Pointer Analysis should I Use?

Michael Hind and Anthony Pioli. ISTAA 2000

◦ Pointer Analysis: Haven’t we solved this problem ?

Michael Hind PASTE

yet

2001

◦ 2017 . . .

Uday Khedker IIT Bombay

Page 8: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 3/43

The Mathematics of Pointer Analysis

In the most general situation

• Alias analysis is undecidable.

Landi-Ryder [POPL 1991], Landi [LOPLAS 1992],Ramalingam [TOPLAS 1994]

• Flow insensitive alias analysis is NP-hard

Horwitz [TOPLAS 1997]

• Points-to analysis is undecidable

Chakravarty [POPL 2003]

Adjust your expectations suitably to avoid disappointments!

Uday Khedker IIT Bombay

Page 9: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 4/43

So what should we expect?

To quote Hind [PASTE 2001]

Uday Khedker IIT Bombay

Page 10: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 4/43

So what should we expect?

To quote Hind [PASTE 2001]

• “Fortunately many approximations exist”

Uday Khedker IIT Bombay

Page 11: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 4/43

So what should we expect?

To quote Hind [PASTE 2001]

• “Fortunately many approximations exist”

• “Unfortunately too many approximations exist!”

Uday Khedker IIT Bombay

Page 12: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 4/43

So what should we expect?

To quote Hind [PASTE 2001]

• “Fortunately many approximations exist”

• “Unfortunately too many approximations exist!”

Engineering of pointer analysis is much more dominant than its science

Uday Khedker IIT Bombay

Page 13: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 5/43

Pointer Analysis: Engineering or Science?

• Engineering view ◮ Build quick approximations◮ The tyranny of (exclusive) OR

Precision OR Efficiency?

• Science view ◮ Build clean abstractions◮ Can we harness the Genius of AND?

Precision AND Efficiency?

Uday Khedker IIT Bombay

Page 14: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 5/43

Pointer Analysis: Engineering or Science?

• Engineering view ◮ Build quick approximations◮ The tyranny of (exclusive) OR

Precision OR Efficiency?

• Science view ◮ Build clean abstractions◮ Can we harness the Genius of AND?

Precision AND Efficiency?

• Most common trend as evidenced by publications

◮ Build acceptable approximations guided by empirical observations

◮ The notion of acceptability is often constrained by beliefs rather thanpossibilities

Uday Khedker IIT Bombay

Page 15: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 6/43

Abstraction Vs. Approximation in Static Analysis

• Static analysis needs to create abstract values that represent manyconcrete values

• Mapping concrete values to abstract values

◮ Abstraction.

Deciding which properties of the concrete values are essential What

Ease of understanding, reasoning, modelling etc. Why

◮ Approximation.

Deciding which properties of the concrete values cannot What

be represented accurately and should be summarized

Decidability, tractability, or efficiency and scalability Why

Uday Khedker IIT Bombay

Page 16: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 7/43

Abstraction Vs. Approximation in Static Analysis

• Abstractions

◮ focus on precision and conciseness of modelling◮ tell us what we can ignore without being imprecise

• Approximations

◮ focus on efficiency and scalability◮ tell us the imprecision that we have to tolerate

Uday Khedker IIT Bombay

Page 17: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 7/43

Abstraction Vs. Approximation in Static Analysis

• Abstractions

◮ focus on precision and conciseness of modelling◮ tell us what we can ignore without being imprecise

• Approximations

◮ focus on efficiency and scalability◮ tell us the imprecision that we have to tolerate

• Build clean abstractions before surrendering to the approximations

Uday Khedker IIT Bombay

Page 18: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 8/43

The Basis of My Hope

• Common belief:

• The possibility that I dream of:

• The basis of my hope:

Uday Khedker IIT Bombay

Page 19: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 8/43

The Basis of My Hope

• Common belief:

Pointer information is very large

• The possibility that I dream of:

• The basis of my hope:

Uday Khedker IIT Bombay

Page 20: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 8/43

The Basis of My Hope

• Common belief:

Pointer information is very large

• The possibility that I dream of:

Precision can reduce the size of pointer information to make it far moremanageable

• The basis of my hope:

Uday Khedker IIT Bombay

Page 21: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 8/43

The Basis of My Hope

• Common belief:

Pointer information is very large

• The possibility that I dream of:

Precision can reduce the size of pointer information to make it far moremanageable

• The basis of my hope:

At any program point, the usable pointer information is much smaller thanthe total pointer information

Current methods perform many repeated and possibly avoidablecomputations

Uday Khedker IIT Bombay

Page 22: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 9/43

Why Avoid Approximations?

• Approximations may create a vicious cycle

ApproximationImprecision

causes

Inefficiency

maycause

may seemto warrant

Uday Khedker IIT Bombay

Page 23: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 9/43

Why Avoid Approximations?

• Approximations may create a vicious cycle

ApproximationImprecision

causes

Inefficiency

maycause

may seemto warrant

• Two examples of inefficiency cause by approximations

◮ k-limited call strings may create “butterfly cycles” causing spuriousfixed point computations [Hakjoo, 2010]

◮ Imprecision in function pointer analysis overapproximates calls

may create spurious recursion in call graphs

Uday Khedker IIT Bombay

Page 24: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 10/43

Which Approximations Would I like to Avoid?

Approximation Admits

Flow insensitivity

Context insensitivity (orpartial context sensitivity)

Imprecision in call graphs

Allocation site basedheap abstraction

Uday Khedker IIT Bombay

Page 25: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 10/43

Which Approximations Would I like to Avoid?

Approximation Admits

Flow insensitivity Spurious intraprocedural paths

Context insensitivity (orpartial context sensitivity) Spurious interprocedural paths

Imprecision in call graphs Spurious call sequences

Allocation site basedheap abstraction Spurious paths in memory graph

Uday Khedker IIT Bombay

Page 26: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 11/43

The Classical Precision-Efficiency Dilemma

AbstractionRole in precision Cause of inefficiency

Distinguishes between Needs to consider

Flow sensitivity

Context sensitivity

Precise heap abstraction

Precise call structure

Uday Khedker IIT Bombay

Page 27: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 11/43

The Classical Precision-Efficiency Dilemma

AbstractionRole in precision Cause of inefficiency

Distinguishes between Needs to consider

Flow sensitivity Information at differentprogram points

Context sensitivity Information indifferent contexts

Precise heap abstraction Different heaplocations

Precise call structureIndirect calls made todifferent callees fromthe same program point

Uday Khedker IIT Bombay

Page 28: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 11/43

The Classical Precision-Efficiency Dilemma

AbstractionRole in precision Cause of inefficiency

Distinguishes between Needs to consider

Flow sensitivity Information at differentprogram points

A large number ofprogram points

Context sensitivity Information indifferent contexts

Exponentially largenumber of contexts

Precise heap abstraction Different heaplocations

Unbounded numberof heap locations

Precise call structureIndirect calls made todifferent callees fromthe same program point

Precise points-toinformation

Uday Khedker IIT Bombay

Page 29: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 12/43

Flow Insensitivity in Data Flow Analysis

• Assumption: Statements can be executed in any order.

• Instead of computing point-specific data flow information, summary dataflow information is computed.

The summary information is required to be a safe approximation ofpoint-specific information for each point.

• No data flow information is killed

If a statement kills data flow information, there is an alternate path thatexcludes the statement.

Uday Khedker IIT Bombay

Page 30: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 13/43

Flow Insensitivity in Data Flow Analysis

0 f0 0

1 f1 1

2 f2 2 3 f3 3

i fi i

m fm m

Start

0 f0 0 1 f1 1 2 f2 2 3 f3 3 . . . i fi i . . . m fm m

End

Uday Khedker IIT Bombay

Page 31: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 13/43

Flow Insensitivity in Data Flow Analysis

0 f0 0

1 f1 1

2 f2 2 3 f3 3

i fi i

m fm m

Start

0 f0 0 1 f1 1 2 f2 2 3 f3 3 . . . i fi i . . . m fm m

End

Uday Khedker IIT Bombay

Page 32: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 13/43

Flow Insensitivity in Data Flow Analysis

0 f0 0

1 f1 1

2 f2 2 3 f3 3

i fi i

m fm m

Start

0 f0 0 1 f1 1 2 f2 2 3 f3 3 . . . i fi i . . . m fm m

End

Allows arbitrary compositions of flow functions in any order

⇒ Flow insensitivity

Uday Khedker IIT Bombay

Page 33: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 13/43

Flow Insensitivity in Data Flow Analysis

0 f0 0

1 f1 1

2 f2 2 3 f3 3

i fi i

m fm m

Start

0 f0 0 1 f1 1 2 f2 2 3 f3 3 . . . i fi i . . . m fm m

End

In practice, dependent constraints are collected in a global

repository in one pass and then are solved independently

Uday Khedker IIT Bombay

Page 34: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 14/43

If I am Allowed to Nitpick . . .

• Context sensitivity should involve all of the following

[A] Full context sensitivity regardless of the call depth even in recursion[B] Ability to store data flow information parameterized by contexts at

each program point[C] Flow sensitivity at the intraprocedural level (otherwise distinct calls

to the same procedure within a procedure cannot be distinguished)

Uday Khedker IIT Bombay

Page 35: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 14/43

If I am Allowed to Nitpick . . .

• Context sensitivity should involve all of the following

[A] Full context sensitivity regardless of the call depth even in recursion[B] Ability to store data flow information parameterized by contexts at

each program point[C] Flow sensitivity at the intraprocedural level (otherwise distinct calls

to the same procedure within a procedure cannot be distinguished)

• In particular

◮ k-limiting violates [A]◮ Treating recursion as an SCC violates [A]◮ Functional approaches violate [B]◮ Object sensitivity violates [C]

Uday Khedker IIT Bombay

Page 36: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 14/43

If I am Allowed to Nitpick . . .

• Context sensitivity should involve all of the following

[A] Full context sensitivity regardless of the call depth even in recursion[B] Ability to store data flow information parameterized by contexts at

each program point[C] Flow sensitivity at the intraprocedural level (otherwise distinct calls

to the same procedure within a procedure cannot be distinguished)

• In particular

◮ k-limiting violates [A]◮ Treating recursion as an SCC violates [A]◮ Functional approaches violate [B]◮ Object sensitivity violates [C]

• Object sensitivity can be completely modelled by calling context sensitivity

◮ by a flow sensitive propagation of values representing objects, and◮ identifying a procedure by an (object, procedure) pair, and◮ identifying a context by a call site and the pairs defined as above

Uday Khedker IIT Bombay

Page 37: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 15/43

Context Sensitivity in Interprocedural Analysis

Sr

Er

Ss

Es

Ci

Ri

ci

St

Et

Cj

Rj

cj

x

x

x ′ = fr (x)

x ′

y

y

y ′ = fr (y)

y ′

fr

Uday Khedker IIT Bombay

Page 38: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 15/43

Context Sensitivity in Interprocedural Analysis

Sr

Er

Ss

Es

Ci

Ri

ci

St

Et

Cj

Rj

cj

x

x

x ′

y

y

y ′

fr

Uday Khedker IIT Bombay

Page 39: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 15/43

Context Sensitivity in Interprocedural Analysis

Sr

Er

Ss

Es

Ci

Ri

ci

St

Et

Cj

Rj

cj

x

x

x ′

y

y

y ′

fr

×

Uday Khedker IIT Bombay

Page 40: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 15/43

Context Sensitivity in Interprocedural Analysis

Sr

Er

Ss

Es

Ci

Ri

ci

St

Et

Cj

Rj

cj

x

x

x ′

y

y

y ′

fr

Uday Khedker IIT Bombay

Page 41: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 15/43

Context Sensitivity in Interprocedural Analysis

Sr

Er

Ss

Es

Ci

Ri

ci

St

Et

Cj

Rj

cj

x

x

x ′

y

y

y ′

fr

×

Uday Khedker IIT Bombay

Page 42: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 16/43

Context Sensitivity in the Presence of Recursion

Starts

Ci

Ri

Ends

call

return

stopcalling

Uday Khedker IIT Bombay

Page 43: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 16/43

Context Sensitivity in the Presence of Recursion

Starts

Ci

Ri

Ends

call

return

stopcalling

• Paths from Starts to Ends should constitutea context free language

calln · stop · returnn

• If we treat cycle of recursion as an SCC

◮ Calls and returns become jumps, and◮ paths are approximated by a regular

language

call∗ · stop · return∗

Uday Khedker IIT Bombay

Page 44: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 17/43

Context Insensitivity = Imprecision + Potential Inefficiency

Startmain

a = 1

Call P

Call P

Call P

Endmain

Startp

a = a+1

Endp

Uday Khedker IIT Bombay

Page 45: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 17/43

Context Insensitivity = Imprecision + Potential Inefficiency

Startmain

a = 1

Call P

Call P

Call P

Endmain

Startp

a = a+1

Endp

• What is the value range of a?

Uday Khedker IIT Bombay

Page 46: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 17/43

Context Insensitivity = Imprecision + Potential Inefficiency

Startmain

a = 1

Call P

Call P

Call P

Endmain

Startp

a = a+1

Endp

(1, 1)

(2, 2)

• What is the value range of a?

Uday Khedker IIT Bombay

Page 47: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 17/43

Context Insensitivity = Imprecision + Potential Inefficiency

Startmain

a = 1

Call P

Call P

Call P

Endmain

Startp

a = a+1

Endp

(1, 1)

(2, 2)

(2, 2)

• What is the value range of a?

• Context sensitive analysis

◮ Data flow value propagated backto the current caller of P

Uday Khedker IIT Bombay

Page 48: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 17/43

Context Insensitivity = Imprecision + Potential Inefficiency

Startmain

a = 1

Call P

Call P

Call P

Endmain

Startp

a = a+1

Endp

(1, 1)

(2, 2)

(2, 2)

(2, 2)

(3, 3)

• What is the value range of a?

• Context sensitive analysis

◮ Data flow value propagated backto the current caller of P

Uday Khedker IIT Bombay

Page 49: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 17/43

Context Insensitivity = Imprecision + Potential Inefficiency

Startmain

a = 1

Call P

Call P

Call P

Endmain

Startp

a = a+1

Endp

(1, 1)

(2, 2)

(2, 2)

(2, 2)

(3, 3)

(3, 3)

• What is the value range of a?

• Context sensitive analysis

◮ Data flow value propagated backto the current caller of P

◮ Range of a at Endmain is (3, 3)

Uday Khedker IIT Bombay

Page 50: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 17/43

Context Insensitivity = Imprecision + Potential Inefficiency

Startmain

a = 1

Call P

Call P

Call P

Endmain

Startp

a = a+1

Endp

(1, 1)

(2, 2)

(2, 2)

• What is the value range of a?

• Context sensitive analysis

◮ Data flow value propagated backto the current caller of P

◮ Range of a at Endmain is (3, 3)

• Context insensitive analysis

◮ Data flow value propagated backto every caller

Uday Khedker IIT Bombay

Page 51: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 17/43

Context Insensitivity = Imprecision + Potential Inefficiency

Startmain

a = 1

Call P

Call P

Call P

Endmain

Startp

a = a+1

Endp

(1, 1)

(2, 2)

(1, 2)

(2, 3)

• What is the value range of a?

• Context sensitive analysis

◮ Data flow value propagated backto the current caller of P

◮ Range of a at Endmain is (3, 3)

• Context insensitive analysis

◮ Data flow value propagated backto every caller

Uday Khedker IIT Bombay

Page 52: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 17/43

Context Insensitivity = Imprecision + Potential Inefficiency

Startmain

a = 1

Call P

Call P

Call P

Endmain

Startp

a = a+1

Endp

(1, 1)(1, 2)

(2, 3)

(2, 3)

(2, 3)

• What is the value range of a?

• Context sensitive analysis

◮ Data flow value propagated backto the current caller of P

◮ Range of a at Endmain is (3, 3)

• Context insensitive analysis

◮ Data flow value propagated backto every caller

Uday Khedker IIT Bombay

Page 53: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 17/43

Context Insensitivity = Imprecision + Potential Inefficiency

Startmain

a = 1

Call P

Call P

Call P

Endmain

Startp

a = a+1

Endp

(1, 1)

(2, 3)

(1, 3)

(2, 4)

• What is the value range of a?

• Context sensitive analysis

◮ Data flow value propagated backto the current caller of P

◮ Range of a at Endmain is (3, 3)

• Context insensitive analysis

◮ Data flow value propagated backto every caller

Uday Khedker IIT Bombay

Page 54: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 17/43

Context Insensitivity = Imprecision + Potential Inefficiency

Startmain

a = 1

Call P

Call P

Call P

Endmain

Startp

a = a+1

Endp

(1, 1)(1, 3)

(2, 4)

(2, 4)

(2, 4)

• What is the value range of a?

• Context sensitive analysis

◮ Data flow value propagated backto the current caller of P

◮ Range of a at Endmain is (3, 3)

• Context insensitive analysis

◮ Data flow value propagated backto every caller

Uday Khedker IIT Bombay

Page 55: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 17/43

Context Insensitivity = Imprecision + Potential Inefficiency

Startmain

a = 1

Call P

Call P

Call P

Endmain

Startp

a = a+1

Endp

(1, 1)

(2, 4)

(1, 4)

(2, 5)

• What is the value range of a?

• Context sensitive analysis

◮ Data flow value propagated backto the current caller of P

◮ Range of a at Endmain is (3, 3)

• Context insensitive analysis

◮ Data flow value propagated backto every caller

Uday Khedker IIT Bombay

Page 56: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 17/43

Context Insensitivity = Imprecision + Potential Inefficiency

Startmain

a = 1

Call P

Call P

Call P

Endmain

Startp

a = a+1

Endp

(1, 1)(1, 4)

(2, 5)

(2, 5)

(2, 5)

• What is the value range of a?

• Context sensitive analysis

◮ Data flow value propagated backto the current caller of P

◮ Range of a at Endmain is (3, 3)

• Context insensitive analysis

◮ Data flow value propagated backto every caller

◮ Range of a at Endmain is (2, . . .)

Uday Khedker IIT Bombay

Page 57: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 17/43

Context Insensitivity = Imprecision + Potential Inefficiency

Startmain

a = 1

Call P

Call P

Call P

Endmain

Startp

a = a+1

Endp

• What is the value range of a?

• Context sensitive analysis

◮ Data flow value propagated backto the current caller of P

◮ Range of a at Endmain is (3, 3)

• Context insensitive analysis

◮ Data flow value propagated backto every caller

◮ Range of a at Endmain is (2, . . .)

Uday Khedker IIT Bombay

Page 58: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 17/43

Context Insensitivity = Imprecision + Potential Inefficiency

Startmain

a = 1

Call P

Call P

Call P

Endmain

Startp

a = a+1

Endp

• What is the value range of a?

• Context sensitive analysis

◮ Data flow value propagated backto the current caller of P

◮ Range of a at Endmain is (3, 3)

• Context insensitive analysis

◮ Data flow value propagated backto every caller

◮ Range of a at Endmain is (2, . . .)

• Spurious interprocedural loops

Uday Khedker IIT Bombay

Page 59: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 17/43

Context Insensitivity = Imprecision + Potential Inefficiency

Startmain

a = 1

Call P

Call P

Call P

Endmain

Startp

a = a+1

Endp

• What is the value range of a?

• Context sensitive analysis

◮ Data flow value propagated backto the current caller of P

◮ Range of a at Endmain is (3, 3)

• Context insensitive analysis

◮ Data flow value propagated backto every caller

◮ Range of a at Endmain is (2, . . .)

• Spurious interprocedural loops

• Spurious fixed point computations

Uday Khedker IIT Bombay

Page 60: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 18/43

Context Sensitivity in the Presence of Recursion

Starts

Ci

Ri

Ends

call

return

stopcalling

Uday Khedker IIT Bombay

Page 61: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 18/43

Context Sensitivity in the Presence of Recursion

Starts

Ci

Ri

Ends

call

return

stopcalling

• Paths from Starts to Ends should constitutea context free language

calln · stop · returnn

• If we treat cycle of recursion as an SCC

◮ Calls and returns become jumps, and◮ paths are approximated by a regular

language

call∗ · stop · return∗

Uday Khedker IIT Bombay

Page 62: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 19/43

Pointer Analysis: An Engineer’s Landscape

Flow

Sensitivity

Increases

Context SensitivityIncreases

FI=

FI⊆

FISSA

FSNoKill

FS

CI CSObjSens CSRecIns CS

Uday Khedker IIT Bombay

Page 63: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 19/43

Pointer Analysis: An Engineer’s Landscape

Flow

Sensitivity

Increases

Context SensitivityIncreases

FI=

FI⊆

FISSA

FSNoKill

FS

CI CSObjSens CSRecIns CS

Data Structures: BDDs, probabilistic

Uday Khedker IIT Bombay

Page 64: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 19/43

Pointer Analysis: An Engineer’s Landscape

Flow

Sensitivity

Increases

Context SensitivityIncreases

FI=

FI⊆

FISSA

FSNoKill

FS

CI CSObjSens CSRecIns CS

Data Structures: BDDs, probabilistic

Methods: parallel, on demand, randomized

Uday Khedker IIT Bombay

Page 65: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 19/43

Pointer Analysis: An Engineer’s Landscape

Flow

Sensitivity

Increases

Context SensitivityIncreases

FI=

FI⊆

FISSA

FSNoKill

FS

CI CSObjSens CSRecIns CS

Data Structures: BDDs, probabilistic

Methods: parallel, on demand, randomizedRefinement: Level-wise, bootstrapping

Uday Khedker IIT Bombay

Page 66: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 19/43

Pointer Analysis: An Engineer’s Landscape

Flow

Sensitivity

Increases

Context SensitivityIncreases

FI=

FI⊆

FISSA

FSNoKill

FS

CI CSObjSens CSRecIns CS

Crowded Area

Uday Khedker IIT Bombay

Page 67: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 19/43

Pointer Analysis: An Engineer’s Landscape

Flow

Sensitivity

Increases

Context SensitivityIncreases

FI=

FI⊆

FISSA

FSNoKill

FS

CI CSObjSens CSRecIns CS

Crowded Area

Thinly

populated

Uday Khedker IIT Bombay

Page 68: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 19/43

Pointer Analysis: An Engineer’s Landscape

Flow

Sensitivity

Increases

Context SensitivityIncreases

FI=

FI⊆

FISSA

FSNoKill

FS

CI CSObjSens CSRecIns CS

Crowded Area

Thinly

populated

That’s thecorner we are trying to

occupy :-)

Uday Khedker IIT Bombay

Page 69: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 20/43

In Search of Abstractions for Precision Without Inefficiency

DesiredAbstraction Enabling Abstraction Status of our work

Flowsensitivity

Context sensitivity(Caller sensitivity)

Precise heapabstraction

Precise call structure

Uday Khedker IIT Bombay

Page 70: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 20/43

In Search of Abstractions for Precision Without Inefficiency

DesiredAbstraction Enabling Abstraction Status of our work

Flowsensitivity

Joint liveness andpoints-to analysis

Partial accomplishment(SAS12)

Context sensitivity(Caller sensitivity)

Precise heapabstraction

Precise call structure

Restrict the computationonly to the usable data.Weave liveness discoveryinto the analysis

Uday Khedker IIT Bombay

Page 71: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 20/43

In Search of Abstractions for Precision Without Inefficiency

DesiredAbstraction Enabling Abstraction Status of our work

Flowsensitivity

Joint liveness andpoints-to analysis

Partial accomplishment(SAS12)

High level abstractionof memory

Partial accomplishment(SAS16)

Context sensitivity(Caller sensitivity)

Precise heapabstraction

Precise call structure

Postpone low levelconnections explicatedby the classicalpoints-to facts

Uday Khedker IIT Bombay

Page 72: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 20/43

In Search of Abstractions for Precision Without Inefficiency

DesiredAbstraction Enabling Abstraction Status of our work

Flowsensitivity

Joint liveness andpoints-to analysis

Partial accomplishment(SAS12)

High level abstractionof memory

Partial accomplishment(SAS16)

Context sensitivity(Caller sensitivity)

Value contextsMature accomplishment(CC08, SAS12, SOAP13)

Precise heapabstraction

Precise call structure

Distinguish betweencontexts by theirdata flow values andnot their call chains

Uday Khedker IIT Bombay

Page 73: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 20/43

In Search of Abstractions for Precision Without Inefficiency

DesiredAbstraction Enabling Abstraction Status of our work

Flowsensitivity

Joint liveness andpoints-to analysis

Partial accomplishment(SAS12)

High level abstractionof memory

Partial accomplishment(SAS16)

Context sensitivity(Caller sensitivity)

Value contextsMature accomplishment(CC08, SAS12, SOAP13)

GPG based bottom-upsummary flow functions

Mature accomplishment(SAS16)

Precise heapabstraction

Precise call structure

Avoid recomputationsfor each context.Use a higher levelabstraction of memory.

Uday Khedker IIT Bombay

Page 74: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 20/43

In Search of Abstractions for Precision Without Inefficiency

DesiredAbstraction Enabling Abstraction Status of our work

Flowsensitivity

Joint liveness andpoints-to analysis

Partial accomplishment(SAS12)

High level abstractionof memory

Partial accomplishment(SAS16)

Context sensitivity(Caller sensitivity)

Value contextsMature accomplishment(CC08, SAS12, SOAP13)

GPG based bottom-upsummary flow functions

Mature accomplishment(SAS16)

Precise heapabstraction

Liveness accessgraphs

Partial accomplishment(TOPLAS07)

Precise call structure

Identify the part of heapactually accessed in termsof patterns of accesses

Uday Khedker IIT Bombay

Page 75: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 20/43

In Search of Abstractions for Precision Without Inefficiency

DesiredAbstraction Enabling Abstraction Status of our work

Flowsensitivity

Joint liveness andpoints-to analysis

Partial accomplishment(SAS12)

High level abstractionof memory

Partial accomplishment(SAS16)

Context sensitivity(Caller sensitivity)

Value contextsMature accomplishment(CC08, SAS12, SOAP13)

GPG based bottom-upsummary flow functions

Mature accomplishment(SAS16)

Precise heapabstraction

Liveness accessgraphs

Partial accomplishment(TOPLAS07)

Access basedabstraction

Mature accomplishment(ISMM17)

Precise call structure

Distinguish between heaplocations based on howthey are accessed and nothow they are allocated

Uday Khedker IIT Bombay

Page 76: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 20/43

In Search of Abstractions for Precision Without Inefficiency

DesiredAbstraction Enabling Abstraction Status of our work

Flowsensitivity

Joint liveness andpoints-to analysis

Partial accomplishment(SAS12)

High level abstractionof memory

Partial accomplishment(SAS16)

Context sensitivity(Caller sensitivity)

Value contextsMature accomplishment(CC08, SAS12, SOAP13)

GPG based bottom-upsummary flow functions

Mature accomplishment(SAS16)

Precise heapabstraction

Liveness accessgraphs

Partial accomplishment(TOPLAS07)

Access basedabstraction

Mature accomplishment(ISMM17)

Precise call structureCallee sensitivity Work in progress

Call strings record callhistory. We need torecord call future also.

Uday Khedker IIT Bombay

Page 77: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 20/43

In Search of Abstractions for Precision Without Inefficiency

DesiredAbstraction Enabling Abstraction Status of our work

Flowsensitivity

Joint liveness andpoints-to analysis

Partial accomplishment(SAS12)

High level abstractionof memory

Partial accomplishment(SAS16)

Context sensitivity(Caller sensitivity)

Value contextsMature accomplishment(CC08, SAS12, SOAP13)

GPG based bottom-upsummary flow functions

Mature accomplishment(SAS16)

Precise heapabstraction

Liveness accessgraphs

Partial accomplishment(TOPLAS07)

Access basedabstraction

Mature accomplishment(ISMM17)

Precise call structureCallee sensitivity Work in progress

Virtual call resolution Work in progress

Make the call graph moreprecise by computing amore precise set of callees

Uday Khedker IIT Bombay

Page 78: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Our Meanderings 20/43

In Search of Abstractions for Precision Without Inefficiency

DesiredAbstraction Enabling Abstraction Status of our work

Flowsensitivity

Joint liveness andpoints-to analysis

Partial accomplishment(SAS12)

High level abstractionof memory

Partial accomplishment(SAS16)

Context sensitivity(Caller sensitivity)

Value contextsMature accomplishment(CC08, SAS12, SOAP13)

GPG based bottom-upsummary flow functions

Mature accomplishment(SAS16)

Precise heapabstraction

Liveness accessgraphs

Partial accomplishment(TOPLAS07)

Access basedabstraction

Mature accomplishment(ISMM17)

Precise call structureCallee sensitivity Work in progress

Virtual call resolution Work in progress

We are destined

to a long haul with no

guarantees :-)

Uday Khedker IIT Bombay

Page 79: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

Part 2

Some Short Trips

Page 80: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 21/43

In Search of Abstractions for Precision Without Inefficiency

DesiredAbstraction Enabling Abstraction Status of our work

Flowsensitivity

Joint liveness andpoints-to analysis

Partial accomplishment(SAS12)

High level abstractionof memory

Partial accomplishment(SAS16)

Context sensitivity(Caller sensitivity)

Value contextsMature accomplishment(CC08, SAS12, SOAP13)

GPG based bottom-upsummary flow functions

Mature accomplishment(SAS16)

Precise heapabstraction

Liveness accessgraphs

Partial accomplishment(TOPLAS07)

Access basedabstraction

Mature accomplishment(ISMM17)

Precise call structureCallee sensitivity Work in progress

Virtual call resolution Work in progress

Uday Khedker IIT Bombay

Page 81: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 22/43

Liveness Based Pointer Analysis: Motivation

n1

x = &y

y = &z

z = &u

n1

n2 ∗z = y n2 n3 z = y n3

n4

y = &x

use u

use x

n4

x y z u

x y z u

x y z u

x y z u

x y zx y z u

x y z u

Uday Khedker IIT Bombay

Page 82: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 22/43

Liveness Based Pointer Analysis: Motivation

n1

x = &y

y = &z

z = &u

n1

n2 ∗z = y n2 n3 z = y n3

n4

y = &x

use u

use x

n4

x y z u

x y z u

x y z u

x y z u

x y zx y z u

x y z uIs all this

information reallyuseful?

Uday Khedker IIT Bombay

Page 83: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 22/43

Liveness Based Pointer Analysis: Motivation

n1

x = &y

y = &z

z = &u

n1

n2 ∗z = y n2 n3 z = y n3

n4

y = &x

use u

use x

n4

x y z u

x y z u

x y z u

x y z u

x y zx y z u

x y z u

Uday Khedker IIT Bombay

Page 84: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 22/43

Liveness Based Pointer Analysis: Motivation

n1

x = &y

y = &z

z = &u

n1

n2 ∗z = y n2 n3 z = y n3

n4

y = &x

use u

use x

n4

x y z u

x y z u

x y z u

x y z u

x y zx y z u

x y z u

Uday Khedker IIT Bombay

Page 85: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 22/43

Liveness Based Pointer Analysis: Motivation

n1

x = &y

y = &z

z = &u

n1

n2 ∗z = y n2 n3 z = y n3

n4

y = &x

use u

use x

n4

x y z u

x y z u

x y z u

x y z u

x y zx y z u

x y z u

Uday Khedker IIT Bombay

Page 86: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 22/43

Liveness Based Pointer Analysis: Motivation

n1

x = &y

y = &z

z = &u

n1

n2 ∗z = y n2 n3 z = y n3

n4

y = &x

use u

use x

n4

x y z u

x y z u

x y z u

x y z u

x y zx y z u

x y z u

Uday Khedker IIT Bombay

Page 87: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 22/43

Liveness Based Pointer Analysis: Motivation

n1

x = &y

y = &z

z = &u

n1

n2 ∗z = y n2 n3 z = y n3

n4

y = &x

use u

use x

n4

x y z u

x y z u

x y z u

x y z u

x y zx y z u

x y z u

Uday Khedker IIT Bombay

Page 88: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 23/43

Liveness Based Points-to Analysis (SAS-2012)

• Mutual dependence of liveness and points-to information

◮ Define points-to information only for live pointers◮ For pointer indirections, define liveness information using points-to

information

• Use call strings method for full flow and context sensitivity

◮ Value based termination of call strings construction (CC-2008)

• Use strong liveness

Uday Khedker IIT Bombay

Page 89: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 24/43

Liveness Based Interprocedural Points-to Analysis: EmpiricalMeasurements

• Observations on SPEC CPU 2006 benchmarks in GCC 4.6.0

(Prashant Singh Rawat, IITB 2012)

Usable pointer information is small and sparse

No of Points-to pairs Percentable of basic blocks

0 64-96%1-4 9-25%5-8 0-10%8+ 0-4%

Uday Khedker IIT Bombay

Page 90: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 24/43

Liveness Based Interprocedural Points-to Analysis: EmpiricalMeasurements

• Observations on SPEC CPU 2006 benchmarks in GCC 4.6.0

(Prashant Singh Rawat, IITB 2012)

Usable pointer information is small and sparse

No of Points-to pairs Percentable of basic blocks

0 64-96%1-4 9-25%5-8 0-10%8+ 0-4%

Uday Khedker IIT Bombay

Page 91: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 24/43

Liveness Based Interprocedural Points-to Analysis: EmpiricalMeasurements

• Observations on SPEC CPU 2006 benchmarks in GCC 4.6.0

(Prashant Singh Rawat, IITB 2012)

Usable pointer information is small and sparse

No of Points-to pairs Percentable of basic blocks

0 64-96%1-4 9-25%5-8 0-10%8+ 0-4%

• Independently implemented and verified in

◮ LLVM (Dylan McDermott, Cambridge, 2016) and◮ GCC 4.7.2 (Priyanka Sawant, IITB, 2016)

Uday Khedker IIT Bombay

Page 92: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 25/43

Value Contexts (CC-2008, SAS-2012, SOAP-2013)

Start

End

ProcedureBody

Uday Khedker IIT Bombay

Page 93: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 25/43

Value Contexts (CC-2008, SAS-2012, SOAP-2013)

Start

End

Multipleinterproceduralpaths reachingthe procedure

Uday Khedker IIT Bombay

Page 94: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 25/43

Value Contexts (CC-2008, SAS-2012, SOAP-2013)

Start

End

Multipleinterproceduralpaths reachingthe procedure

Uday Khedker IIT Bombay

Page 95: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 25/43

Value Contexts (CC-2008, SAS-2012, SOAP-2013)

Start

End

Multipleinterproceduralpaths reachingthe procedure

Uday Khedker IIT Bombay

Page 96: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 25/43

Value Contexts (CC-2008, SAS-2012, SOAP-2013)

Start

End

Multipleinterproceduralpaths reachingthe procedure

Uday Khedker IIT Bombay

Page 97: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 25/43

Value Contexts (CC-2008, SAS-2012, SOAP-2013)

Start

End

Multipleinterproceduralpaths reachingthe procedure

Uday Khedker IIT Bombay

Page 98: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 25/43

Value Contexts (CC-2008, SAS-2012, SOAP-2013)

Start

End

Data flow values x

x ′

Uday Khedker IIT Bombay

Page 99: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 25/43

Value Contexts (CC-2008, SAS-2012, SOAP-2013)

Start

End

Data flow values x

x ′

y

y ′

Uday Khedker IIT Bombay

Page 100: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 25/43

Value Contexts (CC-2008, SAS-2012, SOAP-2013)

Start

End

Data flow values x

x ′

y

y ′

y

y ′

Uday Khedker IIT Bombay

Page 101: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 25/43

Value Contexts (CC-2008, SAS-2012, SOAP-2013)

Start

End

Data flow values x

x ′

y

y ′

y

y ′

y

y ′

Uday Khedker IIT Bombay

Page 102: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 25/43

Value Contexts (CC-2008, SAS-2012, SOAP-2013)

Start

End

Data flow values x

x ′

y

y ′

y

y ′

y

y ′

z

z ′

Uday Khedker IIT Bombay

Page 103: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 25/43

Value Contexts (CC-2008, SAS-2012, SOAP-2013)

Start

End

Data flow values

Contexts

x

X0

x ′

y

y ′

y

y ′

y

y ′

z

z ′

Uday Khedker IIT Bombay

Page 104: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 25/43

Value Contexts (CC-2008, SAS-2012, SOAP-2013)

Start

End

Data flow values

Contexts

x

X0

x ′

y

X1

y ′

y

X1

y ′

y

X1

y ′

z

z ′

Uday Khedker IIT Bombay

Page 105: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 25/43

Value Contexts (CC-2008, SAS-2012, SOAP-2013)

Start

End

Data flow values

Contexts

x

X0

x ′

y

X1

y ′

y

X1

y ′

y

X1

y ′

z

X2

z ′

Uday Khedker IIT Bombay

Page 106: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 25/43

Value Contexts (CC-2008, SAS-2012, SOAP-2013)

Start

End

Data flow values

Contexts

Call sites

x

c0

X0

x ′

y

c1

X1

y ′

y

c2

X1

y ′

y

c3

X1

y ′

z

c4

X2

z ′

Uday Khedker IIT Bombay

Page 107: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 25/43

Value Contexts (CC-2008, SAS-2012, SOAP-2013)

Start

End

Data flow values

Contexts

Call sites

Contexts ofthe callers

x

c0

Xi

X0

x ′

y

c1

Xj

X1

y ′

y

c2

Xk

X1

y ′

y

c3

Xl

X1

y ′

z

c4

Xm

X2

z ′

Uday Khedker IIT Bombay

Page 108: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 25/43

Value Contexts (CC-2008, SAS-2012, SOAP-2013)

Start

End

Data flow values

Contexts

Call sites

Contexts ofthe callers

x

c0

Xi

X0

x ′

y

c1

Xj

X1

y ′

y

c2

Xk

X1

y ′

y

c3

Xl

X1

y ′

z

c4

Xm

X2

z ′

Context transition graph

Xi X0c0

Uday Khedker IIT Bombay

Page 109: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 25/43

Value Contexts (CC-2008, SAS-2012, SOAP-2013)

Start

End

Data flow values

Contexts

Call sites

Contexts ofthe callers

x

c0

Xi

X0

x ′

y

c1

Xj

X1

y ′

y

c2

Xk

X1

y ′

y

c3

Xl

X1

y ′

z

c4

Xm

X2

z ′

Context transition graph

Xi X0c0

Xj

Xk

Xl

X1

c1

c2

c3

Uday Khedker IIT Bombay

Page 110: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 25/43

Value Contexts (CC-2008, SAS-2012, SOAP-2013)

Start

End

Data flow values

Contexts

Call sites

Contexts ofthe callers

x

c0

Xi

X0

x ′

y

c1

Xj

X1

y ′

y

c2

Xk

X1

y ′

y

c3

Xl

X1

y ′

z

c4

Xm

X2

z ′

Context transition graph

Xi X0c0

Xj

Xk

Xl

X1

c1

c2

c3

Xm X2c4

Uday Khedker IIT Bombay

Page 111: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 26/43

Value Contexts (CC-2008, SAS-2012, SOAP-2013)

Analyze a procedure once for an input data flow value

• The number of times a procedure is analyzed reduces dramatically

• Similar to the tabulation based method of functional approach[Sharir-Pnueli, 1981]

However,

◮ Value contexts record calling contexts tooUseful for context matching across program analyses

◮ Can avoid some reprocessing even when a new input value is found

Uday Khedker IIT Bombay

Page 112: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 27/43

Empirical Observations About Value Contexts

• Reaching definitions analysis in GCC 4.2.0 (CC-2008)

Analysis of Towers of Hanoi

◮ Time brought down from 3.973× 106 ms to 2.37 ms◮ No of call strings brought down from 106+ to 8

Uday Khedker IIT Bombay

Page 113: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 27/43

Empirical Observations About Value Contexts

• Reaching definitions analysis in GCC 4.2.0 (CC-2008)

Analysis of Towers of Hanoi

◮ Time brought down from 3.973× 106 ms to 2.37 ms◮ No of call strings brought down from 106+ to 8

• Generic Interprocedural Analysis Framework in SOOT (SOAP-2013)

Empirical observations on SPECJVM98 and DaCapo 2006 benchmarks foron-the-fly call graph construction

◮ Average number of contexts per procedure lies in the range 4-25◮ Much fewer long call chains than in the default call graph

constructed using SPARKFor legnth 7, less than 50%For length 10, less than 5%

Uday Khedker IIT Bombay

Page 114: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 28/43

Classical Points-to Facts: A Low Level Abstraction ofMemory for Points-to Analysis

All variables are global

Red nodes are known named locations

x

y

f()

{

*x = y

}

Uday Khedker IIT Bombay

Page 115: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 28/43

Classical Points-to Facts: A Low Level Abstraction ofMemory for Points-to Analysis

All variables are global

Red nodes are known named locations

Blue nodes are placeholders denoting unknown locations

x

y

f()

{

*x = y

}

φ1 φ2

Uday Khedker IIT Bombay

Page 116: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 28/43

Classical Points-to Facts: A Low Level Abstraction ofMemory for Points-to Analysis

All variables are global

Red nodes are known named locations

Blue nodes are placeholders denoting unknown locations

x

y

f()

{

*x = y

}

φ1 φ2

a

Informationfrom callers

Uday Khedker IIT Bombay

Page 117: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 28/43

Classical Points-to Facts: A Low Level Abstraction ofMemory for Points-to Analysis

All variables are global

Red nodes are known named locations

Blue nodes are placeholders denoting unknown locations

x

y

f()

{

*x = y

}

φ1 φ2φ1

a

Uday Khedker IIT Bombay

Page 118: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 28/43

Classical Points-to Facts: A Low Level Abstraction ofMemory for Points-to Analysis

All variables are global

Red nodes are known named locations

Blue nodes are placeholders denoting unknown locations

x

y

f()

{

*x = y

}

φ1 φ2a

Uday Khedker IIT Bombay

Page 119: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 28/43

Classical Points-to Facts: A Low Level Abstraction ofMemory for Points-to Analysis

All variables are global

Red nodes are known named locations

Blue nodes are placeholders denoting unknown locations

x

y

f()

{

*x = y

}

φ1 φ2a

b

Informationfrom callers

Uday Khedker IIT Bombay

Page 120: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 28/43

Classical Points-to Facts: A Low Level Abstraction ofMemory for Points-to Analysis

All variables are global

Red nodes are known named locations

Blue nodes are placeholders denoting unknown locations

x

y

f()

{

*x = y

}

φ1 φ2a

bb

φ2

Uday Khedker IIT Bombay

Page 121: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 28/43

Classical Points-to Facts: A Low Level Abstraction ofMemory for Points-to Analysis

All variables are global

Red nodes are known named locations

Blue nodes are placeholders denoting unknown locations

x

y

f()

{

*x = y

}

φ1 φ2a b

Uday Khedker IIT Bombay

Page 122: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 29/43

Generalized Points-to Facts: A High Level Abstraction ofMemory for Points-to Analysis (SAS-2016)

Blue arrows are low level view of memory in terms of classical points-to facts

x

y

f()

{

*x = y

}

φ1 φ2

Uday Khedker IIT Bombay

Page 123: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 29/43

Generalized Points-to Facts: A High Level Abstraction ofMemory for Points-to Analysis (SAS-2016)

Blue arrows are low level view of memory in terms of classical points-to facts

Black arrows are high level view of memory in terms of generalized points-to facts

x

y

f()

{

*x = y

}

φ1 φ2

Uday Khedker IIT Bombay

Page 124: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 29/43

Generalized Points-to Facts: A High Level Abstraction ofMemory for Points-to Analysis (SAS-2016)

Blue arrows are low level view of memory in terms of classical points-to facts

Black arrows are high level view of memory in terms of generalized points-to facts

x

y

f()

{

*x = y

}

φ1 φ2a

Uday Khedker IIT Bombay

Page 125: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 29/43

Generalized Points-to Facts: A High Level Abstraction ofMemory for Points-to Analysis (SAS-2016)

Blue arrows are low level view of memory in terms of classical points-to facts

Black arrows are high level view of memory in terms of generalized points-to facts

x

y

f()

{

*x = y

}

φ1 φ2a

Uday Khedker IIT Bombay

Page 126: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 29/43

Generalized Points-to Facts: A High Level Abstraction ofMemory for Points-to Analysis (SAS-2016)

Blue arrows are low level view of memory in terms of classical points-to facts

Black arrows are high level view of memory in terms of generalized points-to facts

x

y

f()

{

*x = y

}

φ1 φ2a

Uday Khedker IIT Bombay

Page 127: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 29/43

Generalized Points-to Facts: A High Level Abstraction ofMemory for Points-to Analysis (SAS-2016)

Blue arrows are low level view of memory in terms of classical points-to facts

Black arrows are high level view of memory in terms of generalized points-to facts

x

y

f()

{

*x = y

}

φ1 φ2a b

Uday Khedker IIT Bombay

Page 128: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 29/43

Generalized Points-to Facts: A High Level Abstraction ofMemory for Points-to Analysis (SAS-2016)

Blue arrows are low level view of memory in terms of classical points-to facts

Black arrows are high level view of memory in terms of generalized points-to facts

x

y

f()

{

*x = y

}

φ1 φ2a b

Uday Khedker IIT Bombay

Page 129: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 29/43

Generalized Points-to Facts: A High Level Abstraction ofMemory for Points-to Analysis (SAS-2016)

Blue arrows are low level view of memory in terms of classical points-to facts

Black arrows are high level view of memory in terms of generalized points-to facts

x

y

f()

{

*x = y

}

φ1 φ2a b

Uday Khedker IIT Bombay

Page 130: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 30/43

Generalized Points-to Graphs (GPGs) for Points-to Analysis(SAS-2016)

Construction of bottom up summary flow functions using GPGs

• Issues at intraprocedural level

Flow sensitivity, strong and weak updates, efficiency using SSA form

• Issues at interprocedural level

Context sensitivity: Composition of callee’s GPGs within callers

Efficiency using bypassing of irrelevant information

• Handling advanced features

Function Pointers, Heap, Structures, Union, Arrays, Pointer Arithmetic

• Theoretical issues. Soundness and complexity

• Implementation and measurements

Using LTO framework in GCC 4.7.2 scaling to 158 KLoC

Uday Khedker IIT Bombay

Page 131: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 31/43

Heap Reference Analysis [TOPLAS 2007]

• Problem.

• Our Objectives.

• Main Challenge.

• Our Key Idea.

• Current status.

• Further Work.

Uday Khedker IIT Bombay

Page 132: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 31/43

Heap Reference Analysis [TOPLAS 2007]

• Problem. A lot of unused data remains unclaimed even in the best ofgarbage collectors. In C/C++, memory leaks is a major problem

• Our Objectives.

• Main Challenge.

• Our Key Idea.

• Current status.

• Further Work.

Uday Khedker IIT Bombay

Page 133: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 31/43

Heap Reference Analysis [TOPLAS 2007]

• Problem. A lot of unused data remains unclaimed even in the best ofgarbage collectors. In C/C++, memory leaks is a major problem

• Our Objectives. Static analysis of heap data to improve garbage collectionand plug memory leaks

• Main Challenge.

• Our Key Idea.

• Current status.

• Further Work.

Uday Khedker IIT Bombay

Page 134: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 31/43

Heap Reference Analysis [TOPLAS 2007]

• Problem. A lot of unused data remains unclaimed even in the best ofgarbage collectors. In C/C++, memory leaks is a major problem

• Our Objectives. Static analysis of heap data to improve garbage collectionand plug memory leaks

• Main Challenge. Unlike stack and static data,

◮ heap data accessible to any procedure is unbounded. Hence,◮ the mapping between object names and their addresses needs to

change at runtime

• Our Key Idea.

• Current status.

• Further Work.

Uday Khedker IIT Bombay

Page 135: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 32/43

Which Heap Memory Nodes Can be Statically Marked asLive?

If the while loop is not executed even once.

1 w = x // x points to ma

2 while (x.data < max)3 x = x.rptr4 y = x.lptr

5 z = New class of z6 y = y.lptr7 z.sum = x.data + y.data

HeapStack

x

z

w

y

a

p

q

b

i

c

f

g

h

d

e

j

m

k

l

n

o

lptr

rptr

rptr

lptr

rptr

lptr

rptr

lptr

lptr

rptr

rptr

lptr

rptr

lptr

a

i

m

Uday Khedker IIT Bombay

Page 136: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 32/43

Which Heap Memory Nodes Can be Statically Marked asLive?

If the while loop is executed once.

1 w = x // x points to ma

2 while (x.data < max)3 x = x.rptr4 y = x.lptr

5 z = New class of z6 y = y.lptr7 z.sum = x.data + y.data

HeapStack

x

z

w

y

a

p

q

b

i

c

f

g

h

d

e

j

m

k

l

n

o

lptr

rptr

rptr

lptr

rptr

lptr

rptr

lptr

lptr

rptr

rptr

lptr

rptr

lptr

b

fh

Uday Khedker IIT Bombay

Page 137: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 32/43

Which Heap Memory Nodes Can be Statically Marked asLive?

If the while loop is executed twice.

1 w = x // x points to ma

2 while (x.data < max)3 x = x.rptr4 y = x.lptr

5 z = New class of z6 y = y.lptr7 z.sum = x.data + y.data

HeapStack

x

z

w

y

a

p

q

b

i

c

f

g

h

d

e

j

m

k

l

n

o

lptr

rptr

rptr

lptr

rptr

lptr

rptr

lptr

lptr

rptr

rptr

lptr

rptr

lptr

ce

Uday Khedker IIT Bombay

Page 138: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 33/43

Heap Reference Analysis [TOPLAS 2007]

• Problem. A lot of unused data remains unclaimed even in the best ofgarbage collectors. In C/C++, memory leaks is a major problem

• Our Objectives. Static analysis of heap data to improve garbage collectionand plug memory leaks

• Main Challenge. Unlike stack and static data,

◮ heap data accessible to any procedure is unbounded. Hence,◮ the mapping between object names and their addresses needs to

change at runtime

• Our Key Idea.

• Current status.

• Further Work.

Uday Khedker IIT Bombay

Page 139: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 33/43

Heap Reference Analysis [TOPLAS 2007]

• Problem. A lot of unused data remains unclaimed even in the best ofgarbage collectors. In C/C++, memory leaks is a major problem

• Our Objectives. Static analysis of heap data to improve garbage collectionand plug memory leaks

• Main Challenge. Unlike stack and static data,

◮ heap data accessible to any procedure is unbounded. Hence,◮ the mapping between object names and their addresses needs to

change at runtime

• Our Key Idea. Build bounded abstractions of heap data in terms of graphsand perform analysis using these graphs as data flow values

• Current status.

• Further Work.

Uday Khedker IIT Bombay

Page 140: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 34/43

Heap Reference Analysis: Our Solution

y = z = null

1 w = x

w = null

2 while (x.data < max)

{ x.lptr = null

3 x = x.rptr }

x.rptr = x.lptr.rptr = null

x.lptr.lptr.lptr = null

x.lptr.lptr.rptr = null

4 y = x.lptr

x.lptr = y.rptr = null

y.lptr.lptr = y.lptr.rptr = null

5 z = New class of z

z.lptr = z.rptr = null

6 y = y.lptr

y.lptr = y.rptr = null

7 z.sum = x.data + y.data

x = y = z = null HeapStack

x

z

w

y

a

p

q

b

i

c

f

g

h

d

e

j

m

k

l

n

o

rptr

lptr

While loop is not executed even once

a

i

mlptr

rptr

lptr

lptr

rptr

lptr

rptr

rptr

rptr

lptr

rptr

lptr

Uday Khedker IIT Bombay

Page 141: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 34/43

Heap Reference Analysis: Our Solution

y = z = null

1 w = x

w = null

2 while (x.data < max)

{ x.lptr = null

3 x = x.rptr }

x.rptr = x.lptr.rptr = null

x.lptr.lptr.lptr = null

x.lptr.lptr.rptr = null

4 y = x.lptr

x.lptr = y.rptr = null

y.lptr.lptr = y.lptr.rptr = null

5 z = New class of z

z.lptr = z.rptr = null

6 y = y.lptr

y.lptr = y.rptr = null

7 z.sum = x.data + y.data

x = y = z = null HeapStack

x

z

w

y

a

p

q

b

i

c

f

g

h

d

e

j

m

k

l

n

o

rptr

lptr

While loop is not executed even once

a

i

mlptr

rptr

lptr

lptr

rptr

lptr

rptr

rptr

rptr

lptr

rptr

lptr

Uday Khedker IIT Bombay

Page 142: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 34/43

Heap Reference Analysis: Our Solution

y = z = null

1 w = x

w = null

2 while (x.data < max)

{ x.lptr = null

3 x = x.rptr }

x.rptr = x.lptr.rptr = null

x.lptr.lptr.lptr = null

x.lptr.lptr.rptr = null

4 y = x.lptr

x.lptr = y.rptr = null

y.lptr.lptr = y.lptr.rptr = null

5 z = New class of z

z.lptr = z.rptr = null

6 y = y.lptr

y.lptr = y.rptr = null

7 z.sum = x.data + y.data

x = y = z = null HeapStack

x

z

w

y

a

p

q

b

i

c

f

g

h

d

e

j

m

k

l

n

o

rptr

lptr

While loop is not executed even once

a

i

mlptr

rptr

lptr

lptr

rptr

lptr

rptr

rptr

rptr

lptr

rptr

lptr

Uday Khedker IIT Bombay

Page 143: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 34/43

Heap Reference Analysis: Our Solution

y = z = null

1 w = x

w = null

2 while (x.data < max)

{ x.lptr = null

3 x = x.rptr }

x.rptr = x.lptr.rptr = null

x.lptr.lptr.lptr = null

x.lptr.lptr.rptr = null

4 y = x.lptr

x.lptr = y.rptr = null

y.lptr.lptr = y.lptr.rptr = null

5 z = New class of z

z.lptr = z.rptr = null

6 y = y.lptr

y.lptr = y.rptr = null

7 z.sum = x.data + y.data

x = y = z = null HeapStack

x

z

w

y

a

p

q

b

i

c

f

g

h

d

e

j

m

k

l

n

o

rptr

lptr

While loop is not executed even once

a

i

mlptr

rptr

lptr

lptr

rptr

lptr

rptr

rptr

rptr

lptr

rptr

lptr

Uday Khedker IIT Bombay

Page 144: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 34/43

Heap Reference Analysis: Our Solution

y = z = null

1 w = x

w = null

2 while (x.data < max)

{ x.lptr = null

3 x = x.rptr }

x.rptr = x.lptr.rptr = null

x.lptr.lptr.lptr = null

x.lptr.lptr.rptr = null

4 y = x.lptr

x.lptr = y.rptr = null

y.lptr.lptr = y.lptr.rptr = null

5 z = New class of z

z.lptr = z.rptr = null

6 y = y.lptr

y.lptr = y.rptr = null

7 z.sum = x.data + y.data

x = y = z = null HeapStack

x

z

w

y

a

p

q

b

i

c

f

g

h

d

e

j

m

k

l

n

o

rptr

lptr

While loop is not executed even once

a

i

mlptr

rptr

lptr

lptr

rptr

lptr

rptr

rptr

rptr

lptr

rptr

lptr

Uday Khedker IIT Bombay

Page 145: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 34/43

Heap Reference Analysis: Our Solution

y = z = null

1 w = x

w = null

2 while (x.data < max)

{ x.lptr = null

3 x = x.rptr }

x.rptr = x.lptr.rptr = null

x.lptr.lptr.lptr = null

x.lptr.lptr.rptr = null

4 y = x.lptr

x.lptr = y.rptr = null

y.lptr.lptr = y.lptr.rptr = null

5 z = New class of z

z.lptr = z.rptr = null

6 y = y.lptr

y.lptr = y.rptr = null

7 z.sum = x.data + y.data

x = y = z = null HeapStack

x

z

w

y

a

p

q

b

i

c

f

g

h

d

e

j

m

k

l

n

o

rptr

lptr

While loop is not executed even once

a

i

mlptr

rptr

lptr

lptr

rptr

lptr

rptr

rptr

rptr

lptr

rptr

Uday Khedker IIT Bombay

Page 146: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 34/43

Heap Reference Analysis: Our Solution

y = z = null

1 w = x

w = null

2 while (x.data < max)

{ x.lptr = null

3 x = x.rptr }

x.rptr = x.lptr.rptr = null

x.lptr.lptr.lptr = null

x.lptr.lptr.rptr = null

4 y = x.lptr

x.lptr = y.rptr = null

y.lptr.lptr = y.lptr.rptr = null

5 z = New class of z

z.lptr = z.rptr = null

6 y = y.lptr

y.lptr = y.rptr = null

7 z.sum = x.data + y.data

x = y = z = null HeapStack

x

z

w

y

a

p

q

b

i

c

f

g

h

d

e

j

m

k

l

n

o

rptr

lptr

While loop is not executed even once

a

i

mlptr

rptr

lptr

lptr

rptr

lptr

rptr

Uday Khedker IIT Bombay

Page 147: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 34/43

Heap Reference Analysis: Our Solution

y = z = null

1 w = x

w = null

2 while (x.data < max)

{ x.lptr = null

3 x = x.rptr }

x.rptr = x.lptr.rptr = null

x.lptr.lptr.lptr = null

x.lptr.lptr.rptr = null

4 y = x.lptr

x.lptr = y.rptr = null

y.lptr.lptr = y.lptr.rptr = null

5 z = New class of z

z.lptr = z.rptr = null

6 y = y.lptr

y.lptr = y.rptr = null

7 z.sum = x.data + y.data

x = y = z = null HeapStack

x

z

w

y

a

p

q

b

i

c

f

g

h

d

e

j

m

k

l

n

o

rptr

lptr

While loop is executed once

a

i

m

b

fhlptr

rptr

rptr

lptr rptr

lptr

rptr

lptr

Uday Khedker IIT Bombay

Page 148: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 34/43

Heap Reference Analysis: Our Solution

y = z = null

1 w = x

w = null

2 while (x.data < max)

{ x.lptr = null

3 x = x.rptr }

x.rptr = x.lptr.rptr = null

x.lptr.lptr.lptr = null

x.lptr.lptr.rptr = null

4 y = x.lptr

x.lptr = y.rptr = null

y.lptr.lptr = y.lptr.rptr = null

5 z = New class of z

z.lptr = z.rptr = null

6 y = y.lptr

y.lptr = y.rptr = null

7 z.sum = x.data + y.data

x = y = z = null HeapStack

x

z

w

y

a

p

q

b

i

c

f

g

h

d

e

j

m

k

l

n

o

rptr

lptr

While loop is executed twice

a

i

m

b

fh

ce

lptr

rptr

rptr

lptr rptr

lptr

rptr

rptr

Uday Khedker IIT Bombay

Page 149: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 35/43

Heap Reference Analysis [TOPLAS 2007]

• Problem. A lot of unused data remains unclaimed even in the best ofgarbage collectors. In C/C++, memory leaks is a major problem

• Our Objectives. Static analysis of heap allocated data to improve garbagecollection and plug memory leaks

• Main Challenge. Unlike stack and static data,

◮ heap data accessible to any procedure is unbounded. Hence,◮ the mapping between object names and their addresses needs to

change at runtime

• Our Key Idea. Build bounded abstractions of heap data in terms of graphsand perform analysis using these graphs as data flow values

• Current status.

• Further Work.

Uday Khedker IIT Bombay

Page 150: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 35/43

Heap Reference Analysis [TOPLAS 2007]

• Problem. A lot of unused data remains unclaimed even in the best ofgarbage collectors. In C/C++, memory leaks is a major problem

• Our Objectives. Static analysis of heap allocated data to improve garbagecollection and plug memory leaks

• Main Challenge. Unlike stack and static data,

◮ heap data accessible to any procedure is unbounded. Hence,◮ the mapping between object names and their addresses needs to

change at runtime

• Our Key Idea. Build bounded abstractions of heap data in terms of graphsand perform analysis using these graphs as data flow values

• Current status. Theory and prototype implementation (at theintraprocedural level) ready for Java

• Further Work.

Uday Khedker IIT Bombay

Page 151: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 35/43

Heap Reference Analysis [TOPLAS 2007]

• Problem. A lot of unused data remains unclaimed even in the best ofgarbage collectors. In C/C++, memory leaks is a major problem

• Our Objectives. Static analysis of heap allocated data to improve garbagecollection and plug memory leaks

• Main Challenge. Unlike stack and static data,

◮ heap data accessible to any procedure is unbounded. Hence,◮ the mapping between object names and their addresses needs to

change at runtime

• Our Key Idea. Build bounded abstractions of heap data in terms of graphsand perform analysis using these graphs as data flow values

• Current status. Theory and prototype implementation (at theintraprocedural level) ready for Java

• Further Work. Liveness based interprocedural alias analysis

Uday Khedker IIT Bombay

Page 152: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 36/43

Precise Construction of Call Graphs (or Constructing CalleeContexts)

• Problem. Presence of function pointers obscures the caller-calleerelationship between procedures.

◮ Significant imprecision in the result of any analysis◮ Efficiency and scalability is adversely affected

• Main Challenges.

• Research Goals.

• Additional Benefits.

Uday Khedker IIT Bombay

Page 153: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 37/43

What Does A Callee Context Mean?

call pStartmain

fp = a[i ]n1

a = [“P ′′, “Q ′′]

call ∗ fpC1

call pR1

i = i + 1n2

if i < nn3

call pEndmain

x ∗ yStartP

. . .

call pEndP

x ∗ y StartQ

. . .

call p EndQ

Uday Khedker IIT Bombay

Page 154: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 37/43

What Does A Callee Context Mean?

call pStartmain

fp = a[i ]n1

a = [“P ′′, “Q ′′]

call ∗ fpC1

call pR1

i = i + 1n2

if i < nn3

call pEndmain

x ∗ yStartP

. . .

call pEndP

x ∗ y StartQ

. . .

call p EndQ

Is x ∗ y

available?

Uday Khedker IIT Bombay

Page 155: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 37/43

What Does A Callee Context Mean?

call pStartmain

fp = a[i ]n1

a = [“P ′′, “Q ′′]

call ∗ fpC1

call pR1

i = i + 1n2

if i < nn3

call pEndmain

x ∗ yStartP

. . .

call pEndP

x ∗ y StartQ

. . .

call p EndQ

Is x ∗ y

available?

Invalid execution path!

Uday Khedker IIT Bombay

Page 156: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 37/43

What Does A Callee Context Mean?

call pStartmain

fp = a[i ]n1

a = [“P ′′, “Q ′′]

call ∗ fpC1

call pR1

i = i + 1n2

if i < nn3

call pEndmain

x ∗ yStartP

. . .

call pEndP

x ∗ y StartQ

. . .

call p EndQ

Is x ∗ y

available?

Valid execution path!

Uday Khedker IIT Bombay

Page 157: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 37/43

What Does A Callee Context Mean?

call pStartmain

fp = a[i ]n1

a = [“P ′′, “Q ′′]

call ∗ fpC1

call pR1

i = i + 1n2

if i < nn3

call pEndmain

x ∗ yStartP

. . .

call pEndP

x ∗ y StartQ

. . .

call p EndQ

Is x ∗ y

available?

Valid execution path!

Uday Khedker IIT Bombay

Page 158: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 38/43

Precise Construction of Call Graphs (or Constructing CalleeContexts)

• Problem. Presence of function pointers obscures the caller-calleerelationship between procedures.

◮ Significant imprecision in the result of any analysis◮ Efficiency and scalability is adversely affected

• Main Challenges.

• Research Goals.

• Additional Benefits.

Uday Khedker IIT Bombay

Page 159: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 38/43

Precise Construction of Call Graphs (or Constructing CalleeContexts)

• Problem. Presence of function pointers obscures the caller-calleerelationship between procedures.

◮ Significant imprecision in the result of any analysis◮ Efficiency and scalability is adversely affected

• Main Challenges. Precise and efficient interprocedural analysis of

◮ pointers, and◮ data structure hierarchy declaration and usage

• Research Goals.

• Additional Benefits.

Uday Khedker IIT Bombay

Page 160: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 38/43

Precise Construction of Call Graphs (or Constructing CalleeContexts)

• Problem. Presence of function pointers obscures the caller-calleerelationship between procedures.

◮ Significant imprecision in the result of any analysis◮ Efficiency and scalability is adversely affected

• Main Challenges. Precise and efficient interprocedural analysis of

◮ pointers, and◮ data structure hierarchy declaration and usage

• Research Goals. Order sensitive call disambiguation analysis

◮ Flow and context sensitive data structure analysis◮ Creating a mechanism to identify the exact caller to which

information should be propagated

• Additional Benefits.

Uday Khedker IIT Bombay

Page 161: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Some Short Trips 38/43

Precise Construction of Call Graphs (or Constructing CalleeContexts)

• Problem. Presence of function pointers obscures the caller-calleerelationship between procedures.

◮ Significant imprecision in the result of any analysis◮ Efficiency and scalability is adversely affected

• Main Challenges. Precise and efficient interprocedural analysis of

◮ pointers, and◮ data structure hierarchy declaration and usage

• Research Goals. Order sensitive call disambiguation analysis

◮ Flow and context sensitive data structure analysis◮ Creating a mechanism to identify the exact caller to which

information should be propagated

• Additional Benefits. Precise analysis of programs in object orientedlanguages

Uday Khedker IIT Bombay

Page 162: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

Part 3

Conclusions

Page 163: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Conclusions 39/43

Observations

• Data flow propagation in real programs seems to involve a much smallersubset of all possible data flow values

In large programs that work properly, pointer usage is very disciplined and

the core information is very small!

• Earlier approaches reported inefficiency and non-scalability because theycomputed far more information than required because they

◮ did not separate the usable information from unusable information,and

◮ used low level abstractions of memory

Their focus was on

◮ approximating information to reduce the size, or◮ storing and accessing the information more efficiently

Uday Khedker IIT Bombay

Page 164: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Conclusions 40/43

A Spectrum of Possible Ways of Performing Computation

exhaustivecomputation

computationrestrictedto usableinformation

avoidingredundant

computation

demand drivencomputation

MaximumComputation

MinimumComputation

EarlyComputation

LateComputation

What should be computed?

When should it be computed?

Uday Khedker IIT Bombay

Page 165: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Conclusions 40/43

A Spectrum of Possible Ways of Performing Computation

exhaustivecomputation

computationrestrictedto usableinformation

avoidingredundant

computation

demand drivencomputation

MaximumComputation

MinimumComputation

EarlyComputation

LateComputation

What should be computed?

When should it be computed?

Do not compute what you don’t need!

Who defines what is needed?

Uday Khedker IIT Bombay

Page 166: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Conclusions 40/43

A Spectrum of Possible Ways of Performing Computation

exhaustivecomputation

computationrestrictedto usableinformation

avoidingredundant

computation

demand drivencomputation

MaximumComputation

MinimumComputation

EarlyComputation

LateComputation

What should be computed?

When should it be computed?

Do not compute what you don’t need!

Who defines what is needed? Client

Uday Khedker IIT Bombay

Page 167: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Conclusions 40/43

A Spectrum of Possible Ways of Performing Computation

exhaustivecomputation

computationrestrictedto usableinformation

avoidingredundant

computation

incrementalcomputation

splitting intopre and postcomputation

demand drivencomputation

MaximumComputation

MinimumComputation

EarlyComputation

LateComputation

What should be computed?

When should it be computed?

Do not compute what you don’t need!

Who defines what is needed? Algorithm, Data Structure

Uday Khedker IIT Bombay

Page 168: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Conclusions 40/43

A Spectrum of Possible Ways of Performing Computation

exhaustivecomputation

computationrestrictedto usableinformation

avoidingredundant

computation

incrementalcomputation

splitting intopre and postcomputation

demand drivencomputation

MaximumComputation

MinimumComputation

EarlyComputation

LateComputation

What should be computed?

When should it be computed?

Do not compute what you don’t need!

Who defines what is needed? Algorithm, Data Structure

Other examples:

• Bottom up summary flow functions

• Value contexts

• Work list based methods

• BDDs

Uday Khedker IIT Bombay

Page 169: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Conclusions 40/43

A Spectrum of Possible Ways of Performing Computation

exhaustivecomputation

computationrestrictedto usableinformation

avoidingredundant

computation

incrementalcomputation

splitting intopre and postcomputation

demand drivencomputation

MaximumComputation

MinimumComputation

EarlyComputation

LateComputation

What should be computed?

When should it be computed?

Do not compute what you don’t need!

Who defines what is needed? Definition of Analysis

Uday Khedker IIT Bombay

Page 170: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Conclusions 40/43

A Spectrum of Possible Ways of Performing Computation

exhaustivecomputation

computationrestrictedto usableinformation

avoidingredundant

computation

incrementalcomputation

splitting intopre and postcomputation

demand drivencomputation

MaximumComputation

MinimumComputation

EarlyComputation

LateComputation

What should be computed?

When should it be computed?

Do not compute what you don’t need!

Who defines what is needed? No One!

Uday Khedker IIT Bombay

Page 171: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Conclusions 40/43

A Spectrum of Possible Ways of Performing Computation

exhaustivecomputation

computationrestrictedto usableinformation

avoidingredundant

computation

incrementalcomputation

splitting intopre and postcomputation

demand drivencomputation

MaximumComputation

MinimumComputation

EarlyComputation

LateComputation

What should be computed?

When should it be computed?

Do not compute what you don’t need!

Who defines what is needed?These seem orthogonaland may be used together

Uday Khedker IIT Bombay

Page 172: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Conclusions 41/43

Conclusions

• Building quick approximations and compromising on precision may not benecessary for efficiency

• Building clean abstractions to separate the necessary information fromredundant information is much more significant

Uday Khedker IIT Bombay

Page 173: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Conclusions 41/43

Conclusions

• Building quick approximations and compromising on precision may not benecessary for efficiency

• Building clean abstractions to separate the necessary information fromredundant information is much more significant

Our experience of points-to analysis shows that

◮ Use of liveness reduced the pointer information . . .◮ which reduced the number of contexts required . . .◮ which reduced the liveness and pointer information . . .

Uday Khedker IIT Bombay

Page 174: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Conclusions 41/43

Conclusions

• Building quick approximations and compromising on precision may not benecessary for efficiency

• Building clean abstractions to separate the necessary information fromredundant information is much more significant

Our experience of points-to analysis shows that

◮ Use of liveness reduced the pointer information . . .◮ which reduced the number of contexts required . . .◮ which reduced the liveness and pointer information . . .

This encouraged us to explore bottom summary flow functions forpoints-to analysis

◮ which reduced the number of times a procedure is processed and . . .◮ gave rise to generalized points-to facts. . .◮ which reduced the size of intermediate points-to graphs. . .

Uday Khedker IIT Bombay

Page 175: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Conclusions 41/43

Conclusions

• Building quick approximations and compromising on precision may not benecessary for efficiency

• Building clean abstractions to separate the necessary information fromredundant information is much more significant

Our experience of points-to analysis shows that

◮ Use of liveness reduced the pointer information . . .◮ which reduced the number of contexts required . . .◮ which reduced the liveness and pointer information . . .

This encouraged us to explore bottom summary flow functions forpoints-to analysis

◮ which reduced the number of times a procedure is processed and . . .◮ gave rise to generalized points-to facts. . .◮ which reduced the size of intermediate points-to graphs. . .

Approximations should come after

building abstractions and not before

Uday Khedker IIT Bombay

Page 176: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Conclusions 42/43

Acknowledgements

Alan Mycroft,Alefiya LightwalaAmey Karkare,Amitabha Sanyal,Avantika GuptaBageshri Sathe,Prachee Yogi,Prashant Singh Rawat,

Pritam Gharat,Priyanka Sawant,Rohan Padhye,Shubhangi Agrawal,Sudakshina Das,Swati Rathi,Vini Kanvar,Vinit Deodhar

. . . and many more

Uday Khedker IIT Bombay

Page 177: TheAbstractionVs. ApproximationsDilemma inPointerAnalysisuday/soft-copies/aadppa.pdf- Frances Allen, IITK Workshop (2007) • A couple of influential papers Which Pointer Analysis

IITD PPA Dilemma: Conclusions 43/43

Last But Not the Least

Thank You!

Uday Khedker IIT Bombay