intraprocedural dataflow analysis for software product lines

39
Dataflow Analysis for Software Product Lines Mar 28, 2012 AOSD 2012 Intraprocedural Dataflow Analysis for Software Product Lines Claus Brabrand IT University of Copenhagen Universidade Federal de Pernambuco [ [email protected] ] Márcio Ribeiro Universidade Federal de Alagoas Universidade Federal de Pernambuco [ [email protected] ] Paulo Borba Universidade Federal de Pernambuco [ [email protected] ] Társis Tolêdo Universidade Federal de Pernambuco [ [email protected] ]

Upload: nita

Post on 18-Jan-2016

44 views

Category:

Documents


2 download

DESCRIPTION

Intraprocedural Dataflow Analysis for Software Product Lines. Claus Brabrand IT University of Copenhagen Universidade Federal de Pernambuco [ [email protected] ]. Márcio Ribeiro Universidade Federal de Alagoas Universidade Federal de Pernambuco [ [email protected] ]. Paulo Borba - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Intraprocedural Dataflow  Analysis  for Software Product Lines

Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

IntraproceduralDataflow Analysis for

Software Product LinesClaus Brabrand

IT University of CopenhagenUniversidade Federal de Pernambuco

[ [email protected] ]

Márcio RibeiroUniversidade Federal de Alagoas

Universidade Federal de Pernambuco[ [email protected] ]

Paulo BorbaUniversidade Federal de Pernambuco

[ [email protected] ]

Társis TolêdoUniversidade Federal de Pernambuco

[ [email protected] ]

Page 2: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 3 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

< Outline >

Introduction

Software Product Lines (recap)

Dataflow Analysis (recap)Dataflow Analyses for Software Product Lines:

feature in-sensitive (A1) vs feature sensitive (A2, A3, A4)

Results:A1 vs A2 vs A3 vs A4 (in theory and practice)

Related Work

Conclusion

Page 3: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 4 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

Introduction

1x CAR

=

1x CELL PHONE

=

1x APPLICATION

=

CARS CELL PHONES APPLICATIONS

Traditional Software Development:One program = One product

Product Line:A ”family” of products (of N ”similar” products):

customize

SPL:(Family ofPrograms)

Page 4: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 5 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

Software Product Line

SPL:

Feature Model: (e.g.: ψFM ≡ VIDEO COLOR)

Family ofPrograms:

COLOR

VIDEO

COLORVIDEO

VID

EO

Ø

{ Video }

{ Color, Video }

Configurations:Ø, {Color}, {Video}, {Color,Video}VALID

{ Color }

customize

2F

Features:F = { COLOR, VIDEO }

2F

Page 5: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 6 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

Software Product Line

SPL:Family of s:

COLOR

VIDEO

COLORVIDEO

VID

EO

Program

Conditional compilation:

#ifdef ( )

...

#endif

Alternatively,via Aspects(as in AOSD)

Logo logo;...

...use(logo);

#ifdef (VIDEO) logo = new Logo();#endif

Exam

ple

:

Similarly for; e.g.:■ null-pointers■ unused variables■ undefined variables

*** uninitialized variable!in configurations: {Ø, {COLOR}}

Page 6: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 7 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

resultresult

0100101111011010100111110111

0100101111011010100111110111

Analysis of SPLs

The Compilation Process:

...and for Software Product Lines:

0100101111011010100111110111

resultcompile run

ERROR!

customize 0100101111011010100111110111

result

run

ERROR!

ANALYZE!

ANALYZE!

Feature-sensitive data-flow analysis !

runruncompilecompilecompile

ANALYZE!ANALYZE! ERROR!ERROR!

2F

Page 7: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 8 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

< Outline >

Introduction

Software Product Lines (recap)

Dataflow Analysis (recap)Dataflow Analyses for Software Product Lines:

feature in-sensitive (A1) vs feature sensitive (A2, A3, A4)

Results:A1 vs A2 vs A3 vs A4 (in theory and practice)

Related Work

Conclusion

Page 8: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 9 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

Dataflow Analysis

Dataflow Analysis:1) Control-flow graph

2) Lattice (finite height)

3) Transfer functions (monotone)

L

Example:"sign-of-x analysis"

Page 9: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 10 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

< Outline >

Introduction

Software Product Lines (recap)

Dataflow Analysis (recap)Dataflow Analyses for Software Product Lines:

feature in-sensitive (A1) vs feature sensitive (A2, A3, A4)

Results:A1 vs A2 vs A3 vs A4 (in theory and practice)

Related Work

Conclusion

Page 10: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 11 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

A1 (brute force)

A1 (feature in-sensitive):N = 2F compilations!

void m() { int x=0; ifdef(A) x++; ifdef(B) x--;}

c = {A}: c = {B}: c = {A,B}:

int x = 0;

x++;

x--;

int x = 0;

x++;

x--;

int x = 0;

x++;

x--;

0

_|

+

0

_|

-

0

_|

0/+

+

ψFM = A B∨

L

Page 11: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 12 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

A2 (consecutive)

A2 (feature sensitive!):void m() { int x=0; ifdef(A) x++; ifdef(B) x--;}

c = {A}: c = {B}: c = {A,B}:

int x = 0;

x++;

x--;

int x = 0;

x++;

x--;

int x = 0;

x++;

x--;

0

_|

+

0

_|

-

0

_|

0/+

+

[[true]]

[[A]]

[[B]]

[[true]]

[[A]]

[[B]]

[[true]]

[[A]]

[[B]]

0+

✓c |- [[true]]

c |- [[A]]

c |- [[B]]

✓c |- [[true]]

c |- [[A]]

c |- [[B]]

✓c |- [[true]]

c |- [[A]]

c |- [[B]]

✓ ✓

ψFM = A B∨

L

Page 12: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 13 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

A3 (simultaneous)

A3 (feature sensitive!):void m() { int x=0; ifdef(A) x++; ifdef(B) x--;}

∀c ∈ {{A},{B},{A,B}}:

int x = 0;

x++;

x--;

0

_|

+

0

_|

-

0

_|

0/+

+

[[true]]

[[A]]

[[B]]

0+

✓ ∀c |- [[true]]

∀c |- [[A]]

∀c |- [[B]]

({A} = , {B} = , {A,B} = )

({A} = , {B} = , {A,B} = )

({A} = , {B} = , {A,B} = )

({A} = , {B} = , {A,B} = )

✓✓

✓✓

✓✓

ψFM = A B∨

L

Page 13: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 14 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

A4 (shared)

A4 (feature sensitive!):void m() { int x=0; ifdef(A) x++; ifdef(B) x--;}

ψFM = A B:∨

int x = 0;

x++;

x--;

[[true]]

[[A]]

[[B]]

_|( [[ψ]] = )

0 +( [[ψ ¬A∧ ]] = , [[ψ A∧ ]] = )

0( [[ψ]] = )

(A B) ¬A ¬B ≡ ∨ ∧ ∧ false…using BDDrepresentation!(compact+efficient)

+ - 0/+( [[ψ ¬A∧ ¬B∧ ]] = , [[ψ A∧ ¬B∧ ]] = , [[ψ ¬A∧ B∧ ]] = , [[ψ A∧ B∧ ]] = )0

i.e., invalid given wrt.the feature model, ψ !

ψFM = A B∨

L

Page 14: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 15 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

< Outline >

Introduction

Software Product Lines (recap)

Dataflow Analysis (recap)Dataflow Analyses for Software Product Lines:

feature in-sensitive (A1) vs feature sensitive (A2, A3, A4)

Results:A1 vs A2 vs A3 vs A4 (in theory and practice)

Related Work

Conclusion

Page 15: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 16 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

Evaluation

Four (qualitatively different) SPL benchmarks:Implementation: A1, A2, A3, A4 in SOOT + CIDEEvaluation: total time, analysis time, memory usage

Page 16: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 17 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

Results (total time)

In theory:

In practice:

6x 8x 14x

3x5x

3x

1x 1x 1x

2x 2½x2x

A2 (3x), A3 (4x), A4 (5x)

Feature sensitive (avg. gain factor):

(Reaching Definitions)

2F 2F

2F

Page 17: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 18 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

Results (analysis time)

In theory:

In practice: TIME(A4) : Depends ondegree of sharing in SPL !

(caching!)(Reaching Definitions) A3 (1.5x) faster

On average (A2 vs A3):

A2

A3

vs

2F

Page 18: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 19 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

Results (memory usage)

In theory:

In practice:(Reaching Definitions) 6.3 : 1

Average

2F

A2

A3

vs

SPACE(A4) : Depends ondegree of sharing in SPL !

Page 19: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 20 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

< Outline >

Introduction

Software Product Lines (recap)

Dataflow Analysis (recap)Dataflow Analyses for Software Product Lines:

feature in-sensitive (A1) vs feature sensitive (A2, A3, A4)

Results:A1 vs A2 vs A3 vs A4 (in theory and practice)

Related Work

Conclusion

Page 20: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 21 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

Related Work (DFA)

Path-sensitive DFA:

Idea of “conditionally executed statements”

Compute different analysis info along different paths (~ A2, A3, A4) to improve precision or to optimize “hot paths”

Predicated DFA:

Guard lattice values by propositional logic predicates (~ A4), yielding “optimistic dataflow values” that are kept distinct during analysis (~ A3 and A4)

“Constant Propagation with Conditional Branches”( Wegman and Zadeck ) TOPLAS 1991

“Predicated Array Data-Flow Analysis for Run-time Parallelization”( Moon, Hall, and Murphy ) ICS 1998

Our work: Automatically lift any DFA to SPLs (with ψFM) ⇒feature-sensitive analysis for analyzing entire program family

Page 21: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 22 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

Related Work (Lifting for SPLs)

Model Checking:

Type Checking:

Parsing:

Testing:

Model Checking Lots of Systems: Efficient Verification of Temporal Properties in Software Product Lines”( Classen, Heymans, Schobbens, Legay, and Raskin ) ICSE 2010

Model checks all SPLs at the same time (3.5x faster) than one by one! (similar goal, diff techniques)

Type checking ↔ DFA (similar goals, diff techniques)Our: auto lift any DFA (uninit vars, null pointers, ...)

“Type Safety for Feature-Oriented Product Lines”( Apel, Kastner, Grösslinger, and Lengauer ) ASE 2010

“Type-Checking Software Product Lines - A Formal Approach”( Kastner and Apel ) ASE 2008

“Variability-Aware Parsing in the Presence of Lexical Macros & C.C.”( Kastner, Giarrusso, Rendel, Erdweg, Ostermann, and Berger ) OOPSLA 2011

“Reducing Combinatorics in Testing Product Lines”( Hwan, Kim, Batory, and Khurshid ) AOSD 2011

Select relevant feature combinations for a given test caseUses (hardwired) DFA (w/o FM) to compute reachability

(similar techniques, diff goal):Split and merging parsing (~A4) and also uses instrumentation

Page 22: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 23 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

Related Work (emerging interfaces)

Emerging Interfaces: Compute E.I. to flag dependencies and howedit in one place affect feature(s) elsewhere

“Emergent Feature Modularization”( Ribeiro, Pacheco, Teixeira, and Borba ) Onward! 2010

“EMERGO: A Tool for Improving Maintainability of Preprocessor-Based PLs”( Ribeiro, Tolêdo, Winther, Brabrand, and Borba ) AOSD Tool Demo 2012

“EMERGO:

A Tool for Improving Maintainability

of Preprocessor-Based Product Lines”

Thursday at 14:00

and Friday at 16:00

AOSD 2012

TOOL DEMO

Page 23: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 24 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

< Outline >

Introduction

Software Product Lines (recap)

Dataflow Analysis (recap)Dataflow Analyses for Software Product Lines:

feature in-sensitive (A1) vs feature sensitive (A2, A3, A4)

Results:A1 vs A2 vs A3 vs A4 (in theory and practice)

Related Work

Conclusion

Page 24: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 25 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

Conclusion(s)

It is possible to analyze SPLs using DFAs

We can automatically "lift" any dataflow analysis and make it feature sensitive:

A2) Consecutive

A3) Simultaneous

A4) Shared Simultaneous

A2,A3,A4 much faster (3x,4x,5x) than naive A1

A3 is (1.5x) faster than A2 (caching!)

A4 saves lots of memory vs A3 (sharing!) 6.3 : 1

Page 25: Intraprocedural Dataflow  Analysis  for Software Product Lines

Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

< Obrigado* >

*) Thanks

Page 26: Intraprocedural Dataflow  Analysis  for Software Product Lines

Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

BONUS SLIDES

Page 27: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 28 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

Future Work

Explore how all this scales to…:

In particular:…relative speed of A1 vs A2 vs A3 vs A4 ?

…which analyses are feasible vs in-feasible ?

INTER-proceduraldata-flow analysisIn progress...!

Page 28: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 29 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

Specification: A1, A2, A3, A4

A1

A2

A3

A4

Page 29: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 30 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

Results (analysis time)

In theory:

In practice: TIME(A4) : Depends ondegree of sharing in SPL !

Nx1 ≠ 1xN?!

(caching!)

(Reaching Definitions) A3 (1.5x) fasterOn average (A2 vs A3):

A2

A3

vs

2F

2F

Page 30: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 31 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

A2 vs A3 (caching)

Cache misses in A2 vs A3:

Normal cache:As expected, A2 incurs more cache misses ( slower!)⇒

Full/no cache*:As hypothesized, this indeed affects A2 more than A3

i.e., A3 has better cache properties than A2

*) we flush the L2 cache, by traversing an 8MB “bogus array” to invalidate cache!

A2

A3

vs

Page 31: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 32 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

Analyzing a Program1) Program 2) Build CFG 3) Make Equations

4) Solve equations: fixed-point computation (iteration)

5) SOLUTION (least fixed point):

Page 32: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 33 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

IFDEF normalization

Refactor "undisciplined" (lexical) ifdefs into "disciplined" (syntactic) ifdefs:

Normalize "ifdef"s (by transformation):

Page 33: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 34 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

Feature Model (Example)

Feature Model:

Feature set:

Formula:

Set of configurations:FM Car Engine (1.01.4) Air1.4

{ {Car, Engine, 1.0}, {Car, Engine, 1.4}, {Car, Engine, 1.4, Air} }

F = {Car, Engine, 1.0, 1.4, Air}

Note:| [[FM]] | = 3 < 32 = |2F |

[[ ]] =

Page 34: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 35 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

Example Bug from Lampiro

Lampiro SPL (IM client for XMPP protocol):

*** uninitialized variable "logo"(if feature "GLIDER" is defined)

Similar problems with:undeclared variables, unused variables, null pointers, ...

Page 35: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 36 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

BDD (Binary Decision Diagram)

Compact and efficient representation forboolean functions (aka., set of set of names)

FAST: negation, conjunction, disjunction, equality !

= F(A,B,C) = A(BC)

A

C

minimized BDD

B

A

BB

C C C C

BDD

Page 36: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 37 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

Formula ~ Set of Configurations

Definitions (given F, set of feature names):f F feature namec 2F configuration (set of feature names) c FX 22 set of config's (set of set of feature names) X 2F

Exampleifdefs:

F

[[ BA ]]

[[ A(BC) ]]

F = {A,B}

F = {A,B,C}

= { {A}, {B}, {A,B} }

= { {A,B}, {A,C}, {A,B,C} }

Page 37: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 38 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

Emerging Interfaces

Page 38: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 39 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

Emerging Interfaces

"A Tool for Improving Maintainability of Preprocessor-based Product Lines"( Márcio Ribeiro, Társis Tolêdo, Paulo Borba, Claus Brabrand )

*** Best Tool Award ***CBSoft 2011:

Page 39: Intraprocedural Dataflow  Analysis  for Software Product Lines

[ 40 ]Dataflow Analysis for Software Product Lines Mar 28, 2012AOSD 2012

ErrorsLogo logo;

use(logo);

#ifdef (VIDEO) logo = new Logo();#endif

*** uninitialized variable!in configurations: {Ø, {COLOR}}

Logo logo;

logo.use();

#ifdef (VIDEO) logo = new Logo();#endif

*** null-pointer exception!in configurations: {Ø, {COLOR}}

Logo logo;

...

#ifdef (VIDEO) logo = new Logo();#endif

*** unused variable!in configurations: {Ø, {COLOR}}