data o - ucl computer sciencew graphs .. 36 2.2.2 data o w analysis. 36 2.2.3 inheren t inaccuracies...

296

Upload: others

Post on 23-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

Data ow Minimal Sli ingbySebastian Dani i BS . Pure Mathemati s (London University)MS . Computation (Oxford University)A thesis submitted in partial ful�llment of therequirements of the University of North Londonfor the degree ofDo tor of PhilosophyMay 20, 1999

Page 2: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

To Aurora

Page 3: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

Abstra tA sli e of a program p with repse t to a variable x is a set, S, of statements from p su h that everystatement that a�e ts x is in S. S may, but need not, ontain statements that do not a�e t x.Sli ing enables large programs to be de omposed into ones whi h are smaller and hen e pontentiallyeasier to analyse, maintain and test. It is desirable that sli ing algorithms produ e sli es that are `assmall as possible'. A statement minimal sli e, S, with respe t to p and x is a set whi h ontains onlystatements that a�e t x. Statement minimal sli es are known not to be omputable.Sli ing algorithms traditionallywork at the data ow level, i.e. the only information used about ea hexpression in the program being sli ed is the set of variables referen ed by the expression. As a result,su h algorithms annot distinguish between programs with identi al stru ture up to expressions, where orresponding expressions referen e the same set of variables. Su h algorithms, are thus, in e�e t, notworking on single programs but on sets of data ow equivalent programs.The sli es produ ed by these algorithms are not, however, data ow minimal. This means that forsome programs, p, the sli e produ ed by su h an algorithm with respe t to x ontains statements thatdo not a�e t x in all programs in the data ow equivalen e lass of p.This thesis is an investigation into the question: Are data ow minimal sli es omputable?We introdu e a de�nition of a form of data ow minimal sli e and develop an algorithm for om-puting it whi h we prove to be orre t for loop free programs.For programs, p, ontaining loops we prove that there exists an integer n, whi h we all themaximal unfolding number for p, where the data ow minimal sli e of p is the same as the data owminimal sli e of its mth unfolding for all m � n. An unfolding is, by de�nition, loop free and thereforeits data ow minmal sli e an be omputed. The problem of omputing a data ow minimal sli e of pis thus redu ed to the problem of �nding the maximal unfolding number of p. We implement a sli ingalgorithm based on unfolding whi h repeatedly unfolds p until there are no further additions to p'ssli e set. This algorithm is guaranteed to produ e data ow minimal sli es provided that rea hing thisstable state implies that p's maximal unfolding number has been rea hed.

Page 4: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

A knowledegementsFirst and foremost, I would like to thank my main supervisor Dr Mark Harman of Gold-smiths' College, London University.He has been the perfe t supervisor both te hni ally and pra ti ally. I was inspired by hisbroad knowledge and deep understanding of the subje t matter and I bene�ted greatly fromhis skills in managing resear h. His tena ity and en ouragement were unremitting throughout.Among my most treasured experien es are the many hours ea h week that we spenddis ussing interesting issues, not only related to the subje t matter of this thesis. Long maythey ontinue!I would like to thank my other supervisors, Professor Mal olm Munro of the Universityof Durham and Professor Dan Simpson of Brighton University for all their help and en our-agement.I would also like to thank the following people:Professor Yau Jim Yip and Professor Ian Haines of the University of North London fortheir support and en ouragement and for providing me with the onditions ne essary to omplete this thesis.Dr Ross Paterson of City University, for introdu ing me to the delights of Debian Linux,for his beautiful Hope interpreter and for never failing to know the answer to every questionI have asked.Tom Reps of the University of Wis onsin and Tim Teitelbaum of Cornell University forsuggesting that I use s hemas.Mrs. Yoga Sivagurunathan of the University of North London for produ ing the lovelydiagrams in the Parallel Algorithm example (Se tion 2.3.4, page 44).Dr John Howroyd of Goldsmiths' College, London University, for the `John Howroyd'example (Se tion 3.13.1, page 113).My `room mates', Paul Fairney and Ryan Newdi k who have both been ex eedingly en- ouraging and supportive and have had to put up with months of dirty o�ee ups without omplaining!Lastly, I would like to thank Aurora and Ivan both for proof reading and for their manyhelpful omments and suggestions.

Page 5: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

ContentsI Data ow Minimal Sli ing 191 Introdu tion 211.1 Program Sli ing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211.1.1 An Example of Sli ing . . . . . . . . . . . . . . . . . . . . . . . . . . . 211.2 Appli ations of Sli ing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221.3 Data ow analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231.4 The Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231.5 Examples of the Data ow Minimality Problem . . . . . . . . . . . . . . . . . 241.6 Organisation of this Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272 Sli ing: Semanti s and Algorithms 332.1 Di�erent Forms of Sli e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332.1.1 Ba kward vs. Forward . . . . . . . . . . . . . . . . . . . . . . . . . . . 332.1.2 Stati vs. Dynami . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342.1.3 Intra{pro edural vs Inter{pro edural . . . . . . . . . . . . . . . . . . . 342.1.4 Sli ing Stru tured vs. Unstru tured Programs . . . . . . . . . . . . . . 342.1.5 Data ow vs. Non{Data ow . . . . . . . . . . . . . . . . . . . . . . . . 352.2 Weiser's Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352.2.1 Control Flow Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362.2.2 Data ow Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362.2.3 Inherent Ina ura ies in Data ow Analysis . . . . . . . . . . . . . . . 372.2.4 Traditional Dependen e . . . . . . . . . . . . . . . . . . . . . . . . . . 382.2.5 Data Dependen e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

Page 6: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

6 CONTENTSExamples of Data Dependen e . . . . . . . . . . . . . . . . . . . . . . 382.2.6 Control Dependen e . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392.2.7 Weiser's Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39Dire tly Relevant Variables . . . . . . . . . . . . . . . . . . . . . . . . 40Dire tly Relevant Statements . . . . . . . . . . . . . . . . . . . . . . . 40Indire tly Relevant Variables . . . . . . . . . . . . . . . . . . . . . . . 40Indire tly Relevant Statements . . . . . . . . . . . . . . . . . . . . . . 412.3 A Parallel version of Weiser's Algorithm . . . . . . . . . . . . . . . . . . . . . 412.3.1 Pro ess Behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422.3.2 Starting Network Communi ation . . . . . . . . . . . . . . . . . . . . 432.3.3 Constru ting the Sli e . . . . . . . . . . . . . . . . . . . . . . . . . . . 442.3.4 Example Exe ution of the Parallel Algorithm . . . . . . . . . . . . . . 442.4 Weiser Sli es . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512.4.1 Sli ing using Program Dependen e Graphs . . . . . . . . . . . . . . . 522.5 The Semanti s of Sli ing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 542.5.1 Weiser's Semanti De�nition of Valid Sli es . . . . . . . . . . . . . . . 55State Traje tories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 552.5.2 End Sli ing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 552.5.3 Sli ing and non{Termination . . . . . . . . . . . . . . . . . . . . . . . 562.5.4 The Semanti s of the PDG approa h . . . . . . . . . . . . . . . . . . . 562.5.5 Standard Semanti s . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56Ordering on States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57Evaluating Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . 57Stri tness of E in Standard Semanti s . . . . . . . . . . . . . . . . . . 58Assignment Statements . . . . . . . . . . . . . . . . . . . . . . . . . . 58Sequen es of Statements . . . . . . . . . . . . . . . . . . . . . . . . . . 58Conditionals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59Meaning of Loops(Example) . . . . . . . . . . . . . . . . . . . . . . . . 59Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

Page 7: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

CONTENTS 72.5.6 Lazy Semanti s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 602.5.7 Statement Minimal Sli es . . . . . . . . . . . . . . . . . . . . . . . . . 622.5.8 Data ow Minimality Problem . . . . . . . . . . . . . . . . . . . . . . . 632.5.9 Venkatesh's Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 642.5.10 Hausler's Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 662.5.11 Unfolding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 682.6 Dynami Sli ing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69Agrawal and Horgan's First Algorithm . . . . . . . . . . . . . . . . . . 71Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71Agrawal and Horgan's Se ond Algorithm . . . . . . . . . . . . . . . . 73Agrawal and Horgan's Third Algorithm . . . . . . . . . . . . . . . . . 752.7 Symboli Exe ution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 803 Data ow Dependen ies 833.1 Introdu tion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 833.2 Minimal Sli ing Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 843.2.1 Example: Statement Minimal Weiser Sli es . . . . . . . . . . . . . . . 853.2.2 Example: Data ow Minimal Weiser Sli ing . . . . . . . . . . . . . . . 853.3 Dis ussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 863.4 Assumptions about the Programming Language . . . . . . . . . . . . . . . . . 873.4.1 Syntax of Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 873.4.2 Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 883.4.3 The Variables Referen ed by an Expression . . . . . . . . . . . . . . . 883.4.4 Assumptions about Expressions in Programs . . . . . . . . . . . . . . 883.4.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 903.5 The Variable Dependen ies: VD and TVD . . . . . . . . . . . . . . . . . . . 913.5.1 Variable Dependen e (VD) . . . . . . . . . . . . . . . . . . . . . . . . 923.5.2 Examples of Variable Dependen e . . . . . . . . . . . . . . . . . . . . 923.5.3 Terminating Variable Dependen e (TVD) . . . . . . . . . . . . . . . . 933.5.4 Examples of Terminating Variable Dependen e . . . . . . . . . . . . . 943.6 The Unde idability of VD and TVD . . . . . . . . . . . . . . . . . . . . . . . 95

Page 8: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

8 CONTENTS3.7 Data ow Dependen e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 953.7.1 Data ow Equivalen e . . . . . . . . . . . . . . . . . . . . . . . . . . . 963.7.2 Example of Data ow Equivalen e . . . . . . . . . . . . . . . . . . . . . 973.8 The Data ow Variable Dependen ies: DVD and DTVD . . . . . . . . . . . . 983.8.1 Data ow Variable Dependen e(DVD) . . . . . . . . . . . . . . . . . . 983.8.2 Examples of Data ow Variable Dependen e . . . . . . . . . . . . . . . 983.8.3 Data ow Terminating Variable Dependen e(DTVD) . . . . . . . . . . 993.8.4 A Taxonomy of Variable Dependen e . . . . . . . . . . . . . . . . . . . 993.9 The Label Dependen ies: LD and TLD . . . . . . . . . . . . . . . . . . . . . 1023.9.1 Label Dependen e (LD) . . . . . . . . . . . . . . . . . . . . . . . . . . 1023.9.2 Terminating Label Dependen e (TLD) . . . . . . . . . . . . . . . . . . 1033.9.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1043.9.4 A Taxonomy of Label Dependen e . . . . . . . . . . . . . . . . . . . . 1073.10 The Unde idability of LD and TLD . . . . . . . . . . . . . . . . . . . . . . . 1083.11 S hemas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1083.11.1 Syntax of S hemas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1093.11.2 Uniqueness of Labels . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1103.11.3 Interpretations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1103.12 Rede�ning our Data ow Dependen ies in terms of S hemas . . . . . . . . . . 1113.13 A Comparison of Data ow Label Dependen e with Sli ing . . . . . . . . . . . 1123.13.1 The Sli es produ ed by DTLD . . . . . . . . . . . . . . . . . . . . . . 1123.13.2 The Sli es produ ed by DLD . . . . . . . . . . . . . . . . . . . . . . . 1143.14 The Data ow Minimality of Algorithms for DTVD et . . . . . . . . . . . . . . 1163.14.1 Example: DTVD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1163.15 Con lusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1174 The Semanti s, S, of Loop{free S hemas 1194.1 Introdu tion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1194.2 Symboli Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1194.2.1 Examples of Symboli Values . . . . . . . . . . . . . . . . . . . . . . . 1204.2.2 Symboli States() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1214.2.3 Symboli Exe ution of a S hema . . . . . . . . . . . . . . . . . . . . . 121

Page 9: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

CONTENTS 94.3 Symboli Exe ution Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1224.3.1 Example of a Symboli Exe ution Tree . . . . . . . . . . . . . . . . . . 1234.4 Operations on Symboli Exe ution Trees . . . . . . . . . . . . . . . . . . . . . 1254.4.1 Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1254.4.2 The Path Fun tion, pfun, of a symboli exe ution tree, t. . . . . . . . 1254.4.3 Simple Symboli Exe ution Trees . . . . . . . . . . . . . . . . . . . . . 1274.4.4 Simpli� ation of a Symboli Exe ution Tree . . . . . . . . . . . . . . . 127Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1274.4.5 Pruning Symboli Exe ution Trees . . . . . . . . . . . . . . . . . . . . 1284.4.6 Evaluating a symboli value Æ in a Symboli State . . . . . . . . . . 129Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1304.4.7 Updating a Symboli State in a Symboli State . . . . . . . . . . . . . 1304.4.8 Evaluating a Symboli Exe ution Tree in a Symboli State . . . . . . 1304.4.9 The Sequen e of two Symboli Exe ution Trees . . . . . . . . . . . . . 1314.5 The Semanti s of Loop Free S hemas . . . . . . . . . . . . . . . . . . . . . . . 1314.5.1 Assignments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1314.5.2 Fail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1324.5.3 Skip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1324.5.4 Conditionals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1324.5.5 Statement Sequen es . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1324.5.6 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1334.6 Implementation of the Semanti s of Loop Free S hemas . . . . . . . . . . . . 1334.6.1 The Abstra t Syntax for Symboli Values (De�nition 4.2.1(page 119)) 1344.6.2 The `Standard' Update Fun tion [88℄ . . . . . . . . . . . . . . . . . . . 1344.6.3 The Abstra t Syntax for S hemas (Se tion 3.11). . . . . . . . . . . . . 1354.6.4 Symboli States (Se tion 4.2.2) . . . . . . . . . . . . . . . . . . . . . . 1354.6.5 The Abstra t Syntax for Symboli Exe ution Trees. (Se tion 4.3) . . . 1354.6.6 Representation of Paths (De�nition 4.4.1(page 125)) . . . . . . . . . . 1354.6.7 evaldelta (De�nition 4.4.6(page 129)) . . . . . . . . . . . . . . . . . . . 1354.6.8 updatestateinstate (De�nition 4.4.7(page 130)) . . . . . . . . . . . . . . 135

Page 10: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

10 CONTENTS4.6.9 treeinstate (De�nition 4.4.8(page 130)) . . . . . . . . . . . . . . . . . . 1364.6.10 sequen e (De�nition 4.4.9(page 131)) . . . . . . . . . . . . . . . . . . . 1364.6.11 prune (De�nition 4.4.4(page 128)) . . . . . . . . . . . . . . . . . . . . 1364.6.12 simplify (De�nition 4.4.5(page 129)) . . . . . . . . . . . . . . . . . . . 1364.6.13 The Semanti Fun tion S (Se tion 4.5) . . . . . . . . . . . . . . . . . 1364.6.14 The skip Rule (De�nition 4.5.3(page 132)) . . . . . . . . . . . . . . . . 1374.6.15 The Sequen e Rule (De�nition 4.5.5(page 132)) . . . . . . . . . . . . . 1374.6.16 The FAIL Rule (De�nition 4.5.2(page 132)) . . . . . . . . . . . . . . 1374.6.17 The Assignment Rule (De�nition 4.5.1(page 131)) . . . . . . . . . . . 1374.6.18 The Conditional Rule (De�nition 4.5.4(page 132)) . . . . . . . . . . . 1374.6.19 The Path Fun tion (De�nition 4.4.2(page 125)) . . . . . . . . . . . . . 1374.7 Con lusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1385 The Soundness and Completeness of S 1395.1 Introdu tion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1395.2 The Corresponden e between Symboli Exe ution Trees and Programs . . . . 1405.2.1 The Fun tion evalsym . . . . . . . . . . . . . . . . . . . . . . . . . . . 1405.2.2 The Derived State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1425.2.3 The fun tion satisfy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1425.2.4 Di�eren es . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1435.3 Further Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1435.3.1 The Result of Pruning a Simple Symboli Exe ution Tree is Simple . . 1435.3.2 The Result of Simplifying a Symboli Exe ution Tree is Simple . . . . 1445.3.3 A Partial Order on Paths . . . . . . . . . . . . . . . . . . . . . . . . . 1455.3.4 `Smaller Path' Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . 1455.3.5 `Pruning' Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1485.3.6 `Disagreement' Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . 1515.3.7 `No Subpaths' Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . 1515.3.8 Joining Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1535.3.9 `Corresponden e' Lemma . . . . . . . . . . . . . . . . . . . . . . . . . 1535.3.10 Evaluating a Path in a Symboli State . . . . . . . . . . . . . . . . . . 1545.3.11 The `One Path' Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . 159

Page 11: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

CONTENTS 115.4 The Soundness and Completeness of S . . . . . . . . . . . . . . . . . . . . . . 1595.5 Con lusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1696 Data and Control Dependen e in Symboli Exe ution Trees 1716.1 Introdu tion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1716.2 Computing DTLD for Loop{free S hemas . . . . . . . . . . . . . . . . . . . . 1726.2.1 Label Data Dependen e . . . . . . . . . . . . . . . . . . . . . . . . . . 1736.2.2 Example of Label Data Dependen e . . . . . . . . . . . . . . . . . . . 1736.2.3 Label Terminating Control Dependen e . . . . . . . . . . . . . . . . . 1746.2.4 Examples of Label Terminating Control Dependen e . . . . . . . . . . 176An Example with Non{termination . . . . . . . . . . . . . . . . . . . . 1796.2.5 The DTLsli e of a Symboli Exe ution Tree . . . . . . . . . . . . . . . 1826.2.6 The Algorithm for DTLD . . . . . . . . . . . . . . . . . . . . . . . . . 1826.3 Corre tness of the Algorithm for DTLD . . . . . . . . . . . . . . . . . . . . . 1836.3.1 Proof of DTLD Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 1886.4 Computing DTVD for Loop{free S hemas . . . . . . . . . . . . . . . . . . . . 1936.4.1 Variable Data Dependen e . . . . . . . . . . . . . . . . . . . . . . . . 1936.4.2 Variable Terminating Control Dependen e . . . . . . . . . . . . . . . . 1946.4.3 The DTVsli e of a Symboli Exe ution Tree . . . . . . . . . . . . . . . 1956.4.4 The Algorithm for DTVD . . . . . . . . . . . . . . . . . . . . . . . . . 1956.5 The Algorithms for DLD and DVD . . . . . . . . . . . . . . . . . . . . . . . . 1956.5.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1966.6 Labels are really Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1976.6.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1986.6.2 Justi� ation of Label Adding . . . . . . . . . . . . . . . . . . . . . . . 1996.6.3 Variables and Labels Combined . . . . . . . . . . . . . . . . . . . . . . 2006.6.4 Computing DTVD and DTLD using the DTsli e . . . . . . . . . . . . 2016.7 Implementation of DTLD for Loop{free S hemas . . . . . . . . . . . . . . . . 2026.7.1 labels (De�nition 6.2.2(page 173)) . . . . . . . . . . . . . . . . . . . . . 2026.7.2 Ldatadepends (De�nition 6.2.1(page 173)) . . . . . . . . . . . . . . . . 2026.7.3 di�s (Se tion 5.2.4) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2026.7.4 LT ontrols (Se tion 6.2.3) . . . . . . . . . . . . . . . . . . . . . . . . . 202

Page 12: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

12 CONTENTS6.7.5 DTLsli e (De�nition 6.2.4(page 182)) . . . . . . . . . . . . . . . . . . 2036.8 Implementation of DTVD for Loop{free S hemas . . . . . . . . . . . . . . . . 2036.8.1 variables (De�nition 6.4.2(page 194)) . . . . . . . . . . . . . . . . . . . 2036.8.2 Vdatadepends (De�nition 6.4.1(page 194)) . . . . . . . . . . . . . . . . 2036.8.3 VT ontrols (De�nition 6.4.3(page 195)) . . . . . . . . . . . . . . . . . 2036.8.4 DTVsli e (De�nition 6.4.4(page 195)) . . . . . . . . . . . . . . . . . . 2046.9 Implementation of DTD for Loop{free S hemas . . . . . . . . . . . . . . . . . 2046.9.1 Data depends (De�nition 6.6.2(page 200)) . . . . . . . . . . . . . . . . 2046.9.2 T ontrols (De�nition 6.6.3(page 200)) . . . . . . . . . . . . . . . . . . 2056.9.3 DTsli e (De�nition 6.6.5(page 201)) . . . . . . . . . . . . . . . . . . . 2056.10 Con lusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2057 Computing Data ow Dependen ies of S hemas with Loops 2077.1 Introdu tion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2077.2 Unfoldings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2097.2.1 Example of unfolding . . . . . . . . . . . . . . . . . . . . . . . . . . . 2097.3 The DTVD Algorithm for Loop S hemas . . . . . . . . . . . . . . . . . . . . 2117.3.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2127.4 Data ow Dependen e of Unfoldings . . . . . . . . . . . . . . . . . . . . . . . . 2187.4.1 A Partial Ordering on Programs . . . . . . . . . . . . . . . . . . . . . 2197.4.2 A Partial Ordering on S hemas . . . . . . . . . . . . . . . . . . . . . . 2207.5 Implementation of DTVD and DTLD for S hemas with Loops . . . . . . . . . 2257.5.1 The Set of Variables A�e ted by a S hema . . . . . . . . . . . . . . . 2257.5.2 A Fun tion whi h he ks whether two symboli exe ution trees havethe same DTsli e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2257.5.3 Implementation of Unfolding . . . . . . . . . . . . . . . . . . . . . . . 2257.5.4 Implementation of DTD . . . . . . . . . . . . . . . . . . . . . . . . . . 2267.6 Implementation of DTLD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2267.7 Implementation of DTVD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2268 Con lusions 229

Page 13: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

CONTENTS 138.1 Why do data ow sli ing algorithms, like Weiser's, produ e sli es that are notdata ow minimal? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2298.2 Do algorithms for produ ing data ow minimal sli es exist? . . . . . . . . . . 2308.3 The Approa h . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2319 Future Work 2339.1 Extending the Proofs and Algorithms to DVD and DLD . . . . . . . . . . . . 2349.2 Improving EÆ ien y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2349.3 Experimenting with Di�erent De�nitions of Control Dependen e to Obtaindi�erent dependen es . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2349.4 Further Appli ability of Symboli Exe ution Trees . . . . . . . . . . . . . . . 2359.4.1 Data ow Minimal Weiser Sli es . . . . . . . . . . . . . . . . . . . . . . 2359.4.2 Programs with Pro edures . . . . . . . . . . . . . . . . . . . . . . . . . 2369.4.3 Dynami Sli ing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236II Appendi es 237A Sample Outputs from the DTLD and the DTVD Algorithms 239B Programs 255B.1 Complete Hope Program for DTVD andDTLD . . . . . . . . . . . . . . . . . 255B.2 Auxiliary Hope Fun tions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266C Corre tness of the Parallel Algorithm 271C.1.1 Fun tional Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271Solving the Equations to Produ e Sli es . . . . . . . . . . . . . . . . . 273Valid Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . 273C.1.2 Corre tness of the Parallel Sli ing Algorithm . . . . . . . . . . . . . . 274Existen e of Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . 274Termination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274Corre tness Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274Lemma 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275

Page 14: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

14 CONTENTSLemma 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275Base Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276Proof of 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276Proof of 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277Indu tive Step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277Proof of 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278Proof of 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279D Proof of the DTVD algorithm for Loop{free S hemas 281D.0.3 Proof of DTVD Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 284

Page 15: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

List of Figures1.1 Program p1:1 and its sli e, p01:1 . . . . . . . . . . . . . . . . . . . . . . . . . . 221.2 Program p1:2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241.3 The Control Flow Graph of p1:2 . . . . . . . . . . . . . . . . . . . . . . . . . . 251.4 Data ow Minimal Sli e of p1:2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 261.5 Illustration of the Data ow Minimality Problem . . . . . . . . . . . . . . . . 271.6 End{sli ing on x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281.7 End{sli ing on x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281.8 End{sli ing on x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282.1 Program p2:1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362.2 G2:2: The ontrol ow graph of p2:1 . . . . . . . . . . . . . . . . . . . . . . . . 372.3 The program to be sli ed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442.4 The reverse ontrol ow graph obtained from the initial program . . . . . . . 452.5 Initial state of the pro ess network . . . . . . . . . . . . . . . . . . . . . . . . 462.6 The state just before pro ess two outputs its �rst message . . . . . . . . . . . 472.7 The state just after pro ess two has output its �rst message . . . . . . . . . . 482.8 The state after one more pass around the loop . . . . . . . . . . . . . . . . . 492.9 The �nal state of the pro ess network . . . . . . . . . . . . . . . . . . . . . . 502.10 The original program and its sli e . . . . . . . . . . . . . . . . . . . . . . . . . 502.11 G2:11: The augmented ontrol ow graph of p2:1 . . . . . . . . . . . . . . . . . 532.12 G2:4:1: The program dependen e graph of p2:1 . . . . . . . . . . . . . . . . . . 542.13 Non-Termination Preservation . . . . . . . . . . . . . . . . . . . . . . . . . . 632.14 Original Program and Dynami Sli e w.r.t. (fpg; 10; < 1 >) . . . . . . . . . . 702.15 Example Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

Page 16: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

16 LIST OF FIGURES2.16 PDG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 722.17 Redu ed PDG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 722.18 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 732.19 PDG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 732.20 Redu ed PDG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 742.21 Example Program 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 742.22 PDG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 752.23 Example Program 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 772.24 PDG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 782.25 PDG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 793.1 Program p1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 933.2 Program p2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 953.3 Program p3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 963.4 Three Data ow Equivalent Programs . . . . . . . . . . . . . . . . . . . . . . . 973.5 Control Flow Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 983.6 The Four Variations of Variable Dependen e . . . . . . . . . . . . . . . . . . . 1003.7 program p10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1013.8 program p9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1013.9 program p12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1023.10 xVDfyg and xLDf1g . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1043.11 xVDfg and xLDf1g . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1043.12 xVDfzg and xLDf1g . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1043.13 xVDfyg and xLDf2g . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1053.14 xVDfg and xLDf2; 3g . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1053.15 xVDfx; yg and xLDf1; 2g . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1053.16 xVDfx; yg and xLDf1; 2; 3g . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1053.17 xVDfyg and xTVDfyg and xLDf0; 1; 2; 3g and xTLDf0; 1; 2; 3g . . . . . . . 1063.18 xVDfyg and xTVDfyg and xLDf1; 2; 3g and xTLDf1; 2; 3g . . . . . . . . . . 1063.19 xVDfyg and xTVDfg and xLDf1; 2; 3g and xTLDf1; 3g . . . . . . . . . . . 1063.20 xVDfx; ; ig andxTVDfx; ; ig xLDf2; 3; 5; 6g and xTLDf2; 3; 5g . . . . . . . 1073.21 The Four Variations of Label Dependen e . . . . . . . . . . . . . . . . . . . . 108

Page 17: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

LIST OF FIGURES 173.22 s3:22, the S hema orresponding to the program in Figure 3.20(page 107). . . 1093.23 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1153.24 DLD Sli e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1154.1 The Symboli exe ution of s hema s4:1. . . . . . . . . . . . . . . . . . . . . . 1224.2 Example Symboli Exe ution Tree . . . . . . . . . . . . . . . . . . . . . . . . 1244.3 The Path Fun tion of the symboli exe ution tree in Figure 4.4(page 128) . . 1264.4 Simpli�ed Symboli Exe ution Tree . . . . . . . . . . . . . . . . . . . . . . . 1284.5 p4:5 and s4:5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1334.6 The symboli exe ution tree of s4:5 . . . . . . . . . . . . . . . . . . . . . . . . 1346.1 Symboli Exe ution Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1746.2 The Path Fun tion of the symboli exe ution tree in Figure 6.1(page 174) . . 1756.3 Data Dependen e Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1756.4 p6:4 and s6:4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1776.5 The symboli exe ution tree, S[[s6:4℄℄ . . . . . . . . . . . . . . . . . . . . . . . 1786.6 The four paths of S[[s6:4℄℄ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1786.7 The Four Pairs of paths of S[[s6:4℄℄ with di�erent non{? �nal values for y . . . 1786.8 The Di�eren es of ea h pair of paths of S[[s6:4℄℄ with di�erent non{? �nal valuesfor y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1796.9 The Four Pairs of paths of S[[s6:4℄℄ with di�erent non{? �nal values for v . . . 1796.10 The Di�eren es of ea h pair of paths of s6:11, with di�erent non{? �nal valuesfor v . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1796.11 p6:11 and s6:11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1806.12 The symboli exe ution tree, S[[s6:11℄℄ . . . . . . . . . . . . . . . . . . . . . . . 1816.13 Label Data Dependen e and Label Terminating Label Dependen e . . . . . . 1826.14 DTLsli e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1826.15 The Four Pairs of paths of S[[s6:11℄℄ with di�erent �nal values for y . . . . . . 1966.16 The Four Pairs of paths of S[[s6:11℄℄ with di�erent �nal values for y . . . . . . 1976.17 Adding Extra Variables for Label Dependen e . . . . . . . . . . . . . . . . . . 1986.18 Adding Extra Variables for Label Dependen e . . . . . . . . . . . . . . . . . . 1986.19 Label Dependen e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198

Page 18: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

18 LIST OF FIGURES7.1 S[[W0℄℄: The symboli exe ution tree of W0 . . . . . . . . . . . . . . . . . . . . 2127.2 DTsli e of S[[W0℄℄ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2127.3 S[[W1℄℄: The symboli exe ution tree of W1 . . . . . . . . . . . . . . . . . . . . 2137.4 DTsli e of S[[W1℄℄ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2137.5 The symboli exe ution tree, S[[W2℄℄ of W2 . . . . . . . . . . . . . . . . . . . . 2147.6 S[[S; if b1(x; y) then FAIL else skip℄℄ . . . . . . . . . . . . . . . . . . . . . . . . 2157.7 The symboli exe ution tree, S[[W2℄℄ before pruning . . . . . . . . . . . . . . . 2167.8 DTsli e of S[[W2℄℄ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2167.9 The Symboli Exe ution Tree, S[[W3℄℄ . . . . . . . . . . . . . . . . . . . . . . 2177.10 DTsli e of S[[W3℄℄ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2177.11 DTVD of the Loop S hema . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2187.12 DTLD of the Loop S hema . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218C.1 The fun tional network derived from the example program . . . . . . . . . . . 272

Page 19: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

Part IData ow Minimal Sli ing

Page 20: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence
Page 21: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

Chapter 1Introdu tion1.1 Program Sli ingThe underlying feature of all forms of sli ing is that from a big ompli ated program, a smallersimpler sli e is obtained. Analysis of some of the properties of the big program is therebytranslated into the potentially easier problem of analysing the sli e.Program Sli ing was introdu ed by Mark Weiser in his PhD thesis [92℄. Informally, aprogram p is sli ed with respe t to a sli ing riterion whi h is a pair (V; i), where V is a setof variables and i is a `point1' in the program. The sli e s of p is obtained from p by deletingstatements and has the property that p and s `behave the same' with respe t to the sli ing riterion (V; i).1.1.1 An Example of Sli ingConsider program p1:1 in Figure 1.1(page 22). Sli ing p1:1 with respe t to the set of variablesfxg at the end of the program would yield the program p01:1. Synta ti ally the sli e, p01:1 , hasbeen obtained from the original, p1:1, by deleting statements. A semanti relationship existsbetween p1:1 and p01:1 in the sense that in all initial states they both result in the same �nalvalue of the variable x.1We imagine all the statements of the program to be labelled.

Page 22: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

22 Introdu tionbeginx:=y;if x=3thenbegin :=y;x:=25end;i:=i+1endbeginx:=y;if x=3thenx:=25endFigure 1.1: Program p1:1 and its sli e, p01:11.2 Appli ations of Sli ingSli ing has many appli ations in luding� Program Comprehension [10, 31, 50, 51℄� Program Maintenan e [11, 18, 20, 26, 38, 41, 40, 39, 48, 79, 85, 95℄� Program Debugging [94, 4, 84, 66, 77, 86℄� Testing [9, 13, 45, 46, 66℄� Re{engineering [75, 85℄ and Component Re{use [8℄� Program Integration [57℄� Software Metri s [12, 81, 79, 80, 74, 49, 76℄Tip [89℄ and Binkley and Gallagher [15℄ provide detailed surveys of the paradigms, appli- ations and algorithms for program sli ing.A major aim of sli ing is to delete as many statements from a program as possible inprodu ing its sli e as, ertainly for program omprehension, maintenan e, debugging andtesting there is little doubt that, in general, everything else being equal, small programsare easier to understand and maintain than larger ones. Mu h of the literature on programsli ing is on erned with improving the algorithms for sli ing both in terms of size of sli e(the smaller the better) and eÆ ien y of the sli ing algorithm.

Page 23: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

1.3 Data ow analysis 23Although the ultimate goal of statement minimal sli ing is known not to be omputable[92℄, mu h work [47, 35, 87, 20, 26℄ has been done to produ e more pre ise dependen einformation and more a urate sli es than those produ ed by Weiser's sli ing algorithm [92℄.1.3 Data ow analysisWeiser's algorithm works on ontrol ow graphs [53℄ rather than programs. The ontrol owgraph of a program is a stru ture where ea h `statement' in the program is represented by anode. See Figure 1.3(page 25) for an example of a ontrol ow graph. There is an ar on-ne ting node n to nodem if and only if `exe ution an pass' from the statement orrespondingto node n to the statement orresponding to node m. Ea h node of the ontrol ow graph isannotated with two sets: the set of variables de�ned and the set of variables referen ed by the orresponding statement. No other information about the program is re orded in the ontrol ow graph2.Data ow analysis [92, 53℄, by de�nition, is the a t of inferring properties about a programfrom its ontrol ow graph alone. Data ow analysis is, thus, fairly limited. We annot,for example, tell by looking at a program's ontrol ow graph when two expressions in theprogram are equal, nor an we use any form of expression simpli� ation. All the informationrequired to do su h things has been `abstra ted away' in onverting the program into a ontrol ow graph. All the approa hes [47, 35, 87, 20, 26℄ ited above lie outside the realm of data owanalysis; the improvements in pre ision that they exhibit result from the fa t that they usemore information about ea h expression in a program than simply the referen ed variables.Weiser's algorithm, on the other hand, is an example of data ow analysis.1.4 The ProblemWeiser [92℄ noti ed that his algorithm is not data ow minimal. It in luded what appeared tobe unne essary statements in sli es. Importantly, the fa t that these nodes were unne essary ould be observed using data ow analysis alone.In this thesis, we set out to answer the following questions:2This de�nition of a ontrol ow graph is used throughout this thesis.

Page 24: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

24 Introdu tion1. Why do data ow sli ing algorithms, like Weiser's produ e sli es that are not data owminimal?2. Do algorithms for produ ing data ow minimal sli es exist?1.5 Examples of the Data ow Minimality ProblemAn example illustrating the problem (Weiser's was more ompli ated) is given in Figure 1.2(page 24).12345while i<0dobeginif =3thenbegin :=4;x:=5end;i:=i+1endFigure 1.2: Program p1:2Using Weiser's de�nition [93℄, an end{sli e of p1:2 with respe t to the variable x is anyprogram p obtained from p1:2 by statement deletion whi h terminates whenever p1:2 does,with the same �nal value for x.The ontrol ow graph of p1:2 is given in Figure 1.3(page 25).

Page 25: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

1.5 Examples of the Data ow Minimality Problem 25def={}ref={i}

def={}ref={c}

def={c}ref={}

def={x}ref={}

def={i} ref={i}

EXIT

1

2

3

4

5

ENTRY

Figure 1.3: The Control Flow Graph of p1:2Using Weiser's algorithm, sli ing on x at the end of the program, gives the whole ontrol ow graph whi h, by de�nition, is a legal sli e. It turns out that the smaller program whose ontrol ow graph is given in Figure 1.4(page 26) is also a sli e with respe t to x at the endof the program i.e. every program whose ontrol ow graph is the one in Figure 1.3(page 25)will `behave the same' with respe t to the �nal value of x, as the orresponding programwhose ontrol ow graph is the one in Figure 1.4(page 26).Consider the ontrol ow graph in Figure 1.3(page 25). The onstant assignment at node 3is exe uted if and only if the onstant assignment at node 4 is exe uted. Having been assigneda onstant value, the value of x annot be further hanged by the body of the loop. The initialvalue of is important, but not the later assignment to it. There is no exe ution path fromthe entry node of the ontrol ow graph to its exit node where the onstant assignment atnode 3 has an e�e t on the �nal value of x. Node 3, therefore, annot a�e t the �nal value of

Page 26: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

26 Introdu tiondef={}ref={i}

def={}ref={c}

def={x}ref={}

def={i} ref={i}

EXIT

1

2

4

5

ENTRY

Figure 1.4: Data ow Minimal Sli e of p1:2x and thus need not be in luded in the sli e.Importantly, this argument depends not on the original program being sli ed, but onlyon its ontrol ow graph. We have not used any information about the expressions in theprogram apart from the set of variables they referen e. This smaller sli e is, in fa t, data owminimal. We annot remove any more nodes without a�e ting the �nal value of x. The omparison between the sli e produ ed by Weiser's algorithm and the data ow minimal oneis given in Figure 1.5(page 27).Another interesting omparison is given in Figure 1.6(page 28). In this example, againend{sli ing on x, it turns out that the assignment to y an have no e�e t on the �nal valueof x. The informal reasoning for this is that on e we have done the assignment to y, theassignment to x an never be done again. The assignment to y an, therefore, have no e�e t

Page 27: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

1.6 Organisation of this Thesis 2712345while i<0dobeginif =3thenbegin :=4;x:=5end;i:=i+1endProgram p1:2

while i<0dobeginif =3thenbeginx:=5end;i:=i+1endData ow Minimal Sli ewhile i<0dobeginif =3thenbegin :=4;x:=5end;i:=i+1endWeiser's AlgorithmFigure 1.5: Illustration of the Data ow Minimality Problemon the �nal value of x. Weiser's algorithm in ludes the assignment to y.In Figure 1.7(page 28), using Weiser's algorithm, sli ing on x at the end of program p1:7gives the whole program whereas sli ing using Weiser's algorithm on x at the end of programp1:8 in Figure 1.8(page 28), gives the empty program. Clearly neither of these programs anhave an e�e t on the �nal value of x. They either fail to terminate or leave x un hanged soin both ases, the data ow minimal sli e is the `empty program'.1.6 Organisation of this ThesisThe rest of the thesis is organised as follows:� Chapter 2, Sli ing: Algorithms and Semanti s brie y surveys the main ontribu-tions to program sli ing, fo using mainly on attempts to de�ne the semanti s preservedby sli ing and related issues.� Chapter 3, Data ow Dependen ies introdu es the data ow minimality problem andreasons are given for the la k of data ow minimality of algorithms that use traditional

Page 28: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

28 Introdu tion1234 while x<zdobeginif x=3then x:=x+yelse y:=y+1endProgram p1:6while x<zdobeginif x=3then x:=x+yendData ow Minimal Sli e

while x<zdobeginif x=3then x:=x+yelse y:=y+1endWeiser's AlgorithmFigure 1.6: End{sli ing on x12 while y>0do x:=x+1Program p1:7 Data ow Minimal Sli e while y>0do x:=x+1Weiser's AlgorithmFigure 1.7: End{sli ing on x12 while y>0do y:=y+1Program p1:8 Data ow Minimal Sli e Weiser's AlgorithmFigure 1.8: End{sli ing on x

Page 29: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

1.6 Organisation of this Thesis 29data and ontrol dependen e.Four dependen e relations all de�ned in terms of the semanti s of programs are intro-du ed.1. VD and TVD, both binary relations between the variables of a program.2. LD and TLD both binary relations between the variables and labels of a program.Data ow equivalen e is formally de�ned and s hemas [44, 78℄ are used for representing lasses of programs with the same ontrol ow graph.For ea h of the above four dependen e relations on programs, orresponding data owdependen ies are de�ned:1. DVD and DTVD, both binary relations on the set of variables of a s hema.2. DLD and DTLD both binary relations between the variables and labels of a s hema.These data ow dependen ies, (two of whi h are a form of sli ing) are de�ned in su h away that any algorithm for omputing them must be data ow minimal.� In Chapter 4, The Semanti s of Loop free S hemas, the Symboli Exe ution Tree:a stru ture used for performing data ow analysis, is used. Symboli exe ution trees are�nite binary trees whose intermediate nodes are symboli predi ates and whose leaf nodesare symboli states whi h map variable names to symboli values. The semanti s, S, ofloop-free s hemas is de�ned as a mapping from loop{free s hemas to symboli exe utiontrees. The hapter ends with an implementation of S in the fun tional language, Hope[6℄. The input to this implementation is a representation of a s hema s and the outputis a representation of the symboli exe ution tree, S[[s℄℄. This is the �rst stage in analgorithm for omputing the data ow dependen ies introdu ed in Chapter 3.� In Chapter 5, The Soundness and Completeness of S it is shown how for ea h loopfree s hema, s, the symboli exe ution tree, S[[s℄℄ hara terises the set of all possiblebehaviours of all program in [s℄. This hara terisation provided by S is both sound and omplete.{ S is omplete in the following sense:Given a loop{free s hema s, and a program p 2 [s℄, and a state, �, there exists

Page 30: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

30 Introdu tionexa tly one path � of the symboli exe ution tree, S[[s℄℄, that orresponds to theexe ution of p in state �.{ S is sound in the following sense:For all paths � of the symboli exe ution tree, S[[s℄℄ there exists a program p 2 [s℄,and a state, � su h that � orresponds to the exe ution of p in state �.� In Chapter 6, Data and Control Dependen e in Symboli Exe ution Trees,algorithms for omputing DTLD, DTVD, DLD and DVD of loop-free s hemas are given.For every loop{free s hema s, these algorithms are de�ned in terms of its symboli exe ution tree, S[[s℄℄.The fa t that S[[s℄℄ properly hara terises [s℄ enables us to prove that the DTLD andDTVD algorithms for loop{free s hemas are orre t provided that the expression syntaxof the underlying programming language is suÆ iently ri h.The algorithms for omputing DLD and DVD are not proved orre t.In order to ompute ea h of the four data ow dependen ies of a loop{free s hema s, twodi�erent versions of data dependen e and four di�erent versions of ontrol dependen e arede�ned. These forms of data and ontrol dependen e all operate on symboli exe utiontrees. Ea h of DTLD, DTVD, DLD and DVD is omputed by applying the appropriateversion of data and ontrol dependen e to S[[s℄℄.We show that DLD and DTLD an be thought of as spe ial ases of DVD and DTVDrespe tively. DTLD an be omputed by treating the labels as variables, omputing theDTVD, and then restri ting the �nal result to just the set of labels.This means that, in e�e t, the `label' and `variable' versions of ea h dependen e above an be ombined into a single dependen e. This simpli� ation implies that, in fa t,just one form of data dependen e and two forms of ontrol dependen e are all that isrequired in order to ompute the four data ow dependen ies introdu ed in Chapter 3,when applied to loop{free s hemas.� In Chapter 7, Computing Data ow Dependen ies of S hemas with Loops, to ompute the data ow dependen ies for s hemas with loops, initially ea h loop is re-pla ed by its zeroth unfolding and the dependen e of this resulting loop{free s hema

Page 31: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

1.6 Organisation of this Thesis 31is omputed. These loop free s hemas are then further unfolded and the data ow de-penden ies are re{ al ulated. It is formally proved that this pro ess will eventuallyterminate resulting with a loop{free s hema whose data ow dependen e is the same asthe program with loops with whi h we started.Provided that we an re ognise when further unfoldings will produ e no further hangein dependen y, we have a hieved data ow minimal algorithms for omputing the variousdata ow dependen ies introdu ed in this thesis.� The thesis ends, (Chapter 8{Con lusions) with a summary of our �ndings togetherwith indi ations for future resear h (Chapter 9 { Future Work). Due to their similarityto sli ing, the appli ability of the algorithms is only brie y mentioned. More of interest isthe novelty of the approa h and the potential improvements in the a ura y of data owanalysis of programs that may result. Using di�erent forms of ontrol dependen e, alsode�ned in terms of symboli exe ution trees it is probable that, using this approa h,more a urate solutions to other related problems in data ow analysis of programs maybe found.� Full listings of programs for DTVD and DTLD, written in the fun tional programminglanguage Hope [6℄ and example exe utions are in luded in the appendi es. These im-plementations and examples an be tested using a web browser.(http://p69-122.unl.a .uk/~seb)

Page 32: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

32 Introdu tion

Page 33: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

Chapter 2Sli ing: Semanti s and AlgorithmsFor a brief introdu tion to sli ing and its appli ations, please refer to Chapter 1, Se tion 1.1,page 21.2.1 Di�erent Forms of Sli eThere are many di�erent de�nitions of program sli es in the literature.Sli es an be ba kward or forward [59, 91℄, stati or dynami [5, 43, 68, 71℄, intra{pro eduralor inter{pro edural [60, 59℄. Sli ing has been applied to programs with arbitrary ontrol ow(goto statements) [7, 23, 2℄ and even on urrent programming languages like Ada [22, 97℄. Ina re ent form of sli ing alled amorphous sli ing [47, 14℄, sli es are not ne essarily produ edby deleting statements and may not ne essarily even be made from omponents of the originalprogram being sli ed. Amorphous sli ing is so general, that it is, in e�e t a form of partialevaluation [37, 33, 17℄. As will be seen, many forms of sli ing are spe ial ases of data owanalysis [53, 92℄ i.e. working at the level of abstra tion of de�ned and referen ed variables,whereas others [87, 47℄ take more detailed information about expressions into a ount. Somede�nitions, for example losure sli es [91℄ need not even be exe utable programs but just olle tions of labels orresponding to statements that a�e t the sli ing riterion in some way.2.1.1 Ba kward vs. ForwardA ba kward sli e is the ` onventional one' [92℄ where it asked:Whi h statements a�e t the sli ing riterion?

Page 34: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

34 Sli ing: Semanti s and AlgorithmsForward sli ing [60℄ is the onverse of this1. The question asked in forward sli ing is:Given a parti ular statement in a program, whi h other statements are a�e tedby this parti ular statement's exe ution?2.1.2 Stati vs. Dynami A stati sli e is the onventional one where the sli e is required to agree with the programbeing sli ed in all initial states. Dynami sli ing [5, 43, 68, 67, 70, 72℄ involves exe uting theprogram in a parti ular initial state and using tra e information to onstru t a sli e relevantto this parti ular initial state.There are variants of sli ing in between the two extremes of stati and dynami 2, wheresome but not all properties of the initial state are known. These are known as onditionedsli es [19℄ or onstrained sli es [35℄.2.1.3 Intra{pro edural vs Inter{pro eduralIntra-pro edural sli ing means sli ing programs whi h do not have pro edures whereas inter{pro edural [93, 59, 60, 68℄ sli ing ta kles the more omplex problem of sli ing programs wherepro edure de�nitions and alls are allowed3.2.1.4 Sli ing Stru tured vs. Unstru tured ProgramsFor many of these appli ations, parti ularly where maintenan e problems are the primary mo-tivation for sli ing, the sli ing algorithm must be apable of onstru ting sli es from `spaghetti'programs, written before the bene�ts of stru tured programming were fully appre iated4. Thear hetype of this unstru tured programming style is the goto statement; all forms of `jump'statement, su h as break and ontinue an be regarded as spe ial ases of the goto state-ment. Su h programs are said to exhibit `arbitrary ontrol ow' and are onsidered to be`unstru tured'. The traditional program dependen e graph approa h [82℄ in orre tly fails1In the main body of this thesis, only ba kward sli ing is onsidered.2In the main body of this thesis, only stati sli ing is onsidered. Although Se tion 2.6 in this hapter isan a ount of the dynami sli ing algorithms of Agrawal and Horgan [5℄.3In the main body of this thesis, only intra{pro edural sli ing is onsidered.4In the main body of this thesis, only sli ing of stru tured programs is onsidered.

Page 35: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

2.2 Weiser's Work 35to in lude any goto statements in a sli e. Various authors have suggested solutions to thisproblem [23, 2, 7, 48℄.2.1.5 Data ow vs. Non{Data owIn the non{data ow analysis approa h [73, 87℄, infeasible paths are dete ted by using ` heekyrules' (de�ned in Se tion 2.2.3).The fa t that programs like:- :=1;if >0then x:=25else x:=z and :=1;x:=25are semanti ally equivalent, an in ertain ir umstan es be automated (although the gen-eral problem is learly not solvable). As in the ase of amorphous sli ing [47℄, automatable orsemi{automatable te hniques involving the symboli simpli� ation of expressions will in turnlead to simpler semanti ally equivalent versions of the original program and hen e to thin-ner sli es than those produ ed by data ow analysis alone5. Another non data ow approa his parametri program sli ing [35, 36℄ , where sli es are onstru ted using a term-rewritingsystem, whi h an use arbitrary rewrites whi h preserve a property of the syntax using origintra king [36℄. The re-write rules an be heeky (see Se tion 2.2.3) be ause they an involveinformation about expressions other than their referen ed variable sets.Many examples of sli ing are ombinations of the ategories above. For example the workof Kamkar [66℄ produ es ba kward, dynami , inter{pro edural sli es. Weiser's original workdes ribed ba kward, stati , intra{pro edural sli ing although he also gave an algorithm forba kward, stati , inter{pro edural sli ing.2.2 Weiser's WorkWeiser's main a hievement was to produ e an algorithm for produ ing ba kward stati intra{pro edural sli es. His algorithm is an example of data ow analysis, a entral theme of thisthesis. Its input is the ontrol ow graph of the program being sli ed.5Su h te hniques lie outside the s ope of this thesis.

Page 36: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

36 Sli ing: Semanti s and Algorithms2.2.1 Control Flow GraphsA ontrol ow graph is an abstra t representation of a program. For example, onsider theprogram in Figure 2.1(page 36).12345while i<2dobeginif =2thenbegin :=y;x:=25end;i:=i+1endFigure 2.1: Program p2:1The ontrol ow graph of a program has nodes whi h have been labelled with the de�ned andreferen ed variables at ea h node. The ontrol ow graph, G2:1, of program p2:1 in Figure 2.1is given in Figure 2.2(page 37).The reason that a ontrol ow graph is more abstra t than a program is that whereas programshave expressions on the right hand side of assignments and as guards of onditionals and loops, ontrol ow graphs have sets of variable names. This set is the set of variables referen edby the orresponding expression: i.e. the set of variable names expli itly mentioned in the orresponding expression. The de�ned variables are the ones o urring on the left handside of assignment statements.2.2.2 Data ow AnalysisWeiser de�ned Data ow Analysis [53℄ to be the analysis of a program's ontrol ow graph.All we are allowed to take advantage of in su h analysis are the sets of de�ned and referen edvariables at ea h node of the ontrol ow graph.

Page 37: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

2.2 Weiser's Work 37def={}ref={i}

def={}ref={c}

def={c}ref={y}

def={x}ref={}

def={i} ref={i}

EXIT

1

2

3

4

5

ENTRY

Figure 2.2: G2:2: The ontrol ow graph of p2:12.2.3 Inherent Ina ura ies in Data ow AnalysisSin e they referen e the same set of variables, data ow analysis annot distinguish betweenthe expressions 2*x and x-x. All programs di�ering only in this way would be treatedidenti ally. The fa t that x-x an be repla ed by 0, an expression that no longer referen esx, is an example of what we all a heeky rule sin e �nding the sets of variables upon whi hexpressions depend (rather than simply mention) is not omputable. Applying heeky rulesis not part of data ow analysis.Another way of thinking of it is that data ow analysis is performed not on programs buton ontrol ow graphs. All the `data ow analyser' is presented with, therefore, are sets ofvariables in pla e of expressions. The ability to apply heeky rules is thus removed.

Page 38: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

38 Sli ing: Semanti s and Algorithms2.2.4 Traditional Dependen eWeiser's algorithm and most subsequent work on program dependen e uses two relations [53℄between the nodes of a program's ontrol ow graph. These are1. Data Dependen e(D)2. Control Dependen e(C)2.2.5 Data Dependen eNode n2 is data dependent on node n1 if there is a variable v referen ed in n2 whi h is de�nedin n1 and there is a path from n1 to n2 with no intervening assignments to v. We writen1 D n2 to mean n2 is data dependent on n1.Examples of Data Dependen eConsider: n1.n2 x:=y;...z:=xIf there are no intervening assignments to x then n2 is data dependent on n1 sin e thevalue of x at n2 is `a�e ted by' the value of y at n1. Similarly, onsider:An2...n1:B while b dobegin...z:=x;...x:=y;...endAgain, if there are no assignments to x in the portions of ode labelled A and B then n2 isdata dependent on n1 sin e the value of x at n2 is `a�e ted by' the value of y at n1. This isan example of a loop arried data{dependen e.

Page 39: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

2.2 Weiser's Work 392.2.6 Control Dependen eControl Dependen e is harder to de�ne. Informally, the exe ution of a predi ate node ` on-trols' the exe ution of other nodes in the ontrol ow graph by determining whether or not ontrol will de�nitely pass to these nodes or not. For ea h predi ate node, b, the set of nodesthat depend on the out ome of b in this way are termed the ontrolled nodes of b. Moreformally, ontrol dependen e is de�ned in terms of post{dominan e:De�nition 2.2.1 (Post{dominan e)A node i is post{dominated by a node j if all paths from i to EXIT pass through j.De�nition 2.2.2A node j is ontrol dependent on node i if and only if1. There exists a path � from i to j su h that for all u in � with u 6= i and u 6= j, u ispost{dominated by jand2. i is not post{dominated by j.A predi ate node b is a node like nodes 1 and 2 in Figure 2.2(page 37). All predi ate nodeshave two ar s leading from them orresponding to true and false. Predi ate node b ontrols nif all paths from � to the exit starting with one do ontain n and there exists a path from � tothe exit starting with the other ar that does not ontain n. In a blo k stru tured language,like the one being onsidered in this thesis, the set of nodes ontrolled by a predi ate b anbe de�ned synta ti ally. If we asso iate ea h assignment statement with its orrespondingnode in the ontrol ow graph and ea h loop and onditional with the node in the ontrol ow graph orresponding to its predi ate, then the set of nodes ontrolled by b are simplythe nodes orresponding, in the way just des ribed, to the statements that appear at depthone in the abstra t syntax tree of the statement whose predi ate is b.For unstru tured languages, the al ulation of ontrolled nodes an be a hieved using thealgorithm of Ferrante, Ottenstein and Warren [34℄.2.2.7 Weiser's AlgorithmIn Weiser's original thesis [92℄ and Tip's later exposition [89℄ it is shown how sli es an be omputed by solving a set of data and ontrol ow equations derived dire tly from the ontrol

Page 40: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

40 Sli ing: Semanti s and Algorithms ow graph of the program being sli ed. These equations are solved using an iterative pro esswhi h entails omputing sets of `relevant variables' for ea h node in the ontrol ow graph.Dire tly Relevant VariablesSuppose a sli e is to be onstru ted for the sli ing riterion C = (V; n). First, the dire tlyrelevant variables of node i, R0C(i), are de�ned indu tively as follows:-1. The set of dire tly relevant variables at the sli e node, n, is simply the sli e set, V .2. The set of dire tly relevant variables at every other node i, is de�ned in terms of theset of dire tly relevant variables of all nodes j leading dire tly from i to j (writteni!CFG j) in the ontrol ow graph. R0C(i) ontains all the variables v su h that either(a) v 2 R0C(j) and v =2 def(i) or(b) v 2 ref(i) and def(i) \ R0C(j) 6= ;.The dire tly relevant variables of a node are the set of variables at that node upon whi h thesli ing riterion is transitively data dependent.Dire tly Relevant StatementsIn terms of the dire tly relevant variables, a set of dire tly relevant statements S0C is de�ned:-S0C = fi j 9j su h that i!CFG j and def (i)\R0C(j) 6= ;gThis ompletes the �rst iteration of Weiser's algorithm.Indire tly Relevant VariablesThe subsequent iterations of Weiser's algorithm al ulate the indire tly relevant variables, RKCwhere K � 0.In al ulating the indire tly relevant variables, ontrol dependen e is taken into a ount.RK+1C (i) = RKC (i)[ [b2BKC R0(ref (b);b)(i)

Page 41: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

2.3 A Parallel version of Weiser's Algorithm 41where BKC = fb j 9i 2 SKC su h that b ontrols igBKC is the set of all predi ate nodes that ontrol a statement in SKC .Indire tly Relevant StatementsAdding the predi ate nodes to SKC in ludes further indire tly relevant statements in the sli e:-SK+1C = BKC [ fi j 9j su h that i!CFG j and def (i)\RK+1C (j) 6= ;gAs Tip [89℄ states, this pro ess will eventually terminate sin e SKC and RKC are non{de reasing subsets of the program's variables.Weiser proves, in his thesis(theorem 10), that his algorithm produ es sli es a ording tohis semanti de�nition of a sli e.2.3 A Parallel version of Weiser's AlgorithmAs has just been shown, Weiser's algorithm is fairly ompli ated to express using onventionalte hniques. In this se tion, we show that Weiser's algorithm [92℄ an be expressed moreelegantly using a parallel algorithm [28℄.Parallel algorithms have the potential for being `faster' than their sequential ounterparts,sin e, as their name suggests, the work an be shared by many omputing agents all exe utingat the same time.The reason why parallel algorithms are of interest here, however, is not a question ofimproved eÆ ien y, but one of improved `expressibility'.Often, problems an be expressed more generally and more learly on urrently ratherthan sequentially be ause the problem itself may have inherent on urrent aspe ts. A se-quential algorithm is just a `spe ial ase' of a parallel one. Sequential notations often for ethe programmer to impose an unne essarily stri t order of omputation on his algorithm.

Page 42: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

42 Sli ing: Semanti s and AlgorithmsThe simplest example of this is demonstrated in `sequential' versus `parallel' assignment[32℄. Suppose a programmer wants to write a program that leaves variable x having the value7 and y the value 9. He might hoose x:=7;y:=9 or y:=9;x:=7 learly the order of exe utionof these two assignment statements is immaterial. If we had a notation for expressing `parallelassignment' the the programmer ould just write (x,y):=(7,9). The fa t that having su hparallel onstru ts leads to simpler algorithms an be seen in a simple example where theprogrammer wishes to swap the values of variables x and y. As every programmer knows, todo this in a sequential language, a temporary variable is usually used. In the ase of parallelassignment we would simply write (x,y):=(y,x).Programs written in fun tional languages are another example. These languages have no on ept of sequentialisation of statements, and thus are inherently on urrent.Mu h work [64, 65, 55℄ has been done in expressing parallel algorithms using networksof ommuni ating pro esses all a ting on urrently. In general the behaviour of ea h pro essin the network is very simple. The algorithmi power is obtained by the intera tion and o-operation of these simple pro esses. In [28℄, a similar approa h to [1℄ is adopted, whereea h pro ess an be de�ned by a simple fun tion on streams of messages and the topology ofthe pro ess network is de�ned by the way that the individual fun tions are omposed.Weiser's algorithm is expressed as a network of pro esses ea h of whose behaviours isvery simple. Interestingly, the topology of the network is exa tly the same as the topology ofthe ontrol ow graph of the program being sli ed. Ea h pro ess in the network, therefore, orresponds to a node of the ontrol ow graph and its behaviour an be de�ned very simplyin terms of properties of this node.Communi ation between pro esses is in the reverse dire tion of the arrows in the on-trol ow graph. So output hannels orrespond to ar s entering a node and input hannels orrespond to ar s leaving a node.2.3.1 Pro ess BehaviourEa h pro ess repeatedly sends and re eives messages that are sets onsisting of variable namesand node numbers. The behaviour of ea h pro ess, i, depends pre isely on the followinginformation, derived dire tly from the ontrol ow graph of the program being sli ed:-

Page 43: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

2.3 A Parallel version of Weiser's Algorithm 43i The number of the orresponding node of the ontrol ow graphref (i) The set of variables referen ed by idef (i) The set of variables de�ned by iC(i) The set of nodes ontrolled by iPro esses with more than one input orrespond to predi ate nodes. This work is on ernedwith side-e�e t free languages, so all su h pro esses will have def (i) = ;. Conversely, pro esseswith only one input do not orrespond to predi ate nodes, and therefore, by de�nition, they ontrol no other nodes and so have C(i) = ;. The behaviour of ea h pro ess, i, de�ned infun tional notation is:- Fi : set(name) ! set(name);Fi(S) = if S \ (def (i) [ C(i)) 6= ;then (S � def (i)) [ ref (i) [ figelse SThis is interpreted as follows:-If the input, S, to pro ess i, has any elements in ommon with the de�ned variables of i orwith the ontrolled nodes of i then the pro ess, i, outputs the set onsisting of :-1. all its input variables (elements of S) that it does not de�ne,2. all variables that it referen es,3. its node number, i.On the other hand, if S has no elements in ommon with the de�ned variables or ontrollednodes of i then the pro ess i merely outputs S.The pro ess i then repeats this a tion, waiting for the next input message.2.3.2 Starting Network Communi ationIn order to onstru t a sli e for the riterion (V; n), network ommuni ation is initiated byoutputting6 the message V from pro ess n. Messages will be then passed around the network6A ording to Weiser's original de�nition of a stati sli e [92℄, a sli e onstru ted for the sli ing riterion(V; n), need not ontain the node n. Korel and Laski [71℄ argue that the omission of the sli e node is unfortunate;a programmer �nds the absen e of the sli e node onfusing when attempting to relate the sli e to the original.In order to in lude the sli e node in a sli e onstru ted by the parallel sli ing algorithm, pro ess ommuni ationshould be initiated by outputting, not just the set of variables V , but also the node identi�er, n.

Page 44: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

44 Sli ing: Semanti s and Algorithmsuntil it eventually `stabilises' when no new messages arise from any node.2.3.3 Constru ting the Sli eOn e stabilised, the sli e is omputed by observing what has been output by ea h node: Nodei should be in luded in the sli e if and only if pro ess i has output its node identi�er i. Thesli e of a program omputed by this algorithm an be found by in luding the set of nodeswhose identi�ers are input to the entry node of the ontrol ow graph, be ause the entry nodeis rea hable via every node in the reverse ontrol ow graph and thus messages output by allnodes will eventually rea h the entry pro ess.Alternatively, we imagine that all the nodes have lights on them. A node `lights up' if itouptuts its own identi�er. The sli e is simply the set of `lit up' nodes.De�nition 2.3.1 above, should be thought of as a `spe i� ation' of pro ess behaviour ratherthan an `implementation'. The important aspe t of the de�nition is that ea h pro ess shouldbe thought of as a fun tion from the union of all its inputs to the union of all its outputs.2.3.4 Example Exe ution of the Parallel AlgorithmAs is the onvention, ar s entering a node, i, represent inputs to pro ess, i, and ar s leaving irepresent outputs from pro ess, i. When a pro ess outputs a message, it shall mean that themessage is output on all output hannels. Let the sli ing riterion be (f g,7). The programto be sli ed is shown in �gure 2.3.1234567 a:=0;while s<tdo begin if t=4then :=t;s:=2; :=t+7;t:=a+4endFigure 2.3: The program to be sli edThe reverse ontrol ow graph of the program to be sli ed is shown in �gure 2.4:-

Page 45: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

2.3 A Parallel version of Weiser's Algorithm 45ENTRY

EXIT

1

2

3

5

6

4

7

def={a} ref={}

def={} ref={s,t}

def={} ref={t}

def={s} ref={}

def={c} ref={t}

def={t} ref={a}

def={c} ref={t}

Figure 2.4: The reverse ontrol ow graph obtained from the initial programThe pro ess network obtained from the reverse ontrol ow graph is shown in �gure 2.5.A sli e is to be onstru ted for the riterion (f g,7), so pro ess ommuni ation is initiated byoutputting f g from node 7.

Page 46: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

46 Sli ing: Semanti s and Algorithms4

6

1

2

3

5

ENTRY

3,4,5,6,7

4

a

s,t

t

c t

s

c t

t a

7

EXIT

{c}Figure 2.5: Initial state of the pro ess networkTo show the progression of the state of the system, the ar s are labelled in the reverse ontrol ow graph with the messages ommuni ated by the relevant pro esses during exe u-tion. New messages ommuni ated at ea h stage are labelled in bold typefa e.Pro esses are drawn like this:DEF REF

CWhere C is the set of ontrolled nodes, DEF is the set of de�ned variables and REF is theset of referen ed variables of ea h pro ess.

Page 47: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

2.3 A Parallel version of Weiser's Algorithm 47After re eiving f g through its input hannel, pro ess six outputs ft,6g. Pro esses �ve,four and three all eventually re eive ft,6g whi h they simply output be ause ft,6g is dis-joint from the de�ned variables of these pro esses.The resulting state is shown in �gure 2.6.4

6

1

2

3

5

ENTRY

3,4,5,6,7

4

a

s,t

t

c t

s

c t

t a

7

EXIT

{c}

{t,6}

{t,6}

{t,6}

{t,6}

{t,6}

Figure 2.6: The state just before pro ess two outputs its �rst messageWhen ft,6g is input to pro ess two, it auses pro ess two to output fs,t,2,6g to pro- esses one and seven. This is an instan e of a pro ess responding to an input ontaining theidenti�er of a node that it ontrols. Pro ess two will therefore output a message in ludingits own node identi�er, representing the fa t that node two will also be in luded in the �nalsli e.The resulting state is shown in �gure 2.7.

Page 48: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

48 Sli ing: Semanti s and Algorithms4

6

1

2

3

5

ENTRY

3,4,5,6,7

4

a

s,t

t

c t

s

c t

t a

7

EXIT

{c}

{t,6}

{t,6}

{t,6}

{t,6}

{t,6}

{s,t,2,6}

{s,t,2,6}

Figure 2.7: The state just after pro ess two has output its �rst messageThe message fs,t,2,6g passes through pro ess one to the ENTRY node. On re eivingfs,t,2,6g, pro ess seven outputs the message fs,a,2,6,7g be ause it de�nes t. This outputmessage passes una�e ted through pro ess six.When pro ess �ve re eives fs,a,2,7g it outputs fa,2,5,7g.The resulting state is shown in �gure 2.8.

Page 49: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

2.3 A Parallel version of Weiser's Algorithm 494

6

1

2

3

5

ENTRY

3,4,5,6,7

4

a

s,t

t

c t

s

c t

t a

7

EXIT

{c,s,a,2,6,7}

{t,6,s,a,2,7}

{s,t,2,6}

{s,t,2,6}

{t,6,a,2,5,7}

{s,t,2,6}

{t,6,a,2,5,7}

{t,6,a,2,5,7}

{t,6,a,2,5,7}

Figure 2.8: The state after one more pass around the loopContinuing pro ess ommuni ation passes the extra messages in the network to all rea h-able nodes, but auses no new messages to be introdu ed into the system. Finally the networkterminates in the state shown in �gure 2.9.

Page 50: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

50 Sli ing: Semanti s and Algorithms4

6

1

2

3

5

ENTRY

3,4,5,6,7

4

a

s,t

t

c t

s

c t

t a

7

EXIT

{c,s,a,6,2,7,5}

{t,s,a,6,2,7,5}

{s,t,2,6,a,5,7}

{s,t,2,6,a,5,7}

{t,a,6,2,5,7}

{t,a,6,2,5,7}

{t,a,6,2,5,7}

{t,a,6,2,5,7}

{s,t,2,6,1,5,7}

Figure 2.9: The �nal state of the pro ess networkFrom the �nal state of the network the sli e of the original program is onstru ted byin luding those statements and predi ates whose node identi�ers, f1; 2; 5; 6; 7g, have rea hedthe ENTRY node.1234567 a:=0;while s<tdo begin if t=4then :=t;s:=2; :=t+7;t:=a+4end a:=0;while s<tdo begin s:=2; :=t+7;t:=a+4endFigure 2.10: The original program and its sli e

Page 51: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

2.4 Weiser Sli es 51A proof of the equivalen e of Weiser's and this algorithm is in luded in appendix C.2.4 Weiser Sli esIn this se tion we des ribe what Weiser's algorithm produ es, in terms of a program's ontrol ow graph, rather than how it produ es it. Suppose we wish to onstru t the Weiser sli e forprogram p with sli ing riterion (V; n). First we insert a node with referen ed set V into thedesired pla e in the ontrol ow graph of p. For example, if we were sli ing at the end of theprogram we would pla e this extra node after the exit node. Call this node n0. LetF = D [ Ci.e. F is the union of the data and ontrol dependen e relations. The Weiser sli e is `more orless' the set of nodes onsisting of the transitive losure F � of F applied to node n0. To bepre ise, it is (F � ÆD) [D:In other words it is all the maplets in the transitive losure that start with a data dependen e.ExampleSuppose we wished to sli e at the end of program p2:1 in �gure 2.1 with respe t to variablex at the end of the program. We �rst add the node with referen ed variables fxg at theappropriate pla e in the ontrol ow graph of p2:1 to get the augmented ontrol ow graphin �gure 2.11. In this ase: C = f2 7! 1; 3 7! 2; 4 7! 2gD = f1 7! 5; 2 7! 3; 6 7! 4g

Page 52: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

52 Sli ing: Semanti s and AlgorithmssoF � =

8>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>><>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>:

1 7! 52 7! 12 7! 22 7! 32 7! 53 7! 13 7! 23 7! 33 7! 54 7! 14 7! 24 7! 34 7! 56 7! 16 7! 26 7! 36 7! 46 7! 5

9>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>=>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>;and in this ase ((F � ÆD) [D)6 = f1; 2; 3; 4; 5g:2.4.1 Sli ing using Program Dependen e GraphsProgram dependen e graphs were introdu ed by Ferrante et al [34℄. The program dependen egraph for a program, P , is a dire ted graph whose nodes are onne ted by several di�erentkinds of edge. For example, the program dependen e graph, G2:4:1, of program, p2:1, is givenin Figure 2.4.1(page 54). The nodes are essentially the same nodes that o ur in the ontrol ow graph. The ar s of a program dependen e graph that are relevant for sli ing are the

Page 53: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

2.4 Weiser Sli es 53def={}ref={i}

def={}ref={c}

def={c}ref={}

def={x}ref={}

def={i} ref={i}

1

2

3

4

5

ENTRY

def={}ref={x}

6

EXIT

Figure 2.11: G2:11: The augmented ontrol ow graph of p2:1 ontrol dependen e ar and the data dependen e ar . There is a ontrol dependen e ar fromnode n1 to node n2 if and only if n2 is ontrol dependent on n1 and there is a data dependen ear from node n1 to node n2 if and only if n2 is data dependent n1.Ottenstein and Ottenstein [82℄ showed how sli ing ould be done using a program de-penden e graph. Most urrent implementations [3, 58℄ of sli ing algorithms use the programdependen e graph. The reason for this is one of eÆ ien y. In e�e t, the pro ess of omputingthe program dependen e graph is performing the `hard work' in omputing the sli es for nsli ing riteria where n is the number of nodes. As a result of this `pre{pro essing', anyparti ular sli e orresponding to one of these n riteria an be omputed in linear time.The sli es produ ed are usually, but not always, exa tly the same as those produ ed byWeiser's algorithm. The di�eren es between sli ing using Weiser's algorithm and programdependen e graphs an be enumerated as follows:-

Page 54: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

54 Sli ing: Semanti s and Algorithms1. The sli e produ ed by the program dependen e graph approa h is exa tly the transitive losure of the union of the data dependen e and ontrol dependen e relations. Thismeans that the program dependen e graph approa h sometimes produ es bigger sli esthan Weiser's algorithm.2. Using the program dependen e graph approa h, sli ing annot be done with arbitrarysli ing riteria. There is no on ept of adding nodes at the sli e point. A sli e an onlybe onstru ted using existing nodes. So sli ing at node n an only be done with thereferen ed variables of node n.To al ulate a PDG{sli e at a node n we simply tra e ba k along all these ar s. Wein lude in the sli e every node that we rea h. Clearly this will simply produ e(D [ C)�n:ENTRY

i<3

c:=y

i:=i+1

x:=25

c=2

data

control

Figure 2.12: G2:4:1: The program dependen e graph of p2:12.5 The Semanti s of Sli ingAn essential issue in program sli ing is to de�ne what it means for two programs to behave thesame with respe t to a sli ing riterion. i.e. What semanti relationship must exist between

Page 55: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

2.5 The Semanti s of Sli ing 55a program and its sli e in order that the sli e is onsidered valid.2.5.1 Weiser's Semanti De�nition of Valid Sli esWeiser de�ned the semanti relationship that must exist between a program and its sli e interms of state traje tories:State Traje toriesA state traje tory is a sequen e of label, state pairs (li; �i) where �i represents the stateimmediately before exe uting the statement labelled li.De�nition 2.5.1 (Weiser Sli es)A sli e s of a program p on a sli ing riterion = (V; i) is any exe utable program with thefollowing property. Whenever p halts on an input I with a state traje tory T the s also haltson input I with state traje tory T 0 withProj (T ) = Proj (T 0)Proj (T ) is obtained �rst by deleting all elements of T whose label omponent is not i andthen, by restri ting the state omponents to V 7.2.5.2 End Sli ingWhen sli ing at the end of the program8, the traje tories will all be of length one (sin e the`exit' statement is exe uted only on e). This gives rise to simpli�ed form of sli ing alled endsli ing.De�nition 2.5.2 (Weiser's De�nition of an End{Sli e)p0 is an end sli e of p with respe t to a set of variables V if whenever p terminates so does p0with the same �nal values of the variables in V .7This is a slight simpli� ation of the true pi ture sin e we are assuming that i is in the sli e of p withrespe t to . A more ompli ated de�nition involving `nearest su essors' is required if i is not in the sli e8This thesis solely on erns end{sli ing.

Page 56: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

56 Sli ing: Semanti s and Algorithms2.5.3 Sli ing and non{TerminationUsing Weiser's de�nition, every program is semanti ally a valid sli e of p, for non{terminatingprograms p. Weiser, himself, noti ed that using his algorithm, there is no guarantee that asli e will fail to halt whenever the original program fails to halt. In other words, the Weisersli e of a program may terminate in some states where the original did not. So the sli esprodu ed by Weiser's Algorithm do not preserve a proje tion of the standard semanti s [88℄of programs.2.5.4 The Semanti s of the PDG approa hHorwitz et al. [56℄ show that a program dependen e graph (where the nodes ontain theatomi statements and not just the de�ned and referen ed variables) is an adequate stru turefor representing a program's exe ution behaviour in the sense that two program's with thesame program dependen e graph have the same standard semanti s. Reps and Yang [83℄prove that the program dependen e graph approa h to sli ing preserves Weiser's semanti si.e. it was shown that for any initial state where the original program terminates then the sli ealso terminates with the same sequen e of values for ea h element of the sli e. The onverseis not true i.e. in some states the sli e may terminate when the original program does not.Cartwright and Felleisen [21℄ de�ne a lazy semanti s of programs whi h they show is preservedby data ow sli ing algorithms like Weiser's Algorithm [92℄ and the program dependen e graphapproa h [82℄. Before their work is dis ussed, a short introdu tion to standard denotationalsemanti s is required.2.5.5 Standard Semanti sDe�nition 2.5.3 (state)In denotational semanti s [88℄, a state, � 2 �, is a mapping from variables to values from aset V . � 2 � = variables !j VFor example the fun tion

Page 57: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

2.5 The Semanti s of Sli ing 57� = 8>>><>>>: x 7! 5y 7! 1z 7! 2 9>>>=>>>;represents the state where variable x has the value 1, variable y has the value 1 and z thevalue 2.The meaning of a program is given by a fun tion from states to states [88℄.M : P ! �!j �Where P is the set of all programs. Given an initial state �, the �nal state rea hed afterexe uting p starting in state � is thus writtenM[[p℄℄�. If the program p does not terminatewhen started in state �, then M[[p℄℄� has the spe ial value ?, pronoun ed `bottom'. Instandard semanti s, in the bottom state all variables are deemed to have the value ?. So thebottom state is the fun tion that maps every variable name to ?:The �nal value of variable x after exe uting p in initial state � is thus written9M[[p℄℄� x.Ordering on StatesIn the standard semanti s, the ordering on states is su h that two distin t non ? states arein omparable and ? is less than every state. The reason an ordering is required is that themeaning of loop is de�ned to be the least �xed point of a fun tion. The ordering expressesthe sense in whi h the �xed point is the least.Evaluating ExpressionsThe meaning of an expression e is given by the fun tion E whi h evaluates an expression ina state to give a value. E : expressions ! �!j V9Fun tion appli ation asso iates to the left. i.e. f g h x means ((fg)h)x)

Page 58: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

58 Sli ing: Semanti s and AlgorithmsStri tness of E in Standard Semanti sA fun tion is stri t if it gives ? when applied to ?. In standard semanti s E is stri t. Inother words, evaluating every expression in the ? state will give the ? value even expressionslike 3 that do not referen e any variables.Assignment StatementsThe meaning of an assignment statement is the onventional [88℄M[[x := e℄℄ = ��:�[x E [[e℄℄�℄f [x y℄ is the fun tion that is the same as f ex ept that x is mapped to y. This isknown as the update operation. The new state after doing an assignment statement x:=e isthe same as the old ex ept that x gets mapped to the value of the expression e evaluated inthe old state.Sequen es of StatementsThe meaning of s1 followed by s2 is the omposition of the meaning of s2 with the meaningof s1. M[[s1;s2℄℄ = ��:M[[s2℄℄(M[[s1℄℄�)ConditionalsFor onditionals, the guard is evaluated in the urrent state. If it is true then the result isthe �nal state after exe uting the then part, otherwise it is the �nal state after exe uting theelse part. M[[if b then s1 else s2℄℄ = ��:E [[b℄℄�!M[[s1℄℄�;M[[s2℄℄�The notation [a! b; ℄ orresponds to the fun tion whi h if a is true gives b, if a is falseit gives and if a is ? it gives ?.

Page 59: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

2.5 The Semanti s of Sli ing 59LoopsThe meaning of a while loop M[[while b do s℄℄ is de�ned to be the least �xed point of thefun tion �!�E [[b℄℄�! !(M[[s℄℄�); �whi h is shorthand for:- �!�8<: !(M[[s℄℄�) if E [[b℄℄�� otherwiseThe type of this fun tion is:- (state ! state) ! (state ! state). So its least �xed pointis a state to state fun tion.Meaning of Loops(Example)Consider M[[while true do x:=1℄℄is evaluated. By de�nition, it is the least �xed point ofF = �!�8<: !(M[[x:=1℄℄�) if E [[true℄℄�� otherwiseWe are looking for the least ! su h that! = ��8<: !(M[[x:=1℄℄�) if E [[true℄℄�� otherwiseE is stri t so E [[true℄℄� = 8<: ? if � = ?true otherwise :

Page 60: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

60 Sli ing: Semanti s and AlgorithmsFrom this it is lear that �� � ? is a �xed point of F and hen e the least.This means the program while true do x:=1 will fail to terminate, whatever its initialstate.ExampleNow onsider, M[[while true do x:=1; x:=1℄℄:Using the sequen e rule this gives M[[x:=1℄℄(?)whi h by the assignment rule gives:- ?[x E [[1℄℄?℄whi h is ? sin e E [[1℄℄? = ? sin e E is stri t in both its arguments.2.5.6 Lazy Semanti sLazy semanti s is a term usually applied to fun tional languages [42℄. An interpreter thatperforms lazy evaluation will result in some programs terminating that would not do so ifthe opposite form of evaluation alled eager evaluation were used. The reason this happensis that in lazy evaluation, when applying a fun tion to some arguments, the arguments areonly evaluated if their value is need. In eager evaluation, on the other hand, the argumentsare always evaluated before the fun tion is applied. If evaluating an argument, therefore,leads to non-termination, and this argument is not needed, then eager evaluation will lead tonon{termination but lazy evaluation may not. An example is given in the fun tional languageHope [6℄.

Page 61: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

2.5 The Semanti s of Sli ing 61de f,g:num -> num;g(n) <= g(n);f(n) <= 1;Eager evaluation of the term f(g(0)) will produ e non termination (?) whereas lazyevaluation will produ e 1.Now g(0) produ es ? in both lazy and eager evaluation. So in lazy evaluation f(g(0)) =f(?) whi h evaluates to 1. Using lazy evaluation, fun tion appli ation is not stri t, whereasusing eager evaluation fun tion appli ation is stri t.Unlike in fun tional languages, it does not make sense to have a lazy interpreter for imper-ative languages, sin e in imperative languages we are interested in intermediate omputationand not just the �nal result, however it is still possible to de�ne a lazy semanti s of imperativelanguages. (The standard semanti s is eager.)For imperative programs, the state in lazy semanti s an map some variables to ? andothers to proper (non{?) values. The ordering on states is the same however. Evaluatingexpressions in the state where some of the variables get mapped to ? an produ e a non{?value if none of the variables needed to evaluate the expression are mapped to ?. i.e. E isnon{stri t.As will be shown, even in lazy semanti s, an in�nite loop results in the same state (namely,�x:?). The di�eren e is that in lazy semanti s, subsequent assignments an ause this stateto be `re overed'.Consider again, the program while true do x:=1.Using lazy semanti s, we are looking for the least ! su h that! = ��8<: !(Mlazy [[x:=1℄℄�) if E [[true℄℄�� otherwiseIn lazy semanti s, E [[true℄℄� = true for all �.So we are looking for the least ! su h that! = ��!(Mlazy[[x:=1℄℄�)

Page 62: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

62 Sli ing: Semanti s and Algorithmswhi h is still �� � ?but, now onsider, Mlazy [[while true do x:=1; x:=1℄℄:Again, as above, this gives ?[x E [[1℄℄?℄but be ause E is non{stri t, this gives ?[x 1℄:This is the state that maps every variable to ? ex ept x that is mapped to 1, sin e E isnon{stri t in both its arguments. So in other words the program has `re overed from' thein�nite loop. Proje ting the lazy meaning of the program above onto the variable x gives thevalue 1. Although in `normal exe ution' the program will not terminate.The fa t that sli ing preserves lazy semanti s has the onsequen e that sli ing is allowedto introdu e termination. While lazy semanti s is the norm for fun tional programminglanguages, it is not normally asso iated with the meaning of imperative programs, for whi hsli ing is, almost ex lusively, applied10.Consider the example program in �gure 2.13. A stati sli e onstru ted with respe t to(x,3) will ( onventionally) ontain line 3 alone. The fa t that line four will never be exe utedwhen y is initially greater than 0 is of no onsequen e. In the lazy semanti s of this programthe �nal value of the variable x is 1, whatever the initial state.2.5.7 Statement Minimal Sli esClearly, by de�nition, every program is a sli e of itself and in general the sli e of a programis not unique, sin e we an add statements of p to p0 whi h have no e�e t on the sli ing riterion and still have a sli e. Sin e every program is a sli e of itself, a orre t but useless10Sli ing has also been applied to fun tional style notations [96℄.

Page 63: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

2.5 The Semanti s of Sli ing 63123 while(y<0)do y:=y+1;x:=1;Figure 2.13: Non-Termination Preservationsli ing algorithm would be one that simply performed the identity fun tion on programs. Itis learly desirable to have an algorithm that produ es sli es that are `as small as possible'.A statement minimal sli e is one where as many statements as possible have been deleted.Weiser showed that it is not possible to write an algorithm for �nding statement minimalsli es for arbitrary programs. He did this by showing that if we ould �nd statement minimalsli es we ould also solve the halting problem. To �nd whether a given program p halts,simply ompute the end sli e of x with respe t to the program q:-Program q Program p ...............x:=1If p does not terminate, the statement minimal end sli e of program q will give the emptyprogram and if it does terminate it will givex:=1.2.5.8 Data ow Minimality ProblemThe idea is entral to this thesis. An a ount of the data ow minimality is given in Chapter 1and is not repeated here.

Page 64: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

64 Sli ing: Semanti s and Algorithms2.5.9 Venkatesh's WorkThe major aim of the work by Venkatesh [91℄ is to separate de�nitions of sli es from thealgorithms whi h ompute them.Venkatesh [91℄ introdu ed and laims to formally de�ne the semanti s of a variety ofalready existing forms of sli e as well as introdu ing some of his own.Like Weiser, his idea of a sli e was not as a unique obje t. Sli es are programs whi hpreserve some proje tion of the semanti s of the original program. Programs are all sli es ofthemselves.He de�nes a simple pro edural language L with assignments, onditionals and loops (allstatements being uniquely labelled). He de�nes a sli e to be a set of labels, and de�nes afun tion `syn(s; L)' whi h onstru ts a legal program from a program and a subset of its labelsin the obvious way. Although Venkatesh does not point it out, be ause of the properties of ontrol dependen e, the author believes that using Weiser's Algorithm, L = syn(s; L). Inother words the set of labels produ ed by Weiser's algorithm when applied to stru turedprograms will be omplete programs.A set of labels L is a stati ba kward end sli e of p with respe t to variable v if and onlyif for all states � M[[p℄℄�(v) =M[[syn(p; L)℄℄�(v)A Dynami Sli e is de�ned in terms of an initial state �0. He de�nes L to be a dynami sli e with respe t to program p and variable v if and only ifM[[p℄℄�0(v) =M[[syn(p; L)℄℄�0(v)A stati sli e is therefore a spe ial form of dynami sli e.These de�nitions are somewhat in onsistent with the statement that `program sli es areonly onsidered meaningful for terminating omputations' as they learly imply that thesli e and the original program must agree even when the original program fails to terminate.Venkatesh probably meant:- A set of labels L is a stati ba kward end sli e of p with respe tto variable v if and only if for all states �M[[p℄℄� 6= ? =) M[[p℄℄�(v) =M[[syn(p; L)℄℄�(v)

Page 65: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

2.5 The Semanti s of Sli ing 65Interestingly, he introdu es the idea of a Closure Sli e. A losure sli e is just a set of labelsthat `have an e�e t' on the variable(s) of interest. He states that`In the ase of losure sli es, we are not interested in ombining the labels to makea semanti ally orre t subprogram.'Algorithms whi h produ e losure sli es presumably, unlike Weiser's algorithm, have theproperty that L 6= syn(L; s) and the behaviour of syn(L; s) may not be the same as thebehaviour of the original program (even with respe t to the sli e variable).He de�nes losure sli es in terms of ontamination. A label l must be in luded in a losuresli e with respe t to variable v i� the e�e t of ` ontaminating' the expression l per olatesthrough to a�e t v. Although expressed ompletely di�erently, this is the same as labeldependen e introdu ed in this thesis.Venkatesh gives a olle tion of formal de�nitions of di�erent types of sli e. These in ludethe Dynami Ba kward Closure Sli e, the Dynami Ba kward Exe utable Sli e, Stati Ba k-ward Closure Sli e, Stati Ba kward Exe utable Sli e all of whi h, unlike Weiser's de�nitionare fun tions.He also introdu es forward sli ing whi h is where, given a variable v we are interested inall the expressions whi h are a�e ted by the initial value of v. Quasi{stati sli ing is alsointrodu ed. This is simply where a program and its sli e must agree not on a single state asin the dynami ase, nor on all states as in the stati ase, but on a pre�x of the input.In the �nal se tion of his paper Venkatesh introdu es two algorithms one for dynami andthe other for stati sli ing. The latter he laims is equivalent to that used in [59℄ for inter{pro edural sli ing. His stati sli ing algorithm appears to produ e the same sli es as those ofWeiser's algorithm. It is a reformulation of Weiser's algorithm using programs rather than ontrol ow graphs. It has an a umulating parameter L for per olating ontrol dependen einformation.The main ontribution of Venkatesh's work is that it introdu es the idea that there aremany di�erent feasible semanti de�nitions of a sli e. A major problem with the work, is thatalthough his de�nitions are formal, they are not proje tions of the standard semanti s, sohave very little intuitive value in terms of program behaviour. The author believes that sin esli ing is about proje ting properties of program behaviour on to a set of variables, a sli esemanti s is only useful if it is de�ned in terms of a program's standard semanti s sin e that

Page 66: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

66 Sli ing: Semanti s and Algorithmsde�nes how programs behave when they are exe uted. It is not helpful to have to understanda whole new interpretation of a programming language in order to understand the meaningof a program sli e.2.5.10 Hausler's WorkTwo years before Venkatesh, Hausler [52℄ states the same de�nition of a sli e as Weiser.Namely that a sli e S of P an be obtained from P by deleting zero or more statements andthat if P halts on input i with values for the variables in the sli ing riterion, then so does Swith the same values for these variables. Hausler, like Venkatesh only onsiders end sli ing.Although laiming to have given a denotational de�nition of a sli e, he has really just writtena sli ing algorithm in a fun tional language. His algorithm is a data ow algorithm (in thesense that it works at the level of abstra tion of de�ned and referen ed variables) and appears,like Venkatesh, to be another formulation of Weiser's Algorithm. The strength of Hausler'swork lies in the fa t that he expresses a sli ing algorithm without expli itly mentioning a ontrol ow graph. His algorithm works dire tly on programs. He does not expli itly usedata and ontrol dependen e but they are, nevertheless en oded in his algorithm.He uses two mutually re ursive fun tions:Æ : P �PV ! PV(whi h is his version of variable dependen e) and� : P �PV ! Pwhi h is the fun tion that produ es the sli e.Æ is de�ned in terms of a fun tion `used' whi h is, in fa t the referen ed variables of anexpression, so we shall all it `ref '. The rules for Æ and � are now given:-The abstra t syntax of the language he onsiders is of the form:-� ::= x := E jlist(�) jif B then �0 else �1 jwhile B do �For statement lists, the rules for Æ are as follows:-

Page 67: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

2.5 The Semanti s of Sli ing 67Æ(nil; V ) = VÆ(append(a; b)); V ) = Æ(a; Æ(b; V ))and for single statements, the rules for Æ are as follows:-Æ(x := E; V ) = 8<: V � fxg [ ref (E) if x 2 VV otherwiseÆ(if B then �0 else �1; V ) = 8<: ref (B) [ Æ(�0; V ) [ Æ(�1; V ) if �(�0) 6= nil or �(�1) 6= nilV otherwiseÆ(while B do �; V ) = [n2IN Æn(if B then � else nil; V )where Æn+1(�; V ) = Æ(�; Æn(�; V ))and Æ0(�; V ) = VFor statement lists, the rules for � are as follows:-�(nil; V ) = nil�(append(a; b)); V ) = append(�(a; Æ(b; V )); �(b; V ))and for single statements, the rules for � are as follows:-

Page 68: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

68 Sli ing: Semanti s and Algorithms�(x := E; V ) = 8<: x := E if x 2 Vnil otherwise�(if B then �0 else �1; V ) = 8<: if B then �(�0; V ) else �(�1; V ) if �(�0) 6= nil or �(�1) 6= nilnil otherwise�(while B do �; V ) = 8>><>>: while B do �( �; [n2INÆn(if B then � else nil; V )) if �(�0) 6= nilnil otherwiseThis appears to be a reformulation of Weiser's Algorithm and is therefore not data owminimal. The parallel sli ing algorithm [28℄ strongly resembles this. The a umulating pa-rameter V orresponds exa tly to the set of variables on ea h ar . Control ow is apturedby the fa t that a predi ate is in luded if any of the statements within its body are in luded.Importantly, Hausler proves that the above `semanti s' an be translated into an algo-rithm. The only part that was in question was the omputability of[n2IN Æn(�; V )2.5.11 UnfoldingIn his rule for Æ applied to while loops he states:-`Re all, semanti ally, the while statement is equivalent to if then statement omposed with a while statement. i.e.Æ([[while b do S℄℄; V ) = Æ([[if b then S;while b do S℄℄; V )In the Æ de�nition, the while loop is just being unraveled, into one or more ifstatements omposed together. Æ� a ounts for zero or more iterations of thewhile loop.

Page 69: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

2.6 Dynami Sli ing 69� � �It is not pra ti al to iterate the loop an `unknown' number of times. If the programsli er is to be used in an e�e tive manner, the loop must be exe uted in order to ompute Æ� for that loop, only a �nite number of times. Fortunately, this ispossible: � � �� First, it will be shown the number of times that a while loop has to beiterated in order to �nd the relevant variables an be omputed.� Se ondly, this value an be omputed primitive re ursively.� Thirdly, an easily omputable upper bound exists on the number of ne essaryloop iterations. In fa t, this upper bound an be omputed from synta ti information alone, based on the type of statements in the body of the loop.'Hausler proves that there is no problem with this as the onstru t is learly monotoni and bounded above. He states:`At some point in the omputation, there will exist an n su h that Æn�1(�; V ) =Æn(�; V ) whi h implies that no new transitive e�e ts were found. Æn+1(�; V ) doesnot have to be omputed be ause nothing new will be added to Æn.'In the work introdu ed in this thesis, loops are unravelled11 in a similar manner.2.6 Dynami Sli ingDynami Sli es, introdu ed by Korel and Laski [71℄ have the potential to be mu h smallerthan stati ones sin e rather than having to agree in all states, a program and its dynami sli e need only behave the same in one parti ular given initial state. This initial state �0, say,be omes the third omponent of the sli ing riterion in dynami sli ing.Korel and Laski [71℄ were the �rst to introdu e su h a dynami de�nition of a sli e. Adynami sli e need only preserve the e�e t of the original program upon the sli ing riterion11We all it unfolding.

Page 70: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

70 Sli ing: Semanti s and Algorithms12367810read(n);s:=0;p:=0;while (n>1)do begins:=s+n;p:=p*n;n:=n-1;end;write(p); p:=0;

Figure 2.14: Original Program and Dynami Sli e w.r.t. (fpg; 10; < 1 >)when supplied with input x. The dynami paradigm is ideally suited to bug-lo ation, be ausea bug is typi ally dete ted as the result of the exe ution of a program with respe t to somespe i� input, rather than by stati onsideration of the program's properties.Consider the example in �gure 2.14. The author of this program was hoping that it wouldoutput the produ t: 1 � � � � �n, where n is the value input. Suppose the original program hasbeen exe uted and the value entered for the variable n was 1. The value printed at the endof the exe ution is in orre t | it is 0 when it should be 1. To lo ate the bug whi h ausesthis error a dynami sli e is onstru ted (see �gure 2.14). The dynami sli e only identi�esthose statements whi h ontribute to the value of the variable p when the input 1 is suppliedto the program. Lo ating the bug (the faulty initialisation of p) in terms of the dynami sli eis thus easier than with either the original program or the orresponding stati sli e.This is a rather extreme example of a dynami sli e, be ause the input auses the whileloop to be ignored. However, dynami sli ing allows an improvement in pre ision in severalways. Clearly statements whi h remain unexe uted are not in luded in a dynami sli e. Inaddition, statements whi h are exe uted and reate data and ontrol dependen ies may beremoved from the sli e should these dependen ies be subsequently `overwritten' during theexe ution.In [5℄, Agrawal and Horgan use an ad{ho approa h to dynami sli ing that uses a pro-gram's program dependen e graph. They give four algorithms for performing dynami sli ing.

Page 71: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

2.6 Dynami Sli ing 71The �rst two produ e unne essarily large sli es. The third algorithm is the main one. It pro-du es more a urate sli es than the �rst two algorithms. The fourth algorithm is simply amore eÆ ient version of the third and is not dis ussed here. Ea h algorithm involves �rstexe uting the program in state �0, and re ording its exe ution tra e as a �nite sequen e ofnodes of the ontrol ow graph that were visited.Agrawal and Horgan's First AlgorithmExample 234567891011if x<0then beginy:=f1(x);z:=g1(x)endelse if x=0thenbeginy:=f2(x);z:=g2(x)endelsebeginy:=f3(x);z:=g3(x)end;write(y);write(z)Figure 2.15: Example ProgramConsider the program dependen e graph of this program given in Figure 2.16(page 72) (Weuse dotted lines to represent ontrol dependen y and solid lines to represent data dependen y).

Page 72: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

72 Sli ing: Semanti s and Algorithms2 3

4

5 6

7

8 9

10 11

Figure 2.16: PDGThe standard program dependen e graph approa h to stati sli ing says that to omputethe stati sli e with respe t to variable y at line 10, we simply follow all the arrows startingat node 10. This gives f2; 3; 5; 6; 8; 10g.Their �rst algorithm simply interse ts the program dependen e graph with the set of nodesthat o ur in the exe ution tra e and then uses onventional stati sli ing on this redu edprogram dependen e graph.If we run the program in Figure 2.15(page 71) in an initial state with x = �1 we get thetra e < 2; 3; 4; 10; 11>.Interse ting the program dependen e graph in Figure 2.16(page 72) with f2; 3; 4; 10; 11ggives the program dependen e graph in Figure 2.17(page 72).2 3

410 11Figure 2.17: Redu ed PDGSli ing this program dependen e graph with respe t to y at node gives a orre t and smalldynami sli e f2; 3; 10g.As pointed out in their paper this `naive approa h' does not lead to very a urate sli es.Consider the program in Figure 2.18(page 73):

Page 73: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

2.6 Dynami Sli ing 7356789 while i<ndo beginz:=f1(z,y);y:=f2(y);i:=i+1end;write(z)Figure 2.18: ExampleIf we exe ute the program in Figure 2.18(page 73) in a state where i = 0 and n = 1 weget the exe ution tra e < 5; 6; 7; 8; 5; 9>.The program dependen e graph for this example is given in Figure 2.19(page 73)5 6

7

89 Figure 2.19: PDGSo if we interse t this program dependen e graph with the tra e < 5; 6; 7; 8; 5; 9> we getthe whole program dependen e graph. Stati sli ing this program dependen e graph at node9 with respe t to z gives f5; 6; 7; 9g so node 7 has been in luded unne essarily. Sin e if theloop is exe uted only on e, the assignment to y an have no e�e t on the �nal value of z.Agrawal and Horgan's Se ond AlgorithmIn Agrawal and Horgan's Se ond Algorithm, a redu ed program dependen e graph is on-stru ted from the original program dependen e graph and the tra e. An ar from n1 to n2 inthe program dependen e graph is only made if n2 o urs before n1 in the tra e.Again, onsider the program in Figure 2.18(page 73) with i = 0 and n = 1 whi h gives

Page 74: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

74 Sli ing: Semanti s and Algorithmsthe tra e < 5; 6; 7; 8; 5; 9> :This approa h yields the redu ed program dependen e graph in Figure 2.20(page 74).5 6

7

89 Figure 2.20: Redu ed PDGSli ing this program dependen e graph starting at node 9 yields f5; 6; 8; 9g and thus theo�ending node 7 is this time not in luded.Agrawal and Horgan laim this approa h also produ es ina urate sli es. Consider theexample program in Figure 2.21(page 74).345678910while i<ndo beginread(x);if x<0theny:=f1(x)else y:=f2(x);z:=f3(y);write(z);i:=i+1endFigure 2.21: Example Program 3This gives the PDG in Figure 2.22(page 75)

Page 75: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

2.6 Dynami Sli ing 754

5

6 7

10

8

3

9

Figure 2.22: PDGConsider the tra e obtained from this program when the initial values are i = 1, n = 3and the two values read in for x are -4 and 3. The tra e produ ed is< 3; 4; 5; 6; 8; 9; 10; 3; 4; 5; 7; 8; 9; 10; 3> :The approa h des ribed above leaves this program dependen e graph in ta t. Stati sli ingthe program dependen e graph with respe t to variable z at node 9 gives rise to the wholeprogram. Node 6 has unne essarily been in luded as it has no e�e t on the �nal value of z inthis ase.Agrawal and Horgan's Third AlgorithmTheir third algorithm uses a Dynami Dependen e Graph.To reate the dynami dependen e graph, in e�e t, the program being sli ed is �rst un-folded12 as often as ne essary, determined by the length of the tra e. Ea h repeated nodeis treated as a new node The stati sli e of the resulting program dependen e graph is then12Agrawal and Horgan do not express it in terms of unfolding.

Page 76: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

76 Sli ing: Semanti s and Algorithms omputed using the usual approa h. A node is then deemed to be in the �nal sli e if any ofits instan es are in the sli e of the program dependen e graph just des ribed.

Consider again, the example program in Figure 2.21(page 74) for the ase where i=1 andn =3 and the su essive values input for x are -4,3, and -2. The tra e of this is< 3a; 4a; 5a; 6a; 8a; 9a; 10a; 3b; 4b; 5b; 7b; 8b; 9b; 10b; 3 ; 4 ; 5 ; 6 ; 8 ; 9 ; 10 ; 3d >Sin e the loop is exe uted three times the loop is unfolded three times to give the programin Figure 2.23(page 77).

Page 77: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

2.6 Dynami Sli ing 773a4a5a6a7a8a9a10a3b4b5b6b7b8b9b10b3 4 5 6 7 8 9 10

if i<nthenbeginread(x) ;if x<0theny:=f1(x)elsey:=f2(x);z:=f3(y);write(z);i:=i+1;if i<nthenbeginread(x);if x<0theny:=f1(x)elsey:=f2(x);z:=f3(y);write(z);i:=i+1;if i<nthenbeginread(x);if x<0theny:=f1(x)elsey:=f2(x);z:=f3(y);write(z);i:=i+1endendendFigure 2.23: Example Program 7For ea h node in the element of the tra e there will be a unique node in the dynami dependen e graph.

Page 78: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

78 Sli ing: Semanti s and AlgorithmsThis gives the program dependen e graph in Figure 2.24(page 78)4a

5a

6a 7a

10a

8a

3a

9a

4b

5b

6b 7b

10b

8b

3b

9b

4c

5c

6c 7c

10c

8

3c

9c

Figure 2.24: PDGThis yields the redu ed program dependen e graph in Figure 2.25(page 79)

Page 79: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

2.6 Dynami Sli ing 794a

5a

6a

10a

8a

3a

9a

4b

5b

7b

10b

8b

3b

9b

4c

5c

6c

10c

8

3c

9c

Figure 2.25: PDGAgrawal and Horgan laim that to �nd the sli e for a parti ular variable we simply �ndthe node orresponding to the last de�nition of the variable and tra e it ba k. This is onlytrue for end sli ing13.13There seems to be some onfusion in their work between end and middle sli ing. Stri tly speaking we

Page 80: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

80 Sli ing: Semanti s and AlgorithmsIn this ase, we get the sli ing on z give us node 9 whi h when tra ed ba k gives a sli ef3; 4; 5; 6; 8; 9; 10g. As they rightly, point out, this does not in lude node 7. Using middlesli ing however, we would have to tra e ba k from ea h o urren e of the sli ing riterion andso dynami sli ing at node 9 would be the whole program in luding node 7.Their fourth algorithm is a more eÆ ient approa h to produ ing the same dynami sli eas produ ed by their third algorithm. It re ognises like in [52℄ that programs need to beunfolded a �xed number of times (independent of the tra e) to at h all ne essary dependen einformation. A similar result is used in the main work of this thesis.2.7 Symboli Exe utionThe traditional exe ution model of program is based on the on ept of state. Where a stateis a mapping between variable names and values. The allowable values assigned to a variablev are de�ned by the type of v. The type of v may for example be int or harIn symboli exe ution the situation is di�erent. The values assigned to variables in astate are symboli expressions. Ea h symboli state is thus a representation of a whole set oftraditional states.An important example of symboli exe ution is in proving the orre tness of programs.In order to perform program proofs using Hoare Logi [54℄, a form of symboli exe ution isundertaken.ExampleThe weakest pre ondition rule for assignment is:-wp(x := E;R) = R[E=x℄This is interpreted as:In order to guarantee the truth of the predi ate R after exe uting x := E we mustensure that the predi ate R[E=x℄ is true before exe uting x := E. annot sli e with respe t to z using the given program dependen e graph. A dummy statement referen ing zthe end of the program must �rst be in luded. Using statement 9 gives a middle sli e.

Page 81: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

2.7 Symboli Exe ution 81The predi ate R[E=x℄ means R with all o urren es of the x repla ed with the expressionE. In other words, to ompute R[E=x℄ symboli ally evaluate the predi ate R in a symboli state where the variable x has the symboli value E.Dannenberg and Ernst [29℄ expli itly use symboli exe ution for program veri� ation.Symboli Exe ution also o urs in the redu tion of �{expressions [24℄.Besides proof of orre tness, symboli exe ution is used for assuring quality of softwarethrough testing. Symboli exe ution was �rst used for program testing by King [69℄. Here,symboli exe ution is a form of stati analysis, where the program is never a tually exe uted.Instead, parti ular paths through the program are evaluated in detail. All the omputationsare performed symboli ally, subje t to onstraints that may exist along the path be ause ofthe kind and number of onditionals that spe ify whether the path is exe utable. Howden [61℄des ribes a symboli testing and a symboli exe ution system alled DISSECT. The results oftwo lasses of experiments in the use of symboli exe ution are summarized. Several lassesof program errors are de�ned and the reliability of symboli testing in �nding bugs is relatedto the lasses of errors.Huang [62℄ uses symboli tra es to in rease error-dete tion apabilities of program testsand indi ate the extent of their overage. This instrumentation system generates tra esautomati ally upon program exe ution.Cimitile et al. [25℄ use symboli exe ution in an approa h to reverse engineering. With thehelp of theorem proving te hniques they laim that they an re over the high level spe i� ationof fun tions from C programs. They are areful to point out that this pro ess requires humanintera tion. They do not laim to have implemented their system.Coen-Porisini et al. [27℄ use symboli exe ution for software spe ialization.Day [30℄ uses symboli exe ution trees in in the fun tional language, Haskell [63℄, in anew automati te hnique alled symboli fun tional evaluation (SFE) to evaluate semanti fun tions outside of a theorem proving environment. SFE produ es the meaning of a spe i�- ation.These symboli exe ution trees are de�ned as:-data State f a =CondS a (State f a) (State f a) |Term (f a)

Page 82: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

82 Sli ing: Semanti s and Algorithms\A symboli state aptures a tree of possible exe ution paths that the ma hine ould take."

Page 83: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

Chapter 3Data ow Dependen ies3.1 Introdu tionData ow analysis [92, 53℄, by de�nition, is the a t of inferring properties about a programfrom its ontrol ow graph alone. Data ow analysis is, thus, fairly limited. We annot,for example, tell by looking at a program's ontrol ow graph when two expressions in theprogram are equal, nor an we use any form of expression simpli� ation. All the informationrequired to do su h things has been `abstra ted away' in onverting the program into a ontrol ow graph.Weiser's algorithm is an example of data ow analysis. It takes the ontrol ow graph gof a program p as input and outputs a set of nodes, Ng (a subset of the nodes of g). Sin ethere is a one to one orresponden e between the nodes of the ontrol ow graph and the`statements' of the orresponding program, this output uniquely determines whi h statementsof p should be in luded in the sli e of p. The sli e, p0, of p is the program derived from p andthe set of nodes Ng output by Weiser's algorithm.p semanti relationship����������������������������������! p0s??y x??f(N;p)g Sli e���������������������!Ng(set of nodes of g)Clearly, using data ow analysis, all programs with the same ontrol ow graph will betreated identi ally. Weiser noti ed [92, 93℄ that his algorithm is not data ow minimal. (See

Page 84: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

84 Data ow Dependen iesSe tion 1.5, page 24 for some examples.) For some ontrol ow graphs, g, there exists a setof nodes, N 0g whi h is a proper subset of Ng whi h has the property that for all programs pwhose ontrol ow graph is g, the required semanti relationship between p and f(p;N 0g) issatis�ed (where f(p;N 0g) is the program derived from N 0g and p.).A data ow minimal algorithm A would be one for whi h no su h smaller sets of nodesexist i.e. for every ontrol ow graph g, the set of nodes Ag produ ed by A is su h that forany proper subset N 0 of Ag, there will exist a program p whose ontrol ow graph is g butthe required semanti relationship between p and f(p;N 0) is not satis�ed.Data ow minimality is of interest, be ause if an algorithm for program analysis an beshown to be data ow minimal, it is guaranteed to be the most pre ise algorithm a hievableusing data ow analysis alone. Surprisingly, no work appears to have been done to investigatewhether data ow minimal sli es are indeed a hievable.This hapter introdu es eight dependen e relations all de�ned in terms of the semanti sof programs [88℄.1. VD and TVD, both binary relations between the variables of a program.2. LD and TLD both binary relations between the variables and labels of a program.Programs with the same ontrol ow graph are termed data ow equivalent. Data owequivalen e is formally de�ned and s hemas [44℄ are used for representing lasses of programswith the same ontrol ow graph.For ea h of the above four dependen e relations on programs, equivalent ones are de�ned:1. DVD and DTVD, both binary relations between the variables of a s hema.2. DLD and DTLD both binary relations between the variables and labels of a s hema.These `data ow dependen ies', (two of whi h are a form of sli ing) are de�ned in su h away su h any algorithm for omputing them must be data ow minimal.3.2 Minimal Sli ing AlgorithmsDe�nition 3.2.1 (Sli e Preserving Algorithms)Given an algorithm, A whose input obje ts are of type I and whose output obje ts are oftype O. Given a sli e relation: a binary relation R, between I and O, A is R{preserving if

Page 85: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

3.2 Minimal Sli ing Algorithms 85and only if for all i in I , R(i; A(i)) holds.De�nition 3.2.2 (Minimal Algorithms)Furthermore, given an ordering (�) on the elements of O, A is onsidered R{minimal if andonly if for all o, (o � A(i)) implies that R(i; o) does not hold.These de�nitions are essentially, the `equivalen e relation' and the `simpli ity measure'des ribed in [47℄.3.2.1 Example: Statement Minimal Weiser Sli esA statement minimal end{sli ing algorithm, with respe t to variable x has input obje ts andoutput obje ts whi h are both programs. The ordering between programs is p v q if and onlyif p is a synta ti sub{ omponent of q. The sli e relation in this ase, using Weiser's de�nitionis R1(p; q) if and only if for all states when p terminates, q terminates with the same valuefor x.3.2.2 Example: Data ow Minimal Weiser Sli ingA data ow minimal end{sli ing algorithm, with respe t to variable x has input obje ts whi hare ontrol ow graphs and output obje ts whi h are sets of nodes. The ordering in this aseis simply �. The sli e relation, in this ase, is given by:-R2(g;N) if and only if for all p whose ontrol ow graph is g, R1(p; q) where q is the program`derived from' p and N . (R1 is the sli e relation de�ned in Se tion 3.2.1.)De�nition 3.2.3 (Data ow Algorithms)A data ow algorithm is one whose inputs are ontrol ow graphs, or some representationthereof.De�nition 3.2.4 (Data ow Minimal Algorithms)We de�ne a data ow minimal algorithm to be a data ow algorithm that is minimal.Applying Weiser's algorithm to the ontrol ow graph, g, of p1:2 in Figure 1.3(page 25)gives rise to the set of nodes f1; 2; 3; 4; 5g. It has been shown (Chapter 1, Se tion 1.5) thatg and the set f1; 2; 4; 5g satisfy the sli e semanti s with respe t to end{sli ing on variablex. The fa t that f1; 2; 4; 5g $ f1; 2; 3; 4; 5g implies that Weiser's Algorithm is not data owminimal. (But it is data ow.)

Page 86: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

86 Data ow Dependen ies3.3 Dis ussionIt ould be argued that Weiser's algorithm produ es non{data ow minimal sli es only in`pathologi al' ases. Nevertheless, it is not known how to predi t when this situation willo ur. When it does, the impa t on the size of the sli e, be ause of the transitivity ofdependen e, ould be enormous. A single unne essary extra node being in luded in a sli ewill ause the in lusion of all the nodes upon whi h this `unne essary' node depends.`Traditional dependen e'(see Se tion 2.2.4) is the transitive losure of the union of ontroland data dependen e.What does it mean for node n to depend on node m (in the traditional sense i.e.(D [ C)�) in the ontrol ow graph of p?An attempt to semanti ally interpret the meaning of traditional dependen e, ould, for ex-ample, wrongly, but plausibly, postulate that node n1 depends on node n2 in a ontrol owgraph g if and only if there exists a program p whose ontrol ow graph is g in whi h theexe ution of the statement of p orresponding to node n2 `has an e�e t' on the statement of p orresponding to node n1. The phrase statement s2 `has an e�e t' on the statement s1 ouldbe taken to mean: some exe ution of statement s2 either a�e ts the value of the expression instatement s1 (data dependen e) or a�e ts whether a parti ular exe ution statement s1 takespla e at all ( ontrol dependen e)1 .As has been shown in the previous examples, there are ases when n1 `depends on' n2with respe t to ontrol ow graph g but there are no programs in the equivalen e lass of qwhere the statement orresponding to n2 has an e�e t on the statement orresponding to n1.For example in program p1:2 in Figure 1.2(page 24) node 4 depends on node 3 by transitivity,sin e node 2 is data dependent on node 3 and node 4 ontrol dependent on node 2. There isno program in the same data ow equivalen e lass of p1:2 however, where the statement atnode 3 has an e�e t on node 4.What an be said semanti ally about traditional dependen e is that if node n1 is notdependent on node n2 in a ontrol ow graph g then in all programs whose ontrol ow graphis g, the statement orresponding to n1 will not be dependent on the statement orresponding1This ould be made more pre ise. But there no point as the statement is false!

Page 87: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

3.4 Assumptions about the Programming Language 87to n2. Clearly, traditional dependen e is not the only relation between the nodes of a ontrol ow graph that has this property, and what has been shown in the previous se tion, is thatit is not the smallest. The dependen e relation that relates all nodes of a ontrol ow graphto ea h other would also produ e a valid sli ing algorithm, namely one that deletes nothing.It is not suÆ ient to onsider nodes on their own but on `exe ution instan es' of nodes:A node may get exe uted many times during the exe ution of a program, the ith exe utionof node n may be dependent on the jth exe ution of node m but not on the j + 1th.This hapter introdu es program dependen ies whi h do have a lear semanti interpre-tation. These dependen ies are either between variables of a program (VD and TVD) orbetween variables of a program and nodes of its ontrol ow graph, (LD and TLD).For ea h of these dependen ies d there will be a `data ow' version of d of the followinggeneral form:x data ow d{depends on y in p means there exist a program q in the same data owequivalen e lass as p su h that x d{depends on y in q.For ea h data ow dependen e relation d, if an algorithm exists for omputing it, it is guar-anteed, by de�nition, to be data ow minimal.3.4 Assumptions about the Programming LanguagePrograms whi h are onsidered in this thesis are onventional imperative programs ontainingassignments, statement sequen es onditionals and while loops as well as a skip statementwhi h does nothing.3.4.1 Syntax of Programs� ::= skip jFAILx := E jbegin �1; � � � ; �n end jif B then �0 else �1 jwhile B do �

Page 88: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

88 Data ow Dependen ies3.4.2 TypesThere are no expli it type de�nitions in this programming language and it is further assumedthat there are only two types of expression:1. Boolean expressions that o ur as the guards of ifs and whiles.2. All expressions, whi h o ur on the right hand side of assignments we assume to be oftype integer.3.4.3 The Variables Referen ed by an ExpressionThe set of variables ref (e) referen ed by e is the set of variables that synta ti ally o ur in theexpression e. Clearly, by onsidering the expression x�x, it an be seen that if an expression`depends upon' variable x then it referen es x but not ne essarily vi e{versa.3.4.4 Assumptions about Expressions in ProgramsDe�nition 3.4.1 (state)A state is a �nite mapping from variables to values.For example the fun tion � = 8>>><>>>: x 7! 5y 7! 1z 7! 2 9>>>=>>>;represents the state where variable x has the value 1, variable y has the value 1 and z thevalue 2. The reason that a state fun tion is always �nite is that a program an only assignvalues to a �nite number of variables.De�nition 3.4.2 (The Domain of a State)The domain of � is the �nite set of variables for whi h � is de�ned.For example, for the state � above, dom(�) = fx; y; zg.

Page 89: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

3.4 Assumptions about the Programming Language 89De�nition 3.4.3 (state proje tion)Given any state �, proj � is the set of all states whose variables have the same value as thoseof � and whose domains ontain the domain of �. (proj �) de�nes an in�nite set of states.We all su h a set of states a proje tion.For example, �1 = 8>>>>>><>>>>>>: w 7! 7x 7! 5y 7! 1z 7! 2 9>>>>>>=>>>>>>;is su h that �1 2 proj �:Lemma 3.4.1 Let e be an expression with ref (e) = S. Let � be a state whose domain is S,then for all states �0 in proj (�), E [[e℄℄� = E [[e℄℄�0:This result is intuitively obvious sin e if a variable is not mentioned in an expression, then it annot have an e�e t on its value in any state.We use � in pla e of proj � and E [[e℄℄� = z means for all �0 in proj � E [[e℄℄�0 = z.Assumption 3.4.1 (Ri hness of Expressions) Given any �nite set of states, �1; � � � ; �nall with the same domain, S, and any set of values v1; � � � ; vn, there is an expression e withref (e) = S, su h that for all i 2 f1; � � � ; ng,E [[e℄℄�i = vi:This assumption is used extensively in later proofs. It says that the expression notation ofour programming language is suÆ iently powerful su h that given a �nite set of states f�igand the same number of di�erent values, fvig we an pi k an expression e, say, su h that forea h i, the expression, e, evaluated in state �i yields the value vi.

Page 90: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

90 Data ow Dependen ies3.4.5 ExampleLet �1 = 8>>><>>>: x 7! 15y 7! 31z 7! 12 9>>>=>>>;and �2 = 8>>><>>>: x 7! 5y 7! 1z 7! 2 9>>>=>>>;Let us, for example, try to �nd an expression e whi h is su h that E [[e℄℄�1 = 79 andE [[e℄℄�2 = 81.It turns out that this assumption will be true provided that the arithmeti operators of ourlanguage in lude addition, subtra tion, multipli ation, and division. This result is a dire t onsequen e of Lagrange's Interpolation Formula [16, 90℄:Let F be a �eld, and let a0; � � � ; an be any distin t elements of F and let 0; � � � ; nbe any given elements of F . There exists a polynomial f(x) of degree � n su hthat f(a0) = a0; � � � ; f(an) = n:This result learly generalises to where the ai are states rather than simple values. Sin e allthat is required is that ea h state, �i is �rst mapped ea h to a unique simple value beforeLagrange's interpolation formula is applied.For example, if the states are m{tuples x1; :::; xm thennYi=1 �xiiwhere �i is the ith prime number, is guaranteed to produ e unique values for ea h distin t

Page 91: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

3.5 The Variable Dependen ies: VD and TVD 91state.Put another way: given any �nite state to value fun tion, there is an expression in ourprogramming language that is denoted by that fun tion.De�nition 3.4.4 ( � and �0 di�er only on x)Let � and �0 be states with the same domain S and let x be a variable in S. Then � and �0di�er only on x if and only if � x 6= �0 x and � z = �0 z for all z 6= x in S.De�nition 3.4.5 (A�e ts)Let e be an expression and let x be a variable. x a�e ts e if and only if there exist two states� and �0 di�ering only on x su h that E [[e℄℄� 6= E [[e℄℄�0.Lemma 3.4.2 Let e be an expression. For all x =2 ref (e), x does not a�e t E.Proof: obviousLemma 3.4.3 Let T be a set of variables.There is an expression e with ref (e) = T su h that for all x 2 ref (e), x a�e ts e.Proof: Follows immediately from Assumption 3.4.1.3.5 The Variable Dependen ies: VD and TVDIn this se tion, two dependen e relations on the variables of programs: VD and TVD arede�ned. Informally, they both have the property that x depends on y in p if the initial valueof y a�e ts the �nal value of x when exe uting program p. The di�eren e between the twode�nitions arises from two di�erent interpretations of the word `a�e ts'.� In the ase of TVD , a variable y is onsidered to a�e t x if and only if di�erent valuesof y an ause p to terminate with di�erent values for x.� Using the more general, VD , on the other hand, y a�e ts x either if{ y a�e ts x as just des ribed or{ if the initial value of y a�e ts the termination of p.Semanti ally, the di�eren e between the two is that in DVD ? is onsidered a value whereasin DTVD , ? does not ount.

Page 92: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

92 Data ow Dependen ies3.5.1 Variable Dependen e (VD)Variable x is variable dependent on variable y in program p if and only if there exist twostates, �1 and �2, di�ering only at y, su h that either� p terminates when started in state �1 and when started in state �2 with di�erent �nalvalues for x.or� either p terminates when started in state �1 or p terminates when started in state �2but not both.Formally:-De�nition 3.5.1 (Variable Dependen e(VD))Variable x is variable dependent on variable y in program p if and only if there exist twostates, �1 and �2, di�ering only at y, su h thatM[[p℄℄�1x 6=M[[p℄℄�2x.We write x VD y in p.In variable dependen e, as opposed to terminating variable dependen e(Se tion 3.5.3), ? is onsidered to be a value just like any other.3.5.2 Examples of Variable Dependen eExample 1: x VD x in skipExample 2: x VD y in x:=y+1Example 3: z VD z in x:=y+17Example 4: x VD y in z:=y;x:=zExample 5: x VD y in z:=y;a:=z;x:=aExample 6: x VD y in if y=1 then x=1 else x=2Example 7: :(x VD y in z:=y;x:=x+z-y)

Page 93: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

3.5 The Variable Dependen ies: VD and TVD 93Example 8: x VD x in while y<>0 do x:=x+1Example 9: x VD y in while y<>0 do x:=x+1Example 10: y VD y in while y<>0 do y:=y+1Lemma 3.5.1 For all variables x and y, x VD y in skip () x = y:Proof: Obvious.Lemma 3.5.2 For all terminating programs p, for all variables y not mentioned in p, y VD yin p.Proof: Obvious.3.5.3 Terminating Variable Dependen e (TVD)Consider the program p1 in Figure 3.1(page 93).while y6=0do x:=x-1Figure 3.1: Program p1Clearly, the �nal value of the variable x is dependent on the initial value of y (and the initialvalue of x). There are however no initial states di�ering only at y su h that p1 terminateswith di�erent �nal values for x. In this sense the �nal value of x is not dependent on theinitial value of y. This observation leads to a new version of variable dependen e of programswhi h we all Terminating Variable Dependen e.Variable x is terminating variable dependent on variable y in program p if and onlyif there exist two states, �1 and �2, di�ering only at y, su h that p terminates when startedin states �1 and when started in state �2 with di�erent �nal values for x. Formally:-De�nition 3.5.2 (TVD)Variable x is terminating variable dependent upon y in program p if and only if thereexist two states � and �0 di�ering only at variable y, su h that

Page 94: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

94 Data ow Dependen ies? 6=M[[p℄℄� x 6=M[[p℄℄�0 x 6= ?We write x TVD y in p.For terminating variable dependen e, therefore, ? `does not ount' as a value.3.5.4 Examples of Terminating Variable Dependen eExample 1: x TVD x in skipExample 2: x TVD y in x:=y+1Example 3: x TVD y in x:=y+17Example 4: x TVD y in z:=y;x:=zExample 5: x TVD y in z:=y;a:=z;x:=aExample 6: x TVD y in if y=1 then x=1 else x=2Example 7: :(x TVD y in z:=y;x:=x+z-y)Example 8: x TVD x in while y<>0 do x:=x+1Example 9: x TVD y in while y<>0 do x:=x+1Example 10: :(y TVD y in while y<>0 do y:=y+1)If the examples above, if VD and TVD are ompared, it an be seen that a di�eren eo urs only in while y<>0 do y:=y+1. Variable y variable depends on y sin e the initialvalue of y an determine whether the program terminates or not. In terminating variabledependen e we are only interested in terminating programs and in these variable y alwaysends up with the value zero. Using TVD therefore, y is dependent on no variables at all.Lemma 3.5.3 x TVD y in p =) x VD y in pProof: trivial.The onverse is learly not true.

Page 95: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

3.6 The Unde idability of VD and TVD 953.6 The Unde idability of VD and TVDLemma 3.6.1 VD is unde idable.Proof:If an algorithm ould be written to de ide whether x VD y in program p then we ould useit to solve the halting problem [78℄ as follows:- In order to de ide whether program p halts, onstru t the program q given by:p;y:=z;Where y and z are variables not o urring in p. Then p halts if and only if y VD z in q.Similarly,Lemma 3.6.2 TVD is unde idable.Proof: Program q also has the property that p halts if and only if y TVD z in q.3.7 Data ow Dependen eIn this se tion, sin e data ow analysis is our on ern, the de�nitions of variable dependen eand terminating variable dependen e are re ast in terms of ontrol ow graphs rather thanprograms.Clearly, if we know only the variables referen ed by ea h expression in a program we annot know the variable dependen e of the original program sin e ru ial information hasbeen abstra ted away. x:=y;x:=x-yFigure 3.2: Program p2In program p2 in Figure 3.2(page 95) the �nal value of x is not dependent on the initial valueof y. In program p3 in Figure 3.3(page 96), however, the �nal value of x learly is dependent

Page 96: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

96 Data ow Dependen iesx:=yx:=x-2*yFigure 3.3: Program p3on the initial value of y. This is an example of two programs with the same ontrol ow graphbut with di�erent variable dependen e.As des ribed in Se tion 3.1, the programs p1 and p2 in Figure 3.2(page 95) and Fig-ure 3.3(page 96) although di�erent, an be onsidered to be in the same equivalen e lass,sin e they have same ontrol ow graph, i.e. q an be obtained from p by repla ing expressionsby other expressions whi h referen e the same sets of variables. The programs, p1 and p2 arean example of data ow equivalent programs. We write p1�p2 (see De�nition 3.7.1(page 97)).Sin e, as has been just shown, it is not possible to write an algorithm for he king variabledependen e, ` ruder' problems will now be investigated. If we group programs together intoequivalen e lasses, maybe there exist algorithms whi h an answer the question:Is there any program q `equivalent' to p su h that x depends on y in q?If ea h program is put in its own equivalen e lass then, by Lemma 3.6.1(page 95) andLemma 3.6.2(page 95), the problem is unde idable. Clearly, if we put all programs intothe same equivalen e lass, then the problem is trivial. A main on ern of this thesis is toinvestigate whether these problems are solvable when the equivalen e is data ow equivalen eand if so, to produ e algorithms for solving them.3.7.1 Data ow Equivalen eIn this se tion, the notion of data ow equivalen e is formalised. Rules for data ow equiva-len e orresponding to ea h synta ti ategory of program are given. Programs are data owequivalent if they have the same ontrol ow graph. Sin e we are just onsidering stru turedprograms2, it an be said that data ow equivalent programs have identi al stru ture up toexpressions and ` orresponding' expressions referen e the same sets of variables. Also the2gotos are not allowed.

Page 97: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

3.7 Data ow Dependen e 97while x < ydo if x < ythen x:=x + yelse y:=y+1 while 3*x+2*y=17do if x>y-ythen x:=x+1+yelse y:=y+1 while x+x+y=y+1do if x=y+x-ythen x:=x-x+y-yelse y:=yFigure 3.4: Three Data ow Equivalent Programsvariables o urring on the left hand sides of orresponding assignment statements must bethe same.De�nition 3.7.1 (Data ow Equivalen e)skip�skipref (e) = ref (e0)v := e�v := e0s1 � s01; � � � ; sn�s0nbegin s1; � � � ; sn end � begin s01; � � � ; s0nendref (e) = ref (e0) s1 � s01 s2 � s02if e then s1 else s2 � if e0 then s01 else s02ref (e) = ref (e0) s�s0while e do s�while e0 do s03.7.2 Example of Data ow Equivalen eIn Figure 3.4(page 97) three data ow equivalent programs are given. They all have the ontrol ow graph given in Figure 3.5(page 98).

Page 98: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

98 Data ow Dependen iesdef={} ref={x,y}

def={} ref={x,y}

def={x} ref={x,y} def={y} ref={y}

EXIT

ENTRY

1

2

3

4Figure 3.5: Control Flow Graph3.8 The Data ow Variable Dependen ies: DVD and DTVDThe two versions of variable dependen e, VD and TVD(De�nition 3.5.1(page 92) and De�-nition 3.5.2(page 93)) ea h have data ow versions DVD and DTVD whi h are now de�ned.3.8.1 Data ow Variable Dependen e(DVD)Variable x is data ow variable dependent upon y in program p if and only if there existsa program q with the same ontrol ow graph as p su h that x VD y in q. Formally:-De�nition 3.8.1 (DVD)Variable x is data ow variable dependent upon y in program p if and only if there existsq�p su h that x VD y in q.We write x DVD y in p.3.8.2 Examples of Data ow Variable Dependen eExample 1: x DVD x in skip

Page 99: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

3.8 The Data ow Variable Dependen ies: DVD and DTVD 99Example 2: x DVD y in x:=y+1Example 3: z DVD z in z:=z-zExample 4: x DVD y in z:=y;x:=zExample 5: x DVD y in z:=y;a:=z;x:=aExample 6: x DVD y in if y=1 then x=1 else x=1Example 7: x DVD y in z:=y;x:=x+z-yExample 8: x DVD x in while y<>0 do x:=x+1Example 9: x DVD y in while y<>0 do x:=x+1Example 10: y DVD y in while y<>0 do y:=y+13.8.3 Data ow Terminating Variable Dependen e(DTVD)Variable x is data ow terminating variable dependent upon y in program p if and onlyif there exists a program q with the same ontrol ow graph as p su h that x TVD y in q.Formally:-De�nition 3.8.2 (DTVD)Variable x is data ow terminating variable dependent upon y in program p if and onlyif there exists q�p su h that x TVD y in q.We write x DTVD y in p.Examples of Data ow Terminating Variable Dependen e are given in the next se tion.3.8.4 A Taxonomy of Variable Dependen eSo far, four dependen ies have been introdu ed:� VD together with its data ow ounterpart, DVDand� TVD together with its data ow ounterpart, DTVD .These four di�erent variable dependen ies an be ategorised as follows:-

Page 100: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

100 Data ow Dependen iesVD = (non{terminating,normal)TVD = (terminating,normal)DVD = (non{terminating,data ow)DTVD = (terminating,data ow)The de�nition of ea h an be tabulated as follows:x VD y in p 9 � and �0 di�ering only at ysu h thatM[[p℄℄�x 6=M[[p℄℄�0xx TVD y in p 9 � and �0 di�ering only at ysu h that ? 6=M[[p℄℄�x 6=M[[p℄℄�0x 6= ?x DVD y in p 9q�p su h that x VD y in qx DTVD y in p 9q�p su h that x TVD y in qFigure 3.6: The Four Variations of Variable Dependen eFor ea h pair of de�nitions, we give an example program where the respe tive dependen iesare di�erent.1. VD 6= DVD.In program p2, Figure 3.2(page 95), :(x VD y) but x DVD y.2. VD 6= TVD.Consider program p10 in Figure 3.7(page 101). If we start in a state where y is negative,then the value for x in the �nal state is ?. However if we start in a state where y isnon{negative, then the �nal value of x is zero, so x VD y. But for all states where theprogram terminates the �nal value of x is zero, independent of the value of y, so :(xTVD y).

Page 101: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

3.8 The Data ow Variable Dependen ies: DVD and DTVD 101while y6=0do y:=y-1;x:=yFigure 3.7: program p103. TVD 6= DTVD.Again, onsider program p10 in Figure 3.7(page 101). Consider also the data ow equiv-alent program p9 in Figure 3.8(page 101).while y>0do y:=y-1;x:=yFigure 3.8: program p9In program p9, (x TVD y), so by de�nition sin e p9 and p10 are equivalent, we have (xDTVD y) in p10. As we showed earlier, in p10, :(x TVD y).4. VD 6= DTVD.Consider again program p2 in Figure 3.2(page 95). :(x VD y) but x DTVD y.5. TVD 6= DVD.Consider again program p2 in Figure 3.2(page 95). :(x VD y) but x DVD y.6. DVD 6= DTVD.Consider program p12 in Figure 3.9(page 102). :(x DTVD y) sin e for all programs qdata ow equivalent to p12, if q terminates, the �nal value of x will not depend on thevalue of y in the initial state. On the other hand x DVD y sin e the initial value of ya�e ts termination of p12.

Page 102: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

102 Data ow Dependen ieswhile y>0do y:=y-1;x:=7Figure 3.9: program p123.9 The Label Dependen ies: LD and TLDA form of dependen e more losely related to sli ing than variable dependen e is now in-trodu ed. First, programs are labelled3 with names orresponding to the node identi�ers intheir ontrol ow graph. Informally, a variable x is label dependent on label l if and onlyif hanging the expression at label l to another one that referen es the same set of variables an a�e t the �nal value of x. This is similar to the idea of ontamination of expressions[91℄. The `sli e' on x produ ed by label dependen e will be the set of all the labels whoseexpressions an a�e t x in this way. It is a losure sli e [91℄ as it is a olle tion of labels thatdo not ne essarily make up a omplete program. As in the ase of variable dependen e, thereare two variants: terminating and non{terminating. Again, these arise from the two di�erentinterpretation of the word `a�e ts'. (See Se tion 3.5, page 91 for an explanation of this issue.)3.9.1 Label Dependen e (LD)De�nition 3.9.1 (LD)Variable x is label dependent on label l in p if and only if there exists a program p0 data owequivalent to p, di�ering from p0 only at label l, and a state, �, su h thatM[[p℄℄�x 6=M[[p0℄℄�x.We write x LD l in p.For example onsider the program in Figure 3.10(page 104). The program, 1: x:=y+1 isdata ow equivalent to it and di�ers only at the expression labelled 1. Clearly the �nal valuesof x will be di�erent for the two programs starting in any state where y is de�ned.3We do not think of these labels as part of the language, but as omments.

Page 103: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

3.9 The Label Dependen ies: LD and TLD 1033.9.2 Terminating Label Dependen e (TLD)De�nition 3.9.2 (TLD)Variable x is terminating label dependent on label l in p if and only if there exists aprogram p0 data ow equivalent to p, di�ering from p0 only at label l, and a state, �, su h that? 6=M[[p℄℄�x 6=M[[p0℄℄�x? 6= ?.We write x TLD l in p.As in the ase of variable dependen e, we have:-Lemma 3.9.1 x TLD L in p =) x LD L in p:Proof: trivial.In Figure 3.18(page 106) and Figure 3.19(page 106) we have examples of two programswith the same LD but di�erent TLD . The only way the program in Figure 3.19(page 106) an terminate is with x = 0. The expression at label 2 annot a�e t this. It an only a�e ttermination onditions of the program.The data ow variants of these dependen ies are produ ed in exa tly the same way as thedata ow variants of variable dependen e. Variable x is data ow label dependent on labell in p means there exist two programs q and q0 both data ow equivalent to p whi h di�eronly at l su h that there exists a state � where q and q0 `behave di�erently' with respe t tox when started in state �.De�nition 3.9.3 (DLD)Variable x is data ow label dependent upon l in program p if and only if there exists q�psu h that x LD l in q.We write x DLD l in p.De�nition 3.9.4 (DTLD)Variable x is data ow terminating variable dependent upon l in program p if and onlyif there exists q�p su h that x TLD l in q.We write x DTLD l in p.

Page 104: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

104 Data ow Dependen ies1: x:=yFigure 3.10: xVDfyg and xLDf1g1: x:=5Figure 3.11: xVDfg and xLDf1g3.9.3 ExamplesFigures 3.10 to 3.10 show examples omparing the various dependen ies. The interestingexamples involve loops. Noti e the subtle di�eren e between Figure 3.18(page 106) andFigure 3.19(page 106). In �gure 3.18, if we start in a state when y is negative then theprogram will terminate with x having the initial value of y. In �gure 3.19 however, in allstates when the program terminates x will have the value 0. The initial value of y only a�e tstermination and not the �nal value of x a ording to de�nition 3.5.1. In �gure 3.19, thereforethe VD and the TVD are di�erent sin e this value is independent of the initial value of y.Similarly the assignment y:=y-1 has no e�e t on the �nal value of x. Even if this had saidy:=y+79, the only �nal value of x ould be zero. The same thing happens in the program inFigure 3.20(page 107). Label 6 has no e�e t on the �nal value of x in terminating programsbut it an a�e t the termination of the program. It therefore o urs in LD but not in TLD.1: x:=zFigure 3.12: xVDfzg and xLDf1g

Page 105: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

3.9 The Label Dependen ies: LD and TLD 1051: z:=y;2: x:=y+4Figure 3.13: xVDfyg and xLDf2g1: if y=22: then x:=253: else x:=25Figure 3.14: xVDfg and xLDf2; 3g1: if y=22: then x:=17Figure 3.15: xVDfx; yg and xLDf1; 2g1: z:=y;2: if z=23: then x:=17Figure 3.16: xVDfx; yg and xLDf1; 2; 3g

Page 106: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

106 Data ow Dependen ies0: x=0;1: while y>0dobegin2: x:=x+1;3: y:=y-1endFigure 3.17: xVDfyg and xTVDfyg and xLDf0; 1; 2; 3g and xTLDf0; 1; 2; 3g1: while y>02: do y:=y-1;3: x:=yFigure 3.18: xVDfyg and xTVDfyg and xLDf1; 2; 3g and xTLDf1; 2; 3g1: while y <> 02: do y:=y-1;3: x:=yFigure 3.19: xVDfyg and xTVDfg and xLDf1; 2; 3g and xTLDf1; 3g

Page 107: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

3.9 The Label Dependen ies: LD and TLD 1072: while i<3dobegin3: if =2thenbegin4: :=y;5: x:=25end6: i:=i+1endFigure 3.20: xVDfx; ; ig andxTVDfx; ; ig xLDf2; 3; 5; 6g and xTLDf2; 3; 5g3.9.4 A Taxonomy of Label Dependen eSo far, four dependen ies have been introdu ed:� LD together with its data ow ounterpart, DLDand� TLD together with its data ow ounterpart, DTLD .These four di�erent label dependen ies an be ategorised as follows:-LD = (non{terminating,normal)TLD = (terminating,normal)DLD = (non{terminating,data ow)DTLD = (terminating,data ow)

Page 108: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

108 Data ow Dependen iesx LD l in p 9 p0�p di�ering from p only at l, and a state �su h thatM[[p℄℄�x 6=M[[p0℄℄�xx TLD l in p 9 p0�p di�ering from p only at l, and a state �su h that ? 6=M[[p℄℄�x 6=M[[p0℄℄�x 6= ?x DLD l in p 9q�p su h that x LD l in qx DTLD l in p 9q�p su h that x TLD l in qFigure 3.21: The Four Variations of Label Dependen e3.10 The Unde idability of LD and TLDLemma 3.10.1 LD is unde idable.Proof: As in the ase of variable dependen e, if an algorithm ould be written to de idewhether x LD l in program p then we ould use it to solve the halting problem [78℄ asfollows:- In order to de ide whether program p halts, onstru t the program q given by:p;l:y:=5;Then p halts if and only if y LD l in q. Similarly,Lemma 3.10.2 TLD is unde idable.Proof: Program q also has the property that p halts if and only if y TLD l in q.3.11 S hemasBefore our new dependen ies are further investigated, notation for representing data owequivalen e lasses of programs is introdu ed4. The only di�eren e between a program and a4This notation is very similar to that used in s hemes [44℄.

Page 109: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

3.11 S hemas 109s hema is that where the former has expressions, the latter has labelled sets of variables. We all these labelled sets of variables, Symboli Expressions.3.11.1 Syntax of S hemas� ::= skip jx := L(P(V )) jbegin �1; � � � ; �n end jif L(P(V )) then �0 else �1 jwhile L(P(V )) do � jFAILA s hema is a notation that an be used for representing Data ow Equivalen e lasses ofprograms. The use of FAIL is explained later. For example, the simple s hema, s3:22 inFigure 3.22(page 109) represents the data ow equivalen e lass ontaining the program inFigure 3.20(page 107). while f1(i)dobeginif f2( )thenbegin := f3(y);x := f4()end;i := f5(i)endFigure 3.22: s3:22, the S hema orresponding to the program in Figure 3.20(page 107).

Page 110: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

110 Data ow Dependen ies3.11.2 Uniqueness of LabelsIn the theory of s hemes [44℄ the same symboli expression an o ur at di�erent pla es.Di�erent o urren es of the same term orrespond to programs with the same expressiono urring in di�erent pla es. Although, in pra ti e, the same expression an o ur in di�erentpla es in a program, taking advantage of it is not allowable in data ow analysis, where all thatis known about ea h expression is its set of referen ed variables. The fa t that two expressionsare the same is invisible to an agent that performs data ow analysis. In performing data owanalysis, therefore, we imagine all programs have been �rst translated into s hemas wherethere is no repetition of labels. As will be seen in Chapter 7, as a result of unfolding s hemas,however, repetitions of the same labels do need to be onsidered.The theory introdu ed in this thesis does allow s hemas with repetitions of labels. Unlikes hemas with no repetition of labels, s hemas with repeated labels do not represent data owequivalen e lasses of programs sin e allowing repetition means that the lasses of programsrepresented by s hemas are not disjoint: for example the program x:=x+y;x:=x+y is bothrepresented by the s hema x:=f(x,y);x=f(x,y) whi h represents all program onsisting oftwo assignments to variable x of the same expression referen ing x and y and also by thes hema x:=f(x,y);x=g(x,y) whi h represents all programs onsisting of two assignments tovariable x of any expression referen ing x and y. (Here the expression may or may not be thesame in the two assignments).De�nition 3.11.1 (The lass of Programs represented by a S hema)We write [s℄ for the lass of programs represented by s.Lemma 3.11.1 If ea h label of a s hema s o urs only on e in s, then [s℄ is a data owequivalen e lass.Proof: Follows from de�nition of data ow equivalen e, De�nition 3.7.1(page 97).3.11.3 InterpretationsEvery expression in a onventional programming language is a synta ti representation ofan expression fun tion, E [[E℄℄ whi h is a fun tion from states to values [88℄. In a s hema,a symboli expression is a representation of the set of all possible expression fun tions that ould o ur in its pla e in any data ow equivalent program. The label represents the name of

Page 111: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

3.12 Rede�ning our Data ow Dependen ies in terms of S hemas 111the fun tion and the set of variables those that are referen ed in the orresponding expressionin the program.Given a s hema s, borrowing the terminology of Greiba h [44℄, a program p in the lassrepresented by s is alled an interpretation of s. We an think of a program, therefore asa s hema s together with a fun tion f , say, from the labels of s to the expressions of p, su hthat for all symboli expressions l(V ) of s, ref f(l) = V .We will sometimes abuse notation and asso iate a program p in [s℄ with su h a fun tionand write `p(l)' for the expression of p that orresponds to the symboli expression labelledl in s. For example, the fun tion whi h gives the orresponden e between s hema s3:22 andthe program in Figure 3.20(page 107) is given by:-p(f1) = (i < 3)p(f2) = ( = 2)p(f3) = yp(f4) = 25p(f5) = i+ 1As in the ase of theory of S hemes [44℄, great are has to be taken to properly de�nethe domain of interpretation, that is the set of allowable values for expressions. This hoi eas in the ase of the theory of s hemes an greatly a�e t fundamental properties of s hemas.For our theory and our algorithms to be orre t it is important that the domains are in�nite;without loss of generality, we an therefore assume that the expressions on the right hand ofassignments are of type integer.Sin e this thesis is about data ow dependen e, from now on, s hemas rather than pro-grams will be used as the obje ts to whi h dependen e relations are applied.3.12 Rede�ning our Data owDependen ies in terms of S hemasIn this se tion our four di�erent data ow dependen e relations are re ast in terms of s hemas.The four data ow depende ies are:

Page 112: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

112 Data ow Dependen iesxDVDy in s () 9p 2 [s℄ su h that xVDy in p.xDTVDy in s () 9p 2 [s℄ su h that xTVDy in p.xDLDy in s () 9p 2 [s℄ su h that xLDy in p.xDTLDy in s () 9p 2 [s℄ su h that xTLDy in p.3.13 A Comparison of Data ow Label Dependen e with Sli -ingIn this se tion it is informally justi�ed thatDTLD is the data owminimal version of Venkatesh'sstati ba kward losure sli e [91℄ and it also laimed that DLD is the data ow minimal versionof sli ing that preserves the proje tion, onto the variable of interest, of standard (rather thanlazy5) semanti s. (Kamkar [66℄ introdu es this sli e semanti s in her thesis and so we referto sli es satisfying this sli e relation as `Kamkar Sli es').It is laimed therefore that the sli es produ ed by DLD are always exe utable and alwaysbehave identi ally with respe t to the variable of interest. Not only is the value of the variableof interest preserved by the sli e, but also the original program and the sli e will always agreein terms of termination.3.13.1 The Sli es produ ed by DTLDVenkatesh [91℄ states:`Intuitively, a statement belongs to a sli e if and only if the statement's om-putation ontributes to the value of the variable of interest. Suppose we ouldsele tively ontaminate the omputation of a statement in su h a way that the ontamination propagated to all variables that depended on that statement. Thenwe ould use this as a test to he k whether a statement is ne essary to omputethe value of a spe i�ed variable. On the other hand, a statement does not belongto a sli e if its ontamination does not a�e t the variable of interest. Contamina-tion of the omputation of su h a statement must not ontaminate the spe i�edvariable'5as in the ase of Weiser's algorithm [92℄ and the PDG approa h [82℄

Page 113: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

3.13 A Comparison of Data ow Label Dependen e with Sli ing 113The idea of sele tive ontamination orresponds exa tly to the idea in label dependen e,where the behaviour of two programs di�ering only at a single label are ompared. In DTLDa statement(expression) is in luded if repla ing it with a di�erent one has an a�e t on thevariables of interest.There are two major di�eren es between the sli es produ ed by DTLD and those de�nedby Weiser.The �rst is that DTLD does not in lude statements whose only e�e t is on the termination onditions of a loop. This means that sometimes the sli e produ ed by DTLD will notterminate in states where the original program terminated. In Figure 6.19(page 198), forexample, the DTLD does not in lude f5.The se ond6 is that we an have a s hema s su h that :(xDTVDy) and :(xDTVDz) butthere exist two states di�ering only on y and x with di�erent non{terminating values for z.An example is while b1(q)do while b2(p)do beginif b3(x)then z := f4()else p := f5();if b6(y)then beginx := f7();y := f8()endelse q := f9()endThis s hema gives rise to the following `truth table':-6Was only re ently noti ed. The example is thanks to John Howroyd.

Page 114: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

114 Data ow Dependen iesb1(q) b2(p) b3(x) b6(y) Final Value for zT T T T f4() or ?T T T F ? (inner loop fails)T T F T ? (outer loop fails)T T F F z or ?T F ? (outer loop fails)F zThe values in both olumn 3 and olumn 4 must be di�erent to get a di�erent non-? valuefor z.This means, that surprisingly, the initial values of a set of variables an jointly ontributeto the �nal value of a variable even if they do not ontribute individually. The problem isthat individually, none of the predi ates an a�e t the �nal value of z but jointly they an.In terms of label dependen e, `DTLD sli ing' on z, unlike Weiser sli es, would not in ludestatements that jointly a�e t z in this way.DTLD of the above s hema with respe t to z gives the set of labels fb1; f4g (see ap-pendix A, page 239, third example).This annot happen for DVD and DLD sin e ? in this ase is onsidered a proper valuein their ase.Proof: : Suppose :(xDVDy) and :(xDVDz) but there exist two states, �1 and �3, say,di�ering only on y and z with di�erent �nal values for x. Suppose in �1, (y; z) = (a1; b1) andin �3, (y; z) = (a2; b2) where a1 6= a2 and b1 6= b2 and the values of all other variables agree.Let �2 be a state su h that y = a2 and every other variable has the same value in �1 as in�2, then sin e :(xDVDy), �1 and �2 will result in the same �nal value of x.Similarly, sin e :(xDVDz), �2 and �3 will result in the same �nal value for x be ause�2 and �3 di�er only at z. By transitivity, therefore, �1 and �3 will produ e the same �nalvalues for x, whi h is a ontradi tion.3.13.2 The Sli es produ ed by DLDWe laim that the sli e produ ed byDLD is the data ow minimal version of sli ing thatpreserves the proje tion, onto the variable of interest, of standard semanti s, i.e. the sli esprodu ed by DLD are always exe utable and always `behave the same' with respe t to the

Page 115: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

3.13 A Comparison of Data ow Label Dependen e with Sli ing 115variable of interest: Not only is the value of the variable of interest preserved by the sli e,but also the original program and the sli e will always `agree' in terms of termination.This laim arises from the fa t that by De�nition 3.9.4(page 103), l 2 DTLD x if and onlyif there exist two programs in [s℄ di�ering only at l whose standard semanti s,M, proje tedonto x is di�erent.We further demonstrate the plausibility of our laim with an example.Consider the s hema in Figure 3.23(page 115):-while f1(i)dobeginy := f2(y);i := f3(i)end;z := f4()Figure 3.23: ExampleA Weiser sli e with respe t to z would give z := f4() as would DTLD whereas a sli e thatpreserves the proje tion onto variable z of the standard semanti s and DLD would give thes hema in Figure 3.24(page 115). whilef1(i)dobegini := f3(i)end;z = f4()Figure 3.24: DLD Sli eThe reason for this is there are programs in the lass of the s hema in Figure 3.23(page 115)whi h do not terminate. These programs terminate if and only if the ` orresponding programs'in the lass of the s hema in Figure 3.23(page 115) do not terminate. If we remove any more

Page 116: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

116 Data ow Dependen iesstatements from the s hema in 3.24 the orresponding programs will behave di�erently eitherwith respe t to the �nal value of z or to their termination properties.What we are laiming is now formally stated:-Claim 3.13.1 Given any s hema s and variable x, the set of labels given by:-fl j x DLD l in sgmake up a synta ti ally valid s hema s0 and what is more, for all p 2 [s℄ the orrespondingp0 2 [s0℄ is su h that for all states �,M[[p℄℄� x =M[[p0℄℄� x:It seems intuitively ` orre t' that if a label that an have an e�e t on x on its own, where alabel in DLD x is left out of a `potential sli e', then there will be a program orresponding tothe resulting s hema whi h does not behave the same with respe t to the equivalent originalprogram either with respe t to x or termination. The proof of this result is left for futurework (see Chapter 9).These sli es are larger than those de�ned by Weiser be ause Weiser's were not requiredto behave the same as the original when the original failed to terminate. In general, usingthis approa h, the `skeletons' of all loops of the program being sli ed will be in luded.3.14 The Data ow Minimality of Algorithms for DTVD et .For ea h data ow dependen y introdu ed in this hapter, if algorithms exist for omputing it,it will be data ow minimal. We prove for example that any algorithm that produ es DTVDis data ow minimal. The proof of ea h of the others is identi al.3.14.1 Example: DTVDIn an algorithm A for DTVD, the input is a s hema s (whi h is a representation of a ontrol ow graph, so A is a data ow algorithm) and the output is binary relation d on variables.The ordering is � and the sli e relation R is given by

Page 117: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

3.15 Con lusion 117R(s; d) if and only if d = f(x; y)j9 p 2 [s℄ su h that x TVD y in pgProof: Trivial sin eDTVD(s) = f(x; y)j9 p 2 [s℄ su h that x TVD y in pgFor any s hema s, there is trivially, only one set d satisfying the property R(s; d) so,trivially, any algorithm for produ ing it must be minimal.3.15 Con lusion� In this hapter, the data ow minimality problem has been formally de�ned.� We have de�ned four data ow dependen e relations all a ting on a ontrol ow graphor s hema. Given a ontrol ow graph g, these dependen ies, alled DVD, DTVD, DLDand DTLD, have all been de�ned in terms of the existen e of a program p whose ontrol ow graph is g with desired properties. These desired properties have been de�ned interms of the standard semanti s of p.� These dependen ies, two of whi h are a form of sli ing, are su h that if algorithms for omputing them exist, then these algorithms must be data ow minimal.In the next hapter, we introdu e a stru ture: the Symboli Exe ution Tree, whi h issuitable for expressing these dependen ies.

Page 118: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

118 Data ow Dependen ies

Page 119: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

Chapter 4The Semanti s, S, of Loop{freeS hemas4.1 Introdu tionThe semanti s, S, of loop-free s hemas is de�ned as a mapping from loop{free s hemas tosymboli exe ution trees.Symboli exe ution trees are �nite binary trees whose intermediate nodes are symboli predi ates and whose leaf nodes are symboli states whi h map variable names to symboli values.The hapter ends with an implementation of S in the fun tional language, Hope [6℄.The input to this implementation is a representation of a s hema s and the output is arepresentation of the symboli exe ution tree, S[[s℄℄. This is the �rst stage in an algorithm for omputing the data ow dependen ies introdu ed in Chapter 3.4.2 Symboli ValuesSymboli Values represent the omposition of expression fun tions that would be required to al ulate the value of a variable. They are sometimes alled terms [44℄.

Page 120: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

120 The Semanti s, S, of Loop{free S hemasDe�nition 4.2.1 (Symboli Values(�))� = V � (L� P�) � ?A symboli value is, thus, one of:type 1: A variable.type 2: A label and a �nite set of symboli values.type 3: Bottom.4.2.1 Examples of Symboli Values� x A single variable. The initial value of variable x.� f() The label in this ase is f , and the set is empty. This is the value of the variable xafter the onstant assignment x := f().� f(x; y; z) The label in this ase is f , and the set is fx; y; zg: This is the value of thevariable x after the assignment x := f(x; y; z).� f(x; g()) The label in this ase is f , and the set is fx; g()g. This is the value of thevariable x after the sequen e of assignments y := g(); x := f(x; y).� f(x; h(y; z); g(z)) The label in this ase is f , and the set is fx; h(y; z); g(z)g. This is thevalue of the variable x after the sequen e of assignmentsk := g(z); z := h(y; z); x := f(x; z; k):� ? The bottom symboli value. As will be seen, the s hema FAIL, results in a symboli state where all variables are mapped to this symboli value.� f(?; x) Symboli values like this although synta ti ally valid, never o ur as a result ofsymboli exe ution.Note that a symboli expression is a spe ial type of symboli value, where the set ompo-nent onsists just of variables (like the �rst three examples above).

Page 121: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

4.2 Symboli Values 1214.2.2 Symboli States()As in onventional semanti s [88℄, a state maps variables to values, only in the ase of symboli states, the values are symboli values.De�nition 4.2.2 (Symboli States)A symboli state 2 is a total fun tion from V to �. = [V ! �℄The reason that symboli states are total is that every variable that is not assigned to ina program is mapped to itself1 (even variables that are not mentioned at all). This re e tsthe fa t that if a variable is not mentioned in a program fragment, then its �nal value afterexe uting the fragment will depend on its initial value and nothing else.4.2.3 Symboli Exe ution of a S hemaAs an example of how the symboli state hanges as a result of an assignment, onsider thesimple program, p4:1 onsisting of a sequen e of assignments and its orresponding s hema,s4:1:- x:=21;y:=x+5;z:=x+yp4:1 x := f1();y := f2(x);z := f3(x; y)s4:1In Figure 4.1(page 122), we show the steps in the symboli exe ution of the sequen e ofassignments in s4:1.The initial state maps every variable to itself. As in onventional semanti s [88℄, exe ution ofan assignment x:=f(V ) (where V is a set of variables) involves updating the urrent state tore e t the fa t that the new value of x is the result of evaluating f(V ) in the urrent state.1This is allowable sin e variables are symboli values.

Page 122: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

122 The Semanti s, S, of Loop{free S hemasS hema s4:1 val of x val of y val of zx y zx := f1(); f1() y zy := f2(x); f1() f2(f1()) zz := f3(x; y) f1() f2(f1()) f3(f1(); f2(f1()))Figure 4.1: The Symboli exe ution of s hema s4:1.To evaluate f(V ) in a symboli state we simply repla e ea h element of V by its value in (This is `evaldelta'{ see De�nition 4.4.6(page 129)). After the assignment statement x := f1()the value if x in the state is thus updated to the symboli value f1(). This re e ts the fa tthat now x depends upon the expression fun tion f1 whi h must be a onstant fun tion sin eits set of mentioned variables is empty. After the assignment y := f2(x) , the value of y isupdated to the symboli value f2(f1()). The variable x in the expression on the right handside of this assignment to y is repla ed by its urrent value. This shows that y now dependsupon the expression fun tions f1 and f2. Noti e that z, at this stage, has not been hanged, soit still depends only upon its initial value. Finally, after the assignment z := f3(x; y), to �ndthe value for z we repla e the variables x and y in the right hand side of the assignment bytheir urrent symboli values to give the symboli value f3(f1(); f2(f1())), whi h shows thatz now depends on all three expression fun tions f1, f2 and f3. In the �nal symboli state, allvariables in s4:1 are mapped to symboli expressions that mention no variables. This showsthat after exe uting any program in p in [s4:1℄, all the �nal values of the variables of p willbe independent of the initial values of any variable. For s hemas onsisting of sequen es ofassignments, it will be noti ed that the variables mentioned in the urrent symboli value,Æ, of variable v represent the set of variables upon whi h v is urrently data ow variabledependent. Similarly, the set of labels o urring in Æ orrespond to the set of labels to whi hv is urrently data ow label dependent.4.3 Symboli Exe ution TreesA Symboli Exe ution Tree is a binary tree similar to the symboli exe ution tree used byDay [30℄. The leaves are symboli states and the intermediate nodes are symboli values orresponding to the exe ution of the predi ates of onditionals. The left subtree orresponds

Page 123: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

4.3 Symboli Exe ution Trees 123to the `further exe ution' that o urs if the predi ate evaluates to true and the right subtree orresponds to the `further exe ution' that o urs if the predi ate evaluates to false.De�nition 4.3.1Symboli Exe ution Trees A Symboli Exe ution Tree(SET) is an obje t of type:SET = � SET��� SETA symboli exe ution tree is thus a binary tree whose intermediate nodes are symboli valuesand whose leaf nodes are symboli states.4.3.1 Example of a Symboli Exe ution TreeIn Figure 4.2(page 124), an example symboli exe ution tree is given.

Page 124: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

124 The Semanti s, S, of Loop{free S hemasb1(x,y)

b2(x) id

b1(f(x,y),y)b1(x,(g(y)))

b1(x,g(g(y)))

y->g(g(y))

x->f(x,y)

b1(f(x,y), g(y))b1(f(f(x,y),y),y)

x->f(x,y)y->g(y)

b2(f(x,y))

x->f(f(x,y),y)

y->g(y)

bottom bottom

bottom

b2(x)

b1(f (x,g(y)),g(y))

bottom x -> f(x,g(y))Figure 4.2: Example Symboli Exe ution Tree(It orresponds to s hema W2 de�ned in Se tion 7.2.1 page 209.) The intermediate nodes areall symboli values and the leaf nodes are all �nal state mapping variables to symboli values.The value of ea h variable in these leaf node states is the symboli value orresponding tothe sequen e of assignments that would have to be exe uted to rea h that �nal state. Ea hintermediate node represents the symboli exe ution of a predi ate in the s hema. We allthese nodes predi ate symboli values. The outermost label of a predi ate symboli value willbe the label of the predi ate in the s hema whose exe ution this predi ate symboli valuerepresents.There may be more than one di�erent exe ution path all leading to the exe ution of a

Page 125: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

4.4 Operations on Symboli Exe ution Trees 125parti ular predi ate b(x), say. There will be a di�erent o urren e of a predi ate symboli value whose outermost label is b for ea h exe ution path that leads to it.Variables that are not shown in the domain of symboli states in these diagrams areassumed to be mapped to themselves. There are two `spe ial' states: The state, id, whi hmaps every variable to itself and the state, ?, where every variable gets mapped to the ?symboli value. (Later it will be shown that the ? symboli state arises from the s hemaFAIL that never terminates.)4.4 Operations on Symboli Exe ution TreesIn this se tion, we de�ne some operations on symboli exe ution trees needed for the semanti sof loop{free s hemas(Se tion 4.5).4.4.1 PathsDe�nition 4.4.1 (Paths)Given a symboli exe ution tree, t, a path, � is a disjoint pair (�T ; �F ) of sets of symboli values representing a legal path from the root of t to a leaf symboli state .�T and �F represent the sets of predi ate symboli values that had to be true and falserespe tively in order to arrive at .ExampleThe �rst two olumns in Figure 4.3(page 126) give the values of �T and �F respe tively forea h path of the symboli exe ution tree given in Figure 4.4(page 128).4.4.2 The Path Fun tion, pfun, of a symboli exe ution tree, t.The path fun tion, pfun(t), of t is the fun tion whose domain is the set of paths � in t. Forea h su h �, pfun(t)(�) is the symboli state o urring at the leaf of the tree at the endof �. The range of pfun(t) is, thus, the set of all leaves of t. See Figure 4.3(page 126) for anexample. It only makes sense to �nd the path fun tion of simple symboli exe ution trees(seeSe tion 4.4.3).

Page 126: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

126 The Semanti s, S, of Loop{free S hemasTrue Symboli Predi ates False Symboli Predi ates Final Statefb1(x; y); b2(x); b1(f(x; y); y); b2(f(x; y)); b1(f(f(x; y); y); y)g ; ?fb1(x; y); b2(x); b1(f(x; y); y); b2(f(x; y))g fb1(f(f(x; y); y); y)g x 7! f(f(x; y); y)fb1(x; y); b2(x); b1(f(x; y); y); b1(f(x; y); g(y))g fb2(f(x; y))g ?fb1(x; y); b2(x); b1(f(x; y); y))g fb2(f(x; y)); b1(f(x; y); g(y))g x 7! f(x; y)y 7! g(y)fb1(x; y); b2(x)g fb1(f(x; y); g(y))g x 7! f(x; y)fb1(x; y); b1(x; g(y)); b1(x; g(g(y)))g fb2(x)g ?fb1(x; y); b1(x; g(y))g fb2(x); b1(x; g(g(y))g y 7! g(g(y))fb1(x; y)g fb2(x); b1(x; g(y))g y 7! g(y); fb1(x; y)g idFigure 4.3: The Path Fun tion of the symboli exe ution tree in Figure 4.4(page 128)De�nition 4.4.2 (pfun : SET ! (path !j ))� If t is a leaf, then pfun t = f(;; ;) 7! tg� If t is of the form (t1; r; t2), thenpfun (t1; r; t2) = addleft(r; pfun t1) [ addright(r; pfun t2)where addleft ; addright : (�� (path !j ))! (path !j )addleft(r; f) = [(�1;�2)7! 2f f(�1 [ frg; �2) 7! gandaddright(r; f) = [(�1;�2)7! 2f f(�1; �2 [ frg) 7! gThe path fun tion orresponding to the symboli exe ution tree in Figure 4.4(page 128)is given in Figure 4.3(page 126).

Page 127: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

4.4 Operations on Symboli Exe ution Trees 1274.4.3 Simple Symboli Exe ution TreesA simple symboli exe ution tree is one where all the paths do not ontain repetitions ofsymboli values.De�nition 4.4.3 (Simple Symboli Exe ution Trees)Formally, a leaf t is simple,and the tree (t1; r; t2) is simple if and only if t1 and t2 are both simple and r is not a node oft1 and r is not a node of t2.4.4.4 Simpli� ation of a Symboli Exe ution TreeIn the semanti s of loop free s hemas whi h follows, the meaning of the sequen e of twos hemas s1; s2 is de�ned, unsurprisingly, in terms of the meanings of s1 and s2. In thesequen e fun tion (De�nition 4.4.9(page 131), every leaf state in s1 is repla ed by thesymboli exe ution tree orresponding to the meaning of s2 evaluated in state using thetreeinstate fun tion (see De�nition 4.4.8(page 130)). The resulting symboli exe ution tree onstru ted in this way may ontain `impossible paths'. An impossible path, is one where�t and �f are not disjoint. In order to stop this, we always simplify newly reated symboli exe ution trees. The resulting tree must be simpli�ed to remove these impossible paths.ExampleThe symboli exe ution tree in Figure 4.2(page 124) is not simple as there is more thanone o urren e of the symboli predi ate b2(x). When simpli�ed, the `lower' o urren e ofb2(x) is removed as is everything to the left of this lower o urren e. The simpli�ed symboli exe ution tree is shown in Figure 4.4(page 128).

Page 128: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

128 The Semanti s, S, of Loop{free S hemasb1(x,y)

b2(x) id

b1(f(x,y),y)b1(x,(g(y)))

b1(x,g(g(y)))

y->g(g(y))

x->f(x,y)

b1(f(x,y), g(y))b1(f(f(x,y),y),y)

x->f(x,y)y->g(y)

b2(f(x,y))

x->f(f(x,y),y)

y->g(y)

bottom bottom

bottom

Figure 4.4: Simpli�ed Symboli Exe ution Tree4.4.5 Pruning Symboli Exe ution TreesSimpli� ation is de�ned in terms of pruning. A symboli exe ution tree, t, is pruned withrespe t to a path (�T ; �F ). Pruning a symboli exe ution tree with respe t to a path(�T ; �F ) both results in a tree whi h is both simple and ontains none of the nodes in(�T [ �F )(Lemma 5.3.1(page 144)). Pruning is de�ned re ursively as follows:-

Page 129: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

4.4 Operations on Symboli Exe ution Trees 129De�nition 4.4.4 (prune : path ! SET ! SET)If t is a leaf, prune(�T ; �F )(t) = tand if t is of the form (t1; r; t2) then,prune(�T ; �F )(t) = prune(�T ; �F )t1 if r 2 �Tprune(�T ; �F )(t) = prune(�T ; �F )t2 if r 2 �Fand if r =2 (�T [ �F ) thenprune(�T ; �F )(t)=(prune(frg [ �T ; �F )t1; r; prune(�T ; frg [ �F )t2)A symboli exe ution tree is, thus, simpli�ed by pruning it with respe t to the `empty'path:De�nition 4.4.5 (simplify : SET ! SET ! SET)simplify = prune(;; ;)4.4.6 Evaluating a symboli value Æ in a Symboli State The fun tion, evaldelta , is equivalent to the E fun tion in standard semanti s. Like, E , whi htakes expressions and states, evaldelta takes the `symboli equivalents': symboli values andsymboli states.De�nition 4.4.6 (evaldelta : ! �! �)If Æ is a variable x, then evaldelta x = x

Page 130: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

130 The Semanti s, S, of Loop{free S hemasIf Æ is of the form f(S), thenevaldelta Æ = f([s2S evaldelta s)evaldelta is stri t in both its arguments:-evaldelta ? = ? = evaldelta ? ExampleSee Se tion 4.2.3 whi h shows evaldelta being applied.4.4.7 Updating a Symboli State in a Symboli StateGiven two symboli states 1 and 2, updating 2 in state 1 means produ ing the statewhi h maps ea h variable, v, to the result of evaluating the symboli value 2 x in 1.De�nition 4.4.7 (updatestateinstate : ! ! )updatestateinstate 1 2 x = evaldelta 1 ( 2 x)updatestateinstate is stri t in both its arguments:-updatestateinstate ? = ? = updatestateinstate ? 4.4.8 Evaluating a Symboli Exe ution Tree in a Symboli StateTo evaluate treeinstate t , we symboli ally evaluate ea h node of t in state . The interme-diate nodes are symboli values so we use evaldelta for these and the leaf nodes are states sowe use updatestateinstate for these.De�nition 4.4.8 (treeinstate : ! SET ! SET)If t is a leaf, then treeinstate t = updatestateinstate t

Page 131: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

4.5 The Semanti s of Loop Free S hemas 131and if t of the form (t1; r; t2), then treeinstate t=(treeinstate t1; evaldelta r; treeinstate t2)4.4.9 The Sequen e of two Symboli Exe ution TreesGiven two symboli exe ution trees, t and t0 we now de�ne the meaning of t; t0 i.e. t followedby t0. To ompute t; t0 we repla e ea h leaf node of t by treeinstate t0 .De�nition 4.4.9 (sequen e : SET ! SET ! SET)If t is a leaf, sequen e t t0 = treeinstate t t0and if t is of the form (t1; r; t2) then,sequen e t t0 = (sequen e t1 t0; r; sequen e t2 t0):4.5 The Semanti s of Loop Free S hemasWe are now in a position to de�ne the semanti fun tion S whi h maps s hemas whi h do not ontain while loops to symboli exe ution trees. In e�e t, S is an algorithm whi h translatess hemas into symboli exe ution trees. Later, the resulting symboli exe ution tree will befurther analysed to produ e the various data ow dependen ies introdu ed in Chapter 3.In the following hapter(Chapter 6), the algorithm will be proved orre t for loop frees hemas.4.5.1 AssignmentsDe�nition 4.5.1 (Assignment)S[[x:=f(V )℄℄ is the symboli exe ution tree onsisting of the single leaf state:-

Page 132: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

132 The Semanti s, S, of Loop{free S hemasS[[x:=f(V )℄℄ = �z:8<: z if z 6= xf(V ) if z = x4.5.2 FailDe�nition 4.5.2 (FAIL )FAIL is a s hema that represents programs whi h never terminate in any state. S[[FAIL℄℄ isthus the symboli exe ution tree onsisting of the single leaf ?.S[[FAIL℄℄ = ?4.5.3 SkipDe�nition 4.5.3 (skip )skip is a s hema that represents a program whi h does nothing.S[[skip℄℄ = �v:v4.5.4 ConditionalsDe�nition 4.5.4 (Conditionals)S[[if f(V ) then s1 else s2℄℄ = simplify (S[[s1℄℄; f(V );S[[s2℄℄)S[[if f(V ) then s1 else s2℄℄ is thus the symboli exe ution tree whose root is the symboli value,f(V ), and whose left and right subtrees are the meanings of the then and the else bran hesrespe tively.4.5.5 Statement Sequen esDe�nition 4.5.5 (Statement Sequen es)S[[s1; s2℄℄ = simplify (sequen e S[[s1℄℄ S[[s2℄℄)

Page 133: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

4.6 Implementation of the Semanti s of Loop Free S hemas 1334.5.6 Example(The sequen e of assignments in Figure 4.1(page 122) were really translated to a sym-boli exe ution tree onsisting of a single leaf node ontaining the �nal state shown in Fig-ure 4.1(page 122).)Consider the program p4:5, and its orresponding s hema s4:5, in Figure 4.5(page 133). Thesymboli exe ution tree orresponding to s4:5 has stru ture as shown in Figure 4.6(page 134).if i<3thenbegin :=k;if =2then :=yelse := +1;i:=i+1endelseskipif f1(i)thenbegin := f2(k);if f3( )then := f4(y)else := f5( );i := f6(i)endelseskipFigure 4.5: p4:5 and s4:5Nodes A and B are symboli values orresponding to the evaluation of the symboli predi atesf1(i) and f3( ) respe tively. The leaves C;D and E of the tree orrespond to symboli statesprodu ed by symboli ally exe uting the sequen es of assignments along that path as des ribedin the previous se tion. These �nal states are dependent on the values of the two symboli predi ates(Left bran h=true, right bran h=false).4.6 Implementation of the Semanti s of Loop Free S hemasIn this se tion, the semanti s des ribed in this hapter is translated into an algorithm in thefun tional programming language Hope [6℄. The input to the algorithm (given by the fun tionmeaningl below) is a representation of a s hema (in abstra t syntax2) and the output is a2We have not in luded the parser as it does not ontribute to the understanding of the algorithm.

Page 134: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

134 The Semanti s, S, of Loop{free S hemasf1(i)

f3(f2(k))

i -> f6(i)c->f4(y)

id

i->f6(i)c->f5(c)

(A)

(B)

(D) (E)

(C)

Figure 4.6: The symboli exe ution tree of s4:5representation of the orresponding symboli exe ution tree. -verb-delta- is the Hope datatype representing symboli values.Sin e Hope is a fun tional language, it is noti ed that the program de�nitions are onlysuper� ially di�erent from the mathemati al ones introdu ed in this hapter. This means thatthe indu tive proofs in terms of the mathemati al de�nitions, o urring in later hapters, willalso serve as proofs of the program itself.4.6.1 The Abstra t Syntax for Symboli Values (De�nition 4.2.1(page 119))type name == list( har);data delta == va name ++ omplex (name # (set delta)) ++ botdelta;4.6.2 The `Standard' Update Fun tion [88℄update: (alpha -> beta) -> alpha -> beta -> (alpha -> beta);update f x y z <= if z=xthen yelse f z;

Page 135: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

4.6 Implementation of the Semanti s of Loop Free S hemas 1354.6.3 The Abstra t Syntax for S hemas (Se tion 3.11).data statement ==FAIL ++ass(name X delta) ++ife(delta X (list statement) X (list statement)) ++while(delta X (list statement));4.6.4 Symboli States (Se tion 4.2.2)data state== ok(name -> delta) ++ botstate;4.6.5 The Abstra t Syntax for Symboli Exe ution Trees. (Se tion 4.3)data SET == leaf state ++ node(SET X delta X SET);4.6.6 Representation of Paths (De�nition 4.4.1(page 125))type path == set delta X set delta;4.6.7 evaldelta (De�nition 4.4.6(page 129))evaldelta: state -> delta -> delta;evaldelta botstate x <= botdelta;evaldelta (ok sigma) botdelta <= botdelta;evaldelta (ok sigma) (va x) <= sigma x;evaldelta (ok sigma) ( omplex (f,S)) <= omplex(f,mapset1(evaldelta (ok sigma) ,S));4.6.8 updatestateinstate (De�nition 4.4.7(page 130))updatestateinstate:state -> state -> state;updatestateinstate (ok st1) (ok st2) <= ok((evaldelta (ok st1) o st2));updatestateinstate x y <= botstate;

Page 136: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

136 The Semanti s, S, of Loop{free S hemas4.6.9 treeinstate (De�nition 4.4.8(page 130))treeinstate: SET ->state -> SET;treeinstate (leaf sigma') sigma <= leaf (updatestateinstate sigma sigma');treeinstate (node(t1,r,t2)) sigma <=node (treeinstate t1 sigma,evaldelta sigma r, treeinstate t2 sigma);4.6.10 sequen e (De�nition 4.4.9(page 131))sequen e:SET -> SET -> SET;sequen e (leaf sigma) t' <= treeinstate t' sigma;sequen e(node(t1,r,t2)) t' <= node(sequen e t1 t',r,sequen e t2 t');4.6.11 prune (De�nition 4.4.4(page 128))prune: path -> SET -> SET;prune (l,m) (leaf x) <= leaf x;prune (l,m) (node(b1,r,b2)) <=if (r isin l)then prune (l,m) b1else if (r isin m)then prune (l,m) b2else node(prune (r & l,m) b1, r, prune (l,r & m) b2);x & y is Hope notation for the set fxg [ y4.6.12 simplify (De�nition 4.4.5(page 129))simplify: SET -> SET;simplify <= prune(empty,empty);4.6.13 The Semanti Fun tion S (Se tion 4.5)meaning:statement -> SET;meaningl:list(statement) -> SET;`meaning' is the semanti fun tion S (Se tion 4.5)

Page 137: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

4.6 Implementation of the Semanti s of Loop Free S hemas 1374.6.14 The skip Rule (De�nition 4.5.3(page 132))meaningl nil <= leaf (ok va);4.6.15 The Sequen e Rule (De�nition 4.5.5(page 132))meaningl (x::l) <= simplify (sequen e (meaning x) (meaningl l));4.6.16 The FAIL Rule (De�nition 4.5.2(page 132))meaning FAIL <= leaf botstate;4.6.17 The Assignment Rule (De�nition 4.5.1(page 131))meaning (ass(x,e)) <=leaf(ok (update va x (evaldelta (ok va) e)));4.6.18 The Conditional Rule (De�nition 4.5.4(page 132))meaning (ife(e,l1,l2)) <=simplify (node(meaningl l1, evaldelta (ok va) e,meaningl l2));4.6.19 The Path Fun tion (De�nition 4.4.2(page 125))singleton: alpha -> set alpha;singleton x <= x & empty;de addleft,addright: delta X (pfun path delta) -> (pfun path delta);addleft(d,f) <= mapset(lambda ((a,b), ) => singleton((d & a,b), ),f);addright(d,f) <= mapset(lambda ((a,b), ) => singleton((a,d & b), ),f);applystate :name -> state -> delta;applystate v (ok sigma) <= sigma v;applystate v botstate <= botdelta;

Page 138: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

138 The Semanti s, S, of Loop{free S hemaspathfun: SET -> name -> (pfun path delta);pathfun (leaf sigma) v <= singleton((empty,empty),applystate v sigma);pathfun (node (b1,r,b2)) v <= addleft (r,pathfun b1 v) U addright (r,pathfun b2 v);4.7 Con lusionThe semanti s, S, of loop-free s hemas has been de�ned as a mapping from loop{free s hemasto symboli exe ution trees. An implementation of S has been given. This is the �rst stagein an algorithm for omputing the data ow dependen ies introdu ed in Chapter 3.In order to be able to prove these algorithms orre t, it will �rst be ne essary to provethat S is both sound and omplete. This is done in Chapter 5.

Page 139: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

Chapter 5The Soundness and Completenessof S5.1 Introdu tionIn this hapter, it is shown how for ea h loop free s hema, s, the symboli exe ution tree,S[[s℄℄, hara terises the set of all possible behaviours of all programs in [s℄.The reason that this is riti ally important, is that without it, there is no justi� ationthat it is valid to infer properties of a s hema, and hen e of the set of programs that thes hema represents, by analysis of its symboli exe ution tree.The theory in this hapter leads to a proof that the hara terisation provided by S is bothsound and omplete.� S is omplete in the following sense:Given a loop{free s hema s, and a program p 2 [s℄, and a state, �, there exists exa tlyone path �, of the symboli exe ution tree, S[[s℄℄, that orresponds to the exe ution ofp in state �.� S is sound in the following sense:For all paths �, of the symboli exe ution tree, S[[s℄℄, there exists a program, p 2 [s℄,and a state, �, su h that � orresponds to the exe ution of p in state �.It is this theorem that shows that the symboli exe ution tree aptures exa tly the rightsemanti information about the set of programs represented by the s hema; no more and no

Page 140: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

140 The Soundness and Completeness of Sless. It is be ause of this, that program analysis using symboli exe ution trees may have otherappli ations, not just the ones introdu ed in this thesis. It is this theorem whi h providesthe semanti link that enables the algorithms given in Chapter 6, for omputing data owdependen ies of loop{free s hemas (de�ned in terms of their symboli exe ution trees) to beproved orre t.5.2 The Corresponden e between Symboli Exe ution Treesand ProgramsThe path �, of a symboli exe ution tree orresponds to the exe ution of program, p, in state,�, if and only if:-� � is satis�ed by p and � (De�nition 5.2.2(page 143)),and� the state derived (see Se tion 5.2.2) from the symboli state at `the end of' the path �with respe t to s, p and � isM[[p℄℄�.5.2.1 The Fun tion evalsymGiven a program, p in the lass [s℄, represented by s hema, s, and a starting state, �, everysymboli value, Æ, that an arise in the symboli exe ution of s orresponds to a `real' value,evalsym s p � Æthat arises in the exe ution of p, starting in state sigma. The fun tion, evalsym, is the onethat omputes a value (not symboli , but real) from a symboli value. In order to do this weneed two things:1. A program, p, whi h tells us whi h `real expressions' to use in pla e of the terms of thesymbol value that we are evaluating2. and a state, �, in whi h to evaluate the resulting `real expression'.

Page 141: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

5.2 The Corresponden e between Symboli Exe ution Trees and Programs 141In luding the s hema, s, as a parameter to evalsym is not stri tly ne essary.For ea h symboli expression, fi(Vi) in s there is a orresponding expression, ei in p (seeSe tion 3.11.3). Sin e p 2 [s℄, ei is not a�e ted by any variable outside Vi. So E [[ei℄℄� an be omputed if we know the values in � of all the variables in Vi.Non{trivial symboli values onsist of a label and a set of symboli values. As des ribedin Se tion 3.11.3, the label refers to the expression fun tion in a program in [s℄. The set ofsymboli values inside a non{trivial symboli value orrespond to the values that are passedto the expression fun tion. How do we know whi h symboli value in this set is asso iatedwith whi h variable? This is not a problem, sin e labels are unique. If the outermost labelof symboli value, fi, o urs as the label of an expression in an assignment to x, say, then,by uniqueness (Se tion 3.11.2), it annot o ur as the outermost label of an assignment toany other variable1. The value of all symboli values whose outermost label is fi must betherefore asso iated with the variable, x, and only the variable x. Given a s hema, s, andsymboli value fi(S) we de�ne varof (s; fi(S)) to be the variable asso iated with fi as justde�ned. For example, in the s hema in Figure 3.22(page 109), varof (f3(y)) = .For a trivial symboli value, i.e. a variable v, varof (s; v) = v.To evaluate a symboli value in terms of a s hema s and an initial state � an thus bede�ned as follows:De�nition 5.2.1 (evalsym)The fun tion, evalsym, has the type given by:-evalsym : S hemas!j Programs!j States!j Symboli Values !j Values� If x is a variable, evalsym s p � x = �x;1It is for this reason that we an de�ne symboli value in terms of sets rather than lists of symboli values.

Page 142: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

142 The Soundness and Completeness of S� for ompound symboli values, fi(S),evalsym s p � fi(S) = E [[p fi℄℄[Æ2S varof (s; Æ) 7! (evalsym s p � Æ);where the p fi are the expressions in program p orresponding to the fi in s� and for the ? symboli value, evalsym s p � ? = ?:[Æ2S fvarof (s; Æ) 7! (evalsym s p � Æ)g de�nes the state fun tion2 in whi h to evaluate theexpression p fi.It should be noted that varof is a well de�ned fun tion, sin e for any symboli valuefi(S)that an arise,Æ1 2 S and Æ2 2 S and Æ1 6= Æ2 =) varof (s; Æ1) 6= varof (s; Æ2):We all evalsym s p � Æ the derived value of the symboli value Æ with respe t to (s; p; �).5.2.2 The Derived StateGiven a program, p in the lass [s℄, represented by s hema, s and a starting state �, everysymboli state, that an arise in the symboli exe ution of s thus orresponds to a `real'state, �v:evalsym s p � ( v):We all this the derived state of the symboli state with respe t to (s; p; �).5.2.3 The fun tion satisfyA program p satis�es a path, � in a state � if and only if exe uting p in state � `gives rise' tothe path �. Formally:2Here we are representing a state fun tion in the standard set theoreti manner, as a set of variable 7!value pairs.

Page 143: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

5.3 Further Results 143De�nition 5.2.2 (satisfy)Given a s hema s and a path � = (�t; �f) a program p 2 [s℄ and a state �satisfy s p � �()for all Æ in �t, evalsym s p � Æ = trueandfor all Æ in �f , evalsym s p � Æ = false:The de�nition of `satisfy' is extended to symboli exe ution trees as follow:De�nition 5.2.3 (Satisfying a Symboli Exe ution Tree)Given a s hema s and a path � = (�t; �f) a program p 2 [s℄, a symboli exe ution tree, t,and a state � satisfy t p � �()� 2 dom(pfun t) and satisfy s p � �.5.2.4 Di�eren esGiven two paths, the di�eren es between them are simply the set of symboli values that aretrue in one path and false in the other.De�nition 5.2.4 (di�s)Let (�1; �2) and (�01; �02) be paths.di�s((�1; �2); (�01; �02)) = (�1 \ �02) [ (�2 \ �01):5.3 Further ResultsThe results of this se tion are mainly te hni al on�rmations that the de�nitions in Chap-ter 4 are orre t. They are all needed in the proof of the main theorem of this hapter,Theorem 5.4.1(page 160).5.3.1 The Result of Pruning a Simple Symboli Exe ution Tree is SimpleIt is now proved that pruning a simple tree with respe t to a path � results in a tree that isstill simple and that none of the elements of � are in the pruned tree.

Page 144: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

144 The Soundness and Completeness of SLemma 5.3.1 If t is simple then1. prune(�t; �f) t is simple and2. No element of �t [ �f is a node of prune(�t; �f) t.Proof: By indu tion on the depth of t.Base Case If t is a leaf, then prune(�t; �f) t = t and t is simple.Let t = (t1; r; t2) be simple.Consider prune(�t; �f) (t1; r; t2).If r 2 �t then, by De�nition 4.4.4(page 128), prune(�t; �f) (t1; r; t2) = prune(�t; �f) (t1)whi h is simple by indu tion hypothesis.Similarly, if r 2 �f then, by De�nition 4.4.4(page 128), prune(�t; �f) (t1; r; t2) = prune(�t; �f) (t2)whi h is simple by indu tion hypothesis.If r =2 �t [ �f , then, by De�nition 4.4.4(page 128),prune(�t; �f) (t1; r; t2) = (prune(frg [ �t; �f)t1; r; prune(�t; frg [ �f) t2).But, by indu tion hypothesis, prune(frg[�t; �f)t1 and prune(�t; frg[�f) t2 are simple anddo not ontain �t [ �f [ frg, so by De�nition 4.4.3(page 127),(prune(frg [ �t; �f)t1; r; prune(�t; frg [ �f )t2) is simple and does not ontain �t [ �f .5.3.2 The Result of Simplifying a Symboli Exe ution Tree is SimpleWe now prove that simplifying a symboli exe ution tree results in a simple symboli exe utiontree.Theorem 5.3.1 For all symboli exe ution trees, t, (simplify t) is simple.Proof: By indu tion on the depth of t.Base Case If t is a leaf, (simplify t) = t whi h is simple.Let t = (t1; r; t2).simplify(t1; r; t2) = prune(;; ;)(t1; r; t2) = (prune(frg; ;)t1; r; prune(;; frg)t2).By the Lemma 5.3.1(page 144) and the indu tion hypothesis, prune(frg; ;)t1 and prune(;; frg)t2

Page 145: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

5.3 Further Results 145are simple and do not ontain r. So (prune(frg; ;)t1; r; prune(;; frg)t2) is simple as required,by indu tion.5.3.3 A Partial Order on PathsWe now de�ne what it means for one path to be `less' than another.De�nition 5.3.1 (� v �0)� v �0 () �t � �0t and �f � �0f .Informally, � v �0 means that� every symboli predi ate that is true in � is also true in �0and� every symboli predi ate that is false in � is also false in �0.In other words, � an be obtained from �0 by deleting symboli predi ates. Clearly, given asymboli exe ution tree, t, v de�nes a partial order on the set of all paths of t.5.3.4 `Smaller Path' LemmaThe next lemma shows that every path of a pruned symboli exe ution tree is less than somepath of the unpruned tree, i.e. ea h path of prune � t an be obtained by deleting elementsfrom some path of t.Lemma 5.3.2 Let t be a simple Symboli Exe ution Tree.For all paths, �,�0 2 dom(pfun(prune � t)) =) 9�00 2 dom(pfun t) su h that �0 v �00 andpfun(prune � t)�0 = pfun t �00:Proof: Indu tion on the depth of t.Base CaseIf t is a leaf then result follows trivially.Now let t = (t1; r; t2) and �0 2 dom(pfun(prune � (t1; r; t2))):Case1 If r 2 �tthen, by De�nition 4.4.4(page 128),

Page 146: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

146 The Soundness and Completeness of Sprune � (t1; r; t2) = prune � t1:Therefore �0 2 dom(pfun(prune � t1)).By indu tion hypothesis,9�00 2 dom(pfun t1) su h that �0 v �00 and pfun(prune � t1)�0 = pfun t1 �00.Put �000 = (�00t [ frg; �00f):Then �000 2 dom(pfun (t1; r; t2)):Also �0 v �000and, sin e r 2 �t,pfun(prune � (t1; r; t2))�0 = pfun(prune � t1)�0 = pfun t1 �00 = pfun(t1; r; t2)�000as required.Case2 If r 2 �fSymmetri al Proof.Case3 If r =2 � then, by De�nition 4.4.4(page 128),prune � (t1; r; t2) = (prune (�t [ frg; �f)t1; r; prune (�t; �f [ frg)t2)Therefore by De�nition 4.4.2(page 125),pfun(prune � (t1; r; t2)) =addleft(r; pfun(prune (�t [ frg; �f)t1)[ addright(r; pfun(prune (�t; �f [ frg)t2)).

Page 147: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

5.3 Further Results 147Suppose, �rst, that�0 2 dom(addleft(r; pfun(prune (�t [ frg; �f)t1))).Then �0 = (�00t [ frg; �00f) for some �00 2 dom(pfun(prune (�t [ frg; �f)t1)):By indu tion hypothesis,9�000 2 dom(pfun t1) su h that �00 v �000 andpfun(prune (�t [ frg; �f) t1)�00 = pfun t1 �000.But �0 v (�000t [ frg; �000f ).By De�nition 4.4.2(page 125),(�000t [ frg; �000f ) 2 dom(pfun (t1; r; t2))Therefore pfun(prune � (t1; r; t2))�0 = pfun(prune (�t [ frg; �f)t1)�00 = pfun t1 �000=pfun(t1; r; t2)(�000t [ frg; �000f ) as required.Again, by symmetry, we omit the proof of the ase�0 2 dom(addright(r; pfun(prune (�t; �f [ frg)t2))).This ompletes the proof of Lemma 5.3.2(page 145).Lemma 5.3.3 �0 v � =) di�s(�; �0) = ;Proof: trivial.

Page 148: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

148 The Soundness and Completeness of S5.3.5 `Pruning' LemmaThe next lemma states that given two paths � and �0, with no di�eren es and a simple treet, with � a path of t, if t is pruned with respe t to �0, then the path (�t � �0t; �f � �0f ) willbe a path of the resulting tree and what is more, the symboli state at the end of this pathin the pruned tree will be the same symboli state that o urs at the end of path � in theoriginal tree t.Lemma 5.3.4 Let t be a simple symboli exe ution tree.For all paths � in dom(pfun t), for all paths �0 withdi�s(�; �0) = ;,(�t � �0t; �f � �0f ) 2 dom(pfun (prune �0 t))andpfun t � = pfun (prune �0 t) (�t � �0t; �f � �0f):Proof: Indu tion on the depth of t.Base Case trivialIndu tion HypothesisLet t = (t1; r; t2) be simple.Assume that for all paths � in dom(pfun t), for all paths �0 withdi�s(�; �0) = ;,(�t � �0t; �f � �0f ) 2 dom(pfun (prune �0 t1))andpfun t1 � = pfun (prune �0 t1)(�t � �0t; �f � �0f )and for all paths � in dom(pfun t), for all paths �0 withdi�s(�; �0) = ;,(�t � �0t; �f � �0f ) 2 dom(pfun (prune �0 t2))andpfun t2 � = pfun (prune �0 t2)(�t � �0t; �f � �0f )

Page 149: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

5.3 Further Results 149Let � 2 dom(pfun (t1; r; t2)) and �0 be a path su h that di�s(�; �0) = ;.We must show (�t � �0t; �f � �0f ) 2 dom(pfun (prune �0 (t1; r; t2)))andpfun (t1; r; t2)� = pfun (prune �0 (t1; r; t2))(�t � �0t; �f � �0f),By De�nition 4.4.2(page 125),� 2 dom(addleft(r; pfun t1))[ dom(addright(r; pfun t2)).Suppose �rst, that � 2 dom(addleft(r; pfun t1)).So � = (�00t [ frg; �00f) for some �00 in dom(pfun t1)) with r =2 �00, sin e t is simple.There are two possibilities for �0:1. Either r 2 �0t or2. r =2 �0 (r annot be in �0f sin e di�s(�; �0) = ;).First, suppose r 2 �0t.Then by De�nition 4.4.4(page 128),dom(pfun (prune �0 (t1; r; t2))) = dom(pfun (prune �0 t1)).By indu tion hypothesis, sin e �00 2 dom(pfun t1)) and di�s(�00; �0) = ;,(�00t � �0t; �00f � �0f ) 2 dom(pfun (prune �0 t1)) = dom(pfun (prune �0 (t1; r; t2)))and(pfun t1)�00 = (pfun (prune �0 t2))(�00t � �0t; �00f � �0f ).But sin e r 2 �0t, (�00t � �0t; �00f � �0f ) = (�t � �0t; �f � �0f ).Therefore (�t � �0t; �f � �0f ) 2 dom(pfun (prune �0 (t1; r; t2))) as required

Page 150: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

150 The Soundness and Completeness of Sand, by De�nition 4.4.2(page 125), sin e r 2 �t,pfun (t1; r; t2) � = pfun t1 �00= pfun (prune �0 t1)(�00t � �0t; �00f � �0f ) (by indu tion hypothesis)=pfun (prune �0 (t1; r; t2))(�00t � �0t; �00f � �0f ) (sin e r 2 �0t)=pfun (prune �0 (t1; r; t2))(�t� �0t; �f � �0f )(sin e (�00t � �0t; �00f � �0f ) = (�t � �0t; �f � �0f )) as required.Now, se ondly, suppose r =2 �0. Then by De�nition 4.4.4(page 128),prune �0 (t1; r; t2))) = (prune (�0t [ frg; �0f) t1; r; prune (�0t; �0f [ frg) t2)=(prune �0 t1; r; prune �0 t2) sin e (t1; r; t2) is simple.Sopfun (prune �0 (t1; r; t2)) = addleft(r; pfun (prune �0 t1)) [ addright (r; pfun prune �0t2))So, sin e we are assuming that r 2 �t,pfun (prune �0 (t1; r; t2))� = addleft(r; pfun (prune �0 t1))� = pfun (prune �0 t1)�00.Clearly, di�s(�00; �0) = ; so by indu tion hypothesis, sin e �00 2 dom(pfun t1)),(�00t � �0t; �00f � �0f ) 2 dom(pfun (prune �0 t1))andpfun t1 �00 = pfun (prune �0 t1) (�00t � �0t; �00f � �0f).But (�t � �0t; �f � �0f) = (�00t [ frg � �0t; �00f � �0f) 2 dom addleft(r; pfun (prune �0 t1))and dom addleft(r; pfun (prune �0 t1)) � dom pfun (prune �0 (t1; r; t2))so

Page 151: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

5.3 Further Results 151(�t � �0t; �f � �0f ) 2 dom pfun (prune �0 (t1; r; t2)) as required.Also, pfun(t1; r; t2)� = addleft(r; pfun t1)� (sin e r 2 �t).=(pfun t1)�00 (by de�nition of addleft)=pfun (prune �0 t1) (�00t � �0t; �00f � �0f) by indu tion hypothesis(see above).=pfun (prune �0 (t1; r; t2)) (�t � �0t; �f � �0f ) by de�niton of prune(De�nition 4.4.4(page 128)), sin e r 2 �t � �0t, as required.In this part of the proof, we assumed �rst that � 2 dom(addleft(r; pfun t1)). We should nowprove the whole theorem again, this time with the assumption that � 2 dom(addright(r; pfun t2)).An appeal to symmetry, however, allows this part of the proof to be omitted.5.3.6 `Disagreement' LemmaThe next lemma states two distin t paths of the same symboli exe ution tree must `disagree'somewhere. In other words, their di�eren es are not disjoint.Lemma 5.3.5 Let t be a simple symboli exe ution tree then for all paths �; �0 in dom(pfun t)),di�s(�; �0) = ; () � = �0Proof: Indu tion on the depth of t (very straightforward so omitted).5.3.7 `No Subpaths' LemmaThe next lemma states that in a simple symboli exe ution tree there do not exist distin tpaths � and �0 with �0 v �.Corollary 5.3.1 Let t be a simple symboli exe ution tree. Then�0 v � and �0 2 dom(pfun t) and � 2 dom(pfun t) =) � = �0

Page 152: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

152 The Soundness and Completeness of SProof:By Lemma 5.3.3(page 147), �0 v � =) di�s(�; �0) = ;, whi h, by Lemma 5.3.5(page 151),implies that �0 = �.Lemma 5.3.6 Let t be a simple symboli exe ution tree and � a path, then for all �0 indom(pfun(prune � t)), di�s(�; �0) = ;Proof: Indu tion on the depth of t base ase trivial.Let (t1; r; t2) be a simple. Let � be path, and let �0 in dom(pfun(prune � (t1; r; t2))). ase1 r =2 �pfun prune � (t1; r; t2))) = addleft(r; pfun (prune � t1))[ addright(r; pfun prune � t2))sodom(pfun(prune � (t1; r; t2))) = [(�00t ;�00f )2dom pfun (prune � t1)[dom pfun (prune � t2)f(�00t [ frg; �00f)g [ f(�00t ; �00f [ frg)g:By indu tion hypothesis, di�s(�00; �) = ;Sin e r =2 �, di�s((�00t [ frg; �00f); �) = ; and di�s((�00t ; �00f) [ frg; �) = ;.So for all �0 2 dom(pfun(prune � (t1; r; t2))),di�s(�; �0) = ;as required. ase2 r 2 �tby De�nition 4.4.2(page 125), pfun prune � (t1; r; t2))) = pfun (prune � t1))so result follows immediately by indu tion hypothesis.

Page 153: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

5.3 Further Results 153 ase3 r 2 �f similarly.5.3.8 Joining PathsAn operator t for `joining' paths is now de�ned.De�nition 5.3.2let � and �0 be paths. � t �0 = (�t [ �0t; �f [ �0f)Lemma 5.3.7 di�s(�; �0) = ; () � t �0 is a path.Proof: trivial.5.3.9 `Corresponden e' LemmaLemma 5.3.8 � Let � be a state.� Let s be a loop{free s hema.� Let be a symboli state obtained from s.� Let Æ be a symboli value obtained from s.� Let p 2 [s℄.� Let �0 be the state derived from with respe t to (s; p; �)Then the value derived from Æ with respe t to (s; p; �0) is equal to the value derived fromevaldelta Æ with respe t to (s; p; �). i.e.evalsym s p �0 Æ = evalsym s p � (evaldelta Æ)Proof:Indu tion on the depth of Æ.base aseÆ is a variable.L.H.S.=

Page 154: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

154 The Soundness and Completeness of Sevalsym s p (� z:evalsym s p � ( z)) Æ=evalsym s p � ( Æ)( by De�nition 5.2.1(page 141))=evalsym s p � (evaldelta Æ) (by De�nition 4.4.6(page 129))= R.H.S. as required.If Æ is of the form f(S), thenL.H.S.= evalsym s p (� z:evalsym s p � ( z)) f(S)=E [[p f ℄℄[Æ2S varof (s; Æ) 7! (evalsym s p (� z:evalsym s p � ( z)) Æ) ( byDe�nition 5.2.1(page 141))=E [[p f ℄℄[Æ2S varof (s; Æ) 7! (evalsym s p � (evaldelta Æ)) by indu tion hypothesis.R.H.S.= evalsym s p � (evaldelta f(S))=evalsym s p � (f([Æ2S evaldelta Æ)) (by De�nition 4.4.6(page 129))=E [[p f ℄℄[Æ2S varof (s; Æ) 7! (evalsym s p � (evaldelta Æ)) ( by De�nition 5.2.1(page 141))=L.H.S as required.5.3.10 Evaluating a Path in a Symboli StateTo evaluate a path � in a symboli state we simply evaluate ea h symboli predi ate of �in .De�nition 5.3.3 (pathinstate : ! path ! path)pathinstate (�1; �2) = ([Æ2�1 evaldelta Æ; [Æ2�2 evaldelta Æ)

Page 155: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

5.3 Further Results 155Lemma 5.3.9 Let t1 and t2 be simple symboli exe ution trees and let �1 2 dom (pfun t1)and �2 2 dom (pfun t2) be paths su h thatdi�s(�1; pathinstate (pfun t1�1) �2) = ;then for all variables z, evaldelta (pfun t1 �1) (pfun t2 �2 z)=pfun (treeinstate (pfun t1 �1) t2) (pathinstate (pfun t1�1) �2) z.Proof: Indu tion on the depth of t2.Base Case t2 is a leaf in whi h ase �2 must be (;; ;). Therefore LHS=evaldelta (pfun t1 �1) (pfun t2 �2 z)=evaldelta (pfun t1 �1) ( z) sin e t2 is a leaf and RHS=pfun (treeinstate (pfun t1 �1) t2) (pathinstate (pfun t1�1) �2) z=pfun (treeinstate (pfun t1 �1) )(;; ;) z(by De�nition 5.3.3(page 154))=pfun (updatestateinstate (pfun t1 �1) )(;; ;) z (by De�nition 4.4.8(page 130))=updatestateinstate (pfun t1 �1) ) z(by De�nition 4.4.2(page 125))=evaldelta (pfun t1 �1) ( z) (by De�nition 4.4.7(page 130)) as required.Now let t2 = (tL; r; tR). LHS=evaldelta (pfun t1 �1) (pfun (tL; r; tR) �2 z)

Page 156: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

156 The Soundness and Completeness of S=evaldelta (pfun t1 �1) (addleft(r; (pfun tL))[ addright(r; pfun tR)) �2 z)Suppose �rst that �2 2 dom addleft(r; (pfun tL)) thenLHS=evaldelta (pfun t1 �1) (addleft(r; (pfun tL)) �2 z)=evaldelta (pfun t1 �1) ((pfun tL) (�2 � (frg; ;)) z).Clearly, di�s(�1; pathinstate (pfun t1�1) (�2 � (frg; ;))) = ;.So, by indu tion hypothesis, LHS=pfun (treeinstate (pfun t1 �1) tL) (pathinstate (pfun t1�1) (�2 � (frg; ;)) z=pfun (treeinstate (pfun t1 �1) (tL; r; tR) (pathinstate (pfun t1�1) (�2) z (byDe�nition 4.4.2(page 125))as required.The proof when �2 2 dom addright (r; (pfun tR)) is symmetri al so omitted.Lemma 5.3.10 If t is simple so is treeinstate t.Proof: trivial indu tion on the depth of t.Lemma 5.3.11 If t is simple,� 2 dom (pfun t) =) pathinstate � 2 dom (treeinstate t):Proof: trivial indu tion on the depth of t.Lemma 5.3.12 Let t1 and t2 be simple symboli exe ution trees and let �1 2 dom (pfun t1)and �2 2 dom (pfun t2) be paths su h thatdi�s(�1; pathinstate (pfun t1�1) �2) = ;

Page 157: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

5.3 Further Results 157then for all variables z,�1 t pathinstate (pfun t1�1) �2 2 dom simplify (sequen e t1 t2)andpfun (treeinstate (pfun t1 �1) t2) (pathinstate (pfun t1�1) �2) z =pfun (simplify (sequen e t1 t2))(�1 t pathinstate (pfun t1�1) �2))z.Proof: by indu tion on the depth of t1.Base Case t1 is a leaf .First we must show(;; ;)t pathinstate (pfun (;; ;)) �2 2 dom simplify (sequen e t2).i.e. pathinstate �2 2 dom simplify (sequen e t2).i.e. pathinstate �2 2 dom simplify (treeinstate t2)whi h follows immediately from Lemma 5.3.10(page 156) and Lemma 5.3.11(page 156).We must now show pfun (treeinstate ( t2) (pathinstate ( �2) z =pfun (simplify (sequen e t2))(pathinstate ( �2))z.RHS= pfun (simplify (sequen e t2))(pathinstate ( �2))z=pfun (simplify (treeinstate t2))(pathinstate ( �2))z (by De�nition 4.4.9(page 131))=pfun (treeinstate t2)(pathinstate ( �2))z (by Lemma 5.3.10(page 156))=LHS.This ompletes the proof of the base ase.Now let t1 = (tL; r; tR).Let �1 2 dom (pfun (tL; r; tR)) and �2 2 dom (pfun t2) be paths su h that

Page 158: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

158 The Soundness and Completeness of Sdi�s(�1; pathinstate (pfun (tL; r; tR)�1) �2) = ;.We must show that1. �1 t pathinstate (pfun (tL; r; tR)�1) �2 2 dom simplify (sequen e (tL; r; tR) t2)and2. for all variables z,pfun (treeinstate (pfun (tL; r; tR) �1) t2) (pathinstate (pfun (tL; r; tR)�1) �2) z=pfun (simplify (sequen e (tL; r; tR) t2))(�1 t pathinstate (pfun (tL; r; tR)�1) �2))z.As in previous proofs, suppose �rst that�1 2 dom (addleft(r; pfun tL)).Then �1 � (frg; ;) 2 dom (pfun tL).So by indu tion hypothesis,�1 � (frg; ;)t pathinstate (pfun tL(�1 � (frg; ;)) �2) 2 dom simplify (sequen e tL t2).Therefore1. �1 t pathinstate (pfun (tL; r; tR) �1) �2 2 dom simplify (sequen e (tL; r; tR) t2)as required and2. by indu tion hypothesis, for all variables z,pfun (treeinstate (pfun tL (�1 � (frg; ;)) t2) (pathinstate (pfun tL (�1 � (frg; ;)) �2) z=pfun (simplify (sequen e tL t2))((�1�(frg; ;)tpathinstate (pfun tL (�1�(frg; ;)) �2))z.Therefore for all variables z,

Page 159: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

5.4 The Soundness and Completeness of S 159pfun (treeinstate (pfun (tL; r; tR) �1) t2) (pathinstate (pfun (tL; r; tR)�1) �2) z=pfun (treeinstate (pfun tL ) (�1� (frg; ;)) t2) (pathinstate (pfun tL ) (�1� (frg; ;) �2) z(by De�nition 4.4.2(page 125))=pfun (simplify (sequen e tL t2))((�1�(frg; ;)tpathinstate (pfun tL (�1�(frg; ;)) �2))z(by indu tion hypothesis)=pfun (simplify (sequen e (tL; r; tR) t2))(�1 t pathinstate (pfun (tL; r; tR)�1) �2))z(byDe�nition 4.4.2(page 125))as required. Again, we appeal to symmetry in order to justify the omission of the asewhere �1 2 dom (addright(r; pfun tR)).5.3.11 The `One Path' LemmaLemma 5.3.13 Given a s hema s, a program p 2 [s℄ and a state �, and a simple symboli exe ution tree, t, all of whose non{leaf nodes are obtained from s, then there is exa tly onepath � of t su h that satisfy t p � �.Proof: : trivial.5.4 The Soundness and Completeness of SWe are now in a position to prove the main theorem of this hapter: that S is sound and omplete.1. Complete, in the sense that for all loop free s hemas s, for all programs p in [s℄, andfor all states �, there is exa tly one path in the symboli exe ution tree of s that is`satis�ed' in state � with respe t to p and the state at the end of this path ` orresponds'toM[[p℄℄� (i.e. the normal denotational meaning of p in �).and2. sound, in the sense that for all paths � of the symboli exe ution tree of s there is a pin [s℄ and a state � su h that � is `satis�ed' in state � with respe t to p and the state

Page 160: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

160 The Soundness and Completeness of Sat the end of the path again orresponds toM[[p℄℄�.Theorem 5.4.1 (Soundness and Completeness of S) Given a loop{free s hema s, anda program p 2 [s℄, and a state, �, there exists exa tly one path � 2 dom (pfun S[[s℄℄) su h thatsatisfy s p � � and in this ase, for all z,M[[p℄℄� z = evalsym s p � (pfun S[[s℄℄ � z):Conversely, for all � 2 dom (pfun S[[s℄℄) there exists a program p 2 [s℄, and a state, � su hthat satisfy s p � �andfor all z,M[[p℄℄� z = evalsym s p � (pfun S[[s℄℄ � z):Proof: by indu tion on the stru ture of s.1. s is skip.By De�nition 4.5.3(page 132), S[[skip℄℄ is the symboli exe ution tree onsisting of thesingle leaf node �v:v. Thereforepfun S[[skip℄℄ = f(;; ;) 7! �v:vgfor all z, LHS=M[[skip℄℄� z=� z (by de�nition ofM)=evalsym skip skip � z (by De�nition 5.2.1(page 141))=evalsym skip skip � (pfun S[[skip℄℄ (;; ;) z)(by De�nition 4.5.3(page 132))=RHS.2. s is FAIL.By De�nition 4.5.2(page 132), S[[FAIL℄℄ is the symboli exe ution tree onsisting of the

Page 161: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

5.4 The Soundness and Completeness of S 161single leaf node ?. Thereforepfun S[[FAIL℄℄ = f(;; ;) 7! ?g:For all z, LHS=M[[FAIL℄℄� z=? (by de�nition ofM)=evalsym FAIL FAIL � ? (by De�nition 5.2.1(page 141))=evalsym FAIL FAIL � (pfun S[[FAIL℄℄ (;; ;) z)(by De�nition 4.5.2(page 132))=RHS.3. s is a single assignment statement x := f(V ).Then for all p in [s℄, p is a single assignment of the form x := E where E is an expressionwith ref E = V . M[[p℄℄� z = 8<: � z if z 6= xE [[E℄℄� if z = xBy De�nition 4.5.1(page 131), S[[x:=f(V )℄℄ is the Symboli Exe ution Tree onsistingof the single leaf state:- S[[x:=f(V )℄℄ = �z:8<: z if z 6= xf(V ) if z = xSo, by De�nition 4.4.2(page 125),pfun S[[s℄℄ = f(;; ;) 7! S[[x:=f(V )℄℄g:Therefore

Page 162: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

162 The Soundness and Completeness of Sevalsym s p � (pfun S[[s℄℄ (;; ;) z)=evalsym s p � (S[[x:=f(V )℄℄ z)= 8<: evalsym s p � z if z 6= xevalsym s p � f(V ) if z = x (by De�nition 4.5.1(page 131))= 8><>: � z if z 6= xE [[E℄℄[Æ2V varof (s; Æ) 7! (evalsym s p � Æ) if z = x (byDe�nition 5.2.1(page 141))= 8><>: � z if z 6= xE [[E℄℄[v2V v 7! � v if z = x (by De�nition 5.2.1(page 141))whi h by Lemma 3.4.1(page 89) gives8<: � z if z 6= xE [[E℄℄� if z = x=M[[p℄℄� z as required.4. s is the onditional statement s hema: if f(V ) then s1 else s2.Assume theorem is true for s hemas s1 and s2.Let p 2 [if f(V ) then s1 else s2℄ and let � be a state.Sin e p 2 [if f(V ) then s1 else s2℄, p is of the form [[if E then p1 else p2℄℄, where(a) ref E = V ,(b) p1 2 [s1℄ and( ) p2 2 [s2℄.We must show that there exists exa tly one path � 2 dom (pfun S[[if f(V ) then s1 else s2℄℄)su h that satisfy [[if f(V ) then s1 else s2℄℄ [[if E then p1 else p2℄℄ � � and in this ase, forall z, M[[if E then p1 else p2℄℄� z=evalsym [[if f(V ) then s1 else s2℄℄ [[if E then p1 else p2℄℄ � (pfun S[[if f(V ) then s1 else s2℄℄ � z):By indu tion hypothesis, for all programs p 2 [s1℄ and for all states � there exists exa tly

Page 163: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

5.4 The Soundness and Completeness of S 163one path � 2 dom (pfun S[[s1℄℄) su h that satisfy s p � � and in this ase, for all z,M[[p℄℄� z = evalsym s1 p � (pfun S[[s1℄℄ � z):Similarly, for all programs p 2 [s2℄ and for all states � there exists exa tly one path� 2 dom (pfun S[[s2℄℄) su h that satisfy s2 p � � and in this ase, for all z,M[[p℄℄� z = evalsym s p � (pfun S[[s2℄℄ � z):Let � be a state for whi h [[if E then p1 else p2℄℄ is de�ned. If E [[E℄℄� = true, thenM[[if E then p1 else p2℄℄� =M[[p1℄℄�:By indu tion hypothesis therefore, there exists exa tly one path (�t; �f) 2 dom (pfun S[[s1℄℄)su h that satisfy s p1 � (�t; �f) and in this ase, for all z,M[[p1℄℄� z = evalsym s1 p1 � (pfun S[[s1℄℄ (�t; �f) z):Clearly, f(V ) =2 �f sin e E [[E℄℄� = true,so by Lemma 5.3.4(page 148), (�t � ff(V )g; �f) 2 dom (pfun prune(ff(V )g; ;)S[[s1℄℄)but, by De�nition 4.5.4(page 132),pfun (S[[if f(V ) then s1 else s2℄℄)=pfun(simplify (S[[s1℄℄; f(V );S[[s2℄℄))=pfun(prune(;; ;) (S[[s1℄℄; f(V );S[[s2℄℄)) (by De�nition 4.4.5(page 129))=pfun( (prune(ff(V )g; ;)S[[s1℄℄; f(V ); prune(;; ff(V )g)S[[s2℄℄))(by De�nition 4.4.4(page 128))=addleft(f(V ); pfun( (prune(ff(V )g; ;)S[[s1℄℄)[ addright(f(V ); prune(;; ff(V )g)S[[s2℄℄)).(by De�nition 4.4.2(page 125))

Page 164: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

164 The Soundness and Completeness of STherefore (�t [ ff(V )g; �f) 2 dom(pfun (S[[if f(V ) then s1 else s2℄℄)andsatisfy [[if f(V ) then s1 else s2℄℄ [[if E then p1 else p2℄℄ � (�t [ ff(V )g; �f).By De�nition 4.5.4(page 132),evalsym [[if f(V ) then s1 else s2℄℄ [[if E then p1 else p2℄℄ �(pfun S[[if f(V ) then s1 else s2℄℄ (�t [ ff(V )g; �f) z)=evalsym [[if f(V ) then s1 else s2℄℄ [[if E then p1 else p2℄℄ �(pfun simplify (S[[s1℄℄; f(V );S[[s2℄℄) (�t [ ff(V )g; �f) z)=evalsym [[if f(V ) then s1 else s2℄℄ [[if E then p1 else p2℄℄ �(pfun (prune(f(V ); ;)S[[s1℄℄; f(V ); prune(;; f(V ))S[[s2℄℄) (�t [ ff(V )g; �f) z)(by De�nition 4.4.5(page 129))=evalsym [[if f(V ) then s1 else s2℄℄ [[if E then p1 else p2℄℄ �(pfun prune(f(V ); ;)S[[s1℄℄ (�t � ff(V )g; �f) z)(by De�nition 4.4.2(page 125))=evalsym [[if f(V ) then s1 else s2℄℄ [[if E then p1 else p2℄℄ � (pfun S[[s1℄℄ (�t; �f) z)(by Lemma 5.3.4(page 148), using the fa t that f(V ) =2 �f)=evalsym [[s1℄℄ [[p2℄℄ � (pfun S[[s1℄℄ (�t; �f) z)=M[[p1℄℄�(by indu tion hypothesis)=M[[if E then p1 else p2℄℄� sin e E [[E℄℄� = true as required.Similarly, if E [[E℄℄� = false (by symmetry).

Page 165: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

5.4 The Soundness and Completeness of S 165We now prove the onverse, namely,for all � 2 dom (pfun S[[if f(V ) then s1 else s2℄℄)there exists a program p = [[if E then p1 else p2℄℄ 2 [if f(V ) then s1 else s2℄,and a state, � su h thatsatisfy [[if f(V ) then s1 else s2℄℄ [[if E then p1 else p2℄℄ � �andfor all z,M[[if E then p1 else p2℄℄� z=evalsym [[if f(V ) then s1 else s2℄℄ [[if E then p1 else p2℄℄ � (pfun S[[if f(V ) then s1 else s2℄℄ � z):Proof:Let � 2 dom (pfun S[[if f(V ) then s1 else s2℄℄)then by De�nition 4.5.4(page 132),� 2 dom (pfun(simplify(S[[s1℄℄; f(V );S[[s2℄℄))=dom (pfun(prune(;; ;)(S[[s1℄℄; f(V );S[[s2℄℄)) (by De�nition 4.4.5(page 129))=dom (pfun(prune(ff(V )g; ;)S[[s1℄℄; f(V ); prune(;; ff(V )g)S[[s2℄℄))(De�nition 4.4.4(page 128))=dom (addleft(f(V ); pfun(prune(ff(V )g; ;)S[[s1℄℄) [addright (f(V ); pfun(prune(;; ff(V )g)S[[s2℄℄)). (by De�nition 4.4.2(page 125))Suppose �rst, that� 2 dom (addleft(f(V ); pfun(prune(ff(V )g; ;)S[[s1℄℄))Case1 f(V ) is not a node of S[[s1℄℄In whi h ase � 2 dom (addleft(f(V ); pfun(S[[s1℄℄)).

Page 166: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

166 The Soundness and Completeness of SSo � = (�0t [ ff(V )g; �0f) for some path �0 2 dom(pfun(S[[s1℄℄)) that does not ontain f(V ). By indu tion hypothesis, therefore, there exists a program p1 2[s1℄, and a state, � su h that satisfy s1 p1 � �0andfor all z,M[[p1℄℄� z = evalsym s1 p1 � (pfun S[[s1℄℄ �0 z):By Assumption 3.4.1, there is an expression E whi h referen es the variables Vsu h thatM[[E℄℄� = true . Thereforesatisfy [[if f(V ) then s1 else s2℄℄ [[if E then p1 else p2℄℄ � �andfor all z,M[[if E then p1 else p2℄℄� z =evalsym [[if f(V ) then s1 else s2℄℄ [[if E then p1 else p2℄℄ � (pfun S[[if f(V ) then s1 else s2℄℄ � z):Case2 f(V ) is a node of S[[s1℄℄.By indu tion hypothesis and Lemma 5.3.2(page 145),for all paths �0 in pfun(prune(ff(V )g; ;)S[[s1℄℄)) there exists a program p1 2 [s1℄,and a state, � su h that satisfy s1 p1 � �0andfor all z,M[[p1℄℄� z = evalsym s1 p1 � (pfun (prune(ff(V )g; ;)S[[s1℄℄ �0 z):Again, by Assumption 3.4.1, hoose an expression E whi h referen es the vari-ables V su h thatM[[E℄℄� = true and the result follows immediately.Exa tly as in Lemma 5.3.4(page 148), we appeal to symmetry in order to allowourselves the luxury of omitting the proof of the ase when� 2 dom (addright(f(V ); pfun(prune(ff(V )g; ;)S[[s2℄℄)).This ompletes the proof for onditionals.5. Let s be the sequen e [[s1; s2℄℄. We �rst must prove that given a program p1; p2 in[s1; s2℄, and a state, �, there exists exa tly one path � 2 dom (pfun S[[s1; s2℄℄) su h that

Page 167: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

5.4 The Soundness and Completeness of S 167satisfy [[s1; s2℄℄ [[p1; p2℄℄ � � and in this ase, for all z,M[[p1; p2℄℄� z = evalsym [[s1; s2℄℄ [[p1; p2℄℄ � (pfun S[[s1; s2℄℄ � z):i.e.by de�nition ofM and De�nition 4.5.5(page 132), we must show that there exists apath � in dom(pfun(simplify(sequen e S[[s1℄℄ S[[s2℄℄))) su h thatsatisfy [[s1; s2℄℄ [[p1; p2℄℄ � �and in this ase, for all zM[[p2℄℄(M[[p1℄℄�)z = evalsym [[s1; s2℄℄ [[p1; p2℄℄ � ((pfun(simplify(sequen e S[[s1℄℄ S[[s2℄℄)))� z):Proof: By indu tion hypothesis, there exists exa tly one path �1 in dom (pfun S[[s1℄℄)su h that satisfy [[s1℄℄ [[p1℄℄ � �1 and in this ase, for all z,M[[p1℄℄� z = evalsym [[s1℄℄ [[p1℄℄ � (pfun S[[s1℄℄ �1 z):and by indu tion hypothesis there exists exa tly one path �2 in dom (pfun S[[s2℄℄) su hthat satisfy [[s2℄℄ [[p2℄℄ (M[[p1℄℄�) �2and for all z,M[[p2℄℄(M[[p1℄℄�)z = evalsym [[s2℄℄ [[p2℄℄ (M[[p1℄℄�) ((pfun(S[[s2℄℄))�2 z).Therefore there exists a path �1 in dom (pfun S[[s1℄℄) su h thatM[[p2℄℄(M[[p1℄℄�)z=evalsym [[s2℄℄ [[p2℄℄ (�x:evalsym [[s1℄℄ [[p1℄℄ � (pfun S[[s1℄℄ �1 x)) ((pfun(S[[s2℄℄))�2 z).Therefore M[[p2℄℄(M[[p1℄℄�)z=

Page 168: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

168 The Soundness and Completeness of Sevalsym [[s1; s2℄℄ [[p1; p2℄℄ (�x:evalsym [[s1℄℄ [[p1℄℄ � (pfun S[[s1℄℄ �1 x)) ((pfun(S[[s2℄℄))�2 z)(sin e [[s1; s2℄℄ is a valid s hema and [[p1; p2℄℄ is a valid program)=evalsym [[s1; s2℄℄ [[p1; p2℄℄ � evaldelta((pfun S[[s1℄℄) �1)) ((pfun(S[[s2℄℄))�2 z(by Lemma 5.3.8(page 153))By Lemma 5.3.8(page 153)di�s(�1; pathinstate(pfun S[[s1℄℄) �1) �2) = ;:Therefore, by Lemma 5.3.9(page 155),M[[p2℄℄(M[[p1℄℄�)z=evalsym [[s1; s2℄℄ [[p1; p2℄℄ � (pfun (treeinstate (pfun S[[s1℄℄ �1)) (S[[s2℄℄)) (pathinstate(pfun S[[s1℄℄ �1) �2) zBut by Lemma 5.3.12(page 156),�1 t (pathinstate(pfun S[[s1℄℄ �1) �2) 2 dom simplify (sequen e S[[s1℄℄ S[[s2℄℄)and(pfun (treeinstate (pfun S[[s1℄℄ �1)) (S[[s2℄℄)) (pathinstate(pfun S[[s1℄℄ �1)�2) z=pfun (simplify (sequen eS[[s1℄℄ S[[s2℄℄))(�1 t pathinstate(pfun S[[s1℄℄ �1)�2)z.Also by Lemma 5.3.8(page 153),satisfy [[s1; s2℄℄ [[p1; p2℄℄ � (�1 t pathinstate(pfun S[[s1℄℄ �1)�2).Therefore M[[p2℄℄(M[[p1℄℄�)z=evalsym [[s1; s2℄℄ [[p1; p2℄℄ � pfun (simplify (sequen eS[[s1℄℄ S[[s2℄℄))(�1 tpathinstate(pfun S[[s1℄℄ �1)�2)z=evalsym [[s1; s2℄℄ [[p1; p2℄℄ � pfun (S[[s1; s2℄℄)(�1 t pathinstate(pfun S[[s1℄℄ �1)�2)zby De�nition 4.5.5(page 132), as required.

Page 169: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

5.5 Con lusion 169We must now prove the onverse. Namely, for all � 2 dom (pfun S[[s1; s2℄℄) there existsa program p1; p2 2 [s1; s2℄, and a state, � su h thatsatisfy [[s1; s2℄℄ [[p1; p2℄℄ � �andfor all z,M[[p1; p2℄℄� z = evalsym [[s1; s2℄℄ [[p1; p2℄℄ � (pfun S[[s1; s2℄℄ � z):Proof: Corollary 6.3.1(page 185) states:- for any �nite set of predi ate symboli values Æiobtained from s and any set of values vi of the right type, there exists a state � and aprogram p 2 [s℄ su h that evalsym s p � Æi = vi. The result then follows immediatelyfrom the previous part (of whi h this is the onverse).5.5 Con lusionThe theory in this hapter has led to a proof that the semanti s S, of loop{free s hemas,introdu ed in Chapter 4, is both sound and omplete.It is this theorem whi h provides the essential semanti interpretation that is requiredin order to justify the algorithms that will be given in Chapter 6 for omputing the variousdata ow dependen ies(Chapter 3) of loop{free s hemas.

Page 170: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

170 The Soundness and Completeness of S

Page 171: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

Chapter 6Data and Control Dependen e inSymboli Exe ution Trees6.1 Introdu tionAlgorithms for omputing DTLD, DTVD, DLD and DVD of loop-free s hemas are given.For every loop{free s hema s, these algorithms are de�ned in terms of its symboli exe- ution tree, S[[s℄℄.The fa t that S[[s℄℄ properly hara terises [s℄ enables us to prove that the DTLD andDTVD algorithms for loop{free s hemas are orre t provided that the expression syntax ofthe underlying programming language is suÆ iently ri h.The algorithms for omputing DLD and DVD are not proved orre t.In order to ompute ea h of the four data ow dependen ies of a loop{free s hema s, twodi�erent versions of data dependen e and four di�erent versions of ontrol dependen e arede�ned. These forms of data and ontrol dependen e all operate on symboli exe ution trees.Ea h of DTLD, DTVD, DLD and DVD is omputed by applying the appropriate version ofdata and ontrol dependen e to S[[s℄℄.

Page 172: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

172 Data and Control Dependen e in Symboli Exe ution TreesData Dependen e used Control Dependen e usedDVD Vdatadepends{De�nition 6.4.1(page 194) V ontrols{De�nition 6.5.2(page 196)DTVD Vdatadepends VT ontrols{De�nition 6.4.3(page 195)DLD Ldatadepends{De�nition 6.2.1(page 173) L ontrols{De�nition 6.5.1(page 196)DTLD Ldatadepends LT ontrols{De�nition 6.2.3(page 176)In Se tion 6.6, we show that DLD and DTLD an be thought of as spe ial ases of DVD andDTVD respe tively. DTLD an be omputed by treating the labels as variables, omputingthe DTVD, and then interse ting the �nal result with the set of all labels.This means that, in e�e t, the `label' and `variable' versions of ea h dependen e above anbe ombined into a single dependen e. This simpli� ation implies that, in fa t, just one formof data dependen e and two forms of ontrol dependen e1 are all that is required in order to ompute the four data ow dependen ies introdu ed in Chapter 3, when applied to loop{frees hemas. Data Dependen e used Control Dependen e usedDVD datadepends{De�nition 6.6.2(page 200) ontrols{De�nition 6.6.4(page 200)DLD datadepends ontrolsDTVD datadepends T ontrols{De�nition 6.6.3(page 200)DTLD datadepends T ontrolsWe now fully de�ne and prove orre t the algorithm for one of the data ow dependen ies,DTLD.6.2 Computing DTLD for Loop{free S hemasAs has just been stated, we require a version of data dependen e, whi h we all label datadependen e and a version of ontrol dependen e whi h we all label terminating ontrol de-penden e, both operations on symboli exe ution trees. These are now de�ned formally andexamples given.1the non{terminating version ( ontrols) that we all Control Dependen e and the terminating version (T on-trols) that we all Terminating Control Dependen e

Page 173: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

6.2 Computing DTLD for Loop{free S hemas 1736.2.1 Label Data Dependen eLet t be a symboli exe ution tree. Intuitively, in ea h leaf symboli state, the symboli valueof ea h variable orresponds to the sequen e of assignments that would have to be exe uted torea h that �nal state. The set of labels upon whi h variable, v, is label dependent is pre iselythe set of labels orresponding to these assignments. Formally,Variable v data depends on label l if and only if there exists � in the range of pfun(t) su hthat l 2 labels(� v). (See De�nition 6.2.2(page 173) for the de�nition of labels.)In the example in Figure 4.6(page 134), is data ow label dependent on the set ff4; f5gand i on ff6g. We write Ldatadepends(t)(x) for the set of labels upon whi h x is label datadependent in t.De�nition 6.2.1 (Label Data Dependen e)Ldatadepends(t)(x) = [�2range(pfun(t))labels(� x)The fun tion, labels, whi h returns the set of labels mentioned in a symboli value isformally de�ned for the three types of symboli value as follows:-De�nition 6.2.2 (labels) labels(v) = ;labels(f (S)) = ff g [ [d2Slabels(d)labels(?) = ;6.2.2 Example of Label Data Dependen eConsider, again, the symboli exe ution tree in Figure 6.1(page 174). Its path fun tion isgiven in Figure 6.2(page 175). The label data dependen e is given in Figure 6.3(page 175).

Page 174: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

174 Data and Control Dependen e in Symboli Exe ution TreesTo �nd the labels upon whi h variable, v, label data depends, simply olle t together thelabels of all the symboli �nal values of v.b1(x,y)

b2(x) id

b1(f(x,y),y)b1(x,(g(y)))

b1(x,g(g(y)))

y->g(g(y))

x->f(x,y)

b1(f(x,y), g(y))b1(f(f(x,y),y),y)

x->f(x,y)y->g(y)

b2(f(x,y))

x->f(f(x,y),y)

y->g(y)

bottom bottom

bottom

Figure 6.1: Symboli Exe ution TreeIf v is label data dependent on l in s, then for some program p in [s℄, in a given state �there are an in�nite number of possible values for v that an be obtained by repla ing theexpression at l by another and exe uting the resulting program in the same state �.6.2.3 Label Terminating Control Dependen eIntuitively and informally, v is label terminating ontrol dependent on label l in s hema s ifand only if there is a predi ate that depends on l, whi h `a�e ts' the �nal value of v.

Page 175: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

6.2 Computing DTLD for Loop{free S hemas 175True Symboli Predi ates False Symboli Predi ates Final Statefb1(x; y); b2(x); b1(f(x; y); y); b2(f(x; y)); b1(f(f(x; y); y); y)g ; ?fb1(x; y); b2(x); b1(f(x; y); y); b2(f(x; y))g fb1(f(f(x; y); y); y)g x 7! f(f(x; y); y)fb1(x; y); b2(x); b1(f(x; y); y)); b1(f(x; y); g(y))g fb2(f(x; y))g ?fb1(x; y); b2(x); b1(f(x; y); y))g fb2(f(x; y)); b1(f(x; y); g(y))g x 7! f(x; y)y 7! g(y)fb1(x; y); b2(x)g fb1(f(x; y); g(y))g x 7! f(x; y)fb1(x; y); b1(x; g(y)); b1(x; g(g(y)))g fb2(x)g ?fb1(x; y); b1(x; g(y))g fb2(x); b1(x; g(g(y))g y 7! g(g(y))fb1(x; y)g fb2(x); b1(x; g(y))g y 7! g(y); fb1(x; y)g idFigure 6.2: The Path Fun tion of the symboli exe ution tree in Figure 6.1(page 174)Variable Labels on whi h it data dependsx fy gFigure 6.3: Data Dependen e TableIf v is Label Terminating Control Dependent on l in s hema s but not Label Data Depen-dent on l, then given any program p in [s℄, in a given state �, there are only a �nite numberof possible values for v that an be obtained by repla ing the expression at l by another andexe uting the resulting program in the same state, �. This is be ause hanging the expressionat l, does not introdu e any new potential �nal symboli states, but only potentially a�e tswhi h one is rea hed.Informally, to al ulate the set of labels in a symboli exe ution tree upon whi h variablev is label terminating ontrol dependent the following must be done:-For ea h pair of non{bottom states, where v has a di�erent �nal value, work out the set ofdi�eren es (it annot be empty, by the `disagreement lemma': Lemma 5.3.5(page 151)) of thetwo paths that lead to these two �nal states i.e. the predi ate symboli values that are true inone path and false in the other. Label l is in luded in the set of labels upon whi h v is labelterminating ontrol dependent if and only if l o urs as a label in all the symboli values inthis set of di�eren es.Let t be a symboli exe ution tree. Variable x label terminating ontrol depends onlabel l in t if and only if there exist two paths � and �0 in the domain of pfun(t) with

Page 176: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

176 Data and Control Dependen e in Symboli Exe ution Trees? 6= pfun(t)(�)x 6= pfun(t)(�0)x 6= ? su h that for all e 2 di�s(�; �0), l 2 labels(e). (SeeDe�nition 5.2.4(page 143) for the de�nition of di�s.)We write LT ontrols(t)(v) for the set of labels upon whi h v is label terminating ontroldependent in t.De�nition 6.2.3 (LT ontrols)LT ontrols(t)(v) =[f(�;�0)j?6=pfun(t)(�)v 6=pfun(t)(�0)v 6=?g0�\Æ2di�s(�;�0)labels Æ1A6.2.4 Examples of Label Terminating Control Dependen eConsider the program p6:4, and its orresponding s hema s6:4, in Figure 6.4(page 177).The symboli exe ution tree orresponding to s6:4, S[[s6:4℄℄ has stru ture as shown in Fig-ure 6.5(page 178). S[[s6:4℄℄ has four paths. These are shown in Figure 6.6(page 178). Thereare four pairs of paths in S[[s6:4℄℄ with di�erent, non{?, �nal values for y. These are shownin Figure 6.7(page 178). By De�nition 6.2.3(page 176), for ea h pair of paths (�i; �j) withdi�erent �nal values for y, the value of \Æ2di�s(�i;�j)labels Æmust be al ulated. These values are shown in Figure 6.8(page 179). By De�nition 6.2.3(page 176),the set of labels upon whi h variable y is label terminating ontrol dependent is the union ofthe sets in Figure 6.8(page 179), namelyfb2g [ ; [ ; [ fb2gwhi h is fb2g. This shows that the predi ate b2(z) in s6:4 has an e�e t on the �nal value of yin s6:4, so b2 ontrols y. It also shows that the other predi ate in s6:4, b1(x; z), has no e�e ton the �nal value for y. This also shows that in s6:4, the �nal value of y is not ontrolled bythe initial value of x but only by the initial value of z.

Page 177: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

6.2 Computing DTLD for Loop{free S hemas 177if x<zthen v:=k;if z<3then y:=7else y:=8 if b1(x; z)then v := f2(k);if b2(z)then y := f4()else y := f5()Figure 6.4: p6:4 and s6:4Now onsider the variable v in s hema s6:4 in Figure 6.4(page 177). There are four pairsof paths in S[[s6:4℄℄ with di�erent �nal values for v. These are shown in Figure 6.9(page 179).Again, by De�nition 6.2.3(page 176), for ea h pair of paths (�i; �j) with di�erent2 �nal valuesfor v, the value of \Æ2di�s(�i;�j)labels Æmust be al ulated. These values are shown in Figure 6.10(page 179). By De�nition 6.2.3(page 176),the set of labels upon whi h variable v is terminating label ontrol dependent is the union ofthe sets in Figure 6.8(page 179), namelyfb1; b2g [ ; [ ; [ fb1; b2gwhi h is fb1; b2g. This shows that the predi ates b1(x; z) and b2(z) in s6:4 both have an e�e t2non{?

Page 178: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

178 Data and Control Dependen e in Symboli Exe ution Treesb1(x,z)

b2(z)

v-> f2(k)y->f4()

b2(z)

v->f2(k)y->f5()

(A)

(B) (B)

y->f4() y->f5()Figure 6.5: The symboli exe ution tree, S[[s6:4℄℄Path True Predi ates False Predi ates�1 fA;Bg fg�2 fAg fBg�3 fBg fAg�4 fg fA;BgFigure 6.6: The four paths of S[[s6:4℄℄Di�eren es�1 �2 fBg�1 �4 fA;Bg�3 �4 fBg�2 �3 fA;BgFigure 6.7: The Four Pairs of paths of S[[s6:4℄℄ with di�erent non{? �nal values for y

Page 179: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

6.2 Computing DTLD for Loop{free S hemas 179\Æ2di�s(�1;�2)labels Æ = labels(B) = fb2g\Æ2di�s(�1;�4)labels Æ = labels(A)\ labels(B) = ;\Æ2di�s(�3;�4)labels Æ = labels(B) = fb2g\Æ2di�s(�2;�3)labels Æ = labels(A)\ labels(B) = ;Figure 6.8: The Di�eren es of ea h pair of paths of S[[s6:4℄℄ with di�erent non{? �nal valuesfor y Di�eren es�1 �3 fAg�1 �4 fA;Bg�2 �4 fAg�2 �3 fA;BgFigure 6.9: The Four Pairs of paths of S[[s6:4℄℄ with di�erent non{? �nal values for von the �nal value of v in , so b1 and b2 both ontrol v.This also shows that in s6:4, the �nal value of v is ontrolled by the initial values of x andz.An Example with Non{terminationConsider the program p6:11, and its orresponding s hema s6:11, in Figure 6.11(page 180).The symboli exe ution tree orresponding to s6:11, S[[s6:11℄℄ has stru ture as shown in Fig-\Æ2di�s(�1;�3)labels Æ = labels(A) = fb1gFigure 6.10: The Di�eren es of ea h pair of paths of s6:11, with di�erent non{? �nal valuesfor v

Page 180: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

180 Data and Control Dependen e in Symboli Exe ution Treesure 6.12(page 181). There are no paths in S[[s6:11℄℄ with di�erent non{? �nal values for y.This means there are no labels upon whi h y is label terminating ontrol dependent. There areno terminating programs in [s6:11℄ where the hoi e of whi h `dire tion is taken' an make anydi�eren e to the �nal value of y. Put another way, two programs in the same data ow equiv-alen e lass di�ering only at the expression labelled b1 annot both terminate with di�erent3�nal values for y. Similarly for b2.For variable v, on the other hand, there are two paths in S[[s6:11℄℄ with di�erent �nal valuesfor v. These are �1 and �3. (Labelling the paths of S[[s6:11℄℄ from left to right �1; � � � ; �4.)Using De�nition 6.2.3(page 176), it an be seen that the set of labels upon whi h variable vis label terminating ontrol dependent is\Æ2di�s(�1;�3)labels Æ = labels(A) = fb1gIn this ase, therefore there are two programs in [s6:11℄ di�ering only at label b1, whi h insome initial state both terminate with di�erent �nal values for v.if x<zthen v:=k;if z<3then y:=7else FAIL if b1(x; z)then v := f2(k);if b2(z)then y := f4()else FAILFigure 6.11: p6:11 and s6:11In the example in Figure 4.6(page 134), there are three paths namely, ABD, ABE andAC. Variable has three di�erent symboli values at the ends of ea h of these three paths.3This should remind the reader of the de�nition of DTLD.

Page 181: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

6.2 Computing DTLD for Loop{free S hemas 181b1(x,z)

b2(z)

v-> f2(k)y->f4()

b2(z)

bottom

(A)

(B) (B)

y->f4() bottomFigure 6.12: The symboli exe ution tree, S[[s6:11℄℄This gives three `di�eren es' to al ulate1. di�eren es(ABD;ABE) = fBg, so variable, , is label terminating ontrol dependenton ff2; f3g sin e labels(B) = ff2 ; f3g.2. di�eren es(ABD;AC) = fAg so variable, , is label terminating ontrol dependent onff1g sin e labels(A) = ff1 g.3. di�eren es(AC;ABE) = fAg so variable, , is label terminating ontrol dependent onff1g sin e labels(A) = ff1 g.Colle ting these together we get is label terminating ontrol dependent on ff1; f2; f3g.Variable i, on the other hand has only two di�erent non{? symboli values at the ends ofthe three paths (Paths ABD and ABE lead to the same value). In this ase, therefore, thereare two di�eren es to al ulate:1. di�eren es(ABD;AC) = fAg, so variable i is label terminating ontrol dependent onff1g sin e labels(A) = ff1 g.

Page 182: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

182 Data and Control Dependen e in Symboli Exe ution TreesVariable Label Data Dependen e Label Terminating Control Dependen e ff4; f5g ff1; f2; f3gi ff6g ff1gFigure 6.13: Label Data Dependen e and Label Terminating Label Dependen e ff1; f2; f3; f4; f5gi ff1; f6gFigure 6.14: DTLsli e2. di�eren es(AC;ABE) = fAg, so variable, i, is label terminating ontrol dependent onff1g sin e labels(A) = ff1 g.This shows that i is label terminating ontrol dependent on ff1g.In the example in Figure 6.1(page 174), in order to ompute the set of labels upon whi hthe variable x is label terminating ontrol dependent, it an be seen that symboli predi atevalues that an ause di�erent terminating �nal values of x are:- b1(x; y); b2(x), b1(f(x; y); y)and b2(f(x; y)). Variable x is thus label terminating ontrol dependent on the set of labelsfb1; b2; fg.6.2.5 The DTLsli e of a Symboli Exe ution TreeDe�nition 6.2.4 (DTLsli e )Let t be a Symboli Exe ution Tree. The DTLsli e for variable v in t is de�ned to be theunion of label data dependen e and label terminating ontrol dependen e of v, i.e.DTLsli e(t)(v) = Ldatadepends(t)(v)[ LT ontrols(t)(v)The analysis of s4:5 is given in Figure 6.13(page 182) and Figure 6.14(page 182).6.2.6 The Algorithm for DTLDIn order to ompute the set of labels l for whi h xDTLD l in loop{free s hema s, s is �rst trans-lated into its symboli exe ution tree, S[[s℄℄ and then the DTLsli e (De�nition 6.2.4(page 182))

Page 183: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

6.3 Corre tness of the Algorithm for DTLD 183of x with respe t to S[[s℄℄ is omputed.6.3 Corre tness of the Algorithm for DTLDIn this se tion, some results and de�nitions are provided. These are needed in order to provethe main orre tness theorem of this hapter.De�nition 6.3.1 (varof)Sin e labels are unique, if the outermost label of symboli value, fi o urs as the label of anexpression in an assignment to x, say, then by uniqueness, it annot o ur as the outermostlabel of an assignment to any other variable. The value of all symboli values whose outermostlabel is fi must be therefore asso iated with the variable x and only the variable x. Given as hema s and symboli value fi(S) we de�ne varof s fi(S) to be the variable asso iated withfi as just de�ned.For a symboli value that is the variable, v, varof s v = v.De�nition 6.3.2 (obtained from)1. For all variables v and s hemas s, the symboli value that is the variable v is obtainedfrom s hema s.2. The symboli value f(S) is obtained from the s hema s if and only if(a) all Æ in S are obtained from s and(b) there is a symboli expression f(T ) in s, su h that for varof s is a one to one orresponden e between S and T , i.e. for ea h variable v in T , there is a uniquesymboli value Æ in S su h that varof Æ = v. Conversely, for ea h Æ in S there is aunique variable v in T su h that varof Æ = v.De�nition 6.3.3 (Assigned Symboli Value)Let s be a s hema and Æ=f(S) be a symboli value obtained from s. Æ is an assigned symboli value if and only if f is the label of an expression o urring on the right hand side of anassignment in s4.4f annot, therefore, also be the label of a predi ate, sin e labels are unique.

Page 184: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

184 Data and Control Dependen e in Symboli Exe ution TreesDe�nition 6.3.4 (Predi ate Symboli Value)Let s be a s hema and Æ=f(S) be a symboli value obtained from s. Æ is a predi ate symboli value if and only if f is the label of a predi ate in s.Theorem 6.3.1 (Outermost Labels of Predi ates Symboli Values) Given a s hemas, for all predi ate symboli values, Æ, obtained from s, the outermost label of Æ does not o uras non{outermost label of any symboli value obtained from s.Proof: obvious.Theorem 6.3.2 (Outermost Labels of Predi ate Symboli Values) Given a loop{frees hema s, for all predi ate symboli values, Æ, obtained from s, for all variables, x, the outer-most label of Æ does not o ur in pfun S[[s℄℄ x.Proof: obvious.Theorem 6.3.3 Let Æi, i 2 f1 � � �ng be a set of distin t symboli values obtained from sthat are either variables or assigned symboli values and let vi i 2 f1 � � �ng be a set of ndistin t integers. Then there exists a program p in [s℄ and a state � su h that for all i 2f1 � � �ng; evalsym s p � Æi = vi.Proof:Indu tion on the maximum depth of the Æi.Base CaseThe Æi are all variables. Simply pi k � so that � Æi = viIndu tion Hypothesis Assume true for all Æi of depth < m. Let Æi, i 2 f1 � � �ng have maximumdepth of m. Then sin e all the Æi, are unique, andevalsym s p � fi(S) = E [[p fi℄℄[Æ2S varof (s; Æ) 7! (evalsym s p � Æ)If fi, o urs more than on e as the outermost label of any other Æi then by indu tion hypothesisthe states: [Æ2S varof (s; Æ) 7! (evalsym s p � Æ)

Page 185: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

6.3 Corre tness of the Algorithm for DTLD 185will be unique. By Assumption 3.4.1, the expressions in p orresponding to fi an be hosenas required.Corollary 6.3.1 Let Æi, i 2 f1 � � �ng be a set of distin t predi ate symboli values obtainedfrom s and let bi i 2 f1 � � �ng be a set of boolean values. Then there exists a program p in [s℄and a state � su h that for all i 2 f1 � � �ng; evalsym s p � Æi = bi.Proof: follows immediately from Assumption 3.4.1, Theorem 6.3.3(page 184) and Theo-rem 6.3.2(page 184).Theorem 6.3.4 Given a loop{free s hema s and a �nite set S, of symboli values (thatare not predi ate symboli values) obtained from s. Given a label, f , there exists a state �and programs p and p0 di�ering only at f su h that for all sub{symboli values Æi, Æj , of allelements of S Æi 6= Æj =) evalsym s p � Æi 6= evalsym s p � ÆjandÆi 6= Æj =) evalsym s p0 � Æi 6= evalsym s p0 � Æjandf 2 labels Æi =) evalsym s p � Æi 6= evalsym s p0 � Æiandf =2 labels Æi =) evalsym s p � Æi = evalsym s p0 � ÆiProof:Indu tion on the maximum depth of ea h element of SBaseCaseIf the maximum depth of S is zero then every element of S is a variable. Trivial.Now assume the maximum depth of S is N > 0.Consider the set T of all sub{symboli values of all the elements of S whose depth is less thanN . By the indu tion hypothesis, there exists a state � and programs p and p0 di�ering onlyat f su h that for all sub{symboli values Æi, Æj of all elements, Æ of TÆi 6= Æj =) evalsym s p � Æi 6= evalsym s p � Æjand

Page 186: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

186 Data and Control Dependen e in Symboli Exe ution TreesÆi 6= Æj =) evalsym s p0 � Æi 6= evalsym s p0 � Æjandf 2 labels Æi =) evalsym s p � Æi 6= evalsym s p0 � Æiandf =2 labels Æi =) evalsym s p � Æi = evalsym s p0 � ÆiConsider all the elements fi(Si) of S of depth N . We work through them one at a time.Take the `�rst' one, f1(S1). Then[Æ2S1 (varof s Æ) 7! (evalsym s p � Æ)is a state that has not o urred in the evaluation in p of any other symboli value whoseoutermost label is f1. Similarly[Æ2S1 (varof s Æ) 7! (evalsym s p0 � Æ)is a state that has not o urred in the evaluation in p0 of any other symboli value whoseoutermost label is f1.If f1 6= f , the we want the same expression p(f1) (using the notation dis ussed in Se -tion 3.11.3) in its pla e in both p and p0.If f =2 labels f1 (S1 ) we requireevalsym s p � f1(S1) = evalsym s p0 � f1(S1):By Assumption 3.4.1, we an repla e e1 in p and p0, if ne essary by a new expression, e01 su hthat they agree on all previous states and alsoevalsym s p � f1(S1) = evalsym s p0 � f1(S1)and so that evalsym s p � fi(Si)

Page 187: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

6.3 Corre tness of the Algorithm for DTLD 187is di�erent from any other value so far en ountered in evaluating symboli values with respe tto p and evalsym s p0 � fi(Si)is di�erent from any other value so far en ountered in evaluating symboli values with respe tto p0.If, on the other hand, f 2 labels f1 (S1 ), we requireevalsym s p � f1(S1) 6= evalsym s p0 � f1(S1):By Assumption 3.4.1, we an repla e e1 in p and p0, if ne essary by an new expression, e01su h that they agree on all previous states and alsoevalsym s p � f1(S1) 6= evalsym s p0 � f1(S1)and so that evalsym s p � fi(Si)is di�erent from any other value so far en ountered in evaluating symboli values with respe tto p and evalsym s p0 � fi(Si)is di�erent from any other value so far en ountered in evaluating symboli values with respe tto p0.The �nal possibility is that f1 = f .

Page 188: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

188 Data and Control Dependen e in Symboli Exe ution TreesIn this ase we require evalsym s p � f1(S1) 6= evalsym s p0 � f1(S1):Again, sin e the states in whi h p f and p f 0 have not been previously en ountered, we anagain hoose new values for p f and p f 0 su h that they are the same on all previous statesand di�er on the new ones.Repeat this pro ess until all symboli values of depth N have been pro essed. We will thenbe left with two programs q and q0, say, su h that for all sub{symboli values Æi, Æj of allelements of S Æi 6= Æj =) evalsym s q � Æi 6= evalsym s q � ÆjandÆi 6= Æj =) evalsym s q0 � Æi 6= evalsym s q0 � Æjandf 2 labels Æi =) evalsym s q � Æi 6= evalsym s q0 � Æiandf =2 labels Æi =) evalsym s q � Æi = evalsym s q0 � Æias required.This ompletes the proof of Theorem 6.3.4(page 185).6.3.1 Proof of DTLD AlgorithmWe are now in a position to prove the main theorem whi h states that given a loop frees hema s, the set of labels upon whi h variable x is data ow terminating label dependent anbe omputed by translating s into a symboli exe ution tree, t using the semanti fun tion Sde�ned in Chapter 4 and then omputing the DTLsli e of t using label data dependen y andlabel terminating ontrol dependen y des ribed in Se tion 6.1 of this hapter.Theorem 6.3.5 Given a loop{free s hema s,l 2 DTLD s x () l 2 DTLsli e S[[s℄℄ x

Page 189: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

6.3 Corre tness of the Algorithm for DTLD 189Proof:We must show that there exist two programs p and p0 in [s℄ di�ering only at label l in s anda state � su h that ? 6=M[[p℄℄� x 6=M[[p0℄℄� x 6= ?()l 2 DTLsli e S[[s℄℄ x:=)Assume that there exist two programs p1 and p2 in [s℄ di�ering only at label l in s and a state� su h that ? 6=M[[p℄℄� x 6=M[[p0℄℄� x 6= ?. By Theorem 5.4.1(page 160), there exist uniquepaths � and �0 in dom (pfun s) su h thatevalsym s p � (pfun s � x) 6= evalsym s p0 � (pfun s �0 x)su h that satisfy s p � � and satisfy s p0� �0.Case 1 if � = �0 then y 2 labels(pfun s � x) sin e p1 and p2 in [s℄ di�er only at label l.(Otherwise evalsym s p � (pfun s � x) and evalsym s p0 � (pfun s � x) would have to beidenti al).therefore l 2 Ldatadepends s x as required.Case 2 if � 6= �0then, again, if (pfun s � x) = (pfun s �0 x) then either l 2 labels(pfun s � x) or l 2labels(pfun s �0 x) so l 2 DTLsli e s x, as before.Assume (pfun s � x) 6= (pfun s �0 x) and l =2 labels((pfun s � x)) and l =2 labels((pfun s �0 x)).For all Æ 2 di�s(�; �0), l 2 labels Æ and therefore l is in LT ontrols s x and hen e inDTLsli e s x, as required.(=

Page 190: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

190 Data and Control Dependen e in Symboli Exe ution TreesAssume l 2 DTLsli e s x.Case 1 If l 2 Ldatadepends s xthen there must exist a path � = (�t; �f) with (pfun s � x) su h that l 2 labels(pfun s � x).By Theorem 6.3.4(page 185), we an pi k a state � and programs p and p0 di�ering only at lsu h that for all sub{symboli values Æi ,Æj in �t [ �f [ f(pfun s � x)g su h thatÆi 6= Æj =) evalsym s p � Æi 6= evalsym s p0 � ÆjandÆi 6= Æj =) evalsym s p0 � Æi 6= evalsym s p0 � Æjandl 2 labels Æi =) evalsym s p � Æi 6= evalsym s p0 � Æi.Sin e this means that state will be di�erent when evaluating symboli predi ates with thesame outermost label, by Assumption 3.4.1, we an �nd values of ea h predi ate expressionsu h that (satisfy s p � �)and (satisfy s p0 � �0)but evalsym s p � (pfun s � x) 6= evalsym s p0 �(pfun s � x).sin e l 2 labels(pfun s � x), so by Theorem 5.4.1(page 160),M[[p℄℄� x 6=M[[p0℄℄� x:Also, sin e l 2 labels(pfun s � x), s(pfun s � x) 6= ?so ? 6= evalsym s p � (pfun s � x) 6= evalsym s p0 �(pfun s � x) 6= ?.Therefore ? 6=M[[p℄℄� x 6=M[[p0℄℄� x 6= ?:and p and p0 have been hosen to di�er only at l, as required.Case 2 If l 2 LT ontrols s x

Page 191: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

6.3 Corre tness of the Algorithm for DTLD 191then there exist two paths � and �0 with l 2 labelsÆ for all Æ 2 di�s(�; �0) 6= ; su h thatpfun s � x 6= pfun s �0 x:We an assume that neither l =2 labels(pfun s � x) and l =2 labels(pfun s �0 x) sin e otherwisel 2 (Ldatadepends s x) whi h we have already onsidered.Case 1 if l is a predi ate label.So l must be the outermost label of ea h element of di�s(�; �0).So ea h element of di�s(�; �0) must be of the form l(Si) and none of the Si mention l.By Theorem 6.3.4(page 185), there exists a state � and a program p su h that for all Æi, Æj inf(pfun s � x)g [ f(pfun s �0 x)g [ [fi(Si)2�[�0 Si;Æi 6= Æj =) evalsym s p � Æi 6= evalsym s p � Æj .So i 6= j =) [Æ2Si (varof s Æ) 7! (evalsym s p � Æ) 6= [Æ2Sj (varof s Æ) 7! (evalsym s p � Æ)As in the previous proof, we an work our way through the elements of � [ �0, hoosing thevalue of expression orresponding to the outermost label so that it gives us the required values(Assumption 3.4.1), and does not disagree with all previously en ountered states, to give ustwo programs p and p0 so that satisfy s p � �andsatisfy s p0 � �0.By Theorem 5.4.1(page 160),M[[p℄℄� x = evalsym s p � (pfun s � x) 6= ?andM[[p0℄℄� x = evalsym s p0 � (pfun s �0 x) 6= ?:

Page 192: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

192 Data and Control Dependen e in Symboli Exe ution TreesBut evalsym s p � (pfun s � x) 6= evalsym s p0 � (pfun s �0 x)and so ? 6=M[[p℄℄� x 6=M[[p0℄℄� x 6= ?and p and p0 di�er only at l as required.Case 2 l is a not a predi ate label,then, by Theorem 6.3.4(page 185), there exists a state � and a programs p and p0 di�eringonly at l su h that for all Æi, Æj inf(pfun s � x)g [ f(pfun s �0 x)g [ [fi(Si)2�[�0 Si;Æi 6= Æj =) evalsym s p � Æi 6= evalsym s p � ÆjandÆi 6= Æj =) evalsym s p0 � Æi 6= evalsym s p0 � Æjandl 2 labels Æi =) evalsym s p � Æi 6= evalsym s p0 � Æiandl =2 labels Æi =) evalsym s p � Æi = evalsym s p0 � Æi.Again, as in the previous proof, we an therefore work our way through the elements of�[�0 hoosing the value of expression orresponding to the outermost label so it gives us therequired values (by Assumption 3.4.1), and does not disagree with all previously en ounteredstates, to give us two programs p and p0 so thatsatisfy s p � �andsatisfy s p0 � �0.

Page 193: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

6.4 Computing DTVD for Loop{free S hemas 193By Theorem 5.4.1(page 160),M[[p℄℄� x = evalsym s p � (pfun s � x)andM[[p0℄℄� x = evalsym s p0 � (pfun s �0 x):But ? 6= evalsym s p � (pfun s � x) 6= evalsym s p0 � (pfun s �0 x) 6= ?So ? 6=M[[p℄℄� x 6=M[[p0℄℄� x 6= ?and p and p0 di�er only at l as required.Theorem 6.3.4 shows that for all loop free s hemas s, the DTLD of s an be al ulated by omputing the DTLsli e of S[[s℄℄.This ompletes the proof of orre tness of the DTLD algorithm. We now give a similaralgorithm for DTVD. Due to its similarity to the previous example, it is not overed in somu h detail and the proof is relegated to the appendix.6.4 Computing DTVD for Loop{free S hemasThe data ow (terminating) variable dependen e of symboli exe ution tree, t, is similarly omputed using pfun(t)(De�nition 4.4.2(page 125)). It is the union of two smaller dependen- ies: variable data dependen e and variable terminating ontrol dependen e.6.4.1 Variable Data Dependen eLet t be a symboli exe ution tree. Informally, v variable data depends on variable x in t ifthere is a leaf state of t where variable v gets mapped to a symboli value that mentions thevariable x. This means that there is a path through the program where the �nal value of vis omputed using a sequen e of assignments where the �nal assignment to v is an expressionwhi h depends upon the initial value of x. This orresponds exa tly to traditional datadependen e.

Page 194: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

194 Data and Control Dependen e in Symboli Exe ution TreesFormally, variable v variable data depends on variable x if and only if there exists � inthe range of pfun(t) su h that x 2 variables(� v). (See De�nition 6.4.2(page 194) for thede�nition of variables)We write Vdatadepends(t)(x) for the set of variables upon whi h x is variable data depen-dent in t.De�nition 6.4.1 (Variable Data Dependen e)Vdatadepends(t)(x) = [�2range(pfun(t))variables(� x)The fun tion,variables, whi h returns the set of variables mentioned in a symboli valueis formally de�ned for the three types of symboli value as follows:-De�nition 6.4.2 (variables) variables(v) = fvgvariables(f(S)) = [d2Svariables(d)variables(?) = ;6.4.2 Variable Terminating Control Dependen eLet t be a symboli exe ution tree. Variable x variable ontrol depends on variable v in t ifand only if there exist two paths � and �0 in the domain of pfun(t) with ? 6= pfun(t)(�)x 6=pfun(t)(�0)x 6= ? su h that for all e 2 di�s(�; �0), v 2 variables(e). (See De�nition 5.2.4(page 143)for the de�nition of di�s.) Informally this means that we an �nd two states di�ering onlyon variable v su h in one state one path is hosen and in the other state the other path is hosen. Sin e the symboli states at the end of these paths have di�erent values for x we an hoose two initial states di�ering only on variable v where the symboli values orrespond todi�erent `real' values for x.

Page 195: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

6.5 The Algorithms for DLD and DVD 195We write VT ontrols(t)(v) for the set of variables upon whi h v is ontrol dependent in t.De�nition 6.4.3 (VT ontrols)VT ontrols(t)(v) = [f(�;�0)j?6=pfun(t)(�)v 6=pfun(t)(�0)v 6=?g0� \Æ2di�s(�;�0)variables Æ1A6.4.3 The DTVsli e of a Symboli Exe ution TreeDe�nition 6.4.4 (DTVsli e )Let t be a Symboli Exe ution Tree. The DTVsli e for variable v in t is the union of variabledata dependen e and variable terminating ontrol dependen e of v. i.e.DTVsli e(t)(v) = Vdatadepends(t)(v)[VT ontrols(t)(v)6.4.4 The Algorithm for DTVDIn order to ompute the set of variables v for whi h x DTLD v in loop{free s hema s, sis �rst translated into its symboli exe ution tree, S[[s℄℄ and then the DTVsli e (De�ni-tion 6.4.4(page 195)) of x with respe t to S[[s℄℄ is omputed.The proof of this algorithm is given in Appendix D page 281.As an be seen the algorithm for omputing DTVD for Loop{free S hemas and its proofare almost identi al to that for DTLD. One uses variables and the other uses labels. In viewof the dis ussion that follows (Se tion 6.6) this is not surprising. We show that to omputeDTLD we an think of labels as variables, ompute the DTVD, and interse t the result withthe set of labels. Labels are just a spe ial kind of variable.6.5 The Algorithms for DLD and DVDWe laim, but do not prove, that in order to produ e algorithms for the non{terminatingdependen y relation DVD and hen e DLD, all that is required is a small hange in thede�nitions of Control Dependen e. These results are left to `future work'.

Page 196: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

196 Data and Control Dependen e in Symboli Exe ution TreesDi�eren es�1 �2 fBg�1 �4 fA;Bg�2 �3 fA;Bg�3 �4 fBgFigure 6.15: The Four Pairs of paths of S[[s6:11℄℄ with di�erent �nal values for yDe�nition 6.5.1 (L ontrols)L ontrols(t)(v) =[f(�;�0)jpfun(t)(�)v 6=pfun(t)(�0)vg0� \Æ2di�s(�;�0)labels Æ1ADe�nition 6.5.2 (V ontrols)V ontrols(t)(v) =[f(�;�0)jpfun(t)(�)v 6=pfun(t)(�0)vg0� \Æ2di�s(�;�0)variables Æ1AThe only di�eren e between terminating and non{terminating ontrol dependen y is that,in the latter, two paths are onsidered to lead to di�erent values of v if using one path we getbottom (i.e non-termination) and in terminating dependen e, neither path was allowed to leadto bottom. This exa tly aptures the di�eren e between the terminating and non{terminatingdependen e de�ned in Chapter 3.6.5.1 ExampleConsider, again, the s hema s6:11 in Figure 6.11(page 180). Although there we no pairs ofpaths with di�erent non{? values for y the same is not true when the ondition that thevalues must not be ? is dropped. In this ase, there are four pairs of paths with di�erentvalues for y. These are given in Figure 6.15(page 196).

Page 197: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

6.6 Labels are really Variables 197\Æ2di�s(�1;�2)labels Æ = labels(A) = fb2g\Æ2di�s(�1;�4)labels Æ = labels(A)\ labels(B) = ;\Æ2di�s(�2;�3)labels Æ = labels(A)\ labels(B) = ;\Æ2di�s(�3;�4)labels Æ = labels(B) = fb2gFigure 6.16: The Four Pairs of paths of S[[s6:11℄℄ with di�erent �nal values for yBy De�nition 6.5.1(page 196), the set of labels upon whi h variable y is label ontroldependent is the union of the sets in Figure 6.16(page 197), namelyfb2g [ ; [ ; [ fb2gwhi h is fb2g. This shows that the predi ate b2(z) in s6:4 has an e�e t (using this de�nition)on the �nal value of y in s6:11, sin e in this form of label dependen e (DLD), non{terminationis onsidered a di�erent value from the non{terminating value. We have shown that althoughb2 does not `label terminating ontrol' y in s6:11, it does `label ontrol' y in s6:11.6.6 Labels are really VariablesIn this se tion, we show how data ow label dependen e an be translated into data owvariable dependen e. We laim that given a s hema s if to ea h expression we simply add anew unique label from some set, L, of labels, and al ulate the set of variables upon whi hea h variable is data ow variable dependent and then interse t this set with L we will get thedata ow label dependen e of ea h variable. In other words, we an think of the outer labelof ea h labelled expression in a s hema as just another variable.

Page 198: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

198 Data and Control Dependen e in Symboli Exe ution Trees6.6.1 ExampleFor example onsider Figure 6.17(page 198). Here we have augmented s hema s3:22 of Fig-ure 3.22(page 109). We then al ulate the variable dependen e of this new augmented s hemato get the results shown in Figure 6.18(page 198). The range of this relation is then restri tedto just labels to give us the results shown in Figure 6.19(page 198).while f1(f1; i)dobeginif f2(f2; )thenbegin := f3(f3; y);x := f4(f4)end;i := f5(f5; i)endFigure 6.17: Adding Extra Variables for Label Dependen eVariable DTVDx f1 i f2 y f4 f1 i f2 f3i f1 i f5Figure 6.18: Adding Extra Variables for Label Dependen eVariable DTLDx f1 f2 f4 f1 f2 f3i f1 f5Figure 6.19: Label Dependen eEa h added variable does not o ur on the left hand side of any assignment and o ursonly on e in ea h s hema. The reason this works is that for any �nite set of variables V , thereare ountably many expressions that referen e V . We an think of this extra variable as the

Page 199: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

6.6 Labels are really Variables 199`index' into this ountable set of expressions.6.6.2 Justi� ation of Label AddingWe prove that these extra label variables do not interfere with Data ow Variable Dependen eof the original s hema:Given a program p, let the original s hema be s and the label added s hema be s0, then omputing the Data ow Variable Dependen e of s is the same as omputing Data ow VariableDependen e of s0 and then ignoring the labels. More formally,Theorem 6.6.1 For any program, p, let the orresponding `non{label added' and `label added's hemas be alled s and s0 respe tively. Suppose the set of all added labels is L. We laimthat for all variables x, x DVD y in s () x DVD y in s0 and y =2 L.Proof: Suppose s ontains the expressions e1; � � � ; en labelled l1; � � � ; ln. So s0 ontains theexpressions e1 [ fl1g; � � � ; en [ flng. We use the following properties about expressions:Property 6.6.1For every expression, e su h that e VD S, for all values k, there exists an expression e0 su hthat e0 VD S [ fxg and for all states � with x = k, E [[e℄℄� = E [[e0℄℄�. For example, if e = y+1and k = 79, then put e0 = y + 80� x. Then in all states with x = 79, e and e0 are equal.Property 6.6.2For every expression, e, su h that e VD S, for all values k, there exists an expression e0 su hthat e0 VD S�fxg and for all states � with x = k, E [[e℄℄� = E [[e0℄℄�. For example, if e = x+1and k = 79, then put e0 = 80. Then in all states where x = 79, e and e0 are identi al.Consider any program, q in [s℄. Suppose q ontains (proper) expressions d1; � � � ; dn. Sin e allexpressions are uniquely labelled in s0, by property 6.6.1, there exists a program, q0 ontaining(proper) expressions d01; � � � ; d0n in [s0℄ and values v1; � � � ; vn for l1; � � � ; ln su h that for allstates �, with li = vi for all i, E [[di℄℄� = E [[d0i℄℄�. Sin e q0 annot hange the values of the li(the li have been hosen in that way), then in all the states, � with li = vi we must haveM[[q℄℄� =M[[q0℄℄�. Therefore x DVD y in s =) x DVD y in s0.Conversely, assume x DVD y in s0 and y =2 L. By de�nition, for some q0 in [s0℄, there existtwo states �1 and �2 di�ering only on y su h thatM[[q0℄℄�1x =M[[q0℄℄�2x. Let the values ofthe li in �1 and �2 be vi. By property 6.6.2, we an rewrite the expressions d0i of q0 to give di,

Page 200: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

200 Data and Control Dependen e in Symboli Exe ution Treessay, so that they do not depend on li but they agree with the values of d0i for all states whereli = vi. Call the resulting program q. Clearly M[[q℄℄�1x = M[[q℄℄�2x. But �1 and �2 di�eronly on y and therefore x DVD y in s.Clearly the same argument works for DTVD and DTLD. This result means that we analways work with variable dependen e. Any algorithms we �nd for variable dependen e anbe adapted to apply to label dependen e.6.6.3 Variables and Labels CombinedPut another way, there is no need to distinguish between labels and variables and thereforeno need to distinguish between variable data depends and label data depends nor betweenvariable (terminating) ontrol depends and label (terminating) ontrol depends. For ea hsymboli value Æ, all that is required is the set of names of Æ.De�nition 6.6.1 (names)Let Æ be as symboli value names(Æ) = labels(Æ) [ variables(Æ)De�nition 6.6.2 (Data Dependen e)datadepends(t)(x) = [�2range(pfun(t))names(� x)De�nition 6.6.3 (T ontrols)T ontrols(t)(v) = [f(�;�0)j?6=pfun (t)(�)v 6=pfun(t)(�0)v 6=?g0� \Æ2di�s(�;�0)names Æ1A

Page 201: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

6.6 Labels are really Variables 201De�nition 6.6.4 ( ontrols) ontrols(t)(v) = [f(�;�0)jpfun(t)(�)v 6=pfun(t)(�0)vg0� \Æ2di�s(�;�0)names Æ1ADe�nition 6.6.5 (DTsli e )Let t be a Symboli Exe ution Tree. The DTsli e for variable v in t is the union of datadependen e and terminating ontrol dependen e of v. i.e.DTsli e(t)(v) = datadepends(t)(v)[ T ontrols(t)(v)De�nition 6.6.6 (Dsli e )Let t be a Symboli Exe ution Tree. The Dsli e for variable v in t is the union of datadependen e and ontrol dependen e of v. i.e.Dsli e(t)(v) = datadepends(t)(v)[ ontrols(t)(v)On its own, the DTsli e omputes DTD, the union of DTVD and DTLDDe�nition 6.6.7 (DTD)Let s be a s hema, then x DTD y in s () x DTLD y in s or x DTVD y in s.and the Dsli e omputes DD, the union of DVD and DLDDe�nition 6.6.8 (DD)Let s be a s hema, then x DD y in s () x DLD y in s or x DVD y in s.6.6.4 Computing DTVD and DTLD using the DTsli eClearly, the DTVD of the `label added' s hema is the same as the DTD of the original s hema.The DTVD of a s hema s an thus be omputed in three di�erent ways:-� It an be omputed using the DTVsli e as des ribed in Se tion 6.4.� It an be omputed using the DTsli e and then restri ting the range of the result tojust variables.

Page 202: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

202 Data and Control Dependen e in Symboli Exe ution Trees� It an be omputed by omputing the DTVsli e of the `label added' s hema of s andthen restri ting the range of the result to just variables.Similarly, to ompute DTLD, we an simply restri t the range of the DTD to just labelsand similarly, DVD and DLD an be omputed using the Dsli e .6.7 Implementation of DTLD for Loop{free S hemasIn this se tion we give the algorithm for DTLD in the fun tional programming language Hope.This program refers to the fun tions whi h implemented the semanti s of s hemas given inSe tion 4.6.6.7.1 labels (De�nition 6.2.2(page 173))labels: delta -> set name;labelsset: (set delta) -> (set name);labels (va x) <= empty;labels ( omplex (f,S)) <= (singleton f) U labelsset (S);labels botdelta <= empty;labelsset S <=mapset(labels,S);6.7.2 Ldatadepends (De�nition 6.2.1(page 173))datadependent: (pfun path delta) -> set name ;datadependent <= labelsset o range;6.7.3 di�s (Se tion 5.2.4)differen es: path X path -> set delta;differen es((p1,p1'),(p2,p2')) <= ((p1 interse t p2') U (p1' interse t p2));6.7.4 LT ontrols (Se tion 6.2.3)allinterse t: set delta -> set name;allinterse t S <= if S = emptythen emptyelse let (a,T)== hoose Sin if ( ard S) = 1then (labels a)else (labels a) interse t (allinterse t T);

Page 203: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

6.8 Implementation of DTVD for Loop{free S hemas 203 ontroldependent : (pfun path delta) -> set name ; ontroldependent f <=mapset(lambda d1 => mapset(lambda d2 =>if (apply f d1) = (apply f d2) or (apply f d1)=botdelta or (apply f d2)=botdeltathen emptyelse allinterse t (differen es(d1,d2)) , domain f),domain f) ;6.7.5 DTLsli e (De�nition 6.2.4(page 182))dependent: (pfun path delta) -> (set name);dependent f <= (datadependent f) U ( ontroldependent f);6.8 Implementation of DTVD for Loop{free S hemasIn this se tion we give the algorithm forDTVD in the fun tional programming language Hope.This program refers to the fun tions whi h implemented the semanti s of s hemas given inSe tion 4.6.6.8.1 variables (De�nition 6.4.2(page 194))variables: delta -> set name;variablesset: (set delta) -> (set name);variables (va x) <= x & empty;variables ( omplex (f,S)) <= variablesset (S);variables botdelta <= empty;variablesset S <=mapset(variables,S);6.8.2 Vdatadepends (De�nition 6.4.1(page 194))Vdatadependent: (pfun path delta) -> set name ;Vdatadependent <= variablesset o range;6.8.3 VT ontrols (De�nition 6.4.3(page 195))allinterse t: set delta -> set name;allinterse t S <= if S = emptythen emptyelse let (a,T)== hoose Sin if ( ard S) = 1

Page 204: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

204 Data and Control Dependen e in Symboli Exe ution Treesthen (variables a)else (variables a) interse t (allinterse t T);VT ontroldependent : (pfun path delta) -> set name ;VT ontroldependent f <=mapset(lambda d1 => mapset(lambda d2 =>if (apply f d1) = (apply f d2) or (apply f d1)=botdelta or (apply f d2)=botdeltathen emptyelse allinterse t (differen es(d1,d2)), domain f),domain f) ;6.8.4 DTVsli e (De�nition 6.4.4(page 195))DTVsli e: (pfun path delta) -> (set name);DTVsli e f <= (Vdatadependent f) U (VT ontroldependent f);6.9 Implementation of DTD for Loop{free S hemasIn this se tion we give the algorithm for DTD in the fun tional programming language Hope.DTD ombines DTLD and DTVD.labelsset S <=mapset(labels,S);names: delta -> set name;names(x) <= (labels x) U (variables x);nameset: (set delta) -> (set name);nameset S <= mapset(names,S);6.9.1 Data depends (De�nition 6.6.2(page 200))datadependent: (pfun path delta) -> set name ;datadependent <= nameset o range;allinterse t: set delta -> set name;allinterse t S <=if S = emptythen emptyelse let (a,T)== hoose Sin if ( ard S) = 1

Page 205: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

6.10 Con lusion 205then (names a)else (names a) interse t (allinterse t T);6.9.2 T ontrols (De�nition 6.6.3(page 200))T ontroldependent : (pfun path delta) -> set name ;T ontroldependent f <=mapset(lambda d1 => mapset(lambda d2 =>if (apply f d1) = (apply f d2) or (apply f d1)=botdelta or (apply f d2)=botdeltathen emptyelse allinterse t (differen es(d1,d2)) , domain f),domain f) ;6.9.3 DTsli e (De�nition 6.6.5(page 201))DTsli e: (pfun path delta) -> (set name);DTsli e f <= (datadependent f) U (T ontroldependent f);6.10 Con lusionWe have formally stated and proved the DTLD and DTVD algorithms orre t for loop frees hemas.From now on, we take advantage of the simpli� ation dis ussed in Se tion 6.6, whi hallows us to treat labels as variables.DTD was de�ned as the union of DTVD and DTLD and hen e either of the smaller rela-tions an be de�ned by simply restri ting the range DTD to variables and labels respe tively.In the next hapter we omplete the work by extending the algorithms to handle s hemasthat are not ne essarily loop free.

Page 206: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

206 Data and Control Dependen e in Symboli Exe ution Trees

Page 207: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

Chapter 7Computing Data ow Dependen iesof S hemas with Loops7.1 Introdu tionIn this hapter, the algorithms introdu ed in Chapter 6, whi h ompute data ow dependen iesof loop{free s hemas, are extended to ompute the DTVD1 of s hemas that ontain loops.The algorithm works by unfolding all the loops within a s hema. A s hema, whi h hashad all its loops repla ed by an unfolding is loop{free and hen e DTVD an be omputedusing the te hniques des ribed in Chapter 6.It is proved that if a s hema ontaining loops is unfolded suÆ iently2, the resulting loop{free s hema will have the same DTVD as the original s hema with loops from whi h it wasderived.The problem of omputing DTVD has thus been redu ed to the problem of re ognisingwhen a s hema with loops has been suÆ iently unfolded.At the time of writing, unfortunately, we are not ertain how to re ognise when thismaximal number of unfoldings has been rea hed. There are three possibilities.Possibility 1 A `maximal unfolding number3' of a s hema an be omputed re ursively from` rude' information about the stru ture of its abstra t syntax tree.1and hen e, the DTLD2A �nite number of times.3It need not be the least. Any one will be suÆ ient

Page 208: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

208 Computing Data ow Dependen ies of S hemas with LoopsHausler [52℄ proved4 that for every program p, a maximal number of unfoldingsof p that are ne essary to apture all dependen e information is omputable. Hisalgorithm omputes the maximal unfolding number of ea h ompound synta ti onstru t in a program re ursively in terms of the maximal unfolding numbersof ea h of its omponents. The maximum unfolding number for ea h basi synta ti onstru t is onstant (For example, one for an assignment statement).Possibility 2 There is some relationship between s hemas su h that if, two su essive itera-tions of the unfolding pro ess, give rise to s hemas whi h are related in this way,then further iterations annot introdu e new dependen ies.Possibility 3 Re ognising when a s hema has been maximally unfolded is not omputable.For the data ow dependen ies introdu ed in this thesis, it seems very likely that the �rstof these is true. For example, in the ase of a `tiny' loop onsisting of a single assignmentstatement, it an be seen that after two unfoldings all the dependen e information has beengathered and further unfolding annot add any new dependen ies. As in the ase of Hausler'swork, it is highly likely that the maximal unfolding number for ea h synta ti onstru t in thelanguage of s hemas an also be expressed as a fun tion of the maximal unfolding numbersof ea h of its omponents.Taking advantage of this, an algorithm for DTVD would ompute a maximal unfoldingnumber ni for ea h synta ti omponent si of the s hema and then simply unfold ea h om-ponent si, ni times. The methods introdu ed in Chapter 6 ould then be used to omputethe DTVD of the resulting loop{free s hema.Sin e at the time of writing, we do not know how to ompute the maximal unfoldingnumbers for5 data ow dependen ies, we annot use them in our algorithms. The algorithmsthat we have implemented rely on the se ond assumption being true. The relationship that weuse is that the two s hemas have the same DTD(= DTVD [ DTLD)6. As is demonstrated,unfolding is monotoni , i.e. further unfolding annot redu e the DTVD. It also bounded4We annot ne essarily assume that his result is true in our ase, sin e we are omputing di�erent depen-den ies from Hausler.5although, we believe them to be the same as Hausler's.6There are probably others. Perhaps, for example, data dependen e on its own is suÆ ient. It ertainlyworks in all the examples that have been tested and is mu h more eÆ ient.

Page 209: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

7.2 Unfoldings 209above by a �nite obje t. If it had been proved that no hange in the DTD in one iteration ofthe unfolding pro ess implied that further iterations ould not introdu e further hanges tothe DTVD, then the algorithm would have been proved orre t.The �rst and third possibilities are mutually ontradi tory. If the ability to re ognisewhen a loop is maximally unfolded were not, in general de idable, it would mean there ouldbe no onne tion between the `size' of a loop and the number of iterations of it that wererequired before all dependen e information was `gathered'. It would mean for some lasses ofprograms, no su h maximum upper limit based on rude synta ti properties involving thenumbers of statements would exist. In view of Hausler's work, this seems highly unlikely.7.2 UnfoldingsHausler [52℄ de�nes the denotational sli e of a while loop in terms of unfolding the loop as anested onditional. A very similar approa h is used in this thesis.De�nition 7.2.1 (Unfolding)Given a s hema [[while b do S℄℄, de�ne the sequen e of s hemas:W0(b; S) = if b then FAIL else skipWn+1(b; S) = if b then S;Wn(b; S) else skip.We all Wi(b; S) the ith unfolding of [[while b do S℄℄.7.2.1 Example of unfoldingConsider the s hema:- while b1(x; y)do if b2(x)then x := f(x; y)else y := g(y)The zeroth unfolding: W0= if b1(x; y)then FAILelse skip

Page 210: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

210 Computing Data ow Dependen ies of S hemas with LoopsThe �rst unfolding is given by : W1 = if b1(x; y)then S;W0else skipwhere S is the body of the loop i.e. W1 = if b1(x; y)then if b2(x)then x := f(x; y)else y := g(y);W0else skipi.e. W1 = if b1(x; y)then if b2(x)then x := f(x; y)else y := g(y);if b1(x; y)then FAILelse skipelse skipThe se ond unfolding, W2 = if b1(x; y)then if b2(x)then x := f(x; y)else y := g(y);W1else skip

Page 211: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

7.3 The DTVD Algorithm for Loop S hemas 211The se ond unfolding, W2 is thus if b1(x; y)then if b2(x)then x := f(x; y)else y := g(y);if b1(x; y)then if b2(x)then x := f(x; y)else y := g(y);if b1(x; y)then FAILelse skipelse skipelse skipObservation 7.2.1 To get from Wn(b; S) to Wn+1(b; S), the o urren e of FAIL is repla edby [[S; if b then FAIL else skip℄℄7.3 The DTVD Algorithm for Loop S hemasThe way the algorithm works with loops is that loops are repeatedly unfolded. Eventually astage will be rea hed where the DTVD is maximal in the sense that further unfoldings willnot ause further hanges in the DTVD.

Page 212: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

212 Computing Data ow Dependen ies of S hemas with Loops7.3.1 ExampleConsider, again, the s hema, while b1(x; y)do if b2(x)then x := f(x; y)else y := g(y) . Using the semanti s of loop frees hemas given in Chapter 4, the zeroth unfolding, W0= if b1(x; y)then FAILelse skip has symboli exe- ution tree S[[W0℄℄ given in Figure 7.1(page 212).b1(x,y)

idbottomFigure 7.1: S[[W0℄℄: The symboli exe ution tree of W0The data dependen e DTsli e (De�nition 6.6.5(page 201)) is omputed The results areshown in Figure 7.2(page 212): The variables x and y are not data ow terminating dependenton any variable or label7. Variable Data Controlx ; ;y ; ;Figure 7.2: DTsli e of S[[W0℄℄7If we had used non-terminating ontrol dependen e (De�nition 6.6.4(page 200)), x and y would both havedepended on x, y and b1.

Page 213: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

7.3 The DTVD Algorithm for Loop S hemas 213The loop is unfolded on e more to give W1 = if b1(x; y)then if b2(x)then x := f(x; y)else y := g(y);if b1(x; y)then FAILelse skipelse skipW1 has the symboli exe ution tree S[[W1℄℄ given in Figure 7.3(page 213).b1(x,y)

b2(x) id

b1(f(x, y),y) b1(x,g(y))

bottombottom x f(x,y) y g(y)Figure 7.3: S[[W1℄℄: The symboli exe ution tree of W1Again, the DTsli e is omputed. The results are shown in Figure 7.4(page 213).Variable Control Datax fb1; b2; x; yg ff; x; ygy fb1; b2; x; yg fg; ygFigure 7.4: DTsli e of S[[W1℄℄

Page 214: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

214 Computing Data ow Dependen ies of S hemas with LoopsThe se ond unfolding, W2 = if b1(x; y)then if b2(x)then x := f(x; y)else y := g(y);if b1(x; y)then if b2(x)then x := f(x; y)else y := g(y);if b1(x; y)then FAILelse skipelse skipelse skiphas the symboli

exe ution tree S[[W2℄℄ given in Figure 7.5(page 214).b1(x,y)

b2(x) id

b1(f(x,y),y)b1(x,g(y))

b1(x,g(g(y)))

y->g(g(y))

x->f(x,y)

b1(f(x,y),g(y))b1(f(f(x,y)y),y)

x->f(x,y)y->g(y)

b2(f(x,y))

x->f(f(x,y),y)

y->g(y)

bottom

bottom bottomFigure 7.5: The symboli exe ution tree, S[[W2℄℄ of W2Noti e, by Observation 7.2.1(page 211), to get from one unfolding to the next we repla eea h ? by the symboli exe ution tree, S[[S; if b1(x; y) then FAIL else skip℄℄ namely, the sym-

Page 215: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

7.3 The DTVD Algorithm for Loop S hemas 215boli exe ution tree given in Figure 7.6(page 215), evaluated in the state immediately to theright of the ? being repla ed.b2(x)

b1(f(x,y),y) b1(x,g(y))

f(x,y) bottombottom x g(y)yFigure 7.6: S[[S; if b1(x; y) then FAIL else skip℄℄Noti e also, that at this stage, some simpli� ation (Se tion 4.4.4) has taken pla e. Thesymboli exe ution tree orresponding to S[[W2℄℄ before pruning is given in Figure 7.7(page 216).

Page 216: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

216 Computing Data ow Dependen ies of S hemas with Loopsb1(x,y)

b2(x) id

b1(f(x,y),y)b1(x,g(y))

b1(x,g(g(y))

y->g(g(y))

x->f(x,y)

b1(f(x,y),g(y))b1(f(f(x,y),y),y)

x->f(x,y)y->g(y)

b2(f(x,y))

x->f(f(x,y),y)

y->g(y)

bottom bottom

bottom

b2(x)

b1(f(x,g(y)),g(y))

bottom x -> f(x,g(y))Figure 7.7: The symboli exe ution tree, S[[W2℄℄ before pruningThe subtree to the left of the lower o urren e of b2(x) has been removed sin e it isrepresents impossible paths.Again, this time using S[[W2℄℄, the DTsli e is omputed The results are shown in Fig-ure 7.8(page 216). Variable Control Datax fb1; b2; f; x; yg ff; x; ygy fb1; b2; f; x; yg fg; ygFigure 7.8: DTsli e of S[[W2℄℄A new ontrol dependen e of x on f has emerged.

Page 217: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

7.3 The DTVD Algorithm for Loop S hemas 217The pro ess is repeated on e more. W3 is the s hema whi h has the symboli exe utiontree S[[W3℄℄ given in Figure 7.9(page 217).b1(x,y)

b2(x) id

b1(f(x,y),y)b1(x,g(y))

b1(x,g(g(y)))

y->g(g(y))b1(x,g(g(g(y))))

y->g(g(g(y)))bottom

x->f(x,y)

b1(f(x,y),g(y))b1(f(f(x,y),y),y)

x->f(x,y)y->g(y)

b1(f(x,y),g(g(y))

b2(f(x,y))

x->f(f(x,y),y)b2(f(f(x,y),y))

bottom x->f(x,y)y->g(g(y))

b1 (f(f(x,y) y),g(y))b1(f(f(f(x,y),y),y),y)

bottom x->f(f(f(x,y),y),y) bottom x->f(f(x,y),y)

y->g(y)

y -> g(y)Figure 7.9: The Symboli Exe ution Tree, S[[W3℄℄This time we �nd there is no hange in the DTsli e (Figure 7.10(page 217)). The s hemahas rea hed its maximal unfolding so the algorithm terminates.Variable Control Datax fb1; b2; f; x; yg ff; x; ygy fb1; b2; f; x; yg fg; ygFigure 7.10: DTsli e of S[[W3℄℄Sin e it is loop{free, to produ e the DTVD for ea h variable(see Figure 7.11(page 218)), we

Page 218: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

218 Computing Data ow Dependen ies of S hemas with Loopssimply ompute the DTVsli e (De�nition 6.4.4(page 195)) of this maximal unfolding (S[[W3℄℄)(or simply restri t the range of the DTsli e to just variables as des ribed in Chapter 6).Variable DTVDx fx; ygy fx; ygFigure 7.11: DTVD of the Loop S hemaand to produ e the DTLD for ea h variable(see Figure 7.12(page 218)), we simply omputethe DTLsli e (De�nition 6.2.4(page 182)) of S[[W3℄℄ (or simply restri t the range of the DTsli eto just labels as des ribed in Chapter 6).Variable DTLDx fb1; b2; fgy fb1; b2; f; ggFigure 7.12: DTLD of the Loop S hemaNoti e that the DTLsli e with respe t to x does not ontain g.7.4 Data ow Dependen e of UnfoldingsThe relationship between a loop s hema and its ith unfolding is embodied in the followinglemma. Lemma 7.4.1 states that a program represented by a loop s hema and the orre-sponding program in the ith unfolding of the loop s hema agree in all states where the loopterminates in i or fewer iterations. In all other states the latter program will not terminate.Lemma 7.4.1 Let p 2 [while b do S℄ and let p0 be the ` orresponding' program in [Wi(b; S)℄.Then for all states � in whi h p terminates in i or fewer iterationsM[[p℄℄� =M[[p0℄℄�and in all other statesM[[p0℄℄� = ?.Corollary 7.4.1 Let [[while b do S℄℄ be a loop s hema. Let i and j be natural numbers withi > j. Let pi and pj be ` orresponding' programs in its ith unfolding,[Wi(b; S)℄ and jth

Page 219: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

7.4 Data ow Dependen e of Unfoldings 219unfolding,[Wj(b; S)℄ respe tively. Then for all states � in whi h pi terminates in j or feweriterations M[[pi℄℄� =M[[pj ℄℄�and in all other statesM[[pj ℄℄� = ?.7.4.1 A Partial Ordering on ProgramsWe use the standard ordering on programs de�ned in denotational semanti s [78, 88℄.De�nition 7.4.1Given programs p1 and p2, p1 v p2 () M[[p1℄℄ vM[[p2℄℄where the ordering on program meanings is de�ned as follows:-De�nition 7.4.2M[[p1℄℄ vM[[p2℄℄ () for all states �;M[[p1℄℄� vM[[p2℄℄�where the ordering on states is de�ned as follows:-De�nition 7.4.3 �1 v �2 () (�1 = �2) or (�1 = ?)Using this ordering, program p1 v p2 if and only if whenever p1 terminates, p2 terminatesin the same state.Using the set theoreti de�nition of a binary relation it an be seen that:-Lemma 7.4.2 p1 v p2 =) TVD(p1) � TVD(p2)

Page 220: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

220 Computing Data ow Dependen ies of S hemas with LoopsDe�nition 7.4.4 (Program Substitution)Let p be a program ontaining sub{program p0.p[p00=p0℄ is the program p with p0 repla ed by p00.Lemma 7.4.3 (Program Substitution Lemma)p0 v p00 =) p[p0=q℄ v p[p00=q℄Proof: It follows straightforwardly from the ontinuity [78℄ and stri tness of all the operatorsin standard semanti s.Corollary 7.4.2 p0 v p00 =) TVD(p[p0=q℄) � TVD(p[p00=q℄)Proof: : Follows immediately from Lemma 7.4.2(page 219).7.4.2 A Partial Ordering on S hemasThe partial ordering on programs de�ned in De�nition 7.4.1(page 219) gives rise to a partialorder on s hemas.De�nition 7.4.5 s1 v s2if and only if for all ` orresponding' pairs of programs (p1; p2) in [s1℄� [s2℄,p1 v p2Corollary 7.4.3 s1 v s2 =) DTVD(s1) � DTVD(s2)Proof: : Follows immediately from Lemma 7.4.2(page 219).De�nition 7.4.6 (S hema Substitution)Let s be a s hema ontaining sub{s hema s0.s[s00=s0℄ is the program s with s0 repla ed by s00.

Page 221: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

7.4 Data ow Dependen e of Unfoldings 221Lemma 7.4.4 (S hema Substitution Lemma)s0 v s00 =) s[s0=t℄ v s[s00=t℄Proof: : Follows immediately from Lemma 7.4.3(page 220).Corollary 7.4.4 s0 v s00 =) DTVD(s[s0=t℄) � DTVD(s[s00=t℄)Proof: : Follows immediately from Lemma 7.4.4(page 221).Corollary 7.4.5 Let [[while b do S℄℄ be a loop s hema. Let i and j be natural numbers withi > j. Then Wj(b; S)v Wi(b; S)v [[while b do S℄℄Proof: : Follows immediately from Observation 7.2.1(page 211),Corollary 7.4.4(page 221),and Lemma 7.4.1(page 218).Lemma 7.4.5 Let s be a s hema that mentions the set of variables V and no others. ThenDTVD(s) � id [ (V � V )Proof: obviousLemma 7.4.6 For all loops s hemas, [[while b do S℄℄,n < m =) DTVD(Wn(b; S))� DTVD(Wm(b; S))Proof: Follows immediately from Corollary 7.4.5(page 221).Lemma 7.4.7 For all loops s hemas, [[while b do S℄℄, for all natural numbers n,DTVD(Wn(b; S))� DTVD([[while b do S℄℄)

Page 222: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

222 Computing Data ow Dependen ies of S hemas with LoopsProof: Follows immediately from Corollary 7.4.5(page 221) and Corollary 7.4.3(page 220).Corollary 7.4.6 For all loops s hemas, [[while b do S℄℄, there exists a natural number n su hthat m > n =) DTVD(Wm(b; S)) = DTVD(Wn(b; S))Proof: Follows immediately from the fa t DTVD is monotoni (Corollary 7.4.3(page 220)) and`sandwi hed' between id and id [ (V � V ). Where V is the �nite set of variables mentionedin [[while b do S℄℄.De�nition 7.4.7 (Maximal Unfolding Number)Given a loop s hema, [[while b do S℄℄, we de�ne the least natural number n su h thatm > n =) DTVD(Wm(b; S)) = DTVD(Wn(b; S)):as the maximal unfolding number of [[while b do S℄℄.De�nition 7.4.8 (Maximal Unfolding)We all WN (b; S) the maximal unfolding of [[while b do S℄℄We now prove that for all while loop s hemas W , there exists an N su h that W and itsNth unfolding, WN have the same DTVD.Theorem 7.4.1 For all while loop s hemas, [[while b do S℄℄,DTVD [[while b do S℄℄ = DTVD(WN (b; S))where N is the maximal unfolding number of [[while b do S℄℄.Proof: If x DTVD y in [[while b do S℄℄ there must exist two states �1 and �2 di�ering only ony su h that [[while b0 do S 0℄℄ terminates with di�erent �nal values for x where [[while b0 do S 0℄℄is a program represented by [[while b do S℄℄. Sin e [[while b0 do S 0℄℄ terminates there must besome n su h that it terminates after n iterations of the loop. There must therefore exist someprogram represented by its nth unfolding, Wn(b; S), that behaves the same as [[while b0 do S 0℄℄in states �1 and �2, i.e. x DTVD y in Wn(b; S). But if x DTVD y in Wn(b; S) then x DTVD

Page 223: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

7.4 Data ow Dependen e of Unfoldings 223y in WN(b; S) where N is the maximal unfolding number of [[while b0 do S 0℄℄. We have shownthat x DTVD y in i.e. [[DTVD(while b do S℄℄) � DTVD(WN (b; S))but Lemma 7.4.7(page 221),DTVD(WN(b; S))� DTVD([[while b do S℄℄)and therefore DTVD(WN(b; S)) = DTVD [[while b do S℄℄as required.We have proved that the pro ess of unfolding a while loop s hema will eventually produ ea loop{free s hema whose DTVD is the same as that of the original while loop s hema.Starting with a s hema s ontaining many possibly nested while loops, onsider an algo-rithm that works as follows:-Algorithm 7.4.11. Unfold every loop in s on e.2. repeatunfold ea h loop one more timeforeverLemma 7.4.8 After a �nite number of iterations of Algorithm 7.4.1 above, the s hema pro-du ed by Algorithm 7.4.1 will rea h a point, smax, where further unfoldings will never hangeits DTVD. We all the s hema smax, the Maximal Unfolding of s.Proof: By Corollary 7.4.5(page 221) and Corollary 7.4.4(page 221), ea h iteration will produ ea s hema with a bigger DTVD and it will never be bigger than the original s hema s. ByLemma 7.4.5(page 221), the DTVD of s is bounded above by a �nite relation.Lemma 7.4.9 Algorithm 7.4.1 must lead to a s hema whose DTVD is the same as the orig-inal s hema s.

Page 224: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

224 Computing Data ow Dependen ies of S hemas with LoopsProof: From s hema substitution lemma(Lemma 7.4.4(page 221)) and Lemma 7.4.7(page 221),it follows that the DTVD of ea h unfolded s hema des ribed above will always be a subset ofthe original s hema s. Therefore DTVD(smax) � DTVD(s)Conversely, and again similar to the proof in Theorem 7.4.1(page 222), suppose x DTVD yin s. Then there exists a program p in [s℄ and two states,�1 and �2 di�ering only at y inwhi h p terminates with di�erent �nal values for x. In the exe ution of p to rea h this �nalstate, for ea h loop l in p there will be maximum number nl, say, of times it got exe uted(in states �1 and �2) . Then by Corollary 7.4.1(page 218),Corollary 7.4.3(page 220) andthe program substitution lemma((Lemma 7.4.3(page 220)), if we repla e ea h loop l by itsnlth unfolding this program will behave the same as p in both states �1 and �2. But byLemma 7.4.8(page 223), the s hema t representing this program will be su h thatDTVD(t) � DTVD(smax)Therefore x DTVD y in smax DTVD(s) � DTVD(smax)Hen e DTVD(s) = DTVD(smax)as required.We have shown that suÆ ient unfoldings of a s hema s ontaining loops would result ina loop free s hema whose DTVD is the same as s. Provided an algorithm performs suÆ ientunfoldings it will be orre t.

Page 225: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

7.5 Implementation of DTVD and DTLD for S hemas with Loops 2257.5 Implementation of DTVD and DTLD for S hemas with LoopsThis algorithm works by repeatedly unfolding innermost loops and omputing DTD usingthe algorithms de�ned in Chapter 6 until there is no hange in the DTD of the resultingunfolding. It works outwards until the s hema is loop free. When this situation is rea hedthe DTLD and DTVD an be omputed by restri ting the range of DTD to just variablesand just labels respe tively as des ribed in Se tion 6.6.7.5.1 The Set of Variables A�e ted by a S hemaaffe ted: statement -> set variable;affe tedl: (list statement) -> set variable;affe ted bottom <= empty;affe ted (ass(x,E)) <= singleton x;affe ted (ife(E,s1,s2)) <= (affe tedl s1) U (affe tedl s2);affe ted (while(E,s)) <= affe tedl s;affe tedl nil <= empty;affe tedl (x::l) <= (affe ted x) U (affe tedl l);7.5.2 A Fun tion whi h he ks whether two symboli exe ution trees havethe same DTsli esamedependen y: (set variable)->SET->SET->bool;samedependen y A t1 t2<=makepfun A (DTsli e o (pathfun t1))=makepfun A (DTsli e o (pathfun t2));7.5.3 Implementation of Unfoldingleast: (set variable) -> delta -> (list statement) -> statement -> SET;least A b l s <= let (t1,t2) == (meaning s, meaning (ife(b,l<>[s℄,[℄)))in if samedependen y A t1 t2then t2else least A b l (ife(b,l<>[s℄,[℄));meaning (while(b,l)) <=least (affe ted (while (b,l))) b l (ife(b,[FAIL℄,[℄));

Page 226: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

226 Computing Data ow Dependen ies of S hemas with Loops7.5.4 Implementation of DTDDTLD: list statement -> pfun variable (set name);DTD l <=makepfun (affe tedl l)(DTsli e o (pathfun (meaningl l)));7.6 Implementation of DTLDWe simply restri t the range of DTD to the set of labels that o ur in the s hema.alllabels: statement -> set name;alllabelsl: (list statement) -> set name;alllabels bottom <= empty;alllabels (ass(x,E)) <= (labels E);alllabels (ife(E,s1,s2)) <= (labels E) U (alllabelsl s1) U (alllabelsl s2);alllabels (while(E,s)) <= (labels E) U (alllabelsl s);alllabelsl nil <= empty;alllabelsl (x::l) <= (alllabels x) U (alllabelsl l);sortofrangerestri t: set name -> pfun name (set name) ->pfun name (set name);sortofrangerestri t S f <=if f=emptythen emptyelse let ((a,b),g) == hoose fin (a,b interse t S) & (sortofrangerestri t S g);and thenDTLD: list statement -> pfun name (set name);DTLD s <= sortofrangerestri t (alllabelsl s) (DTD s);7.7 Implementation of DTVDSimilarly, we simply restri t the range of DTD to the set of variables that o ur in the s hema.allvariables: statement -> set name;allvariablesl: (list statement) -> set name;allvariables bottom <= empty;allvariables (ass(x,E)) <= x & (variables E);allvariables (ife(E,s1,s2)) <= (variables E) U (allvariablesl s1) U (allvariablesl s2);allvariables (while(E,s)) <= (variables E) U (allvariablesl s);allvariablesl nil <= empty;allvariablesl (x::l) <= (allvariables x) U (allvariablesl l);and then

Page 227: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

7.7 Implementation of DTVD 227DTVD: list statement -> pfun name (set name);DTVD s <= sortofrangerestri t (allvariablesl s) (DTD s);

Page 228: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

228 Computing Data ow Dependen ies of S hemas with Loops

Page 229: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

Chapter 8Con lusionsIn this thesis, we set out to answer the following questions:1. Why do data ow sli ing algorithms, like Weiser's, produ e sli es that are not data owminimal?2. Do algorithms for produ ing data ow minimal sli es exist?8.1 Why do data ow sli ing algorithms, like Weiser's, produ esli es that are not data ow minimal?`Traditional' data and ontrol dependen e expresses a relationship between the nodes of aprogram's ontrol ow graph [53℄. A node n1 is data or ontrol dependent on node n2 impliesthat there exists an instan e of the exe ution of the statement orresponding to node n2whi h `a�e ts' an instan e of exe ution of the statement orresponding to node n1. This`a�e ts' relationship is not transitive. Just be ause there exists an instan e of exe ution ofa node n1 whi h a�e ts an instan e of exe ution of node n2, and an instan e of exe ution ofnode n2 whi h a�e ts the �nal value of x, it annot be assumed that there is an instan e ofexe ution of node n1 that a�e ts the �nal value of x. This is seen in the ubiquitous examplein Figure 1.2(page 24), where� an exe ution instan e of node 3 a�e ts an exe ution instan e of node 2� and an exe ution instan e of node 2 a�e ts an exe ution instan e of node 4

Page 230: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

230 Con lusions� and an exe ution instan e of node 4 a�e ts the �nal value of xbut there is no exe ution instan e of node 3 that a�e ts the �nal value of x.Sli ing algorithms, for example, [92, 93, 28, 82, 5℄, use the transitive losure of the union ofdata and ontrol dependen e, whi h by de�nition is transitive and will, as has just been seen,sometimes result in `extra' dependen ies. The extra dependen ies give rise to the undesirablein lusion of `unne essary statements' in the sli es produ ed by these algorithms.8.2 Do algorithms for produ ing data ow minimal sli es exist?In an attempt to answer this question, a form of data ow minimal sli e alled DTVD is intro-du ed. An algorithm for omputing the DTVD of loop free program s hemas is introdu edand proved orre t.The algorithm is extended to handle program s hemas with loops by using repeated un-folding.An unfolding is, by de�nition, a loop free program s hema and therefore its data owminimal sli e an be omputed using the orre t algorithm mentioned above.For program s hemas, s, ontaining loops we prove that there exists an integer n, whi hwe all the maximal unfolding number for s, where the data ow minimal sli e of s is the sameas the DTVD of its mth unfolding for all m � n. The problem of omputing DTVD of s isthus redu ed to the problem of �nding the maximal unfolding number of s.We have proved that the pro ess of repeatedly unfolding s will eventually rea h the maxi-mal unfolding. A remaining problem is that we do not urrently know how to re ognise whenthe maximal unfolding has been rea hed.One possibility is to repeatedly unfold s until one `step' produ es no further hanges inthe DTVD of s. We believe, but have not yet proved, that rea hing this stable state impliesthat s's maximal unfolding number has been rea hed.Alternatively, as in the ase of Hausler's work [52℄, it is quite possible that there is amaximal unfolding number for ea h synta ti onstru t in the language of s hemas whi h anbe expressed as a fun tion of the maximal unfolding numbers of ea h of its omponents.Taking advantage of this, an algorithm for DTVD would ompute a maximal unfold-ing number ni for ea h synta ti omponent si of the s hema and then simply unfold ea h

Page 231: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

8.3 The Approa h 231 omponent si, ni times and then apply the algorithm for loop free s hemas.8.3 The Approa hThis thesis introdu es four dependen e relations� DVD {Data ow Variable Dependen e� DTVD{Data ow Terminating Variable Dependen e� DLD {Data ow Label Dependen e� DTLD {Data ow Terminating Label Dependen eThese relations are on on ontrol ow graphs and are all expressible in terms of a standardprogram semanti s. If an algorithm exists for any of these dependen ies then it will bedata ow minimal.These dependen ies are ompared with sli ing (Se tion 3.13). It is laimed that DTLDis a data ow minimal version of Venkatesh's stati ba kward losure sli e [91℄ and DLD isa data ow minimal version of sli es whi h preserve standard semanti s rather than the lazysemanti s [21℄ preserved by sli es produ ed by Weiser's algorithm [92, 93℄ and the PDGapproa h [82℄.Programs are partitioned into data ow equivalen e lasses. Ea h data ow equivalen e lass represents the set of all programs with the same ontrol owgraph. Data ow equivalen e lasses are represented as s hemas [44℄.A semanti s S of loop-free s hemas (Chapter 4) whi h maps loop-free s hemas into Sym-boli Exe ution Trees is de�ned. S gives rise to the �rst stage in the algorithm for om-puting the dependen ies. A theory of Symboli Exe ution Trees is developed (Chapter 5) andit is proved that S[[s℄℄ properly hara terises the set of programs represented by the loop{frees hema s (Theorem 5.4.1(page 160)).New forms of Data and Control Dependen e are de�ned in terms of symboli ex-e ution trees(Chapter 6). For loop{free s hemas s, algorithms for omputing DTLD andDTVD of s in terms of these new forms of data and ontrol dependen e of the symboli exe ution tree, S[[s℄℄ are de�ned and proved orre t (Theorem 6.3.5(page 188)). This proof

Page 232: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

232 Con lusionsrelies on ertain assumptions about the `ri hness' of the expressions of the programming lan-guage being analysed (Assumption 3.4.1) as well as the soundness and ompleteness of S(Theorem 5.4.1(page 160)).Similar algorithms for omputing DLD and DVD are des ribed but not proved orre t.It is proved that D(T)LD is just a spe ial ase of D(V)LD (Se tion 6.6).The data ow dependen ies for s hemas with loops is omputed by an iterative pro ess(Chapter 7). Initially ea h loop is repla ed by its `zeroth unfolding' and data ow dependen eof this resulting loop{free s hema is omputed. The resulting s hema is further unfoldedand the pro ess is repeated. We formally prove that this pro ess will eventually terminateresulting with a loop{free s hema whose data ow dependen e is the same as the programwith loops with whi h we started.Provided that we an re ognise when further unfoldings will produ e no further hangesin dependen y, we have a hieved algorithms for omputing the various minimal data owdependen ies introdu ed in this thesis.

Page 233: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

Chapter 9Future WorkThe �rst priorities for future work are1. Further investigation into the laim that Hausler's maximal unfolding number is appli- able to the data ow dependen ies of s hemas.2. Further investigation of the laim that no hange in the DTD in one iteration of theunfolding pro ess imply that further iterations annot introdu e further hanges to theDTVD.The proof of either of these laims would omplete the proof of the omputability of DTVDof s hemas with loops.It is also possible that simpler onditions exist for guaranteeing no further hange in theDTVD as a result of unfolding. Possibilities for further investigation in lude:-1. Unfolding until there is no further hange in just the data dependen y1 .2. Unfolding until no new ` attened symboli states' o ur: fx 7! f(f(x)); y 7! g(y; z)gand fx 7! f(x); y 7! g(g(y; z); z)g are two examples of the same attened state, sin ethe set of variables and labels in the symboli value orresponding to ea h variable isthe same in both states.Alternatively, a non{ onstru tive approa h to the omputability of DTVD, i.e. a proofthat is not dependent on a parti ular algorithm may bear fruit.1This approa h appears to work in the many examples so far tested.

Page 234: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

234 Future Work9.1 Extending the Proofs and Algorithms to DVD and DLDThe main proofs in this thesis have all been for the terminating forms of data ow dependen e:DTVD and DTLD. In Se tion 6.5 on page 195 it is laimed that only small hanges to thede�nitions of ontrol dependen e are needed to a hieve the non{terminating forms DVD andDLD. Further investigation of these laims is required.9.2 Improving EÆ ien yIn this thesis, the question of the existen e of an algorithm for omputing data ow minimalsli es was posed. The algorithms introdu ed here sometimes produ es sli es that are thinnerthan those produ ed using urrent approa hes. Sin e these algorithms examine all pathsthrough a s hema they have exponential omplexity and are therefore do not s ale up to `realworld' problems.Further work is required to investigate whether this exponential nature is a fun tion ofthe problem itself or just a fun tion of the parti ular approa h we have used.9.3 Experimenting with Di�erent De�nitions of Control De-penden e to Obtain di�erent dependen esIt appears that many useful forms of data ow dependen e an be omputed by applyingsubtle hanges to the de�nition of ontrol dependen e. For example, the only di�eren e inthe omputation of DTVD and DVD is due to su h a di�eren e. It was re ently noti ed2 thatthat we an have a s hema s su h that :(xDTVDy) and :(xDTVDz) but there exist twostates di�ering only on x and y with di�erent non{terminating values for z. This means, thatsurprisingly, the initial values of a set of variables an jointly ontribute to the �nal value ofa variable even if they do not ontribute individually.This leads us to require a di�erent form of dependen y where z is onsidered to dependboth on x and y. This form of dependen y is loser to sli ing sin e in a sli e we would wishto in lude all sets of statements who jointly ontribute to the �nal value of a variable even2Thanks to John Howroyd.

Page 235: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

9.4 Further Appli ability of Symboli Exe ution Trees 235if they do not ontribute individually. In this form of dependen y, whi h we all, terminat-ing variable set dependen y TVSD and its data ow ompanion data ow terminatingvariable set dependen y (DTVSD), a variable depends upon a set of variables. Note thatwe do not need data ow variable set dependen y(DVSD), sin e DVSD=DVD.De�nition 9.3.1 (TVSD)Let S be a set of variables. x TVSDS in p means there exist two states di�ering only on Ssu h that ? 6=M[[p℄℄� x 6=M[[p℄℄�0 x 6= ?and for all proper subsets T of S, for all states �, �0 di�ering only on T? 6=M[[p℄℄� x =M[[p℄℄�0 x 6= ?The author believes that the algorithm to produ e DTVSD an be obtained by a small hange in the de�nition of ontrol dependen y. Generally, more resear h is required intohow subtle hanges in the de�nition of ontrol dependen y give rise to algorithms for solvingdi�erent data ow dependen ies.9.4 Further Appli ability of Symboli Exe ution TreesIn this se tion, we brie y onsider other possible appli ations of Symboli Exe ution Trees.The main body of this thesis has shown how Symboli Exe ution Trees an be used toperform stati sli ing. Symboli Exe ution Trees an, therefore, be applied to any area that urrently uses stati sli ing.9.4.1 Data ow Minimal Weiser Sli esAlthough a variety of Dependen ies have been introdu ed in this thesis, all arguably as usefulas a Weiser Sli e, none is identi al to a Data ow Minimal Weiser Sli e(see Se tion 3.13).The example in Se tion 3.13.1 (page 113)shows that o asionally DTLD gives rise to sli eswhi h do not ontain statements whi h arguably should be in luded and are in luded in sli esprodu ed by Weiser's Algorithm. DLD on the other hand, be ause, unlike a Weiser sli e, itis required to have exa tly the same termination onditions as the original program, ontainsstatements not in luded in Weiser sli es. Further investigation is required to see whether the

Page 236: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

236 Future Workte hniques introdu ed in this thesis an produ e data ow minimal sli es that pre isely satisfythe semanti relationship satis�ed by sli es produ ed by Weiser's algorithm.9.4.2 Programs with Pro eduresAlthough not yet implemented, the author believes that, like in the Parallel Algorithm [28℄,the Data ow Dependen e of programs with pro edures an be omputed by su essive ap-proximation. As a �rst approximation, ea h pro edure all will be treated as FAIL . Thiswill enable us to apply the DTLD algorithm to the body of ea h pro edure pi. The resultingsymboli exe ution tree of pi, with ne essary adjustments to handle parameter passing, anthen be used in pla e of ea h all to pi. This will produ e a se ond approximation. Thispro ess is repeated until there is no further hange in the DTLD of ea h pro edure.9.4.3 Dynami Sli ingDynami Sli ing [71, 5, 43℄ o�ers the potential for mu h thinner sli es, sin e it is asking aboutdependen ies pertaining to parti ular exe utions of a program. Sin e it aptures all relevantexe utions, the Symboli Exe ution Tree is ideal for omputing su h dynami information. Indynami sli ing, the program is �rst exe uted to produ e an exe ution history. An exe utionhistory is the sequen e of nodes visited during this exe ution. All that we require to performdynami sli ing is the path that was visited during a parti ular exe ution. The symboli state at the leaf node of the symboli exe ution tree orresponding to that path ontainsall the ne essary dependen e information pertaining to that parti ular exe ution sequen e.If the exe ution sequen e is longer than a path in the Symboli Exe ution Tree, then, bytheorem 7.4.1, we an shorten the exe ution sequen e (where loops have been iterated moretimes than ne essary) without loss of dependen e information.

Page 237: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

Part IIAppendi es

Page 238: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence
Page 239: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

Appendix ASample Outputs from the DTLDand the DTVD AlgorithmsThe s hema omes �rst followed by the DTLD and then the DTVD. For example in the �rstone below, (" ", ["b1", "b2", "f3"℄) means that variable is data ow terminating labeldependent on ea h label in the set fb1; b2; f3g.These examples an be found together with the implementation at:http://158.223.53.22/~seb/phd/beginwhile b1(i)do beginif b2( )then begin :=f3(y);z:=f4()endelse ;i:=f5(i)endend

Page 240: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

240 Sample Outputs from the DTLD and the DTVD Algorithms("DTLD", ["DTLD"℄)(" ", ["b1", "b2", "f3"℄)("i", ["b1", "f5"℄)("z", ["b1", "b2", "f4"℄)("DTVD", ["DTVD"℄)(" ", [" ", "i", "y"℄)("i", ["i"℄)("z", [" ", "i", "z"℄)while b1(x,y)do if b2(x)then x:=f(x,y)else y:=g(y)("DTLD", ["DTLD"℄)("x", ["b1", "b2", "f"℄)("y", ["b1", "b2", "f", "g"℄)("DTVD", ["DTVD"℄)("x", ["x", "y"℄)("y", ["x", "y"℄)

Page 241: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

241!john howroyd example!while b1(q)do while b2(p)do beginif b3(x)then z:=f4()else p:=f5();if b6(y)then beginx:=f7();y:=f8()endelse q:=f9()end("DTLD", ["DTLD"℄)("p", ["b1", "f5"℄)("q", ["b1", "f9"℄)("x", ["b1", "f7"℄)("y", ["b1", "f8"℄)("z", ["b1", "f4"℄)("DTVD", ["DTVD"℄)("p", ["p", "q"℄)("q", ["q"℄)("x", ["q", "x"℄)("y", ["q", "y"℄)("z", ["q", "z"℄)

Page 242: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

242 Sample Outputs from the DTLD and the DTVD Algorithmsif f1(i)then while f1(i)dobegin x:=f2(x);i:=f6(i)endelse x:=f4()("DTLD", ["DTLD"℄)("i", ["f1", "f6"℄)("x", ["f1", "f2", "f4", "f6"℄)("DTVD", ["DTVD"℄)("i", ["i"℄)("x", ["i", "x"℄)while f1()do z:=f2()("DTLD", ["DTLD"℄)("z", nil)("DTVD", ["DTVD"℄)("z", ["z"℄)i:=f1()("DTLD", ["DTLD"℄)

Page 243: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

243("i", ["f1"℄)("DTVD", ["DTVD"℄)("i", nil)while b1(j)do begini:=b2();while b3(i)do beginz:=f4();i:=f5(i)end;j:=f6(j)end!does the final value of z depend on 5?!!i don't think so!("DTLD", ["DTLD"℄)("i", ["b1", "b2", "b3", "f5"℄)("j", ["b1", "f6"℄)("z", ["b1", "b2", "b3", "f4"℄)("DTVD", ["DTVD"℄)("i", ["i", "j"℄)("j", ["j"℄)("z", ["j", "z"℄)while b1(j)

Page 244: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

244 Sample Outputs from the DTLD and the DTVD Algorithmsdo beginwhile b2(i) do z:=f3(z);j:=f4(j)end("DTLD", ["DTLD"℄)("j", ["b1", "f4"℄)("z", nil)("DTVD", ["DTVD"℄)("j", ["j"℄)("z", ["z"℄)beginwhile b1(j)do begini:=f2();while b3(i)do beginz:=f4();i:=f5(i)end;j:=f6(j)end;if b1(j)then z:=f7()else z:=f8()end!does the final value of z depend on f5?!!i don't think so!

Page 245: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

245("DTLD", ["DTLD"℄)("i", ["b1", "b3", "f2", "f5"℄)("j", ["b1", "f6"℄)("z", ["f8"℄)("DTVD", ["DTVD"℄)("i", ["i", "j"℄)("j", ["j"℄)("z", nil)!this is a ni e examplethe algorithm is lever enough to knowthat if the loop terminates then b1(j) must be falseso z does not depend on f4 but it does depend on f5.!beginwhile b1(j)do beginz:=f2(z);j:=f3(j)end;if b1(j)then z:=f4()else z:=f5()end("DTLD", ["DTLD"℄)("j", ["b1", "f3"℄)("z", ["f5"℄)

Page 246: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

246 Sample Outputs from the DTLD and the DTVD Algorithms("DTVD", ["DTVD"℄)("j", ["j"℄)("z", nil)if b1(i)then while b1(i)do x:=f2(x)else x:=f4()("DTLD", ["DTLD"℄)("x", ["f4"℄)("DTVD", ["DTVD"℄)("x", nil)while b1(i) do x:=f2(x)("DTLD", ["DTLD"℄)("x", nil)("DTVD", ["DTVD"℄)("x", ["x"℄)while b1() do x:=f2()("DTLD", ["DTLD"℄)("x", nil)

Page 247: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

247("DTVD", ["DTVD"℄)("x", ["x"℄)!John Howroyd's first example!! A ounter example toz DVD {x,y} => z DVD x or z DVD yIn this example z DVD {x,y} and not (z DVD x) and not (z DVD y)(assuming that for z DVD K in p there must exist two terminatingstates s1 and s2 differing at most on variables in K,and a program q (data flow equivalent to p), su h that the finalvalue of z in s1 differs from that in s2.)Using the web-published DD sli er we should get that z sli e in ludes3 and 4,but as it doesn't depend on either `separately' we hypothesize that itwill fail to in lude these'.!beginq :=f1();while b2(q)beginh:=f3(x);k:=f4(y);z:=f5();p:=f6();while b7(p)

Page 248: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

248 Sample Outputs from the DTLD and the DTVD Algorithmsdo beginif b8(h)then z:=f9()else z:=f11() ;if b13(k)thenbeginh:=f14() ;k:=f15()endelseq:=f17()endendend("DTLD", ["DTLD"℄)("h", nil)("k", ["f4"℄)("p", ["f6"℄)("q", ["f1"℄)("z", ["f5"℄)("DTVD", ["DTVD"℄)("h", ["h"℄)("k", ["y"℄)("p", nil)("q", nil)("z", nil)

Page 249: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

249if b1(j)then i:=f2()else x:=f3()("DTLD", ["DTLD"℄)("i", ["b1", "f2"℄)("x", ["b1", "f3"℄)("DTVD", ["DTVD"℄)("i", ["i", "j"℄)("x", ["j", "x"℄)if b1(i)thenif b5(d)then :=f3(y)elseelse("DTLD", ["DTLD"℄)(" ", ["b1", "b5", "f3"℄)("DTVD", ["DTVD"℄)(" ", [" ", "d", "i", "y"℄)if b1( )then x:=f4(y)

Page 250: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

250 Sample Outputs from the DTLD and the DTVD Algorithmselse x:=f4(y)("DTLD", ["DTLD"℄)("x", ["f4"℄)("DTVD", ["DTVD"℄)("x", ["y"℄)beginif b1(p) then FAIL else;if b1(p) then else FAIL;x:=f2(z)end("DTLD", ["DTLD"℄)("x", nil)("DTVD", ["DTVD"℄)("x", nil)beginz:=f1(a,b);while b2(p) do;end("DTLD", ["DTLD"℄)("z", ["f1"℄)

Page 251: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

251("DTVD", ["DTVD"℄)("z", ["a", "b"℄)beginif b1(p)then z:=f2(k)else;while b1(p) do ;end("DTLD", ["DTLD"℄)("z", nil)("DTVD", ["DTVD"℄)("z", ["z"℄)while b1(i)do beginif b2( )then :=f3(y)else z:=f4();i:=f5(i)end("DTLD", ["DTLD"℄)(" ", ["b1", "b2", "f3"℄)

Page 252: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

252 Sample Outputs from the DTLD and the DTVD Algorithms("i", ["b1", "f5"℄)("z", ["b1", "b2", "f3", "f4", "f5"℄)("DTVD", ["DTVD"℄)(" ", [" ", "i", "y"℄)("i", ["i"℄)("z", [" ", "i", "y", "z"℄)while b1(j)do begini:=f2();while b3(i)do beginz:=f4(z);i:=f5(i)end;z:=f3(i,z);i:=f6(i);j:=f7(j)end("DTLD", ["DTLD"℄)("i", ["b1", "b3", "f2", "f5", "f6"℄)("j", ["b1", "f7"℄)("z", ["b1", "b3", "f2", "f3", "f4", "f5", "f7"℄)

Page 253: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

253("DTVD", ["DTVD"℄)("i", ["i", "j"℄)("j", ["j"℄)("z", ["j", "z"℄)while b1(j)do begini:=f2();z:=f3(i,z);i:=f4(i);j:=f5(j)end("DTLD", ["DTLD"℄)("i", ["b1", "f2", "f4"℄)("j", ["b1", "f5"℄)("z", ["b1", "f2", "f3", "f5"℄)("DTVD", ["DTVD"℄)("i", ["i", "j"℄)("j", ["j"℄)("z", ["j", "z"℄)

Page 254: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

254 Sample Outputs from the DTLD and the DTVD Algorithmsbeginif b1(i)then x:=f2(y)else x:=f3(z);if b4(d)then :=f5(x)else :=f5(x)end("DTLD", ["DTLD"℄)(" ", ["b1", "f2", "f3", "f5"℄)("x", ["b1", "f2", "f3"℄)("DTVD", ["DTVD"℄)(" ", ["i", "y", "z"℄)("x", ["i", "y", "z"℄)

Page 255: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

Appendix BProgramsB.1 Complete Hope Program for DTVD andDTLD#! /usr/lo al/bin/hope -f!~/newphd/programs/newboth.hop!Here we stop unfolding loops when only the data dependen y does not! hange. We keep the most unfolded - if you see what I mean.!Mu h faster but is it right!?!I feel that if the datadependen y has not hanged then nor will the! ontrol dependen y in the next unfoldinguses list,lists,set,moresetops,pfun, type,types,settolist;type name == list( har);data delta == va name ++ omplex (name # (set delta)) ++ botdelta;singleton: alpha -> set alpha;singleton x <= x & empty;update: (alpha -> beta) -> alpha -> beta -> (alpha -> beta);update f x y z <= if z=xthen yelse f z;

Page 256: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

256 Programsdata statement ==FAIL ++ass(name X delta) ++ife(delta X (list statement) X (list statement)) ++while(delta X (list statement));data token == sym har ++ str (list har ) ++ setvar(set delta);parsestatement: (list token) -> (list statement) # (list token);parsestatementlist: (list token) ->(list statement) # (list token);parsestatement((str V)::((sym ':')::((sym '=')::((str L)::((setvar S)::l)))))<= ([ass(V, omplex(L,S))℄,l);parsestatement((str "FAIL")::l) <=([FAIL℄,l);parsestatement((str "if")::((str L)::((setvar S)::((str "then")::l))))<= let(a,e::b) == parsestatement(l) inlet( ,d) == parsestatement(b) in([ife( omplex(L,S),a, )℄,d);parsestatement((str "while")::((str L)::((setvar S)::((str do) :: l))))<= let(a,b) == parsestatement(l) in([while( omplex(L,S),a)℄,b);parsestatement((str "begin")::l) <=let ( ,m) == parsestatementlist(l)in if m = nil

Page 257: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

B.1 Complete Hope Program for DTVD andDTLD 257then error "end expe ted"else let e::d== min if e=(str "end")then ( ,d)else error "end expe ted";parsestatement(x::l) <= (nil,x::l);parsestatement(nil) <= (nil,nil);parsestatementlist(x::l) <= let (a,b:: ) == parsestatement(x::l)in if b = (sym ';')then (let (e,f) == parsestatementlist( )in (a<>e,f))else (a,b:: );lexstring:list( har) -> list( har) # list( har);lexstring(nil) <= (nil,nil);lexstring(x::l) <= if isalnum xthen (let ( ,d) == lexstring(l) in (x:: ,d))else (nil,x::l);skip omment:list( har) -> list( har);skip omment(nil) <= nil;skip omment(x::l) <= if x /= '!'then skip omment(l)else l;

Page 258: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

258 Programslexpass1:list( har) -> list(token);lexpass1(nil) <= nil;lexpass1(x::l) <= if x = ' ' or x = '\n'then lexpass1(l)else if isalnum xthen (let (a,b) == lexstring(x::l)in str(a):: lexpass1(b))else if x='!'then lexpass1(skip omment(l))elseif (x='(') or (x=')') or (x=',') or (x ='=') or (x=':') or (x=';')then (sym x):: lexpass1(l)else error("illegal symbol in input");makesetvar:list(token) -> (set delta) # list(token);lexpass2:list(token) -> list(token);lexpass2(nil) <= nil;lexpass2(x::l) <= if x /= sym('(')then x::lexpass2(l)else let (a,e::b) == makesetvar(l)in setvar(a)::lexpass2(b);makesetvar(nil) <= (empty,nil);makesetvar(str(x)::sym(',')::l)<= let (a,b) == makesetvar(l)in (((va x) & empty) U a,b);

Page 259: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

B.1 Complete Hope Program for DTVD andDTLD 259makesetvar(str(x)::l) <= (((va x) & empty),l);makesetvar(x::l) <= (empty,x::l);parse: list( har) -> list(statement);parse l <= let (a,b) == parsestatement(lexpass2(lexpass1(l))) in a;lex: list( har) -> list(token);lex l <= lexpass2(lexpass1(l));data state== ok(name -> delta) ++ botstate;data SET == leaf state ++ node(SET X delta X SET);type path == set delta X set delta;evaldelta: state -> delta -> delta;evaldelta botstate x <= botdelta;evaldelta (ok sigma) botdelta <= botdelta;evaldelta (ok sigma) (va x) <= sigma x;evaldelta (ok sigma) ( omplex (f,S)) <= omplex(f,mapset1(evaldelta (ok sigma) ,S));updatestateinstate:state -> state -> state;updatestateinstate (ok st1) (ok st2) <= ok((evaldelta (ok st1) o st2));updatestateinstate x y <= botstate;treeinstate: SET ->state -> SET;treeinstate (leaf sigma') sigma <= leaf (updatestateinstate sigma sigma');treeinstate (node(t1,r,t2)) sigma <=

Page 260: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

260 Programsnode (treeinstate t1 sigma,evaldelta sigma r, treeinstate t2 sigma);sequen e:SET -> SET -> SET;sequen e (leaf sigma) t' <= treeinstate t' sigma;sequen e(node(t1,r,t2)) t' <= node(sequen e t1 t',r,sequen e t2 t');prune: path -> SET -> SET;prune (l,m) (leaf x) <= leaf x;prune (l,m) (node(b1,r,b2)) <=if (r isin l)then prune (l,m) b1else if (r isin m)then prune (l,m) b2else node(prune (r & l,m) b1, r, prune (l,r & m) b2);simplify: SET -> SET;simplify <= prune(empty,empty);meaning:statement -> SET;meaningl:list(statement) -> SET;meaningl nil <= leaf (ok va);meaningl (x::l) <= simplify (sequen e (meaning x) (meaningl l));meaning FAIL <= leaf botstate;meaning (ass(x,e)) <=leaf(ok (update va x (evaldelta (ok va) e)));meaning (ife(e,l1,l2)) <=

Page 261: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

B.1 Complete Hope Program for DTVD andDTLD 261simplify (node(meaningl l1, evaldelta (ok va) e,meaningl l2));de addleft,addright: delta X (pfun path delta) -> (pfun path delta);addleft(d,f) <= mapset(lambda ((a,b), ) => singleton((d & a,b), ),f);addright(d,f) <= mapset(lambda ((a,b), ) => singleton((a,d & b), ),f);applystate :name -> state -> delta;applystate v (ok sigma) <= sigma v;applystate v botstate <= botdelta;pathfun: SET -> name -> (pfun path delta);pathfun (leaf sigma) v <= singleton((empty,empty),applystate v sigma);pathfun (node (b1,r,b2)) v <= addleft (r,pathfun b1 v) U addright (r,pathfun b2 v);variables: delta -> set name;variablesset: (set delta) -> (set name);variables (va x) <= x & empty;variables ( omplex (f,S)) <= variablesset (S);variables botdelta <= empty;variablesset S <=mapset(variables,S);labels: delta -> set name;labelsset: (set delta) -> (set name);labels (va x) <= empty;labels ( omplex (f,S)) <= (singleton f) U labelsset (S);labels botdelta <= empty;

Page 262: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

262 Programslabelsset S <=mapset(labels,S);names: delta -> set name;names(x) <= (labels x) U (variables x);nameset: (set delta) -> (set name);nameset S <= mapset(names,S);datadependent: (pfun path delta) -> set name ;datadependent <= nameset o range;differen es: path X path -> set delta;differen es((p1,p1'),(p2,p2')) <= ((p1 interse t p2') U (p1' interse t p2));allinterse t: set delta -> set name;allinterse t S <= if S = emptythen emptyelse let (a,T)== hoose Sin if ( ard S) = 1then (names a)else (names a) interse t (allinterse t T);T ontroldependent : (pfun path delta) -> set name ;T ontroldependent f <=mapset(lambda d1 => mapset(

Page 263: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

B.1 Complete Hope Program for DTVD andDTLD 263lambda d2 =>if(apply f d1) = (apply f d2) or (apply f d1)=botdelta or (apply f d2)=botdeltathen emptyelse allinterse t (differen es(d1,d2)) , domain f),domain f) ;DTsli e: (pfun path delta) -> (set name);DTsli e f <= (datadependent f) U (T ontroldependent f);affe ted: statement -> set name;affe tedl: (list statement) -> set name;affe ted bottom <= empty;affe ted (ass(x,E)) <= singleton x;affe ted (ife(E,s1,s2)) <= (affe tedl s1) U (affe tedl s2);affe ted (while(E,s)) <= affe tedl s;affe tedl nil <= empty;affe tedl (x::l) <= (affe ted x) U (affe tedl l);samedatadependen y: (set name)->SET->SET->bool;samedatadependen y A t1 t2<=makepfun A (datadependent o (pathfun t1))=makepfun A (datadependent o (pathfun t2));samedependen y: (set name)->SET->SET->bool;

Page 264: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

264 Programssamedependen y A t1 t2<=makepfun A (DTsli e o (pathfun t1))=makepfun A (DTsli e o (pathfun t2));least: (set name) -> delta -> (list statement) -> statement -> SET;least A b l s <= let (t1,t2) == (meaning s, meaning (ife(b,l<>[s℄,[℄)))in if samedatadependen y A t1 t2then t2else least A b l (ife(b,l<>[s℄,[℄));!here we're stopping when datadpenden y is same! hange to samedependen y if I likemeaning (while(b,l)) <=least (affe ted (while (b,l))) b l (ife(b,[FAIL℄,[℄));DTD: list statement -> pfun name (set name);DTD l <=makepfun (affe tedl l)(DTsli e o (pathfun (meaningl l)));allvariables: statement -> set name;allvariablesl: (list statement) -> set name;allvariables bottom <= empty;allvariables (ass(x,E)) <= x & (variables E);allvariables (ife(E,s1,s2)) <= (variables E) U (allvariablesl s1) U (allvariablesl s2);allvariables (while(E,s)) <= (variables E) U (allvariablesl s);allvariablesl nil <= empty;allvariablesl (x::l) <= (allvariables x) U (allvariablesl l);

Page 265: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

B.1 Complete Hope Program for DTVD andDTLD 265alllabels: statement -> set name;alllabelsl: (list statement) -> set name;alllabels bottom <= empty;alllabels (ass(x,E)) <= (labels E);alllabels (ife(E,s1,s2)) <= (labels E) U (alllabelsl s1) U (alllabelsl s2);alllabels (while(E,s)) <= (labels E) U (alllabelsl s);alllabelsl nil <= empty;alllabelsl (x::l) <= (alllabels x) U (alllabelsl l);sortofrangerestri t: set name -> pfun name (set name) ->pfun name (set name);sortofrangerestri t S f <=if f=emptythen emptyelse let ((a,b),g) == hoose fin (a,b interse t S) & (sortofrangerestri t S g);DTLD: list statement -> pfun name (set name);DTLD s <= sortofrangerestri t (alllabelsl s) (DTD s);DTVD: list statement -> pfun name (set name);DTVD s <= sortofrangerestri t (allvariablesl s) (DTD s);k:list( har) -> list(name X set(name));k(l) <= [("DTLD","DTLD"& empty)℄<>(settolist (DTLD(parse l))) <>[("DTVD","DTVD"&empty)℄ <> (settolist (DTVD(parse l)));write(k input);

Page 266: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

266 ProgramsB.2 Auxiliary Hope Fun tionsinfix isin :4;infix minus :4;infix interse t:4;de isin : alpha # set(alpha) -> truval;--- x isin s <= if s = emptythen falseelse let (y,l) == hoose sin if x=ythen trueelse x isin l;de interse t: (set alpha) X (set alpha) -> (set alpha);s interse t t <= if t=emptythen emptyelse let (a,b) == hoose tin (if (a isin s) then( a & empty) else empty) U (s interse t b);de minus: set (alpha) # set (alpha) -> set (alpha);s minus t <= if s = emptythen emptyelse let (a,v) == hoose(s)in if a isin tthen v minus telse a & (v minus t);de mapset: (alpha -> set (beta)) # set(alpha) -> set(beta);--- mapset(f,s) <= if s = empty

Page 267: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

B.2 Auxiliary Hope Fun tions 267then emptyelse let (x,y) == hoose(s)in f(x) U mapset(f,y);de mapset1: (alpha -> beta) # set(alpha) -> set(beta);--- mapset1(f,s) <= if s=emptythen emptyelse let (x,y) == hoose(s)in (f(x) & empty) U mapset1(f,y);de maplist: (alpha -> beta) # list(alpha) -> list(beta);--- maplist(f,nil) <= nil;--- maplist(f,x::l) <= f(x)::maplist(f,l);de maplisttoset: (alpha -> beta) # list(alpha) -> set(beta);--- maplisttoset(f,nil) <= empty;--- maplisttoset(f,x::l) <= (f(x) & empty) U maplisttoset(f,l);type pfun alpha beta == set(alpha X beta);apply: pfun alpha beta -> alpha -> beta;apply f z <= let ((a,b),g) == hoose fin if a=zthen belse apply g z;makepfun: set alpha -> (alpha -> beta) -> pfun alpha beta;makepfun S f <= if S= emptythen emptyelse let (a,T) == hoose Sin (a,f a) & (makepfun T f);

Page 268: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

268 Programsdomain: pfun alpha beta -> set alpha;domain f <= if f=emptythen emptyelse let ((a,b),g) == hoose fin a & domain g;range: pfun alpha beta -> set beta;range f <= if f=emptythen emptyelse let ((a,b),g) == hoose fin b & range g;update: alpha -> beta -> pfun alpha beta -> pfun alpha beta;update x y f <= if f=emptythen (x,y) & emptyelse let ((a,b),g) == hoose fin if x=a then (x,y) & gelse (a,b) & (update x y g);restri t: set alpha -> pfun alpha beta -> pfun alpha beta;restri t S f <= if f=emptythen emptyelse let ((a,b),g) == hoose fin if a isin Sthen (a,b) & (restri t S g)else restri t S g;rangerestri t: set beta -> pfun alpha beta -> pfun alpha beta;rangerestri t S f <= if f=empty

Page 269: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

B.2 Auxiliary Hope Fun tions 269then emptyelse let ((a,b),g) == hoose fin if b isin Sthen (a,b) & (rangerestri t S g)else rangerestri t S g; ompose: (beta -> gamma) -> pfun alpha beta -> pfun alpha gamma; ompose f g <= if g=emptythen emptyelse let ((a,b),h) == hoose gin (a,f b) & ( ompose f h);!the identity fun tion with domain S.idpfun:set alpha -> pfun alpha alpha;idpfun S <= makepfun S id;override: pfun alpha beta -> pfun alpha beta -> pfun alpha beta;override f g <= (restri t ((domain f) minus (domain g)) f) U g;de settolist:set(alpha) -> list(alpha);--- settolist(s) <= if s=emptythen [℄else let (a,b) == hoose(s)in a::settolist(b);

Page 270: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

270 Programs

Page 271: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

Appendix CCorre tness of the ParallelAlgorithmC.1.1 Fun tional NetworksThe pro esses networks used in the parallel sli ing algorithm an be de�ned in terms ofre ursion equations, not over in�nite streams as in [1℄, but over �nite sets of variable namesand node identi�ers.ExampleConsider the pro ess network des ribed in subse tion 2.3.4 with ea h ar and node labelledas follows:-

Page 272: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

272 Corre tness of the Parallel Algorithm4

6

1

2

3

5

ENTRY

F7

7

EXIT

F2

F1

F3

F4

F5

F6

G

E

F

C

D

E

B

B

A

Figure C.1: The fun tional network derived from the example programFrom the diagram, the following re ursion equation is derived:-G = F7(F2(F3(F5(F6(G))[ F4(F5(F6(G))))))Where ea h Fi orresponds to the behaviour of the pro ess i as a fun tion on sets asdes ribed in subse tion 2.3.1. i.e.

Page 273: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

273Fi(S)= if S \ (def (i) [ C(i)) 6= ;then (Sndef (i)) [ ref (i) [ figelse SInputs to pro esses are represented by arguments to the orresponding fun tions, Fi, andoutputs of pro esses by the results of the orresponding fun tions. Clearly, di�erent networktopologies are a hieved by omposing the fun tions in di�erent ways1. Noti e that if a pro esshas more than one input, then the argument of the orresponding fun tion is the union ofthe individual inputs. In the above example there is one loop and hen e only one equation.In general however, there will be an equation for ea h y le in the R ontrol ow graph.Solving the Equations to Produ e Sli esThe equations, in isolation, represent the stati properties of network. A solution to theequations represents a possible labelling with a set of variable names and node identi�ers ofall the ar s of the R ontrol ow graph. For ea h ar , this label orresponds to the union ofall messages that, in a valid implementation (see subse tion C.1.1), will be transmitted alongthe ommuni ation hannel represented by that ar .Valid ImplementationsIn general, of ourse, there are many solutions to su h equations. Following [78℄, a validimplementation of the parallel sli ing algorithm is de�ned as one whi h produ es the leastsolution to the equations; that is, the least solution relative to the partial order of ar {wiseset in lusion, i.e., using v to denote the partial order, for networks L1 and L2, L1 v L2 if andonly if, for every ar , the orresponding label for L1 is a subset of the orresponding label forL2. In order to produ e the sli e, the solution sought must be the least solution whi h ontainsthe sli e set as a subset of the label at the sli e node (be ause the parallel sli ing algorithmis initiated by the sli e node outputting the sli e set).1The possibility of des ribing the network in this way suggests an implementation of the algorithm by ompiling the R ontrol ow graph into a fun tional program.

Page 274: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

274 Corre tness of the Parallel AlgorithmIn subse tion C.1.1 above, sin e the sli e set is f g, the least solution to the re ursionequation that has f g � G is required.C.1.2 Corre tness of the Parallel Sli ing AlgorithmIn this subse tion the parallel sli ing algorithm is proved orre t, in the sense that everystatement in luded in a Weiser sli e will also be in luded using the parallel sli ing algorithm.Some preliminary results are �rst stated and proved.Existen e of SolutionsIt is ne essary to verify that solutions to su h equations exist. This follows Kleene's �rstre ursion theorem [78℄, from the fa t that the fun tions Fi, introdu ed in subse tion C.1.1are monotoni with respe t to set in lusion and hen e an be solved by onstru ting Kleene hains.TerminationIt is also important to show that su h re ursion equation systems give rise to terminating omputations. To do this, it must be shown that �nite solutions always exist. This followsfrom the fa t that the labelling of ea h ar must be ontained in the �nite set onsisting ofall node identi�ers and variable names of the ontrol ow graph.It an easily be shown that the fun tions representing pro esses possess the additive prop-erty i.e. F (A [B) = F (A) [ F (B)This ensures that, in a valid implementation of the parallel sli ing algorithm, ea h pro essnever need output the same value more than on e, ensuring that it need not output messagesinde�nitely.Corre tness ProofDe�nition C.1.1 (outputs(V;i)(n))Let (V; i) be a sli ing riterion for a ontrol ow graph and let n be a node of the ontrol ow

Page 275: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

275graph,Outputs(V;i)(n) is de�ned to be the union of all messages output by node n when sli ingusing the parallel sli ing algorithm with respe t to (V; i).More rigorously, outputs(V;i)(n) is the labelling of all ar s emerging from n in the least solutionof the equations derived from the ontrol ow graph where the labelling of all ar s emergingfrom i ontain V .Lemma 1Let (V; i) be a sli ing riterion for a ontrol ow graph then V � outputs(V;i)(i)Proof obvious.Lemma 2Let (V; i) be a sli ing riterion for a ontrol ow graphIf K � outputs(V;i)(b) then for all nodes, j, of the ontrol ow graph, outputs(K;b)(j) �outputs(V;i)(j)ProofSin e K � outputs(V;i)(b), the least solution ontaining V on all ar s emerging from nodei ontains K on all ar s emerging from node b.For all nodes j; let outputs(K;b)(j) = XjSo by de�nition, the least solution ontaining K on all ar s emerging from b has Xj on allar s emerging from j, for all nodes j.So the least solution ontaining V on all ar s emerging from i has Xj on all ar s emergingfrom j, for all nodes j.So for all nodes j of the ontrol ow graph, outputs(K;b)(j) � outputs(V;i)(j) as required.The main theorem will now be proved. It is proved that every statement in luded in aWeiser sli e will also be in luded in the sli e obtained using the parallel sli ing algorithm.

Page 276: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

276 Corre tness of the Parallel AlgorithmIt shall be proved by indu tion that for all n � 0, for all sli ing riteria C, for all nodes ithat:-1. RnC(i) � outputsC(i)i.e. the label of the ar (s) from node i in lude the relevant variable for node i.2. j 2 Snj ) j 2 outputsC(j)i.e. if j is a relevant statement then it will output its node identi�er.Base CaseFirst it is proved that:-3. 8i; R0C(i) � outputsC(i)4. 8j; j 2 S0C ) j 2 outputsC(j)Proof of 3Part 3 is proved by indu tion on the maximum distan e of i from the sli e node. Themaximum distan e from node i to node j is the maximum number of distin t nodes ina path from i to j.First, if i is the sli e node then by de�nition R0C(i) = V (V is the sli e set) and by def-inition of the parallel sli ing algorithm (subse tions 2.3.2 and C.1.1), V � outputsC(i).If i is not the sli e node, suppose for all ar s at a maximum distan e � N from the sli enode that R0C(i) � outputsC(i).Let i be a node at a maximum distan e N+1 from the sli e node, then by de�nition, allnodes whi h input to i are at a distan e � N from i and indu tively, it an be on ludedthat:-

Page 277: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

2775. The inputs to i ontains Sj!RCFGi R0C(j)Let v 2 R0C(i), then, by de�nition,either there exists a j su h that j !RCFG i and v =2 def (i) and v 2 R0 (j)in whi h ase v is input to node i (by 5 above) and will be output by i(by de�nition of pro ess behaviour (subse tions C.1.1 and 2.3.1)).or v 2 ref (i) and there exists a j su h that def (i)\ R0C(j) 6= ;in whi h ase by 5 above and again by de�nition of pro ess behaviour,it follows that i outputs ref (i) and hen e i outputs v.Con luding that R0C(i) � outputsC(i)Proof of 4 By de�nition S0C = fi j def (i)\ R0C(j) 6= ;ghen e j 2 S0C ) 9k su h that k !RCFG j and def (j)\ R0C(k) 6= ;) 9k su h that k !RCFG j and def (j)\ outputsC(k) 6= ;) j 2 outputsC(j) by de�nition of pro ess behaviour (C.1.1, 2.3.1)This on ludes the proof of the base ase.Indu tive StepNow assume6. 8i; C RNC (i) � outputsC(i)and

Page 278: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

278 Corre tness of the Parallel Algorithm7. 8j; C j 2 SNC ) j 2 outputsC(j)It must be proved that:-8. 8i; C RN+1C (i) � outputsC(i) and9. 8j; C j 2 SN+1C ) j 2 outputsC(j)Proof of 8Weiser de�nes RK+1C (i) = RKC (i) [ [b2BKC R0(b;ref (b))(i)and SK+1C = BKC [ fi j 9j su h that i!CFG j and def (i)\ RK+1C (j) 6= ;gand BKC = fb j 9i 2 SKC su h that b ontrols igNow let v 2 RN+1C (i)

Page 279: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

279Assume 9b 2 BNC su h that v 2 R0(b;ref (b))(i)Then by (3) it follows that v 2 outputs(b;ref (b))(i)But, by de�nition, b ontrols a node in SKC . So by the indu tion hypothesis (7) this nodewill have output its node identi�er. b will therefore re eive this node identi�er, and byde�nition of pro ess behaviour (subse tions C.1.1 and 2.3.1), b will output ref (b).i.e. ref (b) � outputs(v;i)(b)so outputs(b;ref (b))(i) � outputsC(i) by Lemma 2 (subse tion C.1.2)so v 2 outputsC(i)as required for proof of (8).Proof of 9Let i 2 SN+1C ,then i 2 BNC or def (i)\ RN+1C (j) 6= ; for some node j inputting to i.

Page 280: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

280 Corre tness of the Parallel Algorithmif def (i)\ RN+1C (j) 6= ;then def (i)\ outputsC(j) 6= ; for some node j inputting to i whi h,by de�nition of pro ess behaviour (C.1.1 and 2.3.1), implies i 2 outputsC(i)or if i 2 BNCthen i ontrols an element j say of SNCbut by indu tion hypothesis (7), j 2 outputsC(i)This on ludes the proof.

Page 281: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

Appendix DProof of the DTVD algorithm forLoop{free S hemas,Lemma D.0.1 y 2 Vdatadepends s x implies there exist a path � and a program p 2 [s℄ andtwo states � and �0 di�ering only at y su h that su h that(satisfy s p � �)and(satisfy s p �0 �)and ? 6= evalsym s p � (pfun s � x) 6= evalsym s p �0(pfun s � x) 6= ?.Proof:Sin e y 2 Vdatadepends s x, there exists a path � su h that y 2 variables(pfun s � x) We an therefore hoose two states � and �0 di�ering only at y su h that? 6= evalsym s p � (pfun s � x) 6= evalsym s p �0(pfun s � x) 6= ?.By Assumption 3.4.1, we an �nd values for all the predi ate fun tions o urring as outermostlabels in � su h that (satisfy s p � �)

Page 282: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

282 Proof of the DTVD algorithm for Loop{free S hemasand(satisfy s p �0 �)as required.Lemma D.0.2 Let Æi, i 2 f1 � � �ng be a set of n non{predi ate symboli values obtained froms. Let V = fvi j i 2 1 � � �ngbe a set of n integers then there exists a program p in [s℄ and a state � and su h that for alli 2 f1 � � �ng, vi = evalsym s p � ÆiProof:Lemma D.0.3 Let Æi, i 2 f1 � � �ng be a set of distin t non{predi ate symboli values obtainedfrom s that su h that for all i 2 f1 � � �ng, y 2 variables Æi . LetV = fvi j i 2 1 � � �ngandV 0 = fv0i j i 2 1 � � �ngbe two sets of n distin t integers with V \ V 0 = ;then there exists a program p in [s℄ and a two states � and �0 di�ering only at y su h that forall i 2 f1 � � �ng, vi = evalsym s p � Æiandv0i = evalsym s p �0 Æi.Proof:Indu tion on the maximum depth of the Æi.Base CaseThe Æi are all variables then n = 1. Simply pi k any �, � su h that

Page 283: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

283� y = v1and�0 y = v01 .Indu tion Hypothesis Assume true for all Æi of depth < m. Let Æi, i 2 f1 � � �kg be a set ofnon{predi ate symboli values that have maximum depth of m and are su h thatSuppose there are k proper sub{expressions , Æ0 of fÆig that mention y. LetU = fui j i 2 1 � � �kgandU 0 = fu0i j i 2 1 � � �kgbe two sets of k distin t integers with U \ U 0 = ;By indu tion hypothesis, we an hoose � and �0 di�ering only at y su h that for alli 2 f1 � � �kg, ui = evalsym s p � Æ0iandu0i = evalsym s p �0 Æ0i.Let Æki = fi(S) be the elements of fÆig whose depth is k.evalsym s p � fi(S) = E [[p fi℄℄[Æ2S varof (s; Æ) 7! (evalsym s p � Æ)andevalsym s p �0 fi(S) = E [[p fi℄℄[Æ2S varof (s; Æ) 7! (evalsym s p �0 Æ)Sin e y 2 variables fi(S),[Æ2S varof (s; Æ) 7! (evalsym s p � Æ) and [Æ2S varof (s; Æ) 7! (evalsym s p �0 Æ)are two states that are not equal and have not o urred in the evaluation of any proper{subexpressions ontaining y.By Assumption 3.4.1, the expressions in p orresponding to fi an be hosen as required.

Page 284: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

284 Proof of the DTVD algorithm for Loop{free S hemasLemma D.0.4 Let � and �0 be paths su h thaty 2\Æ2di�s(�;�0)(variables Æ)andy =2 variables(pfun s � x) and y =2 variables(pfun s �0 x)and? 6= pfun s � x 6= pfun s �0 x 6= ?then there exists a program p 2 [s℄ and two states � and �0 di�ering only at y su h that su hthat (satisfy s p � �)and(satisfy s p �0 �0)and? 6= evalsym s p � (pfun s � x) 6= evalsym s p �0(pfun s �0 x) 6= ?.Proof:By Lemma D.0.3(page 282), and Assumption 3.4.1, there exist two states di�ering only at ysu h that (satisfy s p � �)and(satisfy s p �0 �0)and sin e ? 6= pfun s � x 6= pfun s �0 x 6= ? we an hoose these states su h that? 6= evalsym s p � (pfun s � x) 6= evalsym s p �0(pfun s �0 x) 6= ?.D.0.3 Proof of DTVD AlgorithmTheorem D.0.1 Given a loop{free s hema s,y 2 DTVD s x () y 2 DTVsli e S[[s℄℄ xProof:We must show that there exists a program p in [s℄ and two states � and �0 di�ering only aty in s and a state � su h that

Page 285: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

285? 6=M[[p℄℄� x 6=M[[p℄℄�0 x 6= ?()y 2 Vdatadepends S[[s℄℄ x [VT ontrolsS[[s℄℄ x=)Assume that there exists a program p in [s℄ and two states � and �0 di�ering only at y in sand a state � su h that ? 6=M[[p℄℄� x 6=M[[p℄℄�0 x 6= ?By Theorem 5.4.1(page 160), there exist unique paths � and �0 in dom (pfun s) su h thatevalsym s p � (pfun s � x) 6= evalsym s p �0 (pfun s �0 x)su h that satisfy s p � � and satisfy s p�0 �0.Case 1 if � = �0 then y 2 variables(pfun s � x) sin e � and sigma0 di�er only at y. (Otherwiseevalsym s p � (pfun s � x) and evalsym s p �0 (pfun s � x) would have to be identi al).therefore y 2 Vdatadepends s x as required.Case 2 if � 6= �0Again if (pfun s � x) = (pfun s �0 x) then either y 2 variables(pfun s � x) and y 2variables(pfun s �0 x) so y 2 Vdatadepends s x as before.Assume (pfun s � x) 6= (pfun s �0 x) and y =2 variables((pfun s � x)) and y =2 variables((pfun s �0 x)).For all Æ 2 di�s(�; �0), y 2 variables Æ and therefore y is in VT ontrols s x as required.(=Assume y 2 Vdatadepends S[[s℄℄ x [VT ontrolsS[[s℄℄ x.Case 1 y 2 Vdatadepends s xso there must exist a path � = (�t; �f) with (pfun s � x) su h that y 2 variables(pfun s � x).by Lemma D.0.1(page 281), there exist two states � and �0 di�ering only at y su h that anda program p 2 [s℄ su h that

Page 286: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

286 Proof of the DTVD algorithm for Loop{free S hemas(satisfy s p � �)and(satisfy s p �0 �)and evalsym s p � (pfun s � x) 6= evalsym s p �0(pfun s � x).as required.Case 2 y 2 VT ontrols s xso there exist two paths � and �0 with y 2 variablesÆ for all Æ 2 di�s(�; �0) 6= ; su h thatpfun s � x 6= pfun s �0 xWe an assume that neither y =2 variables(pfun s � x) and y =2 variables(pfun s �0 x) sin eotherwise y 2 (Vdatadepends s x) whi h we have already onsidered. The result then followsimmediately from Lemma D.0.4(page 284).

Page 287: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

Bibliography[1℄ Abramsky, S. Reasoning about on urrent systems. In Distributed Computing (London, 1984),pp. 307{319.[2℄ Agrawal, H. On sli ing programs with jump statements. In ACM SIGPLAN Conferen eon Programming Language Design and Implementation (Orlando, Florida, June 20{24 1994),pp. 302{312. Pro eedings in SIGPLAN Noti es, 29(6), June 1994.[3℄ Agrawal, H., DeMillo, R. A., Pan, H., Spafford, E. H., and Viravan, C. Spyderproje t. URL http://www. s.purdue.edu/homes/spaf/spyder.html.[4℄ Agrawal, H., DeMillo, R. A., and Spafford, E. H. Debugging with dynami sli ing andba ktra king. Software { Pra ti e and Experien e 23, 6 (June 1993), 589{616.[5℄ Agrawal, H., and Horgan, J. R. Dynami program sli ing. In ACM SIGPLAN Conferen eon Programming Language Design and Implementation (New York, June 1990), pp. 246{256.[6℄ Bailey, R. Fun tional programming with Hope. Ellis Horwood, 1990.[7℄ Ball, T., and Horwitz, S. Sli ing programs with arbitrary ontrol{ ow. In 1st Conferen eon Automated Algorithmi Debugging (Link�oping, Sweden, 1993), P. Fritzson, Ed., Springer,pp. 206{222. Also available as Uniersity of Wis onsin{Madison, te hni al report (in extendedform), TR-1128, De ember, 1992.

Page 288: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

288 BIBLIOGRAPHY[8℄ Be k, J., and Ei hmann, D. Program and interfa e sli ing for reverse engineering. InIEEE/ACM 15th Conferen e on Software Engineering (ICSE'93) (1993), IEEE Computer So- iety Press, Los Alamitos, California, USA, pp. 509{518.[9℄ Beizer, B. Software Testing Te hniques. Van Nostrand Reinhold, 1990.[10℄ Bell, G., and Munro, M. Using dynami analysis to improve stati analysis. In 2nd UKWorkshop on Program Comprehension (Durham, UK, July 1996), M. Munro, Ed.[11℄ Bennett, K., and Mortimer, R. Maintenan e and abstra tion of program data using for-mal transformations. In IEEE International Conferen e on Software Maintenan e (1996), IEEEComputer So iety Press, Los Alamitos, California, USA.[12℄ Bieman, J. M., and Ott, L. M. Measuring fun tional ohesion. IEEE Transa tions on SoftwareEngineering 20, 8 (Aug. 1994), 644{657.[13℄ Binkley, D. W. The appli ation of program sli ing to regression testing. In Journal of Infor-mation and Software Te hnology Spe ial Issue on Program Sli ing, M. Harman and K. Gallagher,Eds., vol. 40. Elsevier, 1998. to appear.[14℄ Binkley, D. W. Computing amorphous program sli es using dependen e graphs and a data- owmodel. In ACM Symposium on Applied Computing (The Menger, San Antonio, Texas, U.S.A.,1999), ACM Press, New York, NY, USA. to appear.[15℄ Binkley, D. W., and Gallagher, K. B. Program sli ing. In Advan es of Computing, Volume43, M. Zelkowitz, Ed. A ademi Press, 1996, pp. 1{50.[16℄ Birkhoff, G., and M Lane, S. A Survey of Modern Algebra. Ma millan, 1949.[17℄ Bj�rner, D., Ershov, A. P., and Jones, N. D. Partial evaluation and mixed omputation.North{Holland, 1987.[18℄ Bull, T. A transformation system for maintenan e | turning theory into pra ti e. In IEEEInternational Conferen e on Software Maintenan e (ICSM'92) (1992), IEEE Computer So ietyPress, Los Alamitos, California, USA.

Page 289: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

BIBLIOGRAPHY 289[19℄ Canfora, G., Cimitile, A., and De Lu ia, A. Conditioned program sli ing. In Journal ofInformation and Software Te hnology Spe ial Issue on Program Sli ing, M. Harman and K. Gal-lagher, Eds., vol. 40. Elsevier, 1998. to appear.[20℄ Canfora, G., Cimitile, A., De Lu ia, A., and Lu a, G. A. D. Software salvaging based on onditions. In International Conferen e on Software Maintenan e (ICSM'96) (Vi toria, Canada,Sept. 1994), IEEE Computer So iety Press, Los Alamitos, California, USA, pp. 424{433.[21℄ Cartwright, R., and Felleisen, M. The semanti s of program dependen e. In ACM SIG-PLAN Conferen e on Programming Language Design and Implementation (1989), pp. 13{27.[22℄ Cheng, J. Sli ing on urrent programs { a graph{theoreti al approa h. In 1st Automati Algo-rithmi Debugging Conferen e (AADEGUB'93) (1993), P. Fritzson, Ed., pp. 223{240. Appearsas Springer Le ture Notes in Computer S ien e vol 749.[23℄ Choi, J., and Ferrante, J. Stati sli ing in the presen e of goto statements. ACM Transa tionson Programming Languages and Systems 16, 4 (July 1994), 1097{1113.[24℄ Chur h, A. The al uli of lambda{ onversion. Annals of Mathemati al Studies 6 (1951).[25℄ Cimitile, A., De Lu ia, A., and Munro, M. Qualifying reusable fun tions using symboli exe ution. In Pro eedings of the 2nd working onferen e on reverse engineering (Toronto, Canada,1995), IEEE Computer So iety Press, Los Alamitos, California, USA, pp. 178{187.[26℄ Cimitile, A., De Lu ia, A., and Munro, M. A spe i� ation driven sli ing pro ess foridentifying reusable fun tions. Software maintenan e: Resear h and Pra ti e 8 (1996), 145{178.[27℄ Coen-Porisini, A., De Paoli, F., Ghezzi, C., and Mandrioli, D. Software spe ializationvia symboli exe ution. IEEE Transa tions on Software Engineering 17, 9 (Sept. 1991), 884{899.[28℄ Dani i , S., Harman, M., and Sivagurunathan, Y. A parallel algorithm for stati programsli ing. Information Pro essing Letters 56, 6 (De . 1995), 307{313.[29℄ Dannenberg, and Ernst. Formal program veri� ation using symboli exe ution. IEEE Trans-a tions on Software Engineering 8 (Jan. 1982), 43{52.

Page 290: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

290 BIBLIOGRAPHY[30℄ Day, N. A. A Framework for Multi-Notation, Model-Oriented Requirements Analysis. PhDthesis, Department of Computer S ien e, University of British Columbia, 1998.[31℄ De Lu ia, A., Fasolino, A. R., and Munro, M. Understanding fun tion behaviours throughprogram sli ing. In 4th IEEE Workshop on Program Comprehension (Berlin, Germany, Mar.1996), IEEE Computer So iety Press, Los Alamitos, California, USA, pp. 9{18.[32℄ Dijkstra, E. W., and Dahl, O. A dis ipline of programming. Prenti e Hall, 1972.[33℄ Ershov, A. P. On the essen e of omputation. North{Holland Publishing, 1978, pp. 392{420.[34℄ Ferrante, J., Ottenstein, K. J., and Warren, J. D. The program dependen e graph andits use in optimization. ACM Transa tions on Programming Languages and Systems 9, 3 (July1987), 319{349.[35℄ Field, J., Ramalingam, G., and Tip, F. Parametri program sli ing. In 22nd ACM Symposiumon Prin iples of Programming Languages (San Fran is o, CA, 1995), pp. 379{392.[36℄ Field, J., and Tip, F. Dynami dependen e in term rewriting systems and its appli ation toprogram sli ing. In Journal of Information and Software Te hnology Spe ial Issue on ProgramSli ing, M. Harman and K. Gallagher, Eds., vol. 40. Elsevier, 1998. to appear.[37℄ Futamura, Y. Partial evaluation of omputation pro ess { an approa h to a ompiler ompiler.Systems, Computers, Controls 2, 5 (Aug. 1971), 721{728.[38℄ Gallagher, K. B. Using Program Sli ing in Software Maintenan e. PhD thesis, University ofMaryland, Baltimore, Maryland, De ember 1989.[39℄ Gallagher, K. B. Using program sli ing in software maintenan e. PhD thesis, University ofMaryland, College Park, Maryland, Jan. 1990.[40℄ Gallagher, K. B. Evaluating the surgeon's assistant: Results of a pilot study. In Pro eedingsof the International Conferen e on Software Maintenan e 1992 (Nov. 1992), IEEE ComputerSo iety Press, Los Alamitos, California, USA, pp. 236{244.

Page 291: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

BIBLIOGRAPHY 291[41℄ Gallagher, K. B., and Lyle, J. R. Using program sli ing in software maintenan e. IEEETransa tions on Software Engineering 17, 8 (Aug. 1991), 751{761.[42℄ Glaser, H., Hankin, C., and Till, D. Prin iples of Fun tional Programming. Prenti e HallInternational, 1984.[43℄ Gopal, R. Dynami program sli ing based on dependen e graphs. In IEEE Conferen e onSoftware Maintenan e (1991), pp. 191{200.[44℄ Greiba h, S. A. Theory of Program Stru tures: S hemes, Semanti s, Veri� ation. Springer-Verlag, Le ture Notes in Computer S ien e 36, 1975.[45℄ Gupta, R., Harrold, M. J., and Soffa, M. L. An approa h to regression testing using sli ing.In Pro eedings of the IEEE Conferen e on Software Maintenan e (Orlando, Florida, USA, 1992),IEEE Computer So iety Press, pp. 299{308.[46℄ Harman, M., and Dani i , S. Using program sli ing to simplify testing. Journal of SoftwareTesting, Veri� ation and Reliability 5, 3 (Sept. 1995), 143{162.[47℄ Harman, M., and Dani i , S. Amorphous program sli ing. In 5th IEEE Internation Workshopon Program Comprehesion (IWPC'97) (Dearborn, Mi higan, USA, May 1997), IEEE ComputerSo iety Press, Los Alamitos, California, USA, pp. 70{79.[48℄ Harman, M., and Dani i , S. A new algorithm for sli ing unstru tured programs. Journal ofSoftware Maintenan e 10, 6 (1998), 415{441.[49℄ Harman, M., Dani i , S., Sivagurunathan, B., Jones, B., and Sivagurunathan, Y.Cohesion metri s. In 8th International Quality Week (San Fran is o, May 29th { June 2nd.1995), pp. Paper 3{T{2, pp 1{14.[50℄ Harman, M., Dani i , S., and Sivagurunathan, Y. Program omprehension assisted bysli ing and transformation. In 1st UK workshop on program omprehension (Durham University,UK, July 1995), M. Munro, Ed.

Page 292: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

292 BIBLIOGRAPHY[51℄ Harman, M., Dani i , S., Sivagurunathan, Y., and Simpson, D. The next 700 sli ing riteria. In 2nd UK workshop on program omprehension (Durham University, UK, July 1996),M. Munro, Ed.[52℄ Hausler, P. A. Denotational program sli ing. In 22nd, Annual Hawaii International Conferen eon System S ien es, Volume II (Jan. 1989), pp. 486{495.[53℄ He ht, M. S. Flow Analysis of Computer Programs. Elsevier, 1977.[54℄ Hoare, C. A. R. An axiomati basis for omputer programming. In Communi ations of theACM (O t. 1969), vol. 12.[55℄ Hoare, C. A. R. Communi ating Sequential Pro esses. Prenti e{Hall, 1985.[56℄ Horwitz, S., Prins, J., and Reps, T. On the adequa y of program dependen e graphs forrepresenting programs. In Pro eedings of the 15th Annual ACM Symposium on the Prin iples ofProgramming Languages (Jan. 1988).[57℄ Horwitz, S., Prins, J., and Reps, T. Integrating non{interfering versions of programs. ACMTransa tions on Programming Languages and Systems 11, 3 (July 1989), 345{387.[58℄ Horwitz, S., and Reps, T. Wisonsin program sli ing proje t.URL http://www. s.wis .edu/wpis/html/.[59℄ Horwitz, S., Reps, T., and Binkley, D. Interpro edural sli ing using dependen e graphs.In ACM SIGPLAN Conferen e on Programming Language Design and Implementation (Atlanta,Georgia, June 1988), pp. 25{46. Pro eedings in SIGPLAN Noti es, 23(7), pp.35{46, 1988.[60℄ Horwitz, S., Reps, T., and Binkley, D. Interpro edural sli ing using dependen e graphs.ACM Transa tions on Programming Languages and Systems 12, 1 (1990), 26{61.[61℄ Howden, W. E. Symboli Testing and the DISSECT Symboli Evaluation System. IEEE Trans.Softw. Eng. SE-3, 4 (July 1977), 266{278.[62℄ Huang, J. C. Instrumenting Programs for Symboli -tra e Generation. Computer 13, 12 (De .1980), 17{23.

Page 293: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

BIBLIOGRAPHY 293[63℄ Hudak, P., Jones, S. P., and Wadler, P. Report on the programming language haskell, anon-stri t purely fun tional language (version 1.2). ACM SIGPLAN Noti es 27 (May 1992).[64℄ Kahn, G. The semanti s of a simple language for parallel programming. In IFIP Congress 74(1974), North{Holland, Amsterdam.[65℄ Kahn, G., and Ma Queen, D. B. Coroutines and networks of parallel pro esses. In IFIPCongress 77 (1977), North{Holland, Amsterdam, pp. 993{998.[66℄ Kamkar, M. Interpro edural dynami sli ing with appli ations to debugging and testing. PhDThesis, Department of Computer S ien e and Information S ien e, Link�oping University, Sweden,1993. Available as Link�oping Studies in S ien e and Te hnology, Dissertations, Number 297.[67℄ Kamkar, M. Appli ation of program sli ing in algorithmi debugging. In Journal of Informationand Software Te hnology Spe ial Issue on Program Sli ing, M. Harman and K. Gallagher, Eds.,vol. 40. Elsevier, 1998. to appear.[68℄ Kamkar, M., Shahmehri, N., and Fritzson, P. Interpro edural dynami sli ing. In Pro eed-ings of the 4th Conferen e on Programming Language Implementation and Logi Programming(1992), pp. 370{384.[69℄ King, J. Symboli exe ution and program testing. Communi ations of the ACM 19, 7 (1976),385{394.[70℄ Korel, B. Computation of dynami sli es for programs with arbitrary ontrol ow. In 2nd Inter-national Workshop on Automated Algorithmi Debugging (AADEBUG'95) (Saint{Malo, Fran e,May 1995), M. Du ass�e, Ed.[71℄ Korel, B., and Laski, J. Dynami program sli ing. Information Pro essing Letters 29, 3 (O t.1988), 155{163.[72℄ Korel, B., and Rilling, J. Dynami program sli ing methods. In Journal of Informationand Software Te hnology Spe ial Issue on Program Sli ing, M. Harman and K. Gallagher, Eds.,vol. 40. Elsevier, 1998. to appear.

Page 294: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

294 BIBLIOGRAPHY[73℄ Krinke, J., and Snelting, G. Validation of measurement software as an appli ation of sli ingand onstraint solving. In Journal of Information and Software Te hnology Spe ial Issue onProgram Sli ing, M. Harman and K. Gallagher, Eds., vol. 40. Elsevier, 1998. to appear.[74℄ Lakhotia, A. Rule{based approa h to omputing module ohesion. In Pro eedings of the 15thConferen e on Software Engineering (ICSE-15) (1993), pp. 34{44.[75℄ Liu, L., and Ellis, R. An approa h to eliminating COMMON blo ks and deriving ADTs fromFortran programs. Te hni al report, University of Westminster, UK, Feb. 1993.[76℄ Longworth, H. D. Sli e{based program metri s. Master's thesis, Mi higan Te hnologi alUniversity, 1985.[77℄ Lyle, J. R., and Weiser, M. Automati program bug lo ation by program sli ing. In 2ndInternational Conferen e on Computers and Appli ations (Peking, 1987), Institute of Ele tri and Ele troni Engineers, pp. 877{882.[78℄ Manna, Z. Mathemati al Theory of Computation. M Graw{Hill, 1974.[79℄ Ott, L. M. Using sli e pro�les and metri s during software maintenan e. In Pro eedings of the10th Annual Software Reliability Symposium (1992), pp. 16{23.[80℄ Ott, L. M., and Bieman, J. M. Program sli es as an abstra tion for ohesion measurement,1998. to appear.[81℄ Ott, L. M., and Thuss, J. J. Sli e based metri s for estimating ohesion. In Pro eedings ofthe IEEE-CS International Metri s Symposium (Baltimore, Maryland, USA, May 1993), IEEEComputer So iety Press, Los Alamitos, California, USA, pp. 71{81.[82℄ Ottenstein, K. J., and Ottenstein, L. M. The program dependen e graph in softwaredevelopment environments. SIGPLAN Noti es 19, 5 (1984), 177{184.[83℄ Reps, T., and Yang, W. The semanti s of program sli ing. Te h. Rep. Te hni al Report 777,University of Wis onsin, 1988.

Page 295: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

BIBLIOGRAPHY 295[84℄ Shahmehri, N. Generalized algorithmi debugging. PhD Thesis, Department of ComputerS ien e and Information S ien e, Link�oping University, Sweden, 1991. Available as Link�opingStudies in S ien e and Te hnology, Dissertations, Number 260.[85℄ Simpson, D., Valentine, S. H., Mit hell, R., Liu, L., and Ellis, R. Re oup { MaintainingFortran. ACM Fortran forum 12, 3 (Sept. 1993), 26{32.[86℄ Sivagurunathan, Y., Harman, M., and Dani i , S. Sli ing, I/O and the impli it state.In 3rd International Workshop on Automated Debugging ( AADEBUG'97 ) (Link�oping, Sweden,May 1997), M. Kamkar, Ed., pp. 59{65.[87℄ Snelting, G. Combining sli ing and onstraint solving for validation of measurement software.In Stati Analysis Symposium (SAS'96), LNCS 1145 (1996), pp. 332{348.[88℄ Stoy, J. E. Denotational semanti s: The S ott{Stra hey approa h to programming languagetheory. MIT Press, 1985. Third edition.[89℄ Tip, F. A survey of program sli ing te hniques. Journal of Programming Languages 3, 3 (Sept.1995), 121{189.[90℄ van der Waerden, B. L. Moderne Algebra. Ungar Publishing, 1943.[91℄ Venkatesh, G. A. The semanti approa h to program sli ing. In ACM SIGPLAN Conferen eon Programming Language Design and Implementation (Toronto, Canada, June 1991), pp. 26{28.Pro eedings in SIGPLAN Noti es, 26(6), pp.107{119, 1991.[92℄ Weiser, M. Program sli es: Formal, psy hologi al, and pra ti al investigations of an automati program abstra tion method. PhD thesis, University of Mi higan, Ann Arbor, MI, 1979.[93℄ Weiser, M. Program sli ing. IEEE Transa tions on Software Engineering 10, 4 (1984), 352{357.[94℄ Weiser, M., and Lyle, J. R. Experiments on sli ing{based debugging aids. Empiri al studiesof programmers, Soloway and Iyengar (eds.). Molex, 1985, h. 12, pp. 187{197.[95℄ Wilde, N., and Huitt, R. Maintenan e support for obje t{oriented programs. IEEE Trans-a tions on Software Engineering 18, 12 (1992), 1038{1044.

Page 296: Data o - UCL Computer Sciencew Graphs .. 36 2.2.2 Data o w Analysis. 36 2.2.3 Inheren t Inaccuracies in Data o w Analysis. 37 2.2.4 T ... A Comparison of Data o w Lab el Dep endence

296 BIBLIOGRAPHY[96℄ Woodward, M. R., and Allen, S. P. Sli ing algebrai spe i� ations. Information andSoftware te hnology 40, 2 (1998), 105{118.[97℄ Zhao, J., Cheng, J., and Ushijima, K. Stati sli ing of on urrent obje t-oriented programs.In 20th IEEE Annual International Computer Software and Appli ations Conferen e (COMP-SAC'96) (Seoul, Korea, August 1996), IEEE Computer So iety Press, Los Alamitos, California,USA, pp. 312{320.