lecture notes

30
Chapter 1 Untyped Lambda Calculus 1.1 Introduction In the mid 1960s, Landin observed that a complex programming language can be understood in terms of a tiny core language lambda calculus capturing the essential mechanisms of programming together with a collection of convenient derived forms whose behavior is understood by translating them to the core. Landin’s core language was the lambda calculus, a formal system in which all computations are reduced to the basic operations of function definition and application. Since the 60’s, the lambda calculus has seen widespread use in the specification of programming language features, language design and implementation, and the study of type systems. Its importance arises from the fact that it can be viewed simultaneously as a simple programming language in which computations can be described and as a mathematical object about which rigorous statements can be proved. 1.2 Syntax e ::= x (variable) | λx. e (function abstraction) | e 1 e 2 (function application) | (e) (bracketed expression) The function application is left associative (e.g., e 1 e 2 e 3 (e 1 e 2 ) e 3 ). The scope of λ expands as far to the right as possible (e.g., λx. x λy. x y λx. (x λy. (xy ))). A variable x is said to be free in a λ-term M when x appears at some position in M where it is not bound by an enclosing lambda abstraction on x.A λ-term with no free variables is called a closed λ- term. 1

Upload: rachit-madan

Post on 14-May-2017

226 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture Notes

Chapter 1

Untyped Lambda Calculus

1.1 Introduction

In the mid 1960s, Landin observed that a complex programming language canbe understood in terms of a tiny core language lambda calculus capturing theessential mechanisms of programming together with a collection of convenientderived forms whose behavior is understood by translating them to the core.Landin’s core language was the lambda calculus, a formal system in whichall computations are reduced to the basic operations of function definitionand application. Since the 60’s, the lambda calculus has seen widespreaduse in the specification of programming language features, language designand implementation, and the study of type systems. Its importance arisesfrom the fact that it can be viewed simultaneously as a simple programminglanguage in which computations can be described and as a mathematicalobject about which rigorous statements can be proved.

1.2 Syntax

e ::= x (variable)| λx. e (function abstraction)| e1 e2 (function application)| (e) (bracketed expression)

The function application is left associative (e.g., e1 e2 e3 ≡ (e1 e2) e3).The scope of λ expands as far to the right as possible (e.g., λx. x λy. x y ≡λx. (x λy. (xy))). A variable x is said to be free in a λ-term M when x

appears at some position in M where it is not bound by an enclosing lambdaabstraction on x. A λ-term with no free variables is called a closed λ- term.

1

Page 2: Lecture Notes

Let us see some example functions in λ-calculus.

1. λx. x - Identity function

2. λx. λy. x - Selection function: function that takes two arguments x

and y and returns first argument x

3. λf. λx. f(f x) - A higher order function that takes a function f and avalue x as arguments and applies f on x twice

Intuitively it is clear that the expression λx. x and λy. y describe exactlythe same function - the function that, given any argument N returns N . Thefact that we use x in one case and y in the other is of no consequence. Thisprinciple is captured by the rule of renaming of bound variables (often calledα-renaming). This states that we may freely replace the bound variable x byanother variable y in a lambda abstraction λx. M as long as y is not amongthe free variables of M . The side condition is needed because, for example wedo not want to consider λx. y and λy. y to be the same function). Formally,λx. M ≡ λy. [x 7→ y]M .

Taking the above argument further, De Bruijn gave a way to removeall the variables from a closed λ-term. Let us define De Bruijn index ofa variable to be the number of λ’s that separate the occurrence from thebinder. If the variable occurrences are replaced with their corresponding DeBruijn indices, then we say the λ-term is in De Bruijn notation. For example,λx. λy. x y ≡ λ. λ. 1 0, λx. λx. x ≡ λ. λ. 0, λz. λy. y ≡ λ. λ. 0. Note that, iftwo closed λ-terms are α-renaming equivalent, then they have the same DeBruijn notation.

A closed λ-term is also called combinator. Some of the standard combi-nators are

I = λx. x

K = λx. λy. x

S = λf. λg. λx. f x (g x)

D = λx. x x

Y = λf. (λx. f(x x)) (λx. f(x x))

Any combinator can be equivalently expressed as one which uses onlyS, K and I.

2

Page 3: Lecture Notes

1.3 Semantics

In its pure form, the lambda calculus has no built-in constants or operators -no numbers, arithmetic operations, loops, records, sequencing, I/O etc. Thesole means by which expressions compute is the application of functions toarguments, which is captured formally by β-reduction rule

(λx. e1) e2 →β (x 7→ e2) e1

This rule says that an expression can be reduced by replacing some subex-pression of the form (λx. M)N), called a redex, by the result of substitutingthe argument N for the bound variable x in the body M . For example,(λx. x y) (u v) reduces to u v y, while (λx. λy. x) z w reduces to (λy. z) w,which reduces to z. We write (λx. λy. x) zw →∗

β z to show that the firstexpression reduces to the second by some sequence of steps of reduction. Forthe sake of simplicity, we use → instead of →β in the sequel.

The notion of reduction gives rise to a natural definition of what it meansfor two expressions to be the same modulo reduction. M and N are β-convertible written M =β N , if they are identical, or if one reduces to theother, or if they are each convertible to some third expression L. The relation=β is reflexive, symmetric and transitive closure of →. The problem ofchecking whether e =β e′ is undecidable. Again, we omit β for the ease ofreading.

To make the notions of β-reduction and conversion completely precise,we must define substitution formally. Substituting an expression N for x inan expression consisting only of x itself yields N . Substituting N for x in anexpression consisting only of a different variable z yields z. To substitute N

for x in an expression consisting an application M L, we substitute in M andL separately. To substitute N for x in an expression consisting a lambda-abstraction λz. M , we substitute in the body M : however, we do this onlywhen z is not the same as x and z is not one of the free variables of N . Thefirst part of this condition ensures that we do not allow nonsensical reductionslike (λx. λx. x) y → λx. y, where x is replaced by y even though it is actuallybound by an inner abstraction, not the one being reduced. The secondprevents a similar kind of mistake where free variables of N are captured byabstractions inside a term being substituted into, resulting in reductions like(λx. λy. x) y → λy. y. Thus, strictly speaking, the substitution [x 7→ N ]M isundefined for some values of N, x, and M . But it is always possible to changethe names of bound variables in N and/or M so that the substitution makessense.

A lambda-expression containing more than one redex can be reduced inmore than one way, leading, in general, to different results. For example,

3

Page 4: Lecture Notes

K I (K True False) reduces in one step to either I or K I True (rememberthe combinators K and I defined earlier). Here, we can further reduce thesecond result, yielding I again: but we might worry that there could be someM such that M could be reduced to either N1 or N2 in such a way that N1

and N2 could never be brought back together by reducing them further. Thefollowing theorem says that the reduction in lambda-calculus is confluent i.e,

Theorem[Church-Rosser ] If M reduces to two different expressions. N1

and N2, then these further reduce to some common expression L. (Insymbols: if M →∗ N1 and M →∗ N2, then there is some L such thatN1 →

∗ L and N2 →∗ L)

The above property is called diamond property. From the theorem above,we understand that →∗ satisfies diamond property. However, note that thediamond property is not satisfied by → (one step reduction).

A closed term without redexes is said to be in normal form. One implica-tion of Church-Rosser theorem is that, we will not find more than one normalform, independent of the reduction strategy. However, some reduction strate-gies might fail to find a normal form. For example (λx. y)((λy. y y)(λy. y y)).It reduces to (λx. y)((λy. y y)(λy. y y)) → (λx. y)((λy. y y)(λy. y y)) → · · · .

A reduction strategy is a rule specifying which redexes should be reducedfirst. There are several common reduction strategies. The normal-orderreduction strategy picks the leftmost outermost redex and reduces it. Itdoes not reduce the expression inside a lambda enclosure. A result aboutnormal-order reduction strategy is that if e has normal form e′, then normalorder reduction will reduce e to e′. Acall by name reduction strategy doesnot evaluate the argument of the function first, it executes in a lazy fashion,only when it is needed. It does not reduce the expression inside a lambdaenclosure. This strategy is demand driven as the expressions are not eval-uated unless it is needed. A call by value reduction strategy does evaluatethe argument of a function first. It does not reduce the expression inside alambda enclosure.

1.4 Programming in λ-calculus

Lambda calculus is expressive enough to encode turing machines. Let us tryand encode some of the basic language features.

True := λt. λf. t

False := λt. λf. t

4

Page 5: Lecture Notes

Both of these expressions are combinators. This means that they are inertwith respect to substitutions. We can define a combinator If using the abovecombinators with the property that If L then M else N reduces to M whenL is True and reduces to N when L is False.

If l m n := λl. λm. λn. l m n

The If combinator does not actually do much: If L M N means just L M N .In effect, the boolean value L itself is the conditional: it takes two argumentsand chooses the first (if it is True) or the second(if it is False). For example,the expression If True M N reduces as follows:

If True M N = (λl. λm. λn. l m n) True M N

→ (λm. λn. True m n) M N

→ (λn. True M n) N

→ True M N

= (λt. λf. t) M N

→ (λf. M) N

→ M

The encoding of numbers as lambda-expressions is only slightly more intri-cate. Define the Church Numerals C0, C1, C2 · · · as follows:

C0 = λz. λs. z

C1 = λz. λs. s z

C2 = λz. λs. s(s z)...

Cn = λz. λs. sn(z)

That is, each number n is represented by a combinator Cn that takes twoarguments, z and s (zero and successor), and applies n copies of s to z. Aswith booleans, this encoding makes numbers into active entities: the numbern is represented by a function that does something n times - a kind of activeunary numeral.

We can define some common arithmetic operations on Church numeralsas follows:

Plus = λm. λn. λz. λs. m (n z s) s

Times = λm. λn. m C0 (Plus n)

Here, Plus is a combinator that takes two Church numerals, m and n,as arguments, and yields another Church numeral - i.e., a function thataccepts arguments z and s, applies s iterated n times to z, and then appliess iterated m more times to the result. It is easy to check, for example, that

5

Page 6: Lecture Notes

Plus C2 C5 = C7. The definition of Times uses another trick: since Plustakes its arguments one at a time, applying it to just one argument n yieldsthe function that adds n to whatever argument it is given. Passing thisfunction as the second argument to m and zero as the first argument means“apply the function that adds n to its argument, iterated m times, to zero,”i.e., “add together m copies of n”.

To test whether a Church numeral is zero, we must give it a pair ofarguments Z and S such that applying S to Z one or more times yieldsFalse, while not applying it at all yields True. Clearly, we can take Z to bejust True. As for S, we use a function that throws away its argument andalways returns False.

IsZero = λm. m True (λx. False)

Other common datatypes like lists, trees, arrays, and variant records canbe encoded using similar techniques. An important operator is the Fix op-erator, which can be used to define recursive functions.

Fix = λf. (λx. f (λy. x x y))(λx. f (λy. x x y))

Note that it has the property that Fix F → F (Fix F ). Now, sup-pose we want to write a recursive function definition of the form F =〈body containing F 〉. The intention is that the recursive definition shouldbe “unrolled” at the point where it occurs: for example, the definition ofFactorial would intuitively be written

if n = 0 then 1else n· if n − 1 = 0 then 1

else n − 1· if n − 2 = 0 then 1else n − 1 · ...

This effect can be achieved by defining G = λf. 〈bodycontaining f〉 andF = Fix G, since then

F = Fix G

= G(Fix G)= 〈body containing (Fix G)〉= 〈body containing 〈body containing (Fix G)〉〉...

For example, if we define the factorial function by

Fact = λfact . λn. If (IsZero n) C1 (Times n (fact (Pred n)))Factorial = Fix Fact

6

Page 7: Lecture Notes

then applying Factorial to the Church numeral for 2 leads to the followingcalculation.

1.5 Exercises

1. Can all closed λ-terms be reduced to normal forms? Justify.

2. Reduce (λy. (λx. x) y)((λu. u)(λv. v)) using call by name and call byvalue strategies.

3. Encode Pred and Subtract function in λ-calculus.

7

Page 8: Lecture Notes

Chapter 2

Simply typed Lambda Calculus

2.1 Introduction

Simply typed lambda calculus is a refinement of untyped lambda calculus,whose only type is → (function type). This makes it canonical and a simplestexample of typed lambda calculus.

There are λ expressions which cannot be reduced further and they are notvalues either (a valid function) for example, x (λx. x). Such a term is calleda stuck term. Stuck terms correspond to meaningless or erroneous programs.We would therefore like to be able to tell, without actually evaluating a term,that its evaluation will definitely not get stuck. Towards this objective weintroduce typing to the lambda calculus.

2.2 Typed Arithmetic Expressions

Let us consider a simpler language of arithmetic expressions before introduc-ing typing to the lambda calculus. The syntax is

t ::= terms:true constant truefalse constant falseif t then t else t conditional0 constant zerosucc t successorpred t predecessoriszero t zero test

Evaluating a term can either result in a value or else get stuck at some stage.

8

Page 9: Lecture Notes

The values are defined to be

v ::= values:true true valuefalse false valuenv numeric value

nv ::= numeric values:0 zero valuesucc nv successor value

The operational semantics is given as

if true then t2 else t3 → t2 (E-IFTRUE)if false then t2 else t3 → t3 (E-IFFALSE)

t1 → t′1if t1 then t2 else t3 → if t ′

1then t2 else t3 (E-IF)

t → t′

succ t → succ t ′ (E-SUCC)t → t′

pred t → pred t ′ (E-PRED)pred 0 → 0 (E-PREDZ)

pred (succ nv) → nv (E-PREDSUCC)iszero 0 → true (E-ISZEROZERO)

iszero (succ nv) → false (E-ISZEROSUCC)t → t′

iszero t → iszero t ′ (E-ISZERO)

Some of the examples of bad programs are

• succ true

• iszero false

• if (succ zero) then true else false

The goal is to come up with some static verification technique to rule outsuch bad programs. We introduce types to that effect. We need to be able todistinguish between terms whose result will be a numerical value (since theseare the only ones that should appear as arguments to pred , succ and iszeroand terms whose result will be a boolean (since only these should appearas the guard of a conditional). We introduce two types: Nat and Bool , forclassifying terms in this way.

T ::= Bool | Nat

9

Page 10: Lecture Notes

Saying that a term t has type T (t : T ) means that t evaluates to a valueof the appropriate form. For example, the term if true then 0 else succ 0has type Nat . Formally the typing rules are

true : Bool (T-TRUE)false : Bool (T-FALSE)

t1 : Bool , t2 : T, t3 : T

if t1 then t2 else t3 : T (T-IF)0 : Nat (T-ZERO)t : Nat

succ t : Nat (T-SUCC)t : Nat

pred t : Nat (T-PRED)t : Nat

iszero t : Bool (T-ISZERO)

A term t is said to be well-typed if there is some type T such that t : T canbe derived from the inference rules above.

The most basic property of a type system is safety (also called soundness):well-typed terms do not get stuck. We show this in two steps, commonlyknown as the progress and preservation theorems.

Progress A well-typed term is not stuck (either it is a value or it can takea step according to the operational semantics)

Preservation If a well-typed term takes a step of evaluation, then the re-sulting term is also well-typed.

These properties together tell us that a well-typed term can never reach astuck state. We will use the following lemma for the proof progress theorem.

Lemma 1 (Canonical Forms). 1. If v is a value of type Bool, then v iseither true or false.

2. If v is a value of type Nat, then v is a numeric value.

Theorem 1 (Progress). Suppose t is well-typed term (i.e, t : T for some T ).Then either t is a value or else there is some t′ with t → t′.

We prove a stronger version of the Preservation theorem i.e,

Theorem 2 (Preservation). If t : T and t → t′, then t′ : T

All the proofs use induction on length of the derivation. Refer the book(given in the reference) for the proofs.

10

Page 11: Lecture Notes

2.3 Typed Lambda Calculus

We construct a type system for a language combining booleans with theprimitives of the pure lambda calculus. In order to assign a type to anabstraction λx. e, we need to calculate what will happen when the abstractionis applied to some argument. The next question that arises is: how do weknow what type of arguments to expect? There are two possible responses:either we can simply annotate the λ-abstraction with the intended type of itsarguments, or else we can analyze the body of the abstraction to see how theargument is used and try to deduce what type it should have. For now, wechoose the first alternative. Instead of just λx. t, we write λx : T. e, wherethe annotation on the bound variable tells us to assume that the argumentwill be of type T . Formally, the syntax is

e ::= x (variable)| λx : T. e (function abstraction)| e1 e2 (function application)| true (true value)| false (false value)| if e1 then e2 else e3 (conditional)

The values are true, false and λx : T.e. The types are defined by the follow-ing grammar.

T ::= Bool| T → T

The operational semantics is same as the untyped one but for the anno-tation of a type in the argument of the function abstraction. Once we knowthe type of the argument to the abstraction, it is clear that the type of thefunction’s result will be just the type of the body e, where occurrences of x

in e are assumed to denote terms of type T1. This intuition is captured bythe following typing rule:

x : T1 ⊢ e : T2

(λx : T1. e) : T1 → T2

Since terms may contain nested λ-abstractions, we need to talk aboutseveral such assumptions. This changes the typing relation from a two-placerelation, t : T , to a three-place relation, Γ ⊢ e : T , where Γ is a set ofassumptions about the types of the free variables in e. Formally, the typingrules are

11

Page 12: Lecture Notes

true : Bool (T-TRUE)false : Bool (T-FALSE)

e1 : Bool , e2 : T, e3 : T

if e1 then e2 else e3 : T (T-IF)x : T ∈ Γ

Γ ⊢ x : T (T-VAR)Γ, x : T1 ⊢ e : T2

Γ ⊢ λx : T1.e : T1 → T2 (T-ABS)Γ ⊢ e1 : T1 → T2, Γ ⊢ e2 : T1

Γ ⊢ e1 e2 : T2 (T-APP)

Refer the book for the proof of type-safety theorem.

2.4 Exercises

1. Prove that each well-typed term in typed arithmetic language has at-most one type and if it has a type, there is exactly one derivation forit (Uniqueness of Types theorem).

2. Is the type system for language of arithmetic expressions complete.Justify.

3. Prove Uniqueness of Types theorem for simply typed lambda calculus.

12

Page 13: Lecture Notes

Chapter 3

Type Inference

3.1 Introduction

The typechecking algorithms for the calculi we have seen so far all depend onexplicit type annotations - in particular, they require that lambda abstrac-tions be annotated with their argument types. In this lecture, we develop amore powerful type inference algorithm, capable of calculating a type for aterm in which some or all of these annotations are left unspecified. Relatedalgorithms lie at the heart of languages like ML and Haskell.

3.2 Constraint Based Typing

The approach is to introduce type variables and perform type derivation withtype variables and collect the constraints. Solve the constraints and infer thetype. There can be many solutions to the constraints. We illustrate theapproach to get the most general solution.

Let us consider an example λf. λa. f(f(a)). Let us annotate with typevariables: (λf : X. λa : Y. f(f(a))) : Z. One of the solutions is X 7→Bool → Bool , Y 7→ Bool , Z 7→ (Bool → Bool) → Bool → Bool. Howeverthe solution X 7→ (Y → Y ), Z 7→ (Y → Y ) → Y → Y is more general thanthe first solution. These solutions are called type substitutions.

Formally, a type substitution σ is a finite mapping from type variables totypes. Application of a substitution to a type is defined as

σ(X) =

{

T if (X 7→ T ) ∈ σ

X if X is not in the domain of σ

σ(Bool) = Boolσ(T1 → T2) = σT1 → σT2

13

Page 14: Lecture Notes

The composition of two substitutions σ and γ is defined as

σ · γ =

{

X 7→ σ(T ) for each (X 7→ T ) ∈ γ

X 7→ T for each (X 7→ T ) ∈ σ with X 6∈ dom(γ)

The substitution σ is said to be more general than σ′ (written as σ ⊑ σ′)if σ′ = γ · σ for some substitution γ. A substitution more general than allsubsitutions (solutions) is called the most general substitution.

The constraint typing relation is denoted by Γ ⊢ t : T |χC. Informally, itcan be read as “term t has type T under assumptions Γ whenever constraintsC are satisfied by the type variables in T.”. The χ subscript is used to trackthe type variables introduced in each subderivation and make sure that thefresh variables created in different subderivations are actually distinct.

The constraint typing rules are

Γ ⊢ true : Bool |∅ ∅ (CT-TRUE)Γ ⊢ false : Bool |∅ ∅ (CT-FALSE)

Γ ⊢ e1 : T1|χ1C1 Γ ⊢ e2 : T2|χ2

C2

Γ ⊢ e3 : T3|χ3C3 χ1, χ2, χ3 mutually disjoint

Γ ⊢ if e1 then e2 else e3 : T2 |χ1∪χ2∪χ3

C1 ∪ C2 ∪ C3 ∪ {T1 = Bool , T2 = T3} (CT-IF)x : T ∈ Γ

Γ ⊢ x : T |∅ ∅ (CT-VAR)Γ, x : T1 ⊢ e : T2|χ C

Γ ⊢ λx : T1. e : T2|χ C (CT-ABS)Γ ⊢ e1 : T1|χ1

C1 Γ ⊢ e2 : T2|χ2C2

χ1 ∩ χ2 = χ1 ∩ FV (T2) = χ2 ∩ FV (T1) = ∅X 6∈ χ1, χ2, T1, T2, C1, C2, Γ

Γ ⊢ e1 e2 : X|χ1∪χ2∪{X} C1 ∪ C2 ∪ {T1 = T2 → X} (CT-APP)

Consider the example λx. λy. λz. (xz)(yz). Let us annotate with typevariables λx : X. λy : Y. λz : Z. (xz)(yz) : S. We derive the constraints onthese type variables as shown below. (The set of variables is left out for theease of reading).

x : X, z : Z ⊢ x z : A|{X = Z → A}y : Y, z : Z ⊢ y z : B|{Y = Z → B}

x : X, y : Y, z : Z ⊢ (x z) (y z) : S|{X = Z → A, Y = Z → B, A = B → S}

⊢ λx : X. λy : Y. λz : Z. (x z) (y z) : S|{X = Z → A, Y = Z → B, A = B → S}

14

Page 15: Lecture Notes

3.3 Unification

To calculate solutions to constraint sets, we use the idea, due to Hindley(1969) and Milner (1978), of using unification (Robinson, 1971) to check thatthe set of solutions is nonempty and, if so, to find a most general solution,in the sense that all solutions can be generated straightforwardly from thisone.

Algorithm 1: Unification Algorithm

Input: C, a finite set of constraintsOutput: most general type substitutionunify(C) =if C = ∅ then

∅else

let {S = T} ∪ C ′ = C inif S = T then

unify(C ′)else if S = X and X 6∈ FV (T ) then

unify([X 7→ T ]C ′) ·[X 7→ T ]else if T = X and X 6∈ FV (S) then

unify([X 7→ S]C ′) ·[X 7→ S]else if S = S1 → S2 and T = T1 → T2 then

unify(C ′ ∪ {S1 = T1, S2 = T2})else

FAIL

The time complexity of this algorithm is polynomial in the number ofconstraints. The algorithm can be made efficient by maintaining equivalenceclasses and using union-find.

Refer to the book for the proofs of the soundness of constraint basedtyping and the correctness of unification procedure that it produces the mostgeneral solution.

15

Page 16: Lecture Notes

Chapter 4

Extending STLC to modelProgramming language features

4.1 Introduction

The simply typed lambda-calculus (STLC) has enough structure to make itstheoretical properties interesting, but it is not yet much of a programminglanguage. In this chapter, we begin to close the gap with more familiarlanguages by introducing a number of familiar features. Note that the typesafety is preserved in all of these additions.

4.2 Unit Type and Sequencing

A unit type is a singleton type interpreted in the simplest possible way.Even in purely functional language, the type Unit is not completely withoutinterest, but its main application is in languages with side effects, such asassignments to reference cells. In such languages, it is often the side effect, notthe result, of an expression that we care about. Unit is an appropriate resulttype for such expressions. In languages with side effects, it is often useful toevaluate two or more expressions in sequence. The sequencing notation t1; t2has the effect of evaluating t1, throwing away its trivial result, and going onto evaluate t2. We need to add the following things to STLC to handle unittype and sequencing.

16

Page 17: Lecture Notes

Syntax e ::= unit | e1; e2

Values v ::= unitTypes T ::= UnitSemantics

e1 → e′1e1; e2 → e′1; e2

unit ; e1 → e1

Typing Rules⊢ unit : Unit

Γ ⊢ e1 : Unit Γ ⊢ e2 : T2

Γ ⊢ e1; e2 : T2

4.3 Let binding

When writing a complex expression, it is often useful - both for avoidingrepetition and for increasing readability - to give names to some of its subex-pressions. Most languages provide one or more ways of doing this. In ML,for example, we write let x = t1 in t2 to mean “evaluate the expression t1and bind the name x to the resulting value, while evaluating t2.

Our let-binder follows ML’s in choosing a call-by-value evaluation order,where the let-bound term must be fully evaluated before evaluation of thelet-body can begin. The formal additions to STLC are

Syntax e ::= let x = e1 in e2

Semanticslet x = v in e → [x 7→ v]e

e1 → e′1let x = e1 in e2 → let x = e ′

1in e2

Typing RulesΓ ⊢ e1 : T1 Γ, x : T1 ⊢ e2 : T2

Γ ⊢ let x = e1 in e2 : T2

4.4 Records and Variants

We introduce records and variants to STLC. Records and Variants are anaggregation of several items of possibly different types. The difference be-tween records and variants is that in variants exactly one of the fields can beaccessed.

Some examples are

17

Page 18: Lecture Notes

Recordlet x = {real = 5 , imag = 6} insquare(x .real ∗ x .real + x .imag ∗ x .imag)

VariantAddr = 〈physical : PhysicalAddr ,

virtual : VirtualAddr 〉let a = 〈physical = pa〉 as Addr ;getname = λa : Addr .

case a of〈physical = x〉 ⇒ x .firstlast|〈virtual = y〉 ⇒ y .name

The formal additions to STLC for records are

Syntax e ::= {li = e i∈1···ni }

Values v ::= {li = v i∈1···ni }

Types {li : T i∈1···ni }

Semantics{li = v i∈1···n

i }.lj → vj

e1 → e′1e1.l → e′1.l

ej → e′j

{li = ei∈1···j−1

i , lj = ej, li = ei∈j+1···n

i } →{li = e

i∈1···j−1

i , lj = e′j , li = ei∈j+1···n

i }Typing Rules

for each i Γ ⊢ ei : Ti

Γ ⊢ {li = e i∈1···ni } : {li : T i∈1···n

i }Γ ⊢ e : {li : T i∈1···n

i }

Γ ⊢ e.lj : Tj

The formal additions to STLC for variants are

18

Page 19: Lecture Notes

Syntax e ::= 〈l = e〉as T |case e of 〈li = xi〉 ⇒ T i∈1···ni

Types T ::= 〈li : T i∈1···ni 〉

Semanticscase (〈lj = vj〉as T )of

〈li = xi〉 ⇒ t i∈1···ni → [xj 7→ vj ]tje → e′

case e of 〈li = xi〉 ⇒ e i∈1···ni →

case e ′ of 〈li = xi〉 ⇒ e i∈1···ni

e → e′

〈li = e〉 as T → 〈li = e′〉 as T

Typing RulesΓ ⊢ ej : Tj

Γ ⊢ 〈lj = ej〉 as 〈li : T i∈1···ni 〉 : 〈li : T i∈1···n

i 〉

Γ ⊢ e : 〈li : T i∈1···ni 〉 for each i Γ, xi : Ti ⊢ ei : T

Γ ⊢ case e of 〈li = xi〉 ⇒ e i∈1···ni : T

4.5 Subtyping

Unlike the features we have studied up to now, which could be formulatedmore or less orthogonally to each other, subtyping is a cross- cutting ex-tension, interacting with most other language features in non-trivial ways.Subtyping is characteristically found in object-oriented languages and is oftenconsidered an essential feature of the object-oriented style. We do not ex-plore this connection in great detail but study subtyping with just functionsand records.

Consider an example: (λr : {x : Nat}. r.x) {x = 0, y = 1}. In thisexample, the type of the argument expected is {x : Nat}, but we are passingan argument of type {x : Nat , y : Nat}. The typing rules, in general, willreject this term as being bad. The idea of subtyping is to make these kindof programs usable.

A type S is a subtype of T , written S < : T , means that, a term of typeS can be safely used wherever a term of type T is expected. Formally,

Γ ⊢ t : S S < : T

Γ ⊢ t : T

19

Page 20: Lecture Notes

The formal additions to STLC are

Syntax T ::= TopTyping Rules

S < : S

S < : U U < : T

S < : T

S < : TopT1 < : S1 S2 < : T2

S1 → S2 < : T1 → T2

Γ ⊢ e : S S < : T

Γ ⊢ e : T

{li : T i∈1···n+ki } < : {li : T i∈1···n

i }for each i Si < : Ti

{li : S i∈1···ni } < : {li : T i∈1···n

i }{kj : S

j∈1···nj } is a permutation of {li : T i∈1···n

i }

{kj : S i∈1···nj } < : {li : T i∈1···n

i }

4.6 Pointers or References

Nearly every programming language provides some form of assignment op-eration that changes the contents of a previously allocated piece of storage.We can have a variable x whose value is the number 5, or a variable y whosevalue is a reference (or pointer) to a mutable cell whose current contents is5, and the difference is visible to the programmer. We can add x to anothernumber, but not assign to it. We can use y directly to assign a new value tothe cell that it points to (by writing y := 84), but we cannot use it directlyas an argument to plus. Instead, we must explicitly dereference it, writing!y to obtain its current contents.

The basic operations on references are allocation, dereferencing, and as-signment. To allocate a reference, we use the ref operator, providing aninitial value for the new cell r = ref 5;. Here r is of type Ref Nat . Toread the value from the cell r, we use !r. To change the value stored in thecell, we use the assignment operator r := 7. The result of this expression isunit : Unit . A simple example is

let x = ref 5 in x :=!x + 1!x

The formal additions to typing rules are

20

Page 21: Lecture Notes

A more subtle aspect of the treatment of references appears when weconsider how to formalize their operational semantics. One way to see why isto ask, “What should be the values of type Ref T ?” The crucial observationthat we need to take into account is that evaluating a ref operator shouldallocate some storage and the result of the operation should be a referenceto this storage.

What, then, is a reference?The run-time store in most programming language implementations is

essentially just a big array of bytes. The run-time system keeps track of whichparts of this array are currently in use. We can think of the store as an arrayof values, rather than an array of bytes, abstracting away from the differentsizes of the run-time representations of different values. Furthermore, wecan abstract away from the fact that references (i.e., indexes into this array)are numbers. We take references to be elements of some uninterpreted setL of store locations, and take the store to be simply a partial function fromlocations l to values. We use the metavariable µ to range over stores. Areference, then, is a location - an abstract index into the store. The statemente|µ → e′|µ′ means that a term e evaluates to e′ changing memory from e toe′ as a side effect.

21

Page 22: Lecture Notes

The formal additions to STLC are

Syntax e ::= ref e|e1 := e2|!eValues v ::= l

Types T ::= Ref T

Semanticse1|µ → e′1|µ

!e1|µ →!e′1|µ′

µ(l) = v

!l|µ → v|µe1|µ → e′1|µ

e1 := e2|µ → e′1 := e2|µ′

e2|µ → e′2|µ′

v := e2|µ → v := e2|µ′

l := v2|µ → unit |[l 7→ v2]µe1|µ → e′1|µ

ref e1 |µ → ref e ′1|µ′

l 6∈ dom(µ)

ref v |µ → l |(µ, l 7→ v)

We have not yet described the typing rules. To find the type of a locationl, we look up the current contents of l in the store and calculate the type T1

of its contents.Γ ⊢ µ(l) : T1

Γ ⊢ l : Ref T1

The type of the location is Ref T1. In effect, by making the type of a termdepend on the store, we have changed the typing relation from a three placerelation (between contexts, terms and types) to a four place relation (betweencontexts, stores, terms and types). Our rule for typing rules now has the form

Γ|µ ⊢ µ(l) : T1

Γ|µ ⊢ l : Ref T1

However, there are two problems with this rule. First, typechecking israther inefficient, since calculating the type of a location l involves calculatingthe type of the current contents v of l. If l appears many times in a term t,we will re-calculate the type of v many times in the course of constructinga typing derivation for t. Worse, if v itself contains locations, then we willhave to recalculate their types each time they appear. For example, if thestore contains

22

Page 23: Lecture Notes

(l1 7→ λx : Nat . 999,l2 7→ λx : Nat . (!l1)x,

l3 7→ λx : Nat . (!l2)x,

l4 7→ λx : Nat . (!l3)x,

l5 7→ λx : Nat . (!l4)x),

then calculating the type of l5 involves calculating those of l4, l3, l2 andl1.

Second, the proposed typing rule for locations may not allow us to deriveanything at all, if the store contains a cycle. For example, there is no finitetyping derivation for the location l2 with respect to the store in the example

(l1 7→ λx : Nat . (!l2)x,

l2 7→ λx : Nat . (!l1)x),

since calculating a type for l2 requires finding the type of l1, which in turninvolves l2, etc. Cyclic reference structures do arise in practice (e.g., they canbe used for building doubly linked lists), and we would like our type systemto be able to deal with them.

Both of these problems arise from the fact that our proposed typing rulefor locations requires us to recalculate the type of a location every time wemention it in a term. But this, intuitively, should not be necessary. Afterall, when a location is first created, we know the type of the initial value thatwe are storing into it. Moreover, although we may later store other valuesinto this location, those other values will always have the same type as theinitial one. These intended types can be collected together as a store typing- a finite function mapping locations to types. We’ll use Σ to range over suchfunctions. Now formally the typing rules are

Σ(l) = T1

Γ|Σ ⊢ l : Ref T1

Γ|Σ ⊢ e1 : T1

Γ|Σ ⊢ ref e1 : Ref T1

Γ|Σ ⊢ e1 : Ref T1

Γ|Σ ⊢!e1 : T1

Γ|Σ ⊢ e1 : Ref T Γ|Σ ⊢ e2 : T

Γ|Σ ⊢ e1 := e2 : Unit

23

Page 24: Lecture Notes

4.7 Interaction: References and Subtyping

As we extend our simple calculus with subtyping toward a full-blown pro-gramming language, each new feature must be examined carefully to see howit interacts with subtyping. For example, the following typing rule is unsound

S < : T

Ref S < : Ref T

To see this more clearly, consider the following example.

let x = ref {a : 5, b : 10} inlet bar = λy : ref {a : Nat}. y := {a : 15} inbar x;x.b

Though the above program is correct as per the type system, the value ofx.b cannot be calculated and hence this typing rule is unsound. The correcttyping rule however is

S < : T T < : S

Ref S < : Ref T

4.8 Let Polymorphism

Consider the following lambda terms.

let double = λf . λa. f (f (a)) inlet a = double(λx : Nat . succ(succ(x )) 1 inlet b = double(λx : Bool . not x ) false in· · ·

Recall the typing rule for let

Γ ⊢ e1 : T1 Γ, x : T1 ⊢ e2 : T2

Γ ⊢ let x = e1 in e2 : T2

Naturally this rule is not enough. Milner introduced a new typing ruleto handle let polymorphisms.

Γ ⊢ e1 : T1 Γ ⊢ [x 7→ e1]e2 : T2

Γ ⊢ let x = e1 in e2 : T2

24

Page 25: Lecture Notes

Chapter 5

Type Based Program Analysis -Pointer analysis

5.1 Introduction

The goal of this lecture is to show that the type systems can be used to specifyand implement various program analyses. In this lecture, we will apply theframework of types, typing rules, typing judgements and type inference fornon-standard types.

The problem we address is : given a C/C#/Java program, can we findfor each pointer in the program, a conservative approximation to the set ofall locations it can point to? There are various ways to address this problem:context sensitive, flow sensitive, field sensitive, object sensitive. However inthis lecture we focus on an analysis which is flow, context, field and object in-sensitive. In the next section, we discuss Steensgaard’s approach. In Section3, Anderson’s approach is discussed.

5.2 Steensgaard’s Analysis

The lecture is the simplified version of the Steensgard’s paper. Readers arerequested to go through the paper “Points-to Analysis in Almost LinearTime”. Since the Steensgaard approach is flow insensitve, the control struc-tures of the program are irrelevant. We will also assume that the programdoes not have any procedures for the moment. It will be added later. Let usconsider a syntax which is capable of handling pointers.

25

Page 26: Lecture Notes

S ::= x = y

| x = &y

| ∗ x = y

| x = ∗y| x = allocate(n)

The syntax for computing the addresses of variables and for pointers areborrowed from the C programming language. All variables are assumed tohave unique names. The allocate(n) expression dynamically allocates a blockof memory of size n.

For the purposes of performing the points-to analysis, we define a non-standard set of types describing the store. The types have nothing to dowith the types normally used in typed imperative languages (e.g, integer,float etc). We use types to model how storage is used in a program atruntime (a storage model). Each type describes a set of locations as wellas the possible runtime contents of those locations. The non standard typesused by our points-to analysis can be described by

α = ⊥ | ref (α)

Consider the following example.

a = &x;

b = &y;

if p then

y = &z;

else

y = &x;

c = &y;

The idea of the analysis is simple. We initialize the type of each variableto point to a separate location. For each assignment we unify the types oneither side of the assignment. The following table shows the initial, interme-diate and final types of each of the variables in the example.

Initial Types After first 3 assignments Finallya : α1 = ⊥ a : α1 = ref (α4) a : α1 = ref (α4,6)b : α2 = ⊥ b : α2 = ref (α5) b : α2 = ref (α5)c : α3 = ⊥ c : ⊥ c : α3 = ref (α5)x : α4 = ⊥ x : ⊥ x, z : α4,6 = ⊥y : α5 = ⊥ y : α5 = ref (α6) y : α5 = ref (α4,6)z : α6 = ⊥ z : ⊥p : α7 = ⊥ p : ⊥ p : α7 = ⊥

26

Page 27: Lecture Notes

Typing Rules The typing rules specify when a program is well-typed. Awell-typed program is one for which the static storage graph indicated by thetypes is a safe (conservative) description of all possible dynamic (runtime)storage configurations.

A ⊢ x : α

A ⊢ y : α

A ⊢ WT (x = y)

A ⊢ x : ref (α)A ⊢ y : α

A ⊢ WT (∗x = y)

A ⊢ x : ref ( )A ⊢ WT (x = allocate(n))

A ⊢ x : ref (α)A ⊢ y : α

A ⊢ WT (x = &y)

A ⊢ x : α

A ⊢ y : ref (α)A ⊢ WT (x = ∗y)

The task of performing a points-to analysis has now been reduced tothe task of inferring a typing environment (A) under which a program iswell-typed. More precisely, the typing environment we seek is the minimalsolution to the well-typedness problem, i.e., each location type variable inthe typing environment describes as few locations as possible.

The basic principle of the implementation is that we start with the as-sumption that all variables are described by different types (type variables)and then proceed to merge types as necessary to ensure well-typedness ofdifferent parts of the program. Merging is made fast by using fast union/finddata structures. The efficient implementation runs in a time almost linear tothe number of the variables in the program. The asymptotic time complexityis given by O(N∗ ∝ (N, N)), where ∝ is the inverse ackerman function and∝ (N, N) is the cost of one union-find operation. Atmost N such operationsare required.

Consider the following example.

a = 4;

27

Page 28: Lecture Notes

x = a;

y = a;

The typing rules mentioned above requires that x, y and a be of the sametype i.e, pointing same locations. However, these variables are not pointers.Our typing rules unnecessarily impose restrictions on these variables also.The actual typing rule for assignments in Steensgaard’s paper is as follows.

A ⊢ x : ref (α1)A ⊢ y : ref (α2)

α2 � α1A ⊢ WT (x = y)

where t1 � t2 := (t1 = ⊥ ∨ t1 = t2)

This rule avoids non-pointer variables for typing rules.

5.3 Anderson’s Analysis

In Anderson’s approach, the types are defined to be

Locs = l0|l1 · · ·α = Locs | ref ({α1, α2, · · · })

Typing Rules

28

Page 29: Lecture Notes

A ⊢ x : α

A ⊢ y : β

PT (β) ⊆ PT (α)A ⊢ WT (x = y)

A ⊢ x : ref (X)A ⊢ y : β

∀α ∈ X.PT (β) ⊆ PT (α)A ⊢ WT (∗x = y)

A ⊢ x : ref (X) li ∈ X

A ⊢ WT (x = allocate li(n))

A ⊢ x : ref (X)A ⊢ y : β β ∈ X

A ⊢ WT (x = &y)

A ⊢ x : α

A ⊢ y : ref (Y )∀β ∈ Y.PT (β) ⊆ PT (α)

A ⊢ WT (x = ∗y)

Let us revisit the earlier example with Anderson’s approach.

Initial Types After first 3 assignments Finallya : α1 = ref ({}) a : α1 = ref ({α4}) a : α1 = ref ({α4})b : α2 = ref ({}) b : α2 = ref ({α5}) b : α2 = ref ({α5})c : α3 = ref ({}) c : α3 = ref ({}) c : α3 = ref ({α5})x : α4 = ref ({}) x : α4 = ref ({}) x : α4 = ref ({})y : α5 = ref ({}) y : α5 = ref ({α6}) y : α5 = ref ({α4, α6})z : α6 = ref ({}) z : α6 = ref ({}) z : α6 = ref ({})p : α7 = ref ({}) p : α7 = ref ({}) p : α7 = ref ({})

We can observe that Anderson’s algorithm is more accurate than Steens-gaard’s algorithm. However the time complexity of Anderson’s algorithm isO(N3).

29

Page 30: Lecture Notes

References

1. “Foundational Calculi for Programming Languages”, Benjamin C. Pierce.

2. “Types and Programming Languages”, Benjamin C. Pierce

30