let should not be generalized dimitrios vytiniotis, simon peyton jones microsoft research, cambridge...

let should not be generalized

Dimitrios Vytiniotis, Simon Peyton Jones

Microsoft Research, Cambridge

TLDI’10, Madrid, January 2010

Tom Schrijvers

K.U. Leuven

2

Extending ML type inference with …Advanced types

Generalized Algebraic Datatypes (GADTs) [Cheney & Hinze, Xi, Peyton Jones et al., Pottier & Simonet, Pottier & Regis-Gianas,

…] Open Type Families [Schrijvers et al., ICFP 2007] … types indexed by some constraint domain

[e.g. Kennedy’s types indexed by Units of Measure, ESOP94]

Advanced forms of constraints Type equalities with type families, type class constraints Implication constraints that arise because of pattern matching

[Pottier & Regis-Gianas, Sulzmann et al.]

A question: How should we be generalizing let-bound definitions?

3

Why is this question relevant? Type system decisions (as let-generalization)

affect1. Implementability of type inference & checking 2. Complexity of implementation3. Efficiency of implementation4. Programmability5. Predictability of type checking6. Backwards compatibility (lots of Haskell 98 code!)

GOAL: Support advanced forms of types and constraints

mentioned Perform well in (1 – 6)

4

Generalized Algebraic Datatypes

(a ~ Int) => (Int ~ a)(a ~ Bool) => (Bool ~ a)

GADT data constructors introduce constraints Pattern matching creates implication

constraints That a solver must discharge

data R a where Rint :: (a ~ Int) => R a Rbool :: (a ~ Bool) => R a

create :: R a -> a create Rint = 42 create Rbool = False

Constraint introduced by

Rint

42 : Int

Expected type: a

5

GADTs and Generalization

Observation: (flop 43) and (flop False) can potentially reach

first or second branch No DEAD code in the example *


mkR :: a -> R a

flop x = let g () = not x -- not :: Bool -> Bool in case (mkR x) of Rbool -> g () Rint -> True

x : β

β ~ Bool

mkR x : R β

β ~ Bool => … ? …

β ~ Int => true

6

GADTs and Generalization

What is the type of g? If spec does not allow quantification over equalities

() -> Bool If spec does allow quantification over equalities

() => Bool or (β ~ Bool) => () -> Bool


mkR :: a -> R a


x : β

β ~ Bool

mkR x : R β

β ~ Bool => … ? …

β ~ Int => true

7

Quantifying over equalities data R a where Rint :: (a ~ Int) => R a Rbool :: (a ~ Bool) => R a

mkR :: a -> R a


Option 1Iβ = Boolg :: () ->

Bool

Option I g :: β ~ Bool => () -> Bool

g :: () -> BoolSecond branch rejected (Bool =/=

Int) … or …

typeable as unreachable

8

Hence, for GADTs: We want to unify away as much as possible

(type of ‘x’) For simpler types For support for some eager solving For shorter constraints

But we can’t unify variables bound in the environment!

if the type system allows quantification over equalities then we must defer a lot

of silly unifications as constraints

9

Open type families

Programmers declare type-level computations

And give axiom schemes for themforall (b).G b Int ~ b

In GHC the axiom scheme definitions are open If G Bool γ ~ Bool we must not conclude that γ ~ Int[think of another consistent axiom G Bool Char ~ Bool]

type family G a b type instance G b Int = b

10

Quantifying restricted constraints? Ok, if equalities are problematic quantify over:

Class constraints: Eq α Type family constraints: F α ~ Int

Problematic: We want to rewrite as much as possible But we must not rewrite too much. Rather delicate!

type family G a b type instance G b Int = b

flop x = let test = ... in ...

x :: βYielding constraint: G β Int ~ Int Yielding type: β -> β

G β Int ~ β & G β Int ~ Int … which gives …

β ~ Int

11

Quantifying over only class constraints? Problematic: class constraints may include

superclasses:

Even if not, it’s hard to give a complete specification

Left-to-right: REJECT (can’t discharge constraint) And it’s not right to defer unsolvable (forall b. F α b ~ Int)

Right-to-left: ACCEPT Rest of typing problem determines α ~ Int and triggers

axiom!

Class (a ~ b) => Eq a b

type instance F Int b = b

let f x = (let h y = … (yielding F α β ~ Int) … in 42, x + 42)

x :: αy :: β

12

Let generalization not a new problem really A. Kennedy knew about it when I was 14! [LIX

RR/96/09]

Kennedy used a clever domain-specific solution Constraint equivalent to: v ~ β/u Τype: forall u. Num u -> Num (β/u)

div :: forall u v. Num (u * v) -> Num u -> Num vweight :: Num kgtime :: Num sec

flop x = let y = div x in (y weight, y time)

x::βYielding constraint: β ~ (u * v)Yielding type: u -> v

Solving gives β = u * v and g becomes monomorphic!

Which IS polymorphic

13

The proposal

The specification and implementation costs for generalizing local let-bound expressions are becoming high:

Do NOT Generalize Local Let-Bound Expressions

Top-level ones do not contribute to the problems[No environment to interact with]

Local but annotated let definitions can be polymorphic

Most Haskell 98 programs actually do not use local let polymorphism (though arguably code refactoring tools may). Results performed in Hackage reported in paper

14

Also in paper Combining Let Should Not Be Generalized with the

OutsideIn [ICFP09] strategy for solving implication constraints leads to LHM(X):

HM(X) with Local assumptions from pattern matching

Type system parameterized over constraint domain X Inference algorithm parameterized over X solver Soundness result provided X solver assumptions

… towards pluggable type systems + type inference

15

A challenge for type system designers Solvers for type class and family instance constraints

are inherently weak (by design) under an open world assumption

Constraint arising: F α ~ Int Instance declaration: type instance F Int ~ Int Question: What is “α”?

Type system may “guess” α = Int, but algorithm can’t [or shouldn’t]

Challenge: Find a declarative specification that rejects

ambiguity

16

A first step: ambiguous constraints Constraint C is unambiguously solvable in a top-level

theory T: usolv(T,C)

iff T shows θ(C) and T & C shows (θ)

The constraint must be solvable by a substitution derivable by massaging the constraint:

¬ usolv(F Int ~ Int, F α ~ Int) … because:

(F Int ~ Int & F α ~ Int) =/=> (α ~ Int) Similar definition by Sulzmann & Stuckey [TOPLAS2005],

also recent related work by Camarao et al.

17

Conclusions and directions

The cost of implicit local generalization is high, we should find alternatives or not generalize implicitly

Directions and ongoing work:

A full GHC implementation that supports Haskell type classes, GADTs, type functions, and first-class polymorphism [journal submission soon]

A declarative specification that deals with ambiguity is open

18

Thank you for your attention

let should not be generalized dimitrios vytiniotis, simon peyton jones microsoft research, cambridge...

Documents