let should not be generalized dimitrios vytiniotis, simon peyton jones microsoft research, cambridge...
TRANSCRIPT
let should not be generalized
Dimitrios Vytiniotis, Simon Peyton Jones
Microsoft Research, Cambridge
TLDI’10, Madrid, January 2010
Tom Schrijvers
K.U. Leuven
2
Extending ML type inference with …Advanced types
Generalized Algebraic Datatypes (GADTs) [Cheney & Hinze, Xi, Peyton Jones et al., Pottier & Simonet, Pottier & Regis-Gianas,
…] Open Type Families [Schrijvers et al., ICFP 2007] … types indexed by some constraint domain
[e.g. Kennedy’s types indexed by Units of Measure, ESOP94]
Advanced forms of constraints Type equalities with type families, type class constraints Implication constraints that arise because of pattern matching
[Pottier & Regis-Gianas, Sulzmann et al.]
A question: How should we be generalizing let-bound definitions?
3
Why is this question relevant? Type system decisions (as let-generalization)
affect1. Implementability of type inference & checking 2. Complexity of implementation3. Efficiency of implementation4. Programmability5. Predictability of type checking6. Backwards compatibility (lots of Haskell 98 code!)
GOAL: Support advanced forms of types and constraints
mentioned Perform well in (1 – 6)
4
Generalized Algebraic Datatypes
(a ~ Int) => (Int ~ a)(a ~ Bool) => (Bool ~ a)
GADT data constructors introduce constraints Pattern matching creates implication
constraints That a solver must discharge
data R a where Rint :: (a ~ Int) => R a Rbool :: (a ~ Bool) => R a
create :: R a -> a create Rint = 42 create Rbool = False
Constraint introduced by
Rint
42 : Int
Expected type: a
5
GADTs and Generalization
Observation: (flop 43) and (flop False) can potentially reach
first or second branch No DEAD code in the example *
data R a where Rint :: (a ~ Int) => R a Rbool :: (a ~ Bool) => R a
mkR :: a -> R a
flop x = let g () = not x -- not :: Bool -> Bool in case (mkR x) of Rbool -> g () Rint -> True
x : β
β ~ Bool
mkR x : R β
β ~ Bool => … ? …
β ~ Int => true
6
GADTs and Generalization
What is the type of g? If spec does not allow quantification over equalities
() -> Bool If spec does allow quantification over equalities
() => Bool or (β ~ Bool) => () -> Bool
data R a where Rint :: (a ~ Int) => R a Rbool :: (a ~ Bool) => R a
mkR :: a -> R a
flop x = let g () = not x -- not :: Bool -> Bool in case (mkR x) of Rbool -> g () Rint -> True
x : β
β ~ Bool
mkR x : R β
β ~ Bool => … ? …
β ~ Int => true
7
Quantifying over equalities data R a where Rint :: (a ~ Int) => R a Rbool :: (a ~ Bool) => R a
mkR :: a -> R a
flop x = let g () = not x -- not :: Bool -> Bool in case (mkR x) of Rbool -> g () Rint -> True
Option 1Iβ = Boolg :: () ->
Bool
Option I g :: β ~ Bool => () -> Bool
g :: () -> BoolSecond branch rejected (Bool =/=
Int) … or …
typeable as unreachable
8
Hence, for GADTs: We want to unify away as much as possible
(type of ‘x’) For simpler types For support for some eager solving For shorter constraints
But we can’t unify variables bound in the environment!
if the type system allows quantification over equalities then we must defer a lot
of silly unifications as constraints
9
Open type families
Programmers declare type-level computations
And give axiom schemes for themforall (b).G b Int ~ b
In GHC the axiom scheme definitions are open If G Bool γ ~ Bool we must not conclude that γ ~ Int[think of another consistent axiom G Bool Char ~ Bool]
type family G a b type instance G b Int = b
10
Quantifying restricted constraints? Ok, if equalities are problematic quantify over:
Class constraints: Eq α Type family constraints: F α ~ Int
Problematic: We want to rewrite as much as possible But we must not rewrite too much. Rather delicate!
type family G a b type instance G b Int = b
flop x = let test = ... in ...
x :: βYielding constraint: G β Int ~ Int Yielding type: β -> β
G β Int ~ β & G β Int ~ Int … which gives …
β ~ Int
11
Quantifying over only class constraints? Problematic: class constraints may include
superclasses:
Even if not, it’s hard to give a complete specification
Left-to-right: REJECT (can’t discharge constraint) And it’s not right to defer unsolvable (forall b. F α b ~ Int)
Right-to-left: ACCEPT Rest of typing problem determines α ~ Int and triggers
axiom!
Class (a ~ b) => Eq a b
type instance F Int b = b
let f x = (let h y = … (yielding F α β ~ Int) … in 42, x + 42)
x :: αy :: β
12
Let generalization not a new problem really A. Kennedy knew about it when I was 14! [LIX
RR/96/09]
Kennedy used a clever domain-specific solution Constraint equivalent to: v ~ β/u Τype: forall u. Num u -> Num (β/u)
div :: forall u v. Num (u * v) -> Num u -> Num vweight :: Num kgtime :: Num sec
flop x = let y = div x in (y weight, y time)
x::βYielding constraint: β ~ (u * v)Yielding type: u -> v
Solving gives β = u * v and g becomes monomorphic!
Which IS polymorphic
13
The proposal
The specification and implementation costs for generalizing local let-bound expressions are becoming high:
Do NOT Generalize Local Let-Bound Expressions
Top-level ones do not contribute to the problems[No environment to interact with]
Local but annotated let definitions can be polymorphic
Most Haskell 98 programs actually do not use local let polymorphism (though arguably code refactoring tools may). Results performed in Hackage reported in paper
14
Also in paper Combining Let Should Not Be Generalized with the
OutsideIn [ICFP09] strategy for solving implication constraints leads to LHM(X):
HM(X) with Local assumptions from pattern matching
Type system parameterized over constraint domain X Inference algorithm parameterized over X solver Soundness result provided X solver assumptions
… towards pluggable type systems + type inference
15
A challenge for type system designers Solvers for type class and family instance constraints
are inherently weak (by design) under an open world assumption
Constraint arising: F α ~ Int Instance declaration: type instance F Int ~ Int Question: What is “α”?
Type system may “guess” α = Int, but algorithm can’t [or shouldn’t]
Challenge: Find a declarative specification that rejects
ambiguity
16
A first step: ambiguous constraints Constraint C is unambiguously solvable in a top-level
theory T: usolv(T,C)
iff T shows θ(C) and T & C shows (θ)
The constraint must be solvable by a substitution derivable by massaging the constraint:
¬ usolv(F Int ~ Int, F α ~ Int) … because:
(F Int ~ Int & F α ~ Int) =/=> (α ~ Int) Similar definition by Sulzmann & Stuckey [TOPLAS2005],
also recent related work by Camarao et al.
17
Conclusions and directions
The cost of implicit local generalization is high, we should find alternatives or not generalize implicitly
Directions and ongoing work:
A full GHC implementation that supports Haskell type classes, GADTs, type functions, and first-class polymorphism [journal submission soon]
A declarative specification that deals with ambiguity is open
18
Thank you for your attention