scrap your boilerplate: generic programming in haskell ralf lämmel, vrije university simon peyton...

Post on 08-Jan-2018

225 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

The problem: boilerplate code data Company = C [Dept] data Dept = D Name Manager [SubUnit] data SubUnit = PU Employee | DU Dept data Employee = E Person Salary data Person = P Name Address data Salary = S Float type Manager = Employee type Name = String type Address = String incSal :: Float -> Company -> Company

TRANSCRIPT

Scrap your boilerplate:Scrap your boilerplate:generic programming in generic programming in

HaskellHaskellRalf LRalf Läämmel, Vrije Universitymmel, Vrije University

Simon Peyton Jones, Microsoft ResearchSimon Peyton Jones, Microsoft Research

The problem: boilerplate codeThe problem: boilerplate codeCompany

Dept “Research” Dept “Production”

Manager Manager

“Fred” £10k “Bill” £15k

Employee

“Fred” £10k

Dept “Devt”

Dept “Manuf”

Find all people in tree and increase their

salary by 10%

The problem: boilerplate codeThe problem: boilerplate codedata Company = C [Dept]data Dept = D Name Manager [SubUnit]data SubUnit = PU Employee | DU Deptdata Employee = E Person Salarydata Person = P Name Addressdata Salary = S Floattype Manager = Employeetype Name = Stringtype Address = String

incSal :: Float -> Company -> Company

The problem: boilerplate codeThe problem: boilerplate codeincSal :: Float -> Company -> CompanyincSal k (C ds) = C (map (incD k) ds)

incD :: Float -> Dept -> DeptincD k (D n m us) = D n (incE k m) (map (incU k) us)

incU :: Float -> SubUnit -> SubUnitincU k (PU e) = incE k eincU k (DU d) = incD k d

incE :: Float -> Employee -> EmployeeincE k (E p s) = E p (incS k s)

incS :: Float -> Salary -> SalaryincS k (S f) = S (k*f)

Boilerplate is badBoilerplate is bad

Boilerplate is tedious to writeBoilerplate is tedious to writeBoilerplate is fragile: needs to be changed Boilerplate is fragile: needs to be changed when data type changes (“schema when data type changes (“schema evolution”)evolution”)Boilerplate obscures the key bits of codeBoilerplate obscures the key bits of code

Getting rid of boilerplateGetting rid of boilerplateUse an un-typed language, with a fixed Use an un-typed language, with a fixed collection of data typescollection of data typesConvert to a universal type and write Convert to a universal type and write (untyped) traversals over that(untyped) traversals over thatUse “reflection” to query types and Use “reflection” to query types and traverse child nodestraverse child nodes

Getting rid of boilerplateGetting rid of boilerplateGeneric (aka polytypic) programming: define Generic (aka polytypic) programming: define function by induction over the (structure of the) function by induction over the (structure of the) type of its argumenttype of its argument

PhD required. Elegant only for “totally generic” PhD required. Elegant only for “totally generic” functions (read, show, equality)functions (read, show, equality)

generic inc<t> :: Float -> t -> tinc<1> k Unit = Unit

inc<a+b> k (Inl x) = Inl (inc<a> k x)inc<a+b> k (Inr y) = Inr (inc<b> k y)

inc<a*b> k (x, y) = (inc<a> k x, inc<a> k y)

Our solutionOur solution

Generic programming for the rest of us Generic programming for the rest of us Typed languageTyped languageWorks for arbitrary data types: Works for arbitrary data types: parameterised, mutually recursive, parameterised, mutually recursive, nested...nested...No encoding to/from some other typeNo encoding to/from some other typeVery modest language supportVery modest language supportElegant application of Haskell's type Elegant application of Haskell's type classesclasses

Our solutionOur solution

incSal :: Float -> Company -> CompanyincSal k = everywhere (mkT (incS k))

incS :: Float -> Salary -> SalaryincS k (S f) = S (k*f)

Two ingredientsTwo ingredientsincSal :: Float -> Company -> CompanyincSal k = everywhere (mkT (incS k))

incS :: Float -> Salary -> SalaryincS k (S f) = S (k*f)

2. Apply a function to every node in the tree

1. Build the function to apply to every node,

from incS

member :: a -> [a] -> Boolmember x [] = Falsemember x (y:ys) | x==y = True

| otherwise = member x ys

Type classesType classes

No! member is not truly polymorphic: it does not work for any type a, only for those on which equality is defined.

member :: Eq a => a -> [a] -> Boolmember x [] = Falsemember x (y:ys) | x==y = True

| otherwise = member x ys

Type classesType classes

The class constraint "Eq a" says that member only works on types that belong to class Eq.

class Eq a where (==) :: a -> a -> Bool

instance Eq Int where (==) i1 i2 = eqInt i1 i2

instance (Eq a) => Eq [a] where (==) [] [] = True (==) (x:xs) (y:ys) = (x == y) && (xs == ys)

(==) xs ys = False

member :: Eq a => a -> [a] -> Boolmember x [] = Falsemember x (y:ys) | x==y = True

| otherwise = member x ys

Type classesType classes

data Eq a = MkEq (a->a->Bool)eq (MkEq e) = e

dEqInt :: Eq IntdEqInt = MkEq eqInt

dEqList :: Eq a -> Eq [a]dEqList (MkEq e) = MkEq el where el [] [] = True el (x:xs) (y:ys) = x `e` y && xs `el` ys el xs ys = False

member :: Eq a -> a -> [a] -> Boolmember d x [] = Falsemember d x (y:ys) | eq d x y = True

| otherwise = member d x ys

Implementing type classesImplementing type classes

Class witnessed by a “dictionary” of

methodsInstance

declarations create dictionaries

Overloaded functions take extra dictionary

parameter(s)

Ingredient 1: type extensionIngredient 1: type extension

(mkT f) is a function that (mkT f) is a function that behaves just like f on arguments whose type behaves just like f on arguments whose type

is compatible with f's, is compatible with f's, behaves like the identity function on all other behaves like the identity function on all other

argumentsarguments

So applying So applying (mkT (incS k))(mkT (incS k)) to all nodes to all nodes in the tree will do what we want.in the tree will do what we want.

Type safe castType safe cast

cast :: (Typeable a, Typeable b) => a -> Maybe b

ghci> (cast 'a') :: Maybe CharJust 'a'ghci> (cast 'a') :: Maybe BoolNothingghci> (cast True) :: Maybe BoolJust True

Type extensionType extensionmkT :: (Typeable a, Typeable b) => (a->a) -> (b->b)

mkT f = case cast f ofJust g -> gNothing -> id

ghci> (mkT not) TrueFalseghci> (mkT not) 'a''a'

Implementing castImplementing castdata TypeRepinstance Eq TypeRepmkRep :: String -> [TypeRep] -> TypeRep

class Typeable a where typeOf :: a -> TypeRep

instance Typeable Int where typeOf i = mkRep "Int" []

Guaranteed not to evaluate its

argument

An Int, perhaps

Implementing castImplementing castclass Typeable a where typeOf :: a -> TypeRep

instance (Typeable a, Typeable b) => Typeable (a,b) where

typeOf p = mkRep "(,)" [ta,tb] where

ta = typeOf (fst p)tb = typeOf (snd p)

Implementing castImplementing cast

cast :: (Typeable a, Typeable b) => a -> Maybe b

cast x = rwhere r = if typeOf x = typeOf (get r)

then Just (unsafeCoerce x) else Nothing

get :: Maybe a -> a get x = undefined

Implementing castImplementing castIn GHC: In GHC:

Typeable instances are generated Typeable instances are generated automatically by the compiler for any data automatically by the compiler for any data typetype

The definition of cast is in a libraryThe definition of cast is in a library

Then cast is soundThen cast is soundBottom line: cast is best thought of as a Bottom line: cast is best thought of as a language extension, but it is an easy one language extension, but it is an easy one to implement. All the hard work is done to implement. All the hard work is done by type classesby type classes

Two ingredientsTwo ingredientsincSal :: Float -> Company -> CompanyincSal k = everywhere (mkT (incS k))

incS :: Float -> Salary -> SalaryincS k (S f) = S (k*f)

2. Apply a function to every node in the tree

1. Build the function to apply to every node,

from incS

Ingredient 2: traversalIngredient 2: traversal

Step 1: implement one-layer traversalStep 1: implement one-layer traversalStep 2: extend one-layer traversal to Step 2: extend one-layer traversal to recursive traversal of the entire treerecursive traversal of the entire tree

One-layer traversalOne-layer traversalclass Typeable a => Data a where gmapT :: (forall b. Data b => b -> b) -> a -> a

instance Data Int where gmapT f x = x

instance (Data a,Data b) => Data (a,b) where

gmapT f (x,y) = (f x, f y)

(gmapT f x) applies f to the IMMEDIATE

CHILDREN of x

One-layer traversalOne-layer traversalclass Typeable a => Data a where gmapT :: (forall b. Data b => b -> b) -> a -> a

instance (Data a) => Data [a] where gmapT f [] = [] gmapT f (x:xs) = f x : f xs -- !!!

gmapT's argument is a polymorphic function; so gmapT has a rank-2 type

Step 2: Now traversals are easy!Step 2: Now traversals are easy!

everywhere :: Data a => (forall b. Data b => b -> b)

-> a -> a

everywhere f x = f (gmapT (everywhere f) x)

Many different traversals!Many different traversals!everywhere, everywhere'

:: Data a => (forall b. Data b => b -> b)

-> a -> a

everywhere f x = f (gmapT (everywhere f) x)-- Bottom up

everywhere' f x = gmapT (everywhere' f) (f x))-- Top down

More perspicuous typesMore perspicuous typeseverywhere :: Data a => (forall b. Data b => b -> b)

-> a -> a

everywhere :: (forall b. Data b => b -> b) -> (forall a. Data a => a -> a)

type GenericT = forall a. Data a => a -> a

everywhere :: GenericT -> GenericT

Aha!

What is "really going on"?What is "really going on"?

inc :: Data t => Float -> t -> tinc :: Data t => Float -> t -> tThe magic of type classes passes an extra The magic of type classes passes an extra argument to inc that contains:argument to inc that contains:

The function The function gmapTgmapT The function The function typeOftypeOf

A call of (A call of (mkTmkT incSincS), done at every node in ), done at every node in tree, entails a comparison of the tree, entails a comparison of the TypeRepTypeRep returned by the passed-in returned by the passed-in typeOftypeOf with a fixed with a fixed TypeRepTypeRep for for SalarySalary; this is precisely a ; this is precisely a dynamic type checkdynamic type check

Summary so farSummary so farSolution consists of:Solution consists of:

A little user-written codeA little user-written code Mechanically generated instances for Mechanically generated instances for TypeableTypeable and and DataData for each data type for each data type

A library of combinators (A library of combinators (castcast, , mkTmkT, , everywhereeverywhere, etc), etc)

Language support:Language support: castcast rank-2 typesrank-2 types

Efficiency is so-so (factor of 2-3 with no Efficiency is so-so (factor of 2-3 with no effort)effort)

Summary so farSummary so far

Robust to data type evolutionRobust to data type evolutionWorks easily for weird data typesWorks easily for weird data types

data Rose a = MkR a [Rose a]

instance (Data a) => Data (Rose a) where gmapT f (MkR x rs) = MkR (f x) (f rs)

data Flip a b = Nil | Cons a (Flip b a)-- Etc...

GeneralisationsGeneralisations

With this same language support, we can With this same language support, we can do much moredo much more

generic queriesgeneric queries generic monadic operationsgeneric monadic operations generic foldsgeneric folds generic zips (e.g. equality)generic zips (e.g. equality)

Generic queriesGeneric queriesAdd up the salaries of all the employees Add up the salaries of all the employees in the treein the tree

salaryBill :: Company -> FloatsalaryBill = everything (+) (0 `mkQ` billS)

billS :: Salary -> FloatbillS (S f) = f

2. Apply the function to every node in the tree, and

combine results with (+) 1. Build the function to apply to every node,

from billS

Type extension againType extension again

mkQ :: (Typeable a, Typeable b) => d -> (b->d) -> a -> d(d `mkQ` q) a = case cast a of

Just b -> q b Nothing -> d

ghci> (22 `mkQ` ord) 'a'97ghci> (22 `mkQ` ord) True22

Apply 'q' if its type fits, otherwise return

'd'

ord :: Char -> Int

Traversal againTraversal againclass Typeable a => Data a where gmapT :: (forall b. Data b => b -> b) -> a -> a

gmapQ :: forall r. (forall b. Data b => b -> r) -> a -> [r]

Apply a function to all children of this node, and

collect the results in a list

Traversal againTraversal againclass Typeable a => Data a where gmapT :: (forall b. Data b => b -> b) -> a -> a

gmapQ :: forall r. (forall b. Data b => b -> r) -> a -> [r]

instance Data Int where gmapQ f x = []

instance (Data a,Data b) => Data (a,b) where

gmapQ f (x,y) = f x ++ f y

The query traversalThe query traversaleverything :: Data a => (r->r->r)

-> (forall b. Data b => b -> r)

-> a -> reverything k f x = foldl k (f x) (gmapQ (everything f) x)

Note that foldr vs foldl is in the traversal, not

gmapQ

Looking for one resultLooking for one result

By making the result type be (Maybe r), By making the result type be (Maybe r), we can find the first (or last) satisfying we can find the first (or last) satisfying value [laziness]value [laziness]

findDept :: String -> Company -> Maybe Dept findDept s = everything `orElse`

(Nothing `mkQ` findD s)

findD :: String -> Dept -> Maybe DeptfindD s d@(D s' _ _) = if s==s' then Just d else Nothing

Monadic transformsMonadic transforms

Uh oh! Where do we stop?Uh oh! Where do we stop?

class Typeable a => Data a where gmapT :: (forall b. Data b => b -> b) -> a -> a

gmapQ :: forall r. (forall b. Data b => b -> r) -> a -> [r]

gmapM :: Monad m => (forall b. Data b => b -> m b) -> a -> m a

Where do we stop?Where do we stop?Happily, we can generalise all three gmaps into Happily, we can generalise all three gmaps into oneone

data Employee = E Person Salary

instance Data Employee where gfoldl k z (E p s) = (z E `k` p) `k` s

We can define We can define gmapTgmapT, , gmapQgmapQ, , gmapMgmapM in terms of (suitably parameterised) in terms of (suitably parameterised) gfoldlgfoldlThe type of The type of gfoldlgfoldl hurts the brain (but the definitions are all easy) hurts the brain (but the definitions are all easy)

Where do we stop?Where do we stop?class Typeable a => Data a where

gfoldl :: (forall a b. Data a => c (a -> b) -> a -> c b)

-> (forall g. g -> c g) -> a -> c a

But we still can't do show!But we still can't do show!

Want show :: Data a => a -> StringWant show :: Data a => a -> String

show :: Data a => a -> Stringshow t = ??? ++ concat (gmapQ show t)

show the children and concatenate the

resultsBut how to show the

constructor?

Add more to class DataAdd more to class Data

Very like Very like typeOftypeOf :: :: TypeableTypeable aa =>=> aa ->-> TypeRepTypeRepexcept only for data types, not functionsexcept only for data types, not functions

class Data a where toConstr :: a -> Constr

data Constr -- abstractconString :: Constr -> StringconFixity :: Constr -> Fixity

So here is showSo here is show

show :: Data a => a -> Stringshow t = conString (toConstr t) ++ concat (gmapQ show t)

Simple refinements to deal with Simple refinements to deal with parentheses, infix constructors etcparentheses, infix constructors etctoConstrtoConstr on a primitive type (like on a primitive type (like IntInt) ) yields a yields a ConstrConstr whose whose conStringconString displays the valuedisplays the value

Further generic functionsFurther generic functions read :: Data a => String -> aread :: Data a => String -> a toBin :: Data a => a -> [Bit]toBin :: Data a => a -> [Bit]fromBin :: Data a => [Bit] -> afromBin :: Data a => [Bit] -> a

testGen :: Data a => RandomGen -> atestGen :: Data a => RandomGen -> a

class Data a where toConstr :: a -> Constr fromConstr :: Constr -> a dataTypeOf :: a -> DataType

data DataType -- AbstractstringCon :: DataType -> String -> Maybe ConstrindexCon :: DataType -> Int -> ConstrdataTypeCons :: DataType -> [Constr]

ConclusionsConclusions““Simple”, elegantSimple”, elegantModest language extensionsModest language extensions

Rank-2 typesRank-2 types Auto-generation of Typeable, Data instancesAuto-generation of Typeable, Data instancesFully implemented in GHCFully implemented in GHC

Shortcomings:Shortcomings: Stop conditionsStop conditions Types are a bit uninformativeTypes are a bit uninformative

Paper: http://research.microsoft.com/~simonpj

top related