Scrap your boilerplate: generic programming in Haskell Ralf Lämmel, Vrije University Simon Peyton Jones, Microsoft Research.

Download Scrap your boilerplate: generic programming in Haskell Ralf Lämmel, Vrije University Simon Peyton Jones, Microsoft Research.

Post on 08-Jan-2018

214 views

Category:

Documents

0 download

DESCRIPTION

The problem: boilerplate code data Company = C [Dept] data Dept = D Name Manager [SubUnit] data SubUnit = PU Employee | DU Dept data Employee = E Person Salary data Person = P Name Address data Salary = S Float type Manager = Employee type Name = String type Address = String incSal :: Float -> Company -> Company

TRANSCRIPT

Scrap your boilerplate: generic programming in Haskell Ralf Lmmel, Vrije University Simon Peyton Jones, Microsoft Research The problem: boilerplate code Company Dept ResearchDept Production Manager Fred10k Bill15k Employee Fred10k Dept Devt Dept Manuf Find all people in tree and increase their salary by 10% The problem: boilerplate code data Company = C [Dept] data Dept = D Name Manager [SubUnit] data SubUnit = PU Employee | DU Dept data Employee = E Person Salary data Person = P Name Address data Salary = S Float type Manager = Employee type Name = String type Address = String incSal :: Float -> Company -> Company The problem: boilerplate code incSal :: Float -> Company -> Company incSal k (C ds) = C (map (incD k) ds) incD :: Float -> Dept -> Dept incD k (D n m us) = D n (incE k m) (map (incU k) us) incU :: Float -> SubUnit -> SubUnit incU k (PU e) = incE k e incU k (DU d) = incD k d incE :: Float -> Employee -> Employee incE k (E p s) = E p (incS k s) incS :: Float -> Salary -> Salary incS k (S f) = S (k*f) Boilerplate is bad Boilerplate is tedious to write Boilerplate is fragile: needs to be changed when data type changes (schema evolution) Boilerplate obscures the key bits of code Getting rid of boilerplate Use an un-typed language, with a fixed collection of data types Convert to a universal type and write (untyped) traversals over that Use reflection to query types and traverse child nodes Getting rid of boilerplate Generic (aka polytypic) programming: define function by induction over the (structure of the) type of its argument PhD required. Elegant only for totally generic functions (read, show, equality) generic inc :: Float -> t -> t inc k Unit = Unit inc k (Inl x) = Inl (inc k x) inc k (Inr y) = Inr (inc k y) inc k (x, y) = (inc k x, inc k y) Our solution Generic programming for the rest of us Typed language Works for arbitrary data types: parameterised, mutually recursive, nested... No encoding to/from some other type Very modest language support Elegant application of Haskell's type classes Our solution incSal :: Float -> Company -> Company incSal k = everywhere (mkT (incS k)) incS :: Float -> Salary -> Salary incS k (S f) = S (k*f) Two ingredients incSal :: Float -> Company -> Company incSal k = everywhere (mkT (incS k)) incS :: Float -> Salary -> Salary incS k (S f) = S (k*f) 2. Apply a function to every node in the tree 1. Build the function to apply to every node, from incS member :: a -> [a] -> Bool member x [] = False member x (y:ys) | x==y = True | otherwise = member x ys Type classes No! member is not truly polymorphic: it does not work for any type a, only for those on which equality is defined. member :: Eq a => a -> [a] -> Bool member x [] = False member x (y:ys) | x==y = True | otherwise = member x ys Type classes The class constraint " Eq a " says that member only works on types that belong to class Eq. class Eq a where (==) :: a -> a -> Bool instance Eq Int where (==) i1 i2 = eqInt i1 i2 instance (Eq a) => Eq [a] where (==) [] [] = True (==) (x:xs) (y:ys) = (x == y) && (xs == ys) (==) xs ys = False member :: Eq a => a -> [a] -> Bool member x [] = False member x (y:ys) | x==y = True | otherwise = member x ys Type classes data Eq a = MkEq (a->a->Bool) eq (MkEq e) = e dEqInt :: Eq Int dEqInt = MkEq eqInt dEqList :: Eq a -> Eq [a] dEqList (MkEq e) = MkEq el where el [] [] = True el (x:xs) (y:ys) = x `e` y && xs `el` ys el xs ys = False member :: Eq a -> a -> [a] -> Bool member d x [] = False member d x (y:ys)| eq d x y = True | otherwise = member d x ys Implementing type classes Class witnessed by a dictionary of methods Instance declarations create dictionaries Overloaded functions take extra dictionary parameter(s) Ingredient 1: type extension (mkT f) is a function that behaves just like f on arguments whose type is compatible with f's, behaves just like f on arguments whose type is compatible with f's, behaves like the identity function on all other arguments behaves like the identity function on all other arguments So applying (mkT (incS k)) to all nodes in the tree will do what we want. Type safe cast cast :: (Typeable a, Typeable b) => a -> Maybe b ghci> (cast 'a') :: Maybe Char Just 'a' ghci> (cast 'a') :: Maybe Bool Nothing ghci> (cast True) :: Maybe Bool Just True Type extension mkT :: (Typeable a, Typeable b) => (a->a) -> (b->b) mkT f = case cast f of Just g -> g Nothing -> id ghci> (mkT not) True False ghci> (mkT not) 'a' 'a' Implementing cast data TypeRep instance Eq TypeRep mkRep :: String -> [TypeRep] -> TypeRep class Typeable a where typeOf :: a -> TypeRep instance Typeable Int where typeOf i = mkRep "Int" [] Guaranteed not to evaluate its argument An Int, perhaps Implementing cast class Typeable a where typeOf :: a -> TypeRep instance (Typeable a, Typeable b) => Typeable (a,b) where typeOf p = mkRep "(,)" [ta,tb] where ta = typeOf (fst p) tb = typeOf (snd p) Implementing cast cast :: (Typeable a, Typeable b) => a -> Maybe b cast x = r where r = if typeOf x = typeOf (get r) then Just (unsafeCoerce x) else Nothing get :: Maybe a -> a get x = undefined Implementing cast In GHC: Typeable instances are generated automatically by the compiler for any data type Typeable instances are generated automatically by the compiler for any data type The definition of cast is in a library The definition of cast is in a library Then cast is sound Bottom line: cast is best thought of as a language extension, but it is an easy one to implement. All the hard work is done by type classes Two ingredients incSal :: Float -> Company -> Company incSal k = everywhere (mkT (incS k)) incS :: Float -> Salary -> Salary incS k (S f) = S (k*f) 2. Apply a function to every node in the tree 1. Build the function to apply to every node, from incS Ingredient 2: traversal Step 1: implement one-layer traversal Step 2: extend one-layer traversal to recursive traversal of the entire tree One-layer traversal class Typeable a => Data a where gmapT :: (forall b. Data b => b -> b) -> a -> a instance Data Int where gmapT f x = x instance (Data a,Data b) => Data (a,b) where gmapT f (x,y) = (f x, f y) (gmapT f x) applies f to the IMMEDIATE CHILDREN of x One-layer traversal class Typeable a => Data a where gmapT :: (forall b. Data b => b -> b) -> a -> a instance (Data a) => Data [a] where gmapT f [] = [] gmapT f (x:xs) = f x : f xs -- !!! gmapT's argument is a polymorphic function; so gmapT has a rank-2 type Step 2: Now traversals are easy! everywhere:: Data a => (forall b. Data b => b -> b) -> a -> a everywhere f x = f (gmapT (everywhere f) x) Many different traversals! everywhere, everywhere' :: Data a => (forall b. Data b => b -> b) -> a -> a everywhere f x = f (gmapT (everywhere f) x) -- Bottom up everywhere' f x = gmapT (everywhere' f) (f x)) -- Top down More perspicuous types everywhere:: Data a => (forall b. Data b => b -> b) -> a -> a everywhere :: (forall b. Data b => b -> b) -> (forall a. Data a => a -> a) type GenericT = forall a. Data a => a -> a everywhere :: GenericT -> GenericT Aha! What is "really going on"? inc :: Data t => Float -> t -> t The magic of type classes passes an extra argument to inc that contains: The function gmapT The function gmapT The function typeOf The function typeOf A call of ( mkT incS ), done at every node in tree, entails a comparison of the TypeRep returned by the passed-in typeOf with a fixed TypeRep for Salary ; this is precisely a dynamic type check Summary so far Solution consists of: A little user-written code A little user-written code Mechanically generated instances for Typeable and Data for each data type Mechanically generated instances for Typeable and Data for each data type A library of combinators ( cast, mkT, everywhere, etc) A library of combinators ( cast, mkT, everywhere, etc) Language support: cast cast rank-2 types rank-2 types Efficiency is so-so (factor of 2-3 with no effort) Summary so far Robust to data type evolution Works easily for weird data types data Rose a = MkR a [Rose a] instance (Data a) => Data (Rose a) where gmapT f (MkR x rs) = MkR (f x) (f rs) data Flip a b = Nil | Cons a (Flip b a) -- Etc... Generalisations With this same language support, we can do much more generic queries generic queries generic monadic operations generic monadic operations generic folds generic folds generic zips (e.g. equality) generic zips (e.g. equality) Generic queries Add up the salaries of all the employees in the tree salaryBill :: Company -> Float salaryBill = everything (+) (0 `mkQ` billS) billS :: Salary -> Float billS (S f) = f 2. Apply the function to every node in the tree, and combine results with (+) 1. Build the function to apply to every node, from billS Type extension again mkQ :: (Typeable a, Typeable b) => d -> (b->d) -> a -> d (d `mkQ` q) a = case cast a of Just b -> q b Nothing -> d ghci> (22 `mkQ` ord) 'a' 97 ghci> (22 `mkQ` ord) True 22 Apply 'q' if its type fits, otherwise return 'd' ord :: Char -> Int Traversal again class Typeable a => Data a where gmapT :: (forall b. Data b => b -> b) -> a -> a gmapQ :: forall r. (forall b. Data b => b -> r) -> a -> [r] Apply a function to all children of this node, and collect the results in a list Traversal again class Typeable a => Data a where gmapT :: (forall b. Data b => b -> b) -> a -> a gmapQ :: forall r. (forall b. Data b => b -> r) -> a -> [r] instance Data Int where gmapQ f x = [] instance (Data a,Data b) => Data (a,b) where gmapQ f (x,y) = f x ++ f y The query traversal everything:: Data a => (r->r->r) -> (forall b. Data b => b -> r) -> a -> r everything k f x = foldl k (f x) (gmapQ (everything f) x) Note that foldr vs foldl is in the traversal, not gmapQ Looking for one result By making the result type be (Maybe r), we can find the first (or last) satisfying value [laziness] findDept :: String -> Company -> Maybe Dept findDept s = everything `orElse` (Nothing `mkQ` findD s) findD :: String -> Dept -> Maybe Dept findD s s' _ _) = if s==s' then Just d else Nothing Monadic transforms Uh oh! Where do we stop? class Typeable a => Data a where gmapT :: (forall b. Data b => b -> b) -> a -> a gmapQ :: forall r. (forall b. Data b => b -> r) -> a -> [r] gmapM :: Monad m => (forall b. Data b => b -> m b) -> a -> m a Where do we stop? Happily, we can generalise all three gmaps into one data Employee = E Person Salary instance Data Employee where gfoldl k z (E p s) = (z E `k` p) `k` s We can define gmapT, gmapQ, gmapM in terms of (suitably parameterised) gfoldl The type of gfoldl hurts the brain (but the definitions are all easy) Where do we stop? class Typeable a => Data a where gfoldl :: (forall a b. Data a => c (a -> b) -> a -> c b) -> (forall g. g -> c g) -> a -> c a But we still can't do show! Want show :: Data a => a -> String show :: Data a => a -> String show t = ??? ++ concat (gmapQ show t) show the children and concatenate the results But how to show the constructor? Add more to class Data Very like typeOf :: Typeable a => a -> TypeRep except only for data types, not functions class Data a where toConstr :: a -> Constr data Constr -- abstract conString :: Constr -> String conFixity :: Constr -> Fixity So here is show show :: Data a => a -> String show t = conString (toConstr t) ++ concat (gmapQ show t) Simple refinements to deal with parentheses, infix constructors etc toConstr on a primitive type (like Int ) yields a Constr whose conString displays the value Further generic functions read :: Data a => String -> a read :: Data a => String -> a toBin :: Data a => a -> [Bit] fromBin :: Data a => [Bit] -> a toBin :: Data a => a -> [Bit] fromBin :: Data a => [Bit] -> a testGen :: Data a => RandomGen -> a testGen :: Data a => RandomGen -> a class Data a where toConstr :: a -> Constr fromConstr :: Constr -> a dataTypeOf :: a -> DataType data DataType-- Abstract stringCon :: DataType -> String -> Maybe Constr indexCon :: DataType -> Int -> Constr dataTypeCons :: DataType -> [Constr] Conclusions Simple, elegant Modest language extensions Rank-2 types Rank-2 types Auto-generation of Typeable, Data instances Auto-generation of Typeable, Data instances Fully implemented in GHC Shortcomings: Stop conditions Stop conditions Types are a bit uninformative Types are a bit uninformative Paper:

Recommended

View more >