engineering large projects in haskell: a decade of fp at galois

46
2008 Galois, Inc. All rights reserved. Engineering Large Projects in Haskell A Decade of Functional Programming at Galois Don Stewart | 2009 04 20 | London HUG

Upload: don-stewart

Post on 12-Nov-2014

2.047 views

Category:

Documents


9 download

DESCRIPTION

Galois has been building systems in Haskell for the past decade. This talk describes some of what we’ve learned about in-the-large, commercial Haskell programming in that time. * When and where we use Haskell * Correctness, productivity, scalabilty, maintainability * What language features we like: types, purity, types, abstractions, types, concurrency, types! * The Haskell toolchain: FFI, HPC, Cabal, compiler, libraries, build systems, etc. * Being a commercial entity in a largely open source communityThis talk was presented Monday 20th April at λondon HUG.http://www.galois.com/blog/2009/04/27/engineering-large-projects-in-haskell-a-decade-of-fp-at-galois/

TRANSCRIPT

Page 1: Engineering Large Projects in Haskell: A Decade of FP at Galois

2008 Galois, Inc. All rights reserved.

Engineering Large Projects in HaskellA Decade of Functional Programming at Galois

Don Stewart | 2009 04 20 | London HUG

Page 2: Engineering Large Projects in Haskell: A Decade of FP at Galois

2008 Galois, Inc. All rights reserved.

This talk made possible by...

• Aaron Tomb

• Adam Wick

• Andy Adams-Moran

• Andy Gill

• David Burke

• Dylan McNamee

• Eric Mertens

• Iavor Diatchki

• Isaac Potoczny-Jones

• Jef Bell

• Peter White

• Trevor Elliott

• Phil Weaver

• Jeff Lewis

• Joe Hurd

• Joel Stanley

• John Launchbury

• John Matthews

• Laura McKinney

• Lee Pike

• Levent Erkok

• Louis Testa

• Magnus Carlsson

• Paul Heinlein

• Sally Browning

• Thomas Nordin

• Brett Letner

• … and many others

Page 3: Engineering Large Projects in Haskell: A Decade of FP at Galois

2008 Galois, Inc. All rights reserved.

What does Galois do?

• Information assurance for critical systems

• Building systems that are trustworthy and secure

• Mixture of government and industry clients

• R&D with our favorite tools:– Formal methods– Typed functional languages– Languages, compilers, DSLs

• Systems components: kernels, file systems, network stuff, analysis tools, user land apps, ...

• Haskell for pretty much everything

Page 4: Engineering Large Projects in Haskell: A Decade of FP at Galois

2008 Galois, Inc. All rights reserved.

Yes. Haskell can do that.

• Many 20 – 200k LOC Haskell projects

• Oldest projects approaching 10 years

• Teams of 1 – 6 developers at a time

• Much pair programming, whiteboards, code reviews

• 20 – 30 devs over longer project lifetime

• Have built many tools and libraries to support Haskell development on this scale

• Haskell essential to keeping clients happy with:– Deadlines, performance(!), maintainability

Page 5: Engineering Large Projects in Haskell: A Decade of FP at Galois

Themes

Page 6: Engineering Large Projects in Haskell: A Decade of FP at Galois

2008 Galois, Inc. All rights reserved.

Languages matter!

• Writing correct software is difficult!

• Programming languages vary wildly in how well they support robust, secure, safe coding practices

• Languages and tools can aid or hinder our efforts:– Type systems– Purity– Modularity / compositionality– Abstraction support– Tools: analyses, provers, model checking– Buggy implementations

Page 7: Engineering Large Projects in Haskell: A Decade of FP at Galois

2008 Galois, Inc. All rights reserved.

Detect errors early!

• Detecting problems before executing the program is critical

– Debugging is hard– Debugging low level systems is harder– Debugging low level critical systems is ...

• Culture of error prevention– “How could we rule out this class of errors?”– “How could we be more precise?”

Page 8: Engineering Large Projects in Haskell: A Decade of FP at Galois

2008 Galois, Inc. All rights reserved.

The toolchain matters!

• Can't build anything without a good tool chain– Native code compiler– Libraries, libraries, libraries– Debugging, tracing– Profiling, inspection– Testing, analysis– Open, modifiable tools

• Particularly when pushing the boundaries

Page 9: Engineering Large Projects in Haskell: A Decade of FP at Galois

2008 Galois, Inc. All rights reserved.

Community matters!

• Soup of ideas in a large, open research community:– Rapid adoption of new ideas

• Support, maintainance and help– Can't build everything we need in-house!

• Give back via:– Workshops: CUFP, ICFP, Haskell Symposium– Hackathons– Industrial Haskell Group– Open source code and infrastructure– Teaching: papers, blogs, talks

Page 10: Engineering Large Projects in Haskell: A Decade of FP at Galois

How Galois uses Haskell

Page 11: Engineering Large Projects in Haskell: A Decade of FP at Galois

1. The Type System

Page 12: Engineering Large Projects in Haskell: A Decade of FP at Galois
Page 13: Engineering Large Projects in Haskell: A Decade of FP at Galois

2008 Galois, Inc. All rights reserved.

Types make our lives easier

• Cheap way to verify properties– Cheaper than theorem proving– More assurance than testing– Saves debugging in hostile environments

• Typical conversation:– Engineer A: “Spec says this must never

happen”– Engineer B: “Can we enforce that in the

type system?”

Page 14: Engineering Large Projects in Haskell: A Decade of FP at Galois

2008 Galois, Inc. All rights reserved.

Kinds of things types enforce

• Simple things:– Correct arguments to a function– Function f does not touch the disk– No null pointers– Mixing up similar concepts:

• Virtual / physical addresses

• Serious things:– Information flow policies– Correct component wiring and integration

Page 15: Engineering Large Projects in Haskell: A Decade of FP at Galois

2008 Galois, Inc. All rights reserved.

Recent experienceFirst demo of a big systems project

• Six engineers• 50k lines of code, in 5 components,

developed over a number of months• Integrated, tested, demo'd in only a week,

two months ahead of schedule, 2 rungs above performance spec.

• 1 space leak, spotted and fixed on first day of testing

• 2 bugs found (typos from spec)

Page 16: Engineering Large Projects in Haskell: A Decade of FP at Galois

2008 Galois, Inc. All rights reserved.

Purity is fundamental

• Difficult to show safety without purity• Code should be pure by default• Makes large systems easier to glue:

– Pure code is “safe” by default to call

• Effects are “code smells”, and have to be treated carefully

• The world has too many impure languages: don't add to that

Page 17: Engineering Large Projects in Haskell: A Decade of FP at Galois

2008 Galois, Inc. All rights reserved.

Types aren't enough though

• Still not expressive enough for a lot of the properties we want to enforce

• We care a lot about sizes in types– “Input must only be 128, 192 or 256 bits”– “Type T should be represented with 7 bits”

Page 18: Engineering Large Projects in Haskell: A Decade of FP at Galois

2008 Galois, Inc. All rights reserved.

Other tools in the bag

• Extended static analysis tools• Model checking

– SAT, SMT, …

• Theorem proving– Isabelle, Coq

• How much assurance do you need?

Page 19: Engineering Large Projects in Haskell: A Decade of FP at Galois

2. Abstractions

Page 20: Engineering Large Projects in Haskell: A Decade of FP at Galois

2008 Galois, Inc. All rights reserved.

Monads

• Constantly rolling new monads– Captures critical facts about the execution

environment in the type

• Directly encodes semantics we care about– “Computed keys are not visible outside the

M component”– “Function f has read-only access to

memory”

Page 21: Engineering Large Projects in Haskell: A Decade of FP at Galois

2008 Galois, Inc. All rights reserved.

Algebraic Data Types

• Every system is either an interpreter or a compiler

– Abstract syntax trees are ubiquitous– Represent processes symbolically, via

ADTs, then evaluate them in a safe (monadic) context

– Precise, concise control over possible values

– But need precise representation control

Page 22: Engineering Large Projects in Haskell: A Decade of FP at Galois

2008 Galois, Inc. All rights reserved.

Laziness

• Captures some concepts perfectly– “A stream of 4k packets from the wire”

• Critical for control abstractions in DSLs• Useful for prototyping:

– error “M.F.foo: not implemented”

Page 23: Engineering Large Projects in Haskell: A Decade of FP at Galois

2008 Galois, Inc. All rights reserved.

Laziness

• Makes time and space reasoning harder!– Mostly harmless in practice– Stress testing tends to reveal retainers– Graphical profiling knocks it dead

• Must be able to precisely enable/disable• Be careful with exceptions and mutation

• whnf/rnf/! are your friends

Page 24: Engineering Large Projects in Haskell: A Decade of FP at Galois

2008 Galois, Inc. All rights reserved.

Type classes

• We use type classes– Well defined interfaces between large

components (sets of modules)– Natural code reuse– Capture general concepts in a natural way– Capture interface in a clear way– Kick butt EDSLs (see Lennart's blog)

Page 25: Engineering Large Projects in Haskell: A Decade of FP at Galois

2008 Galois, Inc. All rights reserved.

Concurrency

• forkIO rocks– Cheap, very fast, precise threads

• MVars rock• STM rocks (safely composable locks!)

• Result: not shy introducing concurrency when appropriate

Page 26: Engineering Large Projects in Haskell: A Decade of FP at Galois

3. Foreign Function Interface

Page 27: Engineering Large Projects in Haskell: A Decade of FP at Galois

2008 Galois, Inc. All rights reserved.

Foreign Function Interface

• The world is a messy place• A good FFI means we can always call

someone else's code if necessary• Have to talk to weird bits of hardware and

weird proof systems• ForeignPtr is great abstraction tool• Must have clear API into the runtime

system (hot topic at the moment)

Page 28: Engineering Large Projects in Haskell: A Decade of FP at Galois

4. Meta programming

Page 29: Engineering Large Projects in Haskell: A Decade of FP at Galois

2008 Galois, Inc. All rights reserved.

There's alway boilerplate

• Abstractions get rid of a lot of repetitive code, but there's always something that's not automated

• We use a little Template Haskell• Other generics:

– Hinze-style generics– SYB generics

• Particular useful for generating instance code for marshalling

Page 30: Engineering Large Projects in Haskell: A Decade of FP at Galois

5. Performance

Page 31: Engineering Large Projects in Haskell: A Decade of FP at Galois

2008 Galois, Inc. All rights reserved.

Fast enough for majority of things

• Vast majority of code is fast enough– GHC -O2 -funbox-strict-fields– Happy with 1 – 2x C for low level code

• Last few drops get squeezed out:– Profiling– Low level Haskell– Cycle-level measurement– EDSLs to generate better code– Calling into C

Page 32: Engineering Large Projects in Haskell: A Decade of FP at Galois

2008 Galois, Inc. All rights reserved.

Performance

• Really precise performance requires expertise

• Libraries are helping reify “oral traditions” about optimization

• Still a lack of clarity about performance techniques in the broader Haskell community though

Page 33: Engineering Large Projects in Haskell: A Decade of FP at Galois

6. Debugging

Page 34: Engineering Large Projects in Haskell: A Decade of FP at Galois

2008 Galois, Inc. All rights reserved.

There are still bugs!

• Testing– QuickCheck!!!

• Heap profiling– “By type” profiling of the heap

• GHC -fhpc– Great for finding exceptions– Understanding what is executing

• +RTS -stderr– Explain what GC, threads, memory is up to

Page 35: Engineering Large Projects in Haskell: A Decade of FP at Galois

7. Documentation

Page 36: Engineering Large Projects in Haskell: A Decade of FP at Galois

2008 Galois, Inc. All rights reserved.

Generating supporting artifacts

• Haddock is great for reference material– Helps capture design in the source– Code + types becomes self documenting

• Design documents can be partially extracted via:

– The major data and type signatures– graphmod– cabalgraph– HPC analysis

Page 37: Engineering Large Projects in Haskell: A Decade of FP at Galois

8. Libraries

Page 38: Engineering Large Projects in Haskell: A Decade of FP at Galois

2008 Galois, Inc. All rights reserved.

Hackage Changes Everything

• There's a library for everything, and often more than one...

• Can sit back and let mtl / monadlib / haxml / hxt fight it out :)

• Static linking → need BSD licensed code if we want to ship

• Haskell Platform to answer QA questions

Page 39: Engineering Large Projects in Haskell: A Decade of FP at Galois

9. Shipping code

Page 40: Engineering Large Projects in Haskell: A Decade of FP at Galois

2008 Galois, Inc. All rights reserved.

Cabal

• I don't know how Haskell was possible before Cabal :)

• Quickly adopted Cabal/cabal-install across projects

• cabal-install:– Simple, clean integration of internal and

external components into packageable objects

Page 41: Engineering Large Projects in Haskell: A Decade of FP at Galois

10. Conventions

Page 42: Engineering Large Projects in Haskell: A Decade of FP at Galois

2008 Galois, Inc. All rights reserved.

We try to ...

• -Wall police• Consistent layout• No tabs• Import qualified Control.Exception• {-# LANGUAGE … #-}• Map exceptions into Either / Maybe

Page 43: Engineering Large Projects in Haskell: A Decade of FP at Galois

2008 Galois, Inc. All rights reserved.

We try to ...

• deriving Show• Line/column for errors if you must throw• No global mutable state• Put type sigs in “when you're done” with

the design• Use GHCi for rapid experimentation• Cabal by default.• Libraries by default

Page 44: Engineering Large Projects in Haskell: A Decade of FP at Galois

11. Things that we still need

Page 45: Engineering Large Projects in Haskell: A Decade of FP at Galois

2008 Galois, Inc. All rights reserved.

More support for large scale programming

• Enforcing conventions across the code• Data representation precision (emerging)• A serious refactoring tool• Vetted and audited libraries by experts

(Haskell Platform)• Idioms for mapping design onto

types/functions/classes/monads• Better capture your 100 module design!

Page 46: Engineering Large Projects in Haskell: A Decade of FP at Galois