Uncategorized

Rules of thumb for functional APIs

I’ve been trying to get to grips with what makes an API clean and pleasing to use in a functional programming language. (In my case this language has been Yeti, an attractive language that uses the Java virtual machine.)

Here are some notes, for my own reference as much as anyone’s. Some of this may be particular to Yeti, but most of it is (hopefully, if I have things right) going to be obvious stuff to any functional programmer. Please leave a comment if you have any more or better suggestions!

Avoid mutable state where practical

  • It’s easier to reason correctly about the behaviour of a series of functions that each take an object and return a new object derived from it, leaving the original one unaffected, than a series of functions that can each change some hidden state in the object and so affect the behaviour of all subsequent functions in an invisible way.
  • This is the nub of practical functional programming: pure functions—functions without hidden state—are predictable, testable, and easier to understand within a wider system.
  • Objects with mutable state should be limited to things that really do have some conceptual internal state, such as database handles or streams.
  • So where in an object-oriented language you may have a class with internal state plus a set of methods that act on it, organise this as a module in which the functions (named somewhat like methods) accept state and return some new state. The state is most likely a struct, maybe with getters but not setters.

Give functions names relative to their modules

  • A Yeti module is a file whose top-level code evaluates to something. Typically it contains one or more bindings (function declarations usually) which are returned within a struct at the end of the file’s top-level code.
  • (Modules can be loaded either with a plain load expression or in a binding: load my.module versus m = load my.module. In the first case a function func within the module would be referred to after loading simply as func; in the second, it would be m.func.)
  • I think it’s best to expect that everyone will be using the second form to load your module, and to name functions so that they make sense when they have the module name immediately before them. You don’t need to worry about name collisions with other modules or the standard library. As a programmer used to object languages, I think this helps when structuring code so as to avoid too much mutable state, because it means the module can take on the function of namespacing that would be carried out by the object class.

Distinguish between curried arguments and other ways of packaging

  • The obvious way to write a function that takes more than one argument in Yeti is using what are known as “curried” arguments, with syntax: f a b = a + b
  • This allows partial application: with two arguments f 2 3 is 5, but with only one, f 2 makes a new function that takes another argument and adds 2 to it.
  • Curried arguments are useful where callers might actually want to bind the first argument and then reuse the function. As an extreme example, one might introduce a second argument for a function that in theory only needs one, like an FFT: a function declared fft size data is redundant because size can be queried from the data array, but it allows the FFT tables to be precomputed when the first argument is bound. So fft 1024, leaving the second argument unbound, becomes a bit like an object constructor.
  • But it’s easy to get used to the idea that this is just how multiple arguments are passed. Many functions won’t have callers that want to do partial application and won’t benefit from knowing some arguments before the rest. And the disadvantage of curried arguments is that the caller needs to remember what order they appear in.
  • So, functions that take a set of related arguments at once, and can’t benefit from knowing one argument in advance of the rest, should accept them as named values in a struct instead. It makes for a more discoverable API.