Although there was nothing very deep about this change or its causes, I found it interesting partly because I had used a partly test-driven process to evolve the original API and I felt there may be a connection between the process and any resulting problems. Here are a few thoughts prompted by this change.
Passing the tests is not enough
Test-driven development is a satisfying and welcome prop. It allows you to reframe difficult questions of algorithm design in terms of easier questions about what an algorithm should produce.
But producing the right results in every test case you can think of is not enough. It’s possible to exercise almost the whole of your implementation in terms of static coverage, yet still have the wrong API.
In other words, it may be just as easy to overfit the API to the test cases as it is to overfit the test cases to the implementation.
Unit testing may be easier than API design
So, designing a good API is harder than writing tests for it. But to rephrase that more encouragingly: writing tests is easier than designing the API.
If, like me, you’re used to thinking of unit testing as requiring more effort than “just bunging together an API”, this should be a worthwhile corrective in both directions.
API design is harder than you think, but unit testing is easier. Having unit tests doesn’t make it any harder to change the API, either: maintaining tests during redesign is seldom difficult, and having tests helps to ensure the logic doesn’t get broken.
Types are not just annoying artifacts of the programming language
An unfortunate consequence of having worked with data representation systems like RDF mostly in the context of Web backends and scripting languages is that it leads to a tendency to treat everything as “just a string”.
This is fine if your string has enough syntax to be able to distinguish types properly by parsing it—for example, if you represent RDF using Turtle and query it using SPARQL.
But if you break down your data model into individual node components while continuing to represent those as untyped strings, you’re going to be in trouble. You can’t get away without understanding, and somewhere making explicit, the underlying type model.
Predictability contributes to simplicity
A simpler API is not necessarily one that leads to fewer or shorter lines of code. It’s one that leads to less confusion and more certainty, and carrying around type information helps, just as precondition testing and fail-fast principles can.
It’s probably still wrong
I’ve effectively found and fixed a bug, one that happened to be in the API rather than the implementation. But there are probably still many remaining. I need a broader population of software using the library before I can be really confident that the API works.
Of course it’s not unusual to see significant API holes in 1.0 releases of a library, and to get them tightened up for 2.0. It’s not the end of the world. But it ought to be easier and cheaper to fix these things earlier rather than later.
Now, I wonder what else is wrong…