My last post got an unprecedented amount of attention after appearing on the popular site Hacker News. It spent about 14 hours on the front page there and got almost 11,000 unique visitors that day — a fair way above my usual daily average, for this whole blog, of about a hundred.
I combed through the comments on the Hacker News post, as well as entries on lobste.rs and Reddit. Counting only comments directly about the post (rather than to other commenters), I reckoned there were:
- Nine comments responding only, or primarily, to performance differences in the language implementations;
- Six suggestions to hoist regexp construction out of the split function in the OCaml version, to make it faster: four of them included code, and they were all quite polite;
- Six objections that my Python code was un-idiomatic, un-Pythonic, or generally strange. Five of those proposed rewriting it to use the CSV parser module or pandas — good advice in general, though not really in line with the purpose of the post;
- Six code rewrites in, enthusiasms about, or defences of, F♯;
- Six comments suggesting other languages to look at (4 x Haskell; Scala; Mythryl);
- Three observations that the Python code would run faster with PyPy;
- Three people sharing my general feeling that it would be a good thing if Standard ML had more practical things going for it;
- Two people having difficulty with my use of the term “Big Data” to describe a 300MB CSV file. (Sorry. I really only ever use the term ironically, but you weren’t to know that.)
I must say, OCaml developers seem like a keen bunch. Actually the vibe I get about both OCaml and F♯, from these few comments, is a happy one. About SML it’s more a sense of distant yearning. The Python comments were a bit grumpier, but that’s reasonable given that my post didn’t exactly engage with the way things actually get done in Python.
It’s quite obviously hard to resist a speed contest. I wasn’t initially going to include the runtime measurements in the post, but I couldn’t help myself either. Still, none of the coding decisions I made when writing the test programs had anything to do with speed. (Scalability, in a simplistic way, yes: I made sure not to read the whole file into memory, and to avoid non-tail-recursive code.) Then of course more people responded about performance than anything else. And even though I can see that that way lies madness, I’d still like to find time to follow up on the question of why the SML code didn’t go as quickly as the OCaml did.
Since I wrote that post, I’ve had a piece of real work come up for which these languages are quite well suited, involving parsing and manipulating a modest amount of symbolic data with a very specialised structure. I decided to do it using Standard ML this time, and hope to find out whether there’s a point at which it becomes more impractical than delightful.