Undergraduate programming languages

I read two quite different articles about programming in academia today.

I don’t know Yossi Kreinin, and when his piece Why bad scientific code beats code following “best practices” appeared on the Hacker News front page, I guessed that I probably wouldn’t agree with it. I’m a programmer working in academia who has spent some time trying to find ways to improve the code that researchers write, and I don’t want to be told I’m wasting my time.

But on the whole I agreed with him. A totally naive programmer is going to produce messy, organic, but basically linear code that will usually be easier to understand and work with than the code of someone who has learned a bit of Java and wants to put everything in a factory class. The part I don’t agree with of course, and that I suspect he put in just for the sake of the headline, is that these really are best practices.

I think that one of the hidden goals of a project like Software Carpentry is to teach that the reason software developers appear to be “too clever”, and somehow unreachable for the scientist outside the software field, is that they are being too clever. Programming gets over-complicated; you can learn to write good code (that is, readable code) more easily than you can learn to write bad code the professional programmers’ way.

I also identified with Yossi’s line that “I claim to have repented, mostly”. Most of the code I’ve written in my career is not very good, by this standard. It’s a long path to enlightenment.

The other article was by my friend Christophe Rhodes: What is a good language to teach at undergraduate level for Computing degrees?

A computing degree is an odd thing. Computer science is the theory of how computers and programs work. Computing, as a subject, is computer science plus some stuff about how we should actually program them in order to get a job done. The two are very different.

Christophe breaks down objectives of a computing degree: Think, experiment, job, career, study, society. I’m not very familiar with the study objective, which refers to postgraduate study in a computing department (something I never did). Think refers to teaching students how to think and solve problems computationally. I think he may be missing a step, and it may be useful to separate thinking about how the computer works (“understand”?) from thinking about how the programmer can work.

There is also, of course, the question of how to avoid making your programmer feel too clever.

As an undergraduate in 1990-94, I was taught, in this order, Prolog, C, ML, 68k assembly language, and C++. I also learned some Lisp, though I can’t remember being formally taught it. Prolog, C, and the assembly language were all good bases for what I referred to as the understand objective, getting something out of the history of computing and learning about how computers work. ML was a wonderful introduction to thinking, and it’s no coincidence that I’ve recently been programming in an ML variant as an engaging alternative to work.

The hard one to evaluate is C++. It was a very poor teaching language for object-oriented programming, which is a pity because at the time we learned it, I didn’t get object-oriented programming at all. And we learned nothing about any other kind of programming from it, having already been taught three different high-level languages. So it contributed very poorly to the think objective and not at all to the understand one. It’s an awful language to learn to write clear, reliable code in, therefore bad from the society perspective, and you can (as I have) spend 20 years learning to write it, which is obviously ridiculous, hence you would think also bad from the job perspective.

But it turns out that C++ has been the most valuable job language there could be, because (a) it is resiliently portable: it turns out that write-once-run-anywhere with a virtual machine comes and goes in waves, depending on the whim of the top operating system provider of the time, but compiling to machine code seems to be eternal; (b) it is just about able to ape a number of programming paradigms, so you can get away with adopting a style without adopting another language; and (c) its complexity means that it looks good on your CV. I honestly wish, as a 20+-year C++ programmer, that we could make it so that nobody ever had to learn C++ again, but even now I don’t think that is the case.

The one goal completely unaddressed by my own degree, and almost every language I’ve learned since, is experiment. I think there are two essentials for this: a responsive interactive interpreter, and visual responsiveness, like a graphical scene environment or plotting tools. Most of the things you want to do here are probably growing up in Javascript and HTML5.

I don’t really know anything about teaching, about the actual on-the-ground business of making people learn stuff. So you shouldn’t listen to me on this next bit, I’m just daydreaming. But if I were planning a 3-year computing degree course in my head now, I think I would aim to teach, in this order:

Python, for the basics of procedural computing. It’s the cleanest language for doing satisfying simple loops and input/output transformations, and is a sound general-purpose language.
Clojure. I’ve decided now that I don’t so much care for Lisp syntax day-to-day, but a Lisp gives you so much to talk about and investigate without getting lost in the specific requirements of the language. And a Lisp on the JVM gives you more depth: advanced students could learn quite a lot about Java without ever actually being taught Java.
Some assembly language, probably ARM and preferably with a nice visible bit of circuit board on hand.
Javascript, but not aiming to teach the language so much as using a specific interactive framework and with a specific small game-development project to complete. I would probably also use Typescript rather than untyped Javascript.
Haskell. As an ML guy it was always my enemy, but Haskell is the functional language that has endured. It’s a follow-on course from Lisp.
C++. Because, as far as I can see at the moment, you pretty much have to. But please, give them modern-style pointer ownership and RAII (i.e. avoiding explicit heap allocation).
Python again, in a closing course that taught people how best to do things in an actual working environment. Testing, not being too much of a smartarse, etc.

But the odd thing about that set of languages is that, just as with my own degree, you never get a very good dedicated object-oriented language. Maybe Objective-C could replace C++; it’s probably a clearer pedagogical object language, but C++ is everywhere while Objective-C is effectively platform-specific, even if it is a very popular platform.

And be sure to teach them version control.

Oh, and don’t forget to add a double-entry book-keeping course. It’s probably more useful than the programming stuff.