Notes on Idris


Idris Book cover

In a bid to expand my programming brain by learning something about “dependent types”, I recently bought the Idris book.

(Idris is a pure functional programming language that is mostly known for supporting dependent types. Not knowing what that really meant, and seeing that this recently-published book written by the author of the language was warmly reviewed on Amazon, I saw an opportunity.)

The idea behind dependent typing is to allow types to be declared as having dependencies on information that would, in most languages, only be known at runtime. For example, a function might accept two arrays as arguments, but only work correctly if the two arrays that are actually passed to it have the same length: with dependent types, this dependency can be written into the type declaration and checked at compile time. On the face of it we can’t in general know things like the number of elements in an arbitrary array at compile time, so this seems like compelling magic. Types are, in fact, first class values, and if that raises a lot of horribly recursive-seeming questions in your mind, then this book might help.

A number of academic languages are exploring this area, but Idris is interesting because of its ambition to be useful for general-purpose programming. The book is pleasingly straightforward about this: it adopts an attitude that “this is how we program, and I’ll help you do it”. In fact, the Idris book might be the most inspiring programming book I’ve read since ML for the Working Programmer (although the two are very different, more so than their synopses would suggest). It’s not that the ideas in it are new — although they are, to me — but that it presents them so crisply as to project a tantalising image of a better practice of programming.

What the book suggests

The principles in the book — as I read it — are:

  • We always write down the type of a function before writing the function. Because the type system is so expressive, this gives the compiler, and the programmer, an awful lot of information — sometimes so much that there remains only one obvious way to satisfy the type requirements when writing the rest of the function. (The parallel with test-driven development, which can be a useful thinking aid when you aren’t sure how to address a problem, presumably inspired the title of the book.)
  • Sometimes, we don’t even implement the rest of the function yet. Idris supports a lovely feature called “holes”, which are just names you plonk down in place of code you haven’t got around to writing yet. The compiler will happily type-check the rest of the program as if the holes were really there, and then report the holes and their types to you to give you a checklist of the bits you still need to fill in.
  • Functions are pure, as in they simply convert their input values to their return values and don’t modify any other state as they go along, and are if possible total, meaning that every input has a corresponding return value and the function won’t bail out or enter an infinite loop. The compiler will actually check that your functions are total — the Halting Problem shows that this can’t be done in general, but apparently it can be done enough of the time to be useful. The prospect of having a compiler confirm that your program cannot crash is exciting even to someone used to Standard ML, where typing is generally sound but inexhaustive cases and range errors still happen.
  • Because functions are pure, all input and output uses monads. That means that I/O effects are encapsulated into function return types. The book gives a supremely matter-of-fact introduction to monadic I/O, without using the word monad.
  • We use an IDE which supports all this with handy shortcuts — there is an impressive Idris integration for Atom which is used extensively throughout the book. For an Emacs user like me this is slightly offputting. There is an Emacs Idris mode as well, it’s just that so much of the book is expressed in terms of keystrokes in Atom.

None of these principles refers to dependent types at all, but the strength of the type system is what makes it plausible that this could all work. Idris does a lot of evaluation at compile time, to support the type system and totality checking, so you always have the feeling of having run your program a few times before it has even compiled.

So, I enjoyed the book. To return to earth-based languages for a moment, it reminds me a bit of Bjarne Stroustrup’s “A Tour of C++”, which successfully sets out “modern C++” as if most of the awkward history of C++ had never happened. One could see this book as setting out a “modern Haskell” as if the actual Haskell had not happened. Idris is quite closely based on Haskell, and I think, as someone who has never been a Haskell programmer, that much of the syntax and prelude library are the same. The biggest difference is that Idris uses eager evaluation, where Haskell is lazily evaluated. (See 10 things Idris improved over Haskell for what appears to be an informed point of view.)

Then what happened?

How did I actually get on with it? After absorbing some of this, I set out to write my usual test case (see Four MLs and a Python) of reading a CSV file of numbers and adding them up. The goal is to read a text file, split each line into number columns, sum the columns across all lines, then print out a single line of comma-separated sums — and to do it in a streaming fashion without reading the whole file into memory.

Here’s my first cut at it. Note that this doesn’t actually use dependent types at all.

module Main

import Data.String

parseFields : List String -> Maybe (List Double)
parseFields strs =
  foldr (\str, acc => case (parseDouble str, acc) of
                           (Just d, Just ds) => Just (d :: ds)
                           _ => Nothing)
        (Just []) strs

parseLine : String -> Maybe (List Double)
parseLine str =
  parseFields $ Strings.split (== ',') str

sumFromFile : List Double -> File -> IO (Either String (List Double))
sumFromFile xs f =
  do False <- fEOF f
     | True => pure (Right xs)
     Right line <- fGetLine f
     | Left err => pure (Left "Failed to read line from file")
     if line == ""
     then sumFromFile xs f
     else case (xs, parseLine line) of
               ([],  Just xs2) => sumFromFile xs2 f
               (xs1, Just xs2) => if length xs1 == length xs2
                                  then sumFromFile (zipWith (+) xs1 xs2) f
                                  else pure (Left "Inconsistent-length rows")
               (_, Nothing) => pure (Left $ "Failed to parse line: " ++ line)

sumFromFileName : String -> IO (Either String (List Double))
sumFromFileName filename =
  do Right f <- openFile filename Read
     | Left err => pure (Left "Failed to open file")
     sumFromFile [] f

main : IO ()
main =
  do [_, filename] <- getArgs
     | _ => putStrLn "Exactly 1 filename must be given"
     Right result <- sumFromFileName filename
     | Left err => putStrLn err
     putStrLn (pack $ intercalate [','] $ map (unpack . show) $ result)

Not beautiful, and I had some problems getting this far. The first thing is that compile times are very, very slow. As soon as any I/O got involved, a simple program took 25 seconds or more to build. Surprisingly, almost all of the time was used by gcc, compiling the C output of the Idris compiler. The Idris type checker itself was fast enough that this at least didn’t affect the editing cycle too much.

The combination of monadic I/O and eager evaluation was also a problem for me. My test program needs to process lines from a file one at a time, without reading the whole file first. In an impure language like SML, this is easy: you can read a line in any function, without involving any I/O in the type signature of the function. In a pure language, you can only read in an I/O context. In a pure lazy language, you can read once in an I/O context and get back a lazily-evaluated list of all the lines in a file (Haskell has a prelude function called lines that does this), which makes toy examples like this simple without having to engage properly with your monad. In a pure eager language, it isn’t so simple: you have to actually “do the I/O” and permeate the I/O context through the program.

This conceptual difficulty was compounded by the book’s caginess about how to read from a file. It’s full of examples that read from the terminal, but the word “file” isn’t in the index. I have the impression this might be because the “right way” is to use a higher-level stream interface, but that interface maybe wasn’t settled enough to go into a textbook.

(As a non-Haskell programmer I find it hard to warm to the “do” syntax used as sugar for monadic I/O in Haskell and Idris. It is extremely ingenious, especially the way it allows alternation — see the lines starting with a pipe character in the above listing — as shorthand for handling error cases. But it troubles me, reminding me of working with languages that layered procedural sugar over Lisp, like the RLisp that the REDUCE algebra system was largely written in. Robert Harper once described Haskell as “the world’s best imperative programming language” and I can see where that comes from: if you spend all your time in I/O context, you don’t feel like you’re writing functional code at all.)

Anyway, let’s follow up my clumsy first attempt with a clumsy second attempt. This one makes use of what seems to be the “hello, world” of dependent typing, namely vectors whose length is part of their compile type. With this, I guess we can guarantee that the lowest-level functions, such as summing two vectors, are only called with the proper length arguments. That doesn’t achieve much for our program since we’re dealing directly with arbitrary-length user inputs anyway, but I can see it could be useful to bubble up error handling out of library code.

Here’s that second effort:

module Main

import Data.Vect
import Data.String

sumVects : Vect len Double -> Vect len Double -> Vect len Double
sumVects v1 v2 = 
  zipWith (+) v1 v2

parseFields : List String -> Maybe (List Double)
parseFields strs =
  foldr (\str, acc => case (parseDouble str, acc) of
                           (Just d, Just ds) => Just (d :: ds)
                           _ => Nothing)
        (Just []) strs

parseVect : String -> Maybe (len ** Vect len Double)
parseVect str =
  case parseFields $ Strings.split (== ',') str of
  Nothing => Nothing
  Just xs => Just (_ ** fromList xs)  

sumFromFile : Maybe (len ** Vect len Double) -> File -> 
              IO (Either String (len ** Vect len Double))
sumFromFile acc f =
  do False <- fEOF f
     | True => case acc of
               Nothing => pure (Right (_ ** []))
               Just v => pure (Right v)
     Right line <- fGetLine f
     | Left err => pure (Left "Failed to read line from file")
     if line == ""
     then sumFromFile acc f
     else case (acc, parseVect line) of
               (_, Nothing) => pure (Left $ "Failed to parse line: " ++ line)
               (Nothing, other) => sumFromFile other f
               (Just (len ** xs), Just (len' ** xs')) =>
                    case exactLength len xs' of
                         Nothing => pure (Left "Inconsistent-length rows")
                         Just xs' =>
                             sumFromFile (Just (len ** (sumVects xs xs'))) f

sumFromFileName : String -> IO (Either String (len ** Vect len Double))
sumFromFileName filename =
  do Right f <- openFile filename Read
     | Left err => pure (Left "Failed to open file")
     sumFromFile Nothing f

main : IO ()
main =
  do [_, filename] <- getArgs
     | _ => putStrLn "Exactly 1 filename must be given"
     Right (_ ** result) <- sumFromFileName filename
     | Left err => putStrLn err
     putStrLn (pack $ intercalate [','] $ map (unpack . show) $ toList result)

Horrible. This is complicated and laborious, at least twice as long as it should be, and makes me feel as if I’ve missed something essential about the nature of the problem. You can see that, as a programmer, I’m struggling a surprising amount here.

Both of these examples compile, run, and get the right answers. But when I tried them on a big input file, both versions took more than 8 minutes to process it — on a file that was processed in less than 30 seconds by each of the languages I tried out in for this post. A sampling profiler did not pick out any obvious single source of delay. Something strange is going on here.

All in all, not a wild success. That’s a pity, because:

Good Things

My failure to produce a nice program that worked efficiently didn’t entirely burst the bubble. I liked a lot in my first attempt to use Idris.

The language has a gloriously clean syntax. I know my examples show it in a poor light; compare instead this astonishingly simple typesafe implementation of a printf-like string formatting function, where format arguments are typechecked based on the content of the format string, something you can’t do at all in most languages. Idris tidies up a number of things from Haskell, and has a beautifully regular syntax for naming types.

Idris is also nice and straightforward, compared with other strongly statically-typed languages, when it comes to things like using arithmetic operators for different kinds of number, or mechanisms like “map” over different types. I assume this is down to the pervasive use of what in Haskell are known as type classes (here called interfaces).

I didn’t run into any problems using the compiler and tools. They were easy to obtain and run, and the compiler (version 1.1.1) gives very clear error messages considering how sophisticated some of the checks it does are. The compiler generally “feels” solid.

I’m not sold on the fact that these languages are whitespace-sensitive — life’s too short to indent code by hand — though this does contribute to the tidy syntax. Idris seems almost pathologically sensitive to indentation level, and a single space too many when indenting a “do” block can leave the compiler lost. But this turned out to be less of a problem in practice than I had expected.

Will I be using Idris for anything bigger? Not yet, but I’d like to get better at it. I’m going to keep an eye on it and drop in occasionally to try another introductory problem or two. Meanwhile if you’d like to improve on anything I’ve written here, please do post below.

MIREX 2017 submissions

For the fifth year in a row, this year the Centre for Digital Music submitted a number of Vamp audio analysis plugins to the MIREX evaluation for “music information retrieval” tasks. This year we submitted the same set of plugins as last year; there were no new implementations, and some of the existing ones are so old as to have celebrated their tenth birthday earlier in the year. So the goal is not to provide state-of-the-art results, but to give other methods a stable baseline for comparison and to check each year’s evaluation metrics and datasets against neighbouring years. I’ve written about this in each of the four previous years: see posts about 2016, 2015, 2014, and 2013.

Obviously, having submitted exactly the same plugins as last year, we expect basically the same results. But the other entries we’re up against will have changed, so here’s a review of how each category went.

(Note: we dropped one category this year, Audio Downbeat Estimation. Last year’s submission was not well prepared for reasons I touched on in last year’s post, and I didn’t find time to rework it.)

Structural Segmentation

Results for the four datasets are here, here, here, and here. Our results, for Segmentino from Matthias Mauch and the older QM Segmenter from Mark Levy, were the same as last year, with the caveat that the QM Segmenter uses random initialisation and so never gets exactly the same results twice.

Surprisingly, nobody else entered anything to this category this year, which seems a pity because it’s an interesting problem. This category seems to have peaked around 2012-2013.

Multiple Fundamental Frequency Estimation and Tracking

An exciting year for this mind-bogglingly difficult category, with 14 entries from ten different sets of authors and a straight fight between template decomposition methods (including our Silvet plugin, from Emmanouil Benetos’s work) and trendy convolutional neural networks. Results are here and here.

With so many entries and evaluations it’s not that easy to get a clear picture, and no single method appears to be overwhelmingly strong. There were fine results in some evaluations for CNN methods from Thickstun et al and Thomé and Ahlbäck, for Pogorelyuk and Rowley‘s very intriguing “Dynamic Mode Decomposition”, and for a few others whose abstracts are missing from the entry site and so can’t be linked to.

Silvet, with the same results as last year, does well enough to be interesting, but in most cases it isn’t troubling the best of the newer methods.

Audio Onset Detection

Bit of a puzzle here, as our two plugin submissions both got slightly different results from last year despite being unchanged implementations of deterministic methods invoked in the same way on the same data sets.

Last year saw a big expansion in the number of entries, and this year there were nearly as many. Just as last year, our old plugins did modestly, but again some of the new experiments fared a bit less well so we weren’t quite at the bottom. Results here.

Audio Beat Tracking

Same puzzle as in onset detection: while our results were basically similar to last year, they weren’t identical. The 2015 and 2016 results were identical and we would have expected the same again in 2017.

That apart, there’s little to report since last year. Results are here, here, and here.

Audio Tempo Estimation

Last year there were two entries in this category, ours and a much stronger one from Sebastian Böck. This year sees one addition, from Hendrick Schreiber and Meinard Müller, which fares creditably. The results are here.

Audio Key Detection

Two pretty successful new submissions this year, both using convolutional neural networks: one from Korzeniowski, Böck, Krebs and Widmer, and the other from Hendrik Schreiber. Our old plugin (from work by Katy Noland) does not fare tragically, but it’s clear that some other methods are getting much closer to the sort of performance one imagines should be realistic. The results are linked from here.

Intuitively, key estimation seems like the sort of problem that is interesting only so long as you don’t have enough training data. As a 24-way classification with large enough training datasets, it looks a bit mundane. The problem becomes, what does it mean for a piece of music to be in a particular key anyway? Submissions are not expected to answer that, but presumably it sets an upper bound on performance.

Audio Chord Estimation

Another increase in the number of test datasets, from 5 to 7, and a strong category again. Last year our submission Chordino (by Matthias Mauch) was beginning to trail, though it wasn’t quite at the back. This year some of the weaker submissions have not been repeated, some new entries have appeared, and Chordino is in last place for every evaluation. It’s not far behind — perceptually it’s still a pretty good algorithm — but some of the other methods are very impressive now. Here are the results.

The abstracts accompanying the two submissions from the audio information processing group at Fudan University in Shanghai (Jiang, Li and Wu and Wu, Feng and Li) are both well worth a read. The former paper refers closely to Chordino, using the same NNLS Chroma features with a new front-end. Meanwhile, the latter paper proposes a method worth remembering for dinner parties, using deep residual networks trained from MIDI-synchronised constant-Q representations of audio with a bidirectional long-short-term memory and conditional random field for labelling.


Notes from the Audio Developer Conference

I’ve spent the last couple of days at the 2017 Audio Developer Conference organised by ROLI. This is a get-together and technical conference for people who work on audio software and software-driven-hardware, in practice mostly people working on music applications.

I don’t go to many conferences these days, despite working in academia. I don’t co-write many papers and I’m no longer funded by a project with a conference budget. I’ve been to a couple that we hosted ourselves at the Centre for Digital Music, but I think the last one I went to anywhere else was the 2014 Linux Audio Conference in Karlsruhe. I don’t mind this situation (I don’t like to travel away from my family anyway), I just mention it to give context for why a long-time academic employee like me should bother to write up a conference at all!


Here are my notes — on things I liked and things I didn’t — in roughly chronological order.

The venue is interesting, quite fancy, and completely new to me. (It is called CodeNode.) I’m a bit boggled that there is such a big space right in the middle of the City given over to developer events. I probably shouldn’t be boggling at that any more, but I can’t help it.
Nice furniture too.

The attendees are amazingly homogeneous. I probably wouldn’t have even noticed this, back when I was tangentially involved in the commercial audio development world, as I was part of the homogeneity. But our research group is a fair bit more diverse and I’m a bit more perceptive now. From the attendance of this event, you would conclude that 98% of audio developers are male and 90% are white people from northern Europe.
When I have been involved in organising events in academia, we have found it hard to get a speaker lineup that is as diverse as the population of potential attendees (i.e. the classic all-male panel problem). I have failed badly at this, even when trying hard — I am definitely part of the problem when it comes to conference organisation. Here, though, my perception is the other way around: the speakers are a closer reflection of what I perceive as the actual population than the attendees are.

Talks I went to:

Day 2 (i.e. the first day of the talks):

  • The future is wide: SIMD, vector classes and branchless algorithms for audio synthesis by Angus Hewlett of FXpansion (now employed by ROLI). A topic I’m interested in and he has clearly done solid work on (see here), but it quickly reached the realms of tweaks I personally am probably never going to need. The most heartening lesson I learned was that compilers are getting better and better at auto-vectorisation.
  • Exploring time-frequency space with the Gaborator by Andreas Gustafsson. I loved this. It was about computing short-time constant-Q transforms of music audio and presenting the results in an interactive way. This is well-trodden territory: I have worked on more than one implementation of a constant-Q transform myself, and on visualising the results. But I really appreciated his dedication to optimising the transform (which appears to be quicker and more invertible than my best implementation) and his imagination in rendering it (reusing the Leaflet mapping API to display time-frequency “maps”). There is a demo of this here and I like it a lot.
    So I was sitting there thinking “yes! nice work!”, but when it came to the questions, it was apparent that people didn’t really get how nice it was. I wanted to pretend to ask a question, just in order to say “I like it!”. But I didn’t, and then I never managed to work up to introducing myself to Andreas afterwards. I feel bad and I wish I had.
  • The development of Ableton Live by Friedemann Schautz. This talk could only disappoint, after its title. But I had to attend anyway. It was a broad review of observations from the development of Live 10, and although I didn’t learn much, I did like Friedemann and thought I would happily work on a team he was running.
  • The amazing usefulness of band-limited impulse trains by Stefan Stenzel of Waldorf. This was a nice old-school piece. Who can resist an impulse train? Not I.
  • Some interesting phenomena in nonlinear oscillators by André Bergner of Native Instruments. André is a compelling speaker who uses hand-drawn slides (I approve) and this was a neat mathematical talk, though I wasn’t able to stay to the end of it.

Day 3 (second and final day of talks):

  • The human in the musical loop (keynote) by Elaine Chew. Elaine is a professor in my group and I know some of her work quite well, but her keynote was exactly what I needed at this time, first thing in the morning on the second day. After a day of bits-driven talks, this was a piece about performers and listeners from someone who is technologically adept herself, and curious, but talks about music first. Elaine is also very calm, which was useful when the projector hardware gave up during her talk and stopped working for a good few minutes. I think as a result she had to hurry the closing topic (about the heartbeat project) which was a pity, as it could have been fascinating to have expanded on this a bit more.
    Some of what Elaine talked about was more than a decade old, and I think this is one of the purposes of professors: to recall, and to be able to communicate, relevant stuff that happened longer ago than any current research student remembers.
  • The new C++17, and why it is good for you by Timur Doumler. The polar opposite of Elaine’s talk, but I was now well-cushioned for it. C++17 continues down the road of simplifying the “modern-language” capabilities C++ has been acquiring since C++11. Most memorable for me are destructuring bind, guaranteed copy elision on value return, variant types, and filesystem support in the standard library.
    Destructuring bind is interesting and I’ve written about it separately.
  • The use of std::variant in realtime DSP by Ian Hobson. A 50-minute slot, for a talk about which Timur Doumler’s earlier talk had already given away the twist! (Yes you can use std::variant, it doesn’t do any heap allocation.) Ambitious. This was a most satisfying talk anyway, as it was all about performance measurements and other very concrete stuff. No mention of the Expression Problem though.
  • Reactive Extensions (Rx) in JUCE by Martin Finke. I have never used either React or JUCE so I thought this would be perfect for me. I had a question lined up: “What is JUCE?” but I didn’t dare use it. The talk was perfectly comprehensible and quite enlightening though, so my silly bit of attitude was quite misplaced. I may even end up using some of what I learned in it.


C++17 destructuring bind

I know very little about C++17, but I attended a talk about it this morning and the syntax for destructuring bind caught my attention.

This is a feature widely supported in other languages, where you assign a complex type to another complex declaration with individual names in it that match the original type, and you can then refer individually to the values that were assigned.


>>> [a,b,c] = [1,2,3]
>>> a
>>> b
>>> c

Standard ML (is that the Ur-language here?):

> val (a, b) = (1, 2);
val a = 1: int
val b = 2: int
> val { this, that } = { that = 1, this = 2 };
val that = 1: int
val this = 2: int

In C++17 the target syntax uses square brackets:

int a[2] = {1,2};
auto [x,y] = a;

It works regardless of whether the source is a structure, array, tuple, pair, etc.

What is interesting about it, in C++, is that it appears the source structure is always indexed by declaration order, rather than by name. That is, if you write

struct { int a; int b; } x { 1, 2 };
auto [b, a] = x;

then b will be 1 and a will be 2, even though in the original structure b was 2 and a was 1.

In other languages, a destructuring bind of a structure with named fields is performed using matching by name rather than by index. (See the SML example above.)

This highlights something that has been building for a while about C++. Since C++11 introduced the structure initialisation syntax in which you just list structure values one by one, we have increasingly accepted referring to structure elements by declaration order rather than by name. Someone who swapped two structure elements of the same type could break an awful lot of code without the compiler noticing. This doesn’t feel very safe, for a supposedly type-safe (ish) language.

And it isn’t the way other languages work. I don’t know any other language that will happily destructure a named structure by index rather than name, or even construct one (even the structure initialisers in C, since C99, are safer).

I’d love to know whether this has affected many people in practice, or whether I’m just being a bit suspicious.


Learning to read Arabic writing: one of my better ideas

I live in London not far from Paddington, where Arabic writing is often seen:


I spent my first few years in the area a bit oblivious to this (shops are shops), but eventually I started to wonder about simple things like: are these all the same language and script, or do they just look similar? And of course: what do they say? Then two years ago I took a gamble on the notion that this might be Arabic, and signed up for Arabic evening classes.

On the first day of the class, we were all asked why we had chosen to study Arabic. Everyone else had a proper explanation – planning to study in an Arabic-speaking country, dispatched to an Arabic-speaking country for business, have a parent who speaks Arabic and want to catch up, etc. I’d like to report that I said “I want to be able to read the shop signs on Edgware Road”, but I wasn’t bold enough, so I just cited curiosity.

I kept up the classes (one evening a week) for a year. Arabic is a difficult language and I didn’t excel. I learned simple introductions, some directions, some colours, a bit of grammar, and that I can’t pronounce the letter ع any better than any other native English speaker can. I learned enough that I can now recognise the odd word when I hear people speaking Arabic, but not enough to join in, and anyway I’ve always been very self-conscious about speaking other languages. But I am now able to slowly read (and write) the alphabet.

Predictably enough, it turns out the signage in Arabic around here usually says the same thing as the Roman lettering next to it. That’s the case for most of the text in the street-view photo above, for example. That could be disappointing, but I find it rather liberating. When people put Arabic text on a sign in this country, they aren’t trying to make things weird for native-English-speaking locals, they’re trying to make it easier for everyone else.

Arabic, the language, has 400-odd million speakers worldwide. Arabic the alphabet serves up to a billion users. Besides the Arabic language, it’s used for Persian and Urdu¹, both of which are quite dissimilar to Arabic. As it turns out, most of the places near me that I was interested in are in fact Arabic-speaking, but there are quite a few Persian places as well and Urdu, being the primary language of Pakistan, is widely used in the UK too.

(I have since had it pointed out to me that, for an English speaker whose main aim is to learn to read the script, going to Persian classes would have been easier than Arabic. Persian is an Indo-European language, it’s grammatically simpler, and the language you learn in classes is a form that people actually speak, whereas the standard Arabic taught to learners here I gather is different from anything spoken on the street anywhere. I have since bought a Persian grammar book, just in case I feel inspired.)

Learning the basics of how to read Arabic gives me a feeling of delight and reassurance, as if I am poking a hole for my brain to look out and find that a previously unfamiliar slice of the world’s population is doing the same stuff as those of us who happen to be users of the Roman alphabet. I recommend it.

Notes for the clueless about the Arabic alphabet

  • It’s written and read right-to-left. This is probably the only thing I did know before I started actively learning about it.
  • It is an alphabet, not a syllabary like Japanese kana or a logographic system like Chinese writing.
  • It is very much structured as a script. Each letter could have up to four shapes (initial, middle, final, standalone) depending on how it joins to the letters around it, so that the whole word flows smoothly. I think this contributes a lot to the sense of mystery “we” have about Arabic. The Cyrillic, Hebrew, and Greek alphabets are not intrinsically any more mysterious, but they are a lot more obviously composed of letters that can be individually mapped to Roman ones.
  • Short vowel sounds are not written down at all. This is unfortunate for the learner, as it means you often can’t pronounce a word unless you already know it. There is a system for annotating them, but it’s not generally used except in the Koran and sometimes in textbooks or Wikipedia where avoiding ambiguity is paramount.
  • There are 28-odd letters, but the number depends on what you’re reading – Persian adds a few over Arabic, but I think it also has some duplicates.
  • Some letters are very distinctive; for example the only letter with two dots below it is the common ي “ya”, which generally maps to an “ee” sound. Others are quite hard to spot because you have to know the joining rules to distinguish them in the middle of a word.
  • You could transliterate any language to Arabic, just as you can transliterate anything to the Roman alphabet. The result might be awkward, but there’s no reason you can’t write English in Arabic letters and have it be just about comprensible. I imagine there must be people who routinely do this.


¹ I know no Urdu, but I understand it’s typically written in the Arabic alphabet but with a more flowing script (Nastaliq, نستعلیق) than is typically used for modern Arabic or Persian. An interesting calligraphic distinction between languages. I first heard of Nastaliq through a fascinating article by Ali Eteraz in 2013, The Death of the Urdu Script, which lamented that it was too hard to display it on current devices. The situation has apparently improved since then.


Sonic Visualiser 3.0, at last


(See previous posts: Help test the Sonic Visualiser v3.0 beta, A second beta of Sonic Visualiser v3.0, A third beta of Sonic Visualiser v3.0, and Yes, there’s a fourth beta of Sonic Visualiser v3.0 now)

No doubt, now that the official release is out, some horrible problem or other will come to light. It wouldn’t be the first time: Sonic Visualiser v2.4 went through a beta programme before release and still had to be replaced with v2.4.1 after only a week. These things happen and that’s OK, but for now I’m feeling good about this one.