Notes from the Audio Developer Conference

I’ve spent the last couple of days at the 2017 Audio Developer Conference organised by ROLI. This is a get-together and technical conference for people who work on audio software and software-driven-hardware, in practice mostly people working on music applications.

I don’t go to many conferences these days, despite working in academia. I don’t co-write many papers and I’m no longer funded by a project with a conference budget. I’ve been to a couple that we hosted ourselves at the Centre for Digital Music, but I think the last one I went to anywhere else was the 2014 Linux Audio Conference in Karlsruhe. I don’t mind this situation (I don’t like to travel away from my family anyway), I just mention it to give context for why a long-time academic employee like me should bother to write up a conference at all!

DSC_0909.JPG

Here are my notes — on things I liked and things I didn’t — in roughly chronological order.

The venue is interesting, quite fancy, and completely new to me. (It is called CodeNode.) I’m a bit boggled that there is such a big space right in the middle of the City given over to developer events. I probably shouldn’t be boggling at that any more, but I can’t help it.
Nice furniture too.

The attendees are amazingly homogeneous. I probably wouldn’t have even noticed this, back when I was tangentially involved in the commercial audio development world, as I was part of the homogeneity. But our research group is a fair bit more diverse and I’m a bit more perceptive now. From the attendance of this event, you would conclude that 98% of audio developers are male and 90% are white people from northern Europe.
When I have been involved in organising events in academia, we have found it hard to get a speaker lineup that is as diverse as the population of potential attendees (i.e. the classic all-male panel problem). I have failed badly at this, even when trying hard — I am definitely part of the problem when it comes to conference organisation. Here, though, my perception is the other way around: the speakers are a closer reflection of what I perceive as the actual population than the attendees are.

Talks I went to:

Day 2 (i.e. the first day of the talks):

  • The future is wide: SIMD, vector classes and branchless algorithms for audio synthesis by Angus Hewlett of FXpansion (now employed by ROLI). A topic I’m interested in and he has clearly done solid work on (see here), but it quickly reached the realms of tweaks I personally am probably never going to need. The most heartening lesson I learned was that compilers are getting better and better at auto-vectorisation.
  • Exploring time-frequency space with the Gaborator by Andreas Gustafsson. I loved this. It was about computing short-time constant-Q transforms of music audio and presenting the results in an interactive way. This is well-trodden territory: I have worked on more than one implementation of a constant-Q transform myself, and on visualising the results. But I really appreciated his dedication to optimising the transform (which appears to be quicker and more invertible than my best implementation) and his imagination in rendering it (reusing the Leaflet mapping API to display time-frequency “maps”). There is a demo of this here and I like it a lot.
    So I was sitting there thinking “yes! nice work!”, but when it came to the questions, it was apparent that people didn’t really get how nice it was. I wanted to pretend to ask a question, just in order to say “I like it!”. But I didn’t, and then I never managed to work up to introducing myself to Andreas afterwards. I feel bad and I wish I had.
  • The development of Ableton Live by Friedemann Schautz. This talk could only disappoint, after its title. But I had to attend anyway. It was a broad review of observations from the development of Live 10, and although I didn’t learn much, I did like Friedemann and thought I would happily work on a team he was running.
  • The amazing usefulness of band-limited impulse trains by Stefan Stenzel of Waldorf. This was a nice old-school piece. Who can resist an impulse train? Not I.
  • Some interesting phenomena in nonlinear oscillators by André Bergner of Native Instruments. André is a compelling speaker who uses hand-drawn slides (I approve) and this was a neat mathematical talk, though I wasn’t able to stay to the end of it.

Day 3 (second and final day of talks):

  • The human in the musical loop (keynote) by Elaine Chew. Elaine is a professor in my group and I know some of her work quite well, but her keynote was exactly what I needed at this time, first thing in the morning on the second day. After a day of bits-driven talks, this was a piece about performers and listeners from someone who is technologically adept herself, and curious, but talks about music first. Elaine is also very calm, which was useful when the projector hardware gave up during her talk and stopped working for a good few minutes. I think as a result she had to hurry the closing topic (about the heartbeat project) which was a pity, as it could have been fascinating to have expanded on this a bit more.
    Some of what Elaine talked about was more than a decade old, and I think this is one of the purposes of professors: to recall, and to be able to communicate, relevant stuff that happened longer ago than any current research student remembers.
  • The new C++17, and why it is good for you by Timur Doumler. The polar opposite of Elaine’s talk, but I was now well-cushioned for it. C++17 continues down the road of simplifying the “modern-language” capabilities C++ has been acquiring since C++11. Most memorable for me are destructuring bind, guaranteed copy elision on value return, variant types, and filesystem support in the standard library.
    Destructuring bind is interesting and I’ve written about it separately.
  • The use of std::variant in realtime DSP by Ian Hobson. A 50-minute slot, for a talk about which Timur Doumler’s earlier talk had already given away the twist! (Yes you can use std::variant, it doesn’t do any heap allocation.) Ambitious. This was a most satisfying talk anyway, as it was all about performance measurements and other very concrete stuff. No mention of the Expression Problem though.
  • Reactive Extensions (Rx) in JUCE by Martin Finke. I have never used either React or JUCE so I thought this would be perfect for me. I had a question lined up: “What is JUCE?” but I didn’t dare use it. The talk was perfectly comprehensible and quite enlightening though, so my silly bit of attitude was quite misplaced. I may even end up using some of what I learned in it.

 

C++17 destructuring bind

I know very little about C++17, but I attended a talk about it this morning and the syntax for destructuring bind caught my attention.

This is a feature widely supported in other languages, where you assign a complex type to another complex declaration with individual names in it that match the original type, and you can then refer individually to the values that were assigned.

Python:

>>> [a,b,c] = [1,2,3]
>>> a
1
>>> b
2
>>> c
3
>>>

Standard ML (is that the Ur-language here?):

> val (a, b) = (1, 2);
val a = 1: int
val b = 2: int
> val { this, that } = { that = 1, this = 2 };
val that = 1: int
val this = 2: int

In C++17 the target syntax uses square brackets:

int a[2] = {1,2};
auto [x,y] = a;

It works regardless of whether the source is a structure, array, tuple, pair, etc.

What is interesting about it, in C++, is that it appears the source structure is always indexed by declaration order, rather than by name. That is, if you write

struct { int a; int b; } x { 1, 2 };
auto [b, a] = x;

then b will be 1 and a will be 2, even though in the original structure b was 2 and a was 1.

In other languages, a destructuring bind of a structure with named fields is performed using matching by name rather than by index. (See the SML example above.)

This highlights something that has been building for a while about C++. Since C++11 introduced the structure initialisation syntax in which you just list structure values one by one, we have increasingly accepted referring to structure elements by declaration order rather than by name. Someone who swapped two structure elements of the same type could break an awful lot of code without the compiler noticing. This doesn’t feel very safe, for a supposedly type-safe (ish) language.

And it isn’t the way other languages work. I don’t know any other language that will happily destructure a named structure by index rather than name, or even construct one (even the structure initialisers in C, since C99, are safer).

I’d love to know whether this has affected many people in practice, or whether I’m just being a bit suspicious.

 

Learning to read Arabic writing: one of my better ideas

I live in London not far from Paddington, where Arabic writing is often seen:

road

I spent my first few years in the area a bit oblivious to this (shops are shops), but eventually I started to wonder about simple things like: are these all the same language and script, or do they just look similar? And of course: what do they say? Then two years ago I took a gamble on the notion that this might be Arabic, and signed up for Arabic evening classes.

On the first day of the class, we were all asked why we had chosen to study Arabic. Everyone else had a proper explanation – planning to study in an Arabic-speaking country, dispatched to an Arabic-speaking country for business, have a parent who speaks Arabic and want to catch up, etc. I’d like to report that I said “I want to be able to read the shop signs on Edgware Road”, but I wasn’t bold enough, so I just cited curiosity.

I kept up the classes (one evening a week) for a year. Arabic is a difficult language and I didn’t excel. I learned simple introductions, some directions, some colours, a bit of grammar, and that I can’t pronounce the letter ع any better than any other native English speaker can. I learned enough that I can now recognise the odd word when I hear people speaking Arabic, but not enough to join in, and anyway I’ve always been very self-conscious about speaking other languages. But I am now able to slowly read (and write) the alphabet.

Predictably enough, it turns out the signage in Arabic around here usually says the same thing as the Roman lettering next to it. That’s the case for most of the text in the street-view photo above, for example. That could be disappointing, but I find it rather liberating. When people put Arabic text on a sign in this country, they aren’t trying to make things weird for native-English-speaking locals, they’re trying to make it easier for everyone else.

Arabic, the language, has 400-odd million speakers worldwide. Arabic the alphabet serves up to a billion users. Besides the Arabic language, it’s used for Persian and Urdu¹, both of which are quite dissimilar to Arabic. As it turns out, most of the places near me that I was interested in are in fact Arabic-speaking, but there are quite a few Persian places as well and Urdu, being the primary language of Pakistan, is widely used in the UK too.

(I have since had it pointed out to me that, for an English speaker whose main aim is to learn to read the script, going to Persian classes would have been easier than Arabic. Persian is an Indo-European language, it’s grammatically simpler, and the language you learn in classes is a form that people actually speak, whereas the standard Arabic taught to learners here I gather is different from anything spoken on the street anywhere. I have since bought a Persian grammar book, just in case I feel inspired.)

Learning the basics of how to read Arabic gives me a feeling of delight and reassurance, as if I am poking a hole for my brain to look out and find that a previously unfamiliar slice of the world’s population is doing the same stuff as those of us who happen to be users of the Roman alphabet. I recommend it.

Notes for the clueless about the Arabic alphabet

  • It’s written and read right-to-left. This is probably the only thing I did know before I started actively learning about it.
  • It is an alphabet, not a syllabary like Japanese kana or a logographic system like Chinese writing.
  • It is very much structured as a script. Each letter could have up to four shapes (initial, middle, final, standalone) depending on how it joins to the letters around it, so that the whole word flows smoothly. I think this contributes a lot to the sense of mystery “we” have about Arabic. The Cyrillic, Hebrew, and Greek alphabets are not intrinsically any more mysterious, but they are a lot more obviously composed of letters that can be individually mapped to Roman ones.
  • Short vowel sounds are not written down at all. This is unfortunate for the learner, as it means you often can’t pronounce a word unless you already know it. There is a system for annotating them, but it’s not generally used except in the Koran and sometimes in textbooks or Wikipedia where avoiding ambiguity is paramount.
  • There are 28-odd letters, but the number depends on what you’re reading – Persian adds a few over Arabic, but I think it also has some duplicates.
  • Some letters are very distinctive; for example the only letter with two dots below it is the common ي “ya”, which generally maps to an “ee” sound. Others are quite hard to spot because you have to know the joining rules to distinguish them in the middle of a word.
  • You could transliterate any language to Arabic, just as you can transliterate anything to the Roman alphabet. The result might be awkward, but there’s no reason you can’t write English in Arabic letters and have it be just about comprensible. I imagine there must be people who routinely do this.

 

¹ I know no Urdu, but I understand it’s typically written in the Arabic alphabet but with a more flowing script (Nastaliq, نستعلیق) than is typically used for modern Arabic or Persian. An interesting calligraphic distinction between languages. I first heard of Nastaliq through a fascinating article by Ali Eteraz in 2013, The Death of the Urdu Script, which lamented that it was too hard to display it on current devices. The situation has apparently improved since then.

 

Sonic Visualiser 3.0, at last

Finally!

(See previous posts: Help test the Sonic Visualiser v3.0 beta, A second beta of Sonic Visualiser v3.0, A third beta of Sonic Visualiser v3.0, and Yes, there’s a fourth beta of Sonic Visualiser v3.0 now)

No doubt, now that the official release is out, some horrible problem or other will come to light. It wouldn’t be the first time: Sonic Visualiser v2.4 went through a beta programme before release and still had to be replaced with v2.4.1 after only a week. These things happen and that’s OK, but for now I’m feeling good about this one.

 

Yes, there’s a fourth beta of Sonic Visualiser v3.0 now

Previously I wrote about the third Sonic Visualiser v3.0 beta release:

“This may well be the final beta, so if you’re seeing problems with it, please do report them while there’s still time!”

Well some very kind people did report problems, and so that was not the final beta. A fourth one is now up for download. Here are the download URLs:

Fixes since the third beta

  • Fix a nasty crash in session I/O in the 64-bit Windows build (this is the main reason for the new beta)
  • Provide more log information about audio drivers to the debug log file
  • Fix a very occasional one-sample-too-short error in resampling audio files during load
  • Fix invisible measure tool crosshairs on spectrogram
  • Fix a possible memory leak in the spectrogram

Keep the bug reports coming!

This one really could be the final beta! So please do report any troubles you have with it. Drop me a line, post a comment below this article, or use the SourceForge bug tracker. And thank you!

 

A third beta of Sonic Visualiser v3.0

Update – 23rd Feb: We have a fourth beta now!

After a short break, we have a third beta of the forthcoming v3.0 release of Sonic Visualiser. Downloads here:

Bugs fixed, and other changes made since the second beta

  • Sonic Visualiser could hang when trying to initialise a transform that refused the first choice of initialisation parameters
  • Error handling for problems in running transforms has been improved in general
  • The Colour 3D Plot layer was sometimes pathologically slow to update
  • The “Normalise Visible Area” option in the Colour 3D Plot layer wasn’t working
  • The visual rendering style of some layers has been improved when viewed on high-resolution screens without pixel doubling
  • A new feature has snuck in, under cover of fixing a rendering offset problem in the spectrum layer: it is now possible (although cumbersome) to zoom the spectrum layer in the frequency axis
  • The process of overhauling the Help Reference documentation to properly describe the new release has begun

Let us know what else you find!

This may well be the final beta, so if you’re seeing problems with it, please do report them while there’s still time!

Drop me a line, post a comment below this article, or use the SourceForge bug tracker.

(This post is a follow-up to “Help test the Sonic Visualiser v3.0 beta” and “A second beta of Sonic Visualiser v3.0“.)