SoundSoftware tutorial at AES 53

I’ll be co-presenting the first tutorial session at the Audio Engineering Society 53rd Conference on Semantic Audio, this weekend.

(It’s the society’s 53rd Conference, and it happens to be about semantic audio. It’s not their 53rd conference about semantic audio. In fact it’s their second: that was also the theme of the AES 42nd Conference in 2011.

What is semantic audio? Good question, glad you asked. I believe it refers to extraction or estimation of any semantic material from audio, including speech recognition and music information retrieval.)

My tutorial, for the SoundSoftware project, is about developing better and more reliable software during research work. That’s a very deep subject, so at best we’ll barely hint at a few techniques during one programming exercise:

  • making readable experimental code using the IPython Notebook, and sharing code for review with colleagues and supervisors;
  • using version control software to manage revisions and escape disaster;
  • modularising and testing any code that can be used in more than one experiment;
  • packaging, publishing, and licensing code;
  • and the motivations for doing the above.

We presented a session at the Digital Audio Effects (DAFx) conference in 2012 which covered much of this material in presentation style, and a tutorial at the International Society for Music Information Retrieval (ISMIR) in 2012 which featured a “live” example of test-driven development in research software. You can find videos and slides from those tutorials here. The theme of this one is similar, and I’ll be reusing some code from the ISMIR tutorial, but I hope we can make this one a bit more hands-on.


Looking at the Sonic Visualiser user survey (part 1)

Ever since Sonic Visualiser hit version 1.7 in mid-2009, it has included a survey feature to find out what its users think of it.

It waits until you’ve used it a few times. Then it pops up a dialog, just once, asking if you’d like to fill in the survey.

If you say yes, you get the survey page in your browser. If you say no, it won’t ask again—not even after an upgrade to a new version (unless you reinstall on a different machine).

This survey has been running ever since, unchanged, and has been completed over 1000 times. We’ve periodically read through the survey submissions, but we haven’t previously published any results from it. Since the survey was designed rather hastily four years ago and it’s high time we updated it, this is probably a good time to catch up on the responses before we do that.

What’s in this post

The survey had both open questions (with big text fields) and simple multiple-choice ones. This post will deal with numerical results from the simple questions.

Many of these results are pretty basic, so please don’t be disappointed if the analysis doesn’t turn out to be all that exciting. If you have any suggestions or questions, please do post a comment!

I intend to follow up by summarising the open questions in another post.

Number and distribution of responses

We have 1071 responses in total, from 6 October 2009 to 25 April 2013 (as of this analysis—the survey is still open).

However, I won’t be using all of those here. Owing to “technical problems” (and/or my incompetence) some responses from mid-2010 have been lost, so to ensure the record doesn’t have any holes in it, I’ll be limiting this post to the 821 responses from 11 Oct 2010 onwards. Here’s the number of responses per quarter:

Note that the most recent quarter (starting April 2013) only has three weeks’ worth of responses.

(Every chart in this post is linked to the data in text format, so click through if you’re interested in the numbers.)

Who are these people?

We asked,

Which of the following best describes your position?

  • A student, researcher, or academic in music
  • A student, researcher, or academic in audio engineering, audio analysis, multimedia, or a related discipline
  • I am employed in some field that is related to my use of Sonic Visualiser
  • I use Sonic Visualiser solely for personal purposes
  • None of the above

Sonic Visualiser comes from an academic environment, and if you add up the slightly arbitrary academic subdivisions they’re close to an overall majority, but there are plenty of personal-use responses and quite a few professionals:

Approximate IP geolocation shows that most respondents come from the US and Europe. Here are the top ten countries:

But 66 countries are represented in total, and the top ten only make up 70% of responses.

Platform, browser, and software version

Windows users are most numerous, while Linux users appear to be relatively on the wane. (Their numbers aren’t actually decreasing, they just haven’t increased as much). Neither of these surprises me, but I am surprised that Windows has been going up more than OS/X. Maybe Mac users don’t like being asked to fill in surveys.

As you might expect, academics, particularly in music, are relatively likely to be using OS/X, while a high proportion of those using SV for personal use are doing so on Windows.

Linux is overrepresented in France, which makes sense, as it is a civilised nation.

Firefox is the most common browser, but it’s been losing out here as everywhere recently. I’m a bit surprised that IE is only in third place even on Windows. I’m probably just a decade or so behind the times.

Few surprises in the breakdown of Sonic Visualiser version number. New versions take over fairly quickly after each release, but that’s to be expected because the survey only polls new installations—this doesn’t tell us anything about upgrade rates.

Linux users seem more likely to be using an old version, presumably because they often install from distribution packages.

Ease of use and general contentment

We asked,

Do you enjoy using Sonic Visualiser?

  • Yes, I do!
  • I have no strong feelings about it
  • I don’t enjoy using it, but I haven’t found any other software to replace it
  • I don’t enjoy using it, I use it because I’ve been told to (by a teacher, for example)


How easy do you find Sonic Visualiser to use?

  • I find it straightforward to use
  • Getting started was tricky, but I’m OK with it now
  • I can get things done, but it’s frustrating and I’m often caught out by unexpected behaviour
  • I can use a few features, but I don’t understand most of it
  • I don’t understand it at all

Most respondents are happy, but the results for ease of use are less satisfactory:

A great many respondents checked the “getting started was tricky” or “I don’t understand most of it” boxes. I think there is room for a simpler Sonic Visualiser. The open survey questions, to be covered in a subsequent post, might give us more ideas.

Features and plugins

We asked,

Which of the following features of Sonic Visualiser have you used? (Please select all that apply, or none.)

  • Saving and reloading complete sessions
  • Running Vamp plugins
  • Speeding up or slowing down playback
  • Annotation by tapping using the computer keyboard
  • Annotation by tapping using a MIDI keyboard
  • Data import or export using RDF formats
  • Audio alignment using the MATCH plugin
  • Editing note or region layers
  • Image layers

This isn’t a well-judged question. It has too many options and some of them are too ambiguous. In particular, “image layers” was intended to refer to layers in which external images can be attached—quite a niche feature—yet it appears as the third most popular option in the survey:

I assume this means people were (quite reasonably) interpreting “image layers” as meaning “any layers that look like images”, such as spectrograms.

Looking more closely at this, it seems that users who said they used the “image layer” feature were less likely to also report using common features such as session save/load or Vamp plugins, but more likely to report using uncommon features such as MIDI tapping or alignment.

This suggests these respondents could probably be clustered into a large group of novice users who use only the built-in analysis tools on a single audio file at a time (for whom “image layers” means spectrograms), and a smaller group who use many features and for whom, perhaps, “image layers” means layers of image type.

Also worth noting is the generally low number of people reporting use of any single feature—none of the features listed here gained support from more than 50% of respondents. Yet more than 90% of respondents checked at least one box. It seems there are different sets of users starting out with quite disjoint needs.

The survey also included a record of which Vamp plugins were installed. Here are the top ten overall:


We asked whether users were familiar with any programming languages (from a fixed multiple selection list, plus “Others” box) and whether they would have any interest in developing new plugins.

I was surprised by the language familiarity question: nearly 60% of respondents checked at least one box, and over a third claimed familiarity with C or C++. That’s far more than for Python, MATLAB, Java, Javascript or PHP, but all of those have pretty good showings even so.

Even among academics in the music field, over 40% professed familiarity with some programming language and over 20% with C.

I’m not quite sure what to make of this. Perhaps Sonic Visualiser is so hard to get started with that only very technically-minded users get as far as answering a survey about it!

Some respondents mentioned further languages in the Other box; these are the ones that appeared most often:

(BASIC includes Visual Basic; Lisp variants include Scheme and Clojure.)

Having asked about programming languages, we asked:

Have you ever considered writing Vamp plugins for use in Sonic Visualiser or any other host application?

  • Yes, I have written some plugins already
  • Yes, I’m interested in the idea
  • No, I wouldn’t be technically capable
  • No, I don’t see any reason to
  • No, I’ve looked at Vamp and found the format unsatisfactory in some way

As you can see, most respondents thought they wouldn’t be technically capable, but a pretty high number did express an interest.

“What’s the difference between Mercurial and git?”

A short post I wrote over at the SoundSoftware site.

Spoiler: The answer I gave was not “Mercurial can be understood by human beings”.

Let me know if you spot any mistakes (or just want to flame, of course).

A colleague pointed out that a big problem I didn’t help with at all is how to understand the difference between git and github, which is what people are often really talking about when they talk about sharing code with git. Ironic that we all move to distributed version-control systems and then appoint a single company to run a central server for them.


Can you develop research software on an iPad?

I’ve just written up a blog article for the Software Sustainability Institute about research software development in a “post-PC” world. (Also available on my project’s own site.)

Apart from using the terms “post-PC”, “touch tablet”, “app store”, and “cloud” a disgracefully large number of times, this article sets out a problem that’s been puzzling me for a while.

We’re increasingly trying to use, for everyday computing, devices that are locked down to very limited software distribution channels. They’re locked down to a degree that would have seemed incomprehensible to many developers ten or twenty years ago. Over time, these devices are more and more going to replace PCs as the public idea of what represents a normal computer. As this happens, where will we find scientific software development and the ideals of open publication and software reuse?

I recognise that not all “post-PC” devices (there we go again) have the same terms for software distribution, and that Android in particular is more open than others. (A commenter on Twitter has already pointed out another advantage of Google’s store that I had overlooked in the article.) The “openness” of Android has been widely criticised, but I do believe that its openness in this respect is important; it matters.

Perhaps the answer, then—at least the principled answer—to the question of how to use these devices in research software development is: bin the iPad; use something more open.

But I didn’t set out to make that point, except by implication, because I’m afraid it simply won’t persuade many people. In the audio and music field I work in, Apple already provide the predominant platform across all sizes of device. If there’s one thing I do believe about this technology cycle, it’s that people choose their platform first based on appeal and evident convenience, and then afterwards wonder what else they can do with it. And that’s not wrong. The trick is how to ensure that it is possible to do something with it, and preferably something that can be shared, published, and reused. How to subvert the system, in the name of science.

Any interesting schemes out there?

Hyvästi, Sibelius

This week saw the sad news that the UK office responsible for development of the music score-writing software Sibelius is to be closed down.

Maintenance of the software will be moved elsewhere, at least according to its owners Avid, the former video-editing software company that expanded madly throughout the professional audio and video world during the 2000s and now seems to be running out of money and ideas. I’m not sure I believe that much maintenance will happen.

Sibelius is a funny one for me.

I feel fond of it, because I remember its origins in the tail-end of the British homebrew software revolution of the 80s. It was developed by British students three or four years older than me for the Archimedes computer, a machine used almost exclusively in schools. Though I had left school for university before Sibelius arrived, I was still in touch with music teachers and I remember their delight with it. It was one of the most exceptional software products ever made.

More recently, for me as a developer, the SoundSoftware workshop we organised at work had a presentation from Paul Walmsley of Sibelius which hit very close to home—I wrote about it here—as he contrasted the things he knew now about software engineering with what he knew as a PhD student and earlier work on Sibelius. It’s clear that the current Sibelius team are very effective software engineers, and I would be extremely interested to see Sibelius on a CV if I were hiring developers now.

There are other, minor reasons to think of it warmly. I always liked its squidgy music font with the wide note heads. I liked the way it played a tiny clip from the appropriately-numbered Sibelius symphony on startup. I liked the way Charles Lucy when he accosted us as Linux trade shows used to refer to the Finn brothers as if they were old English eccentrics in his mould that he bumped into every other week. I liked the way seemingly every academic in computer music used to cite “you know, those Sibelius folks in Cambridge” as a possible model for their students’ future livelihood.

On the other hand, Sibelius hasn’t been especially good for me and mine.

When I was working on the notation editor in Rosegarden, one thing that motivated me was that so few people could afford to buy Sibelius. (The full version has always cost around £600; there are cut-down editions now, but there was nothing then unless you were registered as a student, which I haven’t been since 1994.) When Richard and I started a small company to sell products based on Rosegarden, even as I was trying (ultimately unsuccessfully) to turn Rosegarden’s notation editor into a full score editor that magically retained human performance timing, I was unable to compare my work properly against Sibelius because our shoestring company simply couldn’t afford to buy a copy. While it may be good that a potential competitor can’t afford your product, it’s probably not good that individuals with aspirations in general can’t.

Meanwhile, even as Sibelius was too expensive for personal use without piracy, it was cheap and available and competent enough to be totally disruptive in traditional music typesetting livelihoods. My cousin-in-law Michael, typesetter to the ABRSM and Boosey & Hawkes (as Barnes Music Engraving Ltd) whom I interviewed in 2004 as a professional typesetter, was seeing business disappear to bedroom typesetters with Sibelius; he left the business entirely a few years later.

Finally, after it had cut the legs off all its competition, the most frustrating thing about Sibelius is that it ended up being sold to a video production company who never seemed quite sure what to do with it, took half-measures in the new world of trivially-priced but widely-distributed mobile apps, and saw initially absurd upstarts like NoteFlight take a lot of the ground-level enthusiasm from it.

I would probably have done the same, in their position. But just as I find it hard to read poetry or prose that I could imagine having written, and hard to listen to singing that sounds like my own reedy and tentative voice, so I find it difficult to forgive companies making the same mistakes as I would have made.

(I’m afraid the formal-goodbye-in-Finnish in the title is entirely courtesy of Google—I’m sure there’s some grammar missing there, even though it is only two words. If you speak Finnish, let me know what I should have written.)

SoundSoftware 2012 Workshop

Yesterday the SoundSoftware project, which I help to run, hosted the SoundSoftware 2012 Workshop at Queen Mary. This was a one-day workshop about working practices for researchers developing software and experiences they have had in software work, with an eye to subjects of interest to audio and music researchers.

You can read about the workshop at the earlier link; I’d just like to mention two talks that I found particularly interesting. These were the talk from Paul Walmsley followed by that of David Gavaghan.

Paul is a long-serving senior developer in the Sibelius team at Avid (a group I’m interested in already because of my former life working on the notation editor aspects of Rosegarden: Sibelius was always the gold standard for interactive notation editing). He’s an articulate advocate of unit testing and of what might be described as a decomposition of research work in such a way as to be able to treat every “research output” (for example, presenting a paper) as an experiment demanding reproducibility and traceable provenance.

Usefully, he was able to present ideas like these as simplifying concerns, rather than as arduous reporting requirements. At one point he remarked that he could have shaved six months off his PhD if he had known about unit testing at the time—a remark that proved a popular soundbite when we quoted it through the SoundSoftware tweeter.

(I have an “if only I’d been doing this earlier” feeling about this as well: Rosegarden now contains hundreds of thousands of lines of essentially untested code, much of which is very fragile. Paul noted that much of the Sibelius code also predates this style of working, but that they have been making progress in building up test cases from the low-level works upward.)

David Gavaghan took this theme and expanded on it, with the presentation of his biomedical project Chaste (for “cancer, heart, and soft tissue environment”). This remarkable project, from well outside the usual field we usually hear about in the Centre for Digital Music, was built from scratch in a test-driven development process—which David refers to as “testfirst”. It started with a very short intensive project for a number of students, which so exercised the people involved that they voluntarily continued work on it up to its present form: half a million lines of code with almost 100% test coverage that has proven to avoid many of the pitfalls found in other biomedical simulation software.