MIREX 2017 submissions

For the fifth year in a row, this year the Centre for Digital Music submitted a number of Vamp audio analysis plugins to the MIREX evaluation for “music information retrieval” tasks. This year we submitted the same set of plugins as last year; there were no new implementations, and some of the existing ones are so old as to have celebrated their tenth birthday earlier in the year. So the goal is not to provide state-of-the-art results, but to give other methods a stable baseline for comparison and to check each year’s evaluation metrics and datasets against neighbouring years. I’ve written about this in each of the four previous years: see posts about 2016, 2015, 2014, and 2013.

Obviously, having submitted exactly the same plugins as last year, we expect basically the same results. But the other entries we’re up against will have changed, so here’s a review of how each category went.

(Note: we dropped one category this year, Audio Downbeat Estimation. Last year’s submission was not well prepared for reasons I touched on in last year’s post, and I didn’t find time to rework it.)

Structural Segmentation

Results for the four datasets are here, here, here, and here. Our results, for Segmentino from Matthias Mauch and the older QM Segmenter from Mark Levy, were the same as last year, with the caveat that the QM Segmenter uses random initialisation and so never gets exactly the same results twice.

Surprisingly, nobody else entered anything to this category this year, which seems a pity because it’s an interesting problem. This category seems to have peaked around 2012-2013.

Multiple Fundamental Frequency Estimation and Tracking

An exciting year for this mind-bogglingly difficult category, with 14 entries from ten different sets of authors and a straight fight between template decomposition methods (including our Silvet plugin, from Emmanouil Benetos’s work) and trendy convolutional neural networks. Results are here and here.

With so many entries and evaluations it’s not that easy to get a clear picture, and no single method appears to be overwhelmingly strong. There were fine results in some evaluations for CNN methods from Thickstun et al and Thomé and Ahlbäck, for Pogorelyuk and Rowley‘s very intriguing “Dynamic Mode Decomposition”, and for a few others whose abstracts are missing from the entry site and so can’t be linked to.

Silvet, with the same results as last year, does well enough to be interesting, but in most cases it isn’t troubling the best of the newer methods.

Audio Onset Detection

Bit of a puzzle here, as our two plugin submissions both got slightly different results from last year despite being unchanged implementations of deterministic methods invoked in the same way on the same data sets.

Last year saw a big expansion in the number of entries, and this year there were nearly as many. Just as last year, our old plugins did modestly, but again some of the new experiments fared a bit less well so we weren’t quite at the bottom. Results here.

Audio Beat Tracking

Same puzzle as in onset detection: while our results were basically similar to last year, they weren’t identical. The 2015 and 2016 results were identical and we would have expected the same again in 2017.

That apart, there’s little to report since last year. Results are here, here, and here.

Audio Tempo Estimation

Last year there were two entries in this category, ours and a much stronger one from Sebastian Böck. This year sees one addition, from Hendrick Schreiber and Meinard Müller, which fares creditably. The results are here.

Audio Key Detection

Two pretty successful new submissions this year, both using convolutional neural networks: one from Korzeniowski, Böck, Krebs and Widmer, and the other from Hendrik Schreiber. Our old plugin (from work by Katy Noland) does not fare tragically, but it’s clear that some other methods are getting much closer to the sort of performance one imagines should be realistic. The results are linked from here.

Intuitively, key estimation seems like the sort of problem that is interesting only so long as you don’t have enough training data. As a 24-way classification with large enough training datasets, it looks a bit mundane. The problem becomes, what does it mean for a piece of music to be in a particular key anyway? Submissions are not expected to answer that, but presumably it sets an upper bound on performance.

Audio Chord Estimation

Another increase in the number of test datasets, from 5 to 7, and a strong category again. Last year our submission Chordino (by Matthias Mauch) was beginning to trail, though it wasn’t quite at the back. This year some of the weaker submissions have not been repeated, some new entries have appeared, and Chordino is in last place for every evaluation. It’s not far behind — perceptually it’s still a pretty good algorithm — but some of the other methods are very impressive now. Here are the results.

The abstracts accompanying the two submissions from the audio information processing group at Fudan University in Shanghai (Jiang, Li and Wu and Wu, Feng and Li) are both well worth a read. The former paper refers closely to Chordino, using the same NNLS Chroma features with a new front-end. Meanwhile, the latter paper proposes a method worth remembering for dinner parties, using deep residual networks trained from MIDI-synchronised constant-Q representations of audio with a bidirectional long-short-term memory and conditional random field for labelling.

 

Sonic Visualiser 3.0, at last

Finally!

(See previous posts: Help test the Sonic Visualiser v3.0 beta, A second beta of Sonic Visualiser v3.0, A third beta of Sonic Visualiser v3.0, and Yes, there’s a fourth beta of Sonic Visualiser v3.0 now)

No doubt, now that the official release is out, some horrible problem or other will come to light. It wouldn’t be the first time: Sonic Visualiser v2.4 went through a beta programme before release and still had to be replaced with v2.4.1 after only a week. These things happen and that’s OK, but for now I’m feeling good about this one.

 

Yes, there’s a fourth beta of Sonic Visualiser v3.0 now

Previously I wrote about the third Sonic Visualiser v3.0 beta release:

“This may well be the final beta, so if you’re seeing problems with it, please do report them while there’s still time!”

Well some very kind people did report problems, and so that was not the final beta. A fourth one is now up for download. Here are the download URLs:

Fixes since the third beta

  • Fix a nasty crash in session I/O in the 64-bit Windows build (this is the main reason for the new beta)
  • Provide more log information about audio drivers to the debug log file
  • Fix a very occasional one-sample-too-short error in resampling audio files during load
  • Fix invisible measure tool crosshairs on spectrogram
  • Fix a possible memory leak in the spectrogram

Keep the bug reports coming!

This one really could be the final beta! So please do report any troubles you have with it. Drop me a line, post a comment below this article, or use the SourceForge bug tracker. And thank you!

 

A third beta of Sonic Visualiser v3.0

Update – 23rd Feb: We have a fourth beta now!

After a short break, we have a third beta of the forthcoming v3.0 release of Sonic Visualiser. Downloads here:

Bugs fixed, and other changes made since the second beta

  • Sonic Visualiser could hang when trying to initialise a transform that refused the first choice of initialisation parameters
  • Error handling for problems in running transforms has been improved in general
  • The Colour 3D Plot layer was sometimes pathologically slow to update
  • The “Normalise Visible Area” option in the Colour 3D Plot layer wasn’t working
  • The visual rendering style of some layers has been improved when viewed on high-resolution screens without pixel doubling
  • A new feature has snuck in, under cover of fixing a rendering offset problem in the spectrum layer: it is now possible (although cumbersome) to zoom the spectrum layer in the frequency axis
  • The process of overhauling the Help Reference documentation to properly describe the new release has begun

Let us know what else you find!

This may well be the final beta, so if you’re seeing problems with it, please do report them while there’s still time!

Drop me a line, post a comment below this article, or use the SourceForge bug tracker.

(This post is a follow-up to “Help test the Sonic Visualiser v3.0 beta” and “A second beta of Sonic Visualiser v3.0“.)

A second beta of Sonic Visualiser v3.0

Update – 9th Feb: There is now a third beta! See here for details.

Here’s a second beta release of Sonic Visualiser v3.0:

Bugs found in the first beta and fixed for the second

  • The peak-frequency spectrogram rendered the entire track into the first 1/8th of its length, and showed nothing after that. (The cause of this might make a marginally interesting technical post in its own right)
  • A similar effect was exhibited by Colour 3D Plot layers, but only at very close zoom levels
  • When the Windows build had been used to view an mp3 file, it would subsequently crash on exit
  • All platforms could hang on startup if certain plugins were installed (the Fan Chirp plugin from the Universidad de la República in Uruguay was one example, though it wasn’t the fault of the plugin)
  • The playback/record level meters were very flickery
  • The source package didn’t build on Fedora Linux

What other problems have you spotted?

Let us know! Drop me a line, post a comment below this article, or use the SourceForge bug tracker.

(This post is a follow-up to “Help test the Sonic Visualiser v3.0 beta“)

Help test the Sonic Visualiser v3.0 beta

A first beta release of Sonic Visualiser v3.0 is now available for download, and we’d love to get your feedback.

Sonic Visualiser v3.0beta1 on Windows

Sonic Visualiser is a free, open-source desktop application for close study and annotation of music audio recordings, developed in the Centre for Digital Music at Queen Mary, University of London. It’s been available for about a decade now, and v3.0 will be one of the most substantial updates it’s ever had. This should be a really good release, but we need to hear about the problems other people have with the beta versions before we can be sure of that.

Get it here

Update – 17th Jan: These are not the latest links any more: there is now a second beta! See here for details.

The first beta can be downloaded from the Sound Software code site:

There will be Linux binaries as well, but I’m still working on packaging for those. Watch this space. (Update: there is now an Ubuntu package linked above. I’d like to be making more options available, not least because I don’t actually use Ubuntu myself, but this is a start.)

Note that the beta pops up a dialog each time you run it to remind you that it’s a beta. Sorry about that, I know it might be annoying.

What’s changed

Here’s the list of changes since the last release.

Besides some new features and a lot of bug fixes, there are a few interesting internal changes:

  • Everything to do with sample indexing now uses 64-bit offsets, and it’s possible to load very long audio files that wouldn’t have worked in the previous release
  • Audio analysis plugins are now run with process separation so a misbehaving plugin should no longer be able to crash the host
  • It’s now possible to record audio as well as play it, and to select the record and playback devices in the preferences
  • The user interface now adapts fully to hi-dpi (“retina”) displays on all three platforms
  • For the first time the Windows version is natively 64-bit (if your Windows installation is, and almost all Windows installations are nowadays) — while still being able to use any 32-bit Vamp plugins you have installed

I’m quite excited about this release, so now I need to hear all your deflating reports about the things that aren’t working!

What we particularly need feedback on

  • Problems installing or running the application at all!
  • Problems running plugins that worked with a previous version
  • Problems playing or recording audio, glitches, error dialogs with complaints about audio drivers
  • Any crashes or other error dialogs
  • Any unexpectedly slow performance while showing analyses or running plugins

Note for Linux users

I mentioned above that I’m still working on packaging for Linux. That process also includes overhauling the INSTALL-file instructions, which are not quite up-to-date. If you look at the series of commands carried out in the Docker script at deploy/linux/docker/Dockerfile.ubuntu64 in the source tree, you’ll get an idea of what needs to be done to compile as things stand.

How to report problems

Use the venerable SourceForge bug tracker, or for quick reports you could just post a comment below, send me an email, tweet at me, etc.

For any problems that arise when using a specific file (audio or annotation), it’s massively helpful if you can attach an example file that exhibits the problem. In general, listing any steps to take to reproduce a bug (even if it seems to you that the bug must be so obvious that nobody could ever have missed it) is very useful indeed.

If you run into something and you’re not sure whether it’s a bug or you’re just being stupid, please do report it anyway. A program that makes you feel stupid is already wrong on some level, though I’m all too aware that Sonic Visualiser can do that sometimes because it is a bit overcomplicated in places.

Things we haven’t done yet

We had hoped to devise an easier way to obtain and install plugins in time for this release, and recent survey feedback suggested this would be a very welcome thing for many prospective users. Sadly we haven’t been able to do anything in that area yet, but I hope we may be able to soon.