This year, for the fourth year in a row, we submitted a number of Vamp audio analysis plugins published by the Centre for Digital Music to the annual MIREX evaluation. The motivation is to give other methods a baseline to compare against, to compare one year’s evaluation metrics and datasets against the next year’s, and to give our group a bit of visibility. See my posts about this process in 2015, 2014, and 2013.
Here’s a review of how we got on this year. We entered an extra category compared to last year, a makeshift entry in the audio downbeat estimation task, making this the widest range of categories we’ve covered with these plugins in MIREX so far.
Structural Segmentation
Results for the four datasets are here, here, here, and here. I don’t find the evaluations any easier to follow than I did last year, but I can see that both of our submissions (Segmentino from Matthias Mauch and the older QM Segmenter from Mark Levy) produced the same results as expected from previous years.
Segmentino actually comes across well in this year’s results, not least because the authors of last year’s best method (Thomas Grill and Jan Schlüter) didn’t submit anything this time.
Multiple Fundamental Frequency Estimation and Tracking
Results here and here. Our Silvet plugin performed much as before: reasonably well, though as usual in such a hard task, with hugely varying results from one test case to another.
Audio Onset Detection
Results here. Many more submissions than last year, which was already a broader field
than the year before. Our two old plugins score the same as they did last year, but are no longer placed last, as three of the new submissions have lower scores.
Audio Beat Tracking
Results here, here, and here. Our BeatRoot and QM Tempo Tracker are once again placed near the back. There’s little change from last year at the top, still occupied by the work of Sebastian Böck and Florian Krebs — work which the authors have, to their great credit, made available as freely-licensed, readable, and well-documented Python code in the madmom library.
Audio Tempo Estimation
Results here. Only two entries this year, our QM Tempo Tracker and Sebastian Böck’s entry from the aforementioned madmom.
Audio Downbeat Estimation
Results here. In this category we submitted the QM Bar and Beat Tracker plugin by Matthew Davies, which has been around for a few years; it’s based on the QM Tempo Tracker with an additional downbeat estimator.
The results don’t come across very well, for varying reasons according to the dataset. The QM Bar and Beat Tracker needs to be prompted with the time signature and (following a last-minute decision to enter the category this year) I submitted a script which assumed fixed 4/4 time. This meant we knowingly threw away the Ballroom category, which was all 3/4, but the plugin was also ill-suited to several of the other categories. Not a strong submission then, but interesting to see.
Audio Key Detection
Results here and here. Last year I lamented the lack of any other entries than ours, since the category had just gained a second (and more realistic) test dataset. So I’m delighted to see a couple of new submissions this year, including one from Gilberto Bernardes and Matthew Davies at INESC in Porto which appears to perform well.
Audio Chord Estimation
Results here, now up to five test datasets. Last year saw a torrid time with a bug in the Chordino plugin, but this year it’s back to normal. Chordino still performs well, but in a strong category this year it’s no longer one of the top performers.