2009:Audio Beat Tracking Results
This year was the first year since MIREX 2006 that we have run the Audio Beat Track (ABT) task. ABT 2009 saw two collection used, the original McKinney Collection, and the new Sapp's Mazurka Collection. A number of new scoring metrics were also introduced this year beyond McKinney's P-Score.
McKinney Collection Information
The McKinney Collection comprises 140 files of mixed genres. There are 40 ground-truth sets per music file across which the beats were averaged.
Sapp's Mazurka Collection Information
New this year, we used 322 files drawn from the Mazurka.org dataset put together by Craig Sapp. Craig Sapp was also responsible for creating the high-quality ground-truth files.
Understanding the Metrics
The evaluation methods were taken from the beat evaluation toolbox and are described in the following technical report:
M. E. P. Davies, N. Degara and M. D. Plumbley. "Evaluation methods for musical audio beat tracking algorithms". Technical Report C4DM-TR-09-06.
For further details on the specifics of the methods please refer to the paper. However, here is a brief summary with appropriate references:
F-measure - the standard calculation as used in onset evaluation but with a ±70ms window.
S. Dixon, "Onset detection revisited," in Proceedings of 9th International Conference on Digital Audio Effects (DAFx), Montreal, Canada, pp. 133ΓÇô137, 2006.
S. Dixon, "Evaluation of audio beat tracking system beatroot," Journal of New Music Research, vol. 36, no. 1, pp. 39ΓÇô51, 2007.
Cemgil - beat accuracy is calculated using a Gaussian error function with 40ms standard deviation.
A. T. Cemgil, B. Kappen, P. Desain, and H. Honing, "On tempo tracking: Tempogram representation and Kalman filtering," Journal Of New Music Research, vol. 28, no. 4, pp. 259ΓÇô273, 2001
Goto - binary decision of correct or incorrect tracking based on statistical properties of a beat error sequence.
M. Goto and Y. Muraoka, "Issues in evaluating beat tracking systems," in Working Notes of the IJCAI-97 Workshop on Issues in AI and Music - Evaluation and Assessment, 1997, pp. 9ΓÇô16.
PScore - McKinney's impulse train cross-correlation method as used in 2006.
M. F. McKinney, D. Moelants, M. E. P. Davies, and A. Klapuri, "Evaluation of audio beat tracking and music tempo extraction algorithms," Journal of New Music Research, vol. 36, no. 1, pp. 1ΓÇô16, 2007.
CMLc, CMLt, AMLc, AMLt - continuity-based evaluation methods based on the longest continuously correctly tracked section.
S. Hainsworth, "Techniques for the automated analysis of musical audio," Ph.D. dissertation, Department of Engineering, Cambridge University, 2004.
A. P. Klapuri, A. Eronen, and J. Astola, "Analysis of the meter of acoustic musical signals," IEEE Transactions on Audio, Speech and Language Processing, vol. 14, no. 1, pp. 342ΓÇô355, 2006.
D, Dg - information based criteria based on analysis of a beat error histogram (note the results are measured in 'bits' and not percentages), see the technical report for a description.
DRP1 = Matthew Davies, Andrew Robertson, Mark Plumbley (Deterministic)
DRP2 = Matthew Davies, Andrew Robertson, Mark Plumbley (Dumb)
DRP3 = Matthew Davies, Andrew Robertson, Mark Plumbley (Flexible)
DRP4 = Matthew Davies, Andrew Robertson, Mark Plumbley (Standard)
GP1 = Geoffroy Peeters (VF)
GP2 = Geoffroy Peeters (VE)
GP3 = Geoffroy Peeters (CF)
GP4 = Geoffroy Peeters (CE)
OGM1 = Joao Lobato Oliveira, Fabien Gouyon, Luis Gustavo Martins (SC)
OGM2 = Joao Lobato Oliveira, Fabien Gouyon, Luis Gustavo Martins(R)
TL = Tsung-Chi Lee
Sapp's Mazurka Collection
MIREX 2009 Audio Beat Tracking Runtime Data