2015:Multiple Fundamental Frequency Estimation & Tracking Results - MIREX Dataset

From MIREX Wiki

Introduction

These are the results for the 2015 running of the Multiple Fundamental Frequency Estimation and Tracking task on MIREX dataset. For background information about this task set please refer to the 2014:Multiple Fundamental Frequency Estimation & Tracking page.

General Legend

Sub code Submission name Abstract Contributors
BW1 doMultiF0 PDF Emmanouil Benetos, Tillman Weyde
BW2 NoteTracking1 PDF Emmanouil Benetos, Tillman Weyde
BW3 NoteTracking2 PDF Emmanouil Benetos, Tillman Weyde
CB1 Silvet1 PDF Chris Cannam, Emmanouil Benetos, Matthias Mauch, Matthew E. P. Davies, Simon Dixon, Christian Landone, Katy Noland, and Dan Stowell
CB2 Silvet2 PDF Chris Cannam, Emmanouil Benetos, Matthias Mauch, Matthew E. P. Davies, Simon Dixon, Christian Landone, Katy Noland, and Dan Stowell
SY1 MPE1 PDF Li Su, Yi-Hsuan Yang
SY2 MPE2 PDF Li Su, Yi-Hsuan Yang
SY3 MPE3 PDF Li Su, Yi-Hsuan Yang
SY4 MPE4 PDF Li Su, Yi-Hsuan Yang

Run Times

ID Time
BW1 0 hours, 56 minutes and 4 seconds
CB1 1 hour, 33 minutes and 50 seconds
CB2 0 hours, 5 minutes and 23 seconds
BW2 0 hours, 55 minutes and 23 seconds
BW3 0 hours, 42 minutes and 0 seconds
SY1 0 hours, 12 minutes and 19 seconds
SY2 0 hours, 16 minutes and 37 seconds
SY3 0 hours, 11 minutes and 4 seconds
SY4 0 hours, 15 minutes and 17 seconds

Task 1: Multiple Fundamental Frequency Estimation (MF0E)

MF0E Overall Summary Results

Below are the average scores across 40 test files. These files come from 3 different sources: woodwind quintet recording of bassoon, clarinet, horn,flute and oboe (UIUC); Rendered MIDI using RWC database donated by IRCAM and a quartet recording of bassoon, clarinet, violin and sax donated by Dr. Bryan Pardo`s Interactive Audio Lab (IAL). 20 files coming from 5 sections of the woodwind recording where each section has 4 files ranging from 2 polyphony to 5 polyphony. 12 files from IAL, coming from 4 different songs ranging from 2 polyphony to 4 polyphony and 8 files from RWC synthesized midi ranging from 2 different songs ranging from 2 polphony to 5 polyphony.

file /nema-raid/www/mirex/results/2015/mf0/est_mirex/summary/task1.overall.csv not found

Detailed Results

Precision Recall Accuracy Etot Esubs Emiss Efa
BW1 0.752 0.755 0.654 0.409 0.096 0.149 0.164
CB1 0.804 0.519 0.498 0.529 0.093 0.389 0.047
CB2 0.655 0.460 0.420 0.636 0.174 0.366 0.095
SY1 0.637 0.775 0.588 0.581 0.137 0.088 0.357
SY2 0.640 0.767 0.584 0.571 0.129 0.104 0.338
SY3 0.631 0.749 0.571 0.603 0.146 0.105 0.352
SY4 0.644 0.719 0.567 0.571 0.140 0.141 0.290

download these results as csv

Detailed Chroma Results

Here, accuracy is assessed on chroma results (i.e. all F0's are mapped to a single octave before evaluating)

Precision Recall Accuracy Etot Esubs Emiss Efa
BW1 0.779 0.783 0.678 0.382 0.069 0.149 0.164
CB1 0.851 0.550 0.527 0.497 0.062 0.389 0.047
CB2 0.746 0.527 0.479 0.568 0.107 0.366 0.095
SY1 0.666 0.813 0.614 0.544 0.099 0.088 0.357
SY2 0.670 0.802 0.611 0.536 0.093 0.104 0.338
SY3 0.662 0.791 0.600 0.561 0.104 0.105 0.352
SY4 0.684 0.760 0.599 0.530 0.098 0.141 0.290

download these results as csv

Individual Results Files for Task 1

BW1= Emmanouil Benetos, Tillman Weyde
CB1= Chris Cannam, Emmanouil Benetos, Matthias Mauch, Matthew E. P. Davies, Simon Dixon, Christian Landone, Katy Noland, Dan Stowell
CB2= Chris Cannam, Emmanouil Benetos, Matthias Mauch, Matthew E. P. Davies, Simon Dixon, Christian Landone, Katy Noland, Dan Stowell
SY1= Li Su, Yi-Hsuan Yang
SY2= Li Su, Yi-Hsuan Yang
SY3= Li Su, Yi-Hsuan Yang
SY4= Li Su, Yi-Hsuan Yang

Info about the filenames

The filenames starting with part* comes from acoustic woodwind recording, the ones starting with RWC are synthesized. The legend about the instruments are:

bs = bassoon, cl = clarinet, fl = flute, hn = horn, ob = oboe, vl = violin, cel = cello, gtr = guitar, sax = saxophone, bass = electric bass guitar


Friedman tests for Multiple Fundamental Frequency Estimation (MF0E)

The Friedman test was run in MATLAB to test significant differences amongst systems with regard to the performance (accuracy) on individual files.

Tukey-Kramer HSD Multi-Comparison

TeamID TeamID Lowerbound Mean Upperbound Significance
BW1 SY1 -0.3235 1.1000 2.5235 FALSE
BW1 SY2 0.0515 1.4750 2.8985 TRUE
BW1 SY3 0.7015 2.1250 3.5485 TRUE
BW1 SY4 0.8765 2.3000 3.7235 TRUE
BW1 CB1 1.5265 2.9500 4.3735 TRUE
BW1 CB2 3.1515 4.5750 5.9985 TRUE
SY1 SY2 -1.0485 0.3750 1.7985 FALSE
SY1 SY3 -0.3985 1.0250 2.4485 FALSE
SY1 SY4 -0.2235 1.2000 2.6235 FALSE
SY1 CB1 0.4265 1.8500 3.2735 TRUE
SY1 CB2 2.0515 3.4750 4.8985 TRUE
SY2 SY3 -0.7735 0.6500 2.0735 FALSE
SY2 SY4 -0.5985 0.8250 2.2485 FALSE
SY2 CB1 0.0515 1.4750 2.8985 TRUE
SY2 CB2 1.6765 3.1000 4.5235 TRUE
SY3 SY4 -1.2485 0.1750 1.5985 FALSE
SY3 CB1 -0.5985 0.8250 2.2485 FALSE
SY3 CB2 1.0265 2.4500 3.8735 TRUE
SY4 CB1 -0.7735 0.6500 2.0735 FALSE
SY4 CB2 0.8515 2.2750 3.6985 TRUE
CB1 CB2 0.2015 1.6250 3.0485 TRUE

download these results as csv

500px

Task 2:Note Tracking (NT)

NT Mixed Set Overall Summary Results

This subtask is evaluated in two different ways. In the first setup , a returned note is assumed correct if its onset is within +-50ms of a ref note and its F0 is within +- quarter tone of the corresponding reference note, ignoring the returned offset values. In the second setup, on top of the above requirements, a correct returned note is required to have an offset value within 20% of the ref notes duration around the ref note`s offset, or within 50ms whichever is larger.

A total of 34 files were used in this subtask: 16 from woodwind recording, 8 from IAL quintet recording and 6 piano.

BW2 BW3 CB1 CB2 SY1 SY2 SY3 SY4
Ave. F-Measure Onset-Offset 0.3565 0.3110 0.3060 0.2061 0.3146 0.2939 0.2848 0.2706
Ave. F-Measure Onset Only 0.6013 0.5413 0.5032 0.3737 0.4786 0.4605 0.4616 0.4552
Ave. F-Measure Chroma 0.3744 0.3387 0.3221 0.2366 0.3358 0.3170 0.3091 0.2954
Ave. F-Measure Onset Only Chroma 0.6346 0.5948 0.5339 0.4276 0.5068 0.4902 0.4849 0.4796

download these results as csv

Detailed Results

Precision Recall Ave. F-measure Ave. Overlap
BW2 0.329 0.405 0.356 0.882
BW3 0.280 0.366 0.311 0.880
CB1 0.315 0.304 0.306 0.865
CB2 0.201 0.228 0.206 0.861
SY1 0.254 0.430 0.315 0.882
SY2 0.232 0.424 0.294 0.881
SY3 0.218 0.441 0.285 0.876
SY4 0.207 0.420 0.271 0.874

download these results as csv

Detailed Chroma Results

Here, accuracy is assessed on chroma results (i.e. all F0's are mapped to a single octave before evaluating)

Precision Recall Ave. F-measure Ave. Overlap
BW2 0.344 0.426 0.374 0.880
BW3 0.303 0.401 0.339 0.880
CB1 0.331 0.320 0.322 0.860
CB2 0.229 0.265 0.237 0.858
SY1 0.272 0.458 0.336 0.877
SY2 0.251 0.455 0.317 0.875
SY3 0.236 0.478 0.309 0.873
SY4 0.227 0.456 0.295 0.871

download these results as csv


Results Based on Onset Only

Precision Recall Ave. F-measure Ave. Overlap
BW2 0.566 0.667 0.601 0.720
BW3 0.499 0.618 0.541 0.715
CB1 0.527 0.491 0.503 0.721
CB2 0.376 0.401 0.374 0.677
SY1 0.394 0.639 0.479 0.745
SY2 0.372 0.642 0.460 0.739
SY3 0.358 0.703 0.462 0.694
SY4 0.354 0.691 0.455 0.685

download these results as csv

Chroma Results Based on Onset Only

Precision Recall Ave. F-measure Ave. Overlap
BW2 0.596 0.705 0.635 0.699
BW3 0.546 0.683 0.595 0.690
CB1 0.559 0.522 0.534 0.701
CB2 0.427 0.464 0.428 0.652
SY1 0.417 0.676 0.507 0.715
SY2 0.396 0.682 0.490 0.699
SY3 0.375 0.739 0.485 0.671
SY4 0.373 0.728 0.480 0.654

download these results as csv


Friedman Tests for Note Tracking

The Friedman test was run in MATLAB to test significant differences amongst systems with regard to the F-measure on individual files.

Tukey-Kramer HSD Multi-Comparison for Task2
TeamID TeamID Lowerbound Mean Upperbound Significance
BW2 BW3 -0.4467 1.3529 3.1526 FALSE
BW2 CB1 0.4797 2.2794 4.0791 TRUE
BW2 SY1 0.5974 2.3971 4.1967 TRUE
BW2 SY3 1.3180 3.1176 4.9173 TRUE
BW2 SY2 1.2297 3.0294 4.8291 TRUE
BW2 SY4 1.2886 3.0882 4.8879 TRUE
BW2 CB2 3.1709 4.9706 6.7703 TRUE
BW3 CB1 -0.8732 0.9265 2.7261 FALSE
BW3 SY1 -0.7555 1.0441 2.8438 FALSE
BW3 SY3 -0.0350 1.7647 3.5644 FALSE
BW3 SY2 -0.1232 1.6765 3.4761 FALSE
BW3 SY4 -0.0644 1.7353 3.5350 FALSE
BW3 CB2 1.8180 3.6176 5.4173 TRUE
CB1 SY1 -1.6820 0.1176 1.9173 FALSE
CB1 SY3 -0.9614 0.8382 2.6379 FALSE
CB1 SY2 -1.0497 0.7500 2.5497 FALSE
CB1 SY4 -0.9908 0.8088 2.6085 FALSE
CB1 CB2 0.8915 2.6912 4.4908 TRUE
SY1 SY3 -1.0791 0.7206 2.5203 FALSE
SY1 SY2 -1.1673 0.6324 2.4320 FALSE
SY1 SY4 -1.1085 0.6912 2.4908 FALSE
SY1 CB2 0.7739 2.5735 4.3732 TRUE
SY3 SY2 -1.8879 -0.0882 1.7114 FALSE
SY3 SY4 -1.8291 -0.0294 1.7703 FALSE
SY3 CB2 0.0533 1.8529 3.6526 TRUE
SY2 SY4 -1.7408 0.0588 1.8585 FALSE
SY2 CB2 0.1415 1.9412 3.7408 TRUE
SY4 CB2 0.0827 1.8824 3.6820 TRUE

download these results as csv

500px

NT Piano-Only Overall Summary Results

This subtask is evaluated in two different ways. In the first setup , a returned note is assumed correct if its onset is within +-50ms of a ref note and its F0 is within +- quarter tone of the corresponding reference note, ignoring the returned offset values. In the second setup, on top of the above requirements, a correct returned note is required to have an offset value within 20% of the ref notes duration around the ref note`s offset, or within 50ms whichever is larger. 6 piano recordings are evaluated separately for this subtask.

BW2 BW3 CB1 CB2 SY1 SY2 SY3 SY4
Ave. F-Measure Onset-Offset 0.2033 0.2333 0.2387 0.1736 0.1572 0.1195 0.1926 0.1557
Ave. F-Measure Onset Only 0.6406 0.6924 0.6667 0.4941 0.4802 0.4293 0.5327 0.4875
Ave. F-Measure Chroma 0.2064 0.2390 0.2544 0.1849 0.1791 0.1496 0.2114 0.1848
Ave. F-Measure Onset Only Chroma 0.6501 0.7037 0.6772 0.5116 0.5015 0.4658 0.5482 0.5071

download these results as csv

Detailed Results

Precision Recall Ave. F-measure Ave. Overlap
BW2 0.204 0.203 0.203 0.838
BW3 0.239 0.228 0.233 0.810
CB1 0.276 0.212 0.239 0.813
CB2 0.208 0.152 0.174 0.796
SY1 0.148 0.175 0.157 0.809
SY2 0.105 0.144 0.119 0.814
SY3 0.179 0.219 0.193 0.815
SY4 0.136 0.188 0.156 0.814

download these results as csv

Detailed Chroma Results

Here, accuracy is assessed on chroma results (i.e. all F0's are mapped to a single octave before evaluating)

Precision Recall Ave. F-measure Ave. Overlap
BW2 0.207 0.206 0.206 0.834
BW3 0.245 0.234 0.239 0.814
CB1 0.293 0.227 0.254 0.801
CB2 0.220 0.162 0.185 0.795
SY1 0.166 0.204 0.179 0.795
SY2 0.130 0.185 0.150 0.791
SY3 0.195 0.243 0.211 0.798
SY4 0.159 0.230 0.185 0.792

download these results as csv

Results Based on Onset Only

Precision Recall Ave. F-measure Ave. Overlap
BW2 0.640 0.643 0.641 0.557
BW3 0.704 0.683 0.692 0.559
CB1 0.743 0.611 0.667 0.586
CB2 0.559 0.450 0.494 0.575
SY1 0.438 0.562 0.480 0.540
SY2 0.368 0.540 0.429 0.533
SY3 0.476 0.643 0.533 0.568
SY4 0.411 0.630 0.487 0.559

download these results as csv

Chroma Results Based on Onset Only

Precision Recall Ave. F-measure Ave. Overlap
BW2 0.651 0.652 0.650 0.548
BW3 0.716 0.694 0.704 0.556
CB1 0.755 0.620 0.677 0.586
CB2 0.578 0.467 0.512 0.576
SY1 0.456 0.589 0.502 0.533
SY2 0.398 0.587 0.466 0.519
SY3 0.489 0.664 0.548 0.567
SY4 0.427 0.656 0.507 0.555

download these results as csv

Individual Results Files for Task 2

BW2= Emmanouil Benetos, Tillman Weyde
BW3= Emmanouil Benetos, Tillman Weyde
CB1= Chris Cannam, Emmanouil Benetos, Matthias Mauch, Matthew E. P. Davies, Simon Dixon, Christian Landone, Katy Noland, Dan Stowell
CB2= Chris Cannam, Emmanouil Benetos, Matthias Mauch, Matthew E. P. Davies, Simon Dixon, Christian Landone, Katy Noland, Dan Stowell
SY1= Li Su, Yi-Hsuan Yang
SY2= Li Su, Yi-Hsuan Yang
SY3= Li Su, Yi-Hsuan Yang
SY4= Li Su, Yi-Hsuan Yang