2016:Multiple Fundamental Frequency Estimation & Tracking Results - MIREX Dataset

From MIREX Wiki

Introduction

These are the results for the 2016 running of the Multiple Fundamental Frequency Estimation and Tracking task on MIREX dataset. For background information about this task set please refer to the 2015:Multiple Fundamental Frequency Estimation & Tracking page.

General Legend

Sub code Submission name Abstract Contributors
KB1 (with bugs) Conv_Piano_Transcriptor_2016 PDF Rainer Kelz, Sebastian Böck
MM1 Sonic PDF Matija Marolt
CB1 Silvet PDF Chris Cannam, Emmanouil Benetos
CB2 Silvet Live PDF Chris Cannam, Emmanouil Benetos

Task 1: Multiple Fundamental Frequency Estimation (MF0E)

MF0E Overall Summary Results

Below are the average scores across 40 test files. These files come from 3 different sources: woodwind quintet recording of bassoon, clarinet, horn,flute and oboe (UIUC); Rendered MIDI using RWC database donated by IRCAM and a quartet recording of bassoon, clarinet, violin and sax donated by Dr. Bryan Pardo`s Interactive Audio Lab (IAL). 20 files coming from 5 sections of the woodwind recording where each section has 4 files ranging from 2 polyphony to 5 polyphony. 12 files from IAL, coming from 4 different songs ranging from 2 polyphony to 4 polyphony and 8 files from RWC synthesized midi ranging from 2 different songs ranging from 2 polphony to 5 polyphony.

Detailed Results

Precision Recall Accuracy Etot Esubs Emiss Efa
CB1 0.780 0.507 0.486 0.547 0.096 0.398 0.053
CB2 0.655 0.460 0.420 0.636 0.174 0.366 0.095
DT1 0.674 0.477 0.440 0.599 0.173 0.350 0.077
MM1 0.771 0.579 0.537 0.495 0.118 0.304 0.074

download these results as csv

Detailed Chroma Results

Here, accuracy is assessed on chroma results (i.e. all F0's are mapped to a single octave before evaluating)

Precision Recall Accuracy Etot Esubs Emiss Efa
CB1 0.832 0.540 0.516 0.514 0.062 0.398 0.053
CB2 0.746 0.527 0.479 0.568 0.107 0.366 0.095
DT1 0.712 0.503 0.464 0.573 0.146 0.350 0.077
MM1 0.827 0.622 0.577 0.452 0.075 0.304 0.074

download these results as csv

Individual Results Files for Task 1

MM1= Matija Marolt
CB1= Chris Cannam, Emmanouil Benetos
CB2= Chris Cannam, Emmanouil Benetos

Info about the filenames

The filenames starting with part* comes from acoustic woodwind recording, the ones starting with RWC are synthesized. The legend about the instruments are:

bs = bassoon, cl = clarinet, fl = flute, hn = horn, ob = oboe, vl = violin, cel = cello, gtr = guitar, sax = saxophone, bass = electric bass guitar

Friedman tests for Multiple Fundamental Frequency Estimation (MF0E)

The Friedman test was run in MATLAB to test significant differences amongst systems with regard to the performance (accuracy) on individual files.

Tukey-Kramer HSD Multi-Comparison

TeamID TeamID Lowerbound Mean Upperbound Significance
MM1 CB1 -0.0916 0.6500 1.3916 FALSE
MM1 DT1 0.5334 1.2750 2.0166 TRUE
MM1 CB2 1.1334 1.8750 2.6166 TRUE
CB1 DT1 -0.1166 0.6250 1.3666 FALSE
CB1 CB2 0.4834 1.2250 1.9666 TRUE
DT1 CB2 -0.1416 0.6000 1.3416 FALSE

download these results as csv

2016 Accuracy Per Song Friedman Mean Rankstask1.friedman.Friedman Mean Ranks.png

Task 2:Note Tracking (NT)

NT Mixed Set Overall Summary Results

This subtask is evaluated in two different ways. In the first setup , a returned note is assumed correct if its onset is within +-50ms of a ref note and its F0 is within +- quarter tone of the corresponding reference note, ignoring the returned offset values. In the second setup, on top of the above requirements, a correct returned note is required to have an offset value within 20% of the ref notes duration around the ref note`s offset, or within 50ms whichever is larger.

A total of 34 files were used in this subtask: 16 from woodwind recording, 8 from IAL quintet recording and 6 piano.

CB1 CB2 DT1 MM1
Ave. F-Measure Onset-Offset 0.3045 0.2061 0.4053 0.3518
Ave. F-Measure Onset Only 0.5027 0.3734 0.7118 0.6184
Ave. F-Measure Chroma 0.3206 0.2360 0.4162 0.3701
Ave. F-Measure Onset Only Chroma 0.5340 0.4268 0.7261 0.6475

download these results as csv

Detailed Results

Precision Recall Ave. F-measure Ave. Overlap
CB1 0.312 0.304 0.305 0.865
CB2 0.200 0.230 0.206 0.862
DT1 0.439 0.379 0.405 0.852
MM1 0.384 0.331 0.352 0.877

download these results as csv

Detailed Chroma Results

Here, accuracy is assessed on chroma results (i.e. all F0's are mapped to a single octave before evaluating)

Precision Recall Ave. F-measure Ave. Overlap
CB1 0.328 0.320 0.321 0.860
CB2 0.228 0.265 0.236 0.858
DT1 0.451 0.390 0.416 0.852
MM1 0.404 0.349 0.370 0.874

download these results as csv


Results Based on Onset Only

Precision Recall Ave. F-measure Ave. Overlap
CB1 0.524 0.493 0.503 0.720
CB2 0.374 0.403 0.373 0.677
DT1 0.768 0.668 0.712 0.676
MM1 0.674 0.583 0.618 0.674

download these results as csv

Chroma Results Based on Onset Only

Precision Recall Ave. F-measure Ave. Overlap
CB1 0.557 0.524 0.534 0.700
CB2 0.424 0.465 0.427 0.652
DT1 0.784 0.681 0.726 0.671
MM1 0.706 0.611 0.648 0.661

download these results as csv


Friedman Tests for Note Tracking

The Friedman test was run in MATLAB to test significant differences amongst systems with regard to the F-measure on individual files.

Tukey-Kramer HSD Multi-Comparison for Task2
TeamID TeamID Lowerbound Mean Upperbound Significance
DT1 MM1 0.1074 0.9118 1.7162 TRUE
DT1 CB1 0.9897 1.7941 2.5985 TRUE
DT1 CB2 2.0191 2.8235 3.6279 TRUE
MM1 CB1 0.0780 0.8824 1.6867 TRUE
MM1 CB2 1.1074 1.9118 2.7162 TRUE
CB1 CB2 0.2250 1.0294 1.8338 TRUE

download these results as csv

2016 Accuracy Per Song Friedman Mean Rankstask2.onsetOnly.friedman.Friedman Mean Ranks.png

NT Piano-Only Overall Summary Results

This subtask is evaluated in two different ways. In the first setup , a returned note is assumed correct if its onset is within +-50ms of a ref note and its F0 is within +- quarter tone of the corresponding reference note, ignoring the returned offset values. In the second setup, on top of the above requirements, a correct returned note is required to have an offset value within 20% of the ref notes duration around the ref note`s offset, or within 50ms whichever is larger. 6 piano recordings are evaluated separately for this subtask.

CB1 CB2 DT1 KB1 MM1
Ave. F-Measure Onset-Offset 0.2378 0.1749 0.5518 0.0245 0.3376
Ave. F-Measure Onset Only 0.6674 0.4967 0.8199 0.4850 0.7537
Ave. F-Measure Chroma 0.2535 0.1862 0.5527 0.0287 0.3185
Ave. F-Measure Onset Only Chroma 0.6779 0.5142 0.8205 0.4965 0.6984

download these results as csv

*Submissions marked by asterisk are with bugs

Detailed Results

Precision Recall Ave. F-measure Ave. Overlap
CB1 0.274 0.211 0.238 0.813
CB2 0.209 0.153 0.175 0.797
DT1 0.618 0.505 0.552 0.796
KB1* 0.020 0.032 0.025 0.118
MM1 0.362 0.317 0.338 0.813

download these results as csv

*Submissions marked by asterisk are with bugs

Detailed Chroma Results

Here, accuracy is assessed on chroma results (i.e. all F0's are mapped to a single octave before evaluating)

Precision Recall Ave. F-measure Ave. Overlap
CB1 0.291 0.226 0.253 0.801
CB2 0.221 0.164 0.186 0.796
DT1 0.619 0.506 0.553 0.796
KB1* 0.024 0.037 0.029 0.121
MM1 0.341 0.301 0.318 0.807

download these results as csv

*Submissions marked by asterisk are with bugs

Results Based on Onset Only

Precision Recall Ave. F-measure Ave. Overlap
CB1 0.742 0.613 0.667 0.585
CB2 0.560 0.454 0.497 0.576
DT1 0.912 0.756 0.820 0.644
KB1* 0.560 0.458 0.485 0.120
MM1 0.794 0.722 0.754 0.613

download these results as csv

*Submissions marked by asterisk are with bugs

Chroma Results Based on Onset Only

Precision Recall Ave. F-measure Ave. Overlap
CB1 0.753 0.623 0.678 0.585
CB2 0.579 0.470 0.514 0.577
DT1 0.913 0.756 0.821 0.641
KB1* 0.573 0.469 0.496 0.118
MM1 0.734 0.671 0.698 0.606

download these results as csv

*Submissions marked by asterisk are with bugs

Individual Results Files for Task 2

KB1= Rainer Kelz, Sebastian Böck
MM1= Matija Marolt
CB1= Chris Cannam, Emmanouil Benetos
CB2= Chris Cannam, Emmanouil Benetos