2011:Multiple Fundamental Frequency Estimation & Tracking Results

From MIREX Wiki
Revision as of 14:55, 15 November 2011 by MertBay (talk | contribs) (Runtimes)

Introduction

These are the results for the 2008 running of the Multiple Fundamental Frequency Estimation and Tracking task. For background information about this task set please refer to the 2011:Multiple Fundamental Frequency Estimation & Tracking page.

General Legend

Sub code Submission name Abstract Contributors
BD1 BenetosDixon MultiF0 PDF Emmanouil Benetos, Simon Dixon
BD2 BenetosDixon NoteTracking1 PDF Emmanouil Benetos, Simon Dixon
BD3 BenetosDixon NoteTracking2 PDF Emmanouil Benetos, Simon Dixon
KD1 Karin Dressler PDF Karin Dressler
LYC1 LYC PDF Cheng-Te Lee, Yi-Hsuan Yang,Homer Chen
RFF1 AlgorithmUsingPianoSamples PDF Gustavo Reis,Francisco Fernandéz ,Aníbal Ferreira
RFF2 MultiTimbralInternalSamples PDF Gustavo Reis,Francisco Fernandéz ,Aníbal Ferreira
YR1 MultiF0_Yeh_IRCAM f0FrmFile PDF Chunghsin YEH,Axel Roebel
YR2 MultiF0_Yeh_IRCAM f0TrkFile PDF Chunghsin YEH,Axel Roebel
YR3 MultiF0_Yeh_IRCAM f0FrmFile -c PDF Chunghsin YEH,Axel Roebel
YR4 MultiF0_Yeh_IRCAM f0TrkFile -c PDF Chunghsin YEH,Axel Roebel


Task 1: Multiple Fundamental Frequency Estimation (MF0E)

MF0E Overall Summary Results

Below are the average scores across 40 test files. These files come from 3 different sources: woodwind quintet recording of bassoon, clarinet, horn,flute and oboe (UIUC); Rendered MIDI using RWC database donated by IRCAM and a quartet recording of bassoon, clarinet, violin and sax donated by Dr. Bryan Pardo`s Interactive Audio Lab (IAL). 20 files coming from 5 sections of the woodwind recording where each section has 4 files ranging from 2 polyphony to 5 polyphony. 12 files from IAL, coming from 4 different songs ranging from 2 polyphony to 4 polyphony and 8 files from RWC synthesized midi ranging from 2 different songs ranging from 2 polphony to 5 polyphony.

BD1 KD1 LYC1 RFF1 RFF2 YR1 YR2 YR3 YR4
Accuracy 0.574 0.634 0.474 0.492 0.485 0.662 0.683 0.653 0.678
Accuracy Chroma 0.629 0.664 0.557 0.55 0.542 0.689 0.702 0.681 0.696

download these results as csv

Detailed Results

Precision Recall Accuracy Etot Esubs Emiss Efa
BD1 0.637 0.683 0.574 0.530 0.204 0.113 0.213
KD1 0.850 0.657 0.634 0.384 0.083 0.261 0.041
LYC1 0.555 0.593 0.474 0.711 0.243 0.164 0.304
RFF1 0.627 0.570 0.492 0.592 0.193 0.238 0.161
RFF2 0.567 0.602 0.485 0.657 0.217 0.181 0.259
YR1 0.732 0.800 0.662 0.433 0.094 0.106 0.233
YR2 0.733 0.840 0.683 0.416 0.083 0.076 0.256
YR3 0.714 0.804 0.653 0.458 0.100 0.096 0.263
YR4 0.724 0.843 0.678 0.429 0.084 0.073 0.271

download these results as csv

Detailed Chroma Results

Here, accuracy is assessed on chroma results (i.e. all F0's are mapped to a single octave before evaluating)

Precision Recall Accuracy Etot Esubs Emiss Efa
BD1 0.700 0.754 0.629 0.460 0.134 0.113 0.213
KD1 0.892 0.688 0.664 0.353 0.051 0.261 0.041
LYC1 0.651 0.703 0.557 0.601 0.134 0.164 0.304
RFF1 0.704 0.637 0.550 0.525 0.126 0.238 0.161
RFF2 0.635 0.674 0.542 0.586 0.146 0.181 0.259
YR1 0.761 0.833 0.689 0.400 0.061 0.106 0.233
YR2 0.753 0.864 0.702 0.392 0.059 0.076 0.256
YR3 0.744 0.840 0.681 0.422 0.064 0.096 0.263
YR4 0.744 0.867 0.696 0.404 0.060 0.073 0.271

download these results as csv

Individual Results Files for Task 1

BD1 = Emmanouil Benetos, Simon Dixon
KD1 = Karin Dressler
LYC1 = Cheng-Te Lee, Yi-Hsuan Yang,Homer Chen
RFF1 = Gustavo Reis,Francisco Fernandéz ,Aníbal Ferreira
RFF2 = Gustavo Reis,Francisco Fernandéz ,Aníbal Ferreira
YR1 = Chunghsin YEH,Axel Roebel
YR2 = Chunghsin YEH,Axel Roebel
YR3 = Chunghsin YEH,Axel Roebel
YR4 = Chunghsin YEH,Axel Roebel


Info about the filenames

The filenames starting with part* comes from acoustic woodwind recording, the ones starting with RWC are synthesized. The legend about the instruments are:

bs = bassoon, cl = clarinet, fl = flute, hn = horn, ob = oboe, vl = violin, cel = cello, gtr = guitar, sax = saxophone, bass = electric bass guitar

Run Times

TBA

Friedman tests for Multiple Fundamental Frequency Estimation (MF0E)

The Friedman test was run in MATLAB to test significant differences amongst systems with regard to the performance (accuracy) on individual files.

Tukey-Kramer HSD Multi-Comparison

TeamID TeamID Lowerbound Mean Upperbound Significance
YR2 YR4 -0.9994 0.9000 2.7994 FALSE
YR2 YR1 -0.3494 1.5500 3.4494 FALSE
YR2 YR3 1.0006 2.9000 4.7994 TRUE
YR2 KD1 0.4006 2.3000 4.1994 TRUE
YR2 BD1 1.6506 3.5500 5.4494 TRUE
YR2 RFF1 3.7506 5.6500 7.5494 TRUE
YR2 RFF2 4.0506 5.9500 7.8494 TRUE
YR2 LYC1 3.8756 5.7750 7.6744 TRUE
YR4 YR1 -1.2494 0.6500 2.5494 FALSE
YR4 YR3 0.1006 2.0000 3.8994 TRUE
YR4 KD1 -0.4994 1.4000 3.2994 FALSE
YR4 BD1 0.7506 2.6500 4.5494 TRUE
YR4 RFF1 2.8506 4.7500 6.6494 TRUE
YR4 RFF2 3.1506 5.0500 6.9494 TRUE
YR4 LYC1 2.9756 4.8750 6.7744 TRUE
YR1 YR3 -0.5494 1.3500 3.2494 FALSE
YR1 KD1 -1.1494 0.7500 2.6494 FALSE
YR1 BD1 0.1006 2.0000 3.8994 TRUE
YR1 RFF1 2.2006 4.1000 5.9994 TRUE
YR1 RFF2 2.5006 4.4000 6.2994 TRUE
YR1 LYC1 2.3256 4.2250 6.1244 TRUE
YR3 KD1 -2.4994 -0.6000 1.2994 FALSE
YR3 BD1 -1.2494 0.6500 2.5494 FALSE
YR3 RFF1 0.8506 2.7500 4.6494 TRUE
YR3 RFF2 1.1506 3.0500 4.9494 TRUE
YR3 LYC1 0.9756 2.8750 4.7744 TRUE
KD1 BD1 -0.6494 1.2500 3.1494 FALSE
KD1 RFF1 1.4506 3.3500 5.2494 TRUE
KD1 RFF2 1.7506 3.6500 5.5494 TRUE
KD1 LYC1 1.5756 3.4750 5.3744 TRUE
BD1 RFF1 0.2006 2.1000 3.9994 TRUE
BD1 RFF2 0.5006 2.4000 4.2994 TRUE
BD1 LYC1 0.3256 2.2250 4.1244 TRUE
RFF1 RFF2 -1.5994 0.3000 2.1994 FALSE
RFF1 LYC1 -1.7744 0.1250 2.0244 FALSE
RFF2 LYC1 -2.0744 -0.1750 1.7244 FALSE

download these results as csv

2011MF0task1.friedman.Friedman Mean Ranks.png

Task 2:Note Tracking (NT)

NT Mixed Set Overall Summary Results

This subtask is evaluated in two different ways. In the first setup , a returned note is assumed correct if its onset is within +-50ms of a ref note and its F0 is within +- quarter tone of the corresponding reference note, ignoring the returned offset values. In the second setup, on top of the above requirements, a correct returned note is required to have an offset value within 20% of the ref notes duration around the ref note`s offset, or within 50ms whichever is larger.

A total of 34 files were used in this subtask: 16 from woodwind recording, 8 from IAL quintet recording and 6 piano.

BD2 BD3 LYC1 RFF1 RFF2 YR1 YR3
Ave. F-Measure Onset-Offset 0.2036 0.2077 0.2076 0.1767 0.1414 0.3493 0.3392
Ave. F-Measure Onset Only 0.4465 0.4506 0.3862 0.4078 0.3564 0.5601 0.5465
Ave. F-Measure Chroma 0.2307 0.2438 0.2573 0.2029 0.1655 0.3579 0.3470
Ave. F-Measure Onset Only Chroma 0.5026 0.5232 0.4649 0.4566 0.3986 0.5647 0.5519

download these results as csv

Detailed Results

Precision Recall Ave. F-measure Ave. Overlap
BD2 0.206 0.212 0.204 0.856
BD3 0.200 0.230 0.208 0.853
LYC1 0.190 0.237 0.208 0.829
RFF1 0.143 0.248 0.177 0.864
RFF2 0.103 0.243 0.141 0.864
YR1 0.276 0.489 0.349 0.890
YR3 0.264 0.484 0.339 0.890

download these results as csv

Detailed Chroma Results

Here, accuracy is assessed on chroma results (i.e. all F0's are mapped to a single octave before evaluating)

Precision Recall Ave. F-measure Ave. Overlap
BD2 0.232 0.242 0.231 0.854
BD3 0.232 0.273 0.244 0.852
LYC1 0.234 0.297 0.257 0.823
RFF1 0.165 0.283 0.203 0.864
RFF2 0.121 0.281 0.166 0.863
YR1 0.282 0.502 0.358 0.886
YR3 0.270 0.496 0.347 0.886

download these results as csv


Results Based on Onset Only

Precision Recall Ave. F-measure Ave. Overlap
BD2 0.460 0.451 0.447 0.692
BD3 0.444 0.483 0.451 0.695
LYC1 0.361 0.430 0.386 0.657
RFF1 0.339 0.550 0.408 0.617
RFF2 0.268 0.576 0.356 0.596
YR1 0.445 0.778 0.560 0.736
YR3 0.429 0.773 0.547 0.735

download these results as csv

Chroma Results Based on Onset Only

Precision Recall Ave. F-measure Ave. Overlap
BD2 0.517 0.510 0.503 0.677
BD3 0.512 0.565 0.523 0.679
LYC1 0.432 0.523 0.465 0.640
RFF1 0.380 0.615 0.457 0.599
RFF2 0.300 0.642 0.399 0.565
YR1 0.448 0.787 0.565 0.708
YR3 0.433 0.783 0.552 0.701

download these results as csv

Friedman Tests for Note Tracking

The Friedman test was run in MATLAB to test significant differences amongst systems with regard to the F-measure on individual files.

Tukey-Kramer HSD Multi-Comparison for Task2
TeamID TeamID Lowerbound Mean Upperbound Significance
YR1 YR3 -0.8389 0.7059 2.2506 FALSE
YR1 BD3 1.3670 2.9118 4.4565 TRUE
YR1 LYC1 1.4259 2.9706 4.5153 TRUE
YR1 BD2 1.4553 3.0000 4.5447 TRUE
YR1 RFF1 1.9259 3.4706 5.0153 TRUE
YR1 RFF2 2.8964 4.4412 5.9859 TRUE
YR3 BD3 0.6611 2.2059 3.7506 TRUE
YR3 LYC1 0.7200 2.2647 3.8094 TRUE
YR3 BD2 0.7494 2.2941 3.8389 TRUE
YR3 RFF1 1.2200 2.7647 4.3094 TRUE
YR3 RFF2 2.1906 3.7353 5.2800 TRUE
BD3 LYC1 -1.4859 0.0588 1.6036 FALSE
BD3 BD2 -1.4565 0.0882 1.6330 FALSE
BD3 RFF1 -0.9859 0.5588 2.1036 FALSE
BD3 RFF2 -0.0153 1.5294 3.0741 FALSE
LYC1 BD2 -1.5153 0.0294 1.5741 FALSE
LYC1 RFF1 -1.0447 0.5000 2.0447 FALSE
LYC1 RFF2 -0.0741 1.4706 3.0153 FALSE
BD2 RFF1 -1.0741 0.4706 2.0153 FALSE
BD2 RFF2 -0.1036 1.4412 2.9859 FALSE
RFF1 RFF2 -0.5741 0.9706 2.5153 FALSE

download these results as csv

2011MF0.Accuracy Per Song Friedman Mean Rankstask2.friedman.Friedman Mean Ranks.png

NT Piano-Only Overall Summary Results

This subtask is evaluated in two different ways. In the first setup , a returned note is assumed correct if its onset is within +-50ms of a ref note and its F0 is within +- quarter tone of the corresponding reference note, ignoring the returned offset values. In the second setup, on top of the above requirements, a correct returned note is required to have an offset value within 20% of the ref notes duration around the ref note`s offset, or within 50ms whichever is larger. 6 piano recordings are evaluated separately for this subtask.

BD2 BD3 LYC1 RFF1 RFF2 YR1 YR3
Ave. F-Measure Onset-Offset 0.1003 0.1136 0.1926 0.1941 0.1550 0.2127 0.1913
Ave. F-Measure Onset Only 0.5263 0.5890 0.5260 0.5205 0.4435 0.6055 0.5881
Ave. F-Measure Chroma 0.1098 0.1205 0.2068 0.2261 0.1944 0.1966 0.1800
Ave. F-Measure Onset Only Chroma 0.5400 0.5996 0.5412 0.5645 0.4930 0.5547 0.5391

download these results as csv

Detailed Results

Precision Recall Ave. F-measure Ave. Overlap
BD2 0.113 0.091 0.100 0.818
BD3 0.127 0.103 0.114 0.819
LYC1 0.198 0.188 0.193 0.791
RFF1 0.182 0.210 0.194 0.796
RFF2 0.124 0.209 0.155 0.787
YR1 0.180 0.263 0.213 0.821
YR3 0.160 0.243 0.191 0.818

download these results as csv

Detailed Chroma Results

Here, accuracy is assessed on chroma results (i.e. all F0's are mapped to a single octave before evaluating)

Precision Recall Ave. F-measure Ave. Overlap
BD2 0.123 0.100 0.110 0.817
BD3 0.135 0.109 0.120 0.818
LYC1 0.212 0.203 0.207 0.778
RFF1 0.211 0.246 0.226 0.794
RFF2 0.155 0.263 0.194 0.787
YR1 0.165 0.249 0.197 0.801
YR3 0.149 0.235 0.180 0.798

download these results as csv

Results Based on Onset Only

Precision Recall Ave. F-measure Ave. Overlap
BD2 0.588 0.479 0.526 0.522
BD3 0.663 0.532 0.589 0.523
LYC1 0.531 0.525 0.526 0.558
RFF1 0.487 0.563 0.521 0.558
RFF2 0.355 0.596 0.443 0.541
YR1 0.504 0.793 0.606 0.545
YR3 0.484 0.783 0.588 0.541

download these results as csv

Chroma Results Based on Onset Only

Precision Recall Ave. F-measure Ave. Overlap
BD2 0.603 0.492 0.540 0.521
BD3 0.675 0.541 0.600 0.519
LYC1 0.547 0.539 0.541 0.551
RFF1 0.528 0.612 0.565 0.552
RFF2 0.394 0.665 0.493 0.530
YR1 0.460 0.734 0.555 0.537
YR3 0.442 0.726 0.539 0.533

download these results as csv

Individual Results Files for Task 2

BD2 = Emmanouil Benetos, Simon Dixon
BD3 = Emmanouil Benetos, Simon Dixon
LYC1 = Cheng-Te Lee, Yi-Hsuan Yang,Homer Chen
RFF1 = Gustavo Reis,Francisco Fernandéz ,Aníbal Ferreira
RFF2 = Gustavo Reis,Francisco Fernandéz ,Aníbal Ferreira
YR1 = Chunghsin YEH,Axel Roebel
YR3 = Chunghsin YEH,Axel Roebel