Difference between revisions of "2009:Audio Music Similarity and Retrieval Results"

From MIREX Wiki
(Overall Summary Results)
(Raw Scores)
Line 77: Line 77:
 
==Raw Scores==
 
==Raw Scores==
 
The raw data derived from the Evalutron 6000 human evaluations are located on the [[Audio Music Similarity and Retrieval Raw Data]] page.
 
The raw data derived from the Evalutron 6000 human evaluations are located on the [[Audio Music Similarity and Retrieval Raw Data]] page.
 +
 +
==Individual Performance Reports==
 +
=== Team ID ===
 +
 +
'''ANO''' = [[https://music-ir.org/mirex/2009/results/ams/statistics/ANO/report.txt Anonymous]]<br />
 +
'''BF1''' = [[Benjamin Fields (chr12)]]<br />
 +
'''BF2''' = [[Benjamin Fields (mfcc10)]]<br />
 +
'''BSWH1''' = [[Dmitry Bogdanov, Joan Serrà, Nicolas Wack, and Perfecto Herrera (clas)]]<br />
 +
'''BSWH2''' = [[Dmitry Bogdanov, Joan Serrà, Nicolas Wack, and Perfecto Herrera (hybrid)]]<br />
 +
'''CL1''' = [[Chuan Cao, Ming Li]]<br />
 +
'''CL2''' = [[Chuan Cao, Ming Li]]<br />
 +
'''GT''' = [[George Tzanetakis]]<br />
 +
'''LR''' = [[Thomas Lidy, Andreas Rauber]]<br />
 +
'''ME1'''  = [[Fran├ºois Maillet, Douglas Eck (mlp)]]<br />
 +
'''ME2'''  = [[Fran├ºois Maillet, Douglas Eck (sda)]]<br />
 +
'''PS1''' = [[Tim Pohle, Dominik Schnitzer (2007)]]<br />
 +
'''PS2''' = [[Tim Pohle, Dominik Schnitzer (2009)]]<br />
 +
'''SH1''' = [[Stephan H├╝bler]]<br />
 +
'''SH2''' = [[Stephan H├╝bler]]<br />

Revision as of 18:41, 14 October 2009

Introduction

General Legend

Team ID

ANO = Anonymous
BF1 = Benjamin Fields (chr12)
BF2 = Benjamin Fields (mfcc10)
BSWH1 = Dmitry Bogdanov, Joan Serrà, Nicolas Wack, and Perfecto Herrera (clas)
BSWH2 = Dmitry Bogdanov, Joan Serrà, Nicolas Wack, and Perfecto Herrera (hybrid)
CL1 = Chuan Cao, Ming Li
CL2 = Chuan Cao, Ming Li
GT = George Tzanetakis
LR = Thomas Lidy, Andreas Rauber
ME1 = François Maillet, Douglas Eck (mlp)
ME2 = François Maillet, Douglas Eck (sda)
PS1 = Tim Pohle, Dominik Schnitzer (2007)
PS2 = Tim Pohle, Dominik Schnitzer (2009)
SH1 = Stephan H├╝bler
SH2 = Stephan H├╝bler

Broad Categories

NS = Not Similar
SS = Somewhat Similar
VS = Very Similar

Calculating Summary Measures

Fine(1) = Sum of fine-grained human similarity decisions (0-10).
PSum(1) = Sum of human broad similarity decisions: NS=0, SS=1, VS=2.
WCsum(1) = 'World Cup' scoring: NS=0, SS=1, VS=3 (rewards Very Similar).
SDsum(1) = 'Stephen Downie' scoring: NS=0, SS=1, VS=4 (strongly rewards Very Similar).
Greater0(1) = NS=0, SS=1, VS=1 (binary relevance judgement).
Greater1(1) = NS=0, SS=0, VS=1 (binary relevance judgement using only Very Similar).

(1)Normalized to the range 0 to 1.

Overall Summary Results

file /nema-raid/www/mirex/results/ams/evalutron/summary_evalutron.csv not found

Friedman's Tests

Friedman's Test (FINE Scores)

The Friedman test was run in MATLAB against the Fine summary data over the 100 queries.
Command: [c,m,h,gnames] = multcompare(stats, 'ctype', 'tukey-kramer','estimate', 'friedman', 'alpha', 0.05);

file /nema-raid/www/mirex/results/ams/evalutron/evalutron.fine.friedman.tukeyKramerHSD.csv not found

https://music-ir.org/mirex/2009/results/ams/evalutron/small.evalutron.fine.friedman.tukeyKramerHSD.png

Friedman's Test (BROAD Scores)

The Friedman test was run in MATLAB against the BROAD summary data over the 100 queries.
Command: [c,m,h,gnames] = multcompare(stats, 'ctype', 'tukey-kramer','estimate', 'friedman', 'alpha', 0.05);

file /nema-raid/www/mirex/results/ams/evalutron/evalutron.cat.friedman.tukeyKramerHSD.csv not found

https://music-ir.org/mirex/2009/results/ams/evalutron/small.evalutron.cat.friedman.tukeyKramerHSD.png


Summary Results by Query

FINE Scores

These are the mean FINE scores per query assigned by Evalutron graders. The FINE scores for the 5 candidates returned per algorithm, per query, have been averaged. Values are bounded between 0.0 and 10.0. A perfect score would be 10. Genre labels have been included for reference.

file /nema-raid/www/mirex/results/ams/evalutron/fine_scores.csv not found

BROAD Scores

These are the mean BROAD scores per query assigned by Evalutron graders. The BROAD scores for the 5 candidates returned per algorithm, per query, have been averaged. Values are bounded between 0 (not similar) and 2 (very similar). A perfect score would be 2. Genre labels have been included for reference.

file /nema-raid/www/mirex/results/ams/evalutron/cat_scores.csv not found

Anonymized Metadata

Anonymized Metadata

Raw Scores

The raw data derived from the Evalutron 6000 human evaluations are located on the Audio Music Similarity and Retrieval Raw Data page.

Individual Performance Reports

Team ID

ANO = [Anonymous]
BF1 = Benjamin Fields (chr12)
BF2 = Benjamin Fields (mfcc10)
BSWH1 = Dmitry Bogdanov, Joan Serrà, Nicolas Wack, and Perfecto Herrera (clas)
BSWH2 = Dmitry Bogdanov, Joan Serrà, Nicolas Wack, and Perfecto Herrera (hybrid)
CL1 = Chuan Cao, Ming Li
CL2 = Chuan Cao, Ming Li
GT = George Tzanetakis
LR = Thomas Lidy, Andreas Rauber
ME1 = François Maillet, Douglas Eck (mlp)
ME2 = François Maillet, Douglas Eck (sda)
PS1 = Tim Pohle, Dominik Schnitzer (2007)
PS2 = Tim Pohle, Dominik Schnitzer (2009)
SH1 = Stephan H├╝bler
SH2 = Stephan H├╝bler