2006:Symbolic Melodic Similarity Results

From MIREX Wiki
Revision as of 13:06, 13 May 2010 by IMIRSELBot (talk | contribs) (Robot: Automated text replacement (-\[\[([A-Z][^:]+)\]\] +2006:\1))

Introduction

These are the results for the 2006 running of the Symbolic Melodic Similarity task set. For background information about this task set please refer to the 2006:Symbolic Melodic Similarity page.

Each system was given a query and returned the 10 most melodically similar songs from a given collection where the collections were RISM (monophonic; 10,000), Karoke (polyphonic; 1,000), Mixed (polyphonic; 15,741). Then, for each query, the returned results from all participants were grouped and were evaluated by human graders, each query being evaluated by 3 different graders with two scores (using the Evalutron 6000 system). Graders were asked to provide 1 categorical score with 3 categories: NS,SS,VS as explained below, and one fine score (in the range from 0 to 10).

Evalutron 6000 Summary Data

Number of evaluators = 20
Number of evaluations per query/candidate pair = 3
Number of queries per grader = 15
Ave. size of the candidate lists = 15
Ave. number of query/candidate pairs evaluated per grader: 225
Number of queries (across all subtasks = 17

General Legend

Team ID

Prefix R = RISM collection, K = Karaoke collection, M = Polyphonic collection

FH = Pascal Ferraro and Pierre Hanna
NM = Kjell Lemström, Niko Mikkilä, Veli Mäkinen and Esko Ukkonen
RT = Rainer Typke, Frans Wiering and Remco C. Veltkamp
KF = Klaus Frieler
AU = Alexandra Uitdenbogerd

Broad Categories

NS = Not Similar
SS = Somewhat Similar
VS = Very Similar

Table Headings

ADR = Average Dynamic Recall
NRGB = Normalize Recall at Group Boundaries
AP = Average Precision (non-interpolated)
PND = Precision at N Documents

Calculating Summary Measures

Fine(1) = Sum of fine-grained human similarity decisions (0-10).
PSum(1) = Sum of human broad similarity decisions: NS=0, SS=1, VS=2.
WCsum(1) = 'World Cup' scoring: NS=0, SS=1, VS=3 (rewards Very Similar).
SDsum(1) = 'Stephen Downie' scoring: NS=0, SS=1, VS=4 (strongly rewards Very Similar).
Greater0(1) = NS=0, SS=1, VS=1 (binary relevance judgement).
Greater1(1) = NS=0, SS=0, VS=1 (binary relevance judgement using only Very Similar).

(1)Normalized to the range 0 to 1.

Overall Summary Results

Visualizations

Rainer Typke has created a series of 2006:Symbolic Melodic Similarity Graphs that help us visualize the results.
Rainer Typke has also created a set of detailed representations of the results that is definitely with exploring at [http://rainer.typke.org/mirex06.0.html].

Task I: RISM Overall Summary

file /nema-raid/www/mirex/results/sms06_rism_sum.csv not found

Task I: RISM Runtime Data

file /nema-raid/www/mirex/results/sms06_rism_runtime.csv not found

Task IIa: Karaoke Overall Summary

file /nema-raid/www/mirex/results/sms06_karaoke_sum.csv not found

Task IIa: Karaoke Runtime Data

file /nema-raid/www/mirex/results/sms06_karaoke_runtime.csv not found

Task IIb: Mixed Polyphonic Overall Summary

file /nema-raid/www/mirex/results/sms06_mixed_sum.csv not found

Task IIb: Mixed Polyphonic Runtime Data

file /nema-raid/www/mirex/results/sms06_mixed_runtime.csv not found

Task I: RISM Collection Summary Results

There is an error with this data set...please stand by. file /nema-raid/www/mirex/results/sms06_rism_results3.csv not found

Task IIa: Karaoke Collection Summary Results

file /nema-raid/www/mirex/results/sms06_kar_results3.csv not found

Task IIb: Mixed Polyphonic Collection Summary Results

file /nema-raid/www/mirex/results/sms06_mix_results3.csv not found

Raw Scores

The raw data derived from the Evalutron 6000 human evaluations are located on the 2006:Symbolic Melodic Similarity Raw Data page.