Difference between revisions of "2010:Audio Cover Song Identification"

From MIREX Wiki
(better dist mat definition)
(Output File)
Line 65: Line 65:
 
<pre>
 
<pre>
 
Distance matrix header text with system name
 
Distance matrix header text with system name
1\t</path/to/audio/file/1.wav>
+
1\t</path/to/audio/file/track1.wav>
2\t</path/to/audio/file/2.wav>
+
2\t</path/to/audio/file/track2.wav>
3\t</path/to/audio/file/3.wav>
+
3\t</path/to/audio/file/track3.wav>
 +
4\t</path/to/audio/file/track4.wav>
 
...
 
...
N\t</path/to/audio/file/N.wav>
+
N\t</path/to/audio/file/trackN.wav>
Q/R\t1\t2\t3\t...\tM
+
Q/R\t1\t2\t3\t4\t...\tN
1\t<dist query 1 to 1>\t<dist query 1 to 2>\t<dist query 1 to 3>\t...\t<dist query 1 to M>
+
1\t<dist 1 to 1>\t<dist 1 to 2>\t<dist 1 to 3>\t<dist 1 to 4>\t...\t<dist 1 to N>
2\t<dist query 2 to 1>\t<dist query 2 to 2>\t<dist query 2 to 3>\t...\t<dist query 2 to M>
+
3\t<dist 3 to 2>\t<dist 3 to 2>\t<dist 3 to 3>\t<dist 3 to 4>\t...\t<dist 3 to N>
3\t<dist query 3 to 2>\t<dist query 3 to 2>\t<dist query 3 to 3>\t...\t<dist query 3 to M>
 
...\t...\t...\t...\t...\t...
 
N\t<dist N to 1>\t<dist N to 2>\t<dist N to 3>\t...\t<dist N to M>
 
 
</pre>
 
</pre>
  
where N is 330 (number of queries) and M is 1000, candidate tracks in dataset.
+
where N is 1000 (number of candidate tracks in dataset) and the queries are drawn from this set (and bear the same track indexes if possible).
  
 
which might look like:
 
which might look like:
Line 84: Line 82:
 
<pre>
 
<pre>
 
Example distance matrix 0.1
 
Example distance matrix 0.1
1    /path/to/audio/file/1.wav
+
1    /path/to/audio/file/track1.wav
2    /path/to/audio/file/2.wav
+
2    /path/to/audio/file/track2.wav
3    /path/to/audio/file/3.wav
+
3    /path/to/audio/file/track3.wav
4    /path/to/audio/file/4.wav
+
4    /path/to/audio/file/track4.wav
 +
5    /path/to/audio/file/track5.wav
 
Q/R  1        2        3        4        5
 
Q/R  1        2        3        4        5
 
1    0.00000  1.24100  0.2e-4  0.42559  0.21313
 
1    0.00000  1.24100  0.2e-4  0.42559  0.21313
2    1.24100  0.00000  0.62640  0.23564  0.12313
 
 
3    50.2e-4  0.62640  0.00000  0.38000  0.15152
 
3    50.2e-4  0.62640  0.00000  0.38000  0.15152
4    0.42559  0.23567  0.38000  0.00000  0.78539
 
 
</pre>
 
</pre>
  
 +
Note that indexes of the queries refer back to the track list at the top of the distance matrix file to identify the query track. However, as long as you ensure that the query songs are listed in exactly the same order as they appear in the query list file you are passed we will be able to interpret the data.
  
 
All distances should be zero or positive (0.0+) and should not be infinite or NaN. Values should be separated by a TAB.
 
All distances should be zero or positive (0.0+) and should not be infinite or NaN. Values should be separated by a TAB.
  
As N (collection searched for covers) is 1000 and there are 330 original tracks, the distance matrix should be preceded by 1000 rows of file paths and should be composed of 1000 columns of distance (separated by tab characters) and 330 rows (one for each original track). Each row corresponds to a particular query song (the track to find covers of). Please ensure that the query songs are listed in exactly the same order as they appear in the list file you are passed.
+
As N (collection searched for covers) is 1000 and there are 330 original tracks, the distance matrix should be preceded by a system name, 1000 rows of file paths and should be composed of 1000 columns of distance (separated by tab characters) and 330 rows (one for each original track query). Each row corresponds to a particular query song (the track to find covers of).
  
 
==Evaluation==
 
==Evaluation==
  
 
We will employ the same measures used in [[2007:Audio Cover Song]].
 
We will employ the same measures used in [[2007:Audio Cover Song]].

Revision as of 09:32, 24 May 2010

2010 AUDIO COVER SONG IDENTIFICATION TASK OVERVIEW

The text of this section is copied from the 2009 page. Please add your comments and discussions for 2010.

The Audio Cover Song task was a new task for MIREX 2006 and was last run in 2008. It was closely related to the 2010:Audio Music Similarity and Retrieval (AMS) task as the cover songs were embedded in the Audio Music Similarity and Retrieval test collection.


Task Description

Within the 1000 pieces in the Audio Cover Song database, there are embedded 30 different "cover songs" each represented by 11 different "versions" for a total of 330 audio files (16bit, monophonic, 22.05khz, wav). The "cover songs" represent a variety of genres (e.g., classical, jazz, gospel, rock, folk-rock, etc.) and the variations span a variety of styles and orchestrations.

Using each of these cover song files in turn as as the "seed/query" file, we will examine the returned lists of items for the presence of the other 10 versions of the "seed/query" file.


On top of the previous Audio Cover Song dataset, we are going to use the Mazurka dataset. We are going to randomly choose 11 versions from 49 mazurkas and run it as a separate subtask. The I/O format will be the same as previous years. Systems will return a distance matrix of 539x539.


Task specific mailing list

A specific mailing list is provided for the discussion of this task and related tasks ( 2010:Audio Classification (Test/Train) tasks, 2010:Audio_Cover_Song_Identification, 2010:Audio_Tag_Classification, 2010:Audio_Music_Similarity_and_Retrieval) at: https://mail.lis.uiuc.edu/mailman/listinfo/mrx-com00. If you wish to participate in any of these tasks please sign up to this mailing listas discussion of the task format and evaluation should be conducted there.


Command Line Calling Format

$ /path/to/submission <collection_list_file> <query_list_file> <working_directory> <output_file>
    <collection_list_file>: Text file containing 1000 full path file names for the
                            1000 audio files in the collection (including the 330 
                            query documents).
                            Example: /path/to/coversong/collection.txt
    <query_list_file>     : Text file containing the 330 full path file names for the 
                            330 query documents.
                            Example: /path/to/coversong/queries.txt
    <working_directory>   : Full path to a temporary directory where submission will 
                            have write access for caching features or calculations.
                            Example: /tmp/submission_id/
    <output_file>         : Full path to file where submission should output the similarity 
                            matrix (1000 header rows + 330 x 1000 data matrix).
                            Example: /path/to/coversong/results/submission_id.txt

Input Files

The collection lists file format will be of the form:

/path/to/audio/file/000.wav\n
/path/to/audio/file/001.wav\n
/path/to/audio/file/002.wav\n
... * 996 rows omitted * ...
/path/to/audio/file/999.wav\n

The query lists file format will be of the form:

/path/to/audio/file/182.wav\n
/path/to/audio/file/245.wav\n
/path/to/audio/file/432.wav\n
... * 326 rows omitted * ...
/path/to/audio/file/973.wav\n

For a total of 330 rows -- query ids are randomly assigned from the pool of 1000 collection ids.

Lines will be terminated by a '\n' character.

Output File

The only output will be a distance matrix file that is 330 rows by 1000 columns in the following format:


Distance matrix header text with system name
1\t</path/to/audio/file/track1.wav>
2\t</path/to/audio/file/track2.wav>
3\t</path/to/audio/file/track3.wav>
4\t</path/to/audio/file/track4.wav>
...
N\t</path/to/audio/file/trackN.wav>
Q/R\t1\t2\t3\t4\t...\tN
1\t<dist 1 to 1>\t<dist 1 to 2>\t<dist 1 to 3>\t<dist 1 to 4>\t...\t<dist 1 to N>
3\t<dist 3 to 2>\t<dist 3 to 2>\t<dist 3 to 3>\t<dist 3 to 4>\t...\t<dist 3 to N>

where N is 1000 (number of candidate tracks in dataset) and the queries are drawn from this set (and bear the same track indexes if possible).

which might look like:

Example distance matrix 0.1
1    /path/to/audio/file/track1.wav
2    /path/to/audio/file/track2.wav
3    /path/to/audio/file/track3.wav
4    /path/to/audio/file/track4.wav
5    /path/to/audio/file/track5.wav
Q/R   1        2        3        4        5
1     0.00000  1.24100  0.2e-4   0.42559  0.21313
3     50.2e-4  0.62640  0.00000  0.38000  0.15152

Note that indexes of the queries refer back to the track list at the top of the distance matrix file to identify the query track. However, as long as you ensure that the query songs are listed in exactly the same order as they appear in the query list file you are passed we will be able to interpret the data.

All distances should be zero or positive (0.0+) and should not be infinite or NaN. Values should be separated by a TAB.

As N (collection searched for covers) is 1000 and there are 330 original tracks, the distance matrix should be preceded by a system name, 1000 rows of file paths and should be composed of 1000 columns of distance (separated by tab characters) and 330 rows (one for each original track query). Each row corresponds to a particular query song (the track to find covers of).

Evaluation

We will employ the same measures used in 2007:Audio Cover Song.