Difference between revisions of "2014:Query by Tapping"

From MIREX Wiki
m (Test the query files)
(Step 1: Test the query files)
Line 56: Line 56:
 
Output indexed files are placed into <dir_workspace_root>.
 
Output indexed files are placed into <dir_workspace_root>.
  
=== Step 1: Test the query files ===
+
=== Step 1: Training ===
  
 +
The command format should be like this:
  
The command format should be like this:
+
qbtProgram <dbMidi_list> <query_file_list> [dir_workspace_root]
  
qbtProgram %dbMidi_list% %query_file_list% %resultFile% %dir_workspace_root%
+
Where <dbMidi_list> is a list of the MIDI files in the database to match against (see Step 0), and <query_file_list> maps each query to its associated ground truth. You can use [dir_workspace_root] to store any temporary indexing/database structures. (You can omit [dir_workspace_root] if you do not need it at all.)
  
You can use %dir_workspace_root% to store any temporary indexing/database structures. (You can omit %dir_workspace_root% if you do not need it at all.) If the input query files are onset files (for subtask 1), then the format of %query_file_list% is like this:
+
==== Per-task input specification ====
 +
If the input query files are onset files (for subtask 1), then the format of <query_file_list> is like this:
  
 
  qbtQuery/query_00001.onset  00001.mid
 
  qbtQuery/query_00001.onset  00001.mid
Line 72: Line 74:
 
(Please refer to the readme.txt of the downloaded MIR-QBT corpus for the format of onset files.)
 
(Please refer to the readme.txt of the downloaded MIR-QBT corpus for the format of onset files.)
  
If the input query files are wave files (for subtask 2), the the format of %query_file_list% is like this:
+
If the input query files are wave files (for subtask 2), the the format of <query_file_list> is like this:
  
 
  qbtQuery/query_00001.wav  00001.mid
 
  qbtQuery/query_00001.wav  00001.mid
Line 79: Line 81:
 
  ...
 
  ...
  
The result file gives top-10 candidates for each query. For instance, for wave query file, the result file should have the following format for subtask 1:
+
If the input query files are 2-dimensional onset files (for subtask 3), then the format of <query_file_list> is like this:
 +
 
 +
qbtQuery/query_00001.onset  00001.mid
 +
qbtQuery/query_00002.onset  00001.mid
 +
qbtQuery/query_00003.onset  00002.mid
 +
...
 +
 
 +
=== Step 2: Testing ===
 +
The command format should be like this:
 +
 
 +
qbtProgram <query_file_list> <resultFile> [dir_workspace_root]
 +
 
 +
Where <query_file_list> are input queries, and <resultFile> is the filename where your script should store results. You can use [dir_workspace_root] to store any temporary indexing/database structures. (You can omit [dir_workspace_root] if you do not need it at all.)
 +
 
 +
<resultFile> gives ranked top-10 candidates for each query (note that ranking of the candidates is new for 2014). For instance <resultFile> should have the following format for subtasks 1 and 3:
  
 
  qbtQuery/query_00001.onset: 00025 01003 02200 ...  
 
  qbtQuery/query_00001.onset: 00025 01003 02200 ...  
Line 93: Line 109:
 
  ...
 
  ...
  
Note that the output should be the names of the MIDI files (e.g., <code>00025</code> means <code>00025.mid</code>); they are not necessary 5-digit numbers.
+
Where 00025 is the top-ranked MIDI file for query_00001, followed by 01003, 02200, etc. Note that the output should be the names of the MIDI files (e.g., <code>00025</code> means <code>00025.mid</code>); they are not necessary 5-digit numbers.
  
 
== Potential Participants ==
 
== Potential Participants ==

Revision as of 17:49, 14 July 2014

Overview

The text of this section is copied from the 2013 page. Please add your comments and discussions for 2014.

The main purpose of QBT (Query by Tapping) is to evaluate MIR system in retrieving ground-truth MIDI files by tapping the onset of music notes to the microphone. This task provides query files in wave format as well as the corresponding human-label onset time in symbolic format. For this year's QBT task, we have three corpora for evaluation:

  • Roger Jang's MIR-QBT: This dataset contains both wav files (recorded via microphone) and onset files (human-labeled onset time).
    • 890 onset & .wav queries; 136 ground-truth MIDI files
  • Show Hsiao's QBT_symbolic: This dataset contains only onset files (obtained from the user's tapping on keyboard).
    • 410 onset queries; 143 ground-truth MIDI files (128 of which have at least one query)
  • Kaneshiro et al.'s QBT-Extended: This dataset contains only onset files (obtained from users tapping on a touchscreen). Documentation can be found here.
    • 3,365 onset queries (1,412 from long-term memory and 1,953 from short-term memory) from 60 participants; 51 ground-truth MIDI files
    • A hidden dataset is currently being collected, from 20 new participants

Discussions for 2014

CCRMA is very excited to be hosting the QBT task this year!

Any questions or suggestions can be added directly here, or you can send us an email, qbt | at | ccrma dot stanford dot edu

Task description

  • Evaluations are performed separately on each dataset

Subtask 1: QBT with symbolic input

  • Test database: The set of ground-truth MIDI files corresponding to each dataset.
  • Query files: Text files of onset times to retrieve target MIDIs from all datasets listed above. These onset files can help participant concentrate on similarity matching instead of onset detection. Onset files derived from .wav files cannot guarantee to have perfect detection result from original wav query files.
  • Evaluation: Return top 10 candidates for each query file. 1 point is scored for a hit in the top 10 and 0 is scored otherwise (Top-10 hit rate). We may also consider Top-5 and Top-1 scoring.

Subtask 2: QBT with wave input

  • Test database: About 150 ground-truth monophonic MIDI files in MIR-QBT.
  • Query files: About 800 wave files of tapping recordings to retrieve MIDIs in MIR-QBT.
  • Evaluation: Return top 10 candidates for each query file. 1 point is scored for a hit in the top 10 and 0 is scored otherwise (Top-10 hit rate). We may also consider Top-5 and Top-1 scoring.

Subtask 3: QBT-Extended with symbolic input (new for 2014)

  • This subtask uses a longer query vector concatenating tap times and (pitch) positions.
  • Development dataset: The set of ground-truth MIDI files in the QBT-Extended dataset. Both onset times and MIDI note numbers are used.
  • Query files: Text files of onset times in the QBT-Extended dataset (long-term and short-term memory queries). Both onset times and vertical coordinates of tasks are considered.
  • Development evaluation: Return top 10 candidates for each query file in the development dataset. 1 point is scored for a hit in the top 10 and 0 is scored otherwise (Top-10 hit rate). We may also consider Top-5 and Top-1 scoring.
  • Test evaluation: Return top 10 candidates for each query file in the hidden dataset. 1 point is scored for a hit in the top 10 and 0 is scored otherwise (Top-10 hit rate). We may also consider Top-5 and Top-1 scoring.

Command formats

Step 0: Indexing the MIDIs collection

If your algorithm needs to pre-process (e.g., index) the database, your code should do so using the following command-line format (Note that this step is not required unless you want to index or preprocess the MIDI database).

Command format should look like this:

indexing <dbMidi.list> <dir_workspace_root>

where <dbMidi.list> is the input list of database midi files named as uniq_key.mid. For example:

QBT/database/00001.mid
QBT/database/00002.mid
QBT/database/00003.mid
QBT/database/00004.mid
...

Output indexed files are placed into <dir_workspace_root>.

Step 1: Training

The command format should be like this:

qbtProgram <dbMidi_list> <query_file_list> [dir_workspace_root]

Where <dbMidi_list> is a list of the MIDI files in the database to match against (see Step 0), and <query_file_list> maps each query to its associated ground truth. You can use [dir_workspace_root] to store any temporary indexing/database structures. (You can omit [dir_workspace_root] if you do not need it at all.)

Per-task input specification

If the input query files are onset files (for subtask 1), then the format of <query_file_list> is like this:

qbtQuery/query_00001.onset   00001.mid
qbtQuery/query_00002.onset   00001.mid
qbtQuery/query_00003.onset   00002.mid
...

(Please refer to the readme.txt of the downloaded MIR-QBT corpus for the format of onset files.)

If the input query files are wave files (for subtask 2), the the format of <query_file_list> is like this:

qbtQuery/query_00001.wav   00001.mid
qbtQuery/query_00002.wav   00001.mid
qbtQuery/query_00003.wav   00002.mid
...

If the input query files are 2-dimensional onset files (for subtask 3), then the format of <query_file_list> is like this:

qbtQuery/query_00001.onset   00001.mid
qbtQuery/query_00002.onset   00001.mid
qbtQuery/query_00003.onset   00002.mid
...

Step 2: Testing

The command format should be like this:

qbtProgram <query_file_list> <resultFile> [dir_workspace_root]

Where <query_file_list> are input queries, and <resultFile> is the filename where your script should store results. You can use [dir_workspace_root] to store any temporary indexing/database structures. (You can omit [dir_workspace_root] if you do not need it at all.)

<resultFile> gives ranked top-10 candidates for each query (note that ranking of the candidates is new for 2014). For instance <resultFile> should have the following format for subtasks 1 and 3:

qbtQuery/query_00001.onset: 00025 01003 02200 ... 
qbtQuery/query_00002.onset: 01547 02313 07653 ... 
qbtQuery/query_00003.onset: 03142 00320 00973 ... 
...

And for subtask 2:

qbtQuery/query_00001.wav: 00025 01003 02200 ... 
qbtQuery/query_00002.wav: 01547 02313 07653 ... 
qbtQuery/query_00003.wav: 03142 00320 00973 ... 
...

Where 00025 is the top-ranked MIDI file for query_00001, followed by 01003, 02200, etc. Note that the output should be the names of the MIDI files (e.g., 00025 means 00025.mid); they are not necessary 5-digit numbers.

Potential Participants

Jorge Herrera, Hyung-Suk Kim, and Blair Kaneshiro, CCRMA, jorgeh at ccrma dot stanford dot edu

References

Chen JCC, and Chen ALP (1998). Query by rhythm: An approach for song retrieval in music databases. Research Issues in Data Engineering, Proceedings of IEEE Eighth International Workshop on Continuous-Media Databases and Applications, 139-146.

Eisenberg G, Batke JM, and Sikora T (2004). BeatBank - an MPEG-7 compliant query by tapping system. Audio Engineering Society Convention 116, paper 6136.

Eisenberg G, Batke JM, and Sikora T (2004). Efficiently computable similarity measures for query by tapping systems. Proceedings of the Seventh International Conference on Digital Audio Effects (DAFx'04), Naples, Italy, 189-192.

Hanna P, and Robine M (2009) Query by tapping system based on alignment algorithm. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 1881-1884.

Hébert S, and Peretz I (1997). Recognition of music in long-term memory: Are melodic and temporal patterns equal partners? Memory & Cognition 25:4, 518-533.

Jang JSR, Lee HR, and Yeh CH (2001). Query by tapping: A new paradigm for content-based music retrieval from acoustic input. Advances in Multimedia Information Processing PCM, 590-597.

Kaneshiro B, Kim HS, Herrera J, Oh J, Berger J, and Slaney M (2013). QBT-extended: An annotated dataset of melodically contoured tapped queries. Proceedings of the 14th International Society for Music Information Retrieval Conference, Curitiba, Brazil, 329-334.

Peters G, Anthony C, and Schwartz M (2005). Song search and retrieval by tapping. Proceedings of the National Conference on Artificial Intelligence 20, 1696.

Peters G, Cukierman D, Anthony C, and Schwartz M (2006). Online music search by tapping. Ambient Intelligence in Everyday Life, 178-197.