Difference between revisions of "2014:Singing Voice Separation"
(→Data) |
(→Evaluation) |
||
Line 19: | Line 19: | ||
== Evaluation == | == Evaluation == | ||
− | For evaluation we use | + | For evaluation we use Vincent ''et al.'''s (2006) Source to Distortion Ratio |
− | <math>SDR=10\log_{10}\frac{\|s_{target}\|^2}{\|e_{interf}+e_{noise}+e_{artif}\|^2},</math> | + | <math>{\rm SDR}=10\log_{10}\frac{\|s_{\rm target}\|^2}{\|e_{\rm interf}+e_{\rm noise}+e_{\rm artif}\|^2},</math> |
− | + | Source to Interferences Ratio | |
− | We rank the entries according to <math>\frac{SDR+SAR}{ | + | <math>{\rm SIR}=10\log_{10}\frac{\|s_{\rm target}\|^2}{\|e_{\rm interf}\|^2},</math> |
+ | |||
+ | and Sources to Artifacts Ratio | ||
+ | |||
+ | <math>{\rm SAR}=10\log_{10}\frac{\|s_{\rm target}+e_{\rm interf}+e_{\rm noise}\|^2}{\|e_{\rm artif}\|^2},</math> | ||
+ | |||
+ | as implemented in [http://bass-db.gforge.inria.fr/bss_eval/ BSS Eval]. We rank the entries according to <math>\frac{SDR+SIR+SAR}{3}.</math> | ||
== Submission format == | == Submission format == |
Revision as of 21:16, 25 August 2014
Contents
Description
The singing voice separation task solicits competing entries to blindly separate the singer's voice from pop music recordings. The entries are evaluated using standard metrics (see Evaluation below).
Task specific mailing list
All discussions take place on the MIREX "EvalFest" list. If you have an question or comment, simply include the task name in the subject heading.
Data
A collection of 100 clips of recorded pop music (vocals plus music) are used to evaluate the singing voice separation algorithms.
Collection statistics:
- Size of collection: 100 clips
- Audio details: 16-bit, mono, 22.05kHz, WAV
- Duration of each clip: 30 seconds
Evaluation
For evaluation we use Vincent et al.'s (2006) Source to Distortion Ratio
Source to Interferences Ratio
and Sources to Artifacts Ratio
as implemented in BSS Eval. We rank the entries according to
Submission format
Participants are required to submit an entry that takes in a filename (in the form of *.wav) as its only argument. The entries must send their voice-separated outputs to *-voice.wav and *-music.wav, respectively.
Packaging submissions
All submissions should be statically linked to all libraries (the presence of dynamically linked libraries cannot be guarenteed).
All submissions should include a README file including the following the information:
- Command line calling format for all executables and an example formatted set of commands
- Number of threads/cores used or whether this should be specified on the command line
- Expected memory footprint
- Expected runtime
- Any required environments (and versions), e.g. python, java, bash, matlab.
Time and hardware limits
Due to the potentially high number of particpants in this and other audio tasks, hard limits on the runtime of submissions are specified.
A hard limit of 24 hours will be imposed on runs. Submissions that exceed this runtime may not receive a result.
Potential Participants
name / email