## Description

The singing voice separation task solicits competing entries to blindly separate the singer's voice from pop music recordings. The entries are evaluated using standard metrics (see Evaluation below).

## Data

A collection of 100 clips of recorded pop music (vocals plus music) are used to evaluate the singing voice separation algorithms.

Collection statistics:

1. Size of collection: 100 clips
2. Audio details: 16-bit, mono, 44.1kHz, WAV
3. Duration of each clip: 30 seconds

## Evaluation

For evaluation we use Vincent et al.'s (2012) Source to Distortion Ratio (SDR), Source to Interferences Ratio (SIR), and Sources to Artifacts Ratio (SAR), as implemented by bss_eval_sources.m in BSS Eval Version 3.0. More specifically, their function will be invoked as follows:

>> trueVoice = wavread('trueVoice.wav');
>> trueMixed = trueVoice + trueKaraoke;
>> [estimatedVoice, estimatedKaraoke] = wrapper_function_calling_your_separation_algorithm(trueMixed);
>> [SDR, SIR, SAR] = bss_eval_sources([estimatedVoice estimatedKaraoke]', [trueVoice trueKaraoke]')

SDR =
-2.7443
2.7514

SIR =
-0.5416
9.0144

SAR =
4.5486
4.4363


The final scores will be determined by the average scores of all 100 songs:

${\displaystyle GSDR={\frac {\sum _{i=1}^{100}SDR_{i}}{100}}}$,

${\displaystyle GSIR={\frac {\sum _{i=1}^{100}SIR_{i}}{100}}}$,

${\displaystyle GSAR={\frac {\sum _{i=1}^{100}SAR_{i}}{100}}}$.

## Submission format

Participants are required to submit an entry that takes in a filename (in the form of *.wav) as its only argument. The entries must send their voice-separated outputs to *-voice.wav and *-music.wav, respectively.

## Packaging submissions

All submissions should be statically linked to all libraries (the presence of dynamically linked libraries cannot be guarenteed).

All submissions should include a README file including the following the information:

1. Command line calling format for all executables and an example formatted set of commands
2. Number of threads/cores used or whether this should be specified on the command line
3. Expected memory footprint
4. Expected runtime
5. Any required environments (and versions), e.g. python, java, bash, matlab.

## Time and hardware limits

Due to the potentially high number of particpants in this and other audio tasks, hard limits on the runtime of submissions are specified.

A hard limit of 24 hours will be imposed on runs. Submissions that exceed this runtime may not receive a result.

