Difference between revisions of "2009:Audio Melody Extraction"

From MIREX Wiki
Line 1: Line 1:
[this page is for now a pale copy/paste of MIREX06 webpage: [https://www.music-ir.org/mirex/2006/index.php/Audio_Melody_Extraction Audio_Melody_Extraction]]
+
=Description=
=Goal=
 
To extract the melody line from polyphonic audio.
 
  
The deadline for this task is TBA.
+
The text of this section is copied from the 2008 page. Please add your comments and discussions for 2009.  
  
=Description=
 
 
The aim of the MIREX audio melody extraction evaluation is to identify the melody pitch contour from polyphonic musical audio. The task consists of two parts: Voicing detection (deciding whether a particular time frame contains a "melody pitch" or not), and pitch detection (deciding the most likely melody pitch for each time frame). We structure the submission to allow these parts to be done independently, i.e. it is possible (via a negative pitch value) to guess a pitch even for frames that were being judged unvoiced. Algorithms which don't perform a discrimination between melodic and non-melodic parts are also welcome!
 
The aim of the MIREX audio melody extraction evaluation is to identify the melody pitch contour from polyphonic musical audio. The task consists of two parts: Voicing detection (deciding whether a particular time frame contains a "melody pitch" or not), and pitch detection (deciding the most likely melody pitch for each time frame). We structure the submission to allow these parts to be done independently, i.e. it is possible (via a negative pitch value) to guess a pitch even for frames that were being judged unvoiced. Algorithms which don't perform a discrimination between melodic and non-melodic parts are also welcome!
  
 
(The audio melody extraction evaluation will be essentially a re-run of last years contest i.e. the same test data is used.)
 
(The audio melody extraction evaluation will be essentially a re-run of last years contest i.e. the same test data is used.)
  
'''Dataset''':
+
== Discussions for 2009 ==
 +
 
 +
Your comments here.
 +
 
 +
== '''Dataset''' ==
 
* MIREX05 database : 25 phrase excerpts of 10-40 sec from the following genres: Rock, R&B, Pop, Jazz, Solo classical piano
 
* MIREX05 database : 25 phrase excerpts of 10-40 sec from the following genres: Rock, R&B, Pop, Jazz, Solo classical piano
 
* ISMIR04 database : 20 excerpts of about 20s each
 
* ISMIR04 database : 20 excerpts of about 20s each
Line 17: Line 18:
 
* manually annotated reference data (10 ms time grid)  
 
* manually annotated reference data (10 ms time grid)  
  
'''Output Format''':
+
== '''Output Format''' ==
 
* In order to allow for generalization among potential approaches (i.e. frame size, hop size, etc), submitted algorithms should output pitch estimates, in Hz, at discrete instants in time
 
* In order to allow for generalization among potential approaches (i.e. frame size, hop size, etc), submitted algorithms should output pitch estimates, in Hz, at discrete instants in time
 
* so the output file successively contains the time stamp [space or tab] the corresponding frequency value roma Accuracy may be improved.  
 
* so the output file successively contains the time stamp [space or tab] the corresponding frequency value roma Accuracy may be improved.  
  
'''Relevant Test Collections'''
+
== '''Relevant Test Collections''' ==
 
* For the ISMIR 2004 Audio Description Contest, the Music Technology Group of the Pompeu Fabra University assembled a diverse of audio segments and corresponding melody transcriptions including audio excerpts from such genres as Rock, R&B, Pop, Jazz, Opera, and MIDI. (full test set with the reference transcriptions (28.6 MB))
 
* For the ISMIR 2004 Audio Description Contest, the Music Technology Group of the Pompeu Fabra University assembled a diverse of audio segments and corresponding melody transcriptions including audio excerpts from such genres as Rock, R&B, Pop, Jazz, Opera, and MIDI. (full test set with the reference transcriptions (28.6 MB))
 
* Graham's collection: you find the test set here and further explanations on the pages http://www.ee.columbia.edu/~graham/mirex_melody/ and http://labrosa.ee.columbia.edu/projects/melody/  
 
* Graham's collection: you find the test set here and further explanations on the pages http://www.ee.columbia.edu/~graham/mirex_melody/ and http://labrosa.ee.columbia.edu/projects/melody/  
  
=Potential Participants=
+
==Potential Participants==
 
* Vishweshwara Rao & Preeti Rao (Indian Institute of Technology Bombay, India)
 
* Vishweshwara Rao & Preeti Rao (Indian Institute of Technology Bombay, India)

Revision as of 18:08, 1 May 2009

Description

The text of this section is copied from the 2008 page. Please add your comments and discussions for 2009.

The aim of the MIREX audio melody extraction evaluation is to identify the melody pitch contour from polyphonic musical audio. The task consists of two parts: Voicing detection (deciding whether a particular time frame contains a "melody pitch" or not), and pitch detection (deciding the most likely melody pitch for each time frame). We structure the submission to allow these parts to be done independently, i.e. it is possible (via a negative pitch value) to guess a pitch even for frames that were being judged unvoiced. Algorithms which don't perform a discrimination between melodic and non-melodic parts are also welcome!

(The audio melody extraction evaluation will be essentially a re-run of last years contest i.e. the same test data is used.)

Discussions for 2009

Your comments here.

Dataset

  • MIREX05 database : 25 phrase excerpts of 10-40 sec from the following genres: Rock, R&B, Pop, Jazz, Solo classical piano
  • ISMIR04 database : 20 excerpts of about 20s each
  • CD-quality (PCM, 16-bit, 44100 Hz)
  • single channel (mono)
  • manually annotated reference data (10 ms time grid)

Output Format

  • In order to allow for generalization among potential approaches (i.e. frame size, hop size, etc), submitted algorithms should output pitch estimates, in Hz, at discrete instants in time
  • so the output file successively contains the time stamp [space or tab] the corresponding frequency value roma Accuracy may be improved.

Relevant Test Collections

  • For the ISMIR 2004 Audio Description Contest, the Music Technology Group of the Pompeu Fabra University assembled a diverse of audio segments and corresponding melody transcriptions including audio excerpts from such genres as Rock, R&B, Pop, Jazz, Opera, and MIDI. (full test set with the reference transcriptions (28.6 MB))
  • Graham's collection: you find the test set here and further explanations on the pages http://www.ee.columbia.edu/~graham/mirex_melody/ and http://labrosa.ee.columbia.edu/projects/melody/

Potential Participants

  • Vishweshwara Rao & Preeti Rao (Indian Institute of Technology Bombay, India)