2010:Audio Key Detection

From MIREX Wiki
Revision as of 13:23, 26 May 2010 by AndreasEhmann (talk | contribs) (Created page with '==Description== Determination of the key is a prerequisite for any analysis of tonal music. As a result, extensive work has been done in the area of automatic key detection. The…')
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Description

Determination of the key is a prerequisite for any analysis of tonal music. As a result, extensive work has been done in the area of automatic key detection. The goal of this task is the identification of the key from music in audio format.


System Specs

Input: Call to individual .wav or .mid files, or an ASCII file list of all files (with full paths).

Ground-truth: One ground-truth file per .wav file, in ASCII tab delimited format:

<pitch (e.g. Ab, A, A#, Bb, B …, G#>\t< major or minor>\n
where the < and > characters are not included and \t denotes a tab and \n denotes a new line.
Note: The framework is aware of the equivalence of certain notes and will handle the mapping internally.

Output: One output file per .wav file, in ASCII tab delimited format:

<pitch (e.g. Ab, A, A#, Bb, B …, G#>\t< major or minor>\n

Audio: (PCM, 16-bit, 44100 Hz) single channel (mono) Excerpts synthesized from MIDI

MIDI: Excerpts of MIDI files


Evaluation Procedures

Test Set: The test set we propose to use will consist of pieces for which the keys are known. For example, symphonies and concertos by well-known composers often have the keys stated in the title of the piece. The excerpts will typically be the beginnings of the pieces as this is one part of the piece for which establishing of the global and known key can be guaranteed. Different excerpt durations will be considered: 30 seconds, 20 seconds and 10 seconds.

Input/Output: The input to the system should be some musical excerpt (either audio or MIDI) and the output should be a key name, for example C major or E flat minor. Only pitch class numbers will be taken into account during evaluation, for instance C sharp major and D flat major will be considered equivalent.

System Calibration: The test set will be randomly split into training and test data. Training data will be provided to the participants so that they determine the optimal settings for the parameters of their algorithms.

Evaluation : The error analysis will center on comparing the key identified by the algorithm to the actual key of the piece. The key of the piece is the one defined by the composer in the title of the piece. We will then determine how ΓÇÿcloseΓÇÖ each identified key is to the corresponding correct key. Keys will be considered as ΓÇÿcloseΓÇÖ if they have one of the following relationships: distance of perfect fifth, relative major and minor, and parallel major and minor. A correct key assignment will be given a full point, and incorrect assignments will be allocated fractions of a point according to the following table:

Relation to correct key Points
Same 1
Perfect fifth 0.5
Relative major/minor 0.3
Parallel major/minor 0.2

Comments: Many excellent suggestions were made in the review process. Some of the ideas included: using actual audio files from recordings for the audio portion of the contest, employing other metrics used in information retrieval literature, using test data from a wider variety of genres, and considering the detection of key modulations.

As this is a first attempt at evaluating key-finding across different systems employing a variety of algorithm combinations, we have opted to keep the evaluation procedure as simple and streamlined as possible. The results of this contest will lay the groundwork from which we can expand the techniques for key-finding evaluation.

Relevant Test Collections

Symbolic Data: The dataset contains 500 classical music MIDI files selected from the Classical Music Archives (http://www.classicalarchives.com) and labelled with the key stated in their title.

Examples of pieces include, but are not limited to, the following:

Pieces from the Baroque period: Bach (http://www.classicalarchives.com/bach.html) ΓÇô Keyboard Works, Chamber Works, and Orchestral Works. Vivaldi (http://www.classicalarchives.com/vivaldi.html) ΓÇô Concerti and Chamber Works.

Pieces from the Classical period: Handel (http://www.classicalarchives.com/handel.html) ΓÇô Orchestral Works, Keyboard Works, and Chamber Works. Haydn (http://www.classicalarchives.com/haydn.html) ΓÇô Keyboard Works, Chamber Works, and Orchestral Works. Mozart (http://www.classicalarchives.com/mozart.html) ΓÇô Keyboard Works, Symphonies and Concertos, and Chamber Works. Early Beethoven (http://www.classicalarchives.com/beethovn.html) ΓÇô Piano Works, Symphonies, Concertos, and Chamber Works.

Pieces from the Romantic period: Late Beethoven (http://www.classicalarchives.com/beethovn.html) ΓÇô Piano Works, Symphonies, Concertos, and Chamber Works. Brahms (http://www.classicalarchives.com/brahms.html) ΓÇô Keyboard Works, Chamber Works, Concertos and Orchestral Works. Chopin (http://www.classicalarchives.com/chopin.html) ΓÇô Piano Works.

Audio Data: The dataset contains the same pieces sythesized from MIDI to CD-quality (16-bit, 44100 Hz, mono) WAV files using various software MIDI synthesizers (Winamp, Cakewalk, etc). The synthetizer for each piece was selected randomly.

By using the same data for both the symbolic and audio key-finding methods, we will be able to evaluate and compare both approaches. It should be noted that even though synthesized MIDI is a simple alternative to actual audio, it is an appropriate approach for an evaluation where we are considering both audio and symbolic algorithms. Also, this controlled method eliminates possible tuning issues that are sometimes present in recorded audio.