2005:Symbolic Key Finding
Arpi Mardirossian, Ching-Hua Chuan and Elaine Chew (University of Southern California) email@example.com
Evaluation of Key Finding Algorithms
Determination of the key is a prerequisite for any analysis of tonal music. As a result, extensive work has been done in the area of automatic key detection. However, among this plethora of key finding algorithms, what seems to be lacking is a formal and extensive evaluation process. We propose the evaluation of key-finding algorithms at the 2005 MIREX. There are significant contributions in the area of key finding for both audio and symbolic representation. Thus another the same contest was also proposed for audio data.
- Olli Yli-Harja (firstname.lastname@example.org), Ilya Schmulevich (email@example.com), and Kjell Lemstr├╢m (firstname.lastname@example.org): [high].
- Tuomas Eerola (email@example.com) and Petri Toiviainen (firstname.lastname@example.org): [high].
- Arpi Mardirossian (email@example.com) and Elaine Chew (firstname.lastname@example.org): [high].
- Craig Sapp (email@example.com): [moderate].
- David Temperley (firstname.lastname@example.org): [unknown].
The following evaluation outline is a general guideline that will be compatible with both audio and symbolic key finding algorithms. It is safe to assume that each key finding algorithm will have its own set of parameters. The creators of the system should pre-determine the optimal settings for the parameters. Once these settings are determined, an accuracy rate may be calculated. The input of the test should be some excerpt of the pieces in the test set and the output will be the key name, for example, C major or E flat minor. We plan to use pieces for which the keys are known, for example, symphonies and concertos by well-known composers where the keys are stated in the title of the piece. The excerpt will typically be the beginnings of the pieces as this is the only part of the piece for which establishing of the global and known key can be guaranteed.
The error analysis will center on comparing the key identified by the algorithm to the actual key of the piece. We will then determine how 'close' each identified key is to the corresponding correct key. Keys will be considered as 'close' if they have one of the following relationships: distance of perfect fifth, relative major and minor, and parallel major and minor. It can be assumed that if an algorithm returns a key that is closely related to the actual key then it is superior. We may then use this information to generate further metrics.
Clearly, the optimal parameters may vary for different styles of music, and by composer. If time permits and the systems allow, we may next focus on pieces for which the algorithm has identified an incorrect key under the optimal settings of the parameters and determine whether the incorrect assignments were due to improper parameter selection. We may then calculate the percent of the pieces that had an incorrect assignment under the optimal settings but have a correct assignment with other settings.
Relevant Test Collections
MIDI Collections: MIDI data are a symbolic representation of music. It provides a numeric representation of the pitch and onset/offset time and velocity for every event in a musical piece. Classical Archive website (http://www.classicalarchives.com) provides more than thirty thousands full length classical music files by more than two thousands composers in MIDI format. All the files are presented with full name, and composer. Also, most of files state the key clearly. Music by different composers may be used to test the range of the algorithm. Multiple versions of a piece may be used to test the algorithms' robustness to the various arrangements of instruments.
Score-based Collections: Score-based data are also symbolic representations of music. In addition to numeric event information, it also provides further pitch and time structure information such as contextually correct note names, and key and time signatures. MusData (http://www.musedata.org), for example, provides access to such a score-based collection.
The proposals contemplate two different evaluations for key estimation: one for MIDI and another one for Audio Data. Maybe these two proposals could be merged in a single one. At least part of the data could be shared among done by having a test collection including Audio Data and its MIDI representation, or MIDI representation and the Audio generated by a MIDI synthesizer. This way, we could evaluate and compare approaches dealing with MIDI & Audio.
Will it be some training data, so that participants can try their algorithms?
I cannot tell whether the suggested participants are willing to participate. Other potential candidate could be: Hendrik Purwins