2005:Symbolic Genre Class

From MIREX Wiki
Revision as of 20:59, 1 February 2005 by (talk | contribs) (Relevant Test Collections)


Cory McKay (McGill University) cory.mckay@mail.mcgill.ca


Genre Classification of MIDI Files


Submitted software will automatically classify MIDI recordings into genre categories.

1) Genre Categories The genre categories will be organized hierarchically, in order to enable evaluation of how well entries can perform both coarse and fine classifications. The particular categories to be used will be determined by the evaluation committee. Individual recordings could belong to more than one category, as this is more realistic than requiring that each recording be classified as belonging to exactly one category. A total of three to five coarse categories and ten to fifteen fine categories will be used. Model classifications will be made by the evaluation committee or a sub-committee of the evaluation committee. Entrants will be provided with the selection and organization of categories so that they can configure their software to reflect them before submission.

2) Training and Testing Recordings Training and testing recordings will be chosen by the evaluation committee and kept confidential until after evaluations are complete. The test recordings will then be released, copyrights permitting.

3) Input Data Training will be performed by providing the software (through a command-line argument) with a text file listing training MIDI file paths and model genre(s). Testing will be performed by providing the software (through a command-line argument) with a text file that contains a list of file paths of test MIDI recordings.

4) Output Data The software will produce a text file listing test recording file paths and the genre(s) that each has been classified as.

Potential Participants

  • George Tzanetakis (University of Victoria), gtzan@cs.uvic.ca, high likelihood
  • Cory McKay & Ichiro Fujinaga (McGill University), cory.mckay@mail.mcgill.ca, high likelihood
  • Pedro J. Ponce de Leon & Jose M. Inesta (Universidad de Alicante), pierre@dlsi.ua.es, medium likelihood
  • Roberto Basili, Alfredo Serafini & Armando Stellato (University of Rome Tor Vergata), basili@info.uniroma2.it, medium likelihood
  • Man-Kwan Shan & Fang-Fei Kuo (National Cheng Chi University), mkshan@cs.nccu.edu.tw, medium likelihood

Evaluation Procedures

Entries will be evaluated based on their success rates with respect ot both fine and coarse classifications. Entrants will have the option of enabling their software to output classifications of "unknown," which will be penalized less severely during evaluation than misclassifications, as classifications flagged as uncertain are much better than false classifications in a practical context. Evaluation will be performed using 5-fold cross validation.

Submissions in C/C++, Java, MatLab and Python (and other languages?) will be accepted.

Relevant Test Collections

Review 1

The problem is very interesting for MIR, but too vaguely described. The role of the committee is not to propose anything, but to review the proposed evaluation sessions. Thus the author should propose a detailed list of genres and corresponding data.

I'm not against organizing the genres hierarchically and associating several genres to each file, but this raises many issues that are not discussed at all here. If a track belongs to several genres, are these genres equally weighted or not ? Are they determined by asking several people to classify each track into one genre, or by asking each one to classify each track into several genres ? If there are coarse categories for classical and folk music, where lies the fine category of classical music adapted from folk songs ? I suggest that the contest concentrates on the single genre problem.

The choice of the genre classes is a crucial issue for the contest to be held several times. Indeed existing databases can be reused only when the defined categories are identical each year. Obviously the list of categories should reflect the list of MIDI music available on the internet. It would help if some data were already labeled according to this list.

The list of relevant data should be developed. How many files are needed for learning and testing ? Have the participants already collected some labeled data that they could give to the organizers ? How much ?

Regarding the release of the data, I think that it would be better not to release anything. The training and test data should always be accessible through the D2K interface, and thus no copyright problem would appear. Is it possible to ensure that the test data are used only for testing and not for learning ? Is it possible to implement learning easily in M2K ? (each algorithm may use different structures to store learnt data)

Finally, the evaluation procedure seems nice, but I don't have any clue whether the proposed participants are really interested.

Review 2

This is an interesting topic, one that I haven't seen much work on. I do not believe that its difficult to get a large collection of midi files. Many are in public domain, were never intended to be copyrighted, or have copyleft / creative commons licences. However, its still difficult to assemble a reasonable collection of midi files of appropriate length which accurately represent a sufficient number of genres. This must be addressed.

A key point is that it requires the Contest Committee to handlabel a large number of midi files. We also need to determine what our genres are. Is the Committee capable and willing to do this? I personally would find it very difficult to determine the genre of a midi recording which I don't recognize. MIDI all sounds like Muzak to me, unless I know the original audio recording. Has anyone tried midi-based genre classification before?

I have no problems with the suggested evaluation and testing procedures.

I think we need some more feedback on whether people are really interested in this. Most researchers who use MIDI, to my knowledge, aren't concerned with genre issues. George typically works with audio, so the proposer is the only one I'm aware of who I know is interested. I could be wrong so lets ask around. We also need to explore the handlabelling task, and to see if we can assemble a decent collection (which we should do regardless of this proposal).

If there is significant interest, and the labeling can be done, then we should accept it.