From MIREX Wiki
Introduction
Goal
To classify polyphonic music audio (in PCM format) into genre categories.
Dataset
Two sets of data were used: Magnatune and USPOP. The Magnatune dataset has a hierarchical genre taxonomy, while the USPOP categories are at a single level. The audio sampling rates used were either 44.1 KHz or 22.05 KHz (mono). More data information is in the following table:
| Dataset | Size (@ 44.1 KHz) | Number of Training Files | Number of Testing Files
|
| Magnatune | 34.3 GB | 1005 | 510
|
| USPOP | 28.4 GB | 940 | 474
|
Results
Overall
Magnatune Dataset
| Magnatune Dataset
|
| Rank | Participant | Hierarchical Classification Accuracy | Normalized Hierarchical Classification Accuracy | Raw Classification Accuracy | Normalized Raw Classification Accuracy | Runtime (s) | Machine | Confusion Matrix Files
|
| 1 | Bergstra, Casagrande & Eck (2) | 77.75% | 73.04% | 75.10% | 69.49% | -- | -- | BCE_2_MTeval.txt
|
| 2 | Bergstra, Casagrande & Eck (1) | 77.25% | 72.13% | 74.71% | 68.73% | 23400 | B0 | BCE_1_MTeval.txt
|
| 3 | Mandel & Ellis | 71.96% | 69.63% | 67.65% | 63.99% | 8729 | R | ME_MTeval.txt
|
| 4 | West, K. | 71.67% | 68.33% | 68.43% | 63.87% | 43327 | B4 | W_MTeval.txt
|
| 5 | Lidy & Rauber (RP+SSD) | 71.08% | 70.90% | 67.65% | 66.85% | 6372 | B1 | LR_RP+SSD_MTeval.txt
|
| 6 | Lidy & Rauber (RP+SSD+RH) | 70.88% | 70.52% | 67.25% | 66.27% | 6372 | B1 | LR_RP+SSD+RH_MTeval.txt
|
| 7 | Lidy & Rauber (SSD+RH) | 70.78% | 69.31% | 67.65% | 65.54% | 6372 | B1 | LR_SSD+RH_MTeval.txt
|
| 8 | Scaringella, N. | 70.47% | 72.30% | 66.14% | 67.12% | 22740 | G | SN_MTeval.txt
|
| 9 | Pampalk, E. | 69.90% | 70.91% | 66.47% | 66.26% | 3312 | B0 | P_MTeval.txt
|
| 10 | Ahrendt, P. | 64.61% | 61.40% | 60.98% | 57.15% | 4920 | B1 | A_MTeval.txt
|
| 11 | Burred, J. | 59.22% | 61.96% | 54.12% | 55.68% | 12483 | B2 | B_MTeval.txt
|
| 12 | Tzanetakis, G. | 58.14% | 53.47% | 55.49% | 50.39% | 1312 | B0 | T_MTeval.txt
|
| 13 | Soares, V. | 55.29% | 60.73% | 49.41% | 53.54% | 23880 | Y | SV_MTeval.txt
|
| 14 | Li, M. | TO * | -- | -- | -- | -- | -- | --
|
| 15 | Chen & Gao | DNC * | -- | -- | -- | -- | -- | --
|
USPOP Dataset
| USPOP Dataset
|
| Rank
| Participant
| Raw Classification Accuracy
| Normalized Raw Classification Accuracy
| Runtime (s)
| Machine
| Confusion Matrix Files
|
| 1
| Bergstra, Casagrande & Eck (2)
| 86.92%
| 82.91%
|
|
| BCE_2_USeval.txt
|
| 2
| Bergstra, Casagrande & Eck (1)
| 86.29%
| 82.50%
| 23400
| B0
| BCE_1_USeval.txt
|
| 3
| Mandel & Ellis
| 85.65%
| 76.91%
| 7856
| R
| ME_USeval.txt
|
| 4
| Pampalk, E.
| 80.38%
| 78.74%
| 3090
| B0
| P_USeval.txt
|
| 5
| Lidy & Rauber (SSD+RH)
| 79.75%
| 75.45%
| 5164
| B1
| LR_SSD+RH_USeval.txt
|
| 6
| West, K.
| 78.90%
| 74.67%
| 18557
| B4
| W_USeval.txt
|
| 7
| Lidy & Rauber (RP+SSD)
| 78.48%
| 77.62%
| 5164
| B1
| LR_RP+SSD_USeval.txt
|
| 8
| Ahrendt, P.
| 78.48%
| 73.23%
| 9702
| B1
| A_USeval.txt
|
| 9
| Lidy & Rauber (RP+SSD+RH)
| 78.27%
| 76.84%
| 5194
| B1
| LR_RP+SSD+RH_USeval.txt
|
| 10
| Scaringella, N.
| 75.74%
| 77.67%
| 24606
| G
| SN_USeval.txt
|
| 11
| Soares, V.
| 66.67%
| 67.28%
| 14369
| Y
| SV_USeval.txt
|
| 12
| Burred, J.
| 66.03%
| 72.50%
| 9233
| B2
| B_USeval.txt
|
| 13
| Tzanetakis, G.
| 63.29%
| 50.19%
| 1320
| B0
| T_USeval.txt
|
| 14
| Chen & Gao
| 22.93%
| 17.96%
| N/A
| Y
| CG_USeval.txt
|
| 15
| Li, M.
| TO *
|
|
|
|
|