Difference between revisions of "2005:Audio Genre Classification Results"

From MIREX Wiki
(USPOP Dataset)
 
(9 intermediate revisions by the same user not shown)
Line 1: Line 1:
'''Goal:''' To classify polyphonic music audio (in PCM format) into genre categories.
+
==Introduction==
  
'''Dataset:''' Two sets of data were used: Magnatune and USPOP. The Magnatune dataset has a hierarchical genre taxonomy, while the USPOP categories are at a single level. The audio sampling rates used were either 44.1 KHz or 22.05 KHz (mono). More data information is in the following table:
+
===Goal===
 +
To classify polyphonic music audio (in PCM format) into genre categories.  
  
{| border="1"
+
===Dataset===
 +
Two sets of data were used: Magnatune and USPOP. The Magnatune dataset has a hierarchical genre taxonomy, while the USPOP categories are at a single level. The audio sampling rates used were either 44.1 KHz or 22.05 KHz (mono). More data information is in the following table:
 +
 
 +
{| border="1"  cellspacing="0"
 
|- style="background: yellow; text-align: center;"
 
|- style="background: yellow; text-align: center;"
 
! Dataset !! Size (@ 44.1 KHz) !! Number of Training Files !! Number of Testing Files  
 
! Dataset !! Size (@ 44.1 KHz) !! Number of Training Files !! Number of Testing Files  
Line 13: Line 17:
 
|}
 
|}
  
<br>
+
==Results==
{| border="1"
+
 
 +
===Overall===
 +
{| border="1"  cellspacing="0"
 
|- style="background: yellow; text-align: center;"
 
|- style="background: yellow; text-align: center;"
 
! colspan="3" | OVERALL  
 
! colspan="3" | OVERALL  
 
|-style="background: yellow;"
 
|-style="background: yellow;"
! Rank !! Participant !! Mean of Magnatune Hierarchical Classification <br> Accuracy and USPOP Raw Classification Accuracy   
+
! Rank !! Participant !! Mean of Magnatune Hierarchical Classification Accuracy <br> and USPOP Raw Classification Accuracy   
 
|-
 
|-
| 1 || Bergstra, Casagrande & Eck (2) || 82.34%
+
| 1 || [https://www.music-ir.org/mirex/abstracts/2005/bergstra.pdf Bergstra, Casagrande & Eck (2)] || 82.34%
 
|-
 
|-
| 2 || Bergstra, Casagrande & Eck (1) || 81.77%
+
| 2 || [https://www.music-ir.org/mirex/abstracts/2005/bergstra.pdf Bergstra, Casagrande & Eck (1)] || 81.77%
 
|-
 
|-
| 3 || Mandel & Ellis || 78.81%
+
| 3 || [https://www.music-ir.org/mirex/abstracts/2005/mandel.pdf Mandel & Ellis] || 78.81%
 
|-  
 
|-  
| 4 || West, K. || 75.29%
+
| 4 || [https://www.music-ir.org/mirex/abstracts/2005/west.pdf West, K.] || 75.29%
 
|-  
 
|-  
| 5 || Lidy & Rauber (SSD+RH) || 75.27%  
+
| 5 || [https://www.music-ir.org/mirex/abstracts/2005/lidy.pdf Lidy & Rauber (SSD+RH)] || 75.27%  
 
|-
 
|-
| 6 || Pampalk, E. || 75.14%  
+
| 6 || [https://www.music-ir.org/mirex/abstracts/2005/pampalk.pdf Pampalk, E.] || 75.14%  
 
|-
 
|-
| 7 || Lidy & Rauber (RP+SSD) || 74.78%  
+
| 7 || [https://www.music-ir.org/mirex/abstracts/2005/lidy.pdf Lidy & Rauber (RP+SSD)] || 74.78%  
 
|-
 
|-
| 8 || Lidy & Rauber (RP+SSD+RH) || 74.58%  
+
| 8 || [https://www.music-ir.org/mirex/abstracts/2005/lidy.pdf Lidy & Rauber (RP+SSD+RH)] || 74.58%  
 
|-
 
|-
| 9 || Scaringella, N. || 73.11%  
+
| 9 || [https://www.music-ir.org/mirex/abstracts/2005/scaringella.pdf Scaringella, N.] || 73.11%  
 
|-
 
|-
| 10 || Ahrendt, P. || 71.55%  
+
| 10 || [https://www.music-ir.org/mirex/abstracts/2005/ahrendt.pdf Ahrendt, P.] || 71.55%  
 
|-
 
|-
| 11 || Burred, J. || 62.63%   
+
| 11 || [https://www.music-ir.org/mirex/abstracts/2005/burred.pdf Burred, J.] || 62.63%   
 
|-
 
|-
| 12 || Soares, V. || 60.98%  
+
| 12 || [https://www.music-ir.org/mirex/abstracts/2005/soares.pdf Soares, V.] || 60.98%  
 
|-
 
|-
| 13 || Tzanetakis, G. || 60.72%  
+
| 13 || [https://www.music-ir.org/mirex/abstracts/2005/tzanetakis.pdf Tzanetakis, G.] || 60.72%  
 
|-
 
|-
 
|}
 
|}
  
<br>
+
===Magnatune Dataset===
 
+
{| border="1"  cellspacing="0"
{| border="1"
 
 
|- style="background: yellow; text-align: center;"
 
|- style="background: yellow; text-align: center;"
 
! colspan="9" | Magnatune Dataset  
 
! colspan="9" | Magnatune Dataset  
Line 56: Line 61:
 
! Rank !! Participant !! Hierarchical Classification Accuracy !! Normalized Hierarchical Classification Accuracy !! Raw Classification Accuracy !! Normalized Raw Classification Accuracy !! Runtime (s) !! Machine !! Confusion Matrix Files
 
! Rank !! Participant !! Hierarchical Classification Accuracy !! Normalized Hierarchical Classification Accuracy !! Raw Classification Accuracy !! Normalized Raw Classification Accuracy !! Runtime (s) !! Machine !! Confusion Matrix Files
 
|-
 
|-
| 1 || Bergstra, Casagrande & Eck (2) || 77.75% || 73.04% || 75.10% || 69.49% || -- || -- || BCE_2_MTeval.txt
+
| 1 || Bergstra, Casagrande & Eck (2) || 77.75% || 73.04% || 75.10% || 69.49% || -- || -- || [https://www.music-ir.org/mirex/results/2005/audio-genre/BCE_2_MTeval.txt BCE_2_MTeval.txt]
 
|-
 
|-
| 2 || Bergstra, Casagrande & Eck (1) || 77.25% || 72.13% || 74.71% || 68.73% || 23400 || B0 || BCE_1_MTeval.txt
+
| 2 || Bergstra, Casagrande & Eck (1) || 77.25% || 72.13% || 74.71% || 68.73% || 23400 || B0 || [https://www.music-ir.org/mirex/results/2005/audio-genre/BCE_1_MTeval.txt BCE_1_MTeval.txt]
 
|-
 
|-
| 3 || Mandel & Ellis ||71.96%|| 69.63%|| 67.65% ||63.99%|| 8729 ||R ||ME_MTeval.txt
+
| 3 || Mandel & Ellis ||71.96%|| 69.63%|| 67.65% ||63.99%|| 8729 ||R ||[https://www.music-ir.org/mirex/results/2005/audio-genre/ME_MTeval.txt ME_MTeval.txt]
 
|-
 
|-
| 4 ||West, K.|| 71.67%|| 68.33% ||68.43%|| 63.87%|| 43327 ||B4|| W_MTeval.txt
+
| 4 ||West, K.|| 71.67%|| 68.33% ||68.43%||63.87%||43327 ||B4|| [https://www.music-ir.org/mirex/results/2005/audio-genre/W_MTeval.txt W_MTeval.txt]
 
|-
 
|-
| 5 || Lidy & Rauber (RP+SSD) || 71.08% || 70.90% || 67.65% ||66.85% ||6372 ||B1|| LR_RP+SSD_MTeval.txt
+
| 5 || Lidy & Rauber (RP+SSD) || 71.08% || 70.90%|| 67.65%||66.85%||6372||B1||[https://www.music-ir.org/mirex/results/2005/audio-genre/LR_RP+SSD_MTeval.txt LR_RP+SSD_MTeval.txt]
 
|-
 
|-
| 6 || Lidy & Rauber (RP+SSD+RH) || 70.88% || 70.52% || 67.25% || 66.27% || 6372 || B1 || LR_RP+SSD+RH_MTeval.txt
+
| 6 || Lidy & Rauber (RP+SSD+RH) || 70.88% ||70.52% ||67.25% ||66.27% ||6372 ||B1 ||[https://www.music-ir.org/mirex/results/2005/audio-genre/LR_RP+SSD+RH_MTeval.txt LR_RP+SSD+RH_MTeval.txt]
 
|-
 
|-
| 7 || Lidy & Rauber (SSD+RH) || 70.78% || 69.31% || 67.65% || 65.54% || 6372 || B1 || LR_SSD+RH_MTeval.txt
+
| 7 || Lidy & Rauber (SSD+RH) ||70.78% ||69.31% ||67.65% ||65.54% || 6372 || B1 ||[https://www.music-ir.org/mirex/results/2005/audio-genre/LR_SSD+RH_MTeval.txt LR_SSD+RH_MTeval.txt]
 
|-
 
|-
| 8 || Scaringella, N. || 70.47% || 72.30% || 66.14% || 67.12% || 22740 || G || SN_MTeval.txt
+
| 8 || Scaringella, N. || 70.47% || 72.30%|| 66.14%|| 67.12%|| 22740|| G|| [https://www.music-ir.org/mirex/results/2005/audio-genre/SN_MTeval.txt SN_MTeval.txt]
 
|-
 
|-
| 9 || Pampalk, E. || 69.90% || 70.91% || 66.47% || 66.26% || 3312 || B0 || P_MTeval.txt
+
| 9 || Pampalk, E. || 69.90% || 70.91% || 66.47% || 66.26% || 3312 || B0 || [https://www.music-ir.org/mirex/results/2005/audio-genre/P_MTeval.txt P_MTeval.txt]
 
|-
 
|-
| 10 || Ahrendt, P. || 64.61% || 61.40% || 60.98% || 57.15% || 4920 || B1 || A_MTeval.txt
+
| 10 || Ahrendt, P. || 64.61% || 61.40% || 60.98% || 57.15% || 4920 || B1 || [https://www.music-ir.org/mirex/results/2005/audio-genre/A_MTeval.txt A_MTeval.txt]
 
|-
 
|-
| 11 || Burred, J. || 59.22% || 61.96% || 54.12% || 55.68% || 12483 || B2 || B_MTeval.txt
+
| 11 || Burred, J. || 59.22% || 61.96% || 54.12% || 55.68% || 12483 || B2 ||[https://www.music-ir.org/mirex/results/2005/audio-genre/B_MTeval.txt B_MTeval.txt]
 
|-
 
|-
| 12 || Tzanetakis, G. || 58.14% || 53.47% || 55.49% || 50.39% || 1312 || B0 || T_MTeval.txt
+
| 12 || Tzanetakis, G. || 58.14% || 53.47% || 55.49% || 50.39% || 1312 || B0 || [https://www.music-ir.org/mirex/results/2005/audio-genre/T_MTeval.txt T_MTeval.txt]
 
|-
 
|-
| 13 || Soares, V. || 55.29% || 60.73% || 49.41% || 53.54% || 23880 || Y || SV_MTeval.txt
+
| 13 || Soares, V. || 55.29% || 60.73% || 49.41% || 53.54% || 23880 || Y ||[https://www.music-ir.org/mirex/results/2005/audio-genre/SV_MTeval.txt SV_MTeval.txt]
 
|-
 
|-
| 14 || Li, M. || TO * || -- || -- || -- || -- || -- || -- ||
+
| 14 || Li, M. || TO * || -- || -- || -- || -- || -- || --
 
|-
 
|-
| 15 || Chen & Gao || DNC * || -- || -- || -- || -- || -- || -- ||
+
| 15 || Chen & Gao || DNC * || -- || -- || -- || -- || -- || --
 
|-
 
|-
 
|}
 
|}
<br>
+
 
{| border="1"
+
===USPOP Dataset===
 +
{| border="1"  cellspacing="0"
 
|- style="background: yellow; text-align: center;"
 
|- style="background: yellow; text-align: center;"
 
! colspan="7" | USPOP Dataset
 
! colspan="7" | USPOP Dataset
|-style="background: yellow;"|----
+
|-style="background: yellow;"
|Rank
+
|----
|Participant
+
!Rank
|Raw Classification Accuracy
+
!Participant
|Normalized Raw Classification Accuracy
+
!Raw Classification Accuracy
|Runtime (s)
+
!Normalized Raw Classification Accuracy
|Machine
+
!Runtime (s)
|Confusion Matrix Files
+
!Machine
 +
!Confusion Matrix Files
 
|----
 
|----
 
|1
 
|1
Line 106: Line 113:
 
|
 
|
 
|
 
|
|BCE_2_USeval.txt
+
|[https://www.music-ir.org/mirex/results/2005/audio-genre/BCE_2_USeval.txt BCE_2_USeval.txt]
 
|----
 
|----
 
|2
 
|2
Line 114: Line 121:
 
|23400
 
|23400
 
|B0
 
|B0
|BCE_1_USeval.txt
+
|[https://www.music-ir.org/mirex/results/2005/audio-genre/BCE_1_USeval.txt BCE_1_USeval.txt]
 
|----
 
|----
 
|3
 
|3
Line 122: Line 129:
 
|7856
 
|7856
 
|R
 
|R
|ME_USeval.txt
+
|[https://www.music-ir.org/mirex/results/2005/audio-genre/ME_USeval.txt ME_USeval.txt]
 
|----
 
|----
 
|4
 
|4
Line 130: Line 137:
 
|3090
 
|3090
 
|B0
 
|B0
|P_USeval.txt
+
|[https://www.music-ir.org/mirex/results/2005/audio-genre/P_USeval.txt P_USeval.txt]
 
|----
 
|----
 
|5
 
|5
Line 138: Line 145:
 
|5164
 
|5164
 
|B1
 
|B1
|LR_SSD+RH_USeval.txt
+
|[https://www.music-ir.org/mirex/results/2005/audio-genre/LR_SSD+RH_USeval.txt LR_SSD+RH_USeval.txt]
 
|----
 
|----
 
|6
 
|6
Line 146: Line 153:
 
|18557
 
|18557
 
|B4
 
|B4
|W_USeval.txt
+
|[https://www.music-ir.org/mirex/results/2005/audio-genre/W_USeval.txt W_USeval.txt]
 
|----
 
|----
 
|7
 
|7
Line 154: Line 161:
 
|5164
 
|5164
 
|B1
 
|B1
|LR_RP+SSD_USeval.txt
+
|[https://www.music-ir.org/mirex/results/2005/audio-genre/LR_RP+SSD_USeval.txt LR_RP+SSD_USeval.txt]
 
|----
 
|----
 
|8
 
|8
Line 162: Line 169:
 
|9702
 
|9702
 
|B1
 
|B1
|A_USeval.txt
+
|[https://www.music-ir.org/mirex/results/2005/audio-genre/A_USeval.txt A_USeval.txt]
 
|----
 
|----
 
|9
 
|9
Line 170: Line 177:
 
|5194
 
|5194
 
|B1
 
|B1
|LR_RP+SSD+RH_USeval.txt
+
|[https://www.music-ir.org/mirex/results/2005/audio-genre/LR_RP+SSD+RH_USeval.txt LR_RP+SSD+RH_USeval.txt]
 
|----
 
|----
 
|10
 
|10
Line 178: Line 185:
 
|24606
 
|24606
 
|G
 
|G
|SN_USeval.txt
+
|[https://www.music-ir.org/mirex/results/2005/audio-genre/SN_USeval.txt SN_USeval.txt]
 
|----
 
|----
 
|11
 
|11
Line 186: Line 193:
 
|14369
 
|14369
 
|Y
 
|Y
|SV_USeval.txt
+
|[https://www.music-ir.org/mirex/results/2005/audio-genre/SV_USeval.txt SV_USeval.txt]
 
|----
 
|----
 
|12
 
|12
Line 194: Line 201:
 
|9233
 
|9233
 
|B2
 
|B2
|B_USeval.txt
+
|[https://www.music-ir.org/mirex/results/2005/audio-genre/B_USeval.txt B_USeval.txt]
 
|----
 
|----
 
|13
 
|13
Line 202: Line 209:
 
|1320
 
|1320
 
|B0
 
|B0
|T_USeval.txt
+
|[https://www.music-ir.org/mirex/results/2005/audio-genre/T_USeval.txt T_USeval.txt]
 
|----
 
|----
 
|14
 
|14
Line 210: Line 217:
 
|N/A
 
|N/A
 
|Y
 
|Y
|CG_USeval.txt
+
|[https://www.music-ir.org/mirex/results/2005/audio-genre/CG_USeval.txt CG_USeval.txt]
 
|----
 
|----
 
|15
 
|15

Latest revision as of 11:41, 2 August 2010

Introduction

Goal

To classify polyphonic music audio (in PCM format) into genre categories.

Dataset

Two sets of data were used: Magnatune and USPOP. The Magnatune dataset has a hierarchical genre taxonomy, while the USPOP categories are at a single level. The audio sampling rates used were either 44.1 KHz or 22.05 KHz (mono). More data information is in the following table:

Dataset Size (@ 44.1 KHz) Number of Training Files Number of Testing Files
Magnatune 34.3 GB 1005 510
USPOP 28.4 GB 940 474

Results

Overall

OVERALL
Rank Participant Mean of Magnatune Hierarchical Classification Accuracy
and USPOP Raw Classification Accuracy
1 Bergstra, Casagrande & Eck (2) 82.34%
2 Bergstra, Casagrande & Eck (1) 81.77%
3 Mandel & Ellis 78.81%
4 West, K. 75.29%
5 Lidy & Rauber (SSD+RH) 75.27%
6 Pampalk, E. 75.14%
7 Lidy & Rauber (RP+SSD) 74.78%
8 Lidy & Rauber (RP+SSD+RH) 74.58%
9 Scaringella, N. 73.11%
10 Ahrendt, P. 71.55%
11 Burred, J. 62.63%
12 Soares, V. 60.98%
13 Tzanetakis, G. 60.72%

Magnatune Dataset

Magnatune Dataset
Rank Participant Hierarchical Classification Accuracy Normalized Hierarchical Classification Accuracy Raw Classification Accuracy Normalized Raw Classification Accuracy Runtime (s) Machine Confusion Matrix Files
1 Bergstra, Casagrande & Eck (2) 77.75% 73.04% 75.10% 69.49% -- -- BCE_2_MTeval.txt
2 Bergstra, Casagrande & Eck (1) 77.25% 72.13% 74.71% 68.73% 23400 B0 BCE_1_MTeval.txt
3 Mandel & Ellis 71.96% 69.63% 67.65% 63.99% 8729 R ME_MTeval.txt
4 West, K. 71.67% 68.33% 68.43% 63.87% 43327 B4 W_MTeval.txt
5 Lidy & Rauber (RP+SSD) 71.08% 70.90% 67.65% 66.85% 6372 B1 LR_RP+SSD_MTeval.txt
6 Lidy & Rauber (RP+SSD+RH) 70.88% 70.52% 67.25% 66.27% 6372 B1 LR_RP+SSD+RH_MTeval.txt
7 Lidy & Rauber (SSD+RH) 70.78% 69.31% 67.65% 65.54% 6372 B1 LR_SSD+RH_MTeval.txt
8 Scaringella, N. 70.47% 72.30% 66.14% 67.12% 22740 G SN_MTeval.txt
9 Pampalk, E. 69.90% 70.91% 66.47% 66.26% 3312 B0 P_MTeval.txt
10 Ahrendt, P. 64.61% 61.40% 60.98% 57.15% 4920 B1 A_MTeval.txt
11 Burred, J. 59.22% 61.96% 54.12% 55.68% 12483 B2 B_MTeval.txt
12 Tzanetakis, G. 58.14% 53.47% 55.49% 50.39% 1312 B0 T_MTeval.txt
13 Soares, V. 55.29% 60.73% 49.41% 53.54% 23880 Y SV_MTeval.txt
14 Li, M. TO * -- -- -- -- -- --
15 Chen & Gao DNC * -- -- -- -- -- --

USPOP Dataset

USPOP Dataset
Rank Participant Raw Classification Accuracy Normalized Raw Classification Accuracy Runtime (s) Machine Confusion Matrix Files
1 Bergstra, Casagrande & Eck (2) 86.92% 82.91% BCE_2_USeval.txt
2 Bergstra, Casagrande & Eck (1) 86.29% 82.50% 23400 B0 BCE_1_USeval.txt
3 Mandel & Ellis 85.65% 76.91% 7856 R ME_USeval.txt
4 Pampalk, E. 80.38% 78.74% 3090 B0 P_USeval.txt
5 Lidy & Rauber (SSD+RH) 79.75% 75.45% 5164 B1 LR_SSD+RH_USeval.txt
6 West, K. 78.90% 74.67% 18557 B4 W_USeval.txt
7 Lidy & Rauber (RP+SSD) 78.48% 77.62% 5164 B1 LR_RP+SSD_USeval.txt
8 Ahrendt, P. 78.48% 73.23% 9702 B1 A_USeval.txt
9 Lidy & Rauber (RP+SSD+RH) 78.27% 76.84% 5194 B1 LR_RP+SSD+RH_USeval.txt
10 Scaringella, N. 75.74% 77.67% 24606 G SN_USeval.txt
11 Soares, V. 66.67% 67.28% 14369 Y SV_USeval.txt
12 Burred, J. 66.03% 72.50% 9233 B2 B_USeval.txt
13 Tzanetakis, G. 63.29% 50.19% 1320 B0 T_USeval.txt
14 Chen & Gao 22.93% 17.96% N/A Y CG_USeval.txt
15 Li, M. TO *