Difference between revisions of "2014:Discovery of Repeated Themes & Sections Results"

From MIREX Wiki
m (audPoly)
m (audPoly)
Line 409: Line 409:
  
  
<nowiki><csv p=3>2013/drts/audPolyResultsPerPatt.csv</csv>/nowiki>
+
<nowiki><csv p=3>2013/drts/audPolyResultsPerPatt.csv</csv></nowiki>
  
 
'''Table 9.''' Tabular version of Figures 38 and 39.
 
'''Table 9.''' Tabular version of Figures 38 and 39.
Line 420: Line 420:
 
*David Meredith. (2013). COSIATEC and SIATECCompress: Pattern discovery by geometric compression. ''9th Annual Music Information Retrieval eXchange (MIREX'13)'', Curitiba, Brazil.
 
*David Meredith. (2013). COSIATEC and SIATECCompress: Pattern discovery by geometric compression. ''9th Annual Music Information Retrieval eXchange (MIREX'13)'', Curitiba, Brazil.
 
*Oriol Nieto and Morwaread Farbood. (2014a). Submission to MIREX discovery of repeated themes and sections. ''10th Annual Music Information Retrieval eXchange (MIREX'14)'', Taipei, Taiwan.
 
*Oriol Nieto and Morwaread Farbood. (2014a). Submission to MIREX discovery of repeated themes and sections. ''10th Annual Music Information Retrieval eXchange (MIREX'14)'', Taipei, Taiwan.
 +
*Oriol Nieto and Morwaread Farbood. (2014a). Identifying polyphonic musical patterns from audio recordings using music segmentation techniques. In ''Proc. ISMIR'', Taipei, Taiwan.
 +
*Oriol Nieto and Morwaread Farbood. (2013). Discovering musical patterns using audio structural segmentation techniques. ''9th Annual Music Information Retrieval eXchange (MIREX'13)'', Curitiba, Brazil.
 +
*Arnold Schoenberg. (1967). ''Fundamentals of Musical Composition''. Faber and Faber, London.
 +
*Ron J. Weiss and Juan Pablo Bello. (2010). Identifying repeated patterns in music using sparse convolutive non-negative matrix factorization. In ''Proc. ISMIR'' (pp. 123-128), Utrecht, The Netherlands.

Revision as of 09:34, 21 October 2014

Introduction

THIS PAGE IS STILL UNDER CONSTRUCTION. At present it just contains a copy of last year's results. If you are looking for the 2014 results, please check back here again on Thu 23 October.

The task: algorithms take a piece of music as input, and output a list of patterns repeated within that piece. A pattern is defined as a set of ontime-pitch pairs that occurs at least twice (i.e., is repeated at least once) in a piece of music. The second, third, etc. occurrences of the pattern will likely be shifted in time and/or transposed, relative to the first occurrence. Ideally an algorithm will be able to discover all exact and inexact occurrences of a pattern within a piece, so in evaluating this task we are interested in both:

  • (1) to what extent an algorithm can discover one occurrence, up to time shift and transposition, and;
  • (2) to what extent it can find all occurrences.

The metrics establishment recall, establishment precision and establishment F1 address (1), and the metrics occurrence recall, occurrence precision, and occurrence F1 address (2).

Contribution

Existing approaches to music structure analysis in MIR tend to focus on segmentation (e.g., Weiss & Bello, 2010). The contribution of this task is to afford access to the note content itself (please see the example in Fig. 1A), requiring algorithms to do more than label time windows (e.g., the segmentations in Figs. 1B-D). For instance, a discovery algorithm applied to the piece in Fig. 1A should return a pattern corresponding to the note content of and , as well as a pattern corresponding to the note content of . This is because occurs again independently of the accompaniment in bars 19-22 (not shown here). The ground truth also contains nested patterns, such as in Fig. 1A being a subset of the sectional repetition , reflecting the often-hierarchical nature of musical repetition. While we recognise the appealing simplicity of linear segmentation, in the Discovery of Repeated Themes & Sections task we are demanding analysis at a greater level of detail, and have built a ground truth that contains overlapping and nested patterns.


MozartK282Mvt2.png

Figure 1. Pattern discovery v segmentation. (A) Bars 1-12 of Mozart’s Piano Sonata in E-flat major K282 mvt.2, showing some ground-truth themes and repeated sections; (B-D) Three linear segmentations. Numbers below the staff in Fig. 1A and below the segmentation in Fig. 1D indicate crotchet beats, from zero for bar 1 beat 1.


For a more detailed introduction to the task, please see 2014:Discovery_of_Repeated_Themes_&_Sections.

Ground Truth and Algorithms

The ground truth, called the Johannes Kepler University Patterns Test Database (JKUPTD-Aug2013), is based on motifs and themes in Barlow and Morgenstern (1953), Schoenberg (1967), and Bruhn (1993). Repeated sections are based on those marked by the composer. These annotations are supplemented with some of our own where necessary. A Development Database (JKUPDD-Aug2013) enabled participants to try out their algorithms. For each piece in the Development and Test Databases, symbolic and synthesised audio versions are crossed with monophonic and polyphonic versions, giving four versions of the task in total: symPoly, symMono, audPoly, and audMono. Algorithms submitted to the task are are shown in Table 1.


Sub code Submission name Abstract Contributors
Task Version symMono
NF1 MotivesExtractor PDF Oriol Nieto, Morwaread Farbood
OL1 PatMinr PDF Olivier Lartillot
VM1 VM1 PDF Gissel Velarde, David Meredith
VM2 VM2 PDF Gissel Velarde, David Meredith
NF1'13 motives_mono PDF Oriol Nieto, Morwaread Farbood
DM10'13 SIATECCompressSegment PDF David Meredith
Task Version symPoly
NF1 MotivesExtractor PDF Oriol Nieto, Morwaread Farbood
NF2'13 motives_poly PDF Oriol Nieto, Morwaread Farbood
DM10'13 SIATECCompressSegment PDF David Meredith
Task Version audMono
NF1 MotivesExtractor PDF Oriol Nieto, Morwaread Farbood
NF3'13 motives_audio_mono PDF Oriol Nieto, Morwaread Farbood
Task Version audPoly
NF1 MotivesExtractor PDF Oriol Nieto, Morwaread Farbood
NF4'13 motives_audio_poly PDF Oriol Nieto, Morwaread Farbood

Table 1. Algorithms submitted to DRTS. Strong-performing algorithms from 2013 (submission codes ending '13) are included for the sake of extra comparisons.

Results

For mathematical definitions of the various metrics, please see 2014:Discovery_of_Repeated_Themes_&_Sections#Evaluation_Procedure.

In Brief

Nieto and Farbood (2014a) submitted to all four versions of the task (symbolic-monophonic, symbolic-polyphonic, audio-monophonic, audio-polyphonic), as they did last year (Nieto and Farbood, 2013). On the audio-monophonic version of the task, their NF1 algorithm’s scores were up by an average of .14 (establishing at least one occurrence of each ground truth pattern) and .11 (retrieving all occurrences of a discovered ground truth pattern) compared to last year (see Figs. 30 and 33). There were slighter increases in the audio-polyphonic version of the task. Their work on extracting repetitive structure remains at the forefront of research attempting to cross the audio-symbolic divide (Nieto & Farbood, 2014b; Collins et al., 2014).

Lartillot (2014a, 2014b) submitted an incremental pattern mining algorithm to the symbolic-monophonic version of the task this year. The musical dimensions represented (e.g., chromatic pitch, diatonic pitch) are able to vary throughout the course of a pattern occurrence. The ability to vary representation within an occurrence should mean that Lartillot’s OL1 algorithm is well prepared for retrieving both exact and inexact occurrences of motifs and themes. This does seem to be the case, with OL1 the strongest performer on the occurrence metric (Fig. 9).

Velarde and Meredith (2014) submitted a wavelet-based method to the symbolic-monophonic version of the task this year. This algorithm, VM1, tested significantly stronger according to Friedman's test than NF1 (, Bonferroni-corrected) and OL1 (, Bonferroni-corrected) at discovering at least one occurrence of each ground truth pattern (Fig. 2). While VM1 also seems to find lots of occurrences of each ground truth pattern (with high occurrence recall in Figs. 7, and 3 on a per-pattern basis), it may also find quite a few false-positive occurrences (with lower occurrence precision in Fig. 8). (To avoid a bias toward the more numerous submissions of Velarde and Meredith (2014), VM1 was preselected for comparison with Nieto and Farbood's (2014a) and Lartillot's (2014a) submissions, based on performance for the Development Database.)

symMono

(Submission OL1 did not complete on piece 5. The task captain took the decision to assign the mean of the evaluation metrics for OL1 calculated across the remaining pieces.)

11symMonoEstRecPerPatt2014.png

Figure 2. Establishment recall on a per-pattern basis. Establishment recall answers the following question. On average, how similar is the most similar algorithm-output pattern to a ground-truth pattern prototype?


14symMonoOccRecPerPatt2014.png

Figure 3. Occurrence recall on a per-pattern basis. Occurrence recall answers the following question. On average, how similar is the most similar set of algorithm-output pattern occurrences to a discovered ground-truth occurrence set?


11symMonoEstRec2014.png

Figure 4. Establishment recall averaged over each piece/movement. Establishment recall answers the following question. On average, how similar is the most similar algorithm-output pattern to a ground-truth pattern prototype?


12symMonoEstPrec2014.png

Figure 5. Establishment precision averaged over each piece/movement. Establishment precision answers the following question. On average, how similar is the most similar ground-truth pattern prototype to an algorithm-output pattern?


13symMonoEstF12014.png

Figure 6. Establishment F1 averaged over each piece/movement. Establishment F1 is an average of establishment precision and establishment recall.


14symMonoOccRecP752014.png

Figure 7. Occurrence recall () averaged over each piece/movement. Occurrence recall answers the following question. On average, how similar is the most similar set of algorithm-output pattern occurrences to a discovered ground-truth occurrence set?


15symMonoOccPrecP752014.png

Figure 8. Occurrence precision () averaged over each piece/movement. Occurrence precision answers the following question. On average, how similar is the most similar discovered ground-truth occurrence set to a set of algorithm-output pattern occurrences?


16symMonoOccF1P752014.png

Figure 9. Occurrence F1 () averaged over each piece/movement. Occurrence F1 is an average of occurrence precision and occurrence recall.


17symMonoR3.png

Figure 10. Three-layer recall averaged over each piece/movement. Rather than using as a similarity measure (which is the default for establishment recall), three-layer recall uses , which is a kind of F1 measure.


18symMonoP32014.png

Figure 11. Three-layer precision averaged over each piece/movement. Rather than using as a similarity measure (which is the default for establishment precision), three-layer precision uses , which is a kind of F1 measure.


19symMonoTLF2014.png

Figure 12. Three-layer F1 (TLF) averaged over each piece/movement. TLF is an average of three-layer precision and three-layer recall.


20symMonoRuntime2014.png

Figure 13. Log runtime of the algorithm for each piece/movement.

symPoly

01symPolyEstRecPerPatt2014.png

Figure 14. Establishment recall on a per-pattern basis. Establishment recall answers the following question. On average, how similar is the most similar algorithm-output pattern to a ground-truth pattern prototype?


04symPolyOccRecPerPatt2014.png

Figure 15. Occurrence recall on a per-pattern basis. Occurrence recall answers the following question. On average, how similar is the most similar set of algorithm-output pattern occurrences to a discovered ground-truth occurrence set?


01symPolyEstRec2014.png

Figure 16. Establishment recall averaged over each piece/movement. Establishment recall answers the following question. On average, how similar is the most similar algorithm-output pattern to a ground-truth pattern prototype?


02symPolyEstPrec2014.png

Figure 17. Establishment precision averaged over each piece/movement. Establishment precision answers the following question. On average, how similar is the most similar ground-truth pattern prototype to an algorithm-output pattern?


03symPolyEstF12014.png

Figure 18. Establishment F1 averaged over each piece/movement. Establishment F1 is an average of establishment precision and establishment recall.


04symPolyOccRecP752014.png

Figure 19. Occurrence recall () averaged over each piece/movement. Occurrence recall answers the following question. On average, how similar is the most similar set of algorithm-output pattern occurrences to a discovered ground-truth occurrence set?


05symPolyOccPrecP752014.png

Figure 20. Occurrence precision () averaged over each piece/movement. Occurrence precision answers the following question. On average, how similar is the most similar discovered ground-truth occurrence set to a set of algorithm-output pattern occurrences?


06symPolyOccF1P752014.png

Figure 21. Occurrence F1 () averaged over each piece/movement. Occurrence F1 is an average of occurrence precision and occurrence recall.


07symPolyR32014.png

Figure 22. Three-layer recall averaged over each piece/movement. Rather than using as a similarity measure (which is the default for establishment recall), three-layer recall uses , which is a kind of F1 measure.


08symPolyP32014.png

Figure 23. Three-layer precision averaged over each piece/movement. Rather than using as a similarity measure (which is the default for establishment precision), three-layer precision uses , which is a kind of F1 measure.


09symPolyTLF2014.png

Figure 24. Three-layer F1 (TLF) averaged over each piece/movement. TLF is an average of three-layer precision and three-layer recall.


10symPolyRuntime2014.png

Figure 25. Log runtime of the algorithm for each piece/movement.

audMono

31audMonoEstRecPerPatt2014.png

Figure 26. Establishment recall on a per-pattern basis. Establishment recall answers the following question. On average, how similar is the most similar algorithm-output pattern to a ground-truth pattern prototype?


34audMonoOccRecPerPatt2014.png

Figure 27. Occurrence recall on a per-pattern basis. Occurrence recall answers the following question. On average, how similar is the most similar set of algorithm-output pattern occurrences to a discovered ground-truth occurrence set?


31audMonoEstRec2014.png

Figure 28. Establishment recall averaged over each piece/movement. Establishment recall answers the following question. On average, how similar is the most similar algorithm-output pattern to a ground-truth pattern prototype?


32audMonoEstPrec2014.png

Figure 29. Establishment precision averaged over each piece/movement. Establishment precision answers the following question. On average, how similar is the most similar ground-truth pattern prototype to an algorithm-output pattern?


33audMonoEstF12014.png

Figure 30. Establishment F1 averaged over each piece/movement. Establishment F1 is an average of establishment precision and establishment recall.


34audMonoOccRecP752014.png

Figure 31. Occurrence recall () averaged over each piece/movement. Occurrence recall answers the following question. On average, how similar is the most similar set of algorithm-output pattern occurrences to a discovered ground-truth occurrence set?


35audMonoOccPrecP752014.png

Figure 32. Occurrence precision () averaged over each piece/movement. Occurrence precision answers the following question. On average, how similar is the most similar discovered ground-truth occurrence set to a set of algorithm-output pattern occurrences?


36audMonoOccF1P752014.png

Figure 33. Occurrence F1 () averaged over each piece/movement. Occurrence F1 is an average of occurrence precision and occurrence recall.


37audMonoR32014.png

Figure 34. Three-layer recall averaged over each piece/movement. Rather than using as a similarity measure (which is the default for establishment recall), three-layer recall uses , which is a kind of F1 measure.


38audMonoP32014.png

Figure 35. Three-layer precision averaged over each piece/movement. Rather than using as a similarity measure (which is the default for establishment precision), three-layer precision uses , which is a kind of F1 measure.


39audMonoTLF2014.png

Figure 36. Three-layer F1 (TLF) averaged over each piece/movement. TLF is an average of three-layer precision and three-layer recall.


40audMonoRuntime2014.png

Figure 37. Log runtime of the algorithm for each piece/movement.

audPoly

21audPolyEstRecPerPatt2014.png

Figure 38. Establishment recall on a per-pattern basis. Establishment recall answers the following question. On average, how similar is the most similar algorithm-output pattern to a ground-truth pattern prototype?


24audPolyOccRecPerPatt2014.png

Figure 39. Occurrence recall on a per-pattern basis. Occurrence recall answers the following question. On average, how similar is the most similar set of algorithm-output pattern occurrences to a discovered ground-truth occurrence set?


21audPolyEstRec2014.png

Figure 40. Establishment recall averaged over each piece/movement. Establishment recall answers the following question. On average, how similar is the most similar algorithm-output pattern to a ground-truth pattern prototype?


22audPolyEstPrec2014.png

Figure 41. Establishment precision averaged over each piece/movement. Establishment precision answers the following question. On average, how similar is the most similar ground-truth pattern prototype to an algorithm-output pattern?


23audPolyEstF12014.png

Figure 42. Establishment F1 averaged over each piece/movement. Establishment F1 is an average of establishment precision and establishment recall.


24audPolyOccRecP752014.png

Figure 43. Occurrence recall () averaged over each piece/movement. Occurrence recall answers the following question. On average, how similar is the most similar set of algorithm-output pattern occurrences to a discovered ground-truth occurrence set?


25audPolyOccPrecP752014.png

Figure 44. Occurrence precision () averaged over each piece/movement. Occurrence precision answers the following question. On average, how similar is the most similar discovered ground-truth occurrence set to a set of algorithm-output pattern occurrences?


26audPolyOccF1P752014.png

Figure 45. Occurrence F1 () averaged over each piece/movement. Occurrence F1 is an average of occurrence precision and occurrence recall.


27audPolyR32014.png

Figure 46. Three-layer recall averaged over each piece/movement. Rather than using as a similarity measure (which is the default for establishment recall), three-layer recall uses , which is a kind of F1 measure.


28audPolyP32014.png

Figure 47. Three-layer precision averaged over each piece/movement. Rather than using as a similarity measure (which is the default for establishment precision), three-layer precision uses , which is a kind of F1 measure.


29audPolyTLF2014.png

Figure 48. Three-layer F1 (TLF) averaged over each piece/movement. TLF is an average of three-layer precision and three-layer recall.


30audPolyRuntime2014.png

Figure 49. Log runtime of the algorithm for each piece/movement.

Discussion

If an occurrence of a ground-truth pattern contains forty or more notes then, according to Fig. 2, it is likely that SIATECSegment (DM10, Meredith, 2013) and motives_poly (NF2, Nieto & Farbood, 2013) will return a pattern rated as at least 75% similar. When we restrict attention to these successful discoveries and ask to what extent can the algorithms retrieve all exact and inexact occurrences, we find that SIATECSegment performs relatively well (see nonzero entries for the black line in Fig. 3), with the exception of the first patterns in pieces 1 and 2. We conclude, therefore, that the discovery of repeated sections has been addressed well by the current submissions, but that the discovery of themes and motifs requires more attention in future iterations of this task.

When assembling the ground truth, it was remarkable that most often a motif occurs as a subset of a theme or repeated section, which is not surprising given Drabkin’s (2001) definition of a motif as ‘the shortest subdivision of a theme or phrase that still maintains its identity as an idea’. One suggestion for future work is to apply a discovery algorithm to find repeated sections, and then apply the algorithm again but to the output sections only, in order to retrieve these nested and important musical motifs.

Tabular Versions of Plots

symMono

<csv p=3>2014/drts/symMonoResults2014.csv</csv>

Table 2. Tabular version of Figures 4-13.


<csv p=3>2014/drts/symMonoResultsPerPatt2014.csv</csv>

Table 3. Tabular version of Figures 2 and 3.

symPoly

<csv p=3>2014/drts/symPolyResults2014.csv</csv>

Table 4. Tabular version of Figures 16-25.


<csv p=3>2014/drts/symPolyResultsPerPatt2014.csv</csv>

Table 5. Tabular version of Figures 14 and 15.


audMono

<csv p=3>2014/drts/audMonoResults2014.csv</csv>

Table 6. Taublar version of Figures 28-37.


<csv p=3>2014/drts/audMonoResultsPerPatt2014.csv</csv>

Table 7. Tabular version of Figures 26 and 27.


audPoly

<csv p=3>2014/drts/audPolyResults2014.csv</csv>

Table 8. Tabular version of Figures 40-49.


<csv p=3>2013/drts/audPolyResultsPerPatt.csv</csv>

Table 9. Tabular version of Figures 38 and 39.

References

  • Harold Barlow and Sam Morgenstern. (1948). A dictionary of musical themes. Crown Publishers, New York.
  • Siglind Bruhn. (1993). J.S. Bach's Well-Tempered Clavier: in-depth analysis and interpretation. Mainer International, Hong Kong.
  • William Drabkin. Motif. (2001). In S. Sadie and J. Tyrrell (Eds), The new Grove dictionary of music and musicians. Macmillan, London, UK, 2nd ed.
  • David Meredith. (2013). COSIATEC and SIATECCompress: Pattern discovery by geometric compression. 9th Annual Music Information Retrieval eXchange (MIREX'13), Curitiba, Brazil.
  • Oriol Nieto and Morwaread Farbood. (2014a). Submission to MIREX discovery of repeated themes and sections. 10th Annual Music Information Retrieval eXchange (MIREX'14), Taipei, Taiwan.
  • Oriol Nieto and Morwaread Farbood. (2014a). Identifying polyphonic musical patterns from audio recordings using music segmentation techniques. In Proc. ISMIR, Taipei, Taiwan.
  • Oriol Nieto and Morwaread Farbood. (2013). Discovering musical patterns using audio structural segmentation techniques. 9th Annual Music Information Retrieval eXchange (MIREX'13), Curitiba, Brazil.
  • Arnold Schoenberg. (1967). Fundamentals of Musical Composition. Faber and Faber, London.
  • Ron J. Weiss and Juan Pablo Bello. (2010). Identifying repeated patterns in music using sparse convolutive non-negative matrix factorization. In Proc. ISMIR (pp. 123-128), Utrecht, The Netherlands.