Difference between revisions of "2025:Symbolic Music Generation"

From MIREX Wiki
(Description)
(Data Format)
Line 6: Line 6:
  
 
=Data Format=
 
=Data Format=
The input prompt consists of 4 bars of piano music, with an additional mandatory pickup measure (left blank if not used). The data is prepared in JSON format containing a property: <code>prompt</code>:
 
  
* <code>prompt</code>: a list of notes. Each note contains properties of <code>start</code>, <code>pitch</code>, and <code>duration</code>.
+
Both the input prompt and output generation should be stored in JSON format. Specifically, music is represented by a list of notes, which contains <code>start</code>, <code>pitch</code>, and <code>duration</code> attributes.  
  
The output generation should also follow the JSON format containing one property <code>output</code>:
+
The prompt is stored under the key <code>prompt</code> and lasts 5 measures (the first measure is the pickup measure). Below is an example prompt:
  
* <code>output</code>: a list of notes. Each note contains properties of <code>start</code>, <code>pitch</code>, and <code>duration</code>.
+
<pre>
 +
{
 +
  "prompt": [
 +
    {
 +
      "start": 16,
 +
      "pitch": 72,
 +
      "duration": 6
 +
    },
 +
    {
 +
      "start": 16,
 +
      "pitch": 57,
 +
      "duration": 14
 +
    },
 +
    ...
 +
  ]
 +
}
 +
</pre>
  
'''Detailed explanation of <code>start</code> and <code>duration</code> attributes.'''
+
The generation is stored under the key <code>generation</code> and lasts 12 measures. Below is an example generation:
 +
<pre>
 +
# Generation
 +
{
 +
  "generation": [
 +
    {
 +
      "start": 80,
 +
      "pitch": 40,
 +
      "duration": 4
 +
    },
 +
    {
 +
      "start": 80,
 +
      "pitch": 40,
 +
      "duration": 4
 +
    },
 +
    ...
 +
  ]
 +
}
  
# The data is assumed to be in 4/4 meter, quantized to a sixteenth-note resolution. For both prompt and output, onsets and durations are counted in sixteenth notes.
+
</pre>
# Both onsets and durations are integers ranging from 0 to 17 * 16 - 1 = 271. Notes that end later than the sixteenth measure (i.e., 17 * 16 = 272th time step) will be truncated to the end of the ninth measure.
 
# The accompaniment of the pick-up measure should be blank.
 
 
 
'''Detailed explanation of the <code>pitch</code> attribute.'''
 
 
 
# The pitch property of a note should be integers ranging from 0 to 127, corresponding to the MIDI pitch numbers.
 
  
 +
In the above examples, <code>start</code> and <code>duration</code> attributes are counted in sixteenth notes. Since the data is assumed to be in 4/4 meter and quantized to a sixteenth note resolution, the <code>start</code> of the prompt should range from 0-79 (0-15 is the pickup measure) and <code>start</code> of the generation should range from 80-271. The <code>pitch</code> property of a note should be integers ranging from 0 to 127, corresponding to the MIDI pitch numbers.
  
 
=Data Example=
 
=Data Example=

Revision as of 01:35, 9 July 2025

Description

Symbolic music generation covers a wide range of tasks and settings, including varying types of control, generation objectives (e.g., continuation, inpainting), and representations (e.g., score, performance, single- or multi-track). In MIREX, we narrow this scope each year to focus on a specific subtask.

For this year’s challenge, the selected task is Piano Music Continuation. Given a 4-measure piano prompt (plus an optional pickup measure), the goal is to generate a 12-measure continuation that is musically coherent with the prompt, forming a complete 16-measure piece. All music is assumed to be in 4/4 time and quantized to sixteenth-note resolution. The continuation should match the style of the prompt, which may vary across classical, pop, jazz, or other existing styles. Further details are provided in the following sections.

Data Format

Both the input prompt and output generation should be stored in JSON format. Specifically, music is represented by a list of notes, which contains start, pitch, and duration attributes.

The prompt is stored under the key prompt and lasts 5 measures (the first measure is the pickup measure). Below is an example prompt:

{
  "prompt": [
    {
      "start": 16,
      "pitch": 72,
      "duration": 6
    },
    {
      "start": 16,
      "pitch": 57,
      "duration": 14
    },
    ...
  ]
}

The generation is stored under the key generation and lasts 12 measures. Below is an example generation:

# Generation
{
  "generation": [
    {
      "start": 80,
      "pitch": 40,
      "duration": 4
    },
    {
      "start": 80,
      "pitch": 40,
      "duration": 4
    },
    ...
  ]
}

In the above examples, start and duration attributes are counted in sixteenth notes. Since the data is assumed to be in 4/4 meter and quantized to a sixteenth note resolution, the start of the prompt should range from 0-79 (0-15 is the pickup measure) and start of the generation should range from 80-271. The pitch property of a note should be integers ranging from 0 to 127, corresponding to the MIDI pitch numbers.

Data Example

Below is an example of the input lead sheet in the format given above. The lead sheet is the melody of the first phrase of Hey Jude by The Beatles.

{
  "prompt": [
    {"start": 12, "pitch": 72, "duration": 4},
    {"start": 16, "pitch": 69, "duration": 8},
    ...
  ],
}


This is an example of the generated continuation. Note that the generation starts from the fifth measure (time step 80).

{
  "acc": [
    {"start": 80, "pitch": 41, "duration": 12},
    {"start": 80, "pitch": 65, "duration": 5},
    ...
  ]
}


Evaluation and Competition Format

We will evaluate the submitted algorithms through an online subjective double-blind test. The evaluation format differs from conventional tasks in the following aspects:

  • We use a "potluck" test set. Before submitting the algorithm, each team is required to submit two prompts. The organizer team will supplement the prompts if necessary.
  • There will be no live ranking because the subjective test will be done after the algorithm submission deadline.
  • To better handle randomness in the generation algorithm, we allow cherry-picking from a fixed number of generated samples.
  • We hope to compute some objective measurements as well, but these will only be reported as a reference.

Subjective Evaluation Format

  • After each team submits the algorithm, the organizer team will use the algorithm to generate 16 continuations for each test sample. The generated results will be returned to each team for cherry-picking.
  • Only a subset of the test set will be used for subjective evaluation.
  • In the subjective evaluation, we will first ask the subjects to listen to the prompt and then listen to the generated samples in random order. The order of the samples will be randomized.
  • The subject will be asked to rate each arrangement based on the following criteria:
  • Coherency (5-point scale)
  • Creativity (5-point scale)
  • Structuredness (5-point scale)
  • Overall musicality (5-point scale)


Important Dates (Tentative)

  • Aug 7, 2025: Submit two lead sheets as a part of the test set.
  • Aug 15, 2025: Submit the main algorithm.
  • Aug 20, 2025: Return the generated samples. The cherry-picking phase begins.
  • Aug 24, 2025: Submit the cherry-picked sample ids.
  • Aug 30 - Sep 5, 2024: Online subjective evaluation.
  • Sep 6, 2024: Announce the final result.


Submission

As a generative task with subjective evaluation, the submission process differs greatly from other MIREX tasks. There are four important stages:

  1. Test set submission
  2. Algorithm submission
  3. Cherry-picked sample IDs submission
  4. Evaluation form submission

Please check the Important Dates section for the detailed schedule. Failure to participate in any of the stages will result in disqualification.


Algorithm Submission

To be announced later.


Baselines

To be announced later.


Contacts

If you any questions or suggestions about the task, please contact:

  • Ziyu Wang: ziyu.wang<at>nyu.edu
  • Jingwei Zhao: jzhao<at>u.nus.edu