2025:Symbolic Music Generation

From MIREX Wiki
Revision as of 02:15, 9 July 2025 by Zizzi wang (talk | contribs) (Description)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Description

Symbolic music generation covers a wide range of tasks and settings, including varying types of control, generation objectives (e.g., continuation, inpainting), and representations (e.g., score, performance, single- or multi-track). In MIREX, we narrow this scope each year to focus on a specific subtask.

For this year’s challenge, the selected task is Piano Music Continuation. Given a 4-measure piano prompt (plus an optional pickup measure), the goal is to generate a 12-measure continuation that is musically coherent with the prompt, forming a complete 16-measure piece. All music is assumed to be in 4/4 time and quantized to sixteenth-note resolution. The continuation should match the style of the prompt, which may vary across classical, pop, jazz, or other existing styles. Further details are provided in the following sections.

Please refer to this repository to access the baseline method and know more about the submission format.

Data Format

Both the input prompt and output generation should be stored in JSON format. Specifically, music is represented by a list of notes, which contains start, pitch, and duration attributes.

The prompt is stored under the key prompt and lasts 5 measures (the first measure is the pickup measure). Below is an example prompt:

{
  "prompt": [
    {
      "start": 16,
      "pitch": 72,
      "duration": 6
    },
    {
      "start": 16,
      "pitch": 57,
      "duration": 14
    },
    ...
  ]
}

The generation is stored under the key generation and lasts 12 measures. Below is an example generation:

# Generation
{
  "generation": [
    {
      "start": 80,
      "pitch": 40,
      "duration": 4
    },
    {
      "start": 80,
      "pitch": 40,
      "duration": 4
    },
    ...
  ]
}

In the above examples, start and duration attributes are counted in sixteenth notes. Since the data is assumed to be in 4/4 meter and quantized to a sixteenth note resolution, the start of the prompt should range from 0-79 (0-15 is the pickup measure) and start of the generation should range from 80-271. The pitch property of a note should be integers ranging from 0 to 127, corresponding to the MIDI pitch numbers.

Evaluation and Competition Format

We will evaluate the submitted algorithms through an online subjective double-blind test. The evaluation format differs from conventional tasks in the following aspects:

  • We use a "potluck" test set. Before submitting the algorithm, each team is required to submit two prompts. The organizer team will supplement the prompts if necessary.
  • There will be no live ranking because the subjective test will be done after the algorithm submission deadline.
  • To better handle randomness in the generation algorithm, we allow cherry-picking from a fixed number of generated samples.
  • We welcome both challenge participants and non-participants to submit plans for objective evaluation. Evaluation methods may be incorporated as reference benchmarks and could inform the development of future evaluation metrics.

Subjective Evaluation Format

  • After each team submits the algorithm, the organizer team will use the algorithm to generate 8 continuations for each test sample. The generated results will be returned to each team for cherry-picking.
  • Only a subset of the test set will be used for subjective evaluation.
  • In the subjective evaluation, we will first ask the subjects to listen to the prompt and then listen to the generated samples in random order. The order of the samples will be randomized.
  • The subject will be asked to rate each arrangement based on the following criteria:
  • Coherency to the prompt (5-point scale)
  • Creativity (5-point scale)
  • Structuredness (5-point scale)
  • Overall musicality (5-point scale)

Important Dates

  • Aug 15, 2025: Submit two prompts as a part of the test set.
  • Aug 21, 2025: Submit the main algorithm.
  • Aug 26, 2025: Return the generated samples. The cherry-picking phase begins.
  • Aug 28, 2025: Submit the cherry-picked sample ids.
  • Aug 30 - Sep 5, 2025: Online subjective evaluation.
  • Sep 6, 2025: Announce the final result.

Submission

As described in the Evaluation and Competition Format, there are four types of submissions. Below is a list of them:

Task Submission Method Deadline
2 Prompts for the test set Email JSON files to organizers. Aug 15, 2025
Algorithm Email code/github link/docker to organizers. Check Algorithm Submission below. Aug 21, 2025
Cherry-picked IDs Email IDS to organizers. Aug 28, 2025
Evaluation metric (optional) Email organizers. Aug 21, 2025


Algorithm Submission

Participants must include a generation.sh script in their submission. The task captain will use the script to generate output files using the following format:

./generation.sh "/path/to/input.json" "/path/to/output_folder" n_sample
  • Input File: path to the input .json file.
  • Output Folder: path to the folder where the generated output files will be saved.
  • n_sample: number of samples to generate.
  • The script should generate n_sample output files in the specified output folder.
  • Output files should be named sequentially as sample_01.json, sample_02.json, ..., up to sample_n_sample.json.

Baseline

We provide a baseline algorithm in this repository. This is modified from the model MuseCoco (Lu, P., et al. 2023). Please also refer to this code repository to check data format and generation protocol.

Contacts

If you any questions or suggestions about the task, please contact:

  • Ziyu Wang: ziyu.wang<at>nyu.edu
  • Jingwei Zhao: jzhao<at>u.nus.edu