2025 Automatic Music
Transcription Challenge
Develop advanced models capable of transcribing synthesized classical music into MIDI files.
Summary
The 2025 Automatic Music Transcription (AMT) Challenge invites participants to develop computer programs capable of accurately transcribing synthesized audio recordings of classical music into MIDI files. Each submission will process 100 recordings, each up to 20 seconds long, within a maximum time limit of 4 hours.
The audio data has been synthesized to sound as realistic as possible, closely resembling natural instrumental performances. Unlike previous challenges, participants will be informed of the specific instruments present in each recording. Incorrect instrument identification will incur a penalty, with smaller penalties applied if the mistake involves similar instrument families.
Evaluation criteria include the accuracy of instrument identification, pitch, onset, offset, and dynamics.
An Online Competition
This challenge is proudly sponsored by the IEEE Technical Community on Multimedia Computing (TCMC) .
Technical Details
Participants will register on ai4musicians.org, where sample music files including scores and audio recordings will be provided to assist in model development. Contestants may use any public or proprietary data for training.
Submissions will be open in April 2025. Each team's program will be executed on a GPU-equipped system at Purdue's Rosen Center for Advanced Computing. Teams may submit models once every 24 hours, and a live leaderboard will display performance results. Final rankings will be determined using holdout data. Winning teams will be invited to present their solutions at the 2025 IEEE ICME Conference.
For questions and updates, join the AMT Slack Workspace →
Submission Details
Repository Access
During registration, participants must provide a link to their code repository along with a fine-grained access token for the competition backend to pull the model.
Submission Branch
Create a branch titled submission in your repository. The backend will automatically pull from this branch. Submissions are valid when:
- The
submissionbranch exists - New commits have been made since the last successful run
Environment Configuration
Include an environment.yml file in the root of your repository. This will be used to create a conda environment during execution.
Model Execution Requirements
The repository must contain a main.py file in the root directory accepting these arguments:
- -i: Path to the input audio file (.mp3 format)
- -o: Path to save the output MIDI file
python main.py -i input.mp3 -o output.midi
Input File Naming Convention
Audio file names include MIDI instrument codes following the General MIDI standard. Example:
1._0_40_70.mp3
Here 0, 40, and 70 are the MIDI instrument codes present in the recording. Incorrect instrument identification incurs a scoring penalty.
Model Weights
Participants may use Git LFS (Large File Storage) for managing model weights. The backend fully supports Git LFS.
Schedule
Cash Awards
Winners must open-source their solutions as specified in the registration agreement. Cash awards will only be provided to participants from countries not subject to U.S. embargoes or sanctions. In some cases, awards may take the form of travel grants covering conference registration, hotel, and airfare.
Sample Compositions
We are releasing 20 sample compositions featuring a diverse range of instrumental arrangements. Each composition is provided as an MP3 audio file along with its corresponding sheet music in both MIDI and PDF formats.
Instruments included:
Sample Solution
We provide a reference implementation of the MT3 (Multi-Task Multitrack Music Transcription) model developed by Google's Magenta team. MT3 uses a Transformer-based architecture to process audio inputs and generate accurate musical notations.
Note: To qualify as a winning submission, your model must achieve a transcription accuracy score higher than MT3.
MT3 Resources
- MT3 Competition-Ready Implementation →
- Original MT3 Repository by Google Magenta →
- MT3 Research Paper →
Additional Sample Implementations
Basic Pitch by Spotify — Robust pitch tracking and transcription:
ReconVAT — Semi-supervised music transcription using variational autoencoders:
Organizers
Contributing Composers
For Contributing Composers
Composers retain the copyright of their works while granting royalty-free, non-exclusive rights to the challenge organizers for redistribution and analysis. Each invited composer is expected to contribute 10–30 pieces, each approximately 20 seconds long, divided into three difficulty levels: easy, medium, and difficult.
Composition Guidelines
- Tempo: 60–90 bpm
- Pitch Range: C2 to C7
- Smallest rhythmic duration: sixteenth-notes/rests
- No swing rhythms; use precise notation
- No doubly-dotted notes; no trills or mordents
- Meters: 3/4, 4/4, 6/8
- Dynamic range: pp to ff
- Up to three distinct instruments per composition
- Submit files in PDF (score), MusicXML, and MIDI formats