Natooz/MidiTok

MIDI / symbolic music tokenizers for Deep Learning models 🎶

68
/ 100
Established

This tool helps music researchers and AI developers prepare symbolic music data for machine learning models. It takes MIDI or ABC music files as input and converts them into sequences of tokens, which are the numerical representations that AI models can understand. The output is a structured dataset ready for tasks like music generation, transcription, or analysis. It's designed for researchers and practitioners working on AI applications in music.

855 stars. Actively maintained with 4 commits in the last 30 days. Available on PyPI.

Use this if you need to transform MIDI or ABC music files into a tokenized format suitable for training deep learning models for music AI tasks.

Not ideal if you are a musician looking for a digital audio workstation or a tool for direct music composition and production.

music-information-retrieval music-generation symbolic-music deep-learning-for-music computational-musicology
Maintenance 13 / 25
Adoption 10 / 25
Maturity 25 / 25
Community 20 / 25

How are scores calculated?

Stars

855

Forks

98

Language

Python

License

MIT

Last pushed

Mar 02, 2026

Commits (30d)

4

Dependencies

5

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/Natooz/MidiTok"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.