JinjieNi/MegaDLMs

GPU-optimized framework for training diffusion language models at any scale. The backend of Quokka, Super Data Learners, and OpenMoE 2 training.

36
/ 100
Emerging

This is a GPU-optimized framework for training large language models (LLMs), specifically Diffusion Language Models (DLMs), at any scale. It takes raw text data, tokenizes it, and outputs a fully trained language model that can then be used for generation tasks. This tool is designed for AI researchers and engineers who build and train advanced generative AI models.

327 stars.

Use this if you are developing and training state-of-the-art diffusion or autoregressive language models and need a high-performance, scalable solution optimized for GPU clusters.

Not ideal if you are looking for an off-the-shelf model to use directly, or if you need a framework for training models other than large language models.

large-language-models generative-ai deep-learning-training diffusion-models model-scalability
No License No Package No Dependents
Maintenance 6 / 25
Adoption 10 / 25
Maturity 5 / 25
Community 15 / 25

How are scores calculated?

Stars

327

Forks

30

Language

Python

License

Last pushed

Nov 11, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/JinjieNi/MegaDLMs"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.