NVIDIA/Megatron-LM

Ongoing research training transformer models at scale

73
/ 100
Verified

This project offers a GPU-optimized toolkit for machine learning engineers to build and train very large language models. It provides specialized components and strategies to efficiently scale training across many GPUs. You can use it to create custom frameworks for large-scale AI research and development.

15,633 stars. Actively maintained with 205 commits in the last 30 days.

Use this if you are an ML engineer or researcher building custom training pipelines for large transformer models and need to scale efficiently across many GPUs.

Not ideal if you are looking for a pre-trained model to use directly or a simple library for smaller, single-GPU model training.

large-language-models distributed-training AI-research transformer-architectures GPU-optimization
No Package No Dependents
Maintenance 22 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 25 / 25

How are scores calculated?

Stars

15,633

Forks

3,689

Language

Python

License

Last pushed

Mar 13, 2026

Commits (30d)

205

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/NVIDIA/Megatron-LM"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.