liangyuwang/Tiny-Megatron

Tiny-Megatron, a minimalistic re-implementation of the Megatron library

35
/ 100
Emerging

This project helps machine learning engineers and researchers understand and implement distributed training strategies for large language models. It takes a PyTorch model and an HPC cluster configuration as input, and outputs a functionally identical model that can be trained efficiently across multiple GPUs or nodes. It's designed for those learning how to scale deep learning models for faster training or to fit larger models into memory.

No commits in the last 6 months.

Use this if you are a machine learning engineer or researcher looking to learn about or implement tensor, data, or 2D hybrid parallelism strategies for training large language models in PyTorch.

Not ideal if you need a production-ready library with advanced features like pipeline parallelism or optimizer state sharding, or if you are not comfortable with PyTorch and distributed training concepts.

distributed-deep-learning large-language-models model-training pytorch hpc-cluster-management
Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 6 / 25
Maturity 16 / 25
Community 11 / 25

How are scores calculated?

Stars

23

Forks

3

Language

Python

License

Apache-2.0

Last pushed

Sep 01, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/liangyuwang/Tiny-Megatron"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.