Shenggan/awesome-distributed-ml

A curated list of awesome projects and papers for distributed training or inference

34
/ 100
Emerging

This is a curated collection of resources for machine learning engineers and researchers who are working with extremely large AI models. It brings together open-source projects and research papers that focus on how to efficiently train and deploy these large models using distributed computing. If you're tackling the challenge of building or fine-tuning models that exceed the capacity of a single machine, this list provides tools and techniques to help you scale your efforts effectively.

266 stars. No commits in the last 6 months.

Use this if you are a machine learning engineer or researcher designing, training, or deploying large-scale AI models, especially those like large language models or complex neural networks.

Not ideal if you are a data scientist or developer working with smaller models that can be handled on a single GPU or CPU, or if you are not involved in the low-level systems aspects of machine learning.

large-model-training distributed-machine-learning deep-learning-systems AI-infrastructure model-scaling
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 8 / 25
Community 16 / 25

How are scores calculated?

Stars

266

Forks

30

Language

License

Last pushed

Oct 08, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/Shenggan/awesome-distributed-ml"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.