petuum/adaptdl

Resource-adaptive cluster scheduler for deep learning training.

58
/ 100
Established

This project helps deep learning engineers and machine learning researchers efficiently train their deep learning models in shared cloud or on-premise computing environments. It takes your PyTorch training code and resource requirements as input, automatically adjusting batch sizes, learning rates, and resource allocation. The output is faster, more cost-effective training and better utilization of your computing infrastructure.

453 stars. Used by 1 other package. No commits in the last 6 months. Available on PyPI.

Use this if you are training deep learning models on shared clusters or in the cloud and want to optimize resource usage and training speed without manual tuning.

Not ideal if you are only running single-node deep learning training jobs or do not need to manage shared resources efficiently.

deep-learning-operations MLOps resource-management cloud-cost-optimization model-training
Stale 6m
Maintenance 0 / 25
Adoption 11 / 25
Maturity 25 / 25
Community 22 / 25

How are scores calculated?

Stars

453

Forks

81

Language

Python

License

Apache-2.0

Last pushed

Mar 05, 2023

Commits (30d)

0

Dependencies

6

Reverse dependents

1

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/petuum/adaptdl"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.