petuum/adaptdl

Resource-adaptive cluster scheduler for deep learning training.

/ 100

Established

This project helps deep learning engineers and machine learning researchers efficiently train their deep learning models in shared cloud or on-premise computing environments. It takes your PyTorch training code and resource requirements as input, automatically adjusting batch sizes, learning rates, and resource allocation. The output is faster, more cost-effective training and better utilization of your computing infrastructure.

453 stars. Used by 1 other package. No commits in the last 6 months. Available on PyPI.

Use this if you are training deep learning models on shared clusters or in the cloud and want to optimize resource usage and training speed without manual tuning.

Not ideal if you are only running single-node deep learning training jobs or do not need to manage shared resources efficiently.

deep-learning-operations MLOps resource-management cloud-cost-optimization model-training

Stale 6m

Maintenance 0 / 25

Adoption 11 / 25

Maturity 25 / 25

Community 22 / 25

How are scores calculated?

Stars

453

Forks

Language

Python

License

Apache-2.0

Related frameworks

qualcomm/ai-hub-models

Qualcomm® AI Hub Models is our collection of state-of-the-art machine learning models optimized...

zszazi/Deep-learning-in-cloud

List of Deep Learning Cloud Providers

lincc-frameworks/hyrax

Hyrax - A low-code framework for rapid experimentation with ML & unsupervised discovery in astronomy

openhackathons-org/gpubootcamp

This repository consists for gpu bootcamp material for HPC and AI

intel/ai-reference-models

Intel® AI Reference Models: contains Intel optimizations for running deep learning workloads on...

Explore ML Frameworks

All categories Trending ML Framework directory Insights