rentainhe/pytorch-distributed-training

Simple tutorials on Pytorch DDP training

/ 100

Emerging

Training large neural networks can be slow, but this project provides code examples and tutorials to speed up model training by distributing the workload across multiple GPUs on a single machine. It helps machine learning engineers and researchers optimize their deep learning experiments, taking a single Pytorch model and dataset as input and producing a faster training workflow.

286 stars. No commits in the last 6 months.

Use this if you are a machine learning engineer or researcher struggling with long training times for your deep learning models on a single machine with multiple GPUs.

Not ideal if you are looking for distributed training across multiple machines or if you are not using PyTorch for your deep learning projects.

deep-learning model-training GPU-acceleration neural-networks performance-optimization

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 8 / 25

Community 21 / 25

How are scores calculated?

Stars

286

Forks

Language

Python

License

—

Higher-rated alternatives

deepspeedai/DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference...

helmholtz-analytics/heat

Distributed tensors and Machine Learning framework with GPU and MPI acceleration in Python

hpcaitech/ColossalAI

Making large AI models cheaper, faster and more accessible

horovod/horovod

Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

bsc-wdc/dislib

The Distributed Computing library for python implemented using PyCOMPSs programming model for HPC.

Explore ML Frameworks

All categories Trending ML Framework directory Insights