alibaba/EasyParallelLibrary

Easy Parallel Library (EPL) is a general and efficient deep learning framework for distributed model training.

/ 100

Emerging

Training large-scale deep learning models often requires significant computational power. This library helps deep learning engineers train bigger, more complex models using multiple GPUs more efficiently. You provide your existing model code, and the library optimizes how it runs across your hardware, allowing you to train larger models faster and with less memory.

271 stars. No commits in the last 6 months.

Use this if you are a deep learning engineer struggling to train very large models due to memory constraints or slow training times, and you want to leverage distributed computing without extensive manual setup.

Not ideal if you are working with small models that don't require distributed training, or if you prefer to manually manage all aspects of your parallel training setup.

deep-learning model-training machine-learning-engineering distributed-computing gpu-optimization

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 21 / 25

How are scores calculated?

Stars

271

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

deepspeedai/DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference...

helmholtz-analytics/heat

Distributed tensors and Machine Learning framework with GPU and MPI acceleration in Python

hpcaitech/ColossalAI

Making large AI models cheaper, faster and more accessible

horovod/horovod

Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

bsc-wdc/dislib

The Distributed Computing library for python implemented using PyCOMPSs programming model for HPC.

Explore ML Frameworks

All categories Trending ML Framework directory Insights