saforem2/ezpz
Train across all your devices, ezpz 🍋
This tool helps machine learning engineers and researchers simplify the process of running PyTorch models on different computing systems. You provide your existing PyTorch training script, and it handles the complexities of distributed execution, automatically adapting to available hardware like GPUs, CPUs, or specialized AI accelerators. The output is a trained model, regardless of whether you ran it on a laptop or a supercomputer.
Use this if you need to run the same PyTorch training code across various hardware setups, from a single device to large clusters, without modifying your script for each environment.
Not ideal if you are not using PyTorch or if you require fine-grained, manual control over every aspect of your distributed training setup.
Stars
26
Forks
7
Language
Python
License
MIT
Category
Last pushed
Mar 10, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/saforem2/ezpz"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
deepspeedai/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference...
helmholtz-analytics/heat
Distributed tensors and Machine Learning framework with GPU and MPI acceleration in Python
hpcaitech/ColossalAI
Making large AI models cheaper, faster and more accessible
horovod/horovod
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
bsc-wdc/dislib
The Distributed Computing library for python implemented using PyCOMPSs programming model for HPC.