BlackSamorez/tensor_parallel
Automatically split your PyTorch models on multiple GPUs for training & inference
This project helps machine learning practitioners or researchers run very large AI models, like advanced language models, across multiple GPUs more efficiently. It takes an existing PyTorch model that might be too big for a single GPU and automatically splits its components, allowing each GPU to work on a part of the model simultaneously. The result is faster training or inference for these massive models.
656 stars. Used by 1 other package. No commits in the last 6 months. Available on PyPI.
Use this if your PyTorch model is too large to fit on a single GPU or if you need to significantly speed up its training or inference using multiple GPUs.
Not ideal if you are working with smaller models that already fit comfortably on a single GPU or if you are conducting million-dollar-scale distributed training runs that require highly specialized and optimized infrastructure like DeepSpeed or Alpa.
Stars
656
Forks
44
Language
Python
License
MIT
Category
Last pushed
Jan 02, 2024
Commits (30d)
0
Dependencies
2
Reverse dependents
1
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/BlackSamorez/tensor_parallel"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related frameworks
pymc-devs/pytensor
PyTensor allows you to define, optimize, and efficiently evaluate mathematical expressions...
arogozhnikov/einops
Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)
lava-nc/lava-dl
Deep Learning library for Lava
tensorly/tensorly
TensorLy: Tensor Learning in Python.
tensorpack/tensorpack
A Neural Net Training Interface on TensorFlow, with focus on speed + flexibility