Shenggan/atp
Adaptive Tensor Parallelism for Foundation Models
This project helps machine learning engineers and researchers efficiently train and deploy very large AI models, often called foundation models. It takes your existing large model architecture and training setup, then intelligently optimizes how computations are distributed across multiple GPUs or machines. The result is faster training times and more efficient inference for your large AI models.
No commits in the last 6 months.
Use this if you are working with extremely large AI models and need to reduce their training or inference time by optimizing how they utilize distributed hardware.
Not ideal if you are working with smaller models or do not have access to a distributed computing environment with multiple GPUs.
Stars
9
Forks
—
Language
Python
License
MIT
Category
Last pushed
Dec 15, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/Shenggan/atp"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
AliHaiderAhmad001/GPT-from-Scratch-with-Tensorflow
Implementation for "Improving Language Understanding by Generative Pre-Training" paper
HomebrewML/HomebrewNLP-torch
A case study of efficient training of large language models using commodity hardware.
akshat0123/GPT-1
Pytorch implementation of GPT-1
qiqiApink/MotionGPT
The official PyTorch implementation of the paper "MotionGPT: Finetuned LLMs are General-Purpose...
nawnoes/pytorch-gpt-x
An implementation of an autoregressive language model using an improved Transformer and...