nawnoes/pytorch-gpt-x
An implementation of an autoregressive language model using an improved Transformer and DeepSpeed pipeline parallelism.
This project helps machine learning researchers or engineers train large language models (like GPT-style models) on limited hardware. You provide text data, and it trains a ~1 billion parameter model optimized with techniques like ReZero and DeepSpeed, enabling efficient training on just two V100 16GB GPUs. This is for individuals or teams developing advanced natural language processing capabilities.
Use this if you need to train a large autoregressive language model efficiently on a cluster with a small number of powerful GPUs.
Not ideal if you're looking for an off-the-shelf pre-trained model or if you don't have access to specialized GPU hardware.
Stars
30
Forks
3
Language
Python
License
—
Category
Last pushed
Jan 12, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/nawnoes/pytorch-gpt-x"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
AliHaiderAhmad001/GPT-from-Scratch-with-Tensorflow
Implementation for "Improving Language Understanding by Generative Pre-Training" paper
HomebrewML/HomebrewNLP-torch
A case study of efficient training of large language models using commodity hardware.
akshat0123/GPT-1
Pytorch implementation of GPT-1
qiqiApink/MotionGPT
The official PyTorch implementation of the paper "MotionGPT: Finetuned LLMs are General-Purpose...
Shenggan/atp
Adaptive Tensor Parallelism for Foundation Models