arrmansa/Gpt-Neo-Limited-Vram-Cuda
A notebook that runs GPT-Neo with low vram (6 gb) and cuda acceleration by loading it into gpu memory in smaller parts.
This is a tool for developers who want to experiment with large language models like GPT-Neo on consumer-grade hardware. It helps run these models on GPUs with limited video RAM (as low as 6 GB) by loading the model in smaller segments. Developers can input a large language model and get it running on less powerful GPUs, making advanced model experimentation more accessible.
No commits in the last 6 months.
Use this if you are a developer looking to run large language models on GPUs with limited video memory.
Not ideal if you need to run large language models on systems with ample VRAM or are looking for a pre-packaged, user-friendly application rather than a developer notebook.
Stars
14
Forks
6
Language
Jupyter Notebook
License
Apache-2.0
Category
Last pushed
May 25, 2021
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/arrmansa/Gpt-Neo-Limited-Vram-Cuda"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
tabularis-ai/be_great
A novel approach for synthesizing tabular data using pretrained large language models
EleutherAI/gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron...
shibing624/textgen
TextGen: Implementation of Text Generation models, include LLaMA, BLOOM, GPT2, BART, T5, SongNet...
ai-forever/ru-gpts
Russian GPT3 models.
AdityaNG/kan-gpt
The PyTorch implementation of Generative Pre-trained Transformers (GPTs) using Kolmogorov-Arnold...