arrmansa/Basic-UI-for-GPT-J-6B-with-low-vram

A repository to run gpt-j-6b on low vram machines (4.2 gb minimum vram for 2000 token context, 3.5 gb for 1000 token context). Model loading takes 12gb free ram.

/ 100

Emerging

This project helps developers run a large language model (GPT-J-6B) on computers with limited graphics card memory. It takes the model and user input, then processes it using a combination of RAM and VRAM to generate text outputs. This is designed for AI/ML developers or researchers who need to experiment with large models on less powerful hardware.

113 stars. No commits in the last 6 months.

Use this if you are a developer working with large language models and your machine has limited VRAM (e.g., 4-8 GB) but sufficient RAM to load the model.

Not ideal if you have ample VRAM (12GB+) or are not a developer looking to run specific large language models locally.

AI-development large-language-models resource-constrained-ML local-model-deployment ML-experimentation

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 9 / 25

Maturity 16 / 25

Community 13 / 25

How are scores calculated?

Stars

113

Forks

Language

Jupyter Notebook

License

Apache-2.0

Higher-rated alternatives

tabularis-ai/be_great

A novel approach for synthesizing tabular data using pretrained large language models

EleutherAI/gpt-neox

An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron...

shibing624/textgen

TextGen: Implementation of Text Generation models, include LLaMA, BLOOM, GPT2, BART, T5, SongNet...

ai-forever/ru-gpts

Russian GPT3 models.

AdityaNG/kan-gpt

The PyTorch implementation of Generative Pre-trained Transformers (GPTs) using Kolmogorov-Arnold...

Explore Transformer Models

All categories Trending Transformer directory Insights