willbnu/Qwen-3.5-16G-Vram-Local

Configs, launchers, benchmarks, and tooling for running Qwen3.5 GGUF models locally with llama.cpp on a 16GB NVIDIA GPU

45
/ 100
Emerging

This project helps individual users, like researchers or advanced hobbyists, run large language models (specifically Qwen 3.5) on their local computer with a 16GB NVIDIA graphics card. It provides configurations and tools to get the best performance for tasks like coding, reasoning, or multimodal interactions. You input specific Qwen 3.5 model files and it helps you get optimized language model outputs.

Use this if you want to run powerful Qwen 3.5 language models on your personal machine for local data analysis, creative writing, or coding assistance, without relying on cloud services.

Not ideal if you don't have a dedicated NVIDIA GPU with at least 16GB VRAM, or if you need to deploy models for large-scale production environments.

local-AI large-language-models personal-computing offline-AI AI-inference
No Package No Dependents
Maintenance 13 / 25
Adoption 6 / 25
Maturity 11 / 25
Community 15 / 25

How are scores calculated?

Stars

21

Forks

4

Language

Python

License

MIT

Last pushed

Mar 14, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/willbnu/Qwen-3.5-16G-Vram-Local"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.