willbnu/Qwen-3.5-16G-Vram-Local

Configs, launchers, benchmarks, and tooling for running Qwen3.5 GGUF models locally with llama.cpp on a 16GB NVIDIA GPU

/ 100

Emerging

This project helps individual users, like researchers or advanced hobbyists, run large language models (specifically Qwen 3.5) on their local computer with a 16GB NVIDIA graphics card. It provides configurations and tools to get the best performance for tasks like coding, reasoning, or multimodal interactions. You input specific Qwen 3.5 model files and it helps you get optimized language model outputs.

Use this if you want to run powerful Qwen 3.5 language models on your personal machine for local data analysis, creative writing, or coding assistance, without relying on cloud services.

Not ideal if you don't have a dedicated NVIDIA GPU with at least 16GB VRAM, or if you need to deploy models for large-scale production environments.

local-AI large-language-models personal-computing offline-AI AI-inference

No Package No Dependents

Maintenance 13 / 25

Adoption 6 / 25

Maturity 11 / 25

Community 15 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

QwenLM/Qwen

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

LLM-Red-Team/qwen-free-api

🚀...

QwenLM/Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by...

QwenLM/qwen.cpp

C++ implementation of Qwen-LM

yassa9/qwen600

Static suckless single batch CUDA-only qwen3-0.6B mini inference engine

Explore LLM Tools

All categories Trending LLM Tool directory Insights