wangcx18/llm-vscode-inference-server

An endpoint server for efficiently serving quantized open-source LLMs for code.

/ 100

Emerging

This project provides an alternative server for the 'llm-vscode' extension, enabling developers to run open-source code completion models locally. It takes a quantized language model file as input and serves it as an endpoint, allowing the VSCode extension to provide code suggestions and completions directly on your machine. This is ideal for software developers who want to self-host code-generating AI for their programming tasks.

No commits in the last 6 months.

Use this if you are a software developer who wants to run an open-source, quantized code Large Language Model (LLM) locally within VSCode for code completion and generation, reducing reliance on cloud services.

Not ideal if you are not a developer or if you prefer to use cloud-hosted code completion services without managing local model serving.

software-development code-completion developer-tools local-AI-inference programming

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 8 / 25

Maturity 16 / 25

Community 16 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

thu-pacman/chitu

High-performance inference framework for large language models, focusing on efficiency,...

sophgo/LLM-TPU

Run generative AI models in sophgo BM1684X/BM1688

NotPunchnox/rkllama

Ollama alternative for Rockchip NPU: An efficient solution for running AI and Deep learning...

Deep-Spark/DeepSparkHub

DeepSparkHub selects hundreds of application algorithms and models, covering various fields of...

howard-hou/VisualRWKV

VisualRWKV is the visual-enhanced version of the RWKV language model, enabling RWKV to handle...

Explore LLM Tools

All categories Trending LLM Tool directory Insights