ADT109119/llamacpp-distributed-inference
一個基於 llama.cpp 的分佈式 LLM 推理程式,讓您能夠利用區域網路內的多台電腦協同進行大型語言模型的分佈式推理,使用 Electron 的製作跨平台桌面應用程式操作 UI。
This desktop application helps you run large language models (LLMs) by combining the processing power of multiple computers on your local network. You provide GGUF-formatted model files, and the application orchestrates distributed inference, giving you a local API endpoint to interact with the LLM. It's designed for individuals, researchers, or educators who want to run larger LLMs than a single machine can handle, or for prototyping distributed LLM systems.
No commits in the last 6 months.
Use this if you have several personal computers on a trusted local network and want to pool their resources to run large language models that might otherwise be too demanding for a single machine.
Not ideal if you need enterprise-grade security, stability, or are deploying LLM services in a production or public-facing environment.
Stars
71
Forks
13
Language
JavaScript
License
Apache-2.0
Category
Last pushed
Aug 24, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/ADT109119/llamacpp-distributed-inference"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
containers/ramalama
RamaLama is an open-source developer tool that simplifies the local serving of AI models from...
av/harbor
One command brings a complete pre-wired LLM stack with hundreds of services to explore.
RunanywhereAI/runanywhere-sdks
Production ready toolkit to run AI locally
runpod-workers/worker-vllm
The RunPod worker template for serving our large language model endpoints. Powered by vLLM.
foldl/chatllm.cpp
Pure C++ implementation of several models for real-time chatting on your computer (CPU & GPU)