ADT109119/llamacpp-distributed-inference

一個基於 llama.cpp 的分佈式 LLM 推理程式，讓您能夠利用區域網路內的多台電腦協同進行大型語言模型的分佈式推理，使用 Electron 的製作跨平台桌面應用程式操作 UI。

/ 100

Emerging

This desktop application helps you run large language models (LLMs) by combining the processing power of multiple computers on your local network. You provide GGUF-formatted model files, and the application orchestrates distributed inference, giving you a local API endpoint to interact with the LLM. It's designed for individuals, researchers, or educators who want to run larger LLMs than a single machine can handle, or for prototyping distributed LLM systems.

No commits in the last 6 months.

Use this if you have several personal computers on a trusted local network and want to pool their resources to run large language models that might otherwise be too demanding for a single machine.

Not ideal if you need enterprise-grade security, stability, or are deploying LLM services in a production or public-facing environment.

large-language-models distributed-computing ai-research personal-llm-deployment model-inference

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 9 / 25

Maturity 15 / 25

Community 17 / 25

How are scores calculated?

Stars

Forks

Language

JavaScript

License

Apache-2.0

Higher-rated alternatives

containers/ramalama

RamaLama is an open-source developer tool that simplifies the local serving of AI models from...

av/harbor

One command brings a complete pre-wired LLM stack with hundreds of services to explore.

RunanywhereAI/runanywhere-sdks

Production ready toolkit to run AI locally

runpod-workers/worker-vllm

The RunPod worker template for serving our large language model endpoints. Powered by vLLM.

foldl/chatllm.cpp

Pure C++ implementation of several models for real-time chatting on your computer (CPU & GPU)

Explore LLM Tools

All categories Trending LLM Tool directory Insights