EvilFreelancer/docker-llama.cpp-rpc

Данный проект основан на llama.cpp и компилирует только RPC-сервер, а так же вспомогательные утилиты, работающие в режиме RPC-клиента, необходимые для реализации распределённого инференса конвертированных в GGUF формат Больших Языковых Моделей (БЯМ) и Эмбеддинговых Моделей.

39
/ 100
Emerging

This project helps you run large language models and embedding models on your own servers without needing powerful hardware on a single machine. You provide your GGUF-formatted models, and it gives you a distributed system that can serve text completions or embeddings via a simple API. This is ideal for developers or system administrators integrating AI capabilities into their applications.

No commits in the last 6 months.

Use this if you need to serve large language models or embedding models efficiently across multiple CPU and GPU servers, making the most of your existing hardware.

Not ideal if you're looking for a user-friendly application to interact with LLMs directly, as this tool is focused on backend infrastructure.

AI deployment MLOps distributed inference language model serving backend development
Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 6 / 25
Maturity 16 / 25
Community 15 / 25

How are scores calculated?

Stars

23

Forks

5

Language

Shell

License

MIT

Last pushed

May 25, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/EvilFreelancer/docker-llama.cpp-rpc"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.