intentee/paddler
Open-source LLM load balancer and serving platform for self-hosting LLMs at scale 🏓🦙 Alternative to projects like llm-d, Docker Model Runner, etc but with less moving parts and simple deployments built around ggml ecosystem. Runs on CPU and GPU.
Paddler helps product and DevOps teams self-host large language models (LLMs) on their own infrastructure, instead of relying on external providers. It takes open-source LLMs and serves them efficiently and reliably, allowing you to integrate AI features into your products while maintaining control over data privacy, costs, and performance. Product leaders and engineers concerned with scaling AI features will find it useful.
1,478 stars. Actively maintained with 58 commits in the last 30 days.
Use this if you need to run LLM inference and embeddings at scale within your own organization, particularly for product features or sensitive data, and want predictable costs and reliability.
Not ideal if you prefer to use managed LLM services from cloud providers and are not interested in maintaining your own infrastructure for AI models.
Stars
1,478
Forks
84
Language
Rust
License
Apache-2.0
Category
Last pushed
Mar 12, 2026
Commits (30d)
58
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/intentee/paddler"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
BerriAI/litellm
Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with...
vava-nessa/free-coding-models
Find, benchmark and install in CLI 158 FREE coding LLM models across 20 providers in real time
envoyproxy/ai-gateway
Manages Unified Access to Generative AI Services built on Envoy Gateway
theopenco/llmgateway
Route, manage, and analyze your LLM requests across multiple providers with a unified API interface.
Portkey-AI/gateway
A blazing fast AI Gateway with integrated guardrails. Route to 200+ LLMs, 50+ AI Guardrails with...