kubeai and LLMKube

Both are Kubernetes operators for LLM inference, making them direct competitors serving the same use case of deploying and managing ML models on Kubernetes clusters, though KubeAI appears more mature with broader model support (VLMs, embeddings, speech-to-text) compared to LLMKube's focus on GPU-accelerated inference.

kubeai

Established

LLMKube

Emerging

Maintenance 13/25

Adoption 10/25

Maturity 16/25

Community 20/25

Maintenance 10/25

Adoption 7/25

Maturity 13/25

Community 12/25

Stars: 1,161

Forks: 125

Downloads: —

Commits (30d): 4

Language: Go

License: Apache-2.0

Stars: 29

Forks: 4

Downloads: —

Commits (30d): 0

Language: Go

License: Apache-2.0

No Package No Dependents

About kubeai

kubeai-project/kubeai

AI Inference Operator for Kubernetes. The easiest way to serve ML models in production. Supports VLMs, LLMs, embeddings, and speech-to-text.

KubeAI helps machine learning operations teams deploy and manage AI models like large language models, embedding models, and speech-to-text systems on Kubernetes. It takes your trained ML models and makes them available for applications to use, handling tasks like intelligent scaling, model caching, and efficient request routing. This is for MLOps engineers and platform teams who need to reliably serve AI inference at scale.

MLOps AI Inference Kubernetes Deployment Large Language Models Speech Processing

About LLMKube

defilantech/LLMKube

Kubernetes operator for GPU-accelerated LLM inference - air-gapped, edge-native, production-ready

LLMKube helps organizations deploy large language models (LLMs) on their own computing infrastructure, whether for privacy, cost control, or air-gapped compliance. It takes your chosen LLM and hardware specifications, then manages the entire deployment process, making the model available via a standard API. This is ideal for infrastructure engineers, MLOps teams, or application developers who need to integrate LLM inference into their products while maintaining full control over their data and hardware.

MLOps On-premise AI GPU orchestration Data privacy Edge AI

Scores updated daily from GitHub, PyPI, and npm data. How scores work