kubeai-project/kubeai

AI Inference Operator for Kubernetes. The easiest way to serve ML models in production. Supports VLMs, LLMs, embeddings, and speech-to-text.

59
/ 100
Established

KubeAI helps machine learning operations teams deploy and manage AI models like large language models, embedding models, and speech-to-text systems on Kubernetes. It takes your trained ML models and makes them available for applications to use, handling tasks like intelligent scaling, model caching, and efficient request routing. This is for MLOps engineers and platform teams who need to reliably serve AI inference at scale.

1,161 stars. Actively maintained with 4 commits in the last 30 days.

Use this if you need to deploy and manage a variety of machine learning models (especially large language models or embedding models) in a Kubernetes environment and want to optimize their performance and scalability without complex dependencies.

Not ideal if you are looking for a simple tool for local model experimentation or if your inference workloads are very small-scale and don't require Kubernetes deployment.

MLOps AI Inference Kubernetes Deployment Large Language Models Speech Processing
No Package No Dependents
Maintenance 13 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 20 / 25

How are scores calculated?

Stars

1,161

Forks

125

Language

Go

License

Apache-2.0

Last pushed

Feb 23, 2026

Commits (30d)

4

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/mlops/kubeai-project/kubeai"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.