tensorchord/openmodelz

Autoscale LLM (vLLM, SGLang, LMDeploy) inferences on Kubernetes (and others)

/ 100

Emerging

This project helps data scientists and SREs quickly deploy large language models (LLMs) and other machine learning models for live use. It takes a trained model and automatically sets up all the necessary infrastructure, like monitoring, scaling, and public access, providing a ready-to-use public endpoint. It is for anyone who needs to take a machine learning model from development to a production environment without getting bogged down in complex infrastructure setup.

281 stars. No commits in the last 6 months.

Use this if you need to deploy machine learning models, especially large language models, to a production environment quickly and efficiently without manually configuring all the underlying infrastructure.

Not ideal if you need extremely fine-grained control over every aspect of your infrastructure setup or are working with very small, simple models that don't require autoscaling.

MLOps model deployment AI infrastructure large language models SRE

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 14 / 25

How are scores calculated?

Stars

281

Forks

Language

License

Apache-2.0

Higher-rated alternatives

kubeflow/katib

Automated Machine Learning on Kubernetes

kubeai-project/kubeai

AI Inference Operator for Kubernetes. The easiest way to serve ML models in production. Supports...

sgl-project/rbg

A workload for deploying LLM inference services on Kubernetes

beam-cloud/beta9

Ultrafast serverless GPU inference, sandboxes, and background jobs

ptimizeroracle/ondine

The LLM Dataset Engine — batch process millions of rows with 100+ providers. Multi-row batching...

Explore MLOps Tools

All categories Trending MLOps directory Insights