sgl-project/rbg

A workload for deploying LLM inference services on Kubernetes

57
/ 100
Established

This Kubernetes API helps with deploying and managing complex, multi-component AI inference systems, especially large language models (LLMs). It takes your specifications for different parts of an LLM service (like prefill and decode roles) and outputs a stable, coordinated, and high-performance deployment on your Kubernetes cluster. It's designed for operations engineers or MLOps teams managing production AI services.

187 stars.

Use this if you need to reliably deploy and operate stateful, performance-sensitive, and multi-role LLM inference services on Kubernetes, ensuring proper coordination and resource utilization.

Not ideal if you are deploying simple, single-component applications or if your AI inference workloads do not require intricate multi-role coordination and topology awareness.

MLOps Kubernetes-operations AI-inference-deployment LLM-serving distributed-systems-management
No Package No Dependents
Maintenance 10 / 25
Adoption 10 / 25
Maturity 15 / 25
Community 22 / 25

How are scores calculated?

Stars

187

Forks

47

Language

Go

License

Apache-2.0

Last pushed

Mar 13, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/mlops/sgl-project/rbg"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.