Kubernetes Llm Serving Transformer Models

There are 4 kubernetes llm serving models tracked. The highest-rated is robert-mcdermott/ollama-batch-cluster at 32/100 with 30 stars.

Get all 4 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=kubernetes-llm-serving&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Model Score Tier
1 robert-mcdermott/ollama-batch-cluster

Large Scale Batch Processing with Ollama

32
Emerging
2 anmolg1997/Multi-LoRA-Serve

Multi-adapter inference gateway — one base model, many LoRA adapters...

22
Experimental
3 kimmmmyy223/llm-batch

🚀 Process JSON data in batches with `llm-batch`, leveraging sequential or...

21
Experimental
4 Rohit2sali/vllm-multi-tenant-llm-gateway

This is vllm multi tenant large language model gateway. This system is...

13
Experimental