alibaba/ServeGen

A framework for generating realistic LLM serving workloads

38
/ 100
Emerging

This tool helps engineers and researchers simulate realistic demands on large language model (LLM) serving systems. You input parameters like request rates and model types, and it outputs a sequence of simulated user requests, mirroring the complex, dynamic patterns observed in real production environments. This is for anyone responsible for designing, evaluating, or optimizing the infrastructure that runs large language models.

106 stars. No commits in the last 6 months.

Use this if you need to test the performance, scalability, or cost-effectiveness of an LLM serving system before deploying it or making changes to your existing setup.

Not ideal if you're looking for a tool to generate synthetic text for training LLMs or for benchmarking the LLM's natural language generation quality.

LLM deployment system architecture performance testing workload simulation capacity planning
Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 9 / 25
Maturity 15 / 25
Community 12 / 25

How are scores calculated?

Stars

106

Forks

10

Language

Python

License

Apache-2.0

Last pushed

Oct 09, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/alibaba/ServeGen"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.