alibaba/ServeGen

A framework for generating realistic LLM serving workloads

/ 100

Emerging

This tool helps engineers and researchers simulate realistic demands on large language model (LLM) serving systems. You input parameters like request rates and model types, and it outputs a sequence of simulated user requests, mirroring the complex, dynamic patterns observed in real production environments. This is for anyone responsible for designing, evaluating, or optimizing the infrastructure that runs large language models.

106 stars. No commits in the last 6 months.

Use this if you need to test the performance, scalability, or cost-effectiveness of an LLM serving system before deploying it or making changes to your existing setup.

Not ideal if you're looking for a tool to generate synthetic text for training LLMs or for benchmarking the LLM's natural language generation quality.

LLM deployment system architecture performance testing workload simulation capacity planning

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 9 / 25

Maturity 15 / 25

Community 12 / 25

How are scores calculated?

Stars

106

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

thu-pacman/chitu

High-performance inference framework for large language models, focusing on efficiency,...

NotPunchnox/rkllama

Ollama alternative for Rockchip NPU: An efficient solution for running AI and Deep learning...

sophgo/LLM-TPU

Run generative AI models in sophgo BM1684X/BM1688

Deep-Spark/DeepSparkHub

DeepSparkHub selects hundreds of application algorithms and models, covering various fields of...

howard-hou/VisualRWKV

VisualRWKV is the visual-enhanced version of the RWKV language model, enabling RWKV to handle...

Explore LLM Tools

All categories Trending LLM Tool directory Insights