kvcache-ai/Mooncake

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

69
/ 100
Established

This project helps large AI service providers manage and serve large language models (LLMs) more efficiently. It optimizes the process of delivering AI responses by providing a specialized system for handling the key-value cache (KVCache) — critical data that LLMs use for generating text. Organizations running advanced AI models like Kimi or similar large-scale services would use this to improve performance and reduce operational costs.

4,911 stars. Actively maintained with 111 commits in the last 30 days.

Use this if you are operating a large-scale AI service and need to optimize the performance and cost-efficiency of serving large language models.

Not ideal if you are developing small-scale AI applications or do not manage distributed, high-throughput LLM serving infrastructure.

AI-service-provision large-language-model-deployment AI-infrastructure-optimization high-performance-computing cloud-AI-operations
No Package No Dependents
Maintenance 22 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 21 / 25

How are scores calculated?

Stars

4,911

Forks

600

Language

C++

License

Apache-2.0

Last pushed

Mar 13, 2026

Commits (30d)

111

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/kvcache-ai/Mooncake"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.