TreeAI-Lab/Awesome-KV-Cache-Management
This repository serves as a comprehensive survey of LLM development, featuring numerous research papers along with their corresponding code links.
This project is for developers who work with Large Language Models (LLMs) and need to improve their performance, particularly regarding memory usage and speed. It collects and categorizes research papers on "KV Cache Management," which is a technique to optimize how LLMs process information. The output is a curated list of research papers and their code, helping developers find methods to make their LLMs run faster and more efficiently.
291 stars.
Use this if you are an LLM developer or researcher looking for state-of-the-art techniques to accelerate LLM inference and manage memory more effectively.
Not ideal if you are an end-user of an LLM application and do not work directly with LLM infrastructure or model development.
Stars
291
Forks
9
Language
—
License
—
Category
Last pushed
Dec 05, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/TreeAI-Lab/Awesome-KV-Cache-Management"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ModelEngine-Group/unified-cache-management
Persist and reuse KV Cache to speedup your LLM.
reloadware/reloadium
Hot Reloading and Profiling for Python
October2001/Awesome-KV-Cache-Compression
📰 Must-read papers on KV Cache Compression (constantly updating 🤗).
alibaba/tair-kvcache
Alibaba Cloud's high-performance KVCache system for LLM inference, with components for global...
Zefan-Cai/Awesome-LLM-KV-Cache
Awesome-LLM-KV-Cache: A curated list of 📙Awesome LLM KV Cache Papers with Codes.