Zefan-Cai/Awesome-LLM-KV-Cache

Awesome-LLM-KV-Cache: A curated list of 📙Awesome LLM KV Cache Papers with Codes.

/ 100

Emerging

This is a curated list of research papers and associated codebases focused on optimizing the Key-Value (KV) cache in large language models (LLMs). It helps AI researchers and practitioners stay up-to-date with the latest advancements in LLM inference efficiency. You get a categorized list of academic papers, often with links to their code, and insight into different strategies for managing KV caches.

417 stars. No commits in the last 6 months.

Use this if you are an AI researcher or machine learning engineer actively working on improving the performance and efficiency of large language model inference and want to explore the latest techniques in KV cache management.

Not ideal if you are looking for an off-the-shelf tool or library to directly use in a non-research LLM application.

AI-research LLM-inference model-optimization deep-learning-efficiency natural-language-processing

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 13 / 25

How are scores calculated?

Stars

417

Forks

Language

—

License

GPL-3.0

Compare

Awesome-LLM-KV-Cache and Awesome-KV-Cache-Compression Awesome-LLM-KV-Cache and Awesome-KV-Cache-Management

Higher-rated alternatives

ModelEngine-Group/unified-cache-management

Persist and reuse KV Cache to speedup your LLM.

reloadware/reloadium

Hot Reloading and Profiling for Python

alibaba/tair-kvcache

Alibaba Cloud's high-performance KVCache system for LLM inference, with components for global...

October2001/Awesome-KV-Cache-Compression

📰 Must-read papers on KV Cache Compression (constantly updating 🤗).

xcena-dev/maru

High-Performance KV Cache Storage Engine on CXL Shared Memory for LLM Inference

Explore LLM Tools

All categories Trending LLM Tool directory Insights