Awesome-KV-Cache-Compression and Awesome-LLM-KV-Cache

These are complements that serve different aspects of the same problem space: one curates papers specifically focused on KV cache compression techniques, while the other provides a broader collection of KV cache research papers with corresponding implementations, allowing researchers to explore both specialized compression methods and the wider landscape of KV cache optimizations together.

Maintenance 10/25
Adoption 10/25
Maturity 16/25
Community 11/25
Maintenance 0/25
Adoption 10/25
Maturity 16/25
Community 13/25
Stars: 668
Forks: 22
Downloads: β€”
Commits (30d): 0
Language: β€”
License: MIT
Stars: 417
Forks: 26
Downloads: β€”
Commits (30d): 0
Language: β€”
License: GPL-3.0
No Package No Dependents
Stale 6m No Package No Dependents

About Awesome-KV-Cache-Compression

October2001/Awesome-KV-Cache-Compression

πŸ“° Must-read papers on KV Cache Compression (constantly updating πŸ€—).

This resource provides a curated collection of research papers and projects focused on optimizing the memory usage of Large Language Models (LLMs). It gathers various techniques to make LLMs run more efficiently, specifically by managing their 'KV Cache' – a memory component crucial for generating responses. This helps AI researchers and practitioners identify and implement methods to reduce the computational demands and costs associated with deploying and operating LLMs.

Large Language Models LLM Optimization AI Inference Natural Language Processing Deep Learning Efficiency

About Awesome-LLM-KV-Cache

Zefan-Cai/Awesome-LLM-KV-Cache

Awesome-LLM-KV-Cache: A curated list of πŸ“™Awesome LLM KV Cache Papers with Codes.

This is a curated list of research papers and associated codebases focused on optimizing the Key-Value (KV) cache in large language models (LLMs). It helps AI researchers and practitioners stay up-to-date with the latest advancements in LLM inference efficiency. You get a categorized list of academic papers, often with links to their code, and insight into different strategies for managing KV caches.

AI-research LLM-inference model-optimization deep-learning-efficiency natural-language-processing

Scores updated daily from GitHub, PyPI, and npm data. How scores work