Awesome-KV-Cache-Compression and Awesome-LLM-KV-Cache
These are complements that serve different aspects of the same problem space: one curates papers specifically focused on KV cache compression techniques, while the other provides a broader collection of KV cache research papers with corresponding implementations, allowing researchers to explore both specialized compression methods and the wider landscape of KV cache optimizations together.
About Awesome-KV-Cache-Compression
October2001/Awesome-KV-Cache-Compression
π° Must-read papers on KV Cache Compression (constantly updating π€).
This resource provides a curated collection of research papers and projects focused on optimizing the memory usage of Large Language Models (LLMs). It gathers various techniques to make LLMs run more efficiently, specifically by managing their 'KV Cache' β a memory component crucial for generating responses. This helps AI researchers and practitioners identify and implement methods to reduce the computational demands and costs associated with deploying and operating LLMs.
About Awesome-LLM-KV-Cache
Zefan-Cai/Awesome-LLM-KV-Cache
Awesome-LLM-KV-Cache: A curated list of πAwesome LLM KV Cache Papers with Codes.
This is a curated list of research papers and associated codebases focused on optimizing the Key-Value (KV) cache in large language models (LLMs). It helps AI researchers and practitioners stay up-to-date with the latest advancements in LLM inference efficiency. You get a categorized list of academic papers, often with links to their code, and insight into different strategies for managing KV caches.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work