Kv Cache Optimization Transformer Models

There are 9 kv cache optimization models tracked. 1 score above 70 (verified tier). The highest-rated is LMCache/LMCache at 79/100 with 7,664 stars. 1 of the top 10 are actively maintained.

Get all 9 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=kv-cache-optimization&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Model Score Tier
1 LMCache/LMCache

Supercharge Your LLM with the Fastest KV Cache Layer

79
Verified
2 Zefan-Cai/KVCache-Factory

Unified KV Cache Compression Methods for Auto-Regressive Models

47
Emerging
3 dataflowr/llm_efficiency

KV Cache & LoRA for minGPT

41
Emerging
4 OnlyTerp/kvtc

First open-source KVTC implementation (NVIDIA, ICLR 2026) -- 8-32x KV cache...

38
Emerging
5 itsnamgyu/block-transformer

Block Transformer: Global-to-Local Language Modeling for Fast Inference...

38
Emerging
6 OnlyTerp/turboquant

First open-source implementation of Google TurboQuant (ICLR 2026) --...

35
Emerging
7 codepawl/turboquant-torch

Unofficial PyTorch implementation of TurboQuant (Google Research, ICLR...

27
Experimental
8 DRSY/EasyKV

Easy control for Key-Value Constrained Generative LLM...

25
Experimental
9 DingWeiPeng/Transformer-decoder-only-with-KV-Cache

Transformer Key Value Store/Transformer Decoder Only with KV Cache

10
Experimental