StargazerX0/ScaleKV
[NeurIPS 2025] ScaleKV: Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache Compression
ScaleKV helps researchers and engineers working with large visual generative models to reduce the significant memory footprint required for image generation. It takes in a trained visual autoregressive model and outputs the same model, but optimized to use substantially less memory during image generation, making it feasible to run on more constrained hardware. This project is ideal for those developing and deploying advanced image generation systems.
Use this if you are developing visual generative AI and need to significantly reduce the memory consumption of your large visual autoregressive models without sacrificing image quality.
Not ideal if you are working with text-based models or do not face memory constraints when generating images with visual autoregressive models.
Stars
50
Forks
2
Language
Python
License
MIT
Category
Last pushed
Nov 04, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/StargazerX0/ScaleKV"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ModelCloud/GPTQModel
LLM model quantization (compression) toolkit with hw acceleration support for Nvidia CUDA, AMD...
intel/auto-round
🎯An accuracy-first, highly efficient quantization toolkit for LLMs, designed to minimize quality...
pytorch/ao
PyTorch native quantization and sparsity for training and inference
bodaay/HuggingFaceModelDownloader
Simple go utility to download HuggingFace Models and Datasets
NVIDIA/kvpress
LLM KV cache compression made easy