LightCompress and shortened-llm
These tools are competitors, as both aim to compress large models, with ModelTC/LightCompress offering a broader toolkit for various large models and Nota-NetsPresso/shortened-llm focusing specifically on LLM compression for efficient text generation.
About LightCompress
ModelTC/LightCompress
[EMNLP 2024 & AAAI 2026] A powerful toolkit for compressing large models including LLMs, VLMs, and video generative models.
This toolkit helps organizations make their large AI models, like those for generating text, images, or video, run more efficiently and use less memory. It takes your existing large AI model and outputs a smaller, faster version that maintains its original performance. This is for AI developers and MLOps engineers who need to deploy these large models more cost-effectively on various hardware.
About shortened-llm
Nota-NetsPresso/shortened-llm
Compressed LLMs for Efficient Text Generation [ICLR'24 Workshop]
This project helps machine learning engineers and researchers reduce the size and improve the speed of Large Language Models (LLMs) like LLaMA and Vicuna. By strategically removing parts of the model (depth pruning), it takes an existing LLM and outputs a smaller, faster version that still performs well. This is useful for anyone working with LLMs who needs to deploy them efficiently on limited hardware or for faster processing.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work