ModelTC/LightCompress
[EMNLP 2024 & AAAI 2026] A powerful toolkit for compressing large models including LLMs, VLMs, and video generative models.
This toolkit helps organizations make their large AI models, like those for generating text, images, or video, run more efficiently and use less memory. It takes your existing large AI model and outputs a smaller, faster version that maintains its original performance. This is for AI developers and MLOps engineers who need to deploy these large models more cost-effectively on various hardware.
688 stars. Actively maintained with 36 commits in the last 30 days.
Use this if you need to deploy large AI models (LLMs, VLMs, video generative models) and want to reduce their size and inference costs without significant performance loss.
Not ideal if you are a general user without experience in model deployment or if you need to compress very small, specialized models.
Stars
688
Forks
72
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 11, 2026
Commits (30d)
36
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/ModelTC/LightCompress"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Related models
p-e-w/heretic
Fully automatic censorship removal for language models
Orion-zhen/abliteration
Make abliterated models with transformers, easy and fast
YerbaPage/LongCodeZip
LongCodeZip: Compress Long Context for Code Language Models [ASE2025]
locuslab/wanda
A simple and effective LLM pruning approach.
tommasomncttn/mergenetic
Flexible library for merging large language models (LLMs) via evolutionary optimization (ACL 2025 Demo).