ModelTC/LightCompress

[EMNLP 2024 & AAAI 2026] A powerful toolkit for compressing large models including LLMs, VLMs, and video generative models.

/ 100

Established

This toolkit helps organizations make their large AI models, like those for generating text, images, or video, run more efficiently and use less memory. It takes your existing large AI model and outputs a smaller, faster version that maintains its original performance. This is for AI developers and MLOps engineers who need to deploy these large models more cost-effectively on various hardware.

688 stars. Actively maintained with 36 commits in the last 30 days.

Use this if you need to deploy large AI models (LLMs, VLMs, video generative models) and want to reduce their size and inference costs without significant performance loss.

Not ideal if you are a general user without experience in model deployment or if you need to compress very small, specialized models.

AI model deployment MLOps large language models computer vision models generative AI

No Package No Dependents

Maintenance 20 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 18 / 25

How are scores calculated?

Stars

688

Forks

Language

Python

License

Apache-2.0

Compare

LightCompress and Awesome-Efficient-LLM LightCompress and GlobalCom2 LightCompress and shortened-llm

Related models

p-e-w/heretic

Fully automatic censorship removal for language models

Orion-zhen/abliteration

Make abliterated models with transformers, easy and fast

YerbaPage/LongCodeZip

LongCodeZip: Compress Long Context for Code Language Models [ASE2025]

locuslab/wanda

A simple and effective LLM pruning approach.

tommasomncttn/mergenetic

Flexible library for merging large language models (LLMs) via evolutionary optimization (ACL 2025 Demo).

Explore Transformer Models

All categories Trending Transformer directory Insights