Llm Compression Optimization Transformer Models

There are 44 llm compression optimization models tracked. 4 score above 50 (established tier). The highest-rated is ModelTC/LightCompress at 64/100 with 688 stars. 2 of the top 10 are actively maintained.

Get all 44 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=llm-compression-optimization&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Model Score Tier
1 ModelTC/LightCompress

[EMNLP 2024 & AAAI 2026] A powerful toolkit for compressing large models...

64
Established
2 p-e-w/heretic

Fully automatic censorship removal for language models

62
Established
3 Orion-zhen/abliteration

Make abliterated models with transformers, easy and fast

54
Established
4 YerbaPage/LongCodeZip

LongCodeZip: Compress Long Context for Code Language Models [ASE2025]

54
Established
5 locuslab/wanda

A simple and effective LLM pruning approach.

47
Emerging
6 tommasomncttn/mergenetic

Flexible library for merging large language models (LLMs) via evolutionary...

44
Emerging
7 FMInference/FlexLLMGen

Running large language models on a single GPU for throughput-oriented scenarios.

44
Emerging
8 luuyin/OWL

Official Pytorch Implementation of "Outlier Weighed Layerwise Sparsity...

40
Emerging
9 ymcui/Chinese-Mixtral

中文Mixtral混合专家大模型(Chinese Mixtral MoE LLMs)

40
Emerging
10 zyushun/Adam-mini

Code for Adam-mini: Use Fewer Learning Rates To Gain More...

40
Emerging
11 horseee/Awesome-Efficient-LLM

A curated list for Efficient Large Language Models

39
Emerging
12 BaiTheBest/SparseLLM

Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024)

38
Emerging
13 Koratahiu/Advanced_Optimizers

A family of highly efficient, lightweight yet powerful optimizers.

38
Emerging
14 HOLYKEYZ/model-unfetter

The production engine for directional ablation. Unalign / remove models...

38
Emerging
15 jeffreysijuntan/lloco

The official repo for "LLoCo: Learning Long Contexts Offline"

36
Emerging
16 xuyang-liu16/GlobalCom2

[AAAI 2026] Global Compression Commander: Plug-and-Play Inference...

36
Emerging
17 BauplanLabs/Making-Databases-Faster-with-LLM-Evolutionary-Sampling

Repository hosting code to reproduce our paper (with Stanford and...

35
Emerging
18 arcee-ai/PruneMe

Automated Identification of Redundant Layer Blocks for Pruning in Large...

34
Emerging
19 asahi417/lm-vocab-trimmer

Vocabulary Trimming (VT) is a model compression technique, which reduces a...

33
Emerging
20 Nota-NetsPresso/shortened-llm

Compressed LLMs for Efficient Text Generation [ICLR'24 Workshop]

32
Emerging
21 jordddan/Pruning-LLMs

The framework to prune LLMs to any size and any config.

31
Emerging
22 dmis-lab/Outlier-Safe-Pre-Training

[ACL 2025] Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large...

31
Emerging
23 whucs21Mzy/Model-Phase-Transitions

Navigating Model Phase Transitions to Enable Extreme Lossless Compression: A...

30
Emerging
24 AndyyyYuuu/lm-is-compressor

An accurate language model is a high-compression, lossless data compressor

29
Experimental
25 Scientific-Computing-Lab/Tokompiler

Scope is all you need: Transforming LLMs for HPC Code

27
Experimental
26 OpenNLG/OpenBA-v2

OpenBA-V2: 3B LLM (Large Language Model) with T5 architecture, utilizing...

27
Experimental
27 oliviersaidi/PACF_LLM

Pattern-aware optimization framework achieving 93.8% complexity reduction in...

26
Experimental
28 friendshipkim/overfill

Code for OverFill: Two-Stage Models for Efficient Language Model Decoding

25
Experimental
29 Aaronhuang-778/SliM-LLM

[ICML 2025] SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large...

25
Experimental
30 deadlykitten4/ERC-SVD

ERC-SVD: Error-Controlled SVD for Large Language Model Compression

25
Experimental
31 Pro-GenAI/ShortLang

Compressed Text for efficient LLMs

22
Experimental
32 JingyangXiang/DFRot

[COLM 2025] DFRot: Achieving Outlier-Free and Massive Activation-Free for...

21
Experimental
33 bupt-ai-club/llm-compression-papers

papers of llm compression

21
Experimental
34 simocolo/nnDrain

A PyTorch implementation for structural pruning applied to neural networks...

20
Experimental
35 plandes/lmtask

Inferencing and Training Large Language Model Tasks

18
Experimental
36 burcgokden/LLM-from-Power-Law-Decoder-Representations

Implementation of PLDR-LLM: Large Language Model from Power Law Decoder...

18
Experimental
37 Exthalpy/GenLang

Self-Decoding Compression Architecture

17
Experimental
38 burcgokden/PLDR-LLM-with-KVG-cache

Implementation of PLDR-LLM with KV-cache and G-cache in Pytorch for the...

17
Experimental
39 arrmansa/Temporal-Neuron-Variance-Pruning-Demo

An implementation of Variance Pruning: Pruning Language Models via Temporal...

17
Experimental
40 liyucheng09/llm-compressive

Longitudinal Evaluation of LLMs via Data Compression

15
Experimental
41 0xnu/multicollinearity_llm

A multicollinearity-based compression C program, identifies and removes...

13
Experimental
42 chandan11248/deepseek-innovations-from-scratch

Reverse-engineering how DeepSeek achieved frontier LLM performance at a...

13
Experimental
43 mrzjy/expert_choice_visualization_for_mixtral

A simple project that help visualize expert router choices for text generation

11
Experimental
44 DamianS21/parallel_llm

Parallelise LLM (GPT) outputs for better results

10
Experimental