Llm Compression Optimization Transformer Models

There are 44 llm compression optimization models tracked. 4 score above 50 (established tier). The highest-rated is ModelTC/LightCompress at 64/100 with 688 stars. 2 of the top 10 are actively maintained.

Get all 44 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=llm-compression-optimization&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Model	Score	Tier	Stars	Language
1	ModelTC/LightCompress [EMNLP 2024 & AAAI 2026] A powerful toolkit for compressing large models...	64	Established	688	Python
2	p-e-w/heretic Fully automatic censorship removal for language models	62	Established	12,369	Python
3	Orion-zhen/abliteration Make abliterated models with transformers, easy and fast	54	Established	128	Python
4	YerbaPage/LongCodeZip LongCodeZip: Compress Long Context for Code Language Models [ASE2025]	54	Established	142	Python
5	locuslab/wanda A simple and effective LLM pruning approach.	47	Emerging	854	Python
6	tommasomncttn/mergenetic Flexible library for merging large language models (LLMs) via evolutionary...	44	Emerging	100	Jupyter Notebook
7	FMInference/FlexLLMGen Running large language models on a single GPU for throughput-oriented scenarios.	44	Emerging	9,380	Python
8	luuyin/OWL Official Pytorch Implementation of "Outlier Weighed Layerwise Sparsity...	40	Emerging	81	Python
9	ymcui/Chinese-Mixtral 中文Mixtral混合专家大模型（Chinese Mixtral MoE LLMs）	40	Emerging	610	Python
10	zyushun/Adam-mini Code for Adam-mini: Use Fewer Learning Rates To Gain More...	40	Emerging	453	Python
11	horseee/Awesome-Efficient-LLM A curated list for Efficient Large Language Models	39	Emerging	1,967	Python
12	BaiTheBest/SparseLLM Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024)	38	Emerging	67	Python
13	Koratahiu/Advanced_Optimizers A family of highly efficient, lightweight yet powerful optimizers.	38	Emerging	21	Python
14	HOLYKEYZ/model-unfetter The production engine for directional ablation. Unalign / remove models...	38	Emerging	19	Python
15	jeffreysijuntan/lloco The official repo for "LLoCo: Learning Long Contexts Offline"	36	Emerging	118	Python
16	xuyang-liu16/GlobalCom2 [AAAI 2026] Global Compression Commander: Plug-and-Play Inference...	36	Emerging	39	Python
17	BauplanLabs/Making-Databases-Faster-with-LLM-Evolutionary-Sampling Repository hosting code to reproduce our paper (with Stanford and...	35	Emerging	18	Python
18	arcee-ai/PruneMe Automated Identification of Redundant Layer Blocks for Pruning in Large...	34	Emerging	263	Python
19	asahi417/lm-vocab-trimmer Vocabulary Trimming (VT) is a model compression technique, which reduces a...	33	Emerging	63	Python
20	Nota-NetsPresso/shortened-llm Compressed LLMs for Efficient Text Generation [ICLR'24 Workshop]	32	Emerging	90	Python
21	jordddan/Pruning-LLMs The framework to prune LLMs to any size and any config.	31	Emerging	95	Python
22	dmis-lab/Outlier-Safe-Pre-Training [ACL 2025] Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large...	31	Emerging	35	Python
23	whucs21Mzy/Model-Phase-Transitions Navigating Model Phase Transitions to Enable Extreme Lossless Compression: A...	30	Emerging	76	—
24	AndyyyYuuu/lm-is-compressor An accurate language model is a high-compression, lossless data compressor	29	Experimental	4	Python
25	Scientific-Computing-Lab/Tokompiler Scope is all you need: Transforming LLMs for HPC Code	27	Experimental	10	Python
26	OpenNLG/OpenBA-v2 OpenBA-V2: 3B LLM (Large Language Model) with T5 architecture, utilizing...	27	Experimental	25	Python
27	oliviersaidi/PACF_LLM Pattern-aware optimization framework achieving 93.8% complexity reduction in...	26	Experimental	1	Python
28	friendshipkim/overfill Code for OverFill: Two-Stage Models for Efficient Language Model Decoding	25	Experimental	5	Python
29	Aaronhuang-778/SliM-LLM [ICML 2025] SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large...	25	Experimental	53	Python
30	deadlykitten4/ERC-SVD ERC-SVD: Error-Controlled SVD for Large Language Model Compression	25	Experimental	1	Python
31	Pro-GenAI/ShortLang Compressed Text for efficient LLMs	22	Experimental	4	Python
32	JingyangXiang/DFRot [COLM 2025] DFRot: Achieving Outlier-Free and Massive Activation-Free for...	21	Experimental	29	Python
33	bupt-ai-club/llm-compression-papers papers of llm compression	21	Experimental	13	—
34	simocolo/nnDrain A PyTorch implementation for structural pruning applied to neural networks...	20	Experimental	5	Jupyter Notebook
35	plandes/lmtask Inferencing and Training Large Language Model Tasks	18	Experimental	1	Python
36	burcgokden/LLM-from-Power-Law-Decoder-Representations Implementation of PLDR-LLM: Large Language Model from Power Law Decoder...	18	Experimental	2	Python
37	Exthalpy/GenLang Self-Decoding Compression Architecture	17	Experimental	1	Jupyter Notebook
38	burcgokden/PLDR-LLM-with-KVG-cache Implementation of PLDR-LLM with KV-cache and G-cache in Pytorch for the...	17	Experimental	1	Python
39	arrmansa/Temporal-Neuron-Variance-Pruning-Demo An implementation of Variance Pruning: Pruning Language Models via Temporal...	17	Experimental	1	Jupyter Notebook
40	liyucheng09/llm-compressive Longitudinal Evaluation of LLMs via Data Compression	15	Experimental	33	Python
41	0xnu/multicollinearity_llm A multicollinearity-based compression C program, identifies and removes...	13	Experimental	2	C
42	chandan11248/deepseek-innovations-from-scratch Reverse-engineering how DeepSeek achieved frontier LLM performance at a...	13	Experimental	—	Jupyter Notebook
43	mrzjy/expert_choice_visualization_for_mixtral A simple project that help visualize expert router choices for text generation	11	Experimental	4	Python
44	DamianS21/parallel_llm Parallelise LLM (GPT) outputs for better results	10	Experimental	1	Python

Comparisons in this category

LightCompress and Awesome-Efficient-LLM (64 vs 39) LightCompress and shortened-llm (64 vs 32) LightCompress and GlobalCom2 (64 vs 36)