LLM Compression Optimization LLM Tools
Tools and techniques for reducing LLM size, memory footprint, and inference latency through compression, pruning, quantization, and architectural optimization. Does NOT include general model training, fine-tuning frameworks, or inference serving infrastructure.
There are 17 llm compression optimization tools tracked. 1 score above 70 (verified tier). The highest-rated is Tencent/AngelSlim at 70/100 with 536 stars. 1 of the top 10 are actively maintained.
Get all 17 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=llm-tools&subcategory=llm-compression-optimization&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
Tencent/AngelSlim
Model compression toolkit engineered for enhanced usability,... |
|
Verified |
| 2 |
nebuly-ai/optimate
A collection of libraries to optimise AI model performances |
|
Emerging |
| 3 |
antgroup/glake
GLake: optimizing GPU memory management and IO transmission. |
|
Emerging |
| 4 |
kyo-takano/chinchilla
A toolkit for scaling law research ⚖ |
|
Emerging |
| 5 |
liyucheng09/Selective_Context
Compress your input to ChatGPT or other LLMs, to let them process 2x more... |
|
Emerging |
| 6 |
TsingmaoAI/MI-optimize
mi-optimize is a versatile tool designed for the quantization and evaluation... |
|
Emerging |
| 7 |
microsoft/only_train_once
OTOv1-v3, NeurIPS, ICLR, TMLR, DNN Training, Compression, Structured... |
|
Emerging |
| 8 |
amazon-science/llm-rank-pruning
LLM-Rank: A graph theoretical approach to structured pruning of large... |
|
Emerging |
| 9 |
naskio/mergeui
All-in-one UI for merged LLMs in Hugging Face |
|
Emerging |
| 10 |
LINs-lab/DeFT
[ICLR 2025] DeFT: Decoding with Flash Tree-attention for Efficient... |
|
Emerging |
| 11 |
robtacconelli/Nacrith-GPU
Nacrith — Lossless text compression via ensemble neural arithmetic coding.... |
|
Emerging |
| 12 |
M9rth/heretic
🛠 Remove censorship from language models instantly using advanced... |
|
Experimental |
| 13 |
talkking/PrunerGPT
[ICASSP2024] One-Shot Sensitivity-Aware Mixed Sparsity Pruning for Large... |
|
Experimental |
| 14 |
louisbrulenaudet/mergeKit
Tools for merging pretrained Large Language Models and create Mixture of... |
|
Experimental |
| 15 |
Yvancg/optimizers
A collection of minimal, dependency-free, performance-focused utilities for... |
|
Experimental |
| 16 |
louisbrulenaudet/mergekit-assistant
Mergekit Assistant is a cutting-edge toolkit designed for the seamless... |
|
Experimental |
| 17 |
Mikola78/trinity-large-tech-report
🚀 Explore advanced sparse Mixture-of-Experts models with up to 400B... |
|
Experimental |