LLM Quantization Techniques LLM Tools

Tools and libraries for compressing LLM weights through quantization methods (int8, int4, binary, ternary), including inference frameworks and optimization techniques. Does NOT include general model compression, pruning, distillation, or non-quantization-based optimization approaches.

There are 15 llm quantization techniques tools tracked. 1 score above 50 (established tier). The highest-rated is huawei-csl/SINQ at 60/100 with 602 stars.

Get all 15 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=llm-tools&subcategory=llm-quantization-techniques&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Tool	Score	Tier	Stars	Language
1	huawei-csl/SINQ Welcome to the official repository of SINQ! A novel, fast and high-quality...	60	Established	602	Python
2	SILX-LABS/QUASAR-SUBNET QUASAR is a long-context foundation model and decentralized evaluation...	47	Emerging	7	Python
3	stackblogger/bitnet.js BitNet.Js - A node.js implementation of the microsoft bitnet.cpp inference framework.	41	Emerging	34	HTML
4	m96-chan/0xBitNet Run BitNet b1.58 ternary LLMs with WebGPU — in browsers and native apps	40	Emerging	10	TypeScript
5	AnswerDotAI/cold-compress Cold Compress is a hackable, lightweight, and open-source toolkit for...	39	Emerging	148	Python
6	FMInference/H2O [NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of...	38	Emerging	506	Python
7	grctest/Electron-BitNet Running Microsoft's BitNet via Electron, React & Astro	35	Emerging	56	JavaScript
8	tomsanbear/bitnet-rs Implementing the BitNet model in Rust	29	Experimental	46	Rust
9	dnotitia/smoothie-qwen A lightweight adjustment tool for smoothing token probabilities in the Qwen...	28	Experimental	104	Python
10	GURPREETKAURJETHRA/Quantize-LLM-using-AWQ Quantize LLM using AWQ	23	Experimental	2	Jupyter Notebook
11	kevin-pek/bitnet.c Zero-dependency implementation of BitNet neural network training and BPE...	22	Experimental	5	C
12	Artessay/ArtQuantization ArtQuantization is developed for quantizing Large Language Models, focusing...	22	Experimental	1	Python
13	bayjarvis/llm Fine-tuning, DPO, RLHF, RLAIF on LLMs - Qwen3, Zephyr 7B GPTQ with 4-Bit...	16	Experimental	15	Python
14	puneetkakkar/Bitnet-1.58B Bitnet 1.58b: This project implements the innovative 1-bit LLM architecture...	13	Experimental	9	Python
15	SRafi007/Quantization-for-LLMs-An-Intuitive-Introduction A beginner-friendly note explaining why and how quantization is used in...	11	Experimental	—	Jupyter Notebook