LLM Quantization Techniques LLM Tools
Tools and libraries for compressing LLM weights through quantization methods (int8, int4, binary, ternary), including inference frameworks and optimization techniques. Does NOT include general model compression, pruning, distillation, or non-quantization-based optimization approaches.
There are 15 llm quantization techniques tools tracked. 1 score above 50 (established tier). The highest-rated is huawei-csl/SINQ at 60/100 with 602 stars.
Get all 15 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=llm-tools&subcategory=llm-quantization-techniques&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
huawei-csl/SINQ
Welcome to the official repository of SINQ! A novel, fast and high-quality... |
|
Established |
| 2 |
SILX-LABS/QUASAR-SUBNET
QUASAR is a long-context foundation model and decentralized evaluation... |
|
Emerging |
| 3 |
stackblogger/bitnet.js
BitNet.Js - A node.js implementation of the microsoft bitnet.cpp inference framework. |
|
Emerging |
| 4 |
m96-chan/0xBitNet
Run BitNet b1.58 ternary LLMs with WebGPU — in browsers and native apps |
|
Emerging |
| 5 |
AnswerDotAI/cold-compress
Cold Compress is a hackable, lightweight, and open-source toolkit for... |
|
Emerging |
| 6 |
FMInference/H2O
[NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of... |
|
Emerging |
| 7 |
grctest/Electron-BitNet
Running Microsoft's BitNet via Electron, React & Astro |
|
Emerging |
| 8 |
tomsanbear/bitnet-rs
Implementing the BitNet model in Rust |
|
Experimental |
| 9 |
dnotitia/smoothie-qwen
A lightweight adjustment tool for smoothing token probabilities in the Qwen... |
|
Experimental |
| 10 |
GURPREETKAURJETHRA/Quantize-LLM-using-AWQ
Quantize LLM using AWQ |
|
Experimental |
| 11 |
kevin-pek/bitnet.c
Zero-dependency implementation of BitNet neural network training and BPE... |
|
Experimental |
| 12 |
Artessay/ArtQuantization
ArtQuantization is developed for quantizing Large Language Models, focusing... |
|
Experimental |
| 13 |
bayjarvis/llm
Fine-tuning, DPO, RLHF, RLAIF on LLMs - Qwen3, Zephyr 7B GPTQ with 4-Bit... |
|
Experimental |
| 14 |
puneetkakkar/Bitnet-1.58B
Bitnet 1.58b: This project implements the innovative 1-bit LLM architecture... |
|
Experimental |
| 15 |
SRafi007/Quantization-for-LLMs-An-Intuitive-Introduction
A beginner-friendly note explaining why and how quantization is used in... |
|
Experimental |