LLM Quantization Techniques LLM Tools

Tools and libraries for compressing LLM weights through quantization methods (int8, int4, binary, ternary), including inference frameworks and optimization techniques. Does NOT include general model compression, pruning, distillation, or non-quantization-based optimization approaches.

There are 15 llm quantization techniques tools tracked. 1 score above 50 (established tier). The highest-rated is huawei-csl/SINQ at 60/100 with 602 stars.

Get all 15 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=llm-tools&subcategory=llm-quantization-techniques&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 huawei-csl/SINQ

Welcome to the official repository of SINQ! A novel, fast and high-quality...

60
Established
2 SILX-LABS/QUASAR-SUBNET

QUASAR is a long-context foundation model and decentralized evaluation...

47
Emerging
3 stackblogger/bitnet.js

BitNet.Js - A node.js implementation of the microsoft bitnet.cpp inference framework.

41
Emerging
4 m96-chan/0xBitNet

Run BitNet b1.58 ternary LLMs with WebGPU — in browsers and native apps

40
Emerging
5 AnswerDotAI/cold-compress

Cold Compress is a hackable, lightweight, and open-source toolkit for...

39
Emerging
6 FMInference/H2O

[NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of...

38
Emerging
7 grctest/Electron-BitNet

Running Microsoft's BitNet via Electron, React & Astro

35
Emerging
8 tomsanbear/bitnet-rs

Implementing the BitNet model in Rust

29
Experimental
9 dnotitia/smoothie-qwen

A lightweight adjustment tool for smoothing token probabilities in the Qwen...

28
Experimental
10 GURPREETKAURJETHRA/Quantize-LLM-using-AWQ

Quantize LLM using AWQ

23
Experimental
11 kevin-pek/bitnet.c

Zero-dependency implementation of BitNet neural network training and BPE...

22
Experimental
12 Artessay/ArtQuantization

ArtQuantization is developed for quantizing Large Language Models, focusing...

22
Experimental
13 bayjarvis/llm

Fine-tuning, DPO, RLHF, RLAIF on LLMs - Qwen3, Zephyr 7B GPTQ with 4-Bit...

16
Experimental
14 puneetkakkar/Bitnet-1.58B

Bitnet 1.58b: This project implements the innovative 1-bit LLM architecture...

13
Experimental
15 SRafi007/Quantization-for-LLMs-An-Intuitive-Introduction

A beginner-friendly note explaining why and how quantization is used in...

11
Experimental