dropbox/hqq

Official implementation of Half-Quadratic Quantization (HQQ)

54
/ 100
Established

This project helps machine learning practitioners reduce the memory footprint and speed up large AI models, such as Large Language Models (LLMs) or computer vision models. It takes an existing, large AI model and converts its internal numerical weights into a smaller, more efficient format without needing extra data for calibration. The output is a functionally similar, but more compact and faster-running version of your original AI model, ready for deployment or further training.

917 stars.

Use this if you need to make very large AI models run faster and consume significantly less memory on your hardware, without sacrificing too much accuracy or requiring a complex calibration process.

Not ideal if your primary goal is maximum model accuracy at any computational cost, or if you require very fine-grained control over the quantization process for highly specialized hardware.

AI model deployment Large Language Models computer vision models model optimization deep learning inference
No Package No Dependents
Maintenance 10 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 18 / 25

How are scores calculated?

Stars

917

Forks

89

Language

Python

License

Apache-2.0

Last pushed

Feb 26, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/dropbox/hqq"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.