actypedef/ARCQuant

Code for the paper "ARCQuant: Boosting NVFP4 Quantization with Augmented Residual Channels for LLMs"

33
/ 100
Emerging

This tool helps AI developers and researchers improve the accuracy of large language models (LLMs) when using highly efficient, low-precision number formats like NVFP4. It takes your existing LLM and configuration, and outputs a quantized LLM that maintains high accuracy while enabling faster and more memory-efficient inference. It's designed for machine learning engineers and researchers working on deploying LLMs in resource-constrained environments.

Use this if you need to run large language models more efficiently on hardware, but are struggling to maintain model accuracy when using low-precision quantization methods like NVFP4.

Not ideal if you are not working with large language models, or if you do not require specialized low-bit quantization for performance optimization.

LLM deployment model optimization AI inference quantization machine learning engineering
No License No Package No Dependents
Maintenance 10 / 25
Adoption 6 / 25
Maturity 5 / 25
Community 12 / 25

How are scores calculated?

Stars

18

Forks

3

Language

Cuda

License

Last pushed

Mar 03, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/actypedef/ARCQuant"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.