fastmachinelearning/qonnx
QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX
This project helps machine learning engineers and researchers working with neural networks to efficiently represent and deploy their models using various levels of precision. It allows you to take trained neural networks from frameworks like Brevitas or QKeras and represent them in a standardized, compact format that supports custom integer or minifloat quantization. The output is a model that is ready for deployment on hardware like FPGAs, enabling faster inference and reduced resource usage.
179 stars. Used by 1 other package. Available on PyPI.
Use this if you need to optimize your deep learning models for deployment on resource-constrained hardware, requiring precise control over data representation and computational efficiency.
Not ideal if you are working exclusively with full-precision neural networks or do not require specialized hardware acceleration.
Stars
179
Forks
57
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 10, 2026
Commits (30d)
0
Dependencies
11
Reverse dependents
1
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/fastmachinelearning/qonnx"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related frameworks
open-mmlab/mmengine
OpenMMLab Foundational Library for Training Deep Learning Models
Xilinx/brevitas
Brevitas: neural network quantization in PyTorch
google/qkeras
QKeras: a quantization deep learning library for Tensorflow Keras
tensorflow/model-optimization
A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization...
lucidrains/vector-quantize-pytorch
Vector (and Scalar) Quantization, in Pytorch