fastmachinelearning/qonnx

QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX

/ 100

Established

This project helps machine learning engineers and researchers working with neural networks to efficiently represent and deploy their models using various levels of precision. It allows you to take trained neural networks from frameworks like Brevitas or QKeras and represent them in a standardized, compact format that supports custom integer or minifloat quantization. The output is a model that is ready for deployment on hardware like FPGAs, enabling faster inference and reduced resource usage.

179 stars. Used by 1 other package. Available on PyPI.

Use this if you need to optimize your deep learning models for deployment on resource-constrained hardware, requiring precise control over data representation and computational efficiency.

Not ideal if you are working exclusively with full-precision neural networks or do not require specialized hardware acceleration.

deep-learning-deployment hardware-acceleration model-quantization edge-ai neural-network-optimization

Maintenance 10 / 25

Adoption 11 / 25

Maturity 25 / 25

Community 22 / 25

How are scores calculated?

Stars

179

Forks

Language

Python

License

Apache-2.0

Related frameworks

open-mmlab/mmengine

OpenMMLab Foundational Library for Training Deep Learning Models

Xilinx/brevitas

Brevitas: neural network quantization in PyTorch

google/qkeras

QKeras: a quantization deep learning library for Tensorflow Keras

tensorflow/model-optimization

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization...

lucidrains/vector-quantize-pytorch

Vector (and Scalar) Quantization, in Pytorch

Explore ML Frameworks

All categories Trending ML Framework directory Insights