jianhayes/NESTQUANT

NestQuant: Post-Training Integer-Nesting Quantization for On-Device DNN [IEEE TMC 2025]

/ 100

Experimental

This project helps machine learning engineers and researchers optimize their deep neural networks for deployment on resource-constrained mobile and edge devices. It takes an existing, trained neural network model (like a CNN or Vision Transformer) and applies a technique called integer-nesting quantization. The output is a more efficient, smaller model that maintains accuracy while requiring less memory and computational power for inference. This is ideal for those working on AI for embedded systems or mobile applications.

No commits in the last 6 months.

Use this if you need to reduce the size and computational demands of your deep learning models so they can run efficiently on devices with limited resources, such as smartphones, IoT devices, or specialized embedded hardware.

Not ideal if your models are already sufficiently small or if you are deploying them on powerful cloud-based GPUs where resource constraints are not a primary concern.

on-device AI mobile AI edge computing model compression deep learning optimization

No License Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 5 / 25

Maturity 7 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

—

Higher-rated alternatives

open-mmlab/mmengine

OpenMMLab Foundational Library for Training Deep Learning Models

Xilinx/brevitas

Brevitas: neural network quantization in PyTorch

fastmachinelearning/qonnx

QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX

google/qkeras

QKeras: a quantization deep learning library for Tensorflow Keras

tensorflow/model-optimization

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization...

Explore ML Frameworks

All categories Trending ML Framework directory Insights