hkproj/quantization-notes

Notes on quantization in neural networks

/ 100

Emerging

These notes and sample code provide a practical guide to optimizing neural networks for efficiency. You'll learn how to take a standard neural network model and reduce its computational footprint, making it faster and consume less memory. This is for machine learning practitioners, researchers, and engineers who deploy models on resource-constrained hardware or seek to speed up inference.

121 stars. No commits in the last 6 months.

Use this if you need to make your deep learning models run faster or use less memory, especially for deployment on edge devices or in high-throughput applications.

Not ideal if you are looking for a conceptual introduction to neural networks or deep learning architectures, as it focuses specifically on optimization techniques.

Machine Learning Deployment Model Optimization Edge AI Neural Network Efficiency Deep Learning Inference

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 8 / 25

Community 20 / 25

How are scores calculated?

Stars

121

Forks

Language

Jupyter Notebook

License

—

Higher-rated alternatives

open-mmlab/mmengine

OpenMMLab Foundational Library for Training Deep Learning Models

Xilinx/brevitas

Brevitas: neural network quantization in PyTorch

google/qkeras

QKeras: a quantization deep learning library for Tensorflow Keras

fastmachinelearning/qonnx

QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX

tensorflow/model-optimization

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization...

Explore ML Frameworks

All categories Trending ML Framework directory Insights