cedrickchee/awesome-ml-model-compression

Awesome machine learning model compression research papers, quantization, tools, and learning material.

/ 100

Emerging

This resource helps machine learning engineers and researchers make their large AI models smaller, faster, and more efficient for real-world deployment. It provides a curated list of research papers, articles, and tools focused on techniques like quantization and pruning. The output is knowledge and practical methods for reducing model size and improving inference speed.

539 stars. No commits in the last 6 months.

Use this if you are a machine learning practitioner looking to optimize the size and speed of your deep learning models for deployment on resource-constrained devices or to reduce operational costs.

Not ideal if you are looking for ready-to-use, drag-and-drop software solutions without needing to understand the underlying research or techniques.

AI model optimization deep learning deployment edge AI neural network efficiency machine learning research

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 18 / 25

How are scores calculated?

Stars

539

Forks

Language

—

License

MIT

Higher-rated alternatives

open-mmlab/mmengine

OpenMMLab Foundational Library for Training Deep Learning Models

Xilinx/brevitas

Brevitas: neural network quantization in PyTorch

google/qkeras

QKeras: a quantization deep learning library for Tensorflow Keras

fastmachinelearning/qonnx

QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX

tensorflow/model-optimization

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization...

Explore ML Frameworks

All categories Trending ML Framework directory Insights