chester256/Model-Compression-Papers

Papers for deep neural network compression and acceleration

/ 100

Emerging

This collection helps machine learning engineers and researchers optimize deep learning models for better performance on resource-constrained devices. It provides a curated list of research papers on techniques like quantization, pruning, and knowledge distillation. The input is an existing neural network model and the output is guidance on how to make that model smaller, faster, and more efficient.

401 stars. No commits in the last 6 months.

Use this if you need to deploy large neural networks on edge devices, mobile phones, or embedded systems with limited computational power and memory.

Not ideal if you are looking for ready-to-use code implementations or a general overview of deep learning without a focus on model efficiency.

model-optimization deep-learning-deployment edge-ai neural-network-efficiency resource-constrained-ml

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 8 / 25

Community 23 / 25

How are scores calculated?

Stars

401

Forks

Language

—

License

—

Higher-rated alternatives

open-mmlab/mmengine

OpenMMLab Foundational Library for Training Deep Learning Models

Xilinx/brevitas

Brevitas: neural network quantization in PyTorch

google/qkeras

QKeras: a quantization deep learning library for Tensorflow Keras

fastmachinelearning/qonnx

QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX

tensorflow/model-optimization

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization...

Explore ML Frameworks

All categories Trending ML Framework directory Insights