chester256/Model-Compression-Papers
Papers for deep neural network compression and acceleration
This collection helps machine learning engineers and researchers optimize deep learning models for better performance on resource-constrained devices. It provides a curated list of research papers on techniques like quantization, pruning, and knowledge distillation. The input is an existing neural network model and the output is guidance on how to make that model smaller, faster, and more efficient.
401 stars. No commits in the last 6 months.
Use this if you need to deploy large neural networks on edge devices, mobile phones, or embedded systems with limited computational power and memory.
Not ideal if you are looking for ready-to-use code implementations or a general overview of deep learning without a focus on model efficiency.
Stars
401
Forks
80
Language
—
License
—
Category
Last pushed
Jun 21, 2021
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/chester256/Model-Compression-Papers"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
open-mmlab/mmengine
OpenMMLab Foundational Library for Training Deep Learning Models
Xilinx/brevitas
Brevitas: neural network quantization in PyTorch
google/qkeras
QKeras: a quantization deep learning library for Tensorflow Keras
fastmachinelearning/qonnx
QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX
tensorflow/model-optimization
A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization...