ciodar/deep-compression
PyTorch Lightning implementation of the paper Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding. This repository allows to reproduce the main findings of the paper on MNIST and Imagenette datasets.
This project helps machine learning engineers and researchers reduce the size and computational demands of deep neural networks. By applying pruning, quantization, and Huffman encoding, it takes an existing trained model and outputs a significantly smaller, more efficient version, while maintaining accuracy. It's designed for those working with image classification models like LeNet or AlexNet, who need to deploy them in resource-constrained environments.
No commits in the last 6 months.
Use this if you need to deploy large deep learning models on devices with limited memory or processing power, or if you want to speed up inference times significantly without a major loss in accuracy.
Not ideal if your primary goal is to train new, unoptimized models from scratch, or if you require absolute state-of-the-art accuracy at any computational cost.
Stars
35
Forks
3
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Dec 17, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/ciodar/deep-compression"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
open-mmlab/mmengine
OpenMMLab Foundational Library for Training Deep Learning Models
Xilinx/brevitas
Brevitas: neural network quantization in PyTorch
fastmachinelearning/qonnx
QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX
google/qkeras
QKeras: a quantization deep learning library for Tensorflow Keras
tensorflow/model-optimization
A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization...