ciodar/deep-compression

PyTorch Lightning implementation of the paper Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding. This repository allows to reproduce the main findings of the paper on MNIST and Imagenette datasets.

/ 100

Emerging

This project helps machine learning engineers and researchers reduce the size and computational demands of deep neural networks. By applying pruning, quantization, and Huffman encoding, it takes an existing trained model and outputs a significantly smaller, more efficient version, while maintaining accuracy. It's designed for those working with image classification models like LeNet or AlexNet, who need to deploy them in resource-constrained environments.

No commits in the last 6 months.

Use this if you need to deploy large deep learning models on devices with limited memory or processing power, or if you want to speed up inference times significantly without a major loss in accuracy.

Not ideal if your primary goal is to train new, unoptimized models from scratch, or if you require absolute state-of-the-art accuracy at any computational cost.

model-optimization deep-learning-deployment edge-ai neural-network-compression image-recognition

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 7 / 25

Maturity 16 / 25

Community 8 / 25

How are scores calculated?

Stars

Forks

Language

Jupyter Notebook

License

MIT

Higher-rated alternatives

open-mmlab/mmengine

OpenMMLab Foundational Library for Training Deep Learning Models

Xilinx/brevitas

Brevitas: neural network quantization in PyTorch

fastmachinelearning/qonnx

QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX

google/qkeras

QKeras: a quantization deep learning library for Tensorflow Keras

tensorflow/model-optimization

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization...

Explore ML Frameworks

All categories Trending ML Framework directory Insights