elephantmipt/compressors

A small library with distillation, quantization and pruning pipelines

/ 100

Emerging

For machine learning practitioners, this helps optimize large deep learning models for deployment. It takes an existing, powerful model (the 'teacher') and efficiently transfers its knowledge to a smaller, faster model (the 'student'). This results in a compressed model that performs nearly as well as the original but requires fewer computational resources, making it ideal for real-time applications or environments with limited processing power.

No commits in the last 6 months.

Use this if you need to reduce the size and computational cost of your trained deep learning models in computer vision or natural language processing without significantly sacrificing performance.

Not ideal if you are working with non-deep learning models, or if your primary goal is to train models from scratch rather than optimize existing ones.

deep-learning-optimization model-deployment computer-vision natural-language-processing machine-learning-engineering

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 7 / 25

Maturity 16 / 25

Community 10 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

ModelCloud/GPTQModel

LLM model quantization (compression) toolkit with hw acceleration support for Nvidia CUDA, AMD...

intel/auto-round

🎯An accuracy-first, highly efficient quantization toolkit for LLMs, designed to minimize quality...

pytorch/ao

PyTorch native quantization and sparsity for training and inference

bodaay/HuggingFaceModelDownloader

Simple go utility to download HuggingFace Models and Datasets

NVIDIA/kvpress

LLM KV cache compression made easy

Explore Transformer Models

All categories Trending Transformer directory Insights