openvinotoolkit/nncf
Neural Network Compression Framework for enhanced OpenVINO™ inference
This tool helps machine learning engineers and AI solution developers make their neural network models run faster and use less memory without losing much accuracy. You provide your existing PyTorch, ONNX, or OpenVINO model and a small sample of your data, and it outputs a more optimized, compressed version of your model ready for deployment. This is ideal for those who need to deploy AI models on devices with limited resources or improve the performance of their AI applications.
1,136 stars. Actively maintained with 35 commits in the last 30 days. Available on PyPI.
Use this if you need to accelerate the inference speed of your neural network models or reduce their memory footprint for deployment on edge devices or in resource-constrained environments.
Not ideal if you are looking for a tool to train new models from scratch or if you require extreme precision where even a minimal accuracy drop is unacceptable.
Stars
1,136
Forks
290
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 12, 2026
Commits (30d)
35
Dependencies
13
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/openvinotoolkit/nncf"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related models
huggingface/optimum
🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers...
NVIDIA/Megatron-LM
Ongoing research training transformer models at scale
huggingface/optimum-intel
🤗 Optimum Intel: Accelerate inference with Intel optimization tools
eole-nlp/eole
Open language modeling toolkit based on PyTorch
huggingface/optimum-habana
Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)