openvinotoolkit/nncf

Neural Network Compression Framework for enhanced OpenVINO™ inference

/ 100

Verified

This tool helps machine learning engineers and AI solution developers make their neural network models run faster and use less memory without losing much accuracy. You provide your existing PyTorch, ONNX, or OpenVINO model and a small sample of your data, and it outputs a more optimized, compressed version of your model ready for deployment. This is ideal for those who need to deploy AI models on devices with limited resources or improve the performance of their AI applications.

1,136 stars. Actively maintained with 35 commits in the last 30 days. Available on PyPI.

Use this if you need to accelerate the inference speed of your neural network models or reduce their memory footprint for deployment on edge devices or in resource-constrained environments.

Not ideal if you are looking for a tool to train new models from scratch or if you require extreme precision where even a minimal accuracy drop is unacceptable.

AI deployment model optimization edge AI deep learning inference resource-constrained systems

Maintenance 20 / 25

Adoption 10 / 25

Maturity 25 / 25

Community 25 / 25

How are scores calculated?

Stars

1,136

Forks

290

Language

Python

License

Apache-2.0

Related models

huggingface/optimum

🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers...

NVIDIA/Megatron-LM

Ongoing research training transformer models at scale

huggingface/optimum-intel

🤗 Optimum Intel: Accelerate inference with Intel optimization tools

eole-nlp/eole

Open language modeling toolkit based on PyTorch

huggingface/optimum-habana

Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)

Explore Transformer Models

All categories Trending Transformer directory Insights