model-optimization and neural-compressor

These are complementary tools that operate on different model formats—one optimizes TensorFlow/Keras models while the other compresses ONNX models—so users would choose based on their model framework rather than using them together.

model-optimization
64
Established
neural-compressor
47
Emerging
Maintenance 13/25
Adoption 10/25
Maturity 16/25
Community 25/25
Maintenance 10/25
Adoption 9/25
Maturity 16/25
Community 12/25
Stars: 1,565
Forks: 346
Downloads:
Commits (30d): 1
Language: Python
License: Apache-2.0
Stars: 99
Forks: 9
Downloads:
Commits (30d): 0
Language: Python
License: Apache-2.0
No Package No Dependents
No Package No Dependents

About model-optimization

tensorflow/model-optimization

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.

This toolkit helps machine learning engineers and researchers make their trained Keras and TensorFlow models smaller and faster. It takes an existing, functional machine learning model and applies optimization techniques like quantization or pruning. The output is a more efficient model that performs similarly but requires less computational power and memory, ideal for deploying on devices with limited resources.

ML model deployment edge AI model optimization resource-constrained devices embedded ML

About neural-compressor

onnx/neural-compressor

Model compression for ONNX

This tool helps AI practitioners make their large language models (LLMs) and other AI models run faster and use less memory without losing accuracy. You provide an existing ONNX model, and it outputs a 'quantized' version that's more efficient. This is for machine learning engineers and data scientists deploying models, especially on Intel hardware.

AI model deployment LLM optimization machine learning engineering model efficiency deep learning inference

Scores updated daily from GitHub, PyPI, and npm data. How scores work