snap-research/F8Net
[ICLR 2022 Oral] F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization
This project helps machine learning engineers and researchers deploy neural networks more efficiently by making them smaller and faster. It takes a trained neural network and converts its complex internal calculations into simpler 8-bit operations. The output is a highly optimized neural network that maintains accuracy while requiring less computational power, ideal for deployment on devices with limited resources.
No commits in the last 6 months.
Use this if you need to deploy machine learning models on edge devices or in environments where computational resources and energy are scarce.
Not ideal if your primary concern is achieving the absolute highest model accuracy, and you have ample computational resources for model inference.
Stars
93
Forks
15
Language
Python
License
—
Category
Last pushed
May 05, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/snap-research/F8Net"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
open-mmlab/mmengine
OpenMMLab Foundational Library for Training Deep Learning Models
Xilinx/brevitas
Brevitas: neural network quantization in PyTorch
google/qkeras
QKeras: a quantization deep learning library for Tensorflow Keras
fastmachinelearning/qonnx
QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX
tensorflow/model-optimization
A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization...