snap-research/F8Net

[ICLR 2022 Oral] F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization

/ 100

Emerging

This project helps machine learning engineers and researchers deploy neural networks more efficiently by making them smaller and faster. It takes a trained neural network and converts its complex internal calculations into simpler 8-bit operations. The output is a highly optimized neural network that maintains accuracy while requiring less computational power, ideal for deployment on devices with limited resources.

No commits in the last 6 months.

Use this if you need to deploy machine learning models on edge devices or in environments where computational resources and energy are scarce.

Not ideal if your primary concern is achieving the absolute highest model accuracy, and you have ample computational resources for model inference.

edge-AI model-optimization embedded-systems resource-constrained-AI neural-network-deployment

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 9 / 25

Maturity 16 / 25

Community 16 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

open-mmlab/mmengine

OpenMMLab Foundational Library for Training Deep Learning Models

Xilinx/brevitas

Brevitas: neural network quantization in PyTorch

google/qkeras

QKeras: a quantization deep learning library for Tensorflow Keras

fastmachinelearning/qonnx

QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX

tensorflow/model-optimization

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization...

Explore ML Frameworks

All categories Trending ML Framework directory Insights