joennlae/halutmatmul
Hashed Lookup Table based Matrix Multiplication (halutmatmul) - Stella Nera accelerator
This project offers an innovative way to speed up deep neural network computations by making matrix multiplications much more energy-efficient and faster. It takes standard neural network models as input and produces an optimized version that runs on specialized hardware, achieving high accuracy with significantly less power consumption. This is ideal for AI hardware designers and researchers focused on deploying efficient machine learning models.
216 stars. No commits in the last 6 months.
Use this if you are designing custom hardware accelerators for AI and need to dramatically improve the energy and area efficiency of deep learning inference.
Not ideal if you are looking for a software-only solution for general-purpose CPUs or GPUs without specialized hardware integration.
Stars
216
Forks
14
Language
Python
License
MIT
Category
Last pushed
Dec 10, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/joennlae/halutmatmul"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
apache/tvm
Open Machine Learning Compiler Framework
uxlfoundation/oneDNN
oneAPI Deep Neural Network Library (oneDNN)
Tencent/ncnn
ncnn is a high-performance neural network inference framework optimized for the mobile platform
OpenMined/TenSEAL
A library for doing homomorphic encryption operations on tensors
iree-org/iree-turbine
IREE's PyTorch Frontend, based on Torch Dynamo.