CodingPlatelets/transformer_MM
Accelerator for LLM Based on Chisel3
This project offers hardware designs for accelerating large language model (LLM) computations. It takes in raw LLM data for matrix operations and attention mechanisms and outputs faster processed results, enabling quicker training and inference. Specialized hardware engineers and researchers working on custom AI accelerators would use this to build more efficient LLM chips.
Use this if you are designing custom hardware (like an ASIC or FPGA) for large language models and need highly optimized arithmetic units and memory controllers.
Not ideal if you are a software developer looking for a Python library or an end-user running LLMs on standard GPUs or CPUs.
Stars
12
Forks
1
Language
Scala
License
LGPL-3.0
Category
Last pushed
Dec 15, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/CodingPlatelets/transformer_MM"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
quic/efficient-transformers
This library empowers users to seamlessly port pretrained models and checkpoints on the...
ManuelSLemos/RabbitLLM
Run 70B+ LLMs on a single 4GB GPU — no quantization required.
alpa-projects/alpa
Training and serving large-scale neural networks with auto parallelization.
arm-education/Advanced-AI-Hardware-Software-Co-Design
Hands-on course materials for ML engineers to master extreme model quantization and on-device...
IST-DASLab/marlin
FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes...