AbdelStark/attnres
Rust implementation of Attention Residuals from MoonshotAI/Kimi
This project helps machine learning researchers and Rust engineers experiment with a new type of neural network layer called Attention Residuals. It provides the building blocks for creating Transformer models that can learn to adjust how they combine information from different depths of the network. You input model configurations and data, and it outputs the processed data or model weights for analysis.
Use this if you are a researcher validating a paper's findings, or a Rust engineer building new Transformer models and want to explore the Attention Residuals concept.
Not ideal if you need a production-ready solution, require PyTorch ecosystem compatibility, or need validated performance on GPU deployments.
Stars
47
Forks
5
Language
Rust
License
MIT
Category
Last pushed
Mar 18, 2026
Monthly downloads
9
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/AbdelStark/attnres"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
microsoft/LoRA
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
jadore801120/attention-is-all-you-need-pytorch
A PyTorch implementation of the Transformer model in "Attention is All You Need".
bhavnicksm/vanilla-transformer-jax
JAX/Flax implimentation of 'Attention Is All You Need' by Vaswani et al....
kyegomez/SparseAttention
Pytorch Implementation of the sparse attention from the paper: "Generating Long Sequences with...
sunnynguyen-ai/llm-attention-visualizer
Interactive tool for analyzing attention patterns in transformer models with layer-wise...