AbdelStark/attnres

Rust implementation of Attention Residuals from MoonshotAI/Kimi

/ 100

Emerging

This project helps machine learning researchers and Rust engineers experiment with a new type of neural network layer called Attention Residuals. It provides the building blocks for creating Transformer models that can learn to adjust how they combine information from different depths of the network. You input model configurations and data, and it outputs the processed data or model weights for analysis.

Use this if you are a researcher validating a paper's findings, or a Rust engineer building new Transformer models and want to explore the Attention Residuals concept.

Not ideal if you need a production-ready solution, require PyTorch ecosystem compatibility, or need validated performance on GPU deployments.

neural-networks transformer-models machine-learning-research large-language-models rust-ml-development

No Package No Dependents

Maintenance 13 / 25

Adoption 10 / 25

Maturity 9 / 25

Community 10 / 25

How are scores calculated?

Stars

Forks

Language

Rust

License

MIT

Higher-rated alternatives

microsoft/LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

jadore801120/attention-is-all-you-need-pytorch

A PyTorch implementation of the Transformer model in "Attention is All You Need".

bhavnicksm/vanilla-transformer-jax

JAX/Flax implimentation of 'Attention Is All You Need' by Vaswani et al....

kyegomez/SparseAttention

Pytorch Implementation of the sparse attention from the paper: "Generating Long Sequences with...

sunnynguyen-ai/llm-attention-visualizer

Interactive tool for analyzing attention patterns in transformer models with layer-wise...

Explore Transformer Models

All categories Trending Transformer directory Insights