Attention Mechanism Implementations Transformer Models
There are 24 attention mechanism implementations models tracked. 2 score above 50 (established tier). The highest-rated is microsoft/LoRA at 57/100 with 13,320 stars.
Get all 24 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=attention-mechanism-implementations&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Model | Score | Tier |
|---|---|---|---|
| 1 |
microsoft/LoRA
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large... |
|
Established |
| 2 |
jadore801120/attention-is-all-you-need-pytorch
A PyTorch implementation of the Transformer model in "Attention is All You Need". |
|
Established |
| 3 |
bhavnicksm/vanilla-transformer-jax
JAX/Flax implimentation of 'Attention Is All You Need' by Vaswani et al.... |
|
Emerging |
| 4 |
kyegomez/SparseAttention
Pytorch Implementation of the sparse attention from the paper: "Generating... |
|
Emerging |
| 5 |
AbdelStark/attnres
Rust implementation of Attention Residuals from MoonshotAI/Kimi |
|
Emerging |
| 6 |
sunnynguyen-ai/llm-attention-visualizer
Interactive tool for analyzing attention patterns in transformer models with... |
|
Emerging |
| 7 |
kyegomez/AoA-torch
Implementation of Attention on Attention in Zeta |
|
Emerging |
| 8 |
takara-ai/SwarmFormer
A pytorch implementation of SwarmFormer for text classification. |
|
Emerging |
| 9 |
takara-ai/go-attention
A full attention mechanism and transformer in pure go. |
|
Emerging |
| 10 |
MurrellGroup/InvariantPointAttention.jl
Julia implementation of AlphaFold 2's Invariant Point Attention |
|
Emerging |
| 11 |
SingleZombie/LLSA
Official implementation of Log-linear Sparse Attention (LLSA). |
|
Emerging |
| 12 |
tranquoctrinh/transformer
This is a PyTorch implementation of the Transformer model in the paper... |
|
Emerging |
| 13 |
HKUNLP/efficient-attention
[EVA ICLR'23; LARA ICML'22] Efficient attention mechanisms via control... |
|
Emerging |
| 14 |
mrcabbage972/simple-toolformer
A Python implementation of Toolformer using Huggingface Transformers |
|
Emerging |
| 15 |
Awni00/abstract_transformer
This is the project repo associated with the paper "Disentangling and... |
|
Emerging |
| 16 |
tobifinn/ensemble_transformer
Official PyTorch implementation of "Self-Attentive Ensemble Transformer:... |
|
Experimental |
| 17 |
ghosthamlet/transformers-rs
Rust Implemention of paper: Attention Is All You... |
|
Experimental |
| 18 |
Nemesis-12/multihead-latent-attention
Implementation of Multi-head Latent Attention (MLA) from DeepSeek-V2 |
|
Experimental |
| 19 |
cnygaard/FractalHTransformer
Fractal Hierarchical Transformer: multi-resolution causal attention patterns... |
|
Experimental |
| 20 |
wiedersehne/Paramixer
Parameterizing Mixing Links in Sparse Factors Works Better than Dot-Product... |
|
Experimental |
| 21 |
adi-mish/miniformer
Miniformer is a lightweight PyTorch transformer library for researchers,... |
|
Experimental |
| 22 |
romizone/simulasiLLM
🧠Interactive LLM Attention Simulation — Visualize how GPT-2 transformers... |
|
Experimental |
| 23 |
kesimeg/LORA-turkish-clip
Finetuning CLIP using LORA for Turkish language |
|
Experimental |
| 24 |
MaxLSB/linformer
linformer implementation and comparison with vanilla transformers |
|
Experimental |