kyegomez/ShallowFF
Zeta implemantion of "Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers"
This project helps machine learning engineers and researchers build transformer models more efficiently. It replaces the traditional attention mechanism within a transformer's encoder-decoder block with a simpler, shallow feed-forward network. You provide numerical sequence data, and it outputs processed sequences from a transformer-like model, potentially with faster training or inference.
Use this if you are developing transformer models and want to experiment with alternative, potentially more lightweight, internal architectures for improved performance or efficiency.
Not ideal if you are a practitioner looking for an off-the-shelf solution for a specific NLP task or if you are not familiar with deep learning model architecture.
Stars
12
Forks
1
Language
Python
License
MIT
Category
Last pushed
Feb 07, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/kyegomez/ShallowFF"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
philipperemy/keras-attention
Keras Attention Layer (Luong and Bahdanau scores).
tatp22/linformer-pytorch
My take on a practical implementation of Linformer for Pytorch.
datalogue/keras-attention
Visualizing RNNs using the attention mechanism
ematvey/hierarchical-attention-networks
Document classification with Hierarchical Attention Networks in TensorFlow. WARNING: project is...
thushv89/attention_keras
Keras Layer implementation of Attention for Sequential models