kyegomez/ShallowFF

Zeta implemantion of "Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers"

/ 100

Emerging

This project helps machine learning engineers and researchers build transformer models more efficiently. It replaces the traditional attention mechanism within a transformer's encoder-decoder block with a simpler, shallow feed-forward network. You provide numerical sequence data, and it outputs processed sequences from a transformer-like model, potentially with faster training or inference.

Use this if you are developing transformer models and want to experiment with alternative, potentially more lightweight, internal architectures for improved performance or efficiency.

Not ideal if you are a practitioner looking for an off-the-shelf solution for a specific NLP task or if you are not familiar with deep learning model architecture.

deep-learning-research natural-language-processing model-optimization neural-network-design ai-development

No Package No Dependents

Maintenance 10 / 25

Adoption 5 / 25

Maturity 16 / 25

Community 6 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

philipperemy/keras-attention

Keras Attention Layer (Luong and Bahdanau scores).

tatp22/linformer-pytorch

My take on a practical implementation of Linformer for Pytorch.

datalogue/keras-attention

Visualizing RNNs using the attention mechanism

ematvey/hierarchical-attention-networks

Document classification with Hierarchical Attention Networks in TensorFlow. WARNING: project is...

thushv89/attention_keras

Keras Layer implementation of Attention for Sequential models

Explore ML Frameworks

All categories Trending ML Framework directory Insights