brandokoch/attention-is-all-you-need-paper
Original transformer paper: Implementation of Vaswani, Ashish, et al. "Attention is all you need." Advances in neural information processing systems. 2017.
This project provides a clear, runnable implementation of the original Transformer neural network architecture. It allows machine learning researchers and students to input text data, train a model on it, and then use that model to translate text, for example, from English to German. This is ideal for those learning about or experimenting with foundational natural language processing models.
243 stars. No commits in the last 6 months.
Use this if you are a machine learning researcher or student who wants to understand and experiment with the core Transformer architecture from the 'Attention Is All You Need' paper.
Not ideal if you need a production-ready, highly optimized machine translation system, or if you're not familiar with machine learning development concepts.
Stars
243
Forks
54
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Apr 29, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/brandokoch/attention-is-all-you-need-paper"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
philipperemy/keras-attention
Keras Attention Layer (Luong and Bahdanau scores).
tatp22/linformer-pytorch
My take on a practical implementation of Linformer for Pytorch.
ematvey/hierarchical-attention-networks
Document classification with Hierarchical Attention Networks in TensorFlow. WARNING: project is...
datalogue/keras-attention
Visualizing RNNs using the attention mechanism
thushv89/attention_keras
Keras Layer implementation of Attention for Sequential models