gordicaleksa/pytorch-original-transformer
My implementation of the original transformer model (Vaswani et al.). I've additionally included the playground.py file for visualizing otherwise seemingly hard concepts. Currently included IWSLT pretrained models.
This project offers a foundational implementation of the original Transformer model for individuals eager to learn about its inner workings. It takes text data, specifically English and German sentences, and translates them, producing translated text. It's designed for machine learning students, researchers, or practitioners who want to understand the core concepts behind modern language models.
1,085 stars. No commits in the last 6 months.
Use this if you are studying neural machine translation and want to explore the Transformer architecture with practical examples and visualizations.
Not ideal if you need a production-ready, state-of-the-art machine translation system or a library for high-performance NLP applications.
Stars
1,085
Forks
188
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Dec 27, 2020
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/gordicaleksa/pytorch-original-transformer"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related models
lucidrains/x-transformers
A concise but complete full-attention transformer with a set of promising experimental features...
kanishkamisra/minicons
Utility for behavioral and representational analyses of Language Models
lucidrains/simple-hierarchical-transformer
Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT
lucidrains/dreamer4
Implementation of Danijar's latest iteration for his Dreamer line of work
Nicolepcx/Transformers-in-Action
This is the corresponding code for the book Transformers in Action