SirawitC/Transformer_from_scratch_pytorch
Build a transformer model from scratch using pytorch to understand its inner workings and gain hands-on experience with deep learning models in PyTorch.
This project provides a detailed guide for machine learning engineers or researchers to build a Transformer model from scratch using PyTorch. It explains each core component like tokenization, positional encoding, and multi-head attention, showing how they fit together. The output is a working Transformer model, ideal for those who want to understand the foundational architecture behind modern NLP models like BERT and GPT.
Use this if you are a deep learning practitioner who wants to understand the inner workings of Transformer models by implementing one yourself, rather than just using a pre-built library.
Not ideal if you are looking for a plug-and-play solution to immediately apply a Transformer model to a real-world problem without needing to understand its intricate components.
Stars
42
Forks
7
Language
Python
License
MIT
Category
Last pushed
Nov 25, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/SirawitC/Transformer_from_scratch_pytorch"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
lvapeab/nmt-keras
Neural Machine Translation with Keras
dair-ai/Transformers-Recipe
🧠A study guide to learn about Transformers
jaketae/ensemble-transformers
Ensembling Hugging Face transformers made easy
lof310/transformer
PyTorch implementation of the current SOTA Transformer. Configurable, efficient, and...
jiangtaoxie/SoT
SoT: Delving Deeper into Classification Head for Transformer