shreydan/scratchformers
building various transformer model architectures and its modules from scratch.
This project offers clear, step-by-step implementations of various transformer architectures, like LLaMA, GPT-2, and CLIP. It takes foundational machine learning concepts and translates them into working models and modules for tasks such as natural language processing, image analysis, and even medical image segmentation. Machine learning engineers and researchers looking to understand and build these models from first principles would find this valuable.
No commits in the last 6 months.
Use this if you are a machine learning engineer or researcher who wants to learn how modern transformer models are built from scratch, understand their internal mechanics, or implement them for specific applications.
Not ideal if you are an end-user simply looking to apply pre-trained transformer models or leverage existing APIs for immediate tasks without diving into their architectural details.
Stars
13
Forks
2
Language
Jupyter Notebook
License
—
Category
Last pushed
Mar 14, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/shreydan/scratchformers"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
lucidrains/x-transformers
A concise but complete full-attention transformer with a set of promising experimental features...
kanishkamisra/minicons
Utility for behavioral and representational analyses of Language Models
lucidrains/simple-hierarchical-transformer
Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT
lucidrains/dreamer4
Implementation of Danijar's latest iteration for his Dreamer line of work
Nicolepcx/Transformers-in-Action
This is the corresponding code for the book Transformers in Action