Transformer Architecture Education Transformer Models

There are 63 transformer architecture education models tracked. 1 score above 70 (verified tier). The highest-rated is huggingface/transformers at 87/100 with 157,811 stars. 1 of the top 10 are actively maintained.

Get all 63 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=transformer-architecture-education&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Model	Score	Tier	Stars	Language
1	huggingface/transformers 🤗 Transformers: the model-definition framework for state-of-the-art machine...	87	Verified	157,811	Python
2	kyegomez/LongNet Implementation of plug in and play Attention from "LongNet: Scaling...	51	Established	714	Python
3	pbloem/former Simple transformer implementation from scratch in pytorch. (archival, latest...	49	Emerging	1,092	Python
4	NVIDIA/FasterTransformer Transformer related optimization, including BERT, GPT	48	Emerging	6,398	C++
5	kyegomez/SimplifiedTransformers SimplifiedTransformer simplifies transformer block without affecting...	47	Emerging	15	Python
6	ARM-software/keyword-transformer Official implementation of the Keyword Transformer: https://arxiv.org/abs/2104.00769	47	Emerging	138	Jupyter Notebook
7	ChangwenXu98/TransPolymer Implementation of "TransPolymer: a Transformer-based language model for...	45	Emerging	85	Python
8	IBM/regression-transformer Regression Transformer (2023; Nature Machine Intelligence)	45	Emerging	159	Python
9	bytedance/effective_transformer Running BERT without Padding	44	Emerging	480	C++
10	bayesgroup/code_transformers Empirical Study of Transformers for Source Code & A Simple Approach for...	43	Emerging	66	Python
11	ShivamRajSharma/Transformer-Architectures-From-Scratch Implementation of transformers based architecture in PyTorch.	43	Emerging	55	Python
12	dashstander/block-recurrent-transformer Pytorch implementation of "Block Recurrent Transformers" (Hutchins & Schlag...	41	Emerging	85	Python
13	Breeze648/Transformer-from-Scratch 本仓库定位为 AI论文复现 / 从零实现 Transformer。 ...	41	Emerging	33	Python
14	octanove/shiba Pytorch implementation and pre-trained Japanese model for CANINE, the...	41	Emerging	89	Python
15	YadaYuki/transformer-from-scratch Transformer from scratch 🙊 (English to Japanese Translator by PyTorch)	40	Emerging	31	Python
16	Whiax/BERT-Transformer-Pytorch Basic implementation of BERT and Transformer in Pytorch in one short python...	38	Emerging	45	Python
17	pmichel31415/are-16-heads-really-better-than-1 Code for the paper "Are Sixteen Heads Really Better than One?"	38	Emerging	175	Shell
18	dcaffo98/transpormer TranSPormer: a transformer for the Travelling Salesman Problem	38	Emerging	26	Python
19	amazon-science/transformers-data-augmentation Code associated with the "Data Augmentation using Pre-trained Transformer...	37	Emerging	51	Python
20	THUDM/Multilingual-GLM The multilingual variant of GLM, a general language model trained with...	35	Emerging	62	Python
21	forgi86/sysid-transformers Code to reproduce the results of the paper In-context learning for...	34	Emerging	19	Jupyter Notebook
22	nanowell/Differential-Transformer-PyTorch PyTorch implementation of the Differential-Transformer architecture for...	34	Emerging	86	Python
23	submarat/removing-layer-norm Transformers Don’t Need LayerNorm at Inference Time	33	Emerging	3	Python
24	chrisjob1021/transformer-examples A collection of educational toy implementations and examples of key...	33	Emerging	3	Jupyter Notebook
25	shamspias/Transformers-and-Large-Language-Models-From-Basics-to-Frontier-Research Dive into the transformative world of NLP with this guide on Transformers....	32	Emerging	5	—
26	IParraMartin/An-Explanation-Is-All-You-Need The original transformer implementation from scratch. It contains...	31	Emerging	44	Python
27	LoserCheems/WonderfulMatrices Wonderful Matrices to Build Small Language Models	29	Experimental	44	Python
28	fabienfrfr/tptt 😊 TPTT: Transforming Pretrained Transformers into Titans	29	Experimental	60	Python
29	HSaurabh0919/CTransformers Implementing wide variety of transformers, fine tuning as well as trying...	29	Experimental	3	Jupyter Notebook
30	kyegomez/MLXTransformer Simple Implementation of a Transformer in the new framework MLX by Apple	27	Experimental	19	Python
31	januverma/transformers-stuff Codes, scripts, and notebooks on various aspects of transformer models.	27	Experimental	27	Jupyter Notebook
32	BruinGrowly/URI_Transformer URI-Transformer: Universal Reality Interface - A revolutionary artificial...	26	Experimental	9	Python
33	SauravP97/toy-transformer A decoder only Transformer implementing masked attention	24	Experimental	11	Python
34	abgache/NanoGPL Small test generative pre-trained LAM (Linear Attention Mechanism).	24	Experimental	1	Python
35	daniel-furman/polyglot-or-not Are foundation LMs multilingual knowledge bases? (EMNLP 2023)	22	Experimental	19	Jupyter Notebook
36	TomasrRodrigues/TinyGPT A research-grade PyTorch implementation of a decoder-only transformer from...	21	Experimental	—	Python
37	kyegomez/HeptapodLM An Implementation of an Transformer model that generates tokens non-linearly...	21	Experimental	10	Python
38	MyDarapy/gpt-1-from-scratch Rewriting and pretraining GPT-1 from scratch. Implementing Multihead...	20	Experimental	7	Python
39	FareedKhan-dev/best-introduction-to-transformer transformer again in the same manner as I did in my previous blog (for both...	20	Experimental	8	—
40	fattorib/tritonformer Trainable transformer with fwd+bwd ops in Triton, matching the performance...	20	Experimental	6	Python
41	Rohan-Thoma/Coding-attention-from-scratch This repository consists code for executing attention mechanism from scratch...	18	Experimental	2	Jupyter Notebook
42	jongoiko/minigpt Training a tiny GPT-like Transformer language model	18	Experimental	1	Jupyter Notebook
43	ashleysally00/transformers-and-attention Detailed guide to Transformer models that includes both technical and...	17	Experimental	—	—
44	scttfrdmn/local-code-model Pure Go implementation of a GPT-style transformer from scratch - educational...	17	Experimental	—	Go
45	DataWorshipper/Machine_Translation This repository implements a Machine Translation system from scratch using...	17	Experimental	1	Jupyter Notebook
46	ambideXtrous9/Transformer-from-Scratch Transformer from Scratch	16	Experimental	—	Python
47	tsvlgd/gpt-from-scratch decoder-only Transformer (GPT) language model coded from scratch in pytorch	15	Experimental	2	Jupyter Notebook
48	GabMartino/TransformerForDummies Annotated implementation of vanilla Transformers to guide through all the...	15	Experimental	10	Python
49	Ultron09/Numpy-Transformer A pure NumPy implementation of GPT built from scratch for educational...	15	Experimental	2	Python
50	gatorduck/Creating_Custom_Decoder_Transformer Custom decoder Transformer that treats a patient's medical journey like a...	14	Experimental	—	Jupyter Notebook
51	ZZZ150751/cs336_spring2025_assignment1 Implementation of a Decoder-only Transformer language model from scratch for...	14	Experimental	1	Python
52	driellecristine/BERT-Contrastive-LoRA Enhance BERT fine-tuning for intent classification using supervised...	14	Experimental	—	Python
53	Harsha-hue/visual-transformer-guide I built a visual guide explaining how Transformers work. Tokenization...	14	Experimental	1	HTML
54	tulasinnd/Transformer-Decoder-Evolution This repository contains various decoder-only transformer versions built...	13	Experimental	—	Jupyter Notebook
55	wahabzh/transformer-from-scratch 🤖 Complete Transformer implementation from scratch using PyTorch. Trained on...	13	Experimental	—	Jupyter Notebook
56	ledesma-ivan/How-Transformer-LLMs-Work Understand the architecture behind modern Large Language Models. This...	13	Experimental	—	—
57	sourize/Decodex This project implements a decoder-only GPT model from scratch using PyTorch.	13	Experimental	—	Jupyter Notebook
58	Hunain0786/miniTransformer Mini Transformer (Implemented From Scratch) A from-scratch implementation...	13	Experimental	—	Python
59	xmarva/transformer-based-architectures Breakdown of SoTA transformer-based architectures	11	Experimental	2	Jupyter Notebook
60	Pavansomisetty21/Attention-is-All-You-Need-The-Transformer-architecture In this we explore detailed architecture of Transformer	11	Experimental	—	Jupyter Notebook
61	nabeelshan78/gpt-forge-from-scratch-transformer A clean, modular implementation of a decoder-only Transformer (mini-GPT)...	11	Experimental	2	Jupyter Notebook
62	SrEntropy/nanoGPT-Transformer Mastering every concept from the seminal 2017 paper "Attention Is All You...	10	Experimental	1	Jupyter Notebook
63	coxy1989/tfmr Keras/Tensorflow implementation of the decoder from the transformer as...	10	Experimental	2	HTML