LLM Implementation From Scratch Transformer Models

Educational repositories focused on building Large Language Models from first principles using PyTorch, emphasizing step-by-step understanding of transformer architecture, tokenization, and training mechanics. Does NOT include fine-tuning existing models, inference optimization, or production deployment frameworks.

There are 52 llm implementation from scratch models tracked. 4 score above 50 (established tier). The highest-rated is rasbt/LLMs-from-scratch at 66/100 with 87,892 stars. 1 of the top 10 are actively maintained.

Get all 52 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=llm-implementation-from-scratch&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Model	Score	Tier	Stars	Language
1	rasbt/LLMs-from-scratch Implement a ChatGPT-like LLM in PyTorch from scratch, step by step	66	Established	87,892	Jupyter Notebook
2	facebookresearch/LayerSkip Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative...	52	Established	361	Python
3	FareedKhan-dev/train-llm-from-scratch A straightforward method for training your LLM, from downloading data to...	52	Established	531	Jupyter Notebook
4	kmeng01/rome Locating and editing factual associations in GPT (NeurIPS 2022)	51	Established	737	Python
5	datawhalechina/llms-from-scratch-cn 仅需Python基础，从0构建大语言模型；从0逐步构建GLM4\Llama3\RWKV6，深入理解大模型原理	48	Emerging	4,010	Jupyter Notebook
6	geeks-of-data/knowledge-gpt Extract knowledge from all information sources using gpt and other language...	47	Emerging	291	Python
7	codewithdark-git/Building-LLMs-from-scratch This repository guides you through the process of building a GPT-style Large...	47	Emerging	51	Jupyter Notebook
8	analyticalrohit/llms-from-scratch Build a ChatGPT like LLM from scratch in PyTorch, explained step by step.	45	Emerging	26	Jupyter Notebook
9	huangwl18/language-planner Official Code for "Language Models as Zero-Shot Planners: Extracting...	43	Emerging	278	Jupyter Notebook
10	therealoliver/Deepdive-llama3-from-scratch Achieve the llama3 inference step-by-step, grasp the core concepts, master...	42	Emerging	626	Jupyter Notebook
11	skyloevil/llm-scratch-pytorch lm-scratch-pytorch - The code is designed to be beginner-friendly, with a...	41	Emerging	100	Jupyter Notebook
12	clabrugere/scratch-llm Implements a LLM similar to Meta's Llama 2 from the ground up in PyTorch,...	40	Emerging	38	Python
13	OpenSparseLLMs/LLaMA-MoE-v2 🚀 LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of...	40	Emerging	93	Python
14	FareedKhan-dev/create-million-parameter-llm-from-scratch Building a 2.3M-parameter LLM from scratch with LLaMA 1 architecture.	39	Emerging	201	Jupyter Notebook
15	HxCodeWarrior/StellarByte 从零实现基础的Transformer的Decoerder-Only模型，并进行模型升级，构建专属于自己的LLM模型	38	Emerging	6	Python
16	zhanshijinwat/Steel-LLM Train a 1B LLM with 1T tokens from scratch by personal	38	Emerging	791	Jupyter Notebook
17	joelbarmettlerUZH/ConceptFormer Towards Finding the Essence of Everything in Large Language Models	37	Emerging	13	Python
18	vipulraheja/iterater Official implementation of the paper "IteraTeR: Understanding Iterative...	35	Emerging	80	Python
19	UCSB-NLP-Chang/ULD Implementation of paper 'Reversing the Forget-Retain Objectives: An...	33	Emerging	26	Python
20	bloomberg/minilmv2.bb Our open source implementation of MiniLMv2...	33	Emerging	61	Python
21	jpwahle/emnlp23-paraphrase-types The official implementation of the EMNLP 2023 paper "Paraphrase Types for...	32	Emerging	12	Python
22	Mmorgan-ML/Phase-Slip-Sampler Phase-Slip is a stochastic intervention architecture that operates on the...	32	Emerging	6	Python
23	ai-art-dev99/llm-from-scratch Build a Large Language Model From Scratch	31	Emerging	22	Jupyter Notebook
24	nishantb06/smolLM Reverse Engineering SmolLM2 model and training it from scratch	27	Experimental	1	Python
25	newfull5/NLLB-200-Distilled-350M-en-ko nllb-200 distilled 350M for English to Korean translation	27	Experimental	28	Jupyter Notebook
26	rafaelvp-db/db-ancient-code-translation Simple repo showing code-to-code and code-to-text capabilities using LLMs on...	25	Experimental	5	Python
27	shreyansh26/LLM-Sampling A collection of various LLM sampling methods implemented in pure Pytorch	24	Experimental	28	Python
28	NaS-Research/knowledge-model Our knowledge system systematically ingests, processes, and indexes...	23	Experimental	1	Python
29	NamrataThakur/Large_Language_Model_From_Scratch_Implementation Implementing an LLM from scratch block-by-block using PyTorch	23	Experimental	—	Jupyter Notebook
30	Arlchoose-code/Indonesian-LLM-Starter A starter kit for building your own Indonesian Large Language Model (LLM)...	22	Experimental	1	Python
31	SoelMgd/Poker_Transformers LLMs trained for Poker	21	Experimental	9	Jupyter Notebook
32	Swamy-s-Tech-Skills-Academy-2026/llms-from-scratch-practice Hands-on learning repository for building a GPT-style Large Language Model...	21	Experimental	—	Jupyter Notebook
33	ldr7/language_model_from_scratch Build a language model from scratch.	21	Experimental	1	Jupyter Notebook
34	bijinc/speculoos efficient speculative sampling for language models	21	Experimental	—	Python
35	bassrehab/speculative-decoding Reference implementation of LLM inference acceleration techniques. Includes...	20	Experimental	1	Python
36	ghassenov/llm_from_scratch A GPT-2 model from scratch built to explore the inner workings of...	20	Experimental	4	Jupyter Notebook
37	adarsh-crafts/llama-llm-from-scratch Educational, from-scratch implementation of a LLaMA-style LLM using PyTorch...	20	Experimental	4	Jupyter Notebook
38	wasim/scaling-specialization-dense-lms Do dense LMs develop MoE-like specialization as they scale? Measure it,...	20	Experimental	1	Python
39	RobinSmits/Schaapje Schaapje - A Dutch Small Language Model	18	Experimental	2	Jupyter Notebook
40	theosorus/French-Language-Model In this project, I built a French Large Language Model only with pytorch	18	Experimental	7	Python
41	VisualJoyce/TERepo [ACL 2023] A Text Editing Repository for reproduction and innovation.	17	Experimental	1	Python
42	mohitpg/LLMs-from-scratch A collection of LLMs implemented from scratch using pytorch	17	Experimental	1	Python
43	YUGESHKARAN/Clash_of_Clans_Language_Model A mini language model from scratch using PyTorch, with approximately 2.96...	13	Experimental	—	Jupyter Notebook
44	harishm17/build-llm-from-scratch From‑scratch LLM notebooks: Transformers, BPE tokenizer, PyTorch...	13	Experimental	—	Jupyter Notebook
45	eryk-mazus/no-reason step-by-step cot decoding	12	Experimental	5	Python
46	kreasof-ai/LLM-from-scratch LLM from scratch, no pretrained models, no HF transformers	12	Experimental	7	Jupyter Notebook
47	haukzero/Speculative-Demo 一个简单的投机推理实现	11	Experimental	3	Python
48	bpevangelista/llms_learning ML – From Scratch to Llama2, Mistral and Phi-2 in Pytorch	11	Experimental	3	Cuda
49	wangtz19/DecodingStrategy Unofficial implementations for optimized decoding strategies of large language models	10	Experimental	2	Jupyter Notebook
50	Dhyanesh18/llm-from-scratch In this i have explored different parts of an LLM from the tokenizer to the...	10	Experimental	1	Jupyter Notebook
51	kjpou1/llm-zero-to-trained Building a Large Language Model from scratch for deep understanding —...	10	Experimental	1	Jupyter Notebook
52	jongwooko/CR-ILD About Code for the paper "Revisiting Intermediate Layer Distillation for...	10	Experimental	2	Python

Comparisons in this category

LLMs-from-scratch and llms-from-scratch-cn (66 vs 48) LLMs-from-scratch and train-llm-from-scratch (66 vs 52) LLMs-from-scratch and create-million-parameter-llm-from-scratch (66 vs 39) LLMs-from-scratch and llm-scratch-pytorch (66 vs 41) LLMs-from-scratch and Building-LLMs-from-scratch (66 vs 47) LLMs-from-scratch and scratch-llm (66 vs 40) LLMs-from-scratch and llms-from-scratch (66 vs 45) train-llm-from-scratch and llms-from-scratch (52 vs 45) llm-scratch-pytorch and scratch-llm (41 vs 40) llms-from-scratch and llm-scratch-pytorch (45 vs 41)