LLM Implementation From Scratch Transformer Models

Educational repositories focused on building Large Language Models from first principles using PyTorch, emphasizing step-by-step understanding of transformer architecture, tokenization, and training mechanics. Does NOT include fine-tuning existing models, inference optimization, or production deployment frameworks.

There are 52 llm implementation from scratch models tracked. 4 score above 50 (established tier). The highest-rated is rasbt/LLMs-from-scratch at 66/100 with 87,892 stars. 1 of the top 10 are actively maintained.

Get all 52 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=llm-implementation-from-scratch&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Model Score Tier
1 rasbt/LLMs-from-scratch

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

66
Established
2 facebookresearch/LayerSkip

Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative...

52
Established
3 FareedKhan-dev/train-llm-from-scratch

A straightforward method for training your LLM, from downloading data to...

52
Established
4 kmeng01/rome

Locating and editing factual associations in GPT (NeurIPS 2022)

51
Established
5 datawhalechina/llms-from-scratch-cn

仅需Python基础,从0构建大语言模型;从0逐步构建GLM4\Llama3\RWKV6, 深入理解大模型原理

48
Emerging
6 geeks-of-data/knowledge-gpt

Extract knowledge from all information sources using gpt and other language...

47
Emerging
7 codewithdark-git/Building-LLMs-from-scratch

This repository guides you through the process of building a GPT-style Large...

47
Emerging
8 analyticalrohit/llms-from-scratch

Build a ChatGPT like LLM from scratch in PyTorch, explained step by step.

45
Emerging
9 huangwl18/language-planner

Official Code for "Language Models as Zero-Shot Planners: Extracting...

43
Emerging
10 therealoliver/Deepdive-llama3-from-scratch

Achieve the llama3 inference step-by-step, grasp the core concepts, master...

42
Emerging
11 skyloevil/llm-scratch-pytorch

lm-scratch-pytorch - The code is designed to be beginner-friendly, with a...

41
Emerging
12 clabrugere/scratch-llm

Implements a LLM similar to Meta's Llama 2 from the ground up in PyTorch,...

40
Emerging
13 OpenSparseLLMs/LLaMA-MoE-v2

🚀 LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of...

40
Emerging
14 FareedKhan-dev/create-million-parameter-llm-from-scratch

Building a 2.3M-parameter LLM from scratch with LLaMA 1 architecture.

39
Emerging
15 HxCodeWarrior/StellarByte

从零实现基础的Transformer的Decoerder-Only模型,并进行模型升级,构建专属于自己的LLM模型

38
Emerging
16 zhanshijinwat/Steel-LLM

Train a 1B LLM with 1T tokens from scratch by personal

38
Emerging
17 joelbarmettlerUZH/ConceptFormer

Towards Finding the Essence of Everything in Large Language Models

37
Emerging
18 vipulraheja/iterater

Official implementation of the paper "IteraTeR: Understanding Iterative...

35
Emerging
19 UCSB-NLP-Chang/ULD

Implementation of paper 'Reversing the Forget-Retain Objectives: An...

33
Emerging
20 bloomberg/minilmv2.bb

Our open source implementation of MiniLMv2...

33
Emerging
21 jpwahle/emnlp23-paraphrase-types

The official implementation of the EMNLP 2023 paper "Paraphrase Types for...

32
Emerging
22 Mmorgan-ML/Phase-Slip-Sampler

Phase-Slip is a stochastic intervention architecture that operates on the...

32
Emerging
23 ai-art-dev99/llm-from-scratch

Build a Large Language Model From Scratch

31
Emerging
24 nishantb06/smolLM

Reverse Engineering SmolLM2 model and training it from scratch

27
Experimental
25 newfull5/NLLB-200-Distilled-350M-en-ko

nllb-200 distilled 350M for English to Korean translation

27
Experimental
26 rafaelvp-db/db-ancient-code-translation

Simple repo showing code-to-code and code-to-text capabilities using LLMs on...

25
Experimental
27 shreyansh26/LLM-Sampling

A collection of various LLM sampling methods implemented in pure Pytorch

24
Experimental
28 NaS-Research/knowledge-model

Our knowledge system systematically ingests, processes, and indexes...

23
Experimental
29 NamrataThakur/Large_Language_Model_From_Scratch_Implementation

Implementing an LLM from scratch block-by-block using PyTorch

23
Experimental
30 Arlchoose-code/Indonesian-LLM-Starter

A starter kit for building your own Indonesian Large Language Model (LLM)...

22
Experimental
31 SoelMgd/Poker_Transformers

LLMs trained for Poker

21
Experimental
32 Swamy-s-Tech-Skills-Academy-2026/llms-from-scratch-practice

Hands-on learning repository for building a GPT-style Large Language Model...

21
Experimental
33 ldr7/language_model_from_scratch

Build a language model from scratch.

21
Experimental
34 bijinc/speculoos

efficient speculative sampling for language models

21
Experimental
35 bassrehab/speculative-decoding

Reference implementation of LLM inference acceleration techniques. Includes...

20
Experimental
36 ghassenov/llm_from_scratch

A GPT-2 model from scratch built to explore the inner workings of...

20
Experimental
37 adarsh-crafts/llama-llm-from-scratch

Educational, from-scratch implementation of a LLaMA-style LLM using PyTorch...

20
Experimental
38 wasim/scaling-specialization-dense-lms

Do dense LMs develop MoE-like specialization as they scale? Measure it,...

20
Experimental
39 RobinSmits/Schaapje

Schaapje - A Dutch Small Language Model

18
Experimental
40 theosorus/French-Language-Model

In this project, I built a French Large Language Model only with pytorch

18
Experimental
41 VisualJoyce/TERepo

[ACL 2023] A Text Editing Repository for reproduction and innovation.

17
Experimental
42 mohitpg/LLMs-from-scratch

A collection of LLMs implemented from scratch using pytorch

17
Experimental
43 YUGESHKARAN/Clash_of_Clans_Language_Model

A mini language model from scratch using PyTorch, with approximately 2.96...

13
Experimental
44 harishm17/build-llm-from-scratch

From‑scratch LLM notebooks: Transformers, BPE tokenizer, PyTorch...

13
Experimental
45 eryk-mazus/no-reason

step-by-step cot decoding

12
Experimental
46 kreasof-ai/LLM-from-scratch

LLM from scratch, no pretrained models, no HF transformers

12
Experimental
47 haukzero/Speculative-Demo

一个简单的投机推理实现

11
Experimental
48 bpevangelista/llms_learning

ML – From Scratch to Llama2, Mistral and Phi-2 in Pytorch

11
Experimental
49 wangtz19/DecodingStrategy

Unofficial implementations for optimized decoding strategies of large language models

10
Experimental
50 Dhyanesh18/llm-from-scratch

In this i have explored different parts of an LLM from the tokenizer to the...

10
Experimental
51 kjpou1/llm-zero-to-trained

Building a Large Language Model from scratch for deep understanding —...

10
Experimental
52 jongwooko/CR-ILD

About Code for the paper "Revisiting Intermediate Layer Distillation for...

10
Experimental