codewithdark-git/Building-LLMs-from-scratch
This repository guides you through the process of building a GPT-style Large Language Model (LLM) from scratch using PyTorch. The structure and approach are inspired by the book Build a Large Language Model (From Scratch) by Sebastian Raschka.
This project helps machine learning engineers and researchers understand and build large language models (LLMs) from the ground up. It provides a guided journey to construct a GPT-style model, taking raw text data and producing a trained language model capable of generating human-like text. The target user is someone with a background in machine learning and Python who wants to deeply grasp LLM architectures.
Use this if you are a machine learning engineer or researcher who wants to learn the fundamental components and training process of a GPT-style LLM by building one yourself.
Not ideal if you are a data scientist or developer looking to simply use an existing LLM or fine-tune a pre-trained model for an application without diving into its internal architecture.
Stars
51
Forks
16
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Oct 29, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/codewithdark-git/Building-LLMs-from-scratch"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
rasbt/LLMs-from-scratch
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
facebookresearch/LayerSkip
Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024
FareedKhan-dev/train-llm-from-scratch
A straightforward method for training your LLM, from downloading data to generating text.
kmeng01/rome
Locating and editing factual associations in GPT (NeurIPS 2022)
datawhalechina/llms-from-scratch-cn
仅需Python基础,从0构建大语言模型;从0逐步构建GLM4\Llama3\RWKV6, 深入理解大模型原理