rasbt/LLMs-from-scratch

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

/ 100

Established

This project provides the practical code and guidance to build your own custom GPT-like large language model (LLM) from the ground up. You'll learn how to take raw text data, process it, and train a functional LLM that can generate text or follow instructions. This is designed for AI practitioners, machine learning engineers, and researchers who want to deeply understand and implement LLMs.

87,892 stars. Actively maintained with 8 commits in the last 30 days.

Use this if you are a machine learning engineer or researcher who wants to learn the inner workings of large language models by implementing one yourself, rather than just using existing frameworks.

Not ideal if you are looking for a pre-built, production-ready LLM solution or a high-level API to integrate into an existing application without diving into the underlying code.

AI development natural language processing machine learning engineering deep learning research custom model training

No Package No Dependents

Maintenance 17 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 23 / 25

How are scores calculated?

Stars

87,892

Forks

13,408

Language

Jupyter Notebook

License

—

Compare

LLMs-from-scratch and train-llm-from-scratch LLMs-from-scratch and llms-from-scratch-cn LLMs-from-scratch and Building-LLMs-from-scratch LLMs-from-scratch and llms-from-scratch LLMs-from-scratch and llm-scratch-pytorch LLMs-from-scratch and scratch-llm LLMs-from-scratch and create-million-parameter-llm-from-scratch

Related models

facebookresearch/LayerSkip

Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024

FareedKhan-dev/train-llm-from-scratch

A straightforward method for training your LLM, from downloading data to generating text.

kmeng01/rome

Locating and editing factual associations in GPT (NeurIPS 2022)

datawhalechina/llms-from-scratch-cn

仅需Python基础，从0构建大语言模型；从0逐步构建GLM4\Llama3\RWKV6，深入理解大模型原理

geeks-of-data/knowledge-gpt

Extract knowledge from all information sources using gpt and other language models. Index and...

Explore Transformer Models

All categories Trending Transformer directory Insights