jingyaogong/minimind

🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT！🌏 Train a 26M-parameter GPT from scratch in just 2h!

/ 100

Established

This project helps AI researchers and students understand and build small-scale large language models (LLMs) from the ground up. It provides the tools and code to train a functional language model from scratch, starting with raw text data and producing a trained, lightweight LLM. This is ideal for those learning the inner workings of LLM development without needing massive computing resources.

41,159 stars. Actively maintained with 21 commits in the last 30 days.

Use this if you want to learn the fundamental algorithms and training processes behind large language models by building one yourself on consumer-grade hardware.

Not ideal if you primarily need to fine-tune existing, large, production-ready language models or integrate them into applications without understanding their core mechanics.

AI-research natural-language-processing machine-learning-education neural-network-training computational-linguistics

No Package No Dependents

Maintenance 20 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 21 / 25

How are scores calculated?

Stars

41,159

Forks

4,979

Language

Python

License

Apache-2.0

Recent Releases

v2 21 Oct 2025 minimind-v1 02 Sep 2024

Related models

kyegomez/TeraGPT

Train a production grade GPT in less than 400 lines of code. Better than Karpathy's verison and GIGAGPT

theosorus/GPT2-Hasktorch

GPT2 implementation in Haskell with the Hasktorch library, inspired by Andrej Karpathy's Pytorch...

noah-hein/mazeGPT

AI model for making mazes that extends OpenAIs GPT2 model

RohitPawar001/GPT-2-Implementation

This repository contains the implementation of OpenAI's GPT-2 with LORA, QLORA, RLHF, PPO,GRPO,...

miguelvanegas-c/LLMImplementation

This repository demonstrates how to build a functional AI agent using LangChain in Python,...

Explore Transformer Models

All categories Trending Transformer directory Insights