joyehuang/minimind-notes

🚀 [从零构建 LLM] 极简大模型训练原理与实践指南。包含 Transformer, Pretraining, SFT 核心代码与对照实验。 | A minimal, principle-first guide to understanding and building LLMs from scratch.

/ 100

Emerging

This project offers a hands-on guide to deeply understand how large language models (LLMs) like GPT or Llama are trained. It uses simplified code and comparative experiments to show not just 'how to do it,' but 'why it works.' It's for AI/ML engineers, students, and researchers who want to master the core principles behind LLM development.

Use this if you are preparing for LLM-related job interviews, or if you want to move beyond just using AI frameworks to truly understanding the underlying mechanics of modern LLMs.

Not ideal if you are completely new to deep learning with PyTorch, only want to quickly deploy models without understanding their internal workings, or need production-ready code with best practices.

AI-engineering machine-learning-education large-language-models deep-learning-research AI-career-development

No Package No Dependents

Maintenance 10 / 25

Adoption 8 / 25

Maturity 13 / 25

Community 13 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

AI-Hypercomputer/maxtext

A simple, performant and scalable Jax LLM!

rasbt/reasoning-from-scratch

Implement a reasoning LLM in PyTorch from scratch, step by step

mindspore-lab/mindnlp

MindSpore + 🤗Huggingface: Run any Transformers/Diffusers model on MindSpore with seamless...

mosaicml/llm-foundry

LLM training code for Databricks foundation models

rickiepark/llm-from-scratch

<밑바닥부터 만들면서 공부하는 LLM>(길벗, 2025)의 코드 저장소

Explore Transformer Models

All categories Trending Transformer directory Insights