datawhalechina/llms-from-scratch-cn

仅需Python基础，从0构建大语言模型；从0逐步构建GLM4\Llama3\RWKV6，深入理解大模型原理

/ 100

Emerging

This project provides a hands-on guide to building large language models (LLMs) from scratch. You'll start with basic Python knowledge and learn to implement the core architectures of models like GLM4, Llama3, and RWKV6. This is ideal for machine learning engineers, AI researchers, or data scientists who want to deeply understand how these powerful models are constructed.

4,010 stars. No commits in the last 6 months.

Use this if you want to understand the fundamental building blocks and internal mechanisms of large language models by coding them yourself, rather than just using existing APIs.

Not ideal if you are looking to quickly deploy or fine-tune existing large language models for immediate application without diving into their architectural details.

AI development Machine learning engineering Natural language processing Deep learning research Model architecture

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 22 / 25

How are scores calculated?

Stars

4,010

Forks

552

Language

Jupyter Notebook

License

—

Compare

llms-from-scratch-cn and LLMs-from-scratch

Higher-rated alternatives

rasbt/LLMs-from-scratch

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

facebookresearch/LayerSkip

Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024

FareedKhan-dev/train-llm-from-scratch

A straightforward method for training your LLM, from downloading data to generating text.

kmeng01/rome

Locating and editing factual associations in GPT (NeurIPS 2022)

geeks-of-data/knowledge-gpt

Extract knowledge from all information sources using gpt and other language models. Index and...

Explore Transformer Models

All categories Trending Transformer directory Insights