datawhalechina/tiny-universe

《大模型白盒子构建指南》:一个全手搓的Tiny-Universe

48
/ 100
Emerging

This project offers a comprehensive, 'from-scratch' guide to building large language model (LLM) systems. It starts with foundational principles and walks you through constructing a complete LLM ecosystem, including the core model, Retrieval-Augmented Generation (RAG) framework, Agent system, and evaluation tools. The target audience is deep learning practitioners who have some experience with LLMs and want to deepen their understanding by building systems from the ground up.

4,598 stars.

Use this if you are an AI/ML practitioner with basic LLM application knowledge and want to understand the underlying principles and build your own custom LLM components without relying on high-level frameworks.

Not ideal if you are looking for a quick way to integrate pre-built LLMs or frameworks into an application, as this project focuses on foundational understanding and manual implementation.

Large Language Models Deep Learning Engineering AI System Design Natural Language Processing Machine Learning Research
No License No Package No Dependents
Maintenance 10 / 25
Adoption 10 / 25
Maturity 8 / 25
Community 20 / 25

How are scores calculated?

Stars

4,598

Forks

450

Language

Jupyter Notebook

License

Last pushed

Feb 12, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/rag/datawhalechina/tiny-universe"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.