AIDajiangtang/LLM-from-scratch
从零开始学大模型Transformer、GPT2、BERT pre-training and fine-tuning from scratch
This project offers detailed notebooks and guides for understanding and implementing Large Language Models (LLMs) like Transformer, GPT-2, and BERT from their foundational elements. It covers pre-training, fine-tuning, and practical applications such as text classification, sentiment analysis, and building chatbots. AI developers and researchers would use this to gain hands-on experience and build custom LLM solutions.
No commits in the last 6 months.
Use this if you are an AI developer or researcher who wants to learn the mechanics of LLMs, from basic components to advanced applications, through practical code examples and clear explanations.
Not ideal if you are looking for a pre-built, ready-to-deploy LLM solution without diving into the underlying code and training processes.
Stars
37
Forks
2
Language
Jupyter Notebook
License
—
Category
Last pushed
Jul 01, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/AIDajiangtang/LLM-from-scratch"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
AI-Hypercomputer/maxtext
A simple, performant and scalable Jax LLM!
rasbt/reasoning-from-scratch
Implement a reasoning LLM in PyTorch from scratch, step by step
mindspore-lab/mindnlp
MindSpore + 🤗Huggingface: Run any Transformers/Diffusers model on MindSpore with seamless...
mosaicml/llm-foundry
LLM training code for Databricks foundation models
rickiepark/llm-from-scratch
<밑바닥부터 만들면서 공부하는 LLM>(길벗, 2025)의 코드 저장소