therealoliver/Deepdive-llama3-from-scratch
Achieve the llama3 inference step-by-step, grasp the core concepts, master the process derivation, implement the code.
This project helps AI engineers and researchers understand how the Llama3 large language model performs text generation, step-by-step. It takes foundational mathematical concepts and model architecture details as input and provides a clear, annotated walkthrough of the Llama3 inference process, including deep dives into core mechanisms like KV-Cache and attention. This resource is ideal for those seeking to master the principles behind large language models.
626 stars. No commits in the last 6 months.
Use this if you are an AI engineer or researcher who wants to deeply understand the Llama3 model's internal workings, from basic principles to detailed code implementation.
Not ideal if you are looking for a pre-built tool to simply use Llama3 for text generation without delving into its underlying mechanics.
Stars
626
Forks
50
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Feb 24, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/therealoliver/Deepdive-llama3-from-scratch"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
rasbt/LLMs-from-scratch
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
facebookresearch/LayerSkip
Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024
FareedKhan-dev/train-llm-from-scratch
A straightforward method for training your LLM, from downloading data to generating text.
kmeng01/rome
Locating and editing factual associations in GPT (NeurIPS 2022)
datawhalechina/llms-from-scratch-cn
仅需Python基础,从0构建大语言模型;从0逐步构建GLM4\Llama3\RWKV6, 深入理解大模型原理