ksm26/Pretraining-LLMs
Master the essential steps of pretraining large language models (LLMs). Learn to create high-quality datasets, configure model architectures, execute training runs, and assess model performance for efficient and effective LLM pretraining.
This course teaches you how to train large language models (LLMs) from the ground up. You'll learn to prepare vast text datasets, configure model architectures, run training processes efficiently, and evaluate the performance of your custom LLM. This is for machine learning engineers, data scientists, or researchers who need to build specialized language models.
No commits in the last 6 months.
Use this if you need to create a custom language model for specific tasks or domains, rather than relying solely on existing general-purpose models.
Not ideal if you primarily need to fine-tune an existing LLM for a specific task, as this focuses on the foundational pretraining process.
Stars
27
Forks
10
Language
Jupyter Notebook
License
—
Category
Last pushed
Aug 07, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/ksm26/Pretraining-LLMs"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
AI-Hypercomputer/maxtext
A simple, performant and scalable Jax LLM!
rasbt/reasoning-from-scratch
Implement a reasoning LLM in PyTorch from scratch, step by step
mindspore-lab/mindnlp
MindSpore + 🤗Huggingface: Run any Transformers/Diffusers model on MindSpore with seamless...
mosaicml/llm-foundry
LLM training code for Databricks foundation models
rickiepark/llm-from-scratch
<밑바닥부터 만들면서 공부하는 LLM>(길벗, 2025)의 코드 저장소