ChaitanyaK77/Building-a-Small-Language-Model-SLM-
This Repository provides a Jupyter Notebook for building a small language model from scratch using 'TinyStories' dataset. Covers data preprocessing, BPE tokenization, binary storage, GPU memory management, and training a Transformer in PyTorch. Generate sample stories to test your model. Ideal for learning NLP and PyTorch.
This project provides a step-by-step guide in a Jupyter Notebook for building a small language model. You'll learn how to take raw text data, process it into a format a machine can understand, train a neural network, and then generate new, short stories similar to the input. This is designed for AI/ML practitioners, researchers, or students looking to understand how language models work from the ground up.
No commits in the last 6 months.
Use this if you are an AI/ML enthusiast or student who wants to learn the fundamental components and training process of a language model using standard hardware.
Not ideal if you need a production-ready large language model or a tool for advanced natural language processing tasks without wanting to build the underlying model yourself.
Stars
32
Forks
11
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Jun 07, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/ChaitanyaK77/Building-a-Small-Language-Model-SLM-"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
AI-Hypercomputer/maxtext
A simple, performant and scalable Jax LLM!
rasbt/reasoning-from-scratch
Implement a reasoning LLM in PyTorch from scratch, step by step
mindspore-lab/mindnlp
MindSpore + 🤗Huggingface: Run any Transformers/Diffusers model on MindSpore with seamless...
mosaicml/llm-foundry
LLM training code for Databricks foundation models
rickiepark/llm-from-scratch
<밑바닥부터 만들면서 공부하는 LLM>(길벗, 2025)의 코드 저장소