shivendrra/SmallLanguageModel

a LLM cookbook, for building your own from scratch, all the way from gathering data to training a model

/ 100

Emerging

This helps machine learning engineers or researchers build their own large language models (LLMs) from the ground up. You provide raw text data, and it guides you through collecting, processing, and then training custom BERT, GPT, or Seq-2-Seq models. The output is a functional language model tailored to your specific data and needs.

168 stars. No commits in the last 6 months.

Use this if you are a machine learning engineer or researcher who wants to understand and build a custom large language model from scratch, rather than fine-tuning a pre-existing one.

Not ideal if you're looking for a simple tool to fine-tune an existing LLM or need a ready-to-use solution without delving into model architecture and training.

natural-language-processing machine-learning-engineering language-model-development deep-learning-research

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 16 / 25

How are scores calculated?

Stars

168

Forks

Language

Jupyter Notebook

License

MIT

Higher-rated alternatives

AI-Hypercomputer/maxtext

A simple, performant and scalable Jax LLM!

rasbt/reasoning-from-scratch

Implement a reasoning LLM in PyTorch from scratch, step by step

mindspore-lab/mindnlp

MindSpore + 🤗Huggingface: Run any Transformers/Diffusers model on MindSpore with seamless...

mosaicml/llm-foundry

LLM training code for Databricks foundation models

rickiepark/llm-from-scratch

<밑바닥부터 만들면서 공부하는 LLM>(길벗, 2025)의 코드 저장소

Explore Transformer Models

All categories Trending Transformer directory Insights