LLMs-from-scratch and scratch-llm

These are complements rather than competitors: the first provides a comprehensive, production-oriented pedagogical framework for building transformer-based LLMs (covering architecture, training, and inference), while the second offers a lightweight, ground-up implementation specifically focused on replicating Llama 2's design for educational purposes, allowing learners to study both a general approach and a specific modern architecture.

LLMs-from-scratch
66
Established
scratch-llm
40
Emerging
Maintenance 17/25
Adoption 10/25
Maturity 16/25
Community 23/25
Maintenance 0/25
Adoption 7/25
Maturity 16/25
Community 17/25
Stars: 87,892
Forks: 13,408
Downloads:
Commits (30d): 8
Language: Jupyter Notebook
License:
Stars: 38
Forks: 9
Downloads:
Commits (30d): 0
Language: Python
License: MIT
No Package No Dependents
Stale 6m No Package No Dependents

About LLMs-from-scratch

rasbt/LLMs-from-scratch

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

This project provides the practical code and guidance to build your own custom GPT-like large language model (LLM) from the ground up. You'll learn how to take raw text data, process it, and train a functional LLM that can generate text or follow instructions. This is designed for AI practitioners, machine learning engineers, and researchers who want to deeply understand and implement LLMs.

AI development natural language processing machine learning engineering deep learning research custom model training

About scratch-llm

clabrugere/scratch-llm

Implements a LLM similar to Meta's Llama 2 from the ground up in PyTorch, for educational purposes.

This project offers a clear, basic implementation of a large language model like Meta's Llama, built using PyTorch. It helps developers and researchers understand how these models work internally by showing the mechanics of components like positional encoding and attention. The project takes text data, processes it, and demonstrates the core computational steps that lead to a trained language model.

deep-learning-education natural-language-processing machine-learning-engineering neural-network-architecture LLM-development

Scores updated daily from GitHub, PyPI, and npm data. How scores work