HariomJangra/project-lumen
A 128M parameter language model built from scratch for learning how large language models work.
Project Lumen helps AI researchers and developers understand how modern large language models work by providing a fully built and documented example. It takes raw text data, processes it, and outputs a trained language model capable of generating text or following instructions, allowing users to explore every development step. This is designed for those learning or researching language model creation.
Use this if you are an AI researcher, student, or developer who wants to learn the internal mechanics of building a large language model from scratch.
Not ideal if you need an off-the-shelf, production-ready language model for immediate deployment in an application.
Stars
8
Forks
1
Language
Jupyter Notebook
License
Apache-2.0
Category
Last pushed
Oct 25, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/HariomJangra/project-lumen"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
NX-AI/xlstm
Official repository of the xLSTM.
sinanuozdemir/oreilly-hands-on-gpt-llm
Mastering the Art of Scalable and Efficient AI Model Deployment
DashyDashOrg/pandas-llm
Pandas-LLM
wxhcore/bumblecore
An LLM training framework built from the ground up, featuring a custom BumbleBee architecture...
MiniMax-AI/MiniMax-01
The official repo of MiniMax-Text-01 and MiniMax-VL-01, large-language-model &...