casinca/LLM-quest
Verbose implementations of LLMs architectures, techniques and research papers from scratch. DeepSeek, Qwen3..., RLHF, MoE, Multimodal...
This project offers detailed, from-scratch implementations of various large language model (LLM) architectures and advanced techniques. It provides a transparent view of how complex LLMs like DeepSeek, Qwen3, and Gemma are built, along with methods for alignment (like RLHF) and multimodal capabilities. The resource is invaluable for AI researchers, machine learning engineers, and students who want to understand, experiment with, and learn the intricate mechanics behind state-of-the-art LLMs.
Use this if you are an AI researcher or machine learning engineer looking to deeply understand, reverse-engineer, and experiment with the internal workings of modern LLMs and their underlying techniques from first principles.
Not ideal if you are looking for an out-of-the-box LLM to use in an application, or if you need a high-level library for rapid prototyping without delving into the architectural details.
Stars
12
Forks
1
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 12, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/casinca/LLM-quest"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Goekdeniz-Guelmez/mlx-lm-lora
Train Large Language Models on MLX.
uber-research/PPLM
Plug and Play Language Model implementation. Allows to steer topic and attributes of GPT-2 models.
VHellendoorn/Code-LMs
Guide to using pre-trained large language models of source code
ssbuild/chatglm_finetuning
chatglm 6b finetuning and alpaca finetuning
jarobyte91/pytorch_beam_search
A lightweight implementation of Beam Search for sequence models in PyTorch.