YichenZW/llm-arch-table
Living comparison table of LLM architectural choices (norm, attention, MoE, positional encoding, and more) from the Original Transformer (2017) to frontier models (2026). Based on Harm de Vries's figure, Sebastian Raschka's Big LLM Architecture Comparison, and Tatsunori Hashimoto's Stanford CS 336 lecture.
Stars
—
Forks
—
Language
—
License
—
Category
Last pushed
Mar 12, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/YichenZW/llm-arch-table"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
gustavecortal/gpt-j-fine-tuning-example
Fine-tuning 6-Billion GPT-J (& other models) with LoRA and 8-bit compression
Ebimsv/LLM-Lab
Pretraining and Finetuning Language Model
msmrexe/pytorch-lora-from-scratch
A from-scratch PyTorch implementation of Low-Rank Adaptation (LoRA) to efficiently fine-tune...
linhaowei1/Fine-tuning-Scaling-Law
🌹[ICML 2024] Selecting Large Language Model to Fine-tune via Rectified Scaling Law
aamanlamba/phi3-tune-payments
Bidirectional fine-tuning of Microsoft's Phi-3-Mini model for payment transaction processing...