YichenZW/llm-arch-table

Living comparison table of LLM architectural choices (norm, attention, MoE, positional encoding, and more) from the Original Transformer (2017) to frontier models (2026). Based on Harm de Vries's figure, Sebastian Raschka's Big LLM Architecture Comparison, and Tatsunori Hashimoto's Stanford CS 336 lecture.

/ 100

Experimental

No License No Package No Dependents

Maintenance 10 / 25

Adoption 0 / 25

Maturity 1 / 25

Community 0 / 25

How are scores calculated?

Stars

—

Forks

—

Language

—

License

—

Category

model-fine-tuning-methods

Last pushed

Mar 12, 2026

Commits (30d)

GitHub

Model Fine Tuning Methods · 9 models

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/YichenZW/llm-arch-table"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

Higher-rated alternatives

gustavecortal/gpt-j-fine-tuning-example

Fine-tuning 6-Billion GPT-J (& other models) with LoRA and 8-bit compression

Ebimsv/LLM-Lab

Pretraining and Finetuning Language Model

msmrexe/pytorch-lora-from-scratch

A from-scratch PyTorch implementation of Low-Rank Adaptation (LoRA) to efficiently fine-tune...

linhaowei1/Fine-tuning-Scaling-Law

🌹[ICML 2024] Selecting Large Language Model to Fine-tune via Rectified Scaling Law

aamanlamba/phi3-tune-payments

Bidirectional fine-tuning of Microsoft's Phi-3-Mini model for payment transaction processing...

Explore Transformer Models

All categories Trending Transformer directory Insights