Mixture Of Experts Llms Transformer Models

There are 23 mixture of experts llms models tracked. 1 score above 50 (established tier). The highest-rated is EfficientMoE/MoE-Infinity at 50/100 with 288 stars.

Get all 23 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=mixture-of-experts-llms&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Model	Score	Tier	Stars	Language
1	EfficientMoE/MoE-Infinity PyTorch library for cost-effective, fast and easy serving of MoE models.	50	Established	288	Python
2	raymin0223/mixture_of_recursions Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive...	47	Emerging	548	Python
3	AviSoori1x/makeMoE From scratch implementation of a sparse mixture of experts language model...	46	Emerging	793	Jupyter Notebook
4	thu-nics/MoA [CoLM'25] The official implementation of the paper	46	Emerging	156	Python
5	jaisidhsingh/pytorch-mixtures One-stop solutions for Mixture of Expert modules in PyTorch.	46	Emerging	27	Python
6	CASE-Lab-UMD/Unified-MoE-Compression The official implementation of the paper "Towards Efficient Mixture of...	44	Emerging	89	Python
7	MoonshotAI/MoBA MoBA: Mixture of Block Attention for Long-Context LLMs	44	Emerging	2,076	Python
8	efeslab/fiddler [ICLR'25] Fast Inference of MoE Models with CPU-GPU Orchestration	42	Emerging	262	Python
9	FareedKhan-dev/qwen3-MoE-from-scratch A Step-by-Step Implementation of Qwen 3 MoE Architecture from Scratch	39	Emerging	76	Jupyter Notebook
10	ByteDance-Seed/FlexPrefill Code for paper: [ICLR2025 Oral] FlexPrefill: A Context-Aware Sparse...	38	Emerging	164	Python
11	lliai/D2MoE D^2-MoE: Delta Decompression for MoE-based LLMs Compression	37	Emerging	74	Python
12	SkyworkAI/MoE-plus-plus [ICLR 2025] MoE++: Accelerating Mixture-of-Experts Methods with...	36	Emerging	264	Python
13	dmis-lab/Monet [ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers	35	Emerging	76	Python
14	CASE-Lab-UMD/Router-Tuning-Mixture-of-Depths The open-source Mixture of Depths code and the official implementation of...	34	Emerging	28	Python
15	cmu-flame/FLAME-MoE Official repository for FLAME-MoE: A Transparent End-to-End Research...	32	Emerging	33	Jupyter Notebook
16	rioyokotalab/optimal-sparsity [ICLR 2026 Oral] Optimal Sparsity of Mixture-of-Experts Language Models for...	29	Experimental	7	Python
17	robinzixuan/FROST [ICLR 2026] FROST: Filtering Reasoning Outliers with Attention for Efficient...	28	Experimental	3	Python
18	UNITES-Lab/HEXA-MoE Official code for the paper "HEXA-MoE: Efficient and Heterogeneous-Aware MoE...	24	Experimental	15	Python
19	Spico197/MoE-SFT 🍼 Official implementation of Dynamic Data Mixing Maximizes Instruction...	23	Experimental	41	Python
20	zhongshsh/MoExtend ACL 2024 (SRW), Official Codebase of our Paper: "MoExtend: Tuning New...	21	Experimental	14	Python
21	lorenzflow/robust-moa This is the official repository for the paper: This is your Doge: Exploring...	20	Experimental	2	Python
22	RoyZry98/T-REX-Pytorch [Arxiv 2025] Official code for T-REX: Mixture-of-Rank-One-Experts with...	20	Experimental	17	Python
23	Devanik21/HAG-MoE HAG-MoE introduces a revolutionary approach to artificial intelligence by...	18	Experimental	1	Python