CASE-Lab-UMD/Router-Tuning-Mixture-of-Depths
The open-source Mixture of Depths code and the official implementation of the paper "Router-Tuning: A Simple and Effective Approach for Enabling Dynamic Depth in Transformers. (EMNLP 2025)"
This project helps machine learning engineers and researchers optimize large language models (LLMs) for efficiency. It takes existing transformer models and training data as input and produces a more efficient version of the model that uses fewer computational resources during inference. This is ideal for those who work with deploying and fine-tuning LLMs.
Use this if you need to reduce the computational cost of running your large language models without significantly sacrificing performance.
Not ideal if you are looking for a completely new model architecture or if your primary concern is improving baseline model accuracy rather than efficiency.
Stars
28
Forks
3
Language
Python
License
—
Category
Last pushed
Feb 28, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/CASE-Lab-UMD/Router-Tuning-Mixture-of-Depths"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
EfficientMoE/MoE-Infinity
PyTorch library for cost-effective, fast and easy serving of MoE models.
raymin0223/mixture_of_recursions
Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation...
AviSoori1x/makeMoE
From scratch implementation of a sparse mixture of experts language model inspired by Andrej...
thu-nics/MoA
[CoLM'25] The official implementation of the paper
jaisidhsingh/pytorch-mixtures
One-stop solutions for Mixture of Expert modules in PyTorch.