CASE-Lab-UMD/Router-Tuning-Mixture-of-Depths

The open-source Mixture of Depths code and the official implementation of the paper "Router-Tuning: A Simple and Effective Approach for Enabling Dynamic Depth in Transformers. (EMNLP 2025)"

/ 100

Emerging

This project helps machine learning engineers and researchers optimize large language models (LLMs) for efficiency. It takes existing transformer models and training data as input and produces a more efficient version of the model that uses fewer computational resources during inference. This is ideal for those who work with deploying and fine-tuning LLMs.

Use this if you need to reduce the computational cost of running your large language models without significantly sacrificing performance.

Not ideal if you are looking for a completely new model architecture or if your primary concern is improving baseline model accuracy rather than efficiency.

LLM deployment model optimization transformer efficiency NLP research resource management

No License No Package No Dependents

Maintenance 10 / 25

Adoption 7 / 25

Maturity 8 / 25

Community 9 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

EfficientMoE/MoE-Infinity

PyTorch library for cost-effective, fast and easy serving of MoE models.

raymin0223/mixture_of_recursions

Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation...

AviSoori1x/makeMoE

From scratch implementation of a sparse mixture of experts language model inspired by Andrej...

thu-nics/MoA

[CoLM'25] The official implementation of the paper

jaisidhsingh/pytorch-mixtures

One-stop solutions for Mixture of Expert modules in PyTorch.

Explore Transformer Models

All categories Trending Transformer directory Insights