ilyalasy/moe-routing

Analysis of token routing for different implementations of Mixture of Experts

/ 100

Experimental

This tool helps researchers and AI practitioners understand how different Mixture of Experts (MoE) Large Language Models (LLMs) distribute input tokens to their specialized 'expert' subnetworks. You provide a RedPajama dataset, and it produces data and visualizations showing how tokens are routed. This is primarily for those researching or working with the architecture and efficiency of MoE LLMs.

No commits in the last 6 months.

Use this if you are developing or studying Mixture of Experts LLMs and need to analyze the token routing patterns to optimize performance or understand architectural behavior.

Not ideal if you are looking for a tool to train, fine-tune, or simply use an LLM for text generation or other end-user applications.

Large Language Models MoE Architecture AI Research Deep Learning Optimization Model Analysis

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 5 / 25

Maturity 8 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Jupyter Notebook

License

—

Higher-rated alternatives

AdaptiveMotorControlLab/CEBRA

Learnable latent embeddings for joint behavioral and neural analysis - Official implementation of CEBRA

theolepage/sslsv

Toolkit for training and evaluating Self-Supervised Learning (SSL) frameworks for Speaker...

PaddlePaddle/PASSL

PASSL包含 SimCLR，MoCo v1/v2，BYOL，CLIP，PixPro，simsiam, SwAV, BEiT，MAE 等图像自监督算法以及 Vision...

YGZWQZD/LAMDA-SSL

30 Semi-Supervised Learning Algorithms

ModSSC/ModSSC

ModSSC: A Modular Framework for Semi Supervised Classification

Explore ML Frameworks

All categories Trending ML Framework directory Insights