microsoft/Tutel

Tutel MoE: Optimized Mixture-of-Experts Library, Support GptOss/DeepSeek/Kimi-K2/Qwen3 using FP8/NVFP4/MXFP4

58
/ 100
Established

This project helps large language model (LLM) developers and AI practitioners deploy powerful models like DeepSeek, Kimi, and Qwen3 efficiently. It takes pre-trained LLM models and optimized model weights (e.g., in FP8/NVFP4/MXFP4 formats) and outputs high-performance, long-context text generation or multimodal audio services. The end-users are AI engineers or researchers who are building and deploying LLM-powered applications.

976 stars. Actively maintained with 1 commit in the last 30 days.

Use this if you need to serve large Mixture-of-Experts (MoE) language models like DeepSeek or Kimi with optimal speed and efficiency on powerful GPUs from NVIDIA or AMD.

Not ideal if you are a casual user looking for a simple desktop AI tool or if you don't have access to high-performance GPU hardware (like A100, H100, or MI300).

large-language-models model-deployment ai-inference gpu-optimization deep-learning-operations
No Package No Dependents
Maintenance 13 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 19 / 25

How are scores calculated?

Stars

976

Forks

107

Language

C

License

MIT

Last pushed

Mar 06, 2026

Commits (30d)

1

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/microsoft/Tutel"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.