kyegomez/MuonClip

This repository is an open source implementation of the MuonClip strategy from the KIMI K2 Model from Moonshot AI

27
/ 100
Experimental

This project provides an advanced method for training large language models. It takes your existing model parameters and gradient information, along with attention logit data, to produce updated, more stable model parameters. Data scientists and machine learning engineers working on transformer-based models will find this useful for improving training efficiency and robustness.

Use this if you are training large transformer models and need an optimizer that offers improved stability and token-efficient updates, especially when dealing with potential attention score explosions.

Not ideal if you are working with non-transformer models, small datasets, or if you prefer simpler optimization algorithms like Adam or SGD.

Large Language Models Transformer Training Deep Learning Optimization AI Model Stability Natural Language Processing
No Package No Dependents
Maintenance 6 / 25
Adoption 6 / 25
Maturity 15 / 25
Community 0 / 25

How are scores calculated?

Stars

17

Forks

Language

License

Apache-2.0

Last pushed

Nov 07, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/kyegomez/MuonClip"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.