microsoft/mup

maximal update parametrization (µP)

53
/ 100
Established

When training large neural networks, it's often tricky to find the right hyperparameters (like learning rate) that work well as your model grows. This tool helps deep learning practitioners avoid re-tuning these hyperparameters every time they scale up their model's size. By modifying how your PyTorch neural network is initialized and updated, it ensures that good hyperparameters found on smaller models remain effective on much larger versions, saving significant time and computational resources.

1,689 stars. Used by 1 other package. No commits in the last 6 months. Available on PyPI.

Use this if you are developing large neural network models and want to find optimal hyperparameters once on a smaller model, then confidently transfer those settings to much larger versions without extensive re-tuning.

Not ideal if you are working with small neural networks or are not experiencing issues with hyperparameter stability when scaling up your models.

deep-learning neural-network-training hyperparameter-tuning model-scaling machine-learning-engineering
Stale 6m
Maintenance 0 / 25
Adoption 11 / 25
Maturity 25 / 25
Community 17 / 25

How are scores calculated?

Stars

1,689

Forks

105

Language

Jupyter Notebook

License

MIT

Last pushed

Jul 17, 2024

Commits (30d)

0

Reverse dependents

1

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/microsoft/mup"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.