ScalingOpt/SGG

[ACL 2025 Main] Taming LLMs by Scaling Learning Rates with Gradient Grouping

14
/ 100
Experimental

This project helps machine learning engineers and researchers efficiently train large language models (LLMs) and other large models. It takes your existing adaptive optimizer, like AdamW, and wraps it to improve how learning rates are calculated during training. The output is a more stable and faster training process, leading to better model performance.

No commits in the last 6 months.

Use this if you are training large language models and want to improve training stability, accelerate convergence, and enhance compatibility with parameter-efficient fine-tuning techniques.

Not ideal if you are working with smaller models where basic adaptive optimizers already provide satisfactory training performance and stability.

large-language-model-training deep-learning-optimization model-fine-tuning neural-network-training
No License Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 5 / 25
Maturity 7 / 25
Community 0 / 25

How are scores calculated?

Stars

9

Forks

Language

JavaScript

License

Last pushed

Jul 15, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/ScalingOpt/SGG"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.