cliang1453/SAGE

No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models (ICLR 2022)

29
/ 100
Experimental

This project helps machine learning engineers and researchers fine-tune large transformer models more efficiently. By using a sensitivity-guided adaptive learning rate, it takes a pre-trained transformer model and training data (like GLUE benchmark tasks) and produces a fine-tuned model that trains faster and potentially achieves better performance. This is for professionals working with state-of-the-art natural language processing or related deep learning models.

No commits in the last 6 months.

Use this if you are a machine learning engineer or researcher looking to fine-tune large transformer models like BERT or RoBERTa with improved training speed and performance.

Not ideal if you are looking for a general-purpose machine learning library or if you are not working specifically with large transformer architectures.

natural-language-processing deep-learning transformer-models model-training machine-learning-research
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 7 / 25
Maturity 16 / 25
Community 6 / 25

How are scores calculated?

Stars

29

Forks

2

Language

Python

License

MIT

Last pushed

Feb 09, 2022

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/cliang1453/SAGE"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.