AGI-Arena/MARS
The official implementation of MARS: Unleashing the Power of Variance Reduction for Training Large Models
When training large-scale deep learning models, particularly large language models like GPT-2, this project helps optimize the training process. It takes your model architecture and training data as input and outputs a trained model that converges faster and achieves better performance (lower validation loss) compared to traditional methods. Machine learning engineers and researchers who are pretraining or fine-tuning large models would use this.
716 stars. Actively maintained with 2 commits in the last 30 days.
Use this if you are a machine learning engineer or researcher looking to significantly improve the efficiency and final performance of your large model training, especially for natural language processing tasks.
Not ideal if you are working with smaller models or simpler machine learning tasks where traditional optimizers already perform adequately, as the overhead might not be justified.
Stars
716
Forks
49
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 04, 2026
Commits (30d)
2
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/AGI-Arena/MARS"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related models
scaleapi/llm-engine
Scale LLM Engine public repository
modelscope/easydistill
a toolkit on knowledge distillation for large language models
AGI-Edgerunners/LLM-Adapters
Code for our EMNLP 2023 Paper: "LLM-Adapters: An Adapter Family for Parameter-Efficient...
Wang-ML-Lab/bayesian-peft
Bayesian Low-Rank Adaptation of LLMs: BLoB [NeurIPS 2024] and TFB [NeurIPS 2025]
sangmichaelxie/doremi
Pytorch implementation of DoReMi, a method for optimizing the data mixture weights in language...