amazon-science/mezo_svrg

Code the ICML 2024 paper: "Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models"

/ 100

Experimental

This helps AI/ML researchers and practitioners fine-tune large language models (LLMs) more efficiently. It takes a pre-trained Hugging Face LLM and a GLUE benchmark dataset, then applies advanced optimization algorithms to improve the model's performance on specific tasks. Researchers focused on deep learning optimization or natural language processing would use this.

No commits in the last 6 months.

Use this if you are an AI/ML researcher or practitioner looking to fine-tune large language models with state-of-the-art, memory-efficient, variance-reduced optimization methods to achieve better performance on NLP benchmarks.

Not ideal if you are looking for a no-code solution or a tool for general-purpose NLP tasks beyond LLM fine-tuning and optimization research.

AI/ML Research Natural Language Processing Large Language Models Model Fine-tuning Deep Learning Optimization

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 5 / 25

Maturity 16 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

Apache-2.0

Higher-rated alternatives

scaleapi/llm-engine

Scale LLM Engine public repository

AGI-Arena/MARS

The official implementation of MARS: Unleashing the Power of Variance Reduction for Training Large Models

modelscope/easydistill

a toolkit on knowledge distillation for large language models

AGI-Edgerunners/LLM-Adapters

Code for our EMNLP 2023 Paper: "LLM-Adapters: An Adapter Family for Parameter-Efficient...

Wang-ML-Lab/bayesian-peft

Bayesian Low-Rank Adaptation of LLMs: BLoB [NeurIPS 2024] and TFB [NeurIPS 2025]

Explore Transformer Models

All categories Trending Transformer directory Insights