zhuhanqing/APOLLO

APOLLO: SGD-like Memory, AdamW-level Performance; MLSys'25 Oustanding Paper Honorable Mention

42
/ 100
Emerging

This project offers an advanced optimizer designed for training and fine-tuning large language models (LLMs). It allows machine learning engineers and researchers to achieve high-performance model training with significantly reduced memory consumption. You feed it your LLM architecture and training data, and it optimizes the learning process to produce a well-trained model faster and with less GPU memory.

271 stars.

Use this if you are pre-training or fine-tuning large language models (LLMs) and are constrained by GPU memory but still need AdamW-level performance.

Not ideal if you are working with smaller models that don't face memory limitations during training, or if you require an optimizer for non-LLM machine learning tasks.

large-language-models LLM-training deep-learning-optimization model-fine-tuning GPU-memory-management
No Package No Dependents
Maintenance 6 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 10 / 25

How are scores calculated?

Stars

271

Forks

13

Language

Python

License

Last pushed

Nov 29, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/zhuhanqing/APOLLO"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.