sail-sg/Adan

Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models

/ 100

Emerging

When training deep learning models, Adan is an optimization algorithm designed to speed up the process. It takes your model's parameters and a learning rate as input, and outputs optimized parameters that help your model learn faster. This is intended for machine learning practitioners and researchers who are developing and training advanced deep learning models across various domains.

808 stars. No commits in the last 6 months.

Use this if you are training large deep learning models like Vision Transformers, BERT, GPT-2, or text-to-3D generation models and want to achieve faster convergence and potentially use higher learning rates than traditional optimizers.

Not ideal if you are working with simpler machine learning models or if memory footprint on a single GPU is a critical constraint without the option for distributed training.

deep-learning-training large-language-models computer-vision text-to-3d-generation neural-network-optimization

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 17 / 25

How are scores calculated?

Stars

808

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

nschaetti/EchoTorch

A Python toolkit for Reservoir Computing and Echo State Network experimentation based on...

metaopt/torchopt

TorchOpt is an efficient library for differentiable optimization built upon PyTorch.

gpauloski/kfac-pytorch

Distributed K-FAC preconditioner for PyTorch

opthub-org/pytorch-bsf

PyTorch implementation of Bezier simplex fitting

pytorch/xla

Enabling PyTorch on XLA Devices (e.g. Google TPU)

Explore ML Frameworks

All categories Trending ML Framework directory Insights