Gradient Descent Optimizers ML Frameworks
Implementations and variants of optimization algorithms (SGD, Adam, RMSprop, etc.) for training neural networks. Does NOT include hyperparameter tuning tools, learning rate schedulers as standalone tools, or general black-box optimization frameworks.
There are 71 gradient descent optimizers frameworks tracked. 11 score above 50 (established tier). The highest-rated is nschaetti/EchoTorch at 60/100 with 490 stars.
Get all 71 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=ml-frameworks&subcategory=gradient-descent-optimizers&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Framework | Score | Tier |
|---|---|---|---|
| 1 |
nschaetti/EchoTorch
A Python toolkit for Reservoir Computing and Echo State Network... |
|
Established |
| 2 |
metaopt/torchopt
TorchOpt is an efficient library for differentiable optimization built upon PyTorch. |
|
Established |
| 3 |
gpauloski/kfac-pytorch
Distributed K-FAC preconditioner for PyTorch |
|
Established |
| 4 |
opthub-org/pytorch-bsf
PyTorch implementation of Bezier simplex fitting |
|
Established |
| 5 |
pytorch/xla
Enabling PyTorch on XLA Devices (e.g. Google TPU) |
|
Established |
| 6 |
stanford-centaur/PyPantograph
A Machine-to-Machine Interaction System for Lean 4. |
|
Established |
| 7 |
SimplexLab/TorchJD
Library for Jacobian descent with PyTorch. It enables the optimization of... |
|
Established |
| 8 |
clovaai/AdamP
AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant... |
|
Established |
| 9 |
kozistr/pytorch_optimizer
optimizer & lr scheduler & loss function collections in PyTorch |
|
Established |
| 10 |
xiaoyuxie-vico/PyDimension
Dimensionless learning |
|
Established |
| 11 |
Tony-Y/pytorch_warmup
Learning Rate Warmup in PyTorch |
|
Established |
| 12 |
lean-dojo/LeanDojo-v2
LeanDojo-v2 is an end-to-end framework for training, evaluating, and... |
|
Emerging |
| 13 |
NoteDance/optimizers
This project implements optimizers for TensorFlow and Keras, which can be... |
|
Emerging |
| 14 |
augustepoiroux/LeanInteract
LeanInteract: A Python Interface for Lean 4 |
|
Emerging |
| 15 |
kach/gradient-descent-the-ultimate-optimizer
Code for our NeurIPS 2022 paper |
|
Emerging |
| 16 |
nlesc-dirac/pytorch
Improved LBFGS and LBFGS-B optimizers in PyTorch. |
|
Emerging |
| 17 |
ildoonet/pytorch-gradual-warmup-lr
Gradually-Warmup Learning Rate Scheduler for PyTorch |
|
Emerging |
| 18 |
locuslab/optnet
OptNet: Differentiable Optimization as a Layer in Neural Networks |
|
Emerging |
| 19 |
evanatyourservice/kron_torch
An implementation of PSGD Kron second-order optimizer for PyTorch |
|
Emerging |
| 20 |
facebookresearch/theseus
A library for differentiable nonlinear optimization |
|
Emerging |
| 21 |
sail-sg/Adan
Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models |
|
Emerging |
| 22 |
j-w-yun/optimizer-visualization
Visualize Tensorflow's optimizers. |
|
Emerging |
| 23 |
OptimalFoundation/nadir
Nadir: Cutting-edge PyTorch optimizers for simplicity & composability! π₯ππ» |
|
Emerging |
| 24 |
100/Solid
π― A comprehensive gradient-free optimization framework written in Python |
|
Emerging |
| 25 |
lixilinx/psgd_torch
Pytorch implementation of preconditioned stochastic gradient descent (Kron... |
|
Emerging |
| 26 |
muooon/EmoNavi
An emotion-driven optimizer that feels loss and navigates accordingly. |
|
Emerging |
| 27 |
ayaka14732/tpu-starter
Everything you want to know about Google Cloud TPU |
|
Emerging |
| 28 |
warner-benjamin/optimi
Fast, Modern, and Low Precision PyTorch Optimizers |
|
Emerging |
| 29 |
JGalego/torchlib
Deep learning meets Lean4 π₯β |
|
Emerging |
| 30 |
team-approx-bayes/ivon
IVON optimizer for neural networks based on variational learning. |
|
Emerging |
| 31 |
nanowell/AdEMAMix-Optimizer-Pytorch
The AdEMAMix Optimizer: Better, Faster, Older. |
|
Emerging |
| 32 |
gugarosa/otorchmizer
π¦ Otorchmizer is a PyTorch-based library consisting of meta-heuristic... |
|
Emerging |
| 33 |
tianrui-qi/ADMM-for-SVM
Alternating Direction Method of Multipliers for Support Vector Machine |
|
Emerging |
| 34 |
ltatzel/PyTorchHessianFree
PyTorch implementation of the Hessian-free optimizer |
|
Emerging |
| 35 |
gugugu12138/AdaptoFlux
An algorithm that implements intelligence based on a Method pool (a... |
|
Emerging |
| 36 |
IMvision12/AdEMAMix-Optimizer-Keras
A Keras 3 Implementation of AdEMAMix Optimizer |
|
Emerging |
| 37 |
kiligon/spotax
CLI tool for running JAX training on Google Cloud Spot TPUs with automatic... |
|
Emerging |
| 38 |
instadeepai/sebulba
πͺ The Sebulba architecture to scale reinforcement learning on Cloud TPUs in JAX |
|
Emerging |
| 39 |
SirRob1997/Crowded-Valley---Results
This repository contains the results for the paper: "Descending through a... |
|
Emerging |
| 40 |
Axect/pytorch-scheduler
A comprehensive, research-driven collection of learning rate schedulers for... |
|
Emerging |
| 41 |
MoFHeka/xla-launcher
XLA Launcher is a high-performance, lightweight C++ library designed to... |
|
Emerging |
| 42 |
thieu1995/GrafoRVFL
GrafoRVFL: A Gradient-Free Optimization Framework for Boosting Random Vector... |
|
Emerging |
| 43 |
wassname/viz_torch_optim
Videos of deep learning optimizers moving on 3D problem-landscapes |
|
Emerging |
| 44 |
yinleung/FSGDM
[ICLR 2025] On the Performance Analysis of Momentum Method: A Frequency... |
|
Emerging |
| 45 |
e-sensing/torchopt
R implementation of advanced optimizers for torch |
|
Emerging |
| 46 |
Brokttv/optimizers-from-scratch
training models with different optimizers using NumPy only. Featuring SGD,... |
|
Experimental |
| 47 |
thetechdude124/Adam-Optimization-From-Scratch
πImplementing the ADAM optimizer from the ground up with PyTorch and... |
|
Experimental |
| 48 |
AroMorin/DNNOP
Deep Neural Network Optimization Platform with Gradient-based, Gradient-Free... |
|
Experimental |
| 49 |
fabian-sp/MoMo
MoMo: Momentum Models for Adaptive Learning Rates |
|
Experimental |
| 50 |
Gunale0926/Grams
Grams: Gradient Descent with Adaptive Momentum Scaling (ICLR 2025 Workshop) |
|
Experimental |
| 51 |
OpenEnvision-Lab/ScalingOPT
ScalingOPT [LLM] |
|
Experimental |
| 52 |
ChrisPinedaSanhueza/nested-learning-optimizer
π Optimize TensorFlow models with the Nested Learning Optimizer for improved... |
|
Experimental |
| 53 |
adrienkegreisz/ano-optimizer
Lightweight and customizable optimizer compatible with PyTorch and TensorFlow. |
|
Experimental |
| 54 |
bangyen/leansharp
Formal verification of Z-Score filtered Sharpness-Aware Minimization (SAM)... |
|
Experimental |
| 55 |
AhmedMostafa16/EXAdam
Official implementation of EXAdam optimizer from the paper... |
|
Experimental |
| 56 |
aytugyuruk/optimizer-comparisions-training-with-limited-epochs
Optimizer Comparison Study - Empirical analysis of SGD vs Adam performance... |
|
Experimental |
| 57 |
nfocardoso/thermopt
Drop-in PyTorch optimizer that beats AdamW with lower variance |
|
Experimental |
| 58 |
shreyansh26/ML-Optimizers-JAX
Toy implementations of some popular ML optimizers using Python/JAX |
|
Experimental |
| 59 |
adrienkegreisz/ano-experiments
The source code of the ANO's paper β a robust optimizer for deep learning in... |
|
Experimental |
| 60 |
nisheethjaiswal/ROLLING-DOWN-A-CROWDED-VALLEY-OF-OPTIMIZERS-DEVELOPMENTS-FROM-SGD
Deep Learning Optimizers |
|
Experimental |
| 61 |
smithhenryd/Lazy-Training
Yale S&DS 432 final project studying lazy training dynamics for... |
|
Experimental |
| 62 |
wyzjack/AdaM3
[ICDM 2023] Momentum is All You Need for Data-Driven Adaptive Optimization |
|
Experimental |
| 63 |
tony-wade/optimizers
Extension optimizers for the PyTorch. |
|
Experimental |
| 64 |
Figirs/Neural-Flow-Optimizer
A Python-based library for optimizing gradient descent in deep neural networks. |
|
Experimental |
| 65 |
NekkittAY/MAMGD_Optimizer
Gradient optimization method using exponential damping and second-order... |
|
Experimental |
| 66 |
imehranasgari/DL_Optimizer_RMSpropNesterov_Custom
Custom RMSprop optimizer with Nesterov momentum in pure Python/NumPy. Built... |
|
Experimental |
| 67 |
motasemwed/optimization-algorithms-comparison
A practical comparison of classical optimization algorithms (GD, SGD,... |
|
Experimental |
| 68 |
i207M/MultiAdam
Code for MultiAdam: Parameter-wise Scale-invariant Optimizer for... |
|
Experimental |
| 69 |
dandycheng/ml-gradient-descent-optimization
Gradient descent optimization algorithms comparison coded from scratch.... |
|
Experimental |
| 70 |
dscamiss/generalized-newtons-method
PyTorch implementation of the generalized Newton's method for learning rate selection |
|
Experimental |
| 71 |
Mac0490/Neural-Network-Optimization-Hessian-Based-Analysis
This project investigates the relationship between neural network... |
|
Experimental |