Gradient Descent Optimizers ML Frameworks

Implementations and variants of optimization algorithms (SGD, Adam, RMSprop, etc.) for training neural networks. Does NOT include hyperparameter tuning tools, learning rate schedulers as standalone tools, or general black-box optimization frameworks.

There are 71 gradient descent optimizers frameworks tracked. 11 score above 50 (established tier). The highest-rated is nschaetti/EchoTorch at 60/100 with 490 stars.

Get all 71 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=ml-frameworks&subcategory=gradient-descent-optimizers&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Framework Score Tier
1 nschaetti/EchoTorch

A Python toolkit for Reservoir Computing and Echo State Network...

60
Established
2 metaopt/torchopt

TorchOpt is an efficient library for differentiable optimization built upon PyTorch.

59
Established
3 gpauloski/kfac-pytorch

Distributed K-FAC preconditioner for PyTorch

58
Established
4 opthub-org/pytorch-bsf

PyTorch implementation of Bezier simplex fitting

58
Established
5 pytorch/xla

Enabling PyTorch on XLA Devices (e.g. Google TPU)

57
Established
6 stanford-centaur/PyPantograph

A Machine-to-Machine Interaction System for Lean 4.

56
Established
7 SimplexLab/TorchJD

Library for Jacobian descent with PyTorch. It enables the optimization of...

55
Established
8 clovaai/AdamP

AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant...

54
Established
9 kozistr/pytorch_optimizer

optimizer & lr scheduler & loss function collections in PyTorch

52
Established
10 xiaoyuxie-vico/PyDimension

Dimensionless learning

51
Established
11 Tony-Y/pytorch_warmup

Learning Rate Warmup in PyTorch

50
Established
12 lean-dojo/LeanDojo-v2

LeanDojo-v2 is an end-to-end framework for training, evaluating, and...

48
Emerging
13 NoteDance/optimizers

This project implements optimizers for TensorFlow and Keras, which can be...

47
Emerging
14 augustepoiroux/LeanInteract

LeanInteract: A Python Interface for Lean 4

47
Emerging
15 kach/gradient-descent-the-ultimate-optimizer

Code for our NeurIPS 2022 paper

47
Emerging
16 nlesc-dirac/pytorch

Improved LBFGS and LBFGS-B optimizers in PyTorch.

47
Emerging
17 ildoonet/pytorch-gradual-warmup-lr

Gradually-Warmup Learning Rate Scheduler for PyTorch

47
Emerging
18 locuslab/optnet

OptNet: Differentiable Optimization as a Layer in Neural Networks

46
Emerging
19 evanatyourservice/kron_torch

An implementation of PSGD Kron second-order optimizer for PyTorch

45
Emerging
20 facebookresearch/theseus

A library for differentiable nonlinear optimization

45
Emerging
21 sail-sg/Adan

Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models

45
Emerging
22 j-w-yun/optimizer-visualization

Visualize Tensorflow's optimizers.

45
Emerging
23 OptimalFoundation/nadir

Nadir: Cutting-edge PyTorch optimizers for simplicity & composability! πŸ”₯πŸš€πŸ’»

44
Emerging
24 100/Solid

🎯 A comprehensive gradient-free optimization framework written in Python

43
Emerging
25 lixilinx/psgd_torch

Pytorch implementation of preconditioned stochastic gradient descent (Kron...

42
Emerging
26 muooon/EmoNavi

An emotion-driven optimizer that feels loss and navigates accordingly.

40
Emerging
27 ayaka14732/tpu-starter

Everything you want to know about Google Cloud TPU

39
Emerging
28 warner-benjamin/optimi

Fast, Modern, and Low Precision PyTorch Optimizers

39
Emerging
29 JGalego/torchlib

Deep learning meets Lean4 πŸ”₯βœ…

38
Emerging
30 team-approx-bayes/ivon

IVON optimizer for neural networks based on variational learning.

37
Emerging
31 nanowell/AdEMAMix-Optimizer-Pytorch

The AdEMAMix Optimizer: Better, Faster, Older.

36
Emerging
32 gugarosa/otorchmizer

🐦 Otorchmizer is a PyTorch-based library consisting of meta-heuristic...

36
Emerging
33 tianrui-qi/ADMM-for-SVM

Alternating Direction Method of Multipliers for Support Vector Machine

36
Emerging
34 ltatzel/PyTorchHessianFree

PyTorch implementation of the Hessian-free optimizer

35
Emerging
35 gugugu12138/AdaptoFlux

An algorithm that implements intelligence based on a Method pool (a...

34
Emerging
36 IMvision12/AdEMAMix-Optimizer-Keras

A Keras 3 Implementation of AdEMAMix Optimizer

34
Emerging
37 kiligon/spotax

CLI tool for running JAX training on Google Cloud Spot TPUs with automatic...

34
Emerging
38 instadeepai/sebulba

πŸͺ The Sebulba architecture to scale reinforcement learning on Cloud TPUs in JAX

33
Emerging
39 SirRob1997/Crowded-Valley---Results

This repository contains the results for the paper: "Descending through a...

32
Emerging
40 Axect/pytorch-scheduler

A comprehensive, research-driven collection of learning rate schedulers for...

32
Emerging
41 MoFHeka/xla-launcher

XLA Launcher is a high-performance, lightweight C++ library designed to...

32
Emerging
42 thieu1995/GrafoRVFL

GrafoRVFL: A Gradient-Free Optimization Framework for Boosting Random Vector...

31
Emerging
43 wassname/viz_torch_optim

Videos of deep learning optimizers moving on 3D problem-landscapes

31
Emerging
44 yinleung/FSGDM

[ICLR 2025] On the Performance Analysis of Momentum Method: A Frequency...

30
Emerging
45 e-sensing/torchopt

R implementation of advanced optimizers for torch

30
Emerging
46 Brokttv/optimizers-from-scratch

training models with different optimizers using NumPy only. Featuring SGD,...

28
Experimental
47 thetechdude124/Adam-Optimization-From-Scratch

πŸ“ˆImplementing the ADAM optimizer from the ground up with PyTorch and...

28
Experimental
48 AroMorin/DNNOP

Deep Neural Network Optimization Platform with Gradient-based, Gradient-Free...

27
Experimental
49 fabian-sp/MoMo

MoMo: Momentum Models for Adaptive Learning Rates

27
Experimental
50 Gunale0926/Grams

Grams: Gradient Descent with Adaptive Momentum Scaling (ICLR 2025 Workshop)

27
Experimental
51 OpenEnvision-Lab/ScalingOPT

ScalingOPT [LLM]

25
Experimental
52 ChrisPinedaSanhueza/nested-learning-optimizer

πŸš€ Optimize TensorFlow models with the Nested Learning Optimizer for improved...

22
Experimental
53 adrienkegreisz/ano-optimizer

Lightweight and customizable optimizer compatible with PyTorch and TensorFlow.

22
Experimental
54 bangyen/leansharp

Formal verification of Z-Score filtered Sharpness-Aware Minimization (SAM)...

22
Experimental
55 AhmedMostafa16/EXAdam

Official implementation of EXAdam optimizer from the paper...

21
Experimental
56 aytugyuruk/optimizer-comparisions-training-with-limited-epochs

Optimizer Comparison Study - Empirical analysis of SGD vs Adam performance...

21
Experimental
57 nfocardoso/thermopt

Drop-in PyTorch optimizer that beats AdamW with lower variance

21
Experimental
58 shreyansh26/ML-Optimizers-JAX

Toy implementations of some popular ML optimizers using Python/JAX

21
Experimental
59 adrienkegreisz/ano-experiments

The source code of the ANO's paper – a robust optimizer for deep learning in...

21
Experimental
60 nisheethjaiswal/ROLLING-DOWN-A-CROWDED-VALLEY-OF-OPTIMIZERS-DEVELOPMENTS-FROM-SGD

Deep Learning Optimizers

21
Experimental
61 smithhenryd/Lazy-Training

Yale S&DS 432 final project studying lazy training dynamics for...

21
Experimental
62 wyzjack/AdaM3

[ICDM 2023] Momentum is All You Need for Data-Driven Adaptive Optimization

19
Experimental
63 tony-wade/optimizers

Extension optimizers for the PyTorch.

18
Experimental
64 Figirs/Neural-Flow-Optimizer

A Python-based library for optimizing gradient descent in deep neural networks.

14
Experimental
65 NekkittAY/MAMGD_Optimizer

Gradient optimization method using exponential damping and second-order...

13
Experimental
66 imehranasgari/DL_Optimizer_RMSpropNesterov_Custom

Custom RMSprop optimizer with Nesterov momentum in pure Python/NumPy. Built...

13
Experimental
67 motasemwed/optimization-algorithms-comparison

A practical comparison of classical optimization algorithms (GD, SGD,...

13
Experimental
68 i207M/MultiAdam

Code for MultiAdam: Parameter-wise Scale-invariant Optimizer for...

12
Experimental
69 dandycheng/ml-gradient-descent-optimization

Gradient descent optimization algorithms comparison coded from scratch....

11
Experimental
70 dscamiss/generalized-newtons-method

PyTorch implementation of the generalized Newton's method for learning rate selection

11
Experimental
71 Mac0490/Neural-Network-Optimization-Hessian-Based-Analysis

This project investigates the relationship between neural network...

10
Experimental