andreped/GradientAccumulator

:dart: Gradient Accumulation for TensorFlow 2

41
/ 100
Emerging

When training large deep learning models with TensorFlow 2, you often hit GPU memory limits when using big batch sizes. This project lets you simulate much larger batch sizes than your GPU can natively handle. It takes your existing TensorFlow model and processes batches in smaller chunks, accumulating the gradients until a full virtual batch is complete. This helps deep learning researchers and practitioners train complex models that would otherwise be infeasible due to hardware constraints.

No commits in the last 6 months.

Use this if you are a deep learning researcher or practitioner using TensorFlow 2 and need to train models with very large batch sizes but are limited by your GPU's memory.

Not ideal if you are not using TensorFlow 2 or are not dealing with GPU memory limitations during model training.

deep-learning-training neural-network-optimization GPU-memory-management large-scale-AI model-training-efficiency
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 8 / 25
Maturity 16 / 25
Community 17 / 25

How are scores calculated?

Stars

53

Forks

11

Language

Python

License

MIT

Last pushed

Feb 11, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/andreped/GradientAccumulator"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.