rasbt/pytorch-memory-optim

This code repository contains the code used for my "Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch" blog post.

/ 100

Emerging

This project offers practical code examples and scripts for PyTorch developers who are training large language models (LLMs) and vision transformers. It demonstrates techniques to reduce the memory footprint during model training, helping you work with larger models or limited GPU resources. You provide existing PyTorch training code and get back insights and modified code that uses less GPU memory.

No commits in the last 6 months.

Use this if you are a PyTorch developer encountering 'out of memory' errors or want to optimize GPU memory usage when training large AI models.

Not ideal if you are not a PyTorch developer or are looking for a fully automated, black-box memory optimization solution without diving into code.

PyTorch-development GPU-memory-optimization large-model-training machine-learning-engineering deep-learning-optimization

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 9 / 25

Maturity 16 / 25

Community 13 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

AI-Hypercomputer/maxtext

A simple, performant and scalable Jax LLM!

rasbt/reasoning-from-scratch

Implement a reasoning LLM in PyTorch from scratch, step by step

mindspore-lab/mindnlp

MindSpore + 🤗Huggingface: Run any Transformers/Diffusers model on MindSpore with seamless...

mosaicml/llm-foundry

LLM training code for Databricks foundation models

rickiepark/llm-from-scratch

<밑바닥부터 만들면서 공부하는 LLM>(길벗, 2025)의 코드 저장소

Explore Transformer Models

All categories Trending Transformer directory Insights