rasbt/pytorch-memory-optim
This code repository contains the code used for my "Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch" blog post.
This project offers practical code examples and scripts for PyTorch developers who are training large language models (LLMs) and vision transformers. It demonstrates techniques to reduce the memory footprint during model training, helping you work with larger models or limited GPU resources. You provide existing PyTorch training code and get back insights and modified code that uses less GPU memory.
No commits in the last 6 months.
Use this if you are a PyTorch developer encountering 'out of memory' errors or want to optimize GPU memory usage when training large AI models.
Not ideal if you are not a PyTorch developer or are looking for a fully automated, black-box memory optimization solution without diving into code.
Stars
92
Forks
11
Language
Python
License
Apache-2.0
Category
Last pushed
Jul 14, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/rasbt/pytorch-memory-optim"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
AI-Hypercomputer/maxtext
A simple, performant and scalable Jax LLM!
rasbt/reasoning-from-scratch
Implement a reasoning LLM in PyTorch from scratch, step by step
mindspore-lab/mindnlp
MindSpore + 🤗Huggingface: Run any Transformers/Diffusers model on MindSpore with seamless...
mosaicml/llm-foundry
LLM training code for Databricks foundation models
rickiepark/llm-from-scratch
<밑바닥부터 만들면서 공부하는 LLM>(길벗, 2025)의 코드 저장소