liangyuwang/Tiny-DeepSpeed
Tiny-DeepSpeed, a minimalistic re-implementation of the DeepSpeed library
This project helps deep learning developers understand and experiment with techniques to reduce GPU memory usage when training large models like GPT-2. It takes your existing PyTorch training code and, by applying various parallelism strategies, significantly lowers the GPU memory footprint. This is ideal for machine learning engineers and researchers working with large language models or other deep neural networks who are hitting GPU memory limits.
No commits in the last 6 months.
Use this if you are a deep learning developer struggling with GPU memory limitations when training large models and want to understand how distributed training strategies like ZeRO can help.
Not ideal if you are looking for a production-ready, fully-featured distributed training library or if you are not a developer working with deep learning models.
Stars
50
Forks
10
Language
Python
License
Apache-2.0
Category
Last pushed
Aug 20, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/liangyuwang/Tiny-DeepSpeed"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Lightning-AI/litgpt
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
catherinesyeh/attention-viz
Visualizing query-key interactions in language + vision transformers (VIS 2023)
microsoft/Text2Grad
🚀 Text2Grad: Converting natural language feedback into gradient signals for precise model...
FareedKhan-dev/Building-llama3-from-scratch
LLaMA 3 is one of the most promising open-source model after Mistral, we will recreate it's...
huangjia2019/llm-gpt
From classic NLP to modern LLMs: building language models step by step. 异æ¥å›¾ä¹¦ï¼šã€Š GPT图解 å¤§æ¨¡åž‹æ˜¯æ€Žæ ·æž„å»ºçš„ã€‹-...