liangyuwang/Tiny-DeepSpeed

Tiny-DeepSpeed, a minimalistic re-implementation of the DeepSpeed library

/ 100

Emerging

This project helps deep learning developers understand and experiment with techniques to reduce GPU memory usage when training large models like GPT-2. It takes your existing PyTorch training code and, by applying various parallelism strategies, significantly lowers the GPU memory footprint. This is ideal for machine learning engineers and researchers working with large language models or other deep neural networks who are hitting GPU memory limits.

No commits in the last 6 months.

Use this if you are a deep learning developer struggling with GPU memory limitations when training large models and want to understand how distributed training strategies like ZeRO can help.

Not ideal if you are looking for a production-ready, fully-featured distributed training library or if you are not a developer working with deep learning models.

deep-learning-optimization gpu-memory-management distributed-training large-language-models machine-learning-engineering

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 8 / 25

Maturity 16 / 25

Community 17 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

Lightning-AI/litgpt

20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.

catherinesyeh/attention-viz

Visualizing query-key interactions in language + vision transformers (VIS 2023)

microsoft/Text2Grad

🚀 Text2Grad: Converting natural language feedback into gradient signals for precise model...

FareedKhan-dev/Building-llama3-from-scratch

LLaMA 3 is one of the most promising open-source model after Mistral, we will recreate it's...

huangjia2019/llm-gpt

From classic NLP to modern LLMs: building language models step by step. 异步图书：《 GPT图解大模型是怎样构建的》-...

Explore LLM Tools

All categories Trending LLM Tool directory Insights