0xNaN/edufsdp

A minimal, educational implementation of Fully Sharded Data Parallel (FSDP).

/ 100

Experimental

This project offers a clear and simplified look at how Fully Sharded Data Parallel (FSDP) works. It takes a complex distributed training strategy and breaks down its core components like parameter sharding and data gathering. Machine learning practitioners, especially those learning about or implementing large model training, can use this to understand the underlying mechanisms without getting lost in optimization details.

Use this if you are a machine learning engineer or researcher who wants to deeply understand the fundamental concepts of FSDP for distributed model training.

Not ideal if you need a production-ready FSDP implementation with advanced features like mixed precision, communication-computation overlap, or CPU offloading.

distributed-training large-language-models deep-learning-engineering model-optimization

No License No Package No Dependents

Maintenance 10 / 25

Adoption 5 / 25

Maturity 5 / 25

Community 6 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

deepspeedai/DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference...

helmholtz-analytics/heat

Distributed tensors and Machine Learning framework with GPU and MPI acceleration in Python

hpcaitech/ColossalAI

Making large AI models cheaper, faster and more accessible

horovod/horovod

Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

bsc-wdc/dislib

The Distributed Computing library for python implemented using PyCOMPSs programming model for HPC.

Explore ML Frameworks

All categories Trending ML Framework directory Insights