rkhan055/SHADE
SHADE: Enable Fundamental Cacheability for Distributed Deep Learning Training
This system helps machine learning engineers or researchers accelerate the training of deep learning models on large datasets across multiple machines. It intelligently identifies and caches the most important data samples during distributed training, reducing the need to repeatedly fetch data from storage. The input is your existing deep learning model and dataset, and the output is faster training times for your models.
No commits in the last 6 months.
Use this if you are a machine learning engineer or researcher experiencing slow deep learning model training due to data fetching bottlenecks in a distributed computing environment.
Not ideal if you are working with small datasets or training models on a single machine where data caching is less critical for performance.
Stars
36
Forks
9
Language
Python
License
MIT
Category
Last pushed
Mar 01, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/rkhan055/SHADE"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
deepspeedai/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference...
helmholtz-analytics/heat
Distributed tensors and Machine Learning framework with GPU and MPI acceleration in Python
hpcaitech/ColossalAI
Making large AI models cheaper, faster and more accessible
horovod/horovod
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
bsc-wdc/dislib
The Distributed Computing library for python implemented using PyCOMPSs programming model for HPC.