cake-lab/DELI
Optimizing loading training data from cloud bucket storage for cloud-based distributed deep learning. Official repository for Quantifying and Improving Performance of Distributed Deep Learning with Cloud Storage, to be published in IC2E 2021
When training large-scale deep learning models in the cloud, you often store your training data in cloud storage buckets. This project helps deep learning engineers significantly speed up the process of loading this data, reducing the time your training loop spends waiting. It takes your cloud-based training data and outputs faster, more cost-effective distributed deep learning model training. Deep learning engineers and MLOps specialists using cloud infrastructure would benefit from this.
No commits in the last 6 months.
Use this if you are a deep learning engineer training models with large datasets stored in cloud storage buckets, and you're experiencing slow data loading times during distributed training.
Not ideal if your deep learning models are small, you train on a single machine, or your data is primarily stored on local disks rather than cloud buckets.
Stars
11
Forks
1
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Jan 01, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/cake-lab/DELI"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
deepspeedai/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference...
helmholtz-analytics/heat
Distributed tensors and Machine Learning framework with GPU and MPI acceleration in Python
hpcaitech/ColossalAI
Making large AI models cheaper, faster and more accessible
horovod/horovod
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
bsc-wdc/dislib
The Distributed Computing library for python implemented using PyCOMPSs programming model for HPC.